Mar. 8th, 2006
Crazy hash tables
Mar. 8th, 2006 12:16 pmSo the other day in CS the question came up of, what would happen if you used a mutable object as a key in a hashtable? (Actually, IIRC, the question was entirely different, but Mr. Kurtz misunderstood it as that, so...) This was in Ruby, btw, so I'll use its syntax.
And of course it's pointed out that if you do something like
a=[1]
h={a=>1}
a[0]=0
Then h[a] will return nil, because it'll hash differently. This is not so unexpected. Between then and today Mr. Kurtz thought about the question some more, presumably actually tried it, and here's the crazy part: What if you do h[[1]]? That will also return nil, because while it'll hash the same, what's actually stored as the key in the hashtable is a reference to a, which is now [0], (i.e. not a copy of it, which would still be [1]) so they compare unequal. (This also means if you just happen to change a such that the hash is the same, h[a] will still return 1.) Thus if you say h[[1]]=0, it'll create a new entry in the hashtable - in the same hashbucket, of course.
So what this means is if you do
a=[1]
h={a=>1}
a[0]=0
h[[1]]=0
a[0]=1
Then you will get a hashtable containing, in the same hash bucket, two distinct entries with equal keys. What will h[[1]] return? Depends on whether the implementation adds new entries at the beginning or the end of the list. (I don't know why you would add things at the end, that seems kind of silly, but the point is still that it's implementation dependent and also just plain ridiculous.)
Now it happens that Mr. Kurtz was talking about this to Robby Findler, another professor here and one of the implementors of DrScheme, which also provides built-in hashtables. And Findler said, "This is incredibly broken. No sane implementation would do it this way." So they tried it on DrScheme. What does it do? The exact same thing. :D
-Sniffnoy
And of course it's pointed out that if you do something like
a=[1]
h={a=>1}
a[0]=0
Then h[a] will return nil, because it'll hash differently. This is not so unexpected. Between then and today Mr. Kurtz thought about the question some more, presumably actually tried it, and here's the crazy part: What if you do h[[1]]? That will also return nil, because while it'll hash the same, what's actually stored as the key in the hashtable is a reference to a, which is now [0], (i.e. not a copy of it, which would still be [1]) so they compare unequal. (This also means if you just happen to change a such that the hash is the same, h[a] will still return 1.) Thus if you say h[[1]]=0, it'll create a new entry in the hashtable - in the same hashbucket, of course.
So what this means is if you do
a=[1]
h={a=>1}
a[0]=0
h[[1]]=0
a[0]=1
Then you will get a hashtable containing, in the same hash bucket, two distinct entries with equal keys. What will h[[1]] return? Depends on whether the implementation adds new entries at the beginning or the end of the list. (I don't know why you would add things at the end, that seems kind of silly, but the point is still that it's implementation dependent and also just plain ridiculous.)
Now it happens that Mr. Kurtz was talking about this to Robby Findler, another professor here and one of the implementors of DrScheme, which also provides built-in hashtables. And Findler said, "This is incredibly broken. No sane implementation would do it this way." So they tried it on DrScheme. What does it do? The exact same thing. :D
-Sniffnoy