Details
-
Improvement
-
Resolution: Done
-
Major - P3
-
2.5.4
-
None
Description
The existing hash function used for StringMap has a very high collision rate for text names that vary by only a few characters; particularly suffixes. Running the Smhasher benchmark's text and performance tests shows that Murmur3 is superior in terms of collisions and comparable in terms of performance for short keys (and superior for long keys).
So, the task of this ticket is to replace the current implementation with Murmur3. I recommend doing this by making StringDataDefaultHash a typedef of StringData::Hasher, and changing the implementation of SD::Hasher::operator() as follows:
On platforms where size_t is 32-bit, we should use MurmurHash3_x86_32. Where it is 64 bits, we should use MurmurHash3_x64_128 and keep the low order 64 bits.