[HN Gopher] The PolymurHash universal hash function
___________________________________________________________________
The PolymurHash universal hash function
Author : fanf2
Score : 60 points
Date : 2023-08-18 14:39 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| nullc wrote:
| The proof of almost-universality won't apply to subsets of bits,
| and many applications of hashes only use a subset of bits. The
| construction techniques used to get almost-universality might
| even make some subset of bits very poor in their performance: or
| at least my experience with designing linear codes for error
| detection/correction has been that strong codes often have some
| bits that are surprisingly weak.
|
| So hopefully people have tested this empirically that the
| collision rates for the top and bottom n bits are all reasonable
| looking for varrious values of n (and might be useful to know
| which bits are the strongest, assuming you get a choice).
| Assuming there are no hidden gotchas like that, this looks pretty
| great.
| orlp wrote:
| I am working on a new hash that uses more memory (aiming around
| 4KiB that's shared in a process), but is a whole lot faster.
| The finalizer for that hash is a universal hash which I believe
| is 2^(2-k) almost-delta universal mod 2^k for any k <= 64. So
| the bottom bits are always high quality.
|
| PolymurHash has no such guarantee, the claims are only over the
| full hash. That said, I don't expect problems in practice due
| to the Murmur style bit mixing at the end.
| nullc wrote:
| > I am working on a new hash that uses more memory (aiming
| around 4KiB that's shared in a process), but is a whole lot
| faster.
|
| Though it's hard to imagine anything with 4kib of state being
| very fast! is it just a precomputed power table or something
| where the whole thing isn't even accessed for short inputs?
|
| > The finalizer for that hash is a universal hash which I
| believe is 2^(2-k) almost-delta universal mod 2^k for any k
| <= 64. So the bottom bits are always high quality.
|
| That's interesting!
|
| > PolymurHash has no such guarantee, the claims are only over
| the full hash. That said, I don't expect problems in practice
| due to the Murmur style bit mixing at the end.
|
| That mix at the end is just an addition with a secret
| constant, right? I'd expect then the least significant bits
| to not be improved much by it.
| rurban wrote:
| Confirmed, I tested it. https://github.com/rurban/smhasher
| fanf2 wrote:
| Nice! Does smhasher have a benchmark for short unaligned
| strings, up to about 16 bytes?
| rurban wrote:
| Only aligned, from 1--32
| cyanydeez wrote:
| I'd love this as a wasm library.
| aappleby wrote:
| MurmurHash author here, looks interesting and I will take a
| closer look later today.
___________________________________________________________________
(page generated 2023-08-19 23:00 UTC)