[HN Gopher] Consistent Hashing for Dummies
___________________________________________________________________
Consistent Hashing for Dummies
Author : alanfranz
Score : 50 points
Date : 2021-10-28 12:30 UTC (10 hours ago)
(HTM) web link (www.franzoni.eu)
(TXT) w3m dump (www.franzoni.eu)
| pixelpoet wrote:
| Every rendering person's hero, Eric Veach, makes an appearance
| again! He basically invented modern rendering for his thesis[1],
| then developed AdWords, among many other things...
|
| [1] https://graphics.stanford.edu/papers/veach_thesis/
|
| [2] https://en.wikipedia.org/wiki/Eric_Veach
|
| [3] Funny video of him winning an Academy Award:
| https://www.youtube.com/watch?v=a3ReTBe06Kw
| spmurrayzzz wrote:
| This is one of those algorithms that, once you understand the
| fundamental concepts, opens up so many doors for you in systems-
| level thinking. Great tool to have, portable, and relatively easy
| to implement.
|
| I recently had to figure out a short-term way of horizontally
| scaling a graphite service. I didn't really like the stateful
| complexity around the carbon relay-based solution, so I decided
| to build a simple orchestration layer in node that routes statsd
| metrics to graphite nodes using consistent hashing.
|
| Given the use cases for that particular service, I didn't have to
| worry about long term data storage so I could hand wave some of
| the pain points in cluster membership changes. A simple keyspace-
| aware rsync daemon was enough to get the job done.
| connorbrinton wrote:
| I've always had a hard time understanding consistent hashing, and
| find rendezvous hashing [1] to be much more understandable. It
| provides better load-balancing properties and uses less memory
| than consistent hashing, but requires more computation.
|
| It seems like consistent hashing is much more popular though,
| which definitely makes it worthwhile to learn.
|
| [1] https://en.wikipedia.org/wiki/Rendezvous_hashing
| benlivengood wrote:
| I think jump-based consistent hashing also requires less
| memory, O(1), than rendezvous hashing.
|
| From my limited perspective and the paper linked in the article
| it sounds like consistent hashing is best for numbered sharding
| (disk storage systems and databases) and rendezvous hashing is
| best for arbitrarily distributed storage where nodes can't be
| consecutively numbered.
|
| My best attempt at explaining jump consistent hashing is that
| it's possible to determine how likely a given key will be to
| move to a nearby bucket (small hash values make it less likely,
| large hash values make it more likely) and use that likelihood
| to choose a next bucket candidate for each key. About half of
| keys are likely to move from 1 bucket to 2, but only a third
| are likely to move from 2 buckets to 3, etc. and in general 1/n
| of keys are likely to move to bucket n.
| orasis wrote:
| I hadn't heard of the "Jump Consistent Hashing" algorithm that
| this article uses. The original paper is much more informative
| than the article:
|
| https://arxiv.org/abs/1406.2294
| barbazoo wrote:
| For some reason I didn't find this explanation intuitive at all,
| and I would consider myself to be at least "dummy" level. After a
| quick search, I found this [0] much more intuitive.
|
| [0] https://www.toptal.com/big-data/consistent-hashing
___________________________________________________________________
(page generated 2021-10-28 23:02 UTC)