[HN Gopher] The power of two random choices
___________________________________________________________________
The power of two random choices
Author : mfiguiere
Score : 128 points
Date : 2024-02-07 02:05 UTC (1 days ago)
(HTM) web link (brooker.co.za)
(TXT) w3m dump (brooker.co.za)
| Zanni wrote:
| Nice animated visualization of this:
| https://twitter.com/GrantSlatton/status/1754912113246798036
| koolba wrote:
| That's neat. An easy concept to understand here is that with a
| pair to consider you will never target the worst node. At worst
| it's the second most loaded node.
| atq2119 wrote:
| The underlying mathematical driver of the power of two random
| choices is a pretty amazing fact about the "n balls into n bins"
| random experiment.
|
| If the balls are thrown independently and uniformly at random,
| the most loaded bin will have theta(log n) balls in it (almost
| certainly).
|
| If you instead pick two bins uniformly and independently at
| random for each ball and throw the ball into the bin which has
| fewer balls in it so far, then the most loaded bin will have
| theta(loglog n) balls in it, i.e a massive asymptotic
| improvement.
| mjb wrote:
| The balls-into-bins problem comes up often in distribute
| systems engineering, and so it's super important to have the
| right intuition. I think a lot of people come in to the space
| assuming that random balls-into-bins is much more even than it
| really is (and so underestimating the value of better-than-
| random placement).
|
| Another interesting example is hash-based partitioning of data.
| It doesn't spread data out nearly as evenly as one might
| expect!
|
| (I wrote a blog post about that too:
| https://brooker.co.za/blog/2018/01/01/balls-into-bins.html)
| nyrikki wrote:
| I have found remembering that both the law of large numbers
| and the central limit theorem are dependent on the
| independent and identically distributed assumption.
|
| As an example, many computing systems are Markovian, which is
| memoryless with an exponential distribution.
|
| So the second part of I.I.D. doesn't hold. But may be close
| enough for your needs.
| mjb wrote:
| However, in many real systems the aggregate behavior
| follows the asymptotic trend of the central limit theorem
| even though the underlying distributions may not fit the
| strict requirements.
| nyrikki wrote:
| Yes, I always start with the IID, but try watch for
| evidence that those assumptions need to change.
|
| Too many convenient properties are lost once you lose IID
| to prematurely drop it.
| mkhnews wrote:
| I used this method for a cache eviction/replacement scheme. For
| our use case it was better than traditional lru.
| zellyn wrote:
| The most interesting question this has always raised for me is
| whether it implies you should always try two
| doctors/dentists/therapists/etc. before making a choice. Seems
| like it would be just as powerful "in real life".
| taeric wrote:
| Well, this strategy would only help you find ones that are not
| over loaded? And probably only works if you can change choices
| over time?
|
| That said, sampling, in general, works surprisingly well in all
| things.
| mjb wrote:
| There's a whole area of math in "optimal stopping" which aims
| at exactly this question: how many dentists should you try
| before you pick your favorite?
|
| For example: https://en.wikipedia.org/wiki/Secretary_problem
| vikingerik wrote:
| That problem is always sort of intuitively misleading. The
| success criterion is defined as picking the single best
| option from the field. But that's not how real life works.
| You don't have to pick the single best option for a job or
| spouse or dentist or whatever; you'll likely be satisfied
| with anything in the top ten or quintile or somewhere around
| there. It's interesting mathematically, but it's seldom
| really applicable to real life.
| chrisweekly wrote:
| That reminds me of a great book -- "Algorithms to Live By" --
| which examines precisely this kind of question. Useful,
| educational, and entertaining = highly recommended.
| ckcheng wrote:
| Another place this technique was used and discussed recently:
|
| Caches: LRU vs. Random (2014) (danluu.com)
|
| https://news.ycombinator.com/item?id=39093109
| Night_Thastus wrote:
| It _feels_ subjectively like Caching and Load Balancing both
| belong to the same 'class' of problems at least at some level.
| I'm curious if there's a term that encompasses both.
| dahart wrote:
| The most interesting tidbit, I think, is cut off from the right
| side of the plot -- eventually with enough delay, 1 random choice
| is the winner, for exactly the same reason that 3 random choices
| wins for small delay, and 2 random choices wins for medium delay.
|
| I kinda wonder now if the crossover points between best, 3, 2,
| and 1 are equidistant on a log-scale plot. If I squint a little,
| I could maybe see 4, 16, and 64 being the approximate crossover
| points, especially if there was a little less noise.
| mjb wrote:
| Yes, it's super interesting when systems move into the regime
| where stale information becomes worse than no information at
| all. Part of what makes best-of-2 interesting is how long it
| resists falling into this pit, but it does eventually happen
| (as, I suspect, it must with _any_ distributed placement /load
| balancing algorithm).
| amluto wrote:
| > Using stale data for load balancing leads to a herd behavior,
| where requests will herd toward a previously quiet host for much
| longer than it takes to make that host very busy indeed.
|
| This phenomenon is common with cars using traffic-aware
| navigation, too.
___________________________________________________________________
(page generated 2024-02-08 23:02 UTC)