[HN Gopher] Taming High Cardinality by sharding a stream
       ___________________________________________________________________
        
       Taming High Cardinality by sharding a stream
        
       Author : trojanalert
       Score  : 20 points
       Date   : 2023-08-21 05:58 UTC (17 hours ago)
        
 (HTM) web link (last9.io)
 (TXT) w3m dump (last9.io)
        
       | thamer wrote:
       | If you're considering sharding a database, please spend some time
       | finding the best key distribution strategy, and don't just use
       | `key % shard_count` as if it was automatically the right way to
       | do it. The distribution of values for the left side of this mod
       | operator will not necessarily lead to equal distribution over the
       | shards.
       | 
       | Some will add a hash function around the key, but this only
       | addresses part of the problem: for example if you started with N
       | shards and ever need to add 1, you will need to move all but
       | O(1/N) keys to new shards. And it's not just about growing the
       | number of shards permanently, other maintenance operations such
       | as replacing a host can require you to redistribute the data
       | depending on how replication is set up.
       | 
       | Consistent hashing can often help drastically reduce the number
       | of keys to shuffle, but in any case it's something worth spending
       | some time on early on and getting right rather than having to pay
       | later for having overlooked its impact.
        
       ___________________________________________________________________
       (page generated 2023-08-21 23:01 UTC)