[HN Gopher] Scaling Replicated State Machines with Compartmental...
___________________________________________________________________
Scaling Replicated State Machines with Compartmentalization [pdf]
Author : luu
Score : 19 points
Date : 2021-01-12 07:49 UTC (15 hours ago)
(HTM) web link (mwhittaker.github.io)
(TXT) w3m dump (mwhittaker.github.io)
| electricshampo1 wrote:
| I read through this and it is unclear what the point of this is.
| Most real world RSM deployments (Spanner, DynamoDB, etc) are
| composed of many separate RSMs one for each shard/partition. The
| requisite performance is readily provided by the aggregate
| performance of these essentially independent RSMs. Don't see any
| real situations where adding additional hardware to make a single
| RSM faster makes sense. Any ideas?
| mjb wrote:
| Sharding is a great way to scale, but comes with costs. One
| cost is that you need a separate transaction layer "on top of"
| the shards if you want to offer transactions. If you want to
| offer strong isolation (such as snapshot or serializability),
| good performance, and fault tolerance, that transaction layer
| can be quite complex. Scaling a single RSM is much cleaner
| conceptually, and (depending on the model) can offer very
| strong semantics (strict serializability would be typical).
| Having a single system image (rather than trying to simulate
| one through distributed transaction protocols) also removes
| complexity in other areas, such as making it easier to do large
| read queries MVCC-style.
|
| I don't think that being able to scale an RSM up makes anything
| possible that wasn't possible before, but it is convenient in
| all kinds of ways.
| electricshampo1 wrote:
| I agree with basically everything you said. However, this
| would only apply in relatively narrow cases where your
| workload can be supported by a single RSM. There is an
| absolute limit for the sequencing alone from a single leader,
| even after you apply the tricks mentioned in the paper. Geo
| optimizations (US users data stored in US, EU users data
| stored in EU, etc) for latency while maintaining a single
| database view also become challenging (impossible?) without
| sharding. Fault tolerance is also slightly improved with many
| shards because you expose a smaller fraction of your database
| to transient downtime (re-election, catchup) or slowness when
| the node containing the leader fails or becomes slow.
| hodgesrm wrote:
| Same here. One of the main benefits of sharding is that it
| creates independent scopes for transactions of all kinds, not
| just those managed through Paxos. It's a natural way to
| parallelize everything from transaction processing to backup.
___________________________________________________________________
(page generated 2021-01-12 23:01 UTC)