[HN Gopher] Scalable but Wasteful, or why fast replication proto...
___________________________________________________________________
Scalable but Wasteful, or why fast replication protocols are slow
Author : hugofirth
Score : 40 points
Date : 2021-07-12 06:20 UTC (1 days ago)
(HTM) web link (charap.co)
(TXT) w3m dump (charap.co)
| mistralefob wrote:
| So, why?
| Dylan16807 wrote:
| They're fast when you limit each machine's CPU, but they're
| slow when you limit total CPU.
| eternalban wrote:
| A leaderless protocol is more complex and requires greater IO
| and CPU resources per unit of work. This is the hidden cost.
| The question is if the cost is reasonably offset by higher
| performance (throughput) offered by leaderless consensus. OP is
| arguing that the benefit is not deemed worth the cost outside
| of academia and this is why e.g EPaxos is not adopted.
| nine_k wrote:
| But doesn't a leaderless protocol also give you more
| resilience against failures? Or, in other words, can it be
| that the higher cost buys you not better throughput but
| faster reconciliation when connectivity is poor? Not in a
| data center but on a mobile network?
| hugofirth wrote:
| Another thing which makes the Raft/Paxos vs new-consensus-
| algorithm comparisons complicated is caching.
|
| If your raft state machines are doing IO via some write through
| cache (which they often are) then having specific machines do
| specific jobs can increase the cache quality. I.e. your leader
| node can have a better cache for your write workload, whilst your
| follower nodes can have better caches for your read workload.
|
| This may lead to higher throughput (yay) but then also leave you
| vulnerable to significant slow-downs after leader elections
| (boo).
|
| What makes sense will depend on your use case, but I personally
| agree with the author that multiple simple raft/paxos groups
| scheduled across nodes by some workload aware component might be
| the best of both worlds.
| LAC-Tech wrote:
| > The protocol presents a leader-less solution, where any node
| can become an opportunistic coordinator for an operation.
|
| Does leader = master here? My first reaction is that this is a
| multi-master system but I can't quite unpack "opportunistic
| coordinator".
| toolslive wrote:
| None of this actually matters. Consensus algorithms allow you to
| achieve consensus. Period. There's no requirement whatsoever on
| what you're getting consensus on. A consensus value could be
| _one_ database update, but it doesn't need to be. It can also
| consist of 666 database transactions across 42 different
| namespaces.
| luhn wrote:
| Honestly I think the answer is simpler: People don't _need_
| better algorithms. Paxos and Raft are generally used to build
| service discovery and node coordination, these are not demanding
| workloads and overwhelmingly read-heavy. Even the largest
| deployments can probably be serviced by a set of modestly-sized
| VMs. Paxos and Raft are well-understood algorithms with a choice
| of battle-tested implementations, why would anyone choose
| different?
|
| The whole section on "bin-packing Paxos/Raft is more efficient"
| is strange, because people don't generally bin-pack Paxos/Raft--
| The bin-packing orchestrators are built off of Paxos/Raft!
| rdw wrote:
| The hypothesis in the article could be correct (that industry is
| not adopting new academic innovations because they fail in the
| real world). Based on my experience in this industry, though, it
| could just be that there isn't a super strong connection between
| academia and the people implementing these kinds of systems. I've
| had many conversations with my academically-minded friends where
| they're astonished that we haven't jumped on some latest
| innovation, and I have to let them down by saying that the
| problem that paper was addressing is super far down our list of
| fires to put out. Maybe there are places where teams of top-tier
| engineers are free to spend 6 months every year rewriting
| critical core systems use un-battle-scarred new algorithms that
| might have 20% performance improvements, but most places I've
| worked would achieve the same result for far less money by
| spending 20% more on hardware.
| anonymousDan wrote:
| Yes for that kind of improvement I think academia would be
| better served trying to see if the techniques can be
| incorporated into an existing well tested implementation.
___________________________________________________________________
(page generated 2021-07-13 23:00 UTC)