[HN Gopher] Metastable Failures in Distributed Systems [pdf]
___________________________________________________________________
Metastable Failures in Distributed Systems [pdf]
Author : zekrioca
Score : 70 points
Date : 2021-10-04 17:52 UTC (5 hours ago)
(HTM) web link (sigops.org)
(TXT) w3m dump (sigops.org)
| ctlachance wrote:
| This paper introduced me to a new concept in system architecture.
| Thanks for posting it!
| mjb wrote:
| I think this paper is super important, and anybody who designs or
| runs big systems should read it and take the core point to heart.
| As system designers, we're very used to thinking about systems as
| 'stable' and 'unstable', where stability is good, and instability
| is bad. What this paper points out is that many kinds of
| distributed systems have multiple 'stable' modes, some of which
| are modes where the system is stable (in a control theory sense),
| but not doing any useful work from the client's perspective. This
| is dangerous, because the system won't kick itself out of this
| "stable but down" mode without something changing: human input, a
| control plane taking action, etc.
|
| I don't think this paper covers anything particularly new, but
| writing it down in this form, with the evidence they present, is
| very valuable. Hopefully this paper will deepen the conversation
| about applying control theory to distributed systems design and
| control problems, and allow a more theoretical approach to be
| taken to the design of these systems to avoid common causes of
| instability and bistability.
|
| One of the authors has a great summary of the paper on his blog:
| http://charap.co/metastable-failures-in-distributed-systems/
|
| I wrote a summary and discussion too:
| https://brooker.co.za/blog/2021/05/24/metastable.html
| dang wrote:
| Discussed 4 months ago:
|
| _Metastable Failures in Distributed Systems_ -
| https://news.ycombinator.com/item?id=27506167 - June 2021 (11
| comments)
|
| ...but on a day like today we dare not mark it as a dupe.
___________________________________________________________________
(page generated 2021-10-04 23:00 UTC)