[HN Gopher] Jepsen: Radix DLT 1.0-Beta.35.1
___________________________________________________________________
Jepsen: Radix DLT 1.0-Beta.35.1
Author : aphyr
Score : 103 points
Date : 2022-02-05 14:17 UTC (8 hours ago)
(HTM) web link (jepsen.io)
(TXT) w3m dump (jepsen.io)
| redwood wrote:
| I have a pretty much opposite perspective on Jepsen than most of
| the folks here. My feeling is that essentially no distributed
| system is perfect or without trade offs (and certainly no single
| node system is perfect and wothout tradeoffs) and Jepsen posts
| basically make that clear over and over again like some form of
| techie outrage porn... but the tone and implication is as if
| somehow there is some alternative panacea without tradeoffs...
| and I find that a little bit misleading.
|
| Basically an astute reader of Jepsen testing may deduce "I need
| to use a single node system" which is one option but without the
| availability characteristics modern users usually want
| TheDong wrote:
| If you want to look at a Jepsen test which shows a system
| mostly working as designed, the etcd 3.4.3 one
| (https://jepsen.io/analyses/etcd-3.4.3) is a much nicer read.
| Quoting from the discussion section, "etcd 3.4.3 lived up to
| its claims for key-value operations: we observed nothing but
| strict-serializable consistency for reads, writes, and even
| multi-key transactions, during process pauses, crashes, clock
| skew, network partitions, and membership changes"
|
| That's what a successful test looks like. Sure, they found
| correctness issues with the lock api, but they were able to
| lend some confidence to etcd's core api. There might be bugs,
| but jepsen didn't observe any.
|
| Sure, that's not proof that systems don't have tradeoffs, but
| it's proof that jepsen doesn't literally always find
| consistency issues or other core problems. Said another way,
| Jepsen is testing for correctness issues in systems. They
| either find some or they don't. These results are interesting,
| even if they are not a panacea.
|
| There are other tradeoffs, but for most systems, correctness is
| important enough it merits consideration on its own right.
|
| I don't think readers of jepsen misunderstand what's being
| tested or what it means, nor do they misunderstand that there
| are other tradeoffs to consider, so I think your comment is off
| the mark.
|
| By analogy, if we were reading a post about "I load-tested this
| bridge which claims to support 50 tons of weight, and it broke
| at 5 tons", no one would be saying "yeah, but there are
| tradeoffs for bridges. This post just makes it clear that
| there's tradeoffs. If you make the bridge stronger, it would be
| more expensive, and to imply there's not tradeoffs is
| misleading. An astute reader might deduce that they should
| never drive on bridges again".
|
| I don't think that would be a reasonable interpretation of such
| a post, nor do I think the interpretation you espouse here
| portrays an accurate sentiment.
| belter wrote:
| That is not what Jepsen demonstrates. Instead it shows systems
| making claims that are not supported by their implementation
| and/or algorithms. If you want to take on your claim/suggestion
| then those systems just have to stop making the claims he is
| testing for.
|
| PS: There is also another trend, that is System claiming they
| fixed the issues Jepsen finds, without submitting themselves
| again for analysis...but I digress now...
| redwood wrote:
| I respectfully disagree. Human language and characteristics
| around distributed data stores in the real world have
| inherent ambiguity implicit in them that Jepsen in my view
| pretends isn't the case.
|
| Here's an analogy that may help me communicate how I feel
| since I realize my message is not landing: Let's say I'm
| buying a condo in the San Francisco Bay area. And let's say
| the building that I'm looking to buy in advertisers that it
| is historic but seismically retrofitted. Then say Jepsen-
| earthquake-test comes in and shakes the ground beneath the
| building and shows that the building indeed collapses with
| enough of an earthquake: would that or would that not be
| enough information for me to decide whether or not the
| seismically retrofitted building is good enough for my needs?
| There's a lot of ambiguity in answer that question.
| dmitriid wrote:
| > I respectfully disagree. Human language and
| characteristics around distributed data stores in the real
| world have inherent ambiguity implicit in them that Jepsen
| in my view pretends isn't the case.
|
| We've been doing distributed systems for over 50 years now.
| There's nothing ambiguous in either the language Jepsen
| uses or the claims he is examining. The crypto shills chose
| to pretend the language is ambiguous and invent their own
| definitions _on purpose_.
|
| Including the ridiculous "nah, everyone understands that
| when we speak in present tense it means we mean some
| unspecified point in a nebulous future".
| belter wrote:
| Although I agree with your point, on language and its
| ambiguity, I would argue that is a different claim of the
| one you made above, and that I replied to.
|
| When he demonstrated that Riak was dropping 30-70% of
| writes, even with the strongest consistency settings, or
| that Mongo had multiple scenarios of data loss, we are not
| talking about the subtleties of the English idiom
| redwood wrote:
| But the contrived nature of Jepsen tests is so different
| than the real world. In the real world no system behaves
| exactly the same as it was designed to behave; the real
| world has cosmic rays, earthquakes and everything in
| between. So no statement about a software system can ever
| be simply accepted as a fundamental fact and so to prove
| that software systems break is misleading for what most
| users need to understand about said systems.
|
| A Jepsen style test optimized not for bending to show
| where things break but instead for showing likely real
| world style situations with an eval of which are most
| likely to arise would be far more valuable for people
| Rapzid wrote:
| Jepsen is all about verifying claims and communicating
| limitations. Companies hire them for this as due
| diligence. They fix real world bugs based on the results.
|
| > what most users need to understand about said systems
|
| The target audience of these reports are the system
| builders and the software engineers building services on
| top of them; not end-users consuming higher level
| services.
| dmitriid wrote:
| > But the contrived nature of Jepsen tests is so
| different than the real world.
|
| The real world is significantly more "contrived" than
| anything Jepsen can come up with.
|
| Transactions timing out, Nodes losing transactions even
| after they've been acknowledged etc.? None of this is
| contrived.
| camgunz wrote:
| You make some reasonable points, but I would say a couple
| things here:
|
| - If we're sticking with your example further up-thread,
| you'd buy a house that was advertised as "can withstand a
| 4.0 magnitude earthquake", that then failed when
| subjected to a 4.0 magnitude earthquake.
|
| - 4.0 magnitude earthquakes happen all of the time [0].
|
| More or less what I'm saying is, engineers generally
| don't think DBs lose data, and when they start coming up
| with ways that might happen (node failure, network
| failure, clock desync), distributed DBs assure them with
| algorithms and configuration knobs. Aphyr puts those
| assurances to the test, which is so, so valuable to us
| all.
|
| It's also worth saying that this space is pretty
| technically complicated. All the DB engineers I know use
| some form of Jepsen-style testing (or Jepsen itself)
| because it's amazingly great.
|
| [0]: https://research.google/pubs/pub40801/
| eklavya wrote:
| We do know that some databases fare much better than
| others and that's useful to many. In your analogy it
| would be many builders claiming their buildings to be
| earthquake proof with only some actually being it. Thanks
| to Jepsen, customers know, where to buy.
| TheDong wrote:
| Note, this analogy is slightly off for using the phrase
| "some actually being it". jepsen cannot prove "earthquake
| proof" (correctness).
|
| Rather, it would be "Builders claim their buildings are
| earthquake proof, and jepson was able to show a subset
| collapse in earthquakes. The rest may or may not be
| earthquake proof".
|
| That's still very valuable. It's really valuable to know
| when something is wrong. It would be more valuable to
| know that something is definitely right (correct /
| "earthquake proof"), but jepsen cannot prove that.
| redwood wrote:
| I'm genuinely curious to hear which database is the
| winner from your perspective based on Jepsen's insight?
| cmiles74 wrote:
| I disagree that the analysis articles from Jepsen, taken as
| whole, are arguing for a "perfect distributed system without
| trade offs". In my experience these articles are fairly even-
| handed, indeed, many products make some changes based on the
| analysis and then contract for a follow-up. That the "tone and
| implication is as if somehow there is some alternative panacea
| without tradeoffs..." is not something I observed in this
| analysis or the previous ones that I have read.
|
| In my opinion the Rethink DB 2.1.5 report went fairly well for
| that project. If the claims are aligned with the reality of the
| product, it's clear the Jepsen report will highlight that.
|
| https://jepsen.io/analyses/rethinkdb-2-1-5
| ceroxylon wrote:
| Classic blockchain startup in the 2020s: put a cool name on your
| version of sharding, act as if it were already live and being
| used by governments / corporations, and hope enough people put
| rockets in the discord to keep the VC money coming in.
| AtlasBarfed wrote:
| OH YEAH JEPSEN IS TEARING APART CRYPTO!!!!
|
| Aphyr is the best. No matter your fave distributed tech and all
| it's CAP-don't-matter stuff, he shows that... CAP does very much
| matter and it's really hard.
|
| Cassandra? Kafka? MongoDB (bwahahahah)? It's all got edge cases.
|
| He should be getting paid a million bucks a year by various
| auditing/accounting firms and the FTC/SEC for validating crypto
| claims. It would be a massive public service.
|
| I like that its being treated like a database, and that the
| safety/correctness of any blockchain has to be viewed from a
| distributed database standpoint.
| bogomipz wrote:
| Jepsen is most certainly not "tearing apart crypto" nor is that
| even remotely the intention of this post.
|
| From the very first paragraph:
|
| >"This work was funded by Radix Tokens (Jersey) Limited, and
| conducted in collaboration with RDX Works Ltd ..."
|
| The post also links to RDX Works Ltd's blog post on this
| collaboration. Also in the first paragraph.
| wmf wrote:
| That's the best part: they _paid him_ to throw such savage
| shade on their blockchain. One transaction per second, one
| million transactions per second, what 's the difference? It's
| just the roadmap, man.
| raphlinus wrote:
| Makes you wonder why they ended the collaboration in
| November and didn't continue to retain him to validate that
| the new stuff lives up to its claims, doesn't it?
| Mleekko wrote:
| It's not a secret at all. The response from the Radix
| CEO: "We re-used the part of his testing harness, and
| then re-created the other critical tests to determine if
| the errors detected where still present. He is
| fantastically expensive and booked 3-6 months in advance
| on average. We'll definite re-deploy this kind of testing
| again, but we'll save it for another bigger release,
| rather than a patch where the identified errors can be
| tested against."
| raphlinus wrote:
| I look forward to reading that report!
| raphlinus wrote:
| It's funny how the comments here are polarized, some of them
| claiming that Jepsen slaughtered RDX, others that it proved that
| the consensus layer is rock solid.
|
| Let's appreciate this for what it is. Blockchains are, at their
| heart, a type of database (or at the very least a ledger which
| can be the foundation on which some subset of database semantics
| can be layered). Performance and reliability are empirical claims
| which can be tested empirically, using the kind of methodology
| that Jepsen has been innovating for many years. It is very much
| to the credit of RDX Works that they subjected their product to
| this type of testing. I'm not saying anything about the way the
| _use_ the test results in their blog post and marketing
| materials, though.
|
| What I'd like to see going forward is that it's routine for
| blockchain-based databases to be tested the same way as real
| databases, based on actual shipping product rather than
| speculative goals. Whether you think this would be validating or
| devastating reveals quite a bit about your preconceptions, but
| either way would be a win for truth and progress.
| [deleted]
| NelsonMinar wrote:
| I admire aphyr's ability to dive into a new and complex
| distributed systems technology and understand it enough to
| evaluate it for correctness. I hope the Radix developers have
| ears to listen, the comment in the report "RDX Works informed
| Jepsen that the blockchain/DLT community had developed
| idiosyncratic definitions of safety and liveness" is not
| encouraging.
| daenney wrote:
| > To Jepsen's surprise, RDX Works asserted that phenomena such
| as aborted read, intermediate read, and lost writes do not
| constitute safety violations (in the blockchain sense). RDX
| Works claims that to describe these errors as safety violations
| would not be understood by readers from a blockchain
| background; this report is therefore "factually incorrect". On
| these grounds, RDX Works requested that Jepsen delete any
| mention of our findings from the abstract of this report.
|
| That certainly does not inspire any confidence.
|
| > Jepsen respectfully declines to do so.
|
| Thank you for sticking to that.
| belter wrote:
| Jepsen is the Carl Jung of Distributed Systems...
| Mleekko wrote:
| What Radix say is true though.
|
| In private DBs, reads from the DB node are considered
| transactions and need to follow the same rules as writes. But
| on public blockchains(ledgers) only state manipulation is
| what matters. For example, Metamask obtaining an address
| balance would be a transaction, but no one calls it that way
| because it doesn't modify the state.
| daenney wrote:
| Sure. But they could have asked for the additional
| clarifications or context to be added to make this clear,
| instead of requesting a bunch of stuff be removed because
| they're concerned it'll paint them in a bad light.
| faraz85 wrote:
| As I understand from the report, no request was made to
| remove the content but to leave terms like "liveness
| break" and "safety break" out of the abstract until those
| terms were defined in the main report.
| aphyr wrote:
| You understand incorrectly.
| faraz85 wrote:
| Pleased they ignored that request too, although I can see
| where RDX are coming from. In a distributed ledger it's all
| about state. The consensus layer of the architecture is rock
| solid according to this report.
| smw wrote:
| Are you (new account) failing to disclose your conflict of
| interest here?
| faraz85 wrote:
| Happy to share I'm an advocate of this project and follow
| its developments closely
| antocv wrote:
| aphyr wrote:
| The core ledger system lost committed transactions by
| choosing not to write them to disk before acknowledgement.
| Mleekko wrote:
| Yeah, the issue is serious but at the same time: "This
| problem occurred only in cases where every node was
| killed at roughly the same time". And there are 100 nodes
| on the network. (and it is fixed now)
| NelsonMinar wrote:
| Well, Radix says it is fixed now. That's the gist of
| their response to this report, "we fixed a lot of things
| and no one's tested the new code but trust us!"
| Mleekko wrote:
| nope, if you read it one more time, for this particular
| issue Jepsen confirms the fix.
| Smaug123 wrote:
| > The consensus layer of the architecture is rock solid
| according to this report.
|
| That's quite a stretch. The report states explicitly a) the
| usual proviso that they can only prove the presence of
| bugs, not their absence, but more pertinently b) that their
| methodology is more usually applied to lower-latency
| databases, with the implication that they are less
| confident of their conclusions in this new regime:
|
| > Radix's low throughput and high latency may have masked
| safety violations. In particular, our tests required
| several hours to reproduce e.g. aborted read (#13).
|
| Note also that they didn't even attempt to test what
| happens in the presence of malicious nodes!
| vessenes wrote:
| I like the deadpan 'radix claims 1.4mm tps; in testing,
| transactions at more than 1/s can cause slowdowns' - impressive
| lack of eye rolling.
| faraz85 wrote:
| Slightly out of context, the 1.4mm tps test was during an
| earlier iteration of the sharded architecture (coming in 2023)
| whereas Jepsen were testing the unsharded mainnet.
| boardwaalk wrote:
| It may imply they'd need 1.4 million (mm is millimeters...)
| shards to meet that test number, and scale linearly while
| doing it, though.
| coolsunglasses wrote:
| mm is mille mille, meaning a thousand thousand. 1,000 *
| 1,000 is 1 million. 1mm to mean 1 million is common outside
| of the United States.
| gmalette wrote:
| I live outside the United States and have never seen mm
| mean anything other than millimeter
| Smaug123 wrote:
| Meanwhile I live in the UK and it's completely standard
| for monetary quantities.
| boardwaalk wrote:
| I've seen that use for sure. Perhaps a little telling
| we're talking about blockchain tech and that's what
| people are using. Is it really about a fast distributed
| transaction log, or... is it about the money first.
| bastawhiz wrote:
| It's a bold claim to say that your system can handle six
| orders of magnitude more load than it is actually capable of
| because next year you might release some software that does
| better.
| twsted wrote:
| This is really hilarious:
|
| > When asked, RDX Works executives informed Jepsen that
| blockchain/DLT readers would normally understand present-tense
| English statements like these to be referring to potential future
| behavior, rather than the present.
|
| > Jepsen is no stranger to ambitious claims, and aims to
| understand, analyze, and report on systems as they presently
| behave--in the context of their documentation, marketing, and
| community understanding. Jepsen encourages vendors and community
| members alike to take care with this sort of linguistic
| ambiguity.
| faraz85 wrote:
| I like to think most people are able to read marketing material
| in the context of the roadmap to understand the difference in
| performance claimed vs. measured - apples and oranges comes to
| mind
| lostdog wrote:
| Do you have any affiliation with the Radix project that you'd
| like to disclose?
| camgunz wrote:
| Hey did you uh, make an account for this thread?
| hnuser123456 wrote:
| The future is now.
|
| (sorry for relatively low effort comment.)
| cormacrelf wrote:
| I wonder if Finnish startups benefit from a language that
| does not have a future tense. (This is not a Finnish
| startup.)
___________________________________________________________________
(page generated 2022-02-05 23:01 UTC)