[HN Gopher] Jepsen: Redpanda 21.10.1
___________________________________________________________________
Jepsen: Redpanda 21.10.1
Author : aphyr
Score : 134 points
Date : 2022-04-29 14:02 UTC (8 hours ago)
(HTM) web link (jepsen.io)
(TXT) w3m dump (jepsen.io)
| dhshshdhdgfff wrote:
| The first half is jepsen team trying to divine some actual
| testable guarantees from a pile of blog posts and a random Google
| doc. What a mess.
| jeffbee wrote:
| Total mess. It's a real indictment of Kafka, more than it is
| anything about redpanda in the first half.
| [deleted]
| excuses_ wrote:
| I wonder if Redpanda thinks about or offers some alternative
| protocol that would be better defined in terms of transaction
| guarantees. At this point it looks like Kafka's protocol was a
| nice try but it needs a major refactoring.
| rystsov wrote:
| Documentation is a bit confusing: the protocol was evolved
| over time (new KIPs) and there is mismatch between the
| database model and kafka model. But we see a lot of potential
| in the Kafka transactional protocol.
|
| At Redpanda we were able to push to 5k distributed
| transactions cross replicated shard. It's a mind-blowing for
| a database to achieve the same result.
|
| Also Kafka transactional protocol works at low level it's
| very easy to build systems on top of it. For example, it's
| very easy to build a Calvin inspired system
| http://cs.yale.edu/homes/thomson/publications/calvin-
| sigmod1...
| rystsov wrote:
| The mess is mostly the result of the mismatch between the
| classic database transactional model and kafka transactional
| model (G0 anomaly). If you read the documentation without the
| database background it seems ok, but when you notice the
| differences between the models it becomes hard to understand if
| it's a bug or property of the Kafka protocol.
|
| There is a lot of research happening around this area even in
| the database world. The list of the isolation levels isn't
| final and some of the recent developments include PC-PSI and
| NMSI which also seem to "violate" the order. I hope one day we
| get the formal academic description of the Kafka model. It
| looks very promising.
| btown wrote:
| Are there good research groups or journals to follow to keep
| apprised of the state of the art here?
| rystsov wrote:
| I've created this list a while ago
| https://github.com/redpanda-data/awesome-distributed-
| transac.... Maybe it's time to update it.
|
| Usually I start with a couple of seed papers then follow
| the references, look at the other papers the authors wrote.
| When a phd student explores an area they write several
| paper on the topic so there is a lot material to read. But
| the real gem is the thesis, it has depth, context and a lot
| of links to other work in the area.
| mandevil wrote:
| I was unfamiliar with Redpanda, and now I know and trust it.
| Whatever marketing budget Redpanda spent to get a Jepsen report
| was well worth it.
| belter wrote:
| Agree. Nowadays, I see anything that did not go through Jepsen
| with suspicion. Forces me to do the triple of technical due
| diligence.
| hardwaresofton wrote:
| One of the clearest indications prices for a service should be
| raised I've ever seen.
|
| Can we get patio11 in here to say the thing?
| titanomachy wrote:
| Do you have any information on what Jepsen charges? For all
| we know, it could be precisely the right amount.
| agallego wrote:
| kyle is very friendly and I recommend reaching out. we
| can't and wouldn't disclose any pricing that is not public
| information. would be unethical on my part. all i can say
| is we wish to continue our work with him indefinitely as
| long as we keep making progress on the product :)
| agallego wrote:
| interested! :D
| divan wrote:
| I happened to know RedPanda founder back in the days he was at
| Concord.io (as a founder and a main dev). The level of
| obsession with performance and optimization of this guy was
| insane. He's not only extremelly skilled with C++, but also
| very passionate about rethinking large and complex systems and
| rebuilding them to enable 10-100x speed improvements. It's like
| his personal hobby - take a piece of software everyone use, and
| optimize it to the limits of physics, usually by implementing
| better version from scratch himself :) Plus, he's an excellent
| communicator. Watching how their team was working I always
| thought that successful companies can be built only with that
| level of passion and expertise as a single package.
| debarshri wrote:
| Thing that got my attention was that it has inline transform
| functions that can be added as wasm binary
| gigatexal wrote:
| If your DB doesn't pass the Jepsen tests it's not worth using.
| Kudos to both teams.
| newman314 wrote:
| Redpanda (back when they were VectorizedIO) spammed my work email
| after I starred one of their repos, denied it after I called them
| out on it and I just noticed that they had deleted their response
| to me.
|
| Pretty sneaky to go back and delete the tweets first denying and
| then apologizing.
|
| Receipts: https://twitter.com/d11cc3s/status/1447573471152656389
| https://twitter.com/d11cc3s/status/1450906855115354116
| agallego wrote:
| hi newman314 - i mentioned in the tweet this was a mistake and
| offered an apology there, an sdr reached out to you, when i
| realized that i apologize. no ill intent. feel free to test
| this with a fake github account. my tweets automatically delete
| after 6mo, all of them on a rolling window. nothing special
| about this interaction. there is no sneaky-ness, though feel
| free to disagree.
| staticassertion wrote:
| Sounds like you have a personal, singular issue with them that
| I can't imagine anyone else cares about.
| doommius wrote:
| Always great to read this. I preformed a jenkins test on
| Microsoft internal infra and it's a huge insight. From an
| academic side it's just as interesting looking into the lack of
| standards within consistently and the definitions of them.
| rystsov wrote:
| Cool! What did you test? I've played with Jepsen and Cosmos DB
| when I was at Microsoft but we had to ditch ssh, write custom
| agent and inject faults with PowerShell command lets.
| titanomachy wrote:
| The level of intellectual discipline and competence on display
| here is inspiring.
|
| I'd love to take one of the Jepsen courses, but it seems they're
| offered only as corporate training. Maybe my employeer will agree
| to bring them in.
|
| For now I'll have to satisfy myself with the YouTube videos.
| rystsov wrote:
| Hey folks, I was working with Kyle Kingsbury on this report from
| the Redpanda side and I'm happy to help if you have questions
| cgaebel wrote:
| Thanks for working with Jespen. Being willing to subject your
| product to their testing is a huge boon for Redpanda's
| credibility.
|
| I have two questions:
|
| 1. How surprising were the bugs that Jepsen found?
|
| 2. Besides the obvious regression tests for bugs that Jepsen
| found, how did this report change Redpanda's overall approach
| to testing? Were there classes of tests missing?
| rystsov wrote:
| It wasn't a big surprise for us. Redpanda is a complex
| distributed system with multiple components even at the core
| level: consensus, idmepotency, transactions so we were ready
| that something might be off (but we were pleased to find that
| all the safety issues were with the things which were behind
| the feature flags at the time).
|
| Also we have internal chaos test and by the time partnership
| with Kyle started we already identified half of the
| consistency issues and sent PRs with fixes. The issues got in
| the report because by the time we started the changes weren't
| released yet. But it is acknowledged in the report
|
| > The Redpanda team already had an extensive test suite--
| including fault injection--prior to our collaboration. Their
| work found several serious issues including duplicate writes
| (#3039), inconsistent offsets (#3003), and aborted
| reads/circular information flow (#3036) before Jepsen
| encountered them
|
| We missed other issues because haven't exercised some
| scenario. As soon as Kyle found the issues we were able to
| reproduce them with the in-house chaos tests and fix. This
| dual testing (jepsen + existing chaos harness) approach was
| very beneficial. We were able to check the results and give
| feedback to Kyle if he found a real thing or if it looks more
| like an expected behavior.
|
| We fixed all the consistency (safety) issues, but there are
| several unresolved availability dips. We'll stick with Jepsen
| (the framework) until we're sure we fixed then too. But then
| we probably rely just on the in house tests.
|
| Clojure is very powerful language and I was truly amazed how
| fast Kyle for able to adjust his tests to new information but
| we don't have clojure expertise and even simple tasks take
| time. So it's probably wiser to use what we already know even
| it it a bit more verbose.
| polio wrote:
| A complete nit, but the testimonial from the CTO of The Seventh
| Sense on https://redpanda.com/ spells Redpanda as "Redpand".
| northstar702 wrote:
| Thank you. Fixed.
| CJefferson wrote:
| This isn't anything against Redpanda, but I'm always amazed how
| badly all these distributed databases do in Jepsen.
|
| What would one use them for in practice, which wouldn't be better
| suitable by a (the thing I've used), say postgresql and streaming
| replication in case the server goes down? (I'm not suggesting
| there isn't a good application, just I'm not knowledgeable enough
| to know of one).
| agallego wrote:
| totally different approaches tho. people have tried what you
| proposed many times before and for some scale succeeded. hard
| to compare at all when you dig into the details.
|
| expect a companion post. this was super fun to partner with
| kyle on this. +1 would recommend to anyone building a storage
| system.
| jandrewrogers wrote:
| When a distributed database is designed, you must navigate and
| optimize several complex technical tradeoffs to meet the
| architecture and product objectives. The specific set of
| tradeoffs made -- and they are different for every platform --
| will determine the kinds of data models and workloads that the
| database will be suitable for, especially if performance and
| scalability are critical as in this case.
|
| The reason distributed databases tend to be buggy, especially
| in the first iterations, is straightforward if not simple to
| address. While it is convenient to describe technical design
| tradeoffs as a set of discrete, independent things, in real
| implementation they are all interconnected in subtle, complex,
| nuanced ways. Modifying one design tradeoff in code can have
| unanticipated consequences for other intended tradeoffs. In
| other words, there isn't a _set_ of simple tradeoffs, there is
| a single _extremely high-dimensionality_ tradeoff that is being
| optimized. Not only are complex high-dimensionality design
| elements difficult to reason about when writing code the first
| time, any changes to the code may shift how the tradeoffs
| interact in non-obvious ways. Humans have finite cognitive
| budgets, so unless it is obvious that a code change has the
| potential to have unintended side effects, we generally don 't
| spend the time to fully verify this fact.
|
| I can't tell you how many times I've seen tiny innocuous code
| changes alter the behavior of distributed databases in
| surprising ways. This is also why once the core code seems to
| be correct, people are reluctant to modify it if that can be
| avoided at all.
| rystsov wrote:
| Different systems solve different problems and have different
| functional characteristics. Actually one of the thing which
| Kyle highlighted in his report is write cycles (G0 anomaly), it
| isn't a problem of the Redpanda implementation but a
| fundamental property of the Kafka protocol. Records in Kafka
| protocol don't have preconditions and they don't overwrite each
| other (unlike the database operations) so it doesn't make sense
| to enforce order on the transactions and it's possible to run
| them in parallel. It gives enormous performance benefits and
| doesn't compromise safety.
| georgelyon wrote:
| I'm constantly surprised more folks don't use FoundationDB, I'm
| pretty sure the Jepsen folks said something to the tune of the
| way FoundationDB is tested is far beyond what Jepsen does (Good
| talk on FDB testing:
| https://www.youtube.com/watch?v=4fFDFbi3toc).
|
| My read is that most use cases just need something that works
| _enough_ at scale that the product doesn't fall over and any
| issues introduced by such bugs can be addressed manually (i.e.
| through customer support, or just sufficient ad-hoc error
| handling). Couple that with the investment some of these
| databases have put into onboarding and developer-acquisition,
| and you have something that can be quite compelling even
| compared to something which is fundamentally more correct.
| staticassertion wrote:
| Having looked at FoundationDB a bit it wasn't clear why I
| would choose it. It has transactions, which is nice, but not
| that big of a deal despite how much time they put into
| talking about it. I actually don't even need transactions
| since all of my writes commute, so it's particularly
| uninteresting to me.
|
| They say they're fast, but I didn't find a ton of information
| about that.
|
| Ultimately the sell seemed to be "nosql with transactions"
| and I just couldn't justify putting more time into it. I did
| watch their excellent talk on testing, and I respect that
| they've put that level of effort into it, and it was why I
| even considered it, but yeah, what am I missing?
| jwr wrote:
| As someone who is switching to FoundationDB: because it's not
| easy. It doesn't look like other databases, it isn't in
| fashion (yes, these things matter), and it requires thinking
| and adapting your application to really use it to its full
| potential. It could also benefit from a bit more developer
| marketing.
|
| But it's the best thing out there.
| claytonjy wrote:
| There's a lot of different ways to answer this, but I think
| about it as a different architectural paradigm. Yes you can do
| stream-ish things with Postgres but at some level of scale
| you'd be putting a square peg in a round hole.
|
| What opened my eyes to this world is this post from Martin
| Kleppman on turning the database inside out:
| https://martin.kleppmann.com/2015/03/04/turning-the-database...
| antonmry wrote:
| This report seems to have some wrong insights. Auto-commit
| offsets doesn't imply dataloss if records are processed
| synchronously. This is the safest way to test Kafka instead of
| commit offsets manually
| rystsov wrote:
| Can you clarify what you mean? AFAIK with manual commit you
| have the most control over when the commit happens
|
| Look at this blog post describing a data loss caused by auto-
| commit: https://newrelic.com/blog/best-practices/kafka-
| consumer-conf...
|
| Also there also may be more subtle issues with auto-commit:
| https://github.com/edenhill/librdkafka/issues/2782
| dstroot wrote:
| > A KafkaConsumer, by contrast, will happily connect to a jar of
| applesauce14 and return successful, empty result sets for every
| call to consumer.poll. This makes it surprisingly difficult to
| tell the difference between "everything is fine and I'm up to
| date" versus "the cluster is on fire", and led to significant
| confusion in our tests.
|
| This tickled my funny bone. Never expected humor in a Jepsen
| writeup. Kudos!
| staticassertion wrote:
| > Never expected humor in a Jepsen writeup
|
| Jepsen reports are often pretty funny, some famously so
| cwillu wrote:
| Wait until you find out why it's called "Jepson"
| toolz wrote:
| please tell me it has something to do with carly jepsens song
| "call me maybe"
___________________________________________________________________
(page generated 2022-04-29 23:00 UTC)