[HN Gopher] Data Consistency Is Overrated
___________________________________________________________________
Data Consistency Is Overrated
Author : bo0tzz
Score : 39 points
Date : 2023-02-18 15:22 UTC (7 hours ago)
(HTM) web link (two-wrongs.com)
(TXT) w3m dump (two-wrongs.com)
| ddulaney wrote:
| I think there are two different kinds of consistency, and it's
| important to not conflate them.
|
| There's consistency that's _internal_ to a system. Do all of the
| foreign keys line up correctly? Have I lost any data that was
| provided to me? Here, we can aspire to be 100% correct. I don 't
| think the examples in this article conflict with that.
|
| Then there's consistency that's _external_ to a system. This can
| be between this system and other systems, or between this system
| and reality. Did the operator enter the correct data for this
| item? Did the external system change and fail to tell me? Here
| there is no way to be 100% correct using only the tools that are
| inside the system. You need external audits, reality checks,
| periodic reconciliation.
|
| Critically, all of the author's examples are about external
| consistency (accounting matching reality; inter-system
| communication), but their conclusion seems to be that because
| external consistency can't be fully achieved, we should be OK to
| abandon internal consistency within single systems. I think
| that's too strong a conclusion.
| josephg wrote:
| Hm, I'd cut the cake a little differently. I think there's two
| kinds of consistency we can strive for:
|
| - _Strict_ consistency. Every pointer in my b-tree _must_ point
| to a b-tree node, and not random data which could cause the
| program to crash. In a financial world, a bank should never
| print money.
|
| - _Fuzzy_ "good enough" consistency. In the examples in the
| article, all of the financial transactions should end up close
| enough to being reconciled.
|
| There's value in both kinds of consistency. When talking about
| data entered by humans into a database, there's always going to
| be a bit of slop involved. Someone mistyped a digit. A few
| records weren't entered at all.
|
| But when building software systems, designing with strict
| consistency guarantees is such an unbelievably massive win. I
| can't overstate how important it is. The entire ladder of
| abstraction in modern computers from transistors all the way up
| to this website is only possible because each layer
| "underneath" the layer we're standing on is solid and
| deterministic. If CPUs made even 1 error in every billion
| operations, our computers wouldn't boot at all.
|
| Tony Hoare talks about his invention of null pointers as his
| "billion dollar mistake". Null pointers take something that
| should be strictly consistent (references) and make it fuzzy.
| Rust is exciting lots of people in the systems programming
| space because it takes things that are fuzzy in C (aliasing,
| memory management, thread-safe variables, etc) and makes them
| strictly consistent. All the value in unit testing comes from
| how they make our systems more strictly correct.
|
| Every time I've relaxed consistency guarantees internally in
| systems I've worked on (or heard coworkers doing the same)
| we've come to regret it. At a startup several years ago, we
| needed to build an external search index for a database. The
| database updated live - and we had a change feed that updated
| the browsers live as records changed. The engineer in charge
| did a "good enough" job - he wrote a scrappy script that was
| only mostly correct. But it sometimes left the index
| inconsistent with the data. We got constant reports from our
| users about items not showing up in the search results. He
| would dutifully go back and fiddle with things to try and fix
| the problem. Eventually one of our senior engineers went in and
| rewrote the whole indexing script to be strictly correct. We
| never heard a peep about it after that - it just worked, every
| time. Even putting aside the frustration of our users, writing
| it correctly was a big win for us in terms of maintenance. Once
| it was correct, we didn't need to keep pulling engineering time
| away to fix problems.
|
| Maybe data consistency is overrated in databases. But I think
| if anything, consistency internally in computing systems is
| underrated. We take for granted how well computers work. But
| our capacity to make computers _do anything_ depends entirely
| on those consistency guarantees. It seems ridiculous to
| disregard its importance.
| taeric wrote:
| I think I agree with you. That said, internal and external are
| a touch inadequate.
|
| Specifically, for a large enough system, internal consistency
| will look more like external from a smaller system's
| perspective.
|
| To that end, it is all about costs. If the cost of keeping
| consistent is not above the budget, do so.
| gfody wrote:
| including unknowable cost/opportunity risk
| stevesimmons wrote:
| There's a great book "Data and Reality" that discusses these
| subtle but very crucial differences. Discussed here in HN a
| year ago:
|
| https://news.ycombinator.com/item?id=30251747
| GauntletWizard wrote:
| There's a ton of places where foreign keys are used wrong -
| deletion is acceptable, and good error handling for that case
| is the right thing to build anyway.
|
| That's one of the key arguments that nonconsistency advocates
| are arguing for. If you build a music playlist, the behavior of
| the program if the mp3 files referenced within should be to
| skip that track, not to crash. I've heard too many people
| arguing that they should be allowed to emit nasal demons if
| there's a data error, and they're wrong even in many well
| structured and tightly integrated datasets, but especially
| wrong in "web-scale" datasets
| bcrosby95 wrote:
| Deletion is acceptable, and if you have everything in a
| consistent system they will both exist or not.
|
| "Nonconsistency advocacy" doesn't make a lot of sense to me
| here. Are you advocating not relying on consistency in
| systems that guarantee consistency? That is a waste of time.
|
| Are you advocating eschewing consistent systems? Well, then
| you have more work, so only if I need to. And yes, you should
| handle data errors here because inconsistency is consistent
| with the system you've chosen.
| dgb23 wrote:
| I needed to read this, but I'm not sure what to make of it yet.
|
| The reason is I've been struggling with the idea that (bi-)
| temporal, consistent data is great, as it provides a ton of
| leverage both for users and for auditing and debugging. For some
| problems it's the cleanest, general solution.
|
| What irks me that the code that validates, consumes, transforms
| and displays the data lives in its own time model (git).
|
| Philosophically you're not really looking into the past. You're
| looking at an echo of the past that's displayed in a current
| form.
|
| And more practically it's simply tough to draw the line between
| assuring that historical data is still handled and rendered
| correctly and getting rid of overhead and complexity of a growing
| and evolving application.
|
| The article talks about eventual consistency, but I really don't
| have these kinds of problems and when I do, I fully agree with
| the author as long as it's very clear to the user whether
| something is consistent and when it will be.
| refset wrote:
| > practically it's simply tough to draw the line between
| assuring that historical data is still handled and rendered
| correctly and getting rid of overhead and complexity of a
| growing and evolving application
|
| Agreed, the issue of maintaining accurate data (bitemporal or
| otherwise) in the context of evolving schema and code feels
| like a real puzzle to solve with commonplace tools like git and
| SQL.
| gorbachev wrote:
| I once worked on a platform producing analytics using data that,
| at its source, was manually typed in by people.
|
| My product managers would insist we do distinct counts on the
| aggregates instead of using probabilistic algorithms, because we
| "needed" the absolute 100% accurate output. No matter how many
| times I would explain the data was never 100% accurate to begin
| with and that the error rate using HyperLogLog wouldn't make an
| ounce of difference, we never were allowed to do that. As a
| consequence the performance of the system when doing distinct
| counts on interactive queries was about 100x worse. This didn't
| seem to bother anyone. I never understood why.
| crazygringo wrote:
| 100x is not that big of a difference, and I've been burned by
| "clever" algorithms before because they weren't coded _exactly_
| right. Because if they drift into a broken state, nobody can
| tell. While a simple sum is unlikely to fail in subtle ways.
|
| There are benefits to simplicity. If there isn't a reason to
| need 100x faster performance, then why complicate things? That
| would be my why.
| dgb23 wrote:
| That's a great point.
|
| But I want to add that anecdotally users do care about
| performance and they will often thank you for anything that
| is noticeable. Sometimes just because it feels nice, but it
| can also enable a fast feedback loop, which increases
| immersion and productivity.
|
| In a broader sense, we use computers not only because they
| can perform work autonomously but also because they are fast,
| correct and remember details almost perfectly.
| dgb23 wrote:
| I can imagine some reasons:
|
| - They don't care about performance and think their users don't
| care about performance.
|
| - They have sold something that doesn't make sense but sounds
| good.
|
| - They are afraid of their users.
|
| - Their users don't understand the problem you're describing or
| simply don't believe it.
|
| - The business domain of their users actually knows the about
| the issue, but they still want or need the exact counts.
| readthenotes1 wrote:
| Wouldn't it be funny if the post changed every few hours?
| LAC-Tech wrote:
| "Network partitioning can completely destroy mutual consistency
| in the worst case, and this fact has led to a certain amount of
| restrictiveness, vagueness, and even nervousness in past
| discussions, of how it may be handled. In some environments it is
| desirable or necessary to permit users to continue modifying
| resources such as files when the network is partitioned. A
| network operating system would be a good example. In such
| environments mutual inconsistency becomes a fact of life which
| must be dealt with."[0]
|
| TL;DR - you either store your data in one place, or you store it
| in multiple places but disallow writes when there's a network
| split and wait for consensus on all nodes... or you have
| inconsistency and you have to deal with it.
|
| [0] _Detection of Mutual Inconsistency in Distributed Systems_ ,
| 1983. Great paper, fairly readable, and still relevant.
| legulere wrote:
| Data consistency comes at a cost, but eschewing it does as well.
| Most of the time the performance impact of data consistency does
| not matter, potentially introducing heisenbugs in your system can
| come at a huge cost though.
| layer8 wrote:
| And also, eventual consistency means _eventual_ consistency,
| not abandoning consistency.
| StreamBright wrote:
| Data consistency is overrated if your business is ok with that.
| Many businessed are not ok with that. Example: airline, booking
| process.
| fbdab103 wrote:
| Airlines are notorious for over-booking available seats and
| dealing with the fallout.
| StreamBright wrote:
| Which is done purposefully and not by data in-consistency at
| all.
| macintux wrote:
| I would wager, in complete ignorance of the real
| implementation, that the knowledge that they _can_ overbook
| means that they can relax some of their requirements. If
| two servers can't talk to each other to coordinate for a
| while, they could still each sell tickets.
| taeric wrote:
| Bad examples. Airlines are notorious for having incoherent
| booking policies. Over selling flights and such.
| crazygringo wrote:
| Overbooking is intentional policy. It has nothing to do with
| data consistency.
|
| To the contrary, consistency is extremely important so that
| they overbook by exactly the right amount, to compensate for
| the statistically expected no-shows.
| taeric wrote:
| It is a form of coherence for the system, though. And in
| the spirit of the same ideas. That is, it is an intentional
| policy for databases, too.
|
| And there is no "exactly right amount" that makes it work.
| They keep options for forcing people off flights if they
| planned it wrong.
|
| Is also why they don't let gate agents over sell a flight.
| They keep a stronger consistency on that, for this exact
| reason. Over selling would fit in what someone else called
| external consistency.
| bjornsing wrote:
| Durability is also overrated.
|
| Jokes aside, I've been responsible for a system that processed ~1
| billion monetary transactions per day. Even with a fanatical
| focus on consistency and correctness it was still always off.
| With the philosophy promoted on the OP the chaos would have been
| complete... is my gut feeling at least.
| mousetree wrote:
| Out of interest, what system processed 1 billion monetary
| transactions a day?
| bjornsing wrote:
| Bookkeeping / billing / analytics system for a high volume
| online API.
| playingalong wrote:
| High-frequency something would be my guess;)
| aljungberg wrote:
| Within a closed system, consistency (and verifying it) is a fail
| fast mechanism. For example, it's better to crash on a constraint
| failure when attaching a doodad to a non-existent user account
| than to figure out where all these orphan doodads came from next
| year.
___________________________________________________________________
(page generated 2023-02-18 23:00 UTC)