[HN Gopher] Strict-serializability, but at what cost, for what p...
___________________________________________________________________
Strict-serializability, but at what cost, for what purpose?
Author : ingve
Score : 32 points
Date : 2022-08-04 17:50 UTC (5 hours ago)
(HTM) web link (muratbuffalo.blogspot.com)
(TXT) w3m dump (muratbuffalo.blogspot.com)
| Gwypaas wrote:
| Don't really have an axe to grind regarding strict-
| serializability, serializability, snapshot isolation or whatever.
| Start at the highest level of consistency and consciously step
| away if you encounter issues.
|
| The Postgres default of "Read committed" gives me shivers on the
| other hand...
|
| > If I were to make a concurrent map data structure whose
| operations are not linearizable and then put it on a library and
| give it to my users, any user of that library would come back to
| me and say that my library has a bug. They may not be able to
| tell me exactly what it is, but they'll understand that there is
| a bug in it.
|
| > However, if I take the same concurrent map data structure, and
| put in an application and call that application a "Key-Value
| store" or a "Database" (DBMS) and give it to the typical database
| users, it seems they may certainly use it for a several decades
| without ever complaining that this "Database" is not linearizable
| (or serializable as the DB folks call it).
|
| https://concurrencyfreaks.blogspot.com/2020/06/you-say-read-...
| WJW wrote:
| I like my strictness just as much as anyone else, but this
| seems like a "works in practice" vs "works in theory" issue to
| me. Linearizability is just not as important as some people
| make it out to be.
|
| (On a slightly meta level, you can see this happening for a
| great many issues in programming. Memory safety is undeniably
| nice, but approximately none of the worlds databases or kernels
| are written in memory safe languages. Type safety is nice but a
| huge majority of frontend code is untyped. Don't even get me
| started on the value of unit testing vs its absence in academic
| code)
| Gwypaas wrote:
| > Linearizability is just not as important as some people
| make it out to be.
|
| That's my entire point. Start with strong models and weaken
| when you decide you need it. Thus rule out entire classes of
| bugs for 99% of your code and closely scrutinize the 1% with
| weakened guarantees to know that it works. That is kind of
| the entire premise behind Rust. Make it extremely visible
| where the dragons may exist, because you have the key to
| unlock that door.
|
| Regarding memory safety, 70% of CVEs at Microsoft and in
| Chrome are memory safety bugs. I would say the choice of
| languages without memory safety is more based on legacy than
| some kind of insightful choice.
|
| https://msrc-blog.microsoft.com/2019/07/16/a-proactive-
| appro...
|
| https://security.googleblog.com/2022/03/whats-up-with-in-
| wil...
| WJW wrote:
| > I would say the choice of languages without memory safety
| is more based on legacy than some kind of insightful
| choice.
|
| You say this as if legacy choices are not some of the most
| persistent choices that exist in technology. I don't
| disagree that 70% of CVEs is memory safety related, but
| still my previous point stands that none of the popular
| kernels or databases are written in Rust.
|
| A new database developer with a bright idea is basically
| confronted with the choice to either:
|
| 1. Write it in C and get it merged into Postgres relatively
| quickly. (Within the year or so)
|
| 2. Write it in Rust and also have to implement the rest of
| a RDBMS, then have to find a user base.
|
| When confronted with that kind of choice, most devs would
| choose option 1 every time and I can't blame them for that
| even though I love memory safety.
| liuliu wrote:
| Read this post a few days ago. I am trying to articulate my
| answer properly, mostly from a client-side dev's perspective.
| Here is my half-baked thoughts:
|
| Strict-serializability pretty much guarantees that you don't need
| any extra business logic to resolve logical states you know you
| won't, but turns out end up with. If your changes go out in-order
| over the wire, you know it will be go in with that order on the
| other end, either that end is temporal or spatial away.
|
| However, strict-serializability doesn't solve either business
| logic errors or collaboration issues. If you have wrong
| transaction wrapping (things should happen in one transaction got
| split), or you have race condition locally, SS doesn't help on
| that. SS doesn't solve collaboration issues because the real-time
| ordering means both of you will end up in the same state, but if
| naively implemented, that same state may not be what you would
| expect (you still need to solve conflicts, and how you solve the
| conflict has to match the user expectation).
|
| Then it goes back to what I originally said, for client-side
| devs, what they are looking for is to not write any extra
| business logic to handle states they know they won't end up with
| when their requests to the server went out in order. For many of
| these business logics, SS is too strong. A combination of
| serializability and linearizabily, as the author suggested, could
| be sufficient. It is just hard to articulate what combination
| that will be for broad spectrum of business applications.
| truncate wrote:
| I feel is kind of similar to static vs dynamic type system
| argument. Dynamic Type systems can work great if you have
| strong contracts and everyone understands how to
| define/maintain that, but as you grow up the team, you'd have
| people having gaps in understanding those contracts and break
| them eventually leading to bugs. Enforcing rules sometimes just
| keeps the rules alive in everyone's head as it is harder to get
| around that usually.
| jwr wrote:
| I've spent enough time dealing with problems caused by lack of
| strict serializability. Theoretical discussions are great, but
| eventually your data model isn't just a bunch of separate key-
| value pairs. The usual approach is to avoid problems by
| redesigning your application and doing gymnastics with your data.
| But what's the point if we have FoundationDB (which I'm in the
| process of migrating to)?
| siddontang wrote:
| Disclosure: The maintainer of TiDB here.
|
| When we started to build the distributed database TiDB, we had
| the similar discussion - whether we need to provide a so strong
| consistence for the customers? Inspired by Google Spanner, we
| also wanted to let the application programmers to take easy and
| not struggle with the consistent problems in applications layer
| by themselves.
|
| So we decided to provide the strong consistency. But unlike
| Google Spanner, we couldn't use TureTime API, so we introduced a
| standalone timestamp service named TSO to allocate timestamp.
| Using TSO is simple to guarantee all transactions' order, but
| this introduces another problem - if the TiDB is deployed
| crossing different regions, the transaction latency is high.
|
| Another example for consistency is from our realtime analytical
| processing design. TiDB is a HTAP(hybrid transaction/analytical
| processing) database. It is normal to keep consistency in the TP
| part, but for the AP part, is it necessary to keep consistency?
| If there is an update in the table, does the customer really need
| to get the latest data from the AP part? Here we still choose
| "YES", we want our customers to focus on their business and not
| worry whether the query data result is consistent of not.
|
| Of course, keeping consistency in the database comes at a cost.
| Now we have been trying our best to optimize the latency and
| performance problems :sob:
|
| IMO, we choose a right way. We now have supported strong
| consistency at first, and then we can provide a loose consistency
| for performance too, but on the other hand, if we only build a
| loose consistent database at first, it is hard to provide a
| strong consistency for the customers.
| deathanatos wrote:
| I'm just sitting here wanting "read-your-writes" from systems.
| richieartoul wrote:
| Funny enough, this actually cuts to the core of why strict
| serializability is so important.
|
| "Technically speaking, every read-only transaction could return
| the empty-set (which was the initial state of the database) and
| not be in violation of the serializability guarantee --- this
| is legal since the system can bring read-only transactions
| "back in time" and get a slot in the serial order before any
| transactions that write to the database."
|
| https://fauna.com/blog/serializability-vs-strict-serializabi...
| richieartoul wrote:
| I've had this debate with people numerous times and I think the
| "you don't need strong consistency" crowd has really gotten it
| backwards. These same people will also agree with you that
| premature optimization is bad, but then argue that application
| developers should be forced to grapple with looser isolation
| models for vague reasons like "latency" and "costs". Most
| applications don't push their databases that hard, and the ones
| that do usually have their most expensive queries as large "read
| only" queries where strict serializability of other transactions
| adds little additional overhead.
|
| It's already hard enough to write correct software, we should be
| pushing hard problems like transaction isolation deep into the
| database so they can be solved correctly once, instead of 1000s
| of times incorrectly.
|
| Most applications that depend on strong consistency in their
| database _could_ be rewritten to require less strict guarantees,
| but why?
|
| Strict serializability should be the default. Engineers working
| in performance sensitive settings and who _really_ know what
| they're doing can always loosen this constraint in the narrow
| part of the application where it matters.
|
| It's crazy to me that anyone would ask a developer to justify why
| they need strict serializability. It should be the opposite, you
| should be forced to justify why you're using a looser isolation
| level for performance reasons just like we force developers to
| justify why they're introducing more performant, but less
| readable code!
| liuliu wrote:
| The post has more leg than you give it credit for. The author
| thinks serializable is table stake, and linearizability can be
| implemented cheaply (through sharding). Strict-serializable
| however, cannot really break "scale-out" barrier with some
| serious compromises (either time-bounds it (Spanner's 7ms
| bounds), or central ordering service that cannot scale out). It
| additionally provides a proof that strict-serializable !=
| serializable + linearizable and then go on to question what you
| responded: does strict-serializable worth all these troubles?
| aidenn0 wrote:
| Is scale-out something most services need though? TFA points
| out that it's likely only a subset of the data that needs
| strict-serializable; the flip-side of that is that it's
| likely only a subset of the data needs scale-out.
| richieartoul wrote:
| In my opinion the answer is unequivocally, yes, it is worth
| it, and developers should accept nothing less from modern
| databases. Even with strict serializability, developers will
| still be surprised by transactional behavior in distributed
| environments, but at least strict serializability provides a
| relatively straight forward framework for developers to
| reason about with almost no "ifs, ands, or buts" except for
| some confusing behavior around idempotency and transaction
| retries. As soon as you drop below strict serializability,
| the cognitive load of reasoning about your applications
| correctness increases dramatically. Most developers just
| accept this, but it's not the right trade off for most
| applications IMO.
|
| If this was some theoretical question that we hadn't figured
| out if it was practical or possible yet, I would have more
| sympathy for this line of thinking, but it's not anymore.
|
| Spanner, CockroachDB, and FoundationDB have all proven that
| these systems can be scaled, and economically so.
|
| FoundationDB scales _incredibly_ well with great performance
| characteristics. That's evidence enough to me that the cost
| of strict serializability is worth it for 99% of
| applications.
| _benedict wrote:
| CockroachDB is _not_ strict serializable. It is
| linearizable+serializable+(some other guarantee I forget).
|
| Only Spanner currently offers properly scalable strict
| serializability, as FoundationDB clusters have a size limit
| (that is very forgiving, and enough for most use cases).
|
| Apache Cassandra is working on providing scalable (without
| restriction) strict serializable transactions[1], and they
| should arrive fairly soon. So far as I am aware, at this
| time it will be the only distributed database besides
| Spanner offering fully scalable and fast global
| transactions with this level of isolation.
|
| [1] https://cwiki.apache.org/confluence/display/CASSANDRA/C
| EP-15... (Disclaimer) I'm one of the authors
| bitbckt wrote:
| FaunaDB also offers scalable strict serializability.
|
| Disclosure: I work on FaunaDB.
| _benedict wrote:
| I believe Fauna inherits some scalability issues from
| Calvin, in that given NxN communication is necessary for
| transactions to be processed, a catastrophic failure in
| one shard can bring the entire database offline. Is that
| correct?
|
| It may not happen in practice today, but as your clusters
| grow your exposure to such a failure is increased, and
| the NxN communication overhead grows does it not?
|
| It's certainly more scalable than other systems offering
| this isolation today (besides Spanner), but
| algorithmically at least I believe the proposal we are
| developing for Cassandra is strictly more scalable, as
| the cost and failure exposure grow only with the
| transaction scope, not the cluster.
|
| Not to ding FaunaDB, though. It probably is the most
| scalable database offering this level of isolation that
| is deployable on your own hardware - assuming it is? I
| know it is primarily a managed service, like Spanner.
|
| I'm also not aware how large any real world clusters have
| gotten with FaunaDB in practice as yet. Do you have any
| data on that?
| bitbckt wrote:
| > I believe Fauna inherits some scalability issues from
| Calvin, in that given NxN communication is necessary for
| transactions to be processed, a catastrophic failure in
| one shard can bring the entire database offline. Is that
| correct?
|
| No, that's not correct. FaunaDB is inspired by Calvin,
| but is not a direct implementation of Calvin.
| Transactions are applied independently on each storage
| host, though checking OCC locks may involve peers within
| a single replica.
|
| > assuming it is
|
| We used to offer on-premises delivery, but we are
| strictly a managed service now.
|
| > I'm also not aware how large any real world clusters
| have gotten with FaunaDB in practice as yet. Do you have
| any data on that?
|
| I'm not at liberty to say.
| _benedict wrote:
| > Transactions are applied independently on each storage
| host
|
| I'm not talking about transaction application, but
| obtaining your slot in the global transaction log. How
| does a replica know it isn't missing a transaction from
| some other shard if that shard is offline come its turn
| to declare transactions in the log? It must at least
| receive a message saying no transactions involving it
| were declared by that shard, no?
|
| This is pretty core to the Calvin approach, unless I
| misunderstand it.
|
| > I'm not at liberty to say.
|
| I think scalability is something that is a function of
| both theoretical expectations and practical
| demonstration.
| bitbckt wrote:
| > How does a replica know it isn't missing a transaction
| from some other shard if that shard is offline come its
| turn to declare transactions in the log? It must at least
| receive a message saying no transactions involving it
| were declared by that shard, no?
|
| A transaction is committed to the log with sufficient
| information necessary for the log (and each storage host)
| to know whether any particular host is involved in that
| transaction. There's no need for a message to prove a
| host's absence from that transaction - it has all it
| needs in the log to determine that it isn't involved.
| There is some complexity around how that information maps
| to the cluster's physical topology, but hosts which
| aren't involved in a transaction don't process that
| transaction and need not check with any other host to
| know that fact.
|
| [ETA] That information is derived from the read set when
| constructing the transaction to propose to the log.
| Logical partitions that weren't a part of that read set
| don't need to know that they weren't involved - that fact
| is obvious from their absence.
|
| > I think scalability is something that is a function of
| both theoretical expectations and practical
| demonstration.
|
| I agree with you, though as a commercial product operated
| as a service, that practical demonstration is naturally
| limited in this forum.
| _benedict wrote:
| You misunderstand my point. A replica may of course
| safely assume it is not involved in a transaction it
| hasn't witnessed so long as it has a "complete" log (as
| in, the portion involving it).
|
| > that fact is obvious from their absence.
|
| Yes, but absence from what? There must be some message
| that contains the relevant portion of the log declared by
| each other shard. If that shard is offline, so it cannot
| declare its transactions at all, how does the system
| continue? This absence is one step back from the one you
| are discussing.
|
| If a shard's leader is offline, a new leader must be
| elected before its slot comes around for processing - and
| until this happens _all_ transactions in later slots must
| wait, as it _might_ have declared a transaction that
| interferes with those a later slot would declare.
|
| If no leader can be elected (because a majority of
| replicas are offline) then the entire log stops, no?
| bitbckt wrote:
| It's true: I've had trouble determining what exactly your
| point is, as you seem to think you know more about
| FaunaDB's architecture than you do.
|
| You're under the mistaken impression that we have the
| same data/log architecture as described in the Calvin
| paper, which requires that every host talk with every
| other host. That is not true of FaunaDB.
| _benedict wrote:
| What you describe is a very important distinction from
| Calvin, as this mechanism for deriving a global log is
| core to its design. Have you published this new approach
| in any public venue?
|
| If not, we're now in an unfortunate situation where
| neither your practical nor theoretical capabilities are
| public knowledge.
|
| > every host talk with every other host
|
| Only every shard
|
| > you seem to think you know more about FaunaDB's
| architecture than you do
|
| You are perhaps being uncharitable. The only public
| statements I am aware of about Fauna's capabilities
| relate to its Calvin heritage. I have only been asking if
| my understanding is correct regarding both Calvin and its
| application to Fauna. However, none of the prior issues
| you responded to were problems with the original Calvin
| paper either, at least by my reading.
| bitbckt wrote:
| LMGTFY:
|
| https://fauna.com/blog/consistency-without-clocks-
| faunadb-tr...
| richieartoul wrote:
| You're right about CockroachDB, my mistake.
|
| I think your characterization of Cassandra as "fully
| scalable" in a way that other systems are not is
| misleading, but I won't argue it with you :p
| _benedict wrote:
| I'm not talking about Cassandra being more generally
| scalable than other systems, only that no system as
| scalable as Cassandra offers strict serializable
| transactions besides Spanner (including Cassandra, today)
| mamcx wrote:
| Is important to take in context the SIZE.
|
| A lot of Ink is wrongly interpreted because people not think
| about the SIZE of the (data/problem).
|
| MOST, if not ALL(99%) of the thinking about this stuff is for
| the very end of the scale: Terabytes, dozen/thousand of
| machines, insane latency, etc.
|
| MOST, if not ALL(99%) of the ACTUAL data people deal with,
| the one that matters, is in the low of gigabytes, at best.
| One, maybe 2 servers, answer in a second(is) is more than ok.
|
| You can fit most databases in an iPhone with spare to work.
|
| Instead, the real trouble that people face is wrong
| schema/query designs, and introducing bugs and other stuff
| that the RDBMs thankfully correct/protect against.
|
| Making the primary store MORE strict is a big win to
| alleviate the real issues.
| adamnemecek wrote:
| What's crazy to me is that the idea of linearizability is
| essentially the same as the idea of fixed-points or
| dimensionality reduction.
|
| I have been collecting some links on this topic
|
| https://github.com/adamnemecek/adjoint
| someon3 wrote:
| So you can serialize stuff ?
| kevincox wrote:
| The TL;DR is that every anomaly that your database can experience
| is one thing that _every_ developer needs to consider for _every_
| change they make. This is a huge overhead. Most importantly every
| developer _won 't_ correctly consider these anomalies for every
| change because humans aren't perfect. This means that you _will_
| encounter bugs from time to time.
|
| So that is the purpose. If you solve this at the database layer
| you relieve your developers of this burden and reduce bugs. You
| can argue that the cost is too high (it may be) but there is
| clearly a benefit to balance it out. Especially if the cost keeps
| dropping (as it seems to be doing) the balance tips every in
| favour of stricter consistency models.
___________________________________________________________________
(page generated 2022-08-04 23:01 UTC)