[HN Gopher] Strict-serializability, but at what cost, for what p...
       ___________________________________________________________________
        
       Strict-serializability, but at what cost, for what purpose?
        
       Author : ingve
       Score  : 32 points
       Date   : 2022-08-04 17:50 UTC (5 hours ago)
        
 (HTM) web link (muratbuffalo.blogspot.com)
 (TXT) w3m dump (muratbuffalo.blogspot.com)
        
       | Gwypaas wrote:
       | Don't really have an axe to grind regarding strict-
       | serializability, serializability, snapshot isolation or whatever.
       | Start at the highest level of consistency and consciously step
       | away if you encounter issues.
       | 
       | The Postgres default of "Read committed" gives me shivers on the
       | other hand...
       | 
       | > If I were to make a concurrent map data structure whose
       | operations are not linearizable and then put it on a library and
       | give it to my users, any user of that library would come back to
       | me and say that my library has a bug. They may not be able to
       | tell me exactly what it is, but they'll understand that there is
       | a bug in it.
       | 
       | > However, if I take the same concurrent map data structure, and
       | put in an application and call that application a "Key-Value
       | store" or a "Database" (DBMS) and give it to the typical database
       | users, it seems they may certainly use it for a several decades
       | without ever complaining that this "Database" is not linearizable
       | (or serializable as the DB folks call it).
       | 
       | https://concurrencyfreaks.blogspot.com/2020/06/you-say-read-...
        
         | WJW wrote:
         | I like my strictness just as much as anyone else, but this
         | seems like a "works in practice" vs "works in theory" issue to
         | me. Linearizability is just not as important as some people
         | make it out to be.
         | 
         | (On a slightly meta level, you can see this happening for a
         | great many issues in programming. Memory safety is undeniably
         | nice, but approximately none of the worlds databases or kernels
         | are written in memory safe languages. Type safety is nice but a
         | huge majority of frontend code is untyped. Don't even get me
         | started on the value of unit testing vs its absence in academic
         | code)
        
           | Gwypaas wrote:
           | > Linearizability is just not as important as some people
           | make it out to be.
           | 
           | That's my entire point. Start with strong models and weaken
           | when you decide you need it. Thus rule out entire classes of
           | bugs for 99% of your code and closely scrutinize the 1% with
           | weakened guarantees to know that it works. That is kind of
           | the entire premise behind Rust. Make it extremely visible
           | where the dragons may exist, because you have the key to
           | unlock that door.
           | 
           | Regarding memory safety, 70% of CVEs at Microsoft and in
           | Chrome are memory safety bugs. I would say the choice of
           | languages without memory safety is more based on legacy than
           | some kind of insightful choice.
           | 
           | https://msrc-blog.microsoft.com/2019/07/16/a-proactive-
           | appro...
           | 
           | https://security.googleblog.com/2022/03/whats-up-with-in-
           | wil...
        
             | WJW wrote:
             | > I would say the choice of languages without memory safety
             | is more based on legacy than some kind of insightful
             | choice.
             | 
             | You say this as if legacy choices are not some of the most
             | persistent choices that exist in technology. I don't
             | disagree that 70% of CVEs is memory safety related, but
             | still my previous point stands that none of the popular
             | kernels or databases are written in Rust.
             | 
             | A new database developer with a bright idea is basically
             | confronted with the choice to either:
             | 
             | 1. Write it in C and get it merged into Postgres relatively
             | quickly. (Within the year or so)
             | 
             | 2. Write it in Rust and also have to implement the rest of
             | a RDBMS, then have to find a user base.
             | 
             | When confronted with that kind of choice, most devs would
             | choose option 1 every time and I can't blame them for that
             | even though I love memory safety.
        
       | liuliu wrote:
       | Read this post a few days ago. I am trying to articulate my
       | answer properly, mostly from a client-side dev's perspective.
       | Here is my half-baked thoughts:
       | 
       | Strict-serializability pretty much guarantees that you don't need
       | any extra business logic to resolve logical states you know you
       | won't, but turns out end up with. If your changes go out in-order
       | over the wire, you know it will be go in with that order on the
       | other end, either that end is temporal or spatial away.
       | 
       | However, strict-serializability doesn't solve either business
       | logic errors or collaboration issues. If you have wrong
       | transaction wrapping (things should happen in one transaction got
       | split), or you have race condition locally, SS doesn't help on
       | that. SS doesn't solve collaboration issues because the real-time
       | ordering means both of you will end up in the same state, but if
       | naively implemented, that same state may not be what you would
       | expect (you still need to solve conflicts, and how you solve the
       | conflict has to match the user expectation).
       | 
       | Then it goes back to what I originally said, for client-side
       | devs, what they are looking for is to not write any extra
       | business logic to handle states they know they won't end up with
       | when their requests to the server went out in order. For many of
       | these business logics, SS is too strong. A combination of
       | serializability and linearizabily, as the author suggested, could
       | be sufficient. It is just hard to articulate what combination
       | that will be for broad spectrum of business applications.
        
         | truncate wrote:
         | I feel is kind of similar to static vs dynamic type system
         | argument. Dynamic Type systems can work great if you have
         | strong contracts and everyone understands how to
         | define/maintain that, but as you grow up the team, you'd have
         | people having gaps in understanding those contracts and break
         | them eventually leading to bugs. Enforcing rules sometimes just
         | keeps the rules alive in everyone's head as it is harder to get
         | around that usually.
        
       | jwr wrote:
       | I've spent enough time dealing with problems caused by lack of
       | strict serializability. Theoretical discussions are great, but
       | eventually your data model isn't just a bunch of separate key-
       | value pairs. The usual approach is to avoid problems by
       | redesigning your application and doing gymnastics with your data.
       | But what's the point if we have FoundationDB (which I'm in the
       | process of migrating to)?
        
       | siddontang wrote:
       | Disclosure: The maintainer of TiDB here.
       | 
       | When we started to build the distributed database TiDB, we had
       | the similar discussion - whether we need to provide a so strong
       | consistence for the customers? Inspired by Google Spanner, we
       | also wanted to let the application programmers to take easy and
       | not struggle with the consistent problems in applications layer
       | by themselves.
       | 
       | So we decided to provide the strong consistency. But unlike
       | Google Spanner, we couldn't use TureTime API, so we introduced a
       | standalone timestamp service named TSO to allocate timestamp.
       | Using TSO is simple to guarantee all transactions' order, but
       | this introduces another problem - if the TiDB is deployed
       | crossing different regions, the transaction latency is high.
       | 
       | Another example for consistency is from our realtime analytical
       | processing design. TiDB is a HTAP(hybrid transaction/analytical
       | processing) database. It is normal to keep consistency in the TP
       | part, but for the AP part, is it necessary to keep consistency?
       | If there is an update in the table, does the customer really need
       | to get the latest data from the AP part? Here we still choose
       | "YES", we want our customers to focus on their business and not
       | worry whether the query data result is consistent of not.
       | 
       | Of course, keeping consistency in the database comes at a cost.
       | Now we have been trying our best to optimize the latency and
       | performance problems :sob:
       | 
       | IMO, we choose a right way. We now have supported strong
       | consistency at first, and then we can provide a loose consistency
       | for performance too, but on the other hand, if we only build a
       | loose consistent database at first, it is hard to provide a
       | strong consistency for the customers.
        
       | deathanatos wrote:
       | I'm just sitting here wanting "read-your-writes" from systems.
        
         | richieartoul wrote:
         | Funny enough, this actually cuts to the core of why strict
         | serializability is so important.
         | 
         | "Technically speaking, every read-only transaction could return
         | the empty-set (which was the initial state of the database) and
         | not be in violation of the serializability guarantee --- this
         | is legal since the system can bring read-only transactions
         | "back in time" and get a slot in the serial order before any
         | transactions that write to the database."
         | 
         | https://fauna.com/blog/serializability-vs-strict-serializabi...
        
       | richieartoul wrote:
       | I've had this debate with people numerous times and I think the
       | "you don't need strong consistency" crowd has really gotten it
       | backwards. These same people will also agree with you that
       | premature optimization is bad, but then argue that application
       | developers should be forced to grapple with looser isolation
       | models for vague reasons like "latency" and "costs". Most
       | applications don't push their databases that hard, and the ones
       | that do usually have their most expensive queries as large "read
       | only" queries where strict serializability of other transactions
       | adds little additional overhead.
       | 
       | It's already hard enough to write correct software, we should be
       | pushing hard problems like transaction isolation deep into the
       | database so they can be solved correctly once, instead of 1000s
       | of times incorrectly.
       | 
       | Most applications that depend on strong consistency in their
       | database _could_ be rewritten to require less strict guarantees,
       | but why?
       | 
       | Strict serializability should be the default. Engineers working
       | in performance sensitive settings and who _really_ know what
       | they're doing can always loosen this constraint in the narrow
       | part of the application where it matters.
       | 
       | It's crazy to me that anyone would ask a developer to justify why
       | they need strict serializability. It should be the opposite, you
       | should be forced to justify why you're using a looser isolation
       | level for performance reasons just like we force developers to
       | justify why they're introducing more performant, but less
       | readable code!
        
         | liuliu wrote:
         | The post has more leg than you give it credit for. The author
         | thinks serializable is table stake, and linearizability can be
         | implemented cheaply (through sharding). Strict-serializable
         | however, cannot really break "scale-out" barrier with some
         | serious compromises (either time-bounds it (Spanner's 7ms
         | bounds), or central ordering service that cannot scale out). It
         | additionally provides a proof that strict-serializable !=
         | serializable + linearizable and then go on to question what you
         | responded: does strict-serializable worth all these troubles?
        
           | aidenn0 wrote:
           | Is scale-out something most services need though? TFA points
           | out that it's likely only a subset of the data that needs
           | strict-serializable; the flip-side of that is that it's
           | likely only a subset of the data needs scale-out.
        
           | richieartoul wrote:
           | In my opinion the answer is unequivocally, yes, it is worth
           | it, and developers should accept nothing less from modern
           | databases. Even with strict serializability, developers will
           | still be surprised by transactional behavior in distributed
           | environments, but at least strict serializability provides a
           | relatively straight forward framework for developers to
           | reason about with almost no "ifs, ands, or buts" except for
           | some confusing behavior around idempotency and transaction
           | retries. As soon as you drop below strict serializability,
           | the cognitive load of reasoning about your applications
           | correctness increases dramatically. Most developers just
           | accept this, but it's not the right trade off for most
           | applications IMO.
           | 
           | If this was some theoretical question that we hadn't figured
           | out if it was practical or possible yet, I would have more
           | sympathy for this line of thinking, but it's not anymore.
           | 
           | Spanner, CockroachDB, and FoundationDB have all proven that
           | these systems can be scaled, and economically so.
           | 
           | FoundationDB scales _incredibly_ well with great performance
           | characteristics. That's evidence enough to me that the cost
           | of strict serializability is worth it for 99% of
           | applications.
        
             | _benedict wrote:
             | CockroachDB is _not_ strict serializable. It is
             | linearizable+serializable+(some other guarantee I forget).
             | 
             | Only Spanner currently offers properly scalable strict
             | serializability, as FoundationDB clusters have a size limit
             | (that is very forgiving, and enough for most use cases).
             | 
             | Apache Cassandra is working on providing scalable (without
             | restriction) strict serializable transactions[1], and they
             | should arrive fairly soon. So far as I am aware, at this
             | time it will be the only distributed database besides
             | Spanner offering fully scalable and fast global
             | transactions with this level of isolation.
             | 
             | [1] https://cwiki.apache.org/confluence/display/CASSANDRA/C
             | EP-15... (Disclaimer) I'm one of the authors
        
               | bitbckt wrote:
               | FaunaDB also offers scalable strict serializability.
               | 
               | Disclosure: I work on FaunaDB.
        
               | _benedict wrote:
               | I believe Fauna inherits some scalability issues from
               | Calvin, in that given NxN communication is necessary for
               | transactions to be processed, a catastrophic failure in
               | one shard can bring the entire database offline. Is that
               | correct?
               | 
               | It may not happen in practice today, but as your clusters
               | grow your exposure to such a failure is increased, and
               | the NxN communication overhead grows does it not?
               | 
               | It's certainly more scalable than other systems offering
               | this isolation today (besides Spanner), but
               | algorithmically at least I believe the proposal we are
               | developing for Cassandra is strictly more scalable, as
               | the cost and failure exposure grow only with the
               | transaction scope, not the cluster.
               | 
               | Not to ding FaunaDB, though. It probably is the most
               | scalable database offering this level of isolation that
               | is deployable on your own hardware - assuming it is? I
               | know it is primarily a managed service, like Spanner.
               | 
               | I'm also not aware how large any real world clusters have
               | gotten with FaunaDB in practice as yet. Do you have any
               | data on that?
        
               | bitbckt wrote:
               | > I believe Fauna inherits some scalability issues from
               | Calvin, in that given NxN communication is necessary for
               | transactions to be processed, a catastrophic failure in
               | one shard can bring the entire database offline. Is that
               | correct?
               | 
               | No, that's not correct. FaunaDB is inspired by Calvin,
               | but is not a direct implementation of Calvin.
               | Transactions are applied independently on each storage
               | host, though checking OCC locks may involve peers within
               | a single replica.
               | 
               | > assuming it is
               | 
               | We used to offer on-premises delivery, but we are
               | strictly a managed service now.
               | 
               | > I'm also not aware how large any real world clusters
               | have gotten with FaunaDB in practice as yet. Do you have
               | any data on that?
               | 
               | I'm not at liberty to say.
        
               | _benedict wrote:
               | > Transactions are applied independently on each storage
               | host
               | 
               | I'm not talking about transaction application, but
               | obtaining your slot in the global transaction log. How
               | does a replica know it isn't missing a transaction from
               | some other shard if that shard is offline come its turn
               | to declare transactions in the log? It must at least
               | receive a message saying no transactions involving it
               | were declared by that shard, no?
               | 
               | This is pretty core to the Calvin approach, unless I
               | misunderstand it.
               | 
               | > I'm not at liberty to say.
               | 
               | I think scalability is something that is a function of
               | both theoretical expectations and practical
               | demonstration.
        
               | bitbckt wrote:
               | > How does a replica know it isn't missing a transaction
               | from some other shard if that shard is offline come its
               | turn to declare transactions in the log? It must at least
               | receive a message saying no transactions involving it
               | were declared by that shard, no?
               | 
               | A transaction is committed to the log with sufficient
               | information necessary for the log (and each storage host)
               | to know whether any particular host is involved in that
               | transaction. There's no need for a message to prove a
               | host's absence from that transaction - it has all it
               | needs in the log to determine that it isn't involved.
               | There is some complexity around how that information maps
               | to the cluster's physical topology, but hosts which
               | aren't involved in a transaction don't process that
               | transaction and need not check with any other host to
               | know that fact.
               | 
               | [ETA] That information is derived from the read set when
               | constructing the transaction to propose to the log.
               | Logical partitions that weren't a part of that read set
               | don't need to know that they weren't involved - that fact
               | is obvious from their absence.
               | 
               | > I think scalability is something that is a function of
               | both theoretical expectations and practical
               | demonstration.
               | 
               | I agree with you, though as a commercial product operated
               | as a service, that practical demonstration is naturally
               | limited in this forum.
        
               | _benedict wrote:
               | You misunderstand my point. A replica may of course
               | safely assume it is not involved in a transaction it
               | hasn't witnessed so long as it has a "complete" log (as
               | in, the portion involving it).
               | 
               | > that fact is obvious from their absence.
               | 
               | Yes, but absence from what? There must be some message
               | that contains the relevant portion of the log declared by
               | each other shard. If that shard is offline, so it cannot
               | declare its transactions at all, how does the system
               | continue? This absence is one step back from the one you
               | are discussing.
               | 
               | If a shard's leader is offline, a new leader must be
               | elected before its slot comes around for processing - and
               | until this happens _all_ transactions in later slots must
               | wait, as it _might_ have declared a transaction that
               | interferes with those a later slot would declare.
               | 
               | If no leader can be elected (because a majority of
               | replicas are offline) then the entire log stops, no?
        
               | bitbckt wrote:
               | It's true: I've had trouble determining what exactly your
               | point is, as you seem to think you know more about
               | FaunaDB's architecture than you do.
               | 
               | You're under the mistaken impression that we have the
               | same data/log architecture as described in the Calvin
               | paper, which requires that every host talk with every
               | other host. That is not true of FaunaDB.
        
               | _benedict wrote:
               | What you describe is a very important distinction from
               | Calvin, as this mechanism for deriving a global log is
               | core to its design. Have you published this new approach
               | in any public venue?
               | 
               | If not, we're now in an unfortunate situation where
               | neither your practical nor theoretical capabilities are
               | public knowledge.
               | 
               | > every host talk with every other host
               | 
               | Only every shard
               | 
               | > you seem to think you know more about FaunaDB's
               | architecture than you do
               | 
               | You are perhaps being uncharitable. The only public
               | statements I am aware of about Fauna's capabilities
               | relate to its Calvin heritage. I have only been asking if
               | my understanding is correct regarding both Calvin and its
               | application to Fauna. However, none of the prior issues
               | you responded to were problems with the original Calvin
               | paper either, at least by my reading.
        
               | bitbckt wrote:
               | LMGTFY:
               | 
               | https://fauna.com/blog/consistency-without-clocks-
               | faunadb-tr...
        
               | richieartoul wrote:
               | You're right about CockroachDB, my mistake.
               | 
               | I think your characterization of Cassandra as "fully
               | scalable" in a way that other systems are not is
               | misleading, but I won't argue it with you :p
        
               | _benedict wrote:
               | I'm not talking about Cassandra being more generally
               | scalable than other systems, only that no system as
               | scalable as Cassandra offers strict serializable
               | transactions besides Spanner (including Cassandra, today)
        
           | mamcx wrote:
           | Is important to take in context the SIZE.
           | 
           | A lot of Ink is wrongly interpreted because people not think
           | about the SIZE of the (data/problem).
           | 
           | MOST, if not ALL(99%) of the thinking about this stuff is for
           | the very end of the scale: Terabytes, dozen/thousand of
           | machines, insane latency, etc.
           | 
           | MOST, if not ALL(99%) of the ACTUAL data people deal with,
           | the one that matters, is in the low of gigabytes, at best.
           | One, maybe 2 servers, answer in a second(is) is more than ok.
           | 
           | You can fit most databases in an iPhone with spare to work.
           | 
           | Instead, the real trouble that people face is wrong
           | schema/query designs, and introducing bugs and other stuff
           | that the RDBMs thankfully correct/protect against.
           | 
           | Making the primary store MORE strict is a big win to
           | alleviate the real issues.
        
       | adamnemecek wrote:
       | What's crazy to me is that the idea of linearizability is
       | essentially the same as the idea of fixed-points or
       | dimensionality reduction.
       | 
       | I have been collecting some links on this topic
       | 
       | https://github.com/adamnemecek/adjoint
        
       | someon3 wrote:
       | So you can serialize stuff ?
        
       | kevincox wrote:
       | The TL;DR is that every anomaly that your database can experience
       | is one thing that _every_ developer needs to consider for _every_
       | change they make. This is a huge overhead. Most importantly every
       | developer _won 't_ correctly consider these anomalies for every
       | change because humans aren't perfect. This means that you _will_
       | encounter bugs from time to time.
       | 
       | So that is the purpose. If you solve this at the database layer
       | you relieve your developers of this burden and reduce bugs. You
       | can argue that the cost is too high (it may be) but there is
       | clearly a benefit to balance it out. Especially if the cost keeps
       | dropping (as it seems to be doing) the balance tips every in
       | favour of stricter consistency models.
        
       ___________________________________________________________________
       (page generated 2022-08-04 23:01 UTC)