[HN Gopher] PostgreSQL Subtransactions Considered Harmful
       ___________________________________________________________________
        
       PostgreSQL Subtransactions Considered Harmful
        
       Author : samokhvalov
       Score  : 97 points
       Date   : 2021-08-31 22:14 UTC (11 hours ago)
        
 (HTM) web link (postgres.ai)
 (TXT) w3m dump (postgres.ai)
        
       | vvern wrote:
       | cockroachdb implements subtransactions and I don't think any of
       | these issues apply to cockroach's implementation.
        
         | samokhvalov wrote:
         | They have an interesting section, "Beware: don't (over) use
         | nested transactions", in their blog post:
         | https://www.cockroachlabs.com/blog/nested-transactions-in-co...
         | 
         | > ...we do not recommend their use in new applications.
         | 
         | > The reality is that nested transactions are a product of the
         | early days of software engineering, in the 1990s, back when
         | systems were tightly coupled and the Internet and the Cloud
         | were not yet very relevant. ...
         | 
         | > Nowadays, such tight coupling has a bad rep. This is because
         | two additional decades of software engineering have taught us
         | that implicit global state really, really does not play well
         | with distributed services where some components may fail even
         | as part of normal load ...
         | 
         | > Additionally, nested transactions can amplify performance
         | anomalies ...
         | 
         | > Finally, nested transactions can run afoul of correctness in
         | distributed apps. In fact, the idea of multi-component
         | transactions in client code really evokes the idea of a bull in
         | a china shop. As long as all is well and the transaction is due
         | to commit, the idea somewhat makes sense. However, what happens
         | when the database (and not the client) decides the transaction
         | is un-committable and must be aborted, for example because of a
         | serializability conflict or a node restarting for maintenance?
         | It is not just the database state that must be rolled back; all
         | the possible side effects performed by the components holding
         | the transaction must also be rolled back.
         | 
         | They implemented it, but do not recommend using it.
        
           | cakoose wrote:
           | Does anybody here understand the argument they're making? The
           | language is vague and most examples seem to apply to un-
           | nested transactions as well.
           | 
           | > Additionally, nested transactions can amplify performance
           | anomalies...
           | 
           | This is a function of a transaction's duration and volume of
           | writes, right? Does it matter whether there is nesting?
           | 
           | > It is not just the database state that must be rolled back;
           | all the possible side effects performed by the components
           | holding the transaction must also be rolled back.
           | 
           | Doesn't this apply to un-nested transactions as well? Having
           | external API calls run "within" a DB transaction is common
           | source of inconsistency.
        
       | ltbarcly3 wrote:
       | Using subtransactions appropriately is unlikely to trigger any
       | noticeable performance differences, any more than 'having less
       | transactions' overall will.
       | 
       | When doing multiple steps that need to track external state,
       | subtransactions can greatly simplify the schema you need as well
       | as make the database interactions more efficient. You need to
       | safely record progress at each step, and you can't afford to lose
       | previous steps because a transaction gets rolled back
       | unexpectedly. You could start a transaction and commit between
       | each step, but this gets pretty annoying if you need to SELECT
       | FOR UPDATE or otherwise make sure what you are doing is
       | concurrency safe, since this would require you to track all the
       | resources you need as you go, which would greatly complicate the
       | design of your non-sql code. Not to mention the added failure
       | modes of not being able to reacquire those resources after step N
       | for whatever reason.
       | 
       | Where things get stupid is ORMs and frameworks that start sub
       | transactions all over the place when you don't need them and
       | aren't even trying to use them. I'm looking at you, Django
       | transaction.atomic(), you absolute idiot.
        
         | samokhvalov wrote:
         | What does "appropriately" mean in this context?
        
           | ltbarcly3 wrote:
           | Not automatically inserting subtransactions that aren't
           | necessary, especially when you aren't setting your
           | application up to ever actually use them, as a side effect of
           | something else. Like Django's transaction.atomic(), which at
           | the top nesting level starts and commits a transaction, but
           | which any dynamically nested calls start and commit sub
           | transactions. This is almost never what anybody wants, I can
           | count on one hand the number of times I've seen code that
           | handles failed sub transactions and continues the overall
           | transaction, and which most programmers don't even know how
           | it works or what it does.
        
         | SPBS wrote:
         | > When doing multiple steps that need to track external state
         | 
         | Do you have an example of this? Many explanations for SAVEPOINT
         | say it is useful when 'recalculation is deemed too expensive'
         | but I cannot come up with a satisfactory example where this is
         | truly the case.
        
       | Cullinet wrote:
       | I was confused enough by the word subtransactions to ask DDG for
       | help. This discussion is #1 and #3 is
       | https://core.ac.uk/download/pdf/14445799.pdf which is a more
       | direct overview of where subtransactions are at. The paper's
       | title Transactions Design For Databases With High Performance and
       | Availability isn't desperately helpful, but the tabular
       | isolation, concurrency and atomicity survey is.
       | 
       | EDIT :
       | 
       | the very first reference in the above paper A Critique of ANSI
       | isolation levels, which is from 1995 and I have only just started
       | with but looks like important reading, is here
       | https://www.microsoft.com/en-us/research/wp-content/uploads/...
        
       | justinludwig wrote:
       | Lame title, but a really good, in-depth examination of the
       | lurking issues with subtransactions on postgres.
        
       | monkey26 wrote:
       | Really dislike these "considered harmful" type titles.
        
         | [deleted]
        
         | simonw wrote:
         | You may enjoy this classic then:
         | 
         | "Considered Harmful" Essays Considered Harmful
         | https://meyerweb.com/eric/comment/chech.html
        
         | ahachete wrote:
         | I'm the co-author of one of such ones: "Schema Later Considered
         | Harmful" [1]. I don't see what's wrong with this "type titles",
         | may you elaborate? :)
         | 
         | [1] https://www.enterprisedb.com/blog/schema-later-considered-
         | ha...
        
           | ketralnis wrote:
           | It reads to me the same as "is deprecated" or neovim's claim
           | of "literally the future of vim": it makes bold claims about
           | consensus with no support
        
           | stonecolddevin wrote:
           | They're hyperbolic and don't actually give you any useful
           | information on why they're "harmful" or what's wrong with the
           | subject being discussed.
        
             | ahachete wrote:
             | That they are hyperbolic, maybe. I disagree with the rest,
             | I believe there are tons of information on both examples
             | about why they are harmful.
             | 
             | (edit, typo)
        
               | detaro wrote:
               | Not in the _titles_.  "X is harmful" could instead be "X
               | is unsafe", "X is often slow", "X has non-obvious corner
               | cases", "X led us to maintenance hell", ...
               | 
               | "Harmful" is less information and kind of suggests an
               | overall judgement for all cases vs "things to consider"
               | (which was e.g. also a criticism of the original letter,
               | that it lead to rules like "never use goto" which forced
               | people to do bad workarounds instead of discussion on
               | where to use it and where not)
        
               | kbenson wrote:
               | The problem with titles like that is that they don't
               | convey the same thing. "considered harmful" has entered
               | the lexicon at this point, so you know what you're going
               | to get when you see it, which is someone attempting to
               | make a reasoned case about why some behavior that people
               | do and consider okay is actually worse than people may
               | suspect.
               | 
               | In that way, it's much more informative and less likely
               | to have people arguing pedantic points about the title
               | being "correct" than "is harmful" or "is often slow" or
               | "has non-obvious corner cases" or "can create maintenance
               | problems", which might _all_ be items together in a  "is
               | considered harmful" article.
               | 
               | In other words, it's come full circle. There was the
               | first, then there was the numerous copycats, then the
               | backlash, and now we're all the way around to the point
               | there it's a fairly succinct way to describe exactly the
               | type of article people use it on and most people know
               | what it means and what to expect when they see it.
               | 
               | All that's left is for people to realize it's not going
               | anywhere and actually has some beneficial use and that
               | there's no point in complaining about it anymore, but
               | that might be asking too much.
        
               | ahachete wrote:
               | Fair enough. Yet it might be difficult to summarize the
               | tradeoffs succinctly enough for a title.
               | 
               | "Harmful" conveys enough information to raise awareness
               | of a topic which might not be usually considered. Both
               | examples show this pattern: schema-less is often
               | considered a good thing, when it is not (in Stonebraker's
               | and my opinion); also subtransactions are usually
               | considered a good thing, while the OP shows they can have
               | notable negative effects (of diverse nature) in the
               | database.
        
             | smsm42 wrote:
             | You mean, you actually have to read the article?
        
           | Waterluvian wrote:
           | In my opinion:
           | 
           | It's smug.
           | 
           | You're publishing an opinion with a title that suggests an
           | authority has formed a consensus.
        
           | monkey26 wrote:
           | I've seen plenty over the years as a professional developer
           | of 20+ years and they are opinion pieces for the most part.
           | I'd rather see a title along the lines of "Why you should
           | avoid subtransactions".
           | 
           | Drinking bleach is harmful.
        
             | cakoose wrote:
             | The original was Dijkstra's "Go To Statement Considered
             | Harmful". Many considered that title unnecessarily
             | provocative, but it ended up creating a new CS/programming
             | meme.
             | 
             | Now, I see "Considered Harmful" as just a concise way of
             | alluding to that style of article, basically saying "here
             | are the not-well-known downsides of a commonly-used thing".
             | Many "Considered Harmful" articles do not live up to that
             | promise, but I think this one does.
             | 
             | Another title meme: "Zen and the Art of ...".
        
               | unportant wrote:
               | The funny thing is that Dijkstra named this article "A
               | Case Against the Goto Statement". ACM editor didn't find
               | that clickbaity enough and changed the title.
        
               | akvadrako wrote:
               | That original title is so much better. If articles like
               | this would use it instead of the clickbait one it would
               | give them more credability.
        
       | ahachete wrote:
       | This is a deep and excellent post. Thanks for publishing it and
       | all the research required.
       | 
       | Maybe too much 30,000 feet perspective, but in general everything
       | that is "state" in a session is a potential limit to performance
       | and scalability.
       | 
       | In this blog post, it is the "state" associated with
       | subtransactions. In another area, it may be Prepared Statements.
       | Often regarded as better performing than non prepared (and they
       | strictly are, generally), they impose hard limits into connection
       | scalability. Connection poolers cannot hold this state and thus
       | you cannot use transaction pooling, which boosts performance and
       | resource usage much more than prepared vs non-prepared.
       | 
       | Anyway, that's another topic. But session state --is something
       | I'm becoming more and more against. YMMV.
        
       | jakswa wrote:
       | Unique constraints roll your transaction back. Put your `INSERT`
       | in a nested transaction and you can fall back to `SELECT`, and
       | keep going without throwing away previous work. In multi-table,
       | write-heavy workloads, this insert-or-select path begs for nested
       | transactions so you can lean on uniqueness constraints while
       | keeping your transaction afloat. So far I'm not convinced to
       | abandon that pattern by this article, but I'll keep an eye out
       | for a happier path I guess.
        
         | gurjeet wrote:
         | If that's what you want, you can happily use Postgres' UPSERT
         | implementation, and abandon subtransactions very easily.
         | 
         | https://www.postgresql.org/docs/current/sql-insert.html#SQL-...
        
           | jakswa wrote:
           | It's a good idea to revisit, but I think I need SELSERT or
           | something. Are you saying the `DO NOTHING` clause can still
           | result in a row being returned with `RETURNING *`? My
           | impression is that it can't but I'd be excited to be wrong!
           | 
           | edit: Ah, I bet the implicit suggestion was to do `INSERT ...
           | ON CONFLICT DO NOTHING RETURNING *` and if no row was
           | returned, you hope to know/guess the conflict and do an extra
           | `SELECT`. I'll think more on this, thanks.
        
             | crateless wrote:
             | > The SET and WHERE clauses in ON CONFLICT DO UPDATE have
             | access to the existing row using the table's name (or an
             | alias), and to rows proposed for insertion using the
             | special excluded table.
             | 
             | Therefore you can do DO UPDATE SET table_name.column =
             | Excluded.column WHERE table_name.column IS DISTINCT FROM
             | Excluded.column RETURNING *
        
               | jakswa wrote:
               | Am I right to interpret this as "do a non-changing UPDATE
               | so that you get a returned row"? It's another great idea
               | I hadn't considered, thank you. I'm worried this causes
               | the row to get re-written on disk (an actual UPDATE, even
               | though values aren't changing). That could be a moot
               | concern, I'll dig more.
        
               | crateless wrote:
               | Yeah, that's a valid concern and perhaps you might find
               | an answer here
               | https://stackoverflow.com/questions/34708509/how-to-use-
               | retu...
        
       | SPBS wrote:
       | > It is recommended to learn if subtransactions are used in your
       | systems. If they are, it does not immediately mean that they need
       | to be eliminated - it all depends on the risks of your particular
       | case. Evaluating such risks is a complex task that requires deep
       | Postgres expertise, so if you need such an analysis, consider
       | hiring a PostgreSQL expert.
       | 
       | It is quite disheartening that this essentially echoes this quote
       | from the related blog post
       | https://buttondown.email/nelhage/archive/22ab771c-25b4-4cd9-...:
       | 
       | > It is entirely possible to operate Postgres safely and with
       | high performance and throughput at scale ... but essentially the
       | only way to do it is to have ready access to deep PostgreSQL
       | experience and expertise on your team; people who Just Know where
       | the landmines are because they've seen them before.
       | 
       | And https://blog.nelhage.com/post/some-opinionated-sql-takes/:
       | 
       | > My personal choice: MySQL
       | 
       | > The devil is in the details, but for me, as a default, and
       | despite all these pet peeves, I would start with MySQL.
       | 
       | > As for Postgres, I have enormous respect for it and its
       | engineering and capabilities, but, for me, it's just too damn
       | operationally scary. In my experience it's much worse than MySQL
       | for operational footguns and performance cliffs, where using it
       | slightly wrong can utterly tank your performance or availability.
       | In addition, because MySQL is, in my experience, more widely
       | deployed, it's easier to find and hire engineers with experience
       | deploying and operating it. Postgres is a fine choice, especially
       | if you already have expertise using it on your team, but I've
       | personally been burned too many times.
       | 
       | Why should you need expert consultation in order to use core
       | features, when other databases "just work" without all these
       | caveats? Transaction ID exhaustion and VACUUM tuning comes to
       | mind here.
        
       | rattray wrote:
       | Apparently Django uses subtransactions by default when you use
       | atomic=True which can result in scary behavior
        
       | jrockway wrote:
       | I agree with this. I think we get baited into using
       | subtransactions by how we structure our code. Each function feels
       | like a transaction -- it gets its own local variables, and if it
       | fails, it doesn't have any effect on the rest of the program.
       | (Not strictly true, of course, I'm sure some failing functions
       | modify global state, or their receiver.)
       | 
       | We then mindlessly copy that to our database code -- each
       | mutation function takes a "database object", which could be a
       | direct database connection, or it could be an in-progress
       | transaction. It's generic so that you don't have to care.
       | Functions that think their stuff needs to be a transaction just
       | start one, and if it errors out, hey, it's rolled back.
       | 
       | (Whenever you have a "don't care" type, it means half the
       | functions will be documented "// must be run in a transaction"
       | and the other half will pessimistically create a transaction
       | "just in case" it was invoked with a raw database connection
       | instead of a transaction object.)
       | 
       | Thinking about it more critically, 100% of the times I've wanted
       | to write this, I've wanted to abort the parent transaction as
       | soon as the first child fails. I tend to retry transactions, and
       | doing that twice doesn't make a lot of sense (parent transaction
       | starts, calls a helper function, that starts a transaction, it
       | has a conflict and has to be rolled back, helper function is re-
       | run, that ends up committing, parent transaction fails because of
       | a conflict... and the update gets rolled back anyway).
       | 
       | I basically structure my database APIs to take transaction
       | objects, and make each public API member a transactional unit.
       | Then, the very top level creates and commits the transaction, and
       | can add whatever retry logic it deems necessary.
       | 
       | (Using the classic example, TransferFunds() would be public, and
       | addMoney() and withdrawMoney() would be private. That way, the
       | runner of a transaction can't do "doTx(addMoney);
       | doTx(withdrawMoney)", it would be forced to do
       | "doTx(TransferFunds)". And, all three would take a Transaction
       | object instead of a TransactionOrDatabase object, so the type
       | system enforces the transactional expectations of a money
       | transfer operation.)
        
         | spencercw wrote:
         | I do basically the same thing. If a method tries to start a
         | transaction when a transaction is already open then it's
         | immediately rejected.
         | 
         | I don't really like automatic transaction joining behaviour
         | because it makes it hard to reason about the application
         | behaviour. Will this transactional method commit when it
         | returns? It's impossible to tell without looking at who's
         | calling it.
         | 
         | It also encourages annoying behaviour like, oh this method uses
         | the database, better make it @Transactional.
        
       | captaincrowbar wrote:
       | "One of the basic issues with subtransactions is that they
       | increment XID - the global transaction ID."
       | 
       | If the subtransaction is later rolled back, this is starting to
       | sound like the kind of speculative execution information leak
       | that led to the Spectre attack. I wonder if you could base a
       | similar attack on this?
        
       | gigatexal wrote:
       | Why is the XID a 32bit number? Surely we can move to 64bit and
       | punt the wraparound issue a bit?
        
         | jakswa wrote:
         | My fuzzy take on a concern here could be: storage costs and
         | memory usage. My impression is that it is written/copied to
         | actual rows as xmin
         | (https://www.postgresql.org/docs/current/ddl-system-
         | columns.h...). So loading X rows from disk into memory is
         | loading X transaction IDs into memory.
         | 
         | That said, I'm not sure the concern holds water. At 2bil rows
         | we're talking 7.4GiB of disk vs 15GiB. Depending on what % of
         | your database you want to keep in memory, maybe it goes ok.
        
       | eyelidlessness wrote:
       | This isn't wrong at all, but...
       | 
       | 1. A lot of it is heavily dependent on implementation details.
       | (Yeah we have to care about XID, but we shouldn't have to care
       | about it exponentially for nested scope workloads once that scope
       | exits.)
       | 
       | 2. Actual nested transactions could help alleviate that without
       | too much burden on the MVCC implementation. (This would provide a
       | mechanism for isolating and cleaning up scoped XIDs.)
        
       ___________________________________________________________________
       (page generated 2021-09-01 10:01 UTC)