[HN Gopher] PostgreSQL Subtransactions Considered Harmful
___________________________________________________________________
PostgreSQL Subtransactions Considered Harmful
Author : samokhvalov
Score : 97 points
Date : 2021-08-31 22:14 UTC (11 hours ago)
(HTM) web link (postgres.ai)
(TXT) w3m dump (postgres.ai)
| vvern wrote:
| cockroachdb implements subtransactions and I don't think any of
| these issues apply to cockroach's implementation.
| samokhvalov wrote:
| They have an interesting section, "Beware: don't (over) use
| nested transactions", in their blog post:
| https://www.cockroachlabs.com/blog/nested-transactions-in-co...
|
| > ...we do not recommend their use in new applications.
|
| > The reality is that nested transactions are a product of the
| early days of software engineering, in the 1990s, back when
| systems were tightly coupled and the Internet and the Cloud
| were not yet very relevant. ...
|
| > Nowadays, such tight coupling has a bad rep. This is because
| two additional decades of software engineering have taught us
| that implicit global state really, really does not play well
| with distributed services where some components may fail even
| as part of normal load ...
|
| > Additionally, nested transactions can amplify performance
| anomalies ...
|
| > Finally, nested transactions can run afoul of correctness in
| distributed apps. In fact, the idea of multi-component
| transactions in client code really evokes the idea of a bull in
| a china shop. As long as all is well and the transaction is due
| to commit, the idea somewhat makes sense. However, what happens
| when the database (and not the client) decides the transaction
| is un-committable and must be aborted, for example because of a
| serializability conflict or a node restarting for maintenance?
| It is not just the database state that must be rolled back; all
| the possible side effects performed by the components holding
| the transaction must also be rolled back.
|
| They implemented it, but do not recommend using it.
| cakoose wrote:
| Does anybody here understand the argument they're making? The
| language is vague and most examples seem to apply to un-
| nested transactions as well.
|
| > Additionally, nested transactions can amplify performance
| anomalies...
|
| This is a function of a transaction's duration and volume of
| writes, right? Does it matter whether there is nesting?
|
| > It is not just the database state that must be rolled back;
| all the possible side effects performed by the components
| holding the transaction must also be rolled back.
|
| Doesn't this apply to un-nested transactions as well? Having
| external API calls run "within" a DB transaction is common
| source of inconsistency.
| ltbarcly3 wrote:
| Using subtransactions appropriately is unlikely to trigger any
| noticeable performance differences, any more than 'having less
| transactions' overall will.
|
| When doing multiple steps that need to track external state,
| subtransactions can greatly simplify the schema you need as well
| as make the database interactions more efficient. You need to
| safely record progress at each step, and you can't afford to lose
| previous steps because a transaction gets rolled back
| unexpectedly. You could start a transaction and commit between
| each step, but this gets pretty annoying if you need to SELECT
| FOR UPDATE or otherwise make sure what you are doing is
| concurrency safe, since this would require you to track all the
| resources you need as you go, which would greatly complicate the
| design of your non-sql code. Not to mention the added failure
| modes of not being able to reacquire those resources after step N
| for whatever reason.
|
| Where things get stupid is ORMs and frameworks that start sub
| transactions all over the place when you don't need them and
| aren't even trying to use them. I'm looking at you, Django
| transaction.atomic(), you absolute idiot.
| samokhvalov wrote:
| What does "appropriately" mean in this context?
| ltbarcly3 wrote:
| Not automatically inserting subtransactions that aren't
| necessary, especially when you aren't setting your
| application up to ever actually use them, as a side effect of
| something else. Like Django's transaction.atomic(), which at
| the top nesting level starts and commits a transaction, but
| which any dynamically nested calls start and commit sub
| transactions. This is almost never what anybody wants, I can
| count on one hand the number of times I've seen code that
| handles failed sub transactions and continues the overall
| transaction, and which most programmers don't even know how
| it works or what it does.
| SPBS wrote:
| > When doing multiple steps that need to track external state
|
| Do you have an example of this? Many explanations for SAVEPOINT
| say it is useful when 'recalculation is deemed too expensive'
| but I cannot come up with a satisfactory example where this is
| truly the case.
| Cullinet wrote:
| I was confused enough by the word subtransactions to ask DDG for
| help. This discussion is #1 and #3 is
| https://core.ac.uk/download/pdf/14445799.pdf which is a more
| direct overview of where subtransactions are at. The paper's
| title Transactions Design For Databases With High Performance and
| Availability isn't desperately helpful, but the tabular
| isolation, concurrency and atomicity survey is.
|
| EDIT :
|
| the very first reference in the above paper A Critique of ANSI
| isolation levels, which is from 1995 and I have only just started
| with but looks like important reading, is here
| https://www.microsoft.com/en-us/research/wp-content/uploads/...
| justinludwig wrote:
| Lame title, but a really good, in-depth examination of the
| lurking issues with subtransactions on postgres.
| monkey26 wrote:
| Really dislike these "considered harmful" type titles.
| [deleted]
| simonw wrote:
| You may enjoy this classic then:
|
| "Considered Harmful" Essays Considered Harmful
| https://meyerweb.com/eric/comment/chech.html
| ahachete wrote:
| I'm the co-author of one of such ones: "Schema Later Considered
| Harmful" [1]. I don't see what's wrong with this "type titles",
| may you elaborate? :)
|
| [1] https://www.enterprisedb.com/blog/schema-later-considered-
| ha...
| ketralnis wrote:
| It reads to me the same as "is deprecated" or neovim's claim
| of "literally the future of vim": it makes bold claims about
| consensus with no support
| stonecolddevin wrote:
| They're hyperbolic and don't actually give you any useful
| information on why they're "harmful" or what's wrong with the
| subject being discussed.
| ahachete wrote:
| That they are hyperbolic, maybe. I disagree with the rest,
| I believe there are tons of information on both examples
| about why they are harmful.
|
| (edit, typo)
| detaro wrote:
| Not in the _titles_. "X is harmful" could instead be "X
| is unsafe", "X is often slow", "X has non-obvious corner
| cases", "X led us to maintenance hell", ...
|
| "Harmful" is less information and kind of suggests an
| overall judgement for all cases vs "things to consider"
| (which was e.g. also a criticism of the original letter,
| that it lead to rules like "never use goto" which forced
| people to do bad workarounds instead of discussion on
| where to use it and where not)
| kbenson wrote:
| The problem with titles like that is that they don't
| convey the same thing. "considered harmful" has entered
| the lexicon at this point, so you know what you're going
| to get when you see it, which is someone attempting to
| make a reasoned case about why some behavior that people
| do and consider okay is actually worse than people may
| suspect.
|
| In that way, it's much more informative and less likely
| to have people arguing pedantic points about the title
| being "correct" than "is harmful" or "is often slow" or
| "has non-obvious corner cases" or "can create maintenance
| problems", which might _all_ be items together in a "is
| considered harmful" article.
|
| In other words, it's come full circle. There was the
| first, then there was the numerous copycats, then the
| backlash, and now we're all the way around to the point
| there it's a fairly succinct way to describe exactly the
| type of article people use it on and most people know
| what it means and what to expect when they see it.
|
| All that's left is for people to realize it's not going
| anywhere and actually has some beneficial use and that
| there's no point in complaining about it anymore, but
| that might be asking too much.
| ahachete wrote:
| Fair enough. Yet it might be difficult to summarize the
| tradeoffs succinctly enough for a title.
|
| "Harmful" conveys enough information to raise awareness
| of a topic which might not be usually considered. Both
| examples show this pattern: schema-less is often
| considered a good thing, when it is not (in Stonebraker's
| and my opinion); also subtransactions are usually
| considered a good thing, while the OP shows they can have
| notable negative effects (of diverse nature) in the
| database.
| smsm42 wrote:
| You mean, you actually have to read the article?
| Waterluvian wrote:
| In my opinion:
|
| It's smug.
|
| You're publishing an opinion with a title that suggests an
| authority has formed a consensus.
| monkey26 wrote:
| I've seen plenty over the years as a professional developer
| of 20+ years and they are opinion pieces for the most part.
| I'd rather see a title along the lines of "Why you should
| avoid subtransactions".
|
| Drinking bleach is harmful.
| cakoose wrote:
| The original was Dijkstra's "Go To Statement Considered
| Harmful". Many considered that title unnecessarily
| provocative, but it ended up creating a new CS/programming
| meme.
|
| Now, I see "Considered Harmful" as just a concise way of
| alluding to that style of article, basically saying "here
| are the not-well-known downsides of a commonly-used thing".
| Many "Considered Harmful" articles do not live up to that
| promise, but I think this one does.
|
| Another title meme: "Zen and the Art of ...".
| unportant wrote:
| The funny thing is that Dijkstra named this article "A
| Case Against the Goto Statement". ACM editor didn't find
| that clickbaity enough and changed the title.
| akvadrako wrote:
| That original title is so much better. If articles like
| this would use it instead of the clickbait one it would
| give them more credability.
| ahachete wrote:
| This is a deep and excellent post. Thanks for publishing it and
| all the research required.
|
| Maybe too much 30,000 feet perspective, but in general everything
| that is "state" in a session is a potential limit to performance
| and scalability.
|
| In this blog post, it is the "state" associated with
| subtransactions. In another area, it may be Prepared Statements.
| Often regarded as better performing than non prepared (and they
| strictly are, generally), they impose hard limits into connection
| scalability. Connection poolers cannot hold this state and thus
| you cannot use transaction pooling, which boosts performance and
| resource usage much more than prepared vs non-prepared.
|
| Anyway, that's another topic. But session state --is something
| I'm becoming more and more against. YMMV.
| jakswa wrote:
| Unique constraints roll your transaction back. Put your `INSERT`
| in a nested transaction and you can fall back to `SELECT`, and
| keep going without throwing away previous work. In multi-table,
| write-heavy workloads, this insert-or-select path begs for nested
| transactions so you can lean on uniqueness constraints while
| keeping your transaction afloat. So far I'm not convinced to
| abandon that pattern by this article, but I'll keep an eye out
| for a happier path I guess.
| gurjeet wrote:
| If that's what you want, you can happily use Postgres' UPSERT
| implementation, and abandon subtransactions very easily.
|
| https://www.postgresql.org/docs/current/sql-insert.html#SQL-...
| jakswa wrote:
| It's a good idea to revisit, but I think I need SELSERT or
| something. Are you saying the `DO NOTHING` clause can still
| result in a row being returned with `RETURNING *`? My
| impression is that it can't but I'd be excited to be wrong!
|
| edit: Ah, I bet the implicit suggestion was to do `INSERT ...
| ON CONFLICT DO NOTHING RETURNING *` and if no row was
| returned, you hope to know/guess the conflict and do an extra
| `SELECT`. I'll think more on this, thanks.
| crateless wrote:
| > The SET and WHERE clauses in ON CONFLICT DO UPDATE have
| access to the existing row using the table's name (or an
| alias), and to rows proposed for insertion using the
| special excluded table.
|
| Therefore you can do DO UPDATE SET table_name.column =
| Excluded.column WHERE table_name.column IS DISTINCT FROM
| Excluded.column RETURNING *
| jakswa wrote:
| Am I right to interpret this as "do a non-changing UPDATE
| so that you get a returned row"? It's another great idea
| I hadn't considered, thank you. I'm worried this causes
| the row to get re-written on disk (an actual UPDATE, even
| though values aren't changing). That could be a moot
| concern, I'll dig more.
| crateless wrote:
| Yeah, that's a valid concern and perhaps you might find
| an answer here
| https://stackoverflow.com/questions/34708509/how-to-use-
| retu...
| SPBS wrote:
| > It is recommended to learn if subtransactions are used in your
| systems. If they are, it does not immediately mean that they need
| to be eliminated - it all depends on the risks of your particular
| case. Evaluating such risks is a complex task that requires deep
| Postgres expertise, so if you need such an analysis, consider
| hiring a PostgreSQL expert.
|
| It is quite disheartening that this essentially echoes this quote
| from the related blog post
| https://buttondown.email/nelhage/archive/22ab771c-25b4-4cd9-...:
|
| > It is entirely possible to operate Postgres safely and with
| high performance and throughput at scale ... but essentially the
| only way to do it is to have ready access to deep PostgreSQL
| experience and expertise on your team; people who Just Know where
| the landmines are because they've seen them before.
|
| And https://blog.nelhage.com/post/some-opinionated-sql-takes/:
|
| > My personal choice: MySQL
|
| > The devil is in the details, but for me, as a default, and
| despite all these pet peeves, I would start with MySQL.
|
| > As for Postgres, I have enormous respect for it and its
| engineering and capabilities, but, for me, it's just too damn
| operationally scary. In my experience it's much worse than MySQL
| for operational footguns and performance cliffs, where using it
| slightly wrong can utterly tank your performance or availability.
| In addition, because MySQL is, in my experience, more widely
| deployed, it's easier to find and hire engineers with experience
| deploying and operating it. Postgres is a fine choice, especially
| if you already have expertise using it on your team, but I've
| personally been burned too many times.
|
| Why should you need expert consultation in order to use core
| features, when other databases "just work" without all these
| caveats? Transaction ID exhaustion and VACUUM tuning comes to
| mind here.
| rattray wrote:
| Apparently Django uses subtransactions by default when you use
| atomic=True which can result in scary behavior
| jrockway wrote:
| I agree with this. I think we get baited into using
| subtransactions by how we structure our code. Each function feels
| like a transaction -- it gets its own local variables, and if it
| fails, it doesn't have any effect on the rest of the program.
| (Not strictly true, of course, I'm sure some failing functions
| modify global state, or their receiver.)
|
| We then mindlessly copy that to our database code -- each
| mutation function takes a "database object", which could be a
| direct database connection, or it could be an in-progress
| transaction. It's generic so that you don't have to care.
| Functions that think their stuff needs to be a transaction just
| start one, and if it errors out, hey, it's rolled back.
|
| (Whenever you have a "don't care" type, it means half the
| functions will be documented "// must be run in a transaction"
| and the other half will pessimistically create a transaction
| "just in case" it was invoked with a raw database connection
| instead of a transaction object.)
|
| Thinking about it more critically, 100% of the times I've wanted
| to write this, I've wanted to abort the parent transaction as
| soon as the first child fails. I tend to retry transactions, and
| doing that twice doesn't make a lot of sense (parent transaction
| starts, calls a helper function, that starts a transaction, it
| has a conflict and has to be rolled back, helper function is re-
| run, that ends up committing, parent transaction fails because of
| a conflict... and the update gets rolled back anyway).
|
| I basically structure my database APIs to take transaction
| objects, and make each public API member a transactional unit.
| Then, the very top level creates and commits the transaction, and
| can add whatever retry logic it deems necessary.
|
| (Using the classic example, TransferFunds() would be public, and
| addMoney() and withdrawMoney() would be private. That way, the
| runner of a transaction can't do "doTx(addMoney);
| doTx(withdrawMoney)", it would be forced to do
| "doTx(TransferFunds)". And, all three would take a Transaction
| object instead of a TransactionOrDatabase object, so the type
| system enforces the transactional expectations of a money
| transfer operation.)
| spencercw wrote:
| I do basically the same thing. If a method tries to start a
| transaction when a transaction is already open then it's
| immediately rejected.
|
| I don't really like automatic transaction joining behaviour
| because it makes it hard to reason about the application
| behaviour. Will this transactional method commit when it
| returns? It's impossible to tell without looking at who's
| calling it.
|
| It also encourages annoying behaviour like, oh this method uses
| the database, better make it @Transactional.
| captaincrowbar wrote:
| "One of the basic issues with subtransactions is that they
| increment XID - the global transaction ID."
|
| If the subtransaction is later rolled back, this is starting to
| sound like the kind of speculative execution information leak
| that led to the Spectre attack. I wonder if you could base a
| similar attack on this?
| gigatexal wrote:
| Why is the XID a 32bit number? Surely we can move to 64bit and
| punt the wraparound issue a bit?
| jakswa wrote:
| My fuzzy take on a concern here could be: storage costs and
| memory usage. My impression is that it is written/copied to
| actual rows as xmin
| (https://www.postgresql.org/docs/current/ddl-system-
| columns.h...). So loading X rows from disk into memory is
| loading X transaction IDs into memory.
|
| That said, I'm not sure the concern holds water. At 2bil rows
| we're talking 7.4GiB of disk vs 15GiB. Depending on what % of
| your database you want to keep in memory, maybe it goes ok.
| eyelidlessness wrote:
| This isn't wrong at all, but...
|
| 1. A lot of it is heavily dependent on implementation details.
| (Yeah we have to care about XID, but we shouldn't have to care
| about it exponentially for nested scope workloads once that scope
| exits.)
|
| 2. Actual nested transactions could help alleviate that without
| too much burden on the MVCC implementation. (This would provide a
| mechanism for isolating and cleaning up scoped XIDs.)
___________________________________________________________________
(page generated 2021-09-01 10:01 UTC)