[HN Gopher] Reads Causing Writes in Postgres
___________________________________________________________________
Reads Causing Writes in Postgres
Author : thunderbong
Score : 200 points
Date : 2024-12-22 10:24 UTC (4 days ago)
(HTM) web link (jesipow.com)
(TXT) w3m dump (jesipow.com)
| buglungtung wrote:
| Greate article! I have learned about block/page long time ago
| when I needed to debug performance issue but not as deep as this
| article. Will share it with my teammate and its funny to see
| their emotional face :D
| cube2222 wrote:
| TLDR: it can be caused by hint bit updates, as well as page
| pruning - both can be kicked off by a select query, and will be
| counted as part of the query's statistics.
|
| However, the article as a whole is both a much wider and deeper
| dive. I recommend giving it a read in full!
| vichle wrote:
| Thanks, a TLDR should be mandatory for articles of this length
| :)
| stronglikedan wrote:
| As articles (especially about postgres) go, this isn't that
| long, but you can always get your own AI summary if it's too
| long for you.
| SoftTalker wrote:
| Firefox reader mode (necessary to read this, as the font
| size and color choices are poor) estimated this at a 30+
| minute read. It would be a courtesy to readers for authors
| to provide a summary. That way people can decide if they
| want to spend time reading further. This is why academic
| papers have an abstract up front.
| makeitdouble wrote:
| > AI summary
|
| This is one of the AI side effect that I fear the most.
|
| We're not there, and perhaps will never be, but I imagine a
| point where information organization becomes fully
| neglected because an AI tools can do something about it.
|
| We have a taste of it with emailing that became a wasteland
| as we're supposed to filter and search it either way, and
| mail notifications have only a on/off button and nothing
| in-between.
|
| Not reading emails is I think close to the norm, and I
| guess "TLDR" will stop being an expression and just a fact
| of life ?
| chasil wrote:
| In Oracle, this happens because uncommitted transactions are
| found to be committed by a later reader, which cleans them out.
|
| https://www.databasejournal.com/oracle/delayed-block-cleanou...
| rpcope1 wrote:
| Things get even weirder when you use extensions. I remember being
| profoundly confused using Timescale 1 and doing a lot of
| concurrent writes on a hypertable with a foreign key (while also
| inserting into the other table) when I would get transaction
| deadlocks even in scenarios where it wouldn't normally be
| possible. This is how I found out doing DML on a "hypertable"
| actually does DDL under the hood, with all of the associated
| problems that brings.
| efxhoy wrote:
| That's confusing. What DDL did it do? Create new partitions?
| juxhindb wrote:
| Likely creating child tables for the various chunks that kick
| in periodically (e.g., depending on your hypertable chunking
| policy). Used to hit these all the time, quite annoying.
| madars wrote:
| Similar things can also happen with file systems: ext4 mounted -o
| ro will let the driver do filesystem recovery even if userspace
| writes are prevented.
| sneak wrote:
| That seems like it violates the principle of least surprise.
| Sayrus wrote:
| At the same time, you want to be able to read files in normal
| use-case. Being able to read them (after recovery) only if
| mounted read-write seems counterintuitive. This is the kind
| of times where right or wrong depends on the use.
| lazide wrote:
| Also how you can end up with silly things like ro-but-i-
| really-mean-it-this-time flags
| poincaredisk wrote:
| The forensics people I know don't worry about flags, and
| just use a write blocker for everything.
| lazide wrote:
| Yeah and clone everything before even touching (the copy)
| too.
| numpad0 wrote:
| Do changes need to go on disk for that to work?
| mort96 wrote:
| Hmmm yes and no. If I set / to mount read-only in some
| embedded Linux system context, my intention is just that the
| contents of disk shouldn't change just because some program
| decided to write something somewhere; I would be quite
| surprised if some recoverable metadata bit flip or something
| caused the system to irrecoverably fail to boot just because
| the readonly flag also prevented fsck from fixing errors.
|
| However if I have a faulty drive that I connect to my system
| to recover data from it and I don't want it to experience any
| more writes because I'm worried further writes may break it
| further, I would be quite surprised if 'mount -o ro' caused
| the driver to write to it.
| vbezhenar wrote:
| Recovery and mounting should be separate operations. If
| filesystem is not clean, it should not be allowed to mount
| at all.
| epcoa wrote:
| "Recovering" an otherwise error free journaled or logged
| filesystem is considered a normal operation. Unclean just
| doesn't mean an error. That's how this works and I don't
| see very many interested in changing this behavior.
| Joe_Cool wrote:
| You can disable the journal. It should(! haven't checked
| !) not touch the recovery information then. You also need
| this when you have a decade of version difference and an
| error on mount: `mount -oro,noload`
| bobmcnamara wrote:
| > I would be quite surprised if some recoverable metadata
| bit flip or something caused the system to irrecoverably
| fail to boot just because the readonly flag also prevented
| fsck from fixing errors.
|
| This is exactly what happens maintaining bootloaders. As
| time goes on, the amount of configuration to get ext4 to
| reliably read a possibly dirty filesystem without modifying
| it has skyrocketed to the point where I started putting
| /boot on ext2 again.
| pm90 wrote:
| Trying to reason about postgres is somewhat of an enigma when you
| are forced to do it; generally the only reason as a programmer
| you have to is because something went wrong, and then the mindset
| is a mix of nervousness and panic; then incredulity at some of
| the seemingly unintuitive behaviors. I suspect this might be true
| of any large, complex system at the edges.
| refset wrote:
| Interesting! MVCC mechanics aside, it's also worth remembering
| that work_mem is only 4MB by default [0], so large intermediate
| results will likely spill to disk (e.g. external sorts for ORDER
| BY operations).
|
| [0] https://www.postgresql.org/docs/current/runtime-config-
| resou...
| indulona wrote:
| Haha
___________________________________________________________________
(page generated 2024-12-26 23:01 UTC)