[HN Gopher] Optimizing Postgres Queries at Scale
___________________________________________________________________
Optimizing Postgres Queries at Scale
Author : kmdupree
Score : 66 points
Date : 2021-12-28 17:31 UTC (5 hours ago)
(HTM) web link (heap.io)
(TXT) w3m dump (heap.io)
| the_duke wrote:
| This reads like SEO spam without any notable content.
| mistrial9 wrote:
| I recently found this paper (2015) on optimizers, focusing on
| postgresql.
|
| https://www.vldb.org/pvldb/vol9/p204-leis.pdf
| jackhodkinson wrote:
| Curious how other people run these kinds of experiments on their
| production data in a safe way that does not affect performance of
| prod. Eg how do you avoid CPU spikes in prod that might effect
| users?
| samokhvalov wrote:
| Thin clones can help: https://github.com/postgres-ai/database-
| lab-engine
| devmunchies wrote:
| postgres can collect stats across all queries against the
| database[1]. GCP Cloud SQL has this enabled by default.
|
| You can do `select * from pg_stat_user_tables` to see how many
| table have had (1) a full sequential scan, (2) how many records
| have been traversed by sequential scans, (3) how many index
| scans, and (4) how many records have been scanned using
| indexes.
|
| You can also do `select * from pg_stat_user_indexes` to see (1)
| which indexes have or have not been used, (2) how many times
| they've been used, and (3) how many records have been crawled
| using each index.
|
| Using this information, you would deduce which indexes to
| add/remove, and but you would still need to figure out _which_
| queries are not hitting indexes (an exercise for the reader).
|
| If you save these stats once a day (e.g. using `pg_cron` to
| copy to a new table with a timestamp), you will be able to
| monitor over time whether an index should be added/removed
|
| [1]: https://www.postgresql.org/docs/14/monitoring-stats.html
| capableweb wrote:
| Start a new DB instance and restore all production data from
| latest backup (you are doing regular backups riiight?). Serves
| two purposes: enabling debugging of production data without
| actually using production + verifying that your production
| backups/restores actually work in practice, not just in theory.
|
| If the database is too big, just use last N amount of rows from
| the database instead of a full dump.
|
| Bonus points if you log actual SQL queries that happen in
| production to a log file, so you can then re-run last X queries
| against your new test-production database and log the total run
| time. After doing bunch of optimizations, you can re-run your
| little benchmark and see how it compares.
| polskibus wrote:
| What do you guys use for large postgresql backups? Pgdump?
| Setting up replicas? Coming from ms sql, backups seem to be a
| pain in postgresql world, esp on a live database that is
| receiving updates.
| merb wrote:
| pgbackrest and/or wal-e/wal-g btw. it's important to make
| wal archive backups if you have a big database.
| https://www.postgresql.org/docs/14/continuous-
| archiving.html
|
| so most of the time, both things are good.
|
| there is also:
|
| - Barman
|
| - pg_probackup
|
| (P.S: with archiving you can even have a "delayed"
| downstream cluster)
| __s wrote:
| Not from Heap, but I use https://github.com/wal-g/wal-g
| stickfigure wrote:
| Our hosted postgres instance (GCP) gets nightly backups
| automatically. Restoring to a new instance is a few
| buttonclicks in the UI.
|
| This is infrastructure level stuff that someone else should
| run.
| samokhvalov wrote:
| > If the database is too big, just use last N amount of rows
| from the database instead of a full dump.
|
| For SQL optimization purposes, you cannot do that - there are
| high chances you'll get different query plans.
| capableweb wrote:
| Yeah, if your database is under 1TB, it's relatively easy
| and usable to just clone the full thing. But above that, it
| starts becoming a hassle and you might need to strip away
| some stuff you don't care about (and won't affect the query
| plans), otherwise it can take a long time to backup/restore
| into new instance.
| Normal_gaussian wrote:
| This is how its done.
|
| Be careful with the last n rows approach, you have to have an
| understanding of how the query scales to do this effectively
| (ie. your n rows are at an appropriate scale to optimise). I
| assume that if you are at a small enough scale that you are
| asking this question, restoring the whole backup is the best
| option.
| sergiotapia wrote:
| We used to spin up a production-like Postgres instance and
| restore from latest backup to test SQL optimizations.
| samokhvalov wrote:
| This is great that you can do that. An experimental
| environment must be of the same size.
|
| Ideally, absolutely the same physically, to keep the same
| physical layout as on prod - bloat, etc - though, sometimes
| it's not allowed.
|
| Another problem here is that for each experiment, you might
| need a new environment. Plus, in larger teams, many people
| might want to do this work at the same time - so you end up
| combining efforts, synchronizing them, sharing environments.
| It slows you down. Or increases your cloud spending a lot.
|
| This is a kind of problem Steve Jobs described in his famous
| 1980 speech [1] about Apple: if we give each person their own
| computer, something special happens compared to the case when
| 10 persons share a single computer. Here it is the same: if
| we can give each engineer their own full-size DB copy for
| experiments, something special happens. They start moving
| much, much faster. SQL quality improves. DB changes (DDL,
| massive DML) stop failing. Finally, engineers start learning
| SQL, it's an eye-opener - they now can see how it works on
| large volumes (because they didn't have good environments for
| experiments before!)
|
| This is what we (Postgres.ai) have learned over the last
| couple of years developing Database Lab Engine [2]. It's an
| open-source tool for superfast cloning of Postgres databases
| of any size, based on copy-on-write provided by ZFS or LVM.
| On a single machine, you can have dozens of full-size clones
| up and running, conduct a lot of experiments at the same
| time, and still be paying only for 1 machine. And clone
| provisioning takes only a few seconds, it feels like magic.
| But the main magic is how engineering processes change.
|
| One of great use cases is how GitLab uses it for Postgres
| query optimization and change control [3].
|
| [1] https://youtu.be/0lvMgMrNDlg?t=586
|
| [2] https://github.com/postgres-ai/database-lab-engine
|
| [3] https://postgres.ai/resources/case-studies/gitlab
___________________________________________________________________
(page generated 2021-12-28 23:01 UTC)