[HN Gopher] Optimizing Postgres Queries at Scale
       ___________________________________________________________________
        
       Optimizing Postgres Queries at Scale
        
       Author : kmdupree
       Score  : 66 points
       Date   : 2021-12-28 17:31 UTC (5 hours ago)
        
 (HTM) web link (heap.io)
 (TXT) w3m dump (heap.io)
        
       | the_duke wrote:
       | This reads like SEO spam without any notable content.
        
       | mistrial9 wrote:
       | I recently found this paper (2015) on optimizers, focusing on
       | postgresql.
       | 
       | https://www.vldb.org/pvldb/vol9/p204-leis.pdf
        
       | jackhodkinson wrote:
       | Curious how other people run these kinds of experiments on their
       | production data in a safe way that does not affect performance of
       | prod. Eg how do you avoid CPU spikes in prod that might effect
       | users?
        
         | samokhvalov wrote:
         | Thin clones can help: https://github.com/postgres-ai/database-
         | lab-engine
        
         | devmunchies wrote:
         | postgres can collect stats across all queries against the
         | database[1]. GCP Cloud SQL has this enabled by default.
         | 
         | You can do `select * from pg_stat_user_tables` to see how many
         | table have had (1) a full sequential scan, (2) how many records
         | have been traversed by sequential scans, (3) how many index
         | scans, and (4) how many records have been scanned using
         | indexes.
         | 
         | You can also do `select * from pg_stat_user_indexes` to see (1)
         | which indexes have or have not been used, (2) how many times
         | they've been used, and (3) how many records have been crawled
         | using each index.
         | 
         | Using this information, you would deduce which indexes to
         | add/remove, and but you would still need to figure out _which_
         | queries are not hitting indexes (an exercise for the reader).
         | 
         | If you save these stats once a day (e.g. using `pg_cron` to
         | copy to a new table with a timestamp), you will be able to
         | monitor over time whether an index should be added/removed
         | 
         | [1]: https://www.postgresql.org/docs/14/monitoring-stats.html
        
         | capableweb wrote:
         | Start a new DB instance and restore all production data from
         | latest backup (you are doing regular backups riiight?). Serves
         | two purposes: enabling debugging of production data without
         | actually using production + verifying that your production
         | backups/restores actually work in practice, not just in theory.
         | 
         | If the database is too big, just use last N amount of rows from
         | the database instead of a full dump.
         | 
         | Bonus points if you log actual SQL queries that happen in
         | production to a log file, so you can then re-run last X queries
         | against your new test-production database and log the total run
         | time. After doing bunch of optimizations, you can re-run your
         | little benchmark and see how it compares.
        
           | polskibus wrote:
           | What do you guys use for large postgresql backups? Pgdump?
           | Setting up replicas? Coming from ms sql, backups seem to be a
           | pain in postgresql world, esp on a live database that is
           | receiving updates.
        
             | merb wrote:
             | pgbackrest and/or wal-e/wal-g btw. it's important to make
             | wal archive backups if you have a big database.
             | https://www.postgresql.org/docs/14/continuous-
             | archiving.html
             | 
             | so most of the time, both things are good.
             | 
             | there is also:
             | 
             | - Barman
             | 
             | - pg_probackup
             | 
             | (P.S: with archiving you can even have a "delayed"
             | downstream cluster)
        
             | __s wrote:
             | Not from Heap, but I use https://github.com/wal-g/wal-g
        
             | stickfigure wrote:
             | Our hosted postgres instance (GCP) gets nightly backups
             | automatically. Restoring to a new instance is a few
             | buttonclicks in the UI.
             | 
             | This is infrastructure level stuff that someone else should
             | run.
        
           | samokhvalov wrote:
           | > If the database is too big, just use last N amount of rows
           | from the database instead of a full dump.
           | 
           | For SQL optimization purposes, you cannot do that - there are
           | high chances you'll get different query plans.
        
             | capableweb wrote:
             | Yeah, if your database is under 1TB, it's relatively easy
             | and usable to just clone the full thing. But above that, it
             | starts becoming a hassle and you might need to strip away
             | some stuff you don't care about (and won't affect the query
             | plans), otherwise it can take a long time to backup/restore
             | into new instance.
        
           | Normal_gaussian wrote:
           | This is how its done.
           | 
           | Be careful with the last n rows approach, you have to have an
           | understanding of how the query scales to do this effectively
           | (ie. your n rows are at an appropriate scale to optimise). I
           | assume that if you are at a small enough scale that you are
           | asking this question, restoring the whole backup is the best
           | option.
        
         | sergiotapia wrote:
         | We used to spin up a production-like Postgres instance and
         | restore from latest backup to test SQL optimizations.
        
           | samokhvalov wrote:
           | This is great that you can do that. An experimental
           | environment must be of the same size.
           | 
           | Ideally, absolutely the same physically, to keep the same
           | physical layout as on prod - bloat, etc - though, sometimes
           | it's not allowed.
           | 
           | Another problem here is that for each experiment, you might
           | need a new environment. Plus, in larger teams, many people
           | might want to do this work at the same time - so you end up
           | combining efforts, synchronizing them, sharing environments.
           | It slows you down. Or increases your cloud spending a lot.
           | 
           | This is a kind of problem Steve Jobs described in his famous
           | 1980 speech [1] about Apple: if we give each person their own
           | computer, something special happens compared to the case when
           | 10 persons share a single computer. Here it is the same: if
           | we can give each engineer their own full-size DB copy for
           | experiments, something special happens. They start moving
           | much, much faster. SQL quality improves. DB changes (DDL,
           | massive DML) stop failing. Finally, engineers start learning
           | SQL, it's an eye-opener - they now can see how it works on
           | large volumes (because they didn't have good environments for
           | experiments before!)
           | 
           | This is what we (Postgres.ai) have learned over the last
           | couple of years developing Database Lab Engine [2]. It's an
           | open-source tool for superfast cloning of Postgres databases
           | of any size, based on copy-on-write provided by ZFS or LVM.
           | On a single machine, you can have dozens of full-size clones
           | up and running, conduct a lot of experiments at the same
           | time, and still be paying only for 1 machine. And clone
           | provisioning takes only a few seconds, it feels like magic.
           | But the main magic is how engineering processes change.
           | 
           | One of great use cases is how GitLab uses it for Postgres
           | query optimization and change control [3].
           | 
           | [1] https://youtu.be/0lvMgMrNDlg?t=586
           | 
           | [2] https://github.com/postgres-ai/database-lab-engine
           | 
           | [3] https://postgres.ai/resources/case-studies/gitlab
        
       ___________________________________________________________________
       (page generated 2021-12-28 23:01 UTC)