hngopher.com

       [HN Gopher] How to get the most out of Postgres memory settings
       ___________________________________________________________________
        
       How to get the most out of Postgres memory settings
        
       Author : samaysharma
       Score  : 203 points
       Date   : 2024-06-11 05:40 UTC (1 days ago)
        
 (HTM) web link (tembo.io)
 (TXT) w3m dump (tembo.io)
        
       | sgarland wrote:
       | Aside from these (most of which have reasonably sane defaults
       | with RDS / Aurora), the biggest thing you can do to help the
       | existing RAM you have is to stop making stupid indexing choices.
       | 
       | You probably don't need (or want) an index on every column.
       | 
       | You shouldn't index a UUIDv4 column if you can ever help it (and
       | by extension, you shouldn't use them as a PK).
       | 
       | Finally, you almost certainly need to REINDEX what you have
       | periodically. I've seen indices shrink by 400%, giving back GB of
       | RAM. You want a quick win for speed with zero cost? Gain back
       | enough RAM to have your entire working set cached.
        
         | VWWHFSfQ wrote:
         | Should never use uuidv4 (the builtin gen_random_uuid()) for
         | identifiers anyway since it clusters horribly and complete
         | randomness is the absolute enemy of anything you want to "look
         | up".
         | 
         | uuid_generate_v1mc() from uuid-ossp is a much better choice if
         | you really want a uuid indentifier since it will at least
         | cluster much better.
         | 
         | And yes, absolutely _never_ make it your primary key. I've seen
         | that mistake far too many times and it's always a disaster that
         | is a nightmare to fix once there's a non-trivial amount of data
         | and the database is miserable to work with.
        
           | robertlagrant wrote:
           | I like it as a primary key because you don't have to worry
           | about clashes if you rejig data. But I don't work at scale
           | with data, so performance hasn't been a blocker for this
           | approach.
        
             | cuu508 wrote:
             | What is rejigging data? Combining tables into one but
             | keeping original PKs?
        
               | NortySpock wrote:
               | Yes. I found it often necessary at my last job, either
               | due to loading client data from various datasets and
               | staging tables (ideally in an idempotent way to prevent
               | mistakes or double-entries) , or because someone asked
               | "Hey if we already have the data in system a, can we just
               | copy the data to system b?"
        
             | sgarland wrote:
             | If you're not working at scale, IMO this is also not a
             | difficult problem to overcome with integer IDs. Dump the DB
             | into a CSV and increment the ID column by N, then reload.
             | Or copy the table to a new table, add a trigger to copy
             | from the original to the new with an ID increment, then do
             | `UPDATE <temp_table> SET id = id + N WHERE id <=
             | $MAX_ID_ORIGINALLY_SEEN`.
        
               | NortySpock wrote:
               | Yeah, but that requires basically a special favor from
               | the DBA to create the trigger for what feels like a
               | basic, common-enough data-load task...
               | 
               | What I normally see though is that the auto-incrementing
               | sequence (at least in MS SQL Server or Oracle) isn't
               | clever enough to say "wow, that id already exists on your
               | table somehow? Here let me bump the sequence again and
               | you try again with a higher ID..."
               | 
               | Instead you get a 2am alarm because the generated
               | sequence somehow ran into the block of IDs that you
               | inserted into the table and crashed with a PK unique
               | constraint violation.
               | 
               | Hence UUIDv7 or ULIDs being easy to insert from the temp
               | table into the main table.
        
           | adamcharnock wrote:
           | I think that is putting it rather strongly. I use UUID7 (for
           | its better clustering) for anything that both, 1) may end up
           | having a lot of rows (1M+), and 2) where leaking the creation
           | time in the UUID is acceptable.
           | 
           | Otherwise I use UUID4.
        
             | sgarland wrote:
             | > where leaking the creation time in the UUID is
             | acceptable.
             | 
             | I've seen tons of people cite this as a problem for various
             | reasons, but not a single person has ever given an actual
             | example in the wild of it causing problems. This isn't war;
             | the German Tank Problem is not nearly as applicable as
             | people think.
             | 
             | Worse (from a DB perspective), it's often used as an excuse
             | for poor API design. "I have to have a random ID, because
             | the API is /api/user/<user-id>" and the like. OK, so don't
             | do that? Put the user ID into a JWT (encrypted first if
             | you'd like). Or use an external ID that _is_ random, and an
             | internal ID that 's a sequential ID which maps to the
             | external. Use the internal ID as the PK.
        
               | NortySpock wrote:
               | One year later, in a News Article:
               | 
               | "$person_name's lawyer could not be reached for comment,
               | but it is noted that the userid mentioned in the lawsuit
               | was created on 2022-12-03, the same day as the photos
               | were uploaded, and the same day as the arrest warrant was
               | issued..."
               | 
               | Using the id directly as the pk admittedly just makes
               | things a bit simpler, one less bit of information to
               | track down. I know bridge tables are not hard, but it's
               | just one extra step that the developer has to be aware
               | of, plan ahead for (and possibly all your teams and
               | customer support people need to be aware of when they are
               | tracking down an issue with an account)
               | 
               | UUIDv4 lets you move fast now and pay your performance-
               | piper next year rather than now.
        
           | patmorgan23 wrote:
           | >randomness is the absolute enemy of anything you want to
           | "look up".
           | 
           | Absolutely true that having a lot ordered on a column that's
           | one of the search predicates makes look ups faster. But
           | aren't there many situations where the PK is essentially
           | random even if it's an automatically incrementing integer?
           | 
           | Like in a customer's table, is the order someone became your
           | customer ever really relevant for most operations? It's
           | essentially a random id, especially if you're only looking up
           | a single customer (or things related to that single
           | customer).
           | 
           | Insert performance I could see being an issue. Im not supper
           | familiar with postgres performance tuning, but if the engine
           | handles a last hot page better than insert spread across all
           | pages better (which I think is the case due to how the WAL
           | operates) I can see that being a compelling reason for going
           | with a sequential PK.
        
             | sgarland wrote:
             | You're correct that for simple one-off cases, in a new-ish
             | table, the performance difference is tiny if you can
             | measure it at all.
             | 
             | As tables and indices age, they'll bloat, which can cause
             | additional latency. Again, you may not notice this for
             | small queries, but it can start becoming obvious in more
             | complex queries.
             | 
             | The main issue I've seen is that since software and
             | hardware is so absurdly fast, in the beginning none of this
             | is noticed. It isn't until you've had to upsize the
             | hardware multiple times that someone stops to think about
             | examining schema / query performance, and at that point,
             | they're often unwilling to do a large refactor. Then it
             | becomes my problem as a DBRE, and I sigh and explain the
             | technical reasons why UUIDs suck in a DB, and inevitably
             | get told to make it work as best I can.
             | 
             | > Insert performance I could see being an issue.
             | 
             | And UPDATE, since Postgres doesn't actually do that (it
             | does an INSERT + DELETE).
        
           | Vadim_samokhin wrote:
           | > And yes, absolutely _never_ make it your primary key.
           | 
           | If I dont have lots of range queries, why not then? Only
           | because of bloating (I think fragmentation is a more precise
           | term here)?
        
             | blackenedgem wrote:
             | Because there's a good chance down the line you will need
             | to do some sort of range query. Let's say you want to add
             | and backill a column. Not too bad, you create a partial
             | index where the column is null and use that for backfilling
             | data.
             | 
             | But at a certain scale that starts taking too long and a
             | bigint column would be quicker. Or you decide you need to
             | periodically scan the table in batches for some reason.
             | Perhaps to export the contents to a data warehouse as part
             | of an initial snapshot.
             | 
             | You can skip enumerating these possibilities by having a
             | bigint surrogate key from the get go. There's other
             | advantages as well like better joins and temporal locality
             | when the bigint index can be used rather than the uuid.
        
         | djbusby wrote:
         | What's the problem using UUID as a PK? I really like 128bit
         | numbers for PK, more than 64bit-int or anything text. I'm
         | generally putting a ULID or UUIDV7/8 in there.
        
           | VWWHFSfQ wrote:
           | 128-Bit numbers are absolutely humongous and just by virtue
           | of being a primary key means they get duplicated for every
           | single relation. So if you have your user ID, for instance,
           | as a 128bit uuid, any table and column that references the
           | user ID is now also a 128-bit uuid. It's not a problem if you
           | have very small amounts of data. But once your tables start
           | to grow it will become misery to work with.
           | 
           | In my opinion, the best pattern is to have just a regular
           | bigserial for primary keys to use for internal database
           | relations since postgres is extremely good at generating and
           | optimizing PK sequences (You're taking away all that
           | optimization just because you want an enormous random number
           | to be the primary key).
           | 
           | Then just have a uuid field for application-level identifiers
           | and natural keys. You'll save yourself a whole lot of
           | headache with this pattern.
        
             | londons_explore wrote:
             | UUID's are for when an auto-incrementing ID can't be used,
             | usually due to scale (if you will be inserting millions of
             | records per second from millions of clients).
             | 
             | However, postgres isn't suited to such scale (something
             | like spanner IS).
             | 
             | Therefore, you probably shouldn't be using UUID's with
             | postgres.
             | 
             | The other reason for UUID's is because the system is badly
             | architected and , for example, the client gets to choose
             | their own ID which are written into a DB. Such systems
             | generally are best avoided.
        
               | bruce511 wrote:
               | Also think about systems that are "beyond one database".
               | Like data that lives in multiple places at the same time
               | (on phone local storage, in database, consolidated and
               | syncd either other databases etc.)
               | 
               | In other words once you move beyond '1 sql database' they
               | become important.
               | 
               | (I'm not downvoting you, just disagreeing with you.)
        
               | djbusby wrote:
               | This is why I asked at the start of the thread. We *need*
               | 128bit numbers to make the uniqueness work across loads
               | of systems (like a distributed/federated type thing)
        
             | pdhborges wrote:
             | > But once your tables start to grow it will become misery
             | to work with.
             | 
             | I buy the data layout argument for databases with clustered
             | tables without any smi-sequencial uuid support. But the
             | storage argument looks vanishingly applicable to me: if
             | someone needs to add a column to one of these tables it
             | basically offsets a 4 byte optimization already.
        
               | bruce511 wrote:
               | 8 Byte (assuming 128 bit instead of 64 bit) but yeah.
               | 
               | It's not quite as simple as saving 8 bytes per row
               | though. It's 8 bytes for the UUID, plus 8 for at least
               | the PK, plus 8 more for any other keys the UUID is in.
               | 
               | Then you need to do that for any foreign-key-fields and
               | keys those are in as well.
               | 
               | However, unless your table has very few non-uuid columns,
               | the rest of the table size will dwarf the extra n*8 bytes
               | here. And if you are storing any blobs (like pictures,
               | documents etc) then frankly all the uuids won't add up to
               | anything.
               | 
               | In summary, whether using uuids or not is right for you
               | depends a Lot on your context. An Insert Only log table,
               | with a billion rows, is a very different use case to say
               | a list of employees or customers.
               | 
               | Generally I'm very comfortable with uuids for tables of
               | say less than 100m rows. Very high inserting though
               | suggests tables bigger than that, and perhaps benefits
               | from different strategies.
               | 
               | When it comes to data stored in multiple places (think
               | on-phone first, syncd to cloud, exported from multiple
               | places to consolidated BI systems), uuids solve the Lot
               | of problems.
               | 
               | Context is everything.
        
             | xyzzy_plugh wrote:
             | They're not "absolutely humongous" but merely two times as
             | large as a 64-bit number. That's nothing. They're not even
             | that bad for performance, assuming you have random writes
             | regardless. UUIDs work incredibly well.
             | 
             | Where they _don 't_ work well is when they're stored in
             | textual encoding (base16 with dashes) which I've
             | encountered far more than I'd like. They also don't work
             | (random) for append-only insertions like events, for which
             | ULID (or newer UUID variants) are more appropriate.
             | 
             | But for primary keys? Totally fine.
             | 
             | > extremely good at generating and optimizing PK sequences
             | (You're taking away all that optimization just because you
             | want an enormous random number to be the primary key).
             | 
             | Please share the benchmarks to back this up. UUIDs optimize
             | down perfectly fine.
             | 
             | If you're trying to squeeze _every last ounce_ of
             | performance out of the database then certainly this is a
             | yak you can shave. But most Postgres users have 1001
             | problems that are much bigger than UUIDs. If you 've
             | managed to avoid folks writing shitty queries and this is
             | the bottleneck you're facing then I salute you.
             | 
             | Otherwise, don't worry. UUIDs are fine.
        
               | sgarland wrote:
               | > but merely two times as large as a 64-bit number.
               | That's nothing.
               | 
               | It's literally 2x as large. This starts to matter quite a
               | bit at the millions/billions of rows scale.
               | 
               | > They're not even that bad for performance, assuming you
               | have random writes regardless.
               | 
               | > Please share the benchmarks to back this up.
               | 
               | I've done [0] benchmarks, with [1] source to recreate it.
               | This was for loading data, but it demonstrates quite
               | nicely the problem of massive B+tree splits. An integer
               | PK loads data ~2x as quickly as a UUIDv4 PK stored as
               | `uuid` type, and ~3x as fast as one stored as `text`
               | type.
               | 
               | > But most Postgres users have 1001 problems that are
               | much bigger than UUIDs.
               | 
               | All of which become compounded by the use of non-k-
               | sortable keys.
               | 
               | > If you've managed to avoid folks writing shitty queries
               | and this is the bottleneck you're facing then I salute
               | you.
               | 
               | Using UUIDs as a PK without having an extremely good
               | reason to do so demonstrates a lack of understanding of
               | how the underlying technology works. Worse, it
               | demonstrates a lack of understanding of a basic data
               | structure, something that devs should have near and dear
               | to their heart.
               | 
               | With Postgres specifically, the use of UUIDv4 (or ULID)
               | PKs will also cause an enormous amount of WAL bloat [2],
               | unless you happen to be rolling your own DB and are using
               | ZFS for the file system, and you've disabled
               | `full_page_writes` (which is the only safe way to do so).
               | I have personally seen this take down prod, because the
               | other nodes couldn't keep up with the changes being
               | issued (fun fact, most AWS instances under `.4xlarge` in
               | size have a baseline network speed that's much lower than
               | rated).
               | 
               | [0]: https://gist.github.com/stephanGarland/ee38c699a9bb9
               | 99894d76...
               | 
               | [1]: https://gist.github.com/stephanGarland/fe0788cf2332d
               | 6e241ff3...
               | 
               | [2]: https://www.2ndquadrant.com/en/blog/on-the-impact-
               | of-full-pa...
        
               | xyzzy_plugh wrote:
               | > I've done [0] benchmarks, with [1] source to recreate
               | it.
               | 
               | Where's the source for the data files? Without that it's
               | not possible to make sense of these benchmarks. Are the
               | keys sorted?
               | 
               | Comparing random-insert to insert-at-end, if that is what
               | you are indeed doing, is silly and not representative of
               | real-world scenarios.
               | 
               | Even then, it's _less than_ a 2x overhead, which is far
               | from an order-of-magnitude difference and squarely within
               | the realm of "I don't care."
               | 
               | > All of which become compounded by the use of non-k-
               | sortable keys.
               | 
               | But that's the thing -- they don't! Bad queries, like
               | missing an index, improperly pruning via predicate
               | pushdown and recursive queries will all dominate
               | performance regardless of what you do here. We're not
               | talking _quadratic_ behavior here. The compounding effect
               | you are worried about is negligible for the vast, vast
               | majority of workloads.
               | 
               | > With Postgres specifically, the use of UUIDv4 (or ULID)
               | PKs will also cause an enormous amount of WAL bloat [2],
               | 
               | Again, only for _random insertions_, which is not a
               | problem specific to UUIDs. Yes I agree that if you are
               | rewriting all of your tables all the time you're going to
               | have a bad time. However UUIDs are not inherently
               | problematic as UUIDv7 or their predecessor ULIDs
               | facilitate temporal ordering, which nearly eliminates the
               | disadvantages discussed here.
               | 
               | I fail to see how ULIDs inherently cause WAL bloat, and
               | checking my notes (read: databases) shows they do not.
               | Why do _you_ believe otherwise?
               | 
               | > Using UUIDs as a PK without having an extremely good
               | reason to do so demonstrates a lack of understanding of
               | how the underlying technology works. Worse, it
               | demonstrates a lack of understanding of a basic data
               | structure, something that devs should have near and dear
               | to their heart.
               | 
               | What _is_ problematic is not understanding the basic data
               | structures and the underlying technology, which is the
               | root of most database issues, which is my point: you
               | almost always have bigger fish to fry. Whether or not you
               | opt to wield UUIDs will not save you here.
               | 
               | In fact, 2ndquadrant agrees with me[0]:
               | 
               | > Chances are your database schema is more complicated
               | and uses various other indexes, so maybe the issues due
               | to UUIDs are fairly negligible in the bigger picture.
               | 
               | [0]: https://www.2ndquadrant.com/en/blog/sequential-uuid-
               | generato...
        
             | moomoo11 wrote:
             | Quick question. So you're saying use bigserial for PK
             | (instead of uuid) and have another column where I set a
             | uuid in case of application level identifier.
             | 
             | But wouldn't that require a index on it if it's going to be
             | an identifier that's used to query?
        
               | VWWHFSfQ wrote:
               | Yes but you only have a single index containing the UUIDs
               | instead of duplicated everywhere that is using it as a
               | foreign key relation. Those indexes quickly become
               | explosive in relations like e.g. m2m tables.
        
           | sgarland wrote:
           | UUIDv4 specifically (though ULID [EDIT: this is incorrect, I
           | was thinking of NanoID] has the same problem). Anything non-
           | k-sortable will bloat a B+tree index, causing far more reads
           | than should be required.
           | 
           | MySQL (technically InnoDB) and SQL Server have a huge problem
           | with them due to their design as a clustering index - since
           | the tuples are physically located, or clustered, around the
           | PK, the random nature means that doing a simple query like
           | "SELECT id FROM customer WHERE date_created > NOW() -
           | INTERVAL 1 WEEK" means that the hits are going to be on tons
           | of different pages. Even if everything is cached, that's
           | still a lot of extra work.
           | 
           | Postgres has its own problems with them as well. In the above
           | query, it's possible that it could be an index-only scan.
           | Great, you don't have to dereference the pointer to find the
           | tuple in the heap! Except you still have to go through the
           | Visibility Map, and it's going to be extremely bloated with a
           | UUIDv4 PK. Additionally, due to the combination of how
           | Postgres manages the A in ACID (via `full_page_writes`) and
           | its MVCC implementation, a change to _any_ indexed column in
           | a table results in the entire row being re-written. While
           | logically sequential tuples may be physically co-located
           | (which would be ideal from a write perspective), the index
           | pages are highly unlikely to be if there are non-k-sortable
           | keys, so now you 've bloated the WAL as well.
           | 
           | If you _must_ use a UUID (I maintain you generally don 't -
           | PlanetScale uses integers internally [0], so clearly it can
           | be done at scale), UUIDv7 is probably the best from a DB
           | performance perspective.
           | 
           | [0]:
           | https://github.com/planetscale/discussion/discussions/366
        
             | auspiv wrote:
             | UUID7 basic information/examples - https://uuid7.com
        
               | Yasuraka wrote:
               | So it will be a formal spec version of TSIDs?
               | 
               | https://www.foxhound.systems/blog/time-sorted-unique-
               | identif...
        
             | quadrature wrote:
             | >though ULID has the same problem
             | 
             | I thought that ULID was meant to solve these issues by
             | being sortable, is that not right ?.
        
               | sgarland wrote:
               | You're correct; I was thinking of NanoID, and updated my
               | post to reflect this.
        
             | arethuza wrote:
             | That presumably explains why some of Microsoft's own
             | products use sequentially assigned GUIDs in SQL Server
             | rather than random ones - using NEWSEQUENTIALID() rather
             | than NEWID()?
        
             | kbolino wrote:
             | PostgreSQL also supports a hash index type, which should be
             | able to handle random (v4) UUIDs pretty well. Since there's
             | no meaning to the sort order on random UUIDs anyway, the
             | limitation to exact comparisons (=) shouldn't be a problem.
             | 
             | It doesn't seem like you can specify what index type is
             | used to implement the PRIMARY KEY constraint, though, so
             | this is really only useful for certain situations.
        
             | aidos wrote:
             | I don't think the visibility map suffers from bloat. It's a
             | bitmap with 1 bit per page.
             | 
             | You might have more pages due to the extra size of each
             | tuple, but it's not likely to be a big difference.
        
               | sgarland wrote:
               | > 1 bit per page
               | 
               | Close, it's 2 bits/page :-)
               | 
               | It's not that it bloats, it's that the hits are scattered
               | among many pages. My wording wasn't great on that point.
               | 
               | Some tests demonstrating this are here [0].
               | 
               | [0]: https://www.cybertec-postgresql.com/en/unexpected-
               | downsides-...
        
               | aidos wrote:
               | Ah nice. Thanks for the extra info. I'm weirdly
               | fascinated by the inner workings of Postgres.
        
             | srcreigh wrote:
             | > MySQL (technically InnoDB) and SQL Server have a huge
             | problem with them due to their design as a clustering index
             | - since the tuples are physically located, or clustered,
             | around the PK, the random nature means that doing a simple
             | query like "SELECT id FROM customer WHERE date_created >
             | NOW() - INTERVAL 1 WEEK" means that the hits are going to
             | be on tons of different pages. Even if everything is
             | cached, that's still a lot of extra work.
             | 
             | How do you fix this problem in Postgres?
        
               | Izkata wrote:
               | The postgres query planner has a correlation statistic
               | for how well indexes match the order of the table data on
               | disk, and if it's bad and the query is expected to return
               | enough rows, it'll skip the index entirely. Instead it'll
               | do a sequential scan and sort, to work with the disk
               | cache instead of against it.
               | 
               | Postgres provides a CLUSTER command to reorder the table
               | data on disk to match a given index, resulting in a high
               | correlation and allowing it to do an index scan without
               | the penalty for random disk access. If you're on an SSD
               | or know you have enough memory to have all the data
               | cached you can also disable the random access penalty in
               | the configuration; it exists mainly for spinning disks.
        
         | yeswecatan wrote:
         | How do you know when you need to reindex?
        
           | sgarland wrote:
           | If you have pgstattuple [0], you can check the bloat of
           | indices. Otherwise, you can just make it a cron on a monthly
           | / quarterly / semi-annually / whatever basis. Since PG12 you
           | can do `REINDEX INDEX CONCURRENTLY` with zero downtime, so it
           | really doesn't hurt to do it more often than necessary. Even
           | before PG12, you can do an atomic version of it:
           | 
           | `CREATE INDEX new_<index_name> CONCURRENTLY;`
           | 
           | `RENAME INDEX <index_name> TO old_<index_name>;`
           | 
           | `RENAME INDEX new_<index_name> TO <index_name>;`
           | 
           | `DROP INDEX CONCURRENTLY old_<index_name>;`
           | 
           | [0]: https://www.postgresql.org/docs/current/pgstattuple.html
        
             | blackenedgem wrote:
             | I find average leaf density to be the best metric of them
             | all. Most btree indexes with default settings (fill factor
             | 90%) will converge to 67.5% leaf density over time. So
             | anything below that is bloated and a candidate for
             | reindexing.
        
           | singron wrote:
           | You can measure "bloat" in the index. It's essentially the
           | wasted space in pages.
           | 
           | You can also have bloat in the heap for the same reasons.
           | 
           | You may also want to cluster if your pg_stat.correlation is
           | low since that indicates your heap isn't in the same order as
           | your index anymore. pg_repack can do all of this without
           | blocking, but you can reindex just an index concurrently on
           | version >= 12.
           | 
           | https://wiki.postgresql.org/wiki/Show_database_bloat
        
         | mnahkies wrote:
         | I've got some queries like this saved which are handy to
         | workout when things are getting bloated and need some
         | maintenance that I find pretty handy
         | https://wiki.postgresql.org/wiki/Show_database_bloat
        
           | lomereiter wrote:
           | This is even handier:
           | https://github.com/NikolayS/postgres_dba
        
         | bonesmoses wrote:
         | I'm not against using UUIDs, but it has to be done in a
         | balanced manner. Normal numeric IDs for all surrogates, and
         | then a UUID as an external lookup attribute. You push that
         | externally for API calls, tokens, etc. so you don't have to
         | worry about leaking sequential values, so it should be indexed.
         | Otherwise, it's purely a lookup to key you into the JOIN chain
         | for everything else.
         | 
         | That said, Now that UUIDv7 is available, switching to that
         | should be a priority.
        
       | rc_mob wrote:
       | Well.. that was very detailed. Will bookmark for when I'm ready
       | to optimize.
        
       | nextaccountic wrote:
       | Why doesn't Postgres autotune those settings? Determining this
       | experimentally seems to be very fragile
        
         | drewcoo wrote:
         | There is a certain appeal for Javaheads.                   java
         | -Xms$FOO -Xmx$BAR
         | 
         | I've worked on things where we had to know the correct heap
         | incantation to run a service.
        
           | tempest_ wrote:
           | The exact settings for those params is always more
        
             | Izkata wrote:
             | With a caveat: If you cross 32G heap size, the JVM will
             | start using more and you effectively have less memory
             | available until you get to 48G.
             | 
             | https://confluence.atlassian.com/jirakb/do-not-use-heap-
             | size...
             | 
             | We had this problem for a while in one of our (non-
             | Atlassian) systems and were running it with Xmx at 31.5G
        
       | limaoscarjuliet wrote:
       | Do not forget huge pages configuration in kernel - Postgres will
       | use them and it will save a lot of RAM for large databases, esp
       | with many connections.
       | 
       | One of the ways to configure is to continue adding nr_hugepages
       | to kernel, restarting PG and running your workload until you see
       | HugePages_Free growing.
       | 
       | E.g.:
       | 
       | sysctl.d/10-hugepages.conf:vm.nr_hugepages=20000
       | 
       | See how these are used:
       | 
       | [root@potato ~]# cat /proc/meminfo | grep -i huge
       | 
       | AnonHugePages: 4411392 kB
       | 
       | ShmemHugePages: 0 kB
       | 
       | FileHugePages: 0 kB
       | 
       | HugePages_Total: 20000
       | 
       | HugePages_Free: 3103
       | 
       | HugePages_Rsvd: 143
       | 
       | HugePages_Surp: 0
       | 
       | Hugepagesize: 2048 kB Hugetlb: 40960000 kB
        
         | tanelpoder wrote:
         | Also, for multiprocess database engines like Postgres (and
         | Oracle), starting up and tearing down connections will be much
         | faster with hugepages too, as the shared memory attach and
         | detach system calls have less pagetable setup work to do in the
         | kernel.
        
       | rtp4me wrote:
       | For our production PGSQL databases, we use a combination of
       | PGTuner[0] to help estimate RAM requirements and PGHero[1] to get
       | a live view of the running DB. Furthermore, we use ZFS with the
       | built-in compression to save disk space. Together, these three
       | utilities help keep our DBs running very well.
       | 
       | [0] https://pgtune.leopard.in.ua
       | 
       | [1] https://github.com/ankane/pghero
        
         | anotherhue wrote:
         | Is the old ZFS block size trick still necessary (since DBs
         | internalise so much that OSs tend to provide)?
        
           | rtp4me wrote:
           | It is - depending on the read-vs-write workload. For our
           | workload, we landed on a record size (blocksize) of 128K
           | which gives us 3x-5x compression. Contrary to the 8KB/16KB
           | suggestions on the internet, our testing indicated 128K was
           | the best option. And, using compression allows us to run much
           | smaller storage volume sizes in Azure (thus, saving money).
           | 
           | We did an exhaustive test of our use-cases, and the best ZFS
           | tuning options with Postgres we found (again, for our
           | workload):                 * Enable ZFS on-disk compression
           | * Disable ZFS in-memory compression (enabling this option
           | costs us 30% perf penalty)            * Enable primary
           | caching            * Limit read-ahead caching
           | 
           | Edit: Forgot to add, here are the required PGSQL options when
           | using ZFS:                 * full_page_writes = off
           | * wal_compression = off
           | 
           | Once the above options were set, we were getting close to
           | EXT4 read/write speeds with the benefit of compression.
        
         | candiddevmike wrote:
         | Why did you choose to run PG on ZFS? DBs on CoW FS aren't
         | usually ideal.
        
           | rtp4me wrote:
           | We were running very large storage volumes in Azure (+2TB)
           | and wanted to leverage ZFS compression to save money. After
           | running some performance testing, we landed on a good balance
           | of PGSQL and ZFS options that worked well for us.
        
       | mannyv wrote:
       | One other sneaky way to do this is to get the parameters from an
       | AWS RDS instance of Postgres that matches your server and use
       | those. It's general-purpose, but it's a good starting point...and
       | probably much better than the default Postgres settings.
        
       | tiffanyh wrote:
       | I've always found that the config generator below gets you 90% of
       | the optimizations you'll likely needed.
       | 
       | https://pgtune.leopard.in.ua
        
       | lazyant wrote:
       | I find the "pg connections take a lot of memory" to be mostly a
       | myth. The number floating around is 10 - 20MB per connection but
       | it looks it's more like 2MB per connection
       | https://blog.anarazel.de/2020/10/07/measuring-the-memory-ove...
        
       ___________________________________________________________________
       (page generated 2024-06-12 23:00 UTC)