[HN Gopher] Partitioning in Postgres, 2022 Edition
       ___________________________________________________________________
        
       Partitioning in Postgres, 2022 Edition
        
       Author : plaur782
       Score  : 67 points
       Date   : 2022-10-05 19:02 UTC (3 hours ago)
        
 (HTM) web link (brandur.org)
 (TXT) w3m dump (brandur.org)
        
       | nullwarp wrote:
       | We use Postgres partitioning quite successfully to handle our
       | customer based data. Everything is partitioned by customer and
       | some customers are partitioned further.
       | 
       | One gotcha to be careful with is that if you run a query spanning
       | multiple partitions, it will run them all at once and if your
       | database isn't super big - will bring it to its knees.
       | 
       | Outside of that really no issues. We also use Timescale quite
       | heavily, which also works fantastic.
        
         | makr17 wrote:
         | IIRC, it won't run them all at the _same time_, it will run
         | them in parallel, which ends up being limited by the value of
         | `max_parallel_workers` in your config. So if you find that
         | parallel operations are swamping your DB, maybe trying stepping
         | that config down a bit. In general, queries that span several
         | partitions shouldn't be a problem. Hitting _all_ partitions for
         | a really big table, that's another thing entirely. But since
         | you're partitioned you really ought to be limiting by the value
         | of the partition key in your queries anyways...
        
         | Ozzie_osman wrote:
         | Do you have any notes on how you set this up? Did you use
         | extensions and if so which?
        
       | didgetmaster wrote:
       | I am building my own database engine using some data objects
       | (key-value stores) I invented to form columnar store tables. It
       | has some really fast query speeds and analytic features (e.g.
       | pivot tables) that test favorably compared to Postgres and other
       | RDBMS offerings. (https://www.youtube.com/watch?v=OVICKCkWMZE)
       | 
       | Partitioning a big table is definitely on my TODO list. How big
       | does a typical table need to grow before partitioning is seen as
       | a 'necessity'? What are some ways current partitioning strategies
       | have made things too difficult?
        
         | eatonphil wrote:
         | That's awesome. Link to your repo or any posts you've written
         | about your progress?
        
           | didgetmaster wrote:
           | I haven't decided yet which parts I am going to open source
           | or which license I am going to use; but the binaries are
           | available for free download on our website
           | https://www.Didgets.com and I have a newsletter at
           | Didgets.substack.com
        
       | eatonphil wrote:
       | Is Postgres ever going to allow automatic partitioning?
       | 
       | Something like:                   CREATE TABLE logs (
       | time TIMESTAMPTZ,           data JSONB           ) PARTITION ON
       | time IN INTERVAL '1 day'
       | 
       | And it just creates/manages these partitions for every day?
       | 
       | If you can already manage partitions like this manually it feels
       | like the next step is to just have it be automatic. So you have
       | less of a need to switch to Timescale or ClickHouse or whatever
       | other database as the amount of data you're storing/querying
       | grows. (Yeah that's a handwave-y suggestion but at least you
       | could stick with Postgres for longer.)
        
         | yawgmoth wrote:
         | pg_partman, which is included in most distributions of Postgres
         | (including RDS) can handle some cases
         | 
         | - https://github.com/pgpartman/pg_partman
         | 
         | -
         | https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Postg...
        
           | boomskats wrote:
           | pg_partman is awesome. The SQL syntax for partitioning isn't
           | dissimilar to your example (since PG11?), but like you say,
           | it's the automatic creation and cleanup of partitions that's
           | it's missing. pg_partman used to have to do a lot more
           | things, but nowadays you can just have it do the job of
           | creating X future partitions for you at a given interval, and
           | truncating partitions older than retention period Y.
           | 
           | It's a great extension. We have used it in production for a
           | few years now, as part of a commercial on-prem observability
           | tool (an app that keeps around 1TB of data at any one time,
           | written at avg 15mb/sec, maybe 500ish partitions, user
           | configurable retention strategy). It is extremely reliable.
           | Support for Keith F working on it is one of the reasons I
           | like Crunchy.
           | 
           | edit: I clearly didn't read the link before writing this
        
           | eatonphil wrote:
           | Timescale is just an extension too, but still I wonder why
           | Postgres should not support automatic partition management
           | themselves.
        
             | CSDude wrote:
             | Timescale is not just an extension. It offers many features
             | and a restrictive license. Partman is just a utility. But I
             | agree something like partman should be part of core
             | Postgres
        
               | xdanger wrote:
               | It's just an extension. You can build the Apache 2.0 part
               | of the extension and ignore the Timescale License parts.
               | Those parts are mainly about multinode, compression and
               | continuous aggregates.
               | 
               | TSL is mainly about not competing with their cloud
               | offerings. So you can't run a database-as-a-service for
               | time series data with it.
               | 
               | More about the license:
               | https://www.timescale.com/blog/building-open-source-
               | business...
               | 
               | Comparison about the open source and community editions: 
               | https://docs.timescale.com/timescaledb/latest/timescaledb
               | -ed...
        
       ___________________________________________________________________
       (page generated 2022-10-05 23:01 UTC)