[HN Gopher] Transforming Postgres into a Fast OLAP Database
       ___________________________________________________________________
        
       Transforming Postgres into a Fast OLAP Database
        
       Author : philippemnoel
       Score  : 127 points
       Date   : 2024-02-05 17:24 UTC (2 days ago)
        
 (HTM) web link (blog.paradedb.com)
 (TXT) w3m dump (blog.paradedb.com)
        
       | winrid wrote:
       | (in Rust) :)
        
         | dang wrote:
         | It's nice to have a title that doesn't say that for once!
        
       | brand wrote:
       | How does this handle sort ordering / collations?
        
         | retakeming wrote:
         | Sort ordering is handled automatically by Datafusion.
         | 
         | Collations at the column/operation level are not yet supported
         | but we're working on it.
        
       | datadrivenangel wrote:
       | Interesting and smart choice to use Apache DataFusion as their
       | query engine.
        
         | riku_iki wrote:
         | which is point of worry, because that one is not mature and
         | well adapted product
        
         | philippemnoel wrote:
         | ParadeDB maker here -- DataFusion is amazing! Apple used it to
         | rebuild part of Spark and open-sourced it under Comet too:
         | https://github.com/apache/arrow-datafusion-comet/pull/1
         | 
         | It's a lot more mature than people think :)
        
       | HermitX wrote:
       | Thanks for sharing, I really like Postgres. However, I generally
       | use Postgres for OLTP work. I would like to point out two things:
       | 
       | 1. Based on Clickbench results, pg_analytics is still far from
       | top-tier performance. If you're looking for a high-performance
       | OLAP database, you should consider the top-ranked products.
       | 
       | 2. The queries tested by Clickbench are really simple, far from
       | real data analysis scenarios. You must be aware of the TPC-DS and
       | TPC-H benchmarks, because ClickHouse simply cannot run these test
       | suites, so their Clickbench does not include these datasets.
       | 
       | Lastly, I want to say, if your enterprise is of a certain size,
       | separating OLTP and OLAP into two databases is the right choice,
       | because you will have two different teams responsible for two
       | different tasks. By the way, I use StarRocks for OLAP work.
        
         | ComputerGuru wrote:
         | > Based on Clickbench results, pg_analytics is still far from
         | top-tier performance
         | 
         | You touch on this in your next sentence, but really, how many
         | people need that kind of performance?
        
           | HermitX wrote:
           | As long as you own it, you will love it. Trust me.
        
         | retakeming wrote:
         | One of the authors of pg_analytics here.
         | 
         | 1. On Clickbench, make sure you're doing an apples-to-apples
         | comparison by comparing scores from the same instance. We used
         | the most commonly-used c6a.4xlarge instance. While a few
         | databases like DuckDB rank higher, the performance of
         | Datafusion (our underlying query engine) is constantly
         | improving, and pg_analytics inherits those improvements.
         | 
         | Then again, people only care about performance and benchmarks
         | up to a certain threshold. The goal of pg_analytics is not to
         | displace something like StarRocks, but to enable analytical
         | workloads that require both row and column-oriented data or
         | Postgres transactions.
         | 
         | 2. We're working on TPC-H benchmarks. They're good for
         | demonstrating JOIN performance and we'll have them published
         | early next week.
        
           | HermitX wrote:
           | Thank you for your great job!
        
           | neeleshs wrote:
           | +1 for this.
           | 
           | Most of the time, all that matter in terms of performance is
           | user's tolerance. Once that is reached, operational
           | complexity becomes a lot more important. We use raw Postgres
           | for analytics, knowing that projects like these and cloud
           | offerings like AlloyDB will make our lives easier (in terms
           | of performance) as time goes.
           | 
           | pg_bm25 looks awesome too! Next up, take fdw to the level of
           | Trino/Drill, and we dont need anything else other than
           | postgres and its extensions!
        
         | mritchie712 wrote:
         | here's a link comparing parade to duckdb, clickhouse and
         | snowflake
         | 
         | https://benchmark.clickhouse.com/#eyJzeXN0ZW0iOnsiQWxsb3lEQi...
        
         | chuckhend wrote:
         | Clickbench is really nice because it is so easy to compare and
         | contribute benchmarks. Is there anything out there like
         | Clickbench, but for the TPC-DS and TPC-H benchmarks?
        
           | philippemnoel wrote:
           | TPC-H has some published benchmarks on their website, but
           | it's for specific hardware systems. It's certainly not as
           | user-friendly and modern as ClickBench. I'm not sure if it's
           | possible to make something better without their consent,
           | though.
        
       | zX41ZdbW wrote:
       | There are many Postgres forks for OLAP workloads, such as:
       | 
       | - Greenplum;
       | 
       | - Citus with cstore_fdw;
       | 
       | - IMCS (Konstantin Knizhnik);
       | 
       | - Hydra;
       | 
       | - AlloyDB;
       | 
       | For example, Greenplum is one of the earliest, fairly mature, but
       | abandoned.
        
         | philippemnoel wrote:
         | That's right! Greenplum is still active, but falling off.
         | 
         | We see pg_analytics as the next-generation Citus columnar, with
         | much better performance and integration into the wider data
         | ecosystem via Delta Lake, and eventually Iceberg
        
       | ComputerGuru wrote:
       | If you're looking for pg_analytics (the product in TFA) in the
       | charts and can't find it, it's because the chart calls it
       | ParadeDB (after the company, instead of the product).
        
         | philippemnoel wrote:
         | One of the ParadeDB makers here -- Yes, we release our Helm
         | charts under the `paradedb` name.
         | 
         | We also ship our individual extensions (pg_analytics and
         | pg_bm25) pre-compiled as .deb under our GitHub Releases, and as
         | a fully complete product, ParadeDB, as a Dockerfile and Helm
         | chart
        
       | pletnes wrote:
       | Is this deltalake the same as Databricks deltalake? Is it
       | compatible?
        
         | retakeming wrote:
         | Yes. delta-rs is a Rust-based implementation of Delta Lake.
         | 
         | The existing version of pg_analytics uses delta-rs to manage
         | Parquet files stored within Postgres. In the future, we plan on
         | integrating external object stores. This means that you'll be
         | able to query any Delta Lake directly from Postgres.
         | 
         | Iceberg support will come later, once the Rust implementation
         | of Iceberg matures.
        
           | pletnes wrote:
           | Now this is cool. Making data available to applications (and
           | other nice things).
        
       | slt2021 wrote:
       | clickhouse is phenomenal
        
       | ZeroCool2u wrote:
       | If I combined ParadeDB with a common postgres extension like
       | PostGIS, would I get the benefits of both?
        
         | retakeming wrote:
         | Yes and no, depending on what you mean.
         | 
         | The custom types/indexes introduced by PostGIS won't work with
         | deltalake tables. Even if it were possible, the benefits of
         | using deltalake tables to execute geospatical queries are
         | unclear, since PostGIS indexes are already optimized for this
         | task.
         | 
         | But, what ParadeDB enables is for geospatial tables and
         | deltalake tables to exist within the same database.
        
       | pberton wrote:
       | Nice benchmark with amazing results! Congrats!
        
       | DrDroop wrote:
       | Interesting stuff keeps happening in PostgresLand. This uses
       | Apache DataFusion, first think i hear about it, but from reading
       | the docs it seems like it already offers build in SQL interface
       | atop of Apache parquet and Apache Arrow. What is the difference
       | between this two versions of SQL. Can I still use my favourite
       | pl/sql function together with a deltalake table and does this
       | play nice with other kind of tables?
        
         | retakeming wrote:
         | Datafusion's SQL dialect has some slight quirks that pertain to
         | Datafusion-specific functionality. For instance, the ability to
         | create an external table.
         | 
         | With regards to ParadeDB, we rely on the Datafusion SQL parser,
         | which can transform the Postgres SQL dialect into a Datafusion
         | logical plan that can be executed by Datafusion. We actually
         | have an open PR that adds support for user-defined
         | functions...it will likely get merged within a few days.
        
           | DrDroop wrote:
           | thx for your reply, so postgres is taking care of the parquet
           | files in the background in some way i guess, would it be
           | possible to combine this with something like neon?
        
             | philippemnoel wrote:
             | We haven't tested. Neon has a pretty unique storage
             | architecture, so I suspect it wouldn't work out-of-the-box,
             | but could with some modifications on our end. We're open to
             | this type of partnership eventually.
             | 
             | If you decide to try it, I believe they now have a way to
             | load in arbitrary Postgres extensions!
        
               | nikita wrote:
               | We are happy to try for sure. Analytics workloads really
               | benefit from separation of storage and compute!
        
       | ggregoire wrote:
       | Slightly related but it's so interesting to me all those new
       | small tech companies that basically provides value on top of
       | Postgres! They usually blog about some obscure stuff about
       | Postgres and one of their articles gets on the frontpage of HN.
       | Good marketing and very cool stuff!
        
       | rywalker wrote:
       | Excited to see ParadeDB make progress in this area :)
        
       | tomcam wrote:
       | OLAP was my favorite character in Frozen
        
       | Xcelerate wrote:
       | This is pretty cool. It seems like there are a lot of hard
       | database problems that still need to be solved, but finding the
       | right database system (or extension) seems about as difficult as
       | finding a research article and implementing the solution yourself
       | lately. This seems like a step in the right direction by
       | integrating with Postgres, which is widely used and supported.
       | 
       | Case in point regarding OLAP in particular, I am currently trying
       | to solve a problem where I have a high number of categorical
       | dimensions, and I want to perform a "count distinct" over any
       | combination of dimensions, grouped by any other combination of
       | dimensions, filtered by specific values in each dimension. E.g.,
       | count(distinct a.col1, b.col2), count(distinct a.col1),
       | count(distinct b.col3) from table a join table b using (id) group
       | by a.col4, b.col7.
       | 
       | Sounds obscure when I word it that way, but this is actually a
       | pretty "generic" problem that appears whenever you want to filter
       | and count the number of distinct property combinations that occur
       | within a fact dataset of transactions or events that has been
       | joined with other dimensional datasets. A naive implementation is
       | exorbitantly expensive (and impractical) if you have to join many
       | large tables before grouping and performing count distinct.
       | 
       | However, this specific problem manifests in various equivalent
       | forms mathematically: model counting of boolean expressions, low
       | rank factorization of sparse high dimensional boolean tensors
       | (each row in your transaction dataset corresponds to a value of
       | "true" in a sparse tensor with dimensions indexed by the values
       | of your columns), minimal hypergraph covering set, etc.
       | 
       | Is there a database already out there that's optimized for this
       | fairly common business problem? Maybe...? I searched for a while
       | but couldn't easily separate the startup database hype from the
       | actual capabilities of a particular offering. Plus, even if the
       | ideal "hypergraph counting database" exists, it's not like my
       | company is just going to replace its standard cloud SQL platform
       | that serves as the backbone of our entire product with a niche
       | and fragile experimental database with questionable long-term
       | support. It's much easier to just translate one of the latest
       | tensor factoring research papers into a Python script, plop that
       | into the data processing pipeline, and output the simple factored
       | form of the transactions dataset into a new table that can be
       | easily queried in the standard way.
        
         | neeleshs wrote:
         | Is a sparse high dimensional boolean tensor modeled as you
         | describe roughly the same as inverted index in a search engine
         | like ElasticSearch/Solr?
        
       | yunyu wrote:
       | Congrats on launching - we just spent quite a bit of time
       | replicating/transforming our primary database into clickhouse for
       | OLAP use cases, and it would have been way easier if there were a
       | postgres-native solution. Hoping the managed hosting providers
       | catch on
        
       | caust1c wrote:
       | How do you find working with Arrow / Parquet?
       | 
       | Personally I found it very hard to reason about and thought that
       | Clickhouse's strategy for managing columnar data much more
       | reasonable.
        
         | retakeming wrote:
         | Datafusion and Deltalake abstract away most of Arrow/Parquet.
         | And those APIs were very nice to work with.
        
         | _m_benjamin wrote:
         | Do you have an example where CH was more reasonable? I don't
         | have enough experience here, so would greatly appreciate
         | hearing what you'e seen. Thank you!
        
       | tomkaos wrote:
       | That look like what Microsoft has done with Microsoft Fabric. SQL
       | Server going full parquet file for table storage.
        
         | polskibus wrote:
         | I thought going full parquet as storage is something that
         | Snowflake did first?
        
       | DanielLapidus wrote:
       | Nice, congrats! How does this compare to Hydra in philosophy and
       | use cases?
        
       | pgaddict wrote:
       | Looks very nice. I have three random questions:
       | 
       | 1) How does this deal with backups? Presumably the deltalake
       | tables can't be backed up by Postgres itself, so I guess there's
       | some special way to d backups?
       | 
       | 2) Similarly for replication (physical or logical). Presumably
       | that's not supported, right? I guess logical replication is more
       | useful for OLTP databases from which the data flow to datalakes,
       | so that's fine. But what's the HA story without physical
       | replication?
       | 
       | 3) Presumably all the benefits are from compression at the
       | storage level? Or are there some tweaks to the executor to do
       | columnar stuff? I looked at the hooks in pg_analytics, but I see
       | only stuff to handle DML. But I don't speeak rust, so maybe I
       | missed something.
        
       ___________________________________________________________________
       (page generated 2024-02-07 23:00 UTC)