hngopher.com

       [HN Gopher] Pg_lake: Postgres with Iceberg and data lake access
       ___________________________________________________________________
        
       Pg_lake: Postgres with Iceberg and data lake access
        
       Author : plaur782
       Score  : 241 points
       Date   : 2025-11-04 16:12 UTC (6 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | ozgune wrote:
       | This is huge!
       | 
       | When people ask me what's missing in the Postgres market, I used
       | to tell them "open source Snowflake."
       | 
       | Crunchy's Postgres extension is by far the most ahead solution in
       | the market.
       | 
       | Huge congrats to Snowflake and the Crunchy team on open sourcing
       | this.
        
         | gigatexal wrote:
         | Honestly. Just pay snowflake for the amazing DB and ecosystem
         | it is. And then go build cool stuff unless your value add to
         | customers is infra let them handle all that.
        
       | ayhanfuat wrote:
       | With S3 Table Buckets, Cloudflare R2 Data Catalog and now this,
       | Iceberg seems to be winning.
        
       | dkdcio wrote:
       | I was going to ask if you could then put DuckDB over Postgres for
       | the OLAP query engine -- looks like that's already what it does!
       | very interesting development in the data lake space alongside
       | DuckLake and things
        
         | pgguru wrote:
         | You create foreign tables in postgres using either the
         | pg_lake_table wrapper or pg_lake_iceberg.
         | 
         | Once those tables exist, queries against them are able to
         | either push down entirely to the remote tables and uses a
         | Custom Scan to execute and pull results back into postgres, or
         | we transform/extract the pieces that can be executed remotely
         | using a FDW and then treat it as a tuple source.
         | 
         | In both cases, the user does not need to know any of the
         | details and just runs queries inside postgres as they always
         | have.
        
           | spenczar5 wrote:
           | I think I don't understand postgres enough, so forgive this
           | naive question, but what does pushing down to the remote
           | tables mean? Does it allow parallelism? If I query a very
           | large iceberg table, will this system fan the work out to
           | multiple duckdb executors and gather the results back in?
        
             | pgguru wrote:
             | In any query engine you can execute the same query in
             | different ways. The more restrictions that you can apply on
             | the DuckDB side the less data you need to return to
             | Postgres.
             | 
             | For instance, you could compute a `SELECT COUNT(*) FROM
             | mytable WHERE first_name = 'David'` by querying all the
             | rows from `mytable` on the DuckDB side, returning all the
             | rows, and letting Postgres itself count the number of
             | results, but this is extremely inefficient, since that same
             | value can be computed remotely.
             | 
             | In a simple query like this with well-defined semantics
             | that match between Postgres and DuckDB, you can run the
             | query entirely on the remote side, just using Postgres as a
             | go-between.
             | 
             | Not all functions and operators work in the same way
             | between the two systems, so you cannot just push things
             | down unconditionally; `pg_lake` does some analysis to see
             | what can run on the DuckDB side and what needs to stick
             | around on the Postgres side.
             | 
             | There is only a single "executor" from the perspective of
             | pg_lake, but the pgduck_server embeds a multi-threaded
             | duckdb instance.
             | 
             | How DuckDB executes the portion of the query it gets is up
             | to it; it often will involve parallelism, and it can use
             | metadata about the files it is querying to speed up its own
             | processing without even needing to visit every file. For
             | instance, it can look at the `first_name` in the incoming
             | query and just skip any files which do not have a
             | min_value/max_value that would contain that.
        
               | spenczar5 wrote:
               | Thanks for the detailed answer!
               | 
               | I use DuckDB today to query Iceberg tables. In some
               | particularly gnarly queries (huge DISTINCTs, big sorts,
               | even just selects that touch extremely heavy columns) I
               | have sometimes run out of memory in that DuckDB instance.
               | 
               | I run on hosts without much memory because they are
               | cheap, and easy to launch, giving me isolated query
               | parallism, which is hard to achieve on a single giant
               | host.
               | 
               | To the extent that its possible, I dream of being able to
               | spread those gnarly OOMing queries across multiple hosts;
               | perhaps the DISTINCTs can be merged for example. But this
               | seems like a pretty complicated system that needs to be
               | deeply aware of Iceberg partitioning ("hidden" in
               | pg_lake's language), right?
               | 
               | Is there some component in the postgres world that can
               | help here? I am happy to continue over email, if you
               | prefer, by the way.
        
               | pgguru wrote:
               | Well, dealing with large analytics queries will always
               | perform better with larger amounts of memory... :D You
               | can perhaps tune things to perform based on the amount of
               | system memory (IME 80% is what DuckDB targets if not
               | otherwise configured). Your proposed system does sounds
               | like it introduces quite a bit of complexity that would
               | be better served just by using hosts with more memory.
               | 
               | As far as Iceberg is concerned, DuckDB has its own
               | implementation, but we do not use that; pg_lake has its
               | own iceberg implementation. The partitioning is "hidden"
               | because it is separated out from the schema definition
               | itself and can be changed gradually without the query
               | engine needing to care about the details of how things
               | are partitioning at read time. (For writes, we respect
               | the latest partitioning spec and always write according
               | to that.)
        
       | beoberha wrote:
       | Curious why pgduck_server is a totally separate process?
        
         | dkdcio wrote:
         | from the README:
         | 
         | > This separation also avoids the threading and memory-safety
         | limitations that would arise from embedding DuckDB directly
         | inside the Postgres process, which is designed around process
         | isolation rather than multi-threaded execution. Moreover, it
         | lets us interact with the query engine directly by connecting
         | to it using standard Postgres clients.
        
           | beoberha wrote:
           | Thanks! Didn't scroll down far enough
        
         | rmnclmnt wrote:
         | The README explains it:
         | 
         | > This separation also avoids the threading and memory-safety
         | limitations that would arise from embedding DuckDB directly
         | inside the Postgres process, which is designed around process
         | isolation rather than multi-threaded execution. Moreover, it
         | lets us interact with the query engine directly by connecting
         | to it using standard Postgres clients.
        
         | pgguru wrote:
         | What has been pointed out from the README; also:
         | 
         | - Separation of concerns, since with a single external process
         | we can share object store caches without complicated locking
         | dances between multiple processes. - Memory limits are easier
         | to reason about with a single external process. - Postgres
         | backends end up being more robust, as you can restart the
         | pgduck_server process separately.
        
       | dharbin wrote:
       | Why would Snowflake develop and release this? Doesn't this
       | cannibalize their main product?
        
         | kentm wrote:
         | It's not going to scale as well as Snowflake, but it gets you
         | into an Iceberg ecosystem which Snowflake can ingest and
         | process at scale. Analytical data systems are typically
         | trending to heterogenous compute with a shared storage backend
         | -- you have large, autoscaling systems to process the raw data
         | down to something that is usable by a smaller, cheaper query
         | engine supporting UIs/services.
        
           | hobs wrote:
           | But if you are used to this type of compute per dollar what
           | on earth would make you want to move to Snowflake?
        
             | kentm wrote:
             | Different parts of the analytical stack have different
             | performance requirements and characteristics. Maybe none of
             | your stack needs it and so you never need Snowflake at all.
             | 
             | More likely, you don't need Snowflake to process queries
             | from your BI tools (Mode, Tableau, Superset, etc), but you
             | _do_ need it to prepare data for those BI tools. Its
             | entirely possible that you have hundreds of terabytes, if
             | not petabytes, of input data that you want to pare down to
             | < 1 TB datasets for querying, and Snowflake can chew
             | through those datasets. There's also third party
             | integrations and things like ML tooling that you need to
             | consider.
             | 
             | You shouldn't really consider analytical systems the same
             | as a database backing a service. Analytical systems are
             | designed to funnel large datasets that cover the entire
             | business (cross cutting services and any sharding you've
             | done) into subsequently smaller datasets that are cheaper
             | and faster to query. And you may be using different compute
             | engines for different parts of these pipelines; there's a
             | good chance you're not using _only_ Snowflake but Snowflake
             | and a bunch of different tools.
        
         | barrrrald wrote:
         | One thing I admire about Snowflake is a real commitment to
         | self-cannibalization. They were super out front with Iceberg
         | even though it could disrupt them, because that's what
         | customers were asking for and they're willing to bet they'll
         | figure out how to make money in that new world
         | 
         | Video of their SVP of Product talking about it here:
         | https://youtu.be/PERZMGLhnF8?si=DjS_OgbNeDpvLA04&t=1195
        
           | qaq wrote:
           | Have you interacted with Snowflake teams much? We are using
           | external iceberg tables with snowflake. Every interaction
           | pretty much boils down to you really should not be using
           | iceberg you should be using snowflake for storage. It's also
           | pretty obvious some things are strategically not implemented
           | to push you very strongly in that direction.
        
             | ozkatz wrote:
             | Out of curiosity - can you share a few examples of
             | functionality currently not supported with Iceberg but that
             | works well with their internal format?
        
               | qaq wrote:
               | even partition elimination is pretty primitive. For Query
               | optimizer Iceberg is really not a primary target. The
               | overall interaction with even technical people gives
               | strong this is a sales org that happens to own an OLAP db
               | product vibe.
        
             | barrrrald wrote:
             | Not surprised - this stuff isn't fully mature yet. But I
             | interact with their team a lot and know they have a
             | commitment to it (I'm the other guy in that video)
        
         | 999900000999 wrote:
         | It'll probably be really difficult to set up.
         | 
         | If it's anything like super base, your question the existence
         | of God when trying to get it to work properly.
         | 
         | You pay them to make it work right.
        
           | pgguru wrote:
           | For testing, we at least have a Dockerfile to automate the
           | setup of the pgduck_server and a minio instance so it Just
           | Works(tm) with the extensions installed in your local
           | Postgres cluster (after installing the extensions).
           | 
           | The configuration mainly involves just defining the default
           | iceberg location for new tables, pointing it to the
           | pgduck_server, and providing the appropriate auth/secrets for
           | your bucket access.
        
         | mslot wrote:
         | When we first developed pg_lake at Crunchy Data and defined GTM
         | we considered whether it could be a Snowflake competitor, but
         | we quickly realised that did not make sense.
         | 
         | Data platforms like Snowflake are built as a central place to
         | collect your organisation's data, do governance, large scale
         | analytics, AI model training and inference, share data within
         | and across orgs, build and deploy data products, etc. These are
         | not jobs for a Postgres server.
         | 
         | Pg_lake foremost targets Postgres users who currently need
         | complex ETL pipelines to get data in and out of Postgres, and
         | accidental Postgres data warehouses where you ended up
         | overloading your server with slow analytical queries, but you
         | still want to keep using Postgres.
        
       | mberning wrote:
       | Does anyone know how access control works to the underlying s3
       | objects? I didn't see anything regarding grants in the docs.
        
         | pgguru wrote:
         | Hi, one of the developers here. You define credentials that can
         | access the S3 buckets and use those as DuckDB secrets, usually
         | in an init script for pgduck_server. (You can see some examples
         | of this in the testing framework.)
         | 
         | I'll see if we can improve the docs or highlight that part
         | better, if it is already documented--we did move some things
         | around prior to release.
        
           | onderkalaci wrote:
           | Maybe this could help: https://github.com/Snowflake-
           | Labs/pg_lake?tab=readme-ov-file...
        
           | mberning wrote:
           | Interesting. I am working on a project to integrate access
           | management to iceberg/parquet files for sagemaker.
           | Controlling what users logged into sagemaker studio have
           | access to in s3. It's fine using static policies for mvp, but
           | eventually it needs to be dynamic and integrated into
           | enterprise iam tools. Those tools generally have great
           | support for managing sql grants. Not so much for s3 bucket
           | policies.
        
             | pgguru wrote:
             | DuckDB secrets management supports custom IAM roles and the
             | like; at this point we are basically treating the
             | pgduck_server external system as a black box.
             | 
             | For the postgres grants themselves, we provide privs to
             | allow read/write to the remote tables, which is done via
             | granting the `pg_lake_read`, `pg_lake_write` or
             | `pg_lake_read_write` grants. This is a blanket all-or-
             | nothing grant, however, so would need some design
             | work/patching to support per-relation grants, say.
             | 
             | (You could probably get away with making roles in postgres
             | that have the appropriate read/write grant, then only
             | granting those specific roles to a given relation, so it's
             | probably doable though a little clunky at the moment.)
        
         | mslot wrote:
         | There are Postgres roles for read/write access to the S3 object
         | that DuckDB has access to. Those roles can create tables from
         | specific files or at specific locations, and can then assign
         | more fine-grained privileges to other Postgres roles (e.g. read
         | access on a specific view or table).
        
       | chaps wrote:
       | I love postgres and have created my own "data lake" sorta systems
       | -- what would this add to my workflows?
        
       | gajus wrote:
       | Man, we are living in the golden era of PostgreSQL.
        
       | anentropic wrote:
       | When Snowflake bought Crunchy Data I was hoping they were going
       | to offer a managed version of this
       | 
       | It's great that I can run this locally in a Docker container, I'd
       | love to be able to run a managed instance on AWS billed through
       | our existing Snowflake account
        
       | oulipo2 wrote:
       | Interesting! How does it compare with ducklake?
        
         | mslot wrote:
         | You could say
         | 
         | With DuckLake, the query frontend and query engine are DuckDB,
         | and Postgres is used as a catalog in the background.
         | 
         | With pg_lake, the query frontend and catalog are Postgres, and
         | DuckDB is used as a query engine in the background.
         | 
         | Of course, they also use different table formats (though
         | similar in data layer) with different pros and cons, and the
         | query frontends differ in significant ways.
         | 
         | An interesting thing about pg_lake is that it is effectively
         | standalone, no external catalog required. You can point Spark
         | et al. directly to Postgres with pg_lake by using the Iceberg
         | JDBC driver.
        
       | inglor wrote:
       | This is really nice though looking at the code - a lot of the
       | postgres types are missing as well a lot of the newer parquet
       | logical types - but this is a great start and a nice use of FDW.
        
         | inglor wrote:
         | Also, any planned support for more catalogs?
        
           | pgguru wrote:
           | I think we have recently merged (or are getting ready to
           | merge) REST catalog support, so that will open some things up
           | in this department.
        
         | pgguru wrote:
         | Hi, what types are you expecting to see that aren't supported?
         | I believe we had support for most/all builtin postgres types.
        
       | iamcreasy wrote:
       | Very cool! Was there any inherent limitation with postgresql or
       | its extension system that forced pg_lake to use duckdb as query
       | engine?
        
         | mslot wrote:
         | I gave a talk on that at Data Council, then still discussing
         | the pg_lake extensions as part of Crunchy Data Warehouse.
         | 
         | https://youtu.be/HZArjlMB6W4?si=BWEfGjMaeVytW8M1
         | 
         | Also, nicer recording from POSETTE:
         | https://youtu.be/tpq4nfEoioE?si=Qkmj8o990vkeRkUa
         | 
         | It comes down to the trade-offs made by operational and
         | analytical query engines being fundamentally different at every
         | level.
        
         | pgguru wrote:
         | DuckDB provided a lot of infrastructure for reading/writing
         | parquet files and other common formats here. It also was
         | inherently multi-threaded and supported being embedded in a
         | larger program (similar to sqllite), so made it a good basis
         | for something that could work outside of the traditional
         | process model of Postgres.
         | 
         | Additionally, the postgres extension system supports most of
         | the current project, so wouldn't say it was forced in this
         | case, it was a design decision. :)
        
       | spenczar5 wrote:
       | Very cool. One question that comes up for me is whether pg_lake
       | expects to control the Iceberg metadata, or whether it can be
       | used purely as a read layer. If I make schema updates and
       | partition changes to iceberg directly, without going through
       | pg_lake, will pg_lake's catalog correctly reflect things right
       | away?
        
         | pgguru wrote:
         | We have some level of external iceberg table read-only support,
         | but it is limited at the moment. See this example/caveat:
         | https://github.com/Snowflake-Labs/pg_lake/blob/main/docs/fil...
        
         | mslot wrote:
         | You can use it as a read layer for for specific metadata JSON
         | URL or a table in a REST catalog. The latter got merged quite
         | recently, not yet in docs.
        
       | boshomi wrote:
       | Why not just use Ducklake?[1] That reduces complexity[2] since
       | only DuckDB and PostgreSQL with pg_duckdb are required.
       | 
       | [1] https://ducklake.select/
       | 
       | [2] DuckLake - The SQL-Powered Lakehouse Format for the Rest of
       | Us by Prof. Hannes Muhleisen:
       | https://www.youtube.com/watch?v=YQEUkFWa69o
        
         | pgguru wrote:
         | Boils down to design decisions; see:
         | https://news.ycombinator.com/item?id=45813631
        
         | mslot wrote:
         | DuckLake is pretty cool, and we obviously love everything the
         | DuckDB is doing. It's what made pg_lake possible, and what
         | motivated part of our team to step away from Microsoft/Citus.
         | 
         | DuckLake can do things that pg_lake cannot do with Iceberg, and
         | DuckDB can do things Postgres absolutely can't (e.g. query data
         | frames). On the other hand, Postgres can do a lot of things
         | that DuckDB cannot do. For instance, it can handle >100k single
         | row inserts/sec.
         | 
         | Transactions don't come for free. Embedding the engine in the
         | catalog rather than the catalog in the engine enables
         | transactions across analytical and operational tables. That way
         | you can do a very high rate of writes in a heap table, and
         | transactionally move data into an Iceberg table.
         | 
         | Postgres also has a more natural persistence & continuous
         | processing story, so you can set up pg_cron jobs and use
         | PL/pgSQL (with heap tables for bookkeeping) to do
         | orchestration.
         | 
         | There's also the interoperability aspect of Iceberg being
         | supported by other query engines.
        
           | jabr wrote:
           | How does this compare to https://www.mooncake.dev/pgmooncake?
           | It seems there are several projects like this now, with each
           | taking a slightly different approach optimized for different
           | use cases?
        
             | j_kao wrote:
             | FYI the mooncake team was acquired by Databricks so it's
             | basically vendors trying to compete on features now :)
        
             | mslot wrote:
             | Definitely similar goals, from the Mooncake author:
             | https://news.ycombinator.com/item?id=43298145
             | 
             | I think pg_mooncake is still relatively early stage.
             | 
             | There's a degree of maturity to pg_lake resulting from our
             | team's experience working on extensions like Citus,
             | pg_documentdb, pg_cron, and many others in the past.
             | 
             | For instance, in pg_lake all SQL features and transactions
             | just work, the hybrid query engine can delegate different
             | fragments of the query into DuckDB if the whole query
             | cannot be handled, and having a robust DuckDB integration
             | with a _single DuckDB instance_ (rather than 1 per session)
             | in a separate server process helps make it production-
             | ready. It is used in heavy production workloads already.
             | 
             | No compromise on Postgres features is especially hard to
             | achieve, but after a decade of trying to get there with
             | Citus, we knew we had to get that right from day 1.
             | 
             | Basically, we could speed run this thing into a
             | comprehensive, production-ready solution. I think others
             | will catch up, but we're not sitting still either. :)
        
           | anktor wrote:
           | What does data frames mean in this context? I'm used to them
           | in spark or pandas but does this relate to something in how
           | duckDB operates or is it something else?
        
             | dunefox wrote:
             | It's a python data frame.
        
           | mritchie712 wrote:
           | > For instance, it can handle >100k single row inserts/sec.
           | 
           | DuckLake already has data-inlining for the DuckDB catalog,
           | seems this will be possible once it's supported in the pg
           | catalog.
           | 
           | > Postgres also has a more natural persistence & continuous
           | processing story, so you can set up pg_cron jobs and use
           | PL/pgSQL (with heap tables for bookkeeping) to do
           | orchestration.
           | 
           | This is true, but it's not clear where I'd use this in
           | practice. e.g. if I need to run a complex ETL job, I probably
           | wouldn't do it in pg_cron.
        
             | derefr wrote:
             | > This is true, but it's not clear where I'd use this in
             | practice. e.g. if I need to run a complex ETL job, I
             | probably wouldn't do it in pg_cron.
             | 
             | Think "tiered storage."
             | 
             | See the example under https://github.com/Snowflake-
             | Labs/pg_lake/blob/main/docs/ice...:                  select
             | cron.schedule('flush-queue', '* * * * *', $$          with
             | new_rows as (            delete from measurements_staging
             | returning *          )          insert into measurements
             | select * from new_rows;        $$);
             | 
             | The "continuous ETL" process the GP is talking about would
             | be exactly this kind of thing, and just as trivial. (In
             | fact it would _be_ this exact same code, just with your
             | mental model flipped around from  "promoting data from a
             | staging table into a canonical iceberg table" to "evicting
             | data from a canonical table into a historical-archive
             | table".)
        
       | darth_avocado wrote:
       | This is so cool! We have files in Iceberg that we then move data
       | to/from to a PG db using a custom utility. It always felt more
       | like a workaround that didn't fully use the capabilities of both
       | the technologies. Can't wait to try this out.
        
       | fridder wrote:
       | I love this. There are definitely shops where the data is a bit
       | too much for postgres but something like Snowflake would be
       | overkill. Wish this was around a couple years ago lol
        
       | drchaim wrote:
       | More integrations are great. Anyway, the "this is awesome" moment
       | (for me) will be when you could mix row- and column-oriented
       | tables in Postgres, a bit like Timescale but native Postgres and
       | well done. Hopefully one day.
        
         | pgguru wrote:
         | Hypertables definitely had the arrays columns auto-expanding
         | with the custom node type. Not sure what else it would look
         | like for what you describe.
         | 
         | That said, don't sleep on the "this is awesome" parts in this
         | project... my personal favorite is the automatic schema
         | detection:
         | 
         | ``` CREATE TABLE my_iceberg_table () USING iceberg WITH
         | (definition_from = 's3://bucket/source_data.parquet'); ```
        
       | harisund1990 wrote:
       | This is cool to see! Looks like a compete against pg_mooncake
       | which Databricks acquired. But how is this different from
       | pg_duckdb?
        
       | apexalpha wrote:
       | I'm not super into the Data sphere but my company relies heavily
       | on Snowflake which is becoming an issue.
       | 
       | This announcement seems huge to me, no?!
       | 
       | Is this really an open source Snowflake covering most use cases?
        
         | taude wrote:
         | there's also plenty of other options for warehouse/compute
         | processing of iceberg data storage.
         | 
         | I think this is a pretty big deal, though.
         | 
         | Snowflake does a lot more, though, especially around sharing
         | data across company boundaries.
        
       | lysecret wrote:
       | Nice does this also allow me to write to parquet from my Postgres
       | table?
        
         | lysecret wrote:
         | Ong yes it works this would have made my past job so much
         | easier.
        
         | mslot wrote:
         | Yes, just COPY table TO 's3://mybucket/data.parquet'
         | 
         | Or COPY table TO STDOUT WITH (format 'parquet') if you need it
         | on the client side.
        
       | claudeomusic wrote:
       | Can someone dumb this down a bit for a non data-engineer? Hard to
       | fully wrap my head around who this is/isn't best suited for.
        
         | lysecret wrote:
         | One usecase we have (we built it ourselves) is to periodically
         | offload data from Postgres to lake house partitioned data on
         | GCS. The way I see it this can now be done with a single query.
         | Another one is the other way around to use posters as a query
         | engine or to merge offloaded data with your live data.
        
       | scirob wrote:
       | Crunchydata did it first :) but nice to get more options
        
         | mslot wrote:
         | It's the same team and same project :). Crunchy Data was
         | acquired by Snowflake.
        
           | scirob wrote:
           | Holy shit thats amazing!! Congrats to the team
        
       | iamcreasy wrote:
       | If anyone from Supabase is reading, it would be awesome to have
       | this extension!
        
       | pjd7 wrote:
       | This is awesome, I will be trying this out in the coming months.
       | Its just made it to the top of my R&D shortlist for things that
       | could massively simplify our data stack for a b2b saas.
        
       ___________________________________________________________________
       (page generated 2025-11-04 23:00 UTC)