[HN Gopher] Show HN: Pontoon - Open-source customer data syncs
       ___________________________________________________________________
        
       Show HN: Pontoon - Open-source customer data syncs
        
       Hi HN,  We're Alex and Kalan, the creators of Pontoon
       (https://github.com/pontoon-data/Pontoon). Pontoon is an open-
       source data export platform that makes it really easy to create
       data syncs and send data to your enterprise customers. Check out
       our demo here: https://app.storylane.io/share/onova7c23ai6 or try
       it out with docker: https://pontoon-data.github.io/Pontoon/getting-
       started/quick...  While at our prior roles as data engineers, we've
       both felt the pain of data APIs. We either had to spend weeks
       building out data pipelines in house or spend a lot on ETL tools
       like Fivetran (https://www.fivetran.com/). However, there were a
       few companies that offered data syncs that would sync directly to
       our data warehouse (eg. Redshift, Snowflake, etc.), and when that
       was an option, we always chose it. This led us to wonder "Why don't
       more companies offer data syncs?". It turns out, building reliable
       cross-cloud data syncs is difficult. That's why we built Pontoon.
       We designed Pontoon to be:  - Easily deployed: we provide a single,
       self-contained Docker image for easy deployment and Docker Compose
       for larger workloads (https://pontoon-
       data.github.io/Pontoon/getting-started/quick...)  - Support modern
       data warehouses: we support syncing to/from Snowflake, BigQuery,
       Redshift, and Postgres.  - Sync cross cloud: sync from BigQuery to
       Redshift, Snowflake to BigQuery, Postgres to Redshift, etc.  -
       Developer friendly: data syncs can also be built via the API  -
       Open source: Pontoon is free to use by anyone  Under the hood, we
       use Apache Arrow (https://arrow.apache.org/) to move data between
       sources and destinations. Arrow is very performant - we wanted to
       use a library that could handle the scale of moving millions of
       records per minute.  In the shorter-term, there are several
       improvements we want to make, like:  - Adding support for DBT
       models to make adding data models easier  - UX improvements like
       better error messaging and monitoring of data syncs  - More sources
       and destinations (S3, GCS, Databricks, etc.)  - Improve the API for
       a more developer friendly experience (it's currently tied pretty
       closely to the front end)  In the longer-term, we want to make data
       sharing as easy as possible. As data engineers, we sometimes felt
       like second class citizens with how we were told to get the data we
       needed - "just loop through this api 1000 times", "you probably
       won't get rate limited" (we did), "we can schedule an email to send
       you a csv every day". We want to change how modern data sharing is
       done and make it simple for everyone.  Give it a try
       https://github.com/pontoon-data/Pontoon. Cheers!
        
       Author : alexdriedger
       Score  : 33 points
       Date   : 2025-08-01 15:28 UTC (7 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | melson wrote:
       | Is it like an offline sync?
        
         | kalanm wrote:
         | Kalan here, syncs are batch based and scheduled, similar to
         | conventional ETL / data pipelines
        
       | conormccarter wrote:
       | Congrats on the launch! I'm one of the cofounders of Prequel (I
       | saw our name in the feature grid - small nit: we do support self-
       | hosting). This is definitely a problem worth solving - the market
       | is still early and I'd bet the rising tide will help all of us
       | convince more teams to support this capability. I'm not a lawyer,
       | but the latest EU Data Act might even make it an obligation for
       | some software vendors?
       | 
       | Maybe I can save you a headache: Snowflake is actively
       | deprecating single-factor username/password auth in favor of key
       | pair auth, so the faster you support that, the fewer mandatory
       | migrations you'll be emailing users about.
        
         | kalanm wrote:
         | Thanks! Kalan here, I appreciate the nit! PR is already merged.
         | Definitely agreed on the market, it seems like there's a ton of
         | opportunity. And thanks for the heads up re Snowflake auth!
         | we're actively working that one, and a few other auth modes for
         | Redshift and BQ as well.
        
       | hiatus wrote:
       | What does the row "First-class Data Products" in the comparison
       | table entail?
        
         | alexdriedger wrote:
         | Great question. We think of data products as multi-tenant
         | tables that are created with the intention of sending that data
         | to a customer.
         | 
         | To compare with an ETL tool like Airbyte, it's really easy to
         | sync a full table somewhere with Airbyte, but it get's more
         | complicated if you have a multi-tenant table, where you want to
         | sync only a subset of data to a customer.
         | 
         | When you're setting up a data model with Pontoon, you just
         | define which column has the customer id (we call it a tenant
         | id) and it handles sending the right data to the right
         | customer.
        
       | a2128 wrote:
       | Not to be confused with Pontoon, a self-hostable translation
       | platform made by Mozilla: https://github.com/mozilla/pontoon
        
         | alexdriedger wrote:
         | Another great self-hostable platform. I'm not sure where they
         | got their name from though, translations don't have a
         | connection to lakes like data does...
        
       ___________________________________________________________________
       (page generated 2025-08-01 23:01 UTC)