[HN Gopher] pg_flo - Stream, transform, and re-route PostgreSQL ...
       ___________________________________________________________________
        
       pg_flo - Stream, transform, and re-route PostgreSQL data in real-
       time
        
       Author : shayonj
       Score  : 90 points
       Date   : 2024-11-03 17:02 UTC (5 hours ago)
        
 (HTM) web link (www.pgflo.io)
 (TXT) w3m dump (www.pgflo.io)
        
       | ibgeek wrote:
       | This is very cool!
        
         | shayonj wrote:
         | Thank you! Still very early days, would love to hear any
         | feedback.
        
       | rileymichael wrote:
       | Perfect timing -- I was just looking into similar tools.
       | 
       | If I want to do a bulk copy (say, nightly) with various
       | transformations but not continually stream after, is that
       | supported / a use case that'd be a good fit for your tool?
        
         | zorgmonkey wrote:
         | It looks with what is currently implemented you'd have to drop
         | the tables from the destination if you want repeated copies,
         | probably not quite what you want. Close enough to your use case
         | that it still might be worth testing it out though.
        
           | rileymichael wrote:
           | yeah that's not an immediate deal breaker for me, I'm
           | essentially looking for pgdump/restore + transformations.
           | I'll give it a look and see how it performs
        
             | shayonj wrote:
             | Thank you for giving this a spin. That's correct, today
             | you'd need to drop the table before another sync with
             | pg_flo. That said, I have given delta syncs some thought,
             | and also looking into a control plane that can make some of
             | these things easier. Would love to hear your feedback.
        
               | shayonj wrote:
               | I think this also might be of interest to you - just one
               | time copies and apply transformations -
               | https://github.com/shayonj/pg_flo/issues/6. I will look
               | into shipping that very soon.
        
               | rileymichael wrote:
               | Oh awesome, yeah that'd work perfect then. I'd just prep
               | beforehand and run the one-time copy.
        
               | shayonj wrote:
               | amazing! Will plan a release by Tuesday latest. You can
               | experiment with `--copy-and-stream ` in the meantime [1]
               | 
               | [1] https://github.com/shayonj/pg_flo?tab=readme-ov-
               | file#streami...
        
       | scirob wrote:
       | Cool, Hope it can give an alterantive to Debezium. I never liked
       | Debezium how it first must copy the whole CDC state to kafka. And
       | you must set the kafka retnetion time to infinity, which many
       | kafka as a service systems don't allow anyway.
        
         | FridgeSeal wrote:
         | > how it first must copy the whole CDC state to kafka
         | 
         | There's a setting that controls whether it will do a snapshot
         | first. Turn it off and it will just start sending through new
         | cdc entries.
         | 
         | > you must set the kafka retention time to infinity
         | 
         | Is this a new retirement? I've never had to do this.
        
       | scirob wrote:
       | pglogical can live inside postgres, looks like pg_flo is an
       | external service not an extension.
       | 
       | Maybe a benefit actually. Do you think we could use pg_flo with
       | Postgres as a service instances like Azure postgres, Supabase,
       | Neon etc? Like you just read the WAL without needing to install
       | an extension that is not approved by the vendor.
        
         | shayonj wrote:
         | Yeah, absolutely! There are other benefits of being an
         | extension, and yes you can use pg_flo with any PostgreSQL
         | database or service since it uses logical replication to listen
         | on changes and CTIDs for bulk copies.
        
       | oulipo wrote:
       | Would it be better to use replication rather than a backup on S3?
        
       ___________________________________________________________________
       (page generated 2024-11-03 23:00 UTC)