[HN Gopher] pg_flo - Stream, transform, and re-route PostgreSQL ...
___________________________________________________________________
pg_flo - Stream, transform, and re-route PostgreSQL data in real-
time
Author : shayonj
Score : 90 points
Date : 2024-11-03 17:02 UTC (5 hours ago)
(HTM) web link (www.pgflo.io)
(TXT) w3m dump (www.pgflo.io)
| ibgeek wrote:
| This is very cool!
| shayonj wrote:
| Thank you! Still very early days, would love to hear any
| feedback.
| rileymichael wrote:
| Perfect timing -- I was just looking into similar tools.
|
| If I want to do a bulk copy (say, nightly) with various
| transformations but not continually stream after, is that
| supported / a use case that'd be a good fit for your tool?
| zorgmonkey wrote:
| It looks with what is currently implemented you'd have to drop
| the tables from the destination if you want repeated copies,
| probably not quite what you want. Close enough to your use case
| that it still might be worth testing it out though.
| rileymichael wrote:
| yeah that's not an immediate deal breaker for me, I'm
| essentially looking for pgdump/restore + transformations.
| I'll give it a look and see how it performs
| shayonj wrote:
| Thank you for giving this a spin. That's correct, today
| you'd need to drop the table before another sync with
| pg_flo. That said, I have given delta syncs some thought,
| and also looking into a control plane that can make some of
| these things easier. Would love to hear your feedback.
| shayonj wrote:
| I think this also might be of interest to you - just one
| time copies and apply transformations -
| https://github.com/shayonj/pg_flo/issues/6. I will look
| into shipping that very soon.
| rileymichael wrote:
| Oh awesome, yeah that'd work perfect then. I'd just prep
| beforehand and run the one-time copy.
| shayonj wrote:
| amazing! Will plan a release by Tuesday latest. You can
| experiment with `--copy-and-stream ` in the meantime [1]
|
| [1] https://github.com/shayonj/pg_flo?tab=readme-ov-
| file#streami...
| scirob wrote:
| Cool, Hope it can give an alterantive to Debezium. I never liked
| Debezium how it first must copy the whole CDC state to kafka. And
| you must set the kafka retnetion time to infinity, which many
| kafka as a service systems don't allow anyway.
| FridgeSeal wrote:
| > how it first must copy the whole CDC state to kafka
|
| There's a setting that controls whether it will do a snapshot
| first. Turn it off and it will just start sending through new
| cdc entries.
|
| > you must set the kafka retention time to infinity
|
| Is this a new retirement? I've never had to do this.
| scirob wrote:
| pglogical can live inside postgres, looks like pg_flo is an
| external service not an extension.
|
| Maybe a benefit actually. Do you think we could use pg_flo with
| Postgres as a service instances like Azure postgres, Supabase,
| Neon etc? Like you just read the WAL without needing to install
| an extension that is not approved by the vendor.
| shayonj wrote:
| Yeah, absolutely! There are other benefits of being an
| extension, and yes you can use pg_flo with any PostgreSQL
| database or service since it uses logical replication to listen
| on changes and CTIDs for bulk copies.
| oulipo wrote:
| Would it be better to use replication rather than a backup on S3?
___________________________________________________________________
(page generated 2024-11-03 23:00 UTC)