[HN Gopher] A Technical Dive into PostgreSQL's replication mecha...
___________________________________________________________________
A Technical Dive into PostgreSQL's replication mechanisms
Author : jacobprall
Score : 138 points
Date : 2024-01-11 16:55 UTC (6 hours ago)
(HTM) web link (airbyte.com)
(TXT) w3m dump (airbyte.com)
| jacobprall wrote:
| This is a guide to logical replication in Postgres where I break
| down some of the internal components of the database to explain
| CDC. If you've ever wondered how WAL buffers work, or what
| happens when a transaction is executed, check it out!
| robertlagrant wrote:
| It is a cool idea to use Neon and Airbyte together, as database
| and pushing to analytics is a classic expensive-only use case.
|
| I don't know how expensive this would get, of course.
| jacobprall wrote:
| Should be relatively affordable for modest deployments - I
| would categorize Neon and Airbyte as "bang-for-your-buck"
| products (Airbyte vs Fivetran, + Neon's solid free tier).
| cowthulhu wrote:
| Sortof unrelated, I've been looking at moving some data from SQL
| Server to Postgres, and one of the big reasons is replication.
| SQL Server replication has been super brittle for me - it's
| always silently choking, getting desynchronized, or exhibiting
| weird locking behavior with no indication of the issue until you
| notice something downstream is broken. It's been tough to test
| Postgres replication though, since a lot these issues only occur
| at huge volumes of data. Anyone have any experience with the two
| they can pass on?
| staticlibs wrote:
| > moving some data from SQL Server to Postgres
|
| I don't have any first-hand experience with Postgres
| replication to share, just, when moving DB from MSSQL,
| Babelfish extensions for Postgres (https://babelfishpg.org/)
| may be of interest.
| cryptonector wrote:
| PG logical replication is rock solid. One annoyance is that you
| can't subscribe to a publication but using a different schema
| name, say.
| smartbit wrote:
| I learned Postgres replication by studying _PostgreSQL 16
| Administration Cookbook_ by _Simon Riggs et al_
|
| - Chap 11 Backup and Recovery
|
| - Chap 12 Replication and Upgrades
|
| Highly recommended!
|
| https://learning.oreilly.com/library/view/-/9781835460580/ or
| https://www.packtpub.com/product/postgresql-16-administratio...
| apnew wrote:
| Thanks for the book recommendations!
| alfor wrote:
| Unrelated to the post:
|
| Is there realtime features with postgres?
|
| It seem like a kludge to have to add redis, mqtt, or kafka to our
| application to get things as they change.
| defaultcompany wrote:
| There is NOTIFY [1] that might do what you are thinking of.
|
| [1] https://www.postgresql.org/docs/current/sql-notify.html
| devbug wrote:
| You can LISTEN/NOTIFY. Or you can use logical replication and a
| custom subscriber.[1] Supabase uses the latter.[2]
|
| [1]: https://www.postgresql.org/docs/current/logical-
| replication....
|
| [2]: https://github.com/supabase/realtime
| mritchie712 wrote:
| we use this feature in Supabase, works great.
| dventimi wrote:
| Supabase has arguably a better alternative, which uses
| logical replication and can be used outside of Supabase.
|
| https://github.com/supabase/realtime
| jacobprall wrote:
| Supabase realtime (especially if you want a managed
| backend) or other streaming CDC setups (like Decodable,
| which is Flink/Debezium under the hood) are also great
| choices for logical replication. Streaming tech will
| continue to get more cost-effective and simpler to
| implement in the coming year(s).
|
| I should note: I haven't used Decodable in production
| yet, I'm just a fan of Flink :)
| cryptonector wrote:
| One caveat about LISTEN/NOTIFY is that channels are not
| first-class objects, so there's no authorization associated
| with them, thus anyone who can login can also NOTIFY any
| payload to any channel.
| cpursley wrote:
| Yes, WalEx.
|
| Recently added the concepts of Destinations - where you can
| just configure it to send database change events to an Elixir
| module, webhook or EventRelay (the later two don't require
| Elixir know-how).
|
| https://github.com/cpursley/walex?tab=readme-ov-file#destina...
| cpursley wrote:
| Good stuff, I'm a little obsessed with Postgres replication and
| the WAL (write ahead log).
|
| If you're an Elixir user, you might find my library for
| subscribing to Postgres WAL events useful:
| https://github.com/cpursley/walex
|
| It's a lot easier to operate than the typical debezium setup
| (which is what I think Airbyte uses behind the scenes).
|
| I need to write a guide on how to use WalEx with Neon.
| jacobprall wrote:
| Looks cool. I appreciate the support for all replica identity
| settings!
| klysm wrote:
| Logical replication is an immensely powerful for integrating
| backend data.
| egnehots wrote:
| There was recently a very interesting overview of the different
| distributed PostgreSQL architectures:
|
| https://www.crunchydata.com/blog/an-overview-of-distributed-...
| anarazel wrote:
| > It even assists in PostgreSQL's implementation of Multiversion
| Concurrency Control (MVCC) - the WAL keeps a version history of
| data changes,
|
| That's not really correct - postgres' MVCC implementation doesn't
| read from the WAL. Sure, row changes are WAL logged, but that's
| not really related to MVCC.
| jacobprall wrote:
| That's an excellent point. My statement was unnecessarily
| confusing. I've changed it to simply reference another benefit
| of the WAL - optimizing I/O operations. That would make a good
| blog post in and of itself :D
___________________________________________________________________
(page generated 2024-01-11 23:00 UTC)