[HN Gopher] A Technical Dive into PostgreSQL's replication mecha...
       ___________________________________________________________________
        
       A Technical Dive into PostgreSQL's replication mechanisms
        
       Author : jacobprall
       Score  : 138 points
       Date   : 2024-01-11 16:55 UTC (6 hours ago)
        
 (HTM) web link (airbyte.com)
 (TXT) w3m dump (airbyte.com)
        
       | jacobprall wrote:
       | This is a guide to logical replication in Postgres where I break
       | down some of the internal components of the database to explain
       | CDC. If you've ever wondered how WAL buffers work, or what
       | happens when a transaction is executed, check it out!
        
       | robertlagrant wrote:
       | It is a cool idea to use Neon and Airbyte together, as database
       | and pushing to analytics is a classic expensive-only use case.
       | 
       | I don't know how expensive this would get, of course.
        
         | jacobprall wrote:
         | Should be relatively affordable for modest deployments - I
         | would categorize Neon and Airbyte as "bang-for-your-buck"
         | products (Airbyte vs Fivetran, + Neon's solid free tier).
        
       | cowthulhu wrote:
       | Sortof unrelated, I've been looking at moving some data from SQL
       | Server to Postgres, and one of the big reasons is replication.
       | SQL Server replication has been super brittle for me - it's
       | always silently choking, getting desynchronized, or exhibiting
       | weird locking behavior with no indication of the issue until you
       | notice something downstream is broken. It's been tough to test
       | Postgres replication though, since a lot these issues only occur
       | at huge volumes of data. Anyone have any experience with the two
       | they can pass on?
        
         | staticlibs wrote:
         | > moving some data from SQL Server to Postgres
         | 
         | I don't have any first-hand experience with Postgres
         | replication to share, just, when moving DB from MSSQL,
         | Babelfish extensions for Postgres (https://babelfishpg.org/)
         | may be of interest.
        
         | cryptonector wrote:
         | PG logical replication is rock solid. One annoyance is that you
         | can't subscribe to a publication but using a different schema
         | name, say.
        
       | smartbit wrote:
       | I learned Postgres replication by studying _PostgreSQL 16
       | Administration Cookbook_ by _Simon Riggs et al_
       | 
       | - Chap 11 Backup and Recovery
       | 
       | - Chap 12 Replication and Upgrades
       | 
       | Highly recommended!
       | 
       | https://learning.oreilly.com/library/view/-/9781835460580/ or
       | https://www.packtpub.com/product/postgresql-16-administratio...
        
         | apnew wrote:
         | Thanks for the book recommendations!
        
       | alfor wrote:
       | Unrelated to the post:
       | 
       | Is there realtime features with postgres?
       | 
       | It seem like a kludge to have to add redis, mqtt, or kafka to our
       | application to get things as they change.
        
         | defaultcompany wrote:
         | There is NOTIFY [1] that might do what you are thinking of.
         | 
         | [1] https://www.postgresql.org/docs/current/sql-notify.html
        
         | devbug wrote:
         | You can LISTEN/NOTIFY. Or you can use logical replication and a
         | custom subscriber.[1] Supabase uses the latter.[2]
         | 
         | [1]: https://www.postgresql.org/docs/current/logical-
         | replication....
         | 
         | [2]: https://github.com/supabase/realtime
        
           | mritchie712 wrote:
           | we use this feature in Supabase, works great.
        
             | dventimi wrote:
             | Supabase has arguably a better alternative, which uses
             | logical replication and can be used outside of Supabase.
             | 
             | https://github.com/supabase/realtime
        
               | jacobprall wrote:
               | Supabase realtime (especially if you want a managed
               | backend) or other streaming CDC setups (like Decodable,
               | which is Flink/Debezium under the hood) are also great
               | choices for logical replication. Streaming tech will
               | continue to get more cost-effective and simpler to
               | implement in the coming year(s).
               | 
               | I should note: I haven't used Decodable in production
               | yet, I'm just a fan of Flink :)
        
           | cryptonector wrote:
           | One caveat about LISTEN/NOTIFY is that channels are not
           | first-class objects, so there's no authorization associated
           | with them, thus anyone who can login can also NOTIFY any
           | payload to any channel.
        
         | cpursley wrote:
         | Yes, WalEx.
         | 
         | Recently added the concepts of Destinations - where you can
         | just configure it to send database change events to an Elixir
         | module, webhook or EventRelay (the later two don't require
         | Elixir know-how).
         | 
         | https://github.com/cpursley/walex?tab=readme-ov-file#destina...
        
       | cpursley wrote:
       | Good stuff, I'm a little obsessed with Postgres replication and
       | the WAL (write ahead log).
       | 
       | If you're an Elixir user, you might find my library for
       | subscribing to Postgres WAL events useful:
       | https://github.com/cpursley/walex
       | 
       | It's a lot easier to operate than the typical debezium setup
       | (which is what I think Airbyte uses behind the scenes).
       | 
       | I need to write a guide on how to use WalEx with Neon.
        
         | jacobprall wrote:
         | Looks cool. I appreciate the support for all replica identity
         | settings!
        
       | klysm wrote:
       | Logical replication is an immensely powerful for integrating
       | backend data.
        
       | egnehots wrote:
       | There was recently a very interesting overview of the different
       | distributed PostgreSQL architectures:
       | 
       | https://www.crunchydata.com/blog/an-overview-of-distributed-...
        
       | anarazel wrote:
       | > It even assists in PostgreSQL's implementation of Multiversion
       | Concurrency Control (MVCC) - the WAL keeps a version history of
       | data changes,
       | 
       | That's not really correct - postgres' MVCC implementation doesn't
       | read from the WAL. Sure, row changes are WAL logged, but that's
       | not really related to MVCC.
        
         | jacobprall wrote:
         | That's an excellent point. My statement was unnecessarily
         | confusing. I've changed it to simply reference another benefit
         | of the WAL - optimizing I/O operations. That would make a good
         | blog post in and of itself :D
        
       ___________________________________________________________________
       (page generated 2024-01-11 23:00 UTC)