[HN Gopher] PostgreSQL Streaming Replication (WAL); What It Is a...
___________________________________________________________________
PostgreSQL Streaming Replication (WAL); What It Is and How to
Configure One
Author : thunderbong
Score : 147 points
Date : 2024-10-12 11:52 UTC (11 hours ago)
(HTM) web link (mindhub365.com)
(TXT) w3m dump (mindhub365.com)
| pqdbr wrote:
| It's a great article, but I've always felt these are missing
| critical real-world application from the perspective of a full
| stack dev who also wants to manage their own databases.
|
| - How do I check how many seconds the replica is lagging behind
| master?
|
| - How would I monitor the replica? A simple cron task that pings
| a health check if everything is OK (lag is < x) would be a nice
| start.
|
| And then things get complicated quickly:
|
| - How do I failover to the replica if the primary goes down?
| There's pgBouncer, repmgr, patroni...
|
| - Should I have it automatically or manually?
|
| - Do I need 2 replicas to avoid the split brain scenario? my
| brain hurts already.
|
| - After a failover occurs (either automatically or manually), how
| in the world am I going to configure the primary to be the
| primary again, and the replica to act as the replica again, going
| back to the original scenario?
|
| I'd pay to learn this with confidence.
| forinti wrote:
| Lag is one little detail that I find lacking in Streaming
| Replication. If there was no transaction in the last n seconds,
| it's going to tell you that lag is n s.
|
| Oracle's Dataguard will tell there's no lag, because the
| databases are the same.
|
| In a development database, lag can get quite high, but it makes
| no sense to set off an alarm if lag gets too high because it
| might just be that there was nothing going on.
|
| But the simplicity and reliability of Postgresql I much prefer
| over Oracle, any day of the week.
| WJW wrote:
| If it's really a problem, you can always use the pt-heartbeat
| tool from the percona toolkit:
| https://docs.percona.com/percona-toolkit/pt-heartbeat.html
|
| Conceptually it is very straightforward: It just updates the
| only row in a special table to the current time every
| --interval seconds on the main database, then monitors the
| value in any followers to see which value they have. Subtract
| the two timestamps to get replication lag.
| forinti wrote:
| I'll check it out. Thanks.
| meow_catrix wrote:
| The modern way is to sidestep the issue altogether and use
| Kubernetes with a database designed to run on Kubernetes. You
| can get sharding, replication and leader election essentially
| for free - you can concentrate on using the database instead of
| running the database.
|
| Compute is really cheap compared to engineering man-hours.
| cpucycling7 wrote:
| > The modern way is to sidestep the issue altogether and use
| Kubernetes
|
| Kubernetes does require quite some time to learn/master. So
| you could say one replaces one time-consuming issue with
| another.
| remram wrote:
| What's "a database designed to run on Kubernetes"? Cassandra?
| fastest963 wrote:
| Something like Yugabyte or Cockroach
| linuxdude314 wrote:
| Pretty sure they were referring to the operators you can
| install that will stand up production ready replicated
| clusters.
|
| They work quite well.
| remram wrote:
| They replied to a comment mentioning "pgBouncer, repmgr,
| patroni" so supposedly not one of those.
| dionian wrote:
| it was a minor pain finding and setting up a postgres
| operator in k8s, but once i got it going it wasn't too
| horrible. are these other solutions that are more built for
| it significantly easier to manage?
| slig wrote:
| Which one did you end up choosing?
| kachapopopow wrote:
| https://artifacthub.io/packages/helm/bitnami/postgresql,
| postgres is fortunately one of them. Downside is that you
| cannot scale writes.
| yrro wrote:
| Does the Cloud Native PG operator count?
| cheboygan wrote:
| +1 on checking out cloud native PostgreSQL operator, or
| other PG operators like crunchy or zalando or ongres
| napsterbr wrote:
| > Do I need 2 replicas to avoid the split brain scenario? my
| brain hurts already.
|
| It will hurt even more.
|
| The recommended way is to set up a witness server. Yet another
| thing to manage in a properly designed Postgres cluster.
| Certainly not an easy/trivial thing to do, ops-wise.
|
| From [0]:
|
| > By creating a witness server in the same location (data
| centre) as the primary, if the primary becomes unavailable it's
| possible for the standby to decide whether it can promote
| itself without risking a "split brain" scenario: if it can't
| see either the witness or the primary server, it's likely
| there's a network-level interruption and it should not promote
| itself. If it can see the witness but not the primary, this
| proves there is no network interruption and the primary itself
| is unavailable, and it can therefore promote itself (and
| ideally take action to fence the former primary).
|
| An interesting acronym you'll hear is STONITH (in order to
| fence the former primary).
|
| [0] - https://www.repmgr.org/docs/current/repmgrd-witness-
| server.h...
| nolist_policy wrote:
| You want STONITH anyways. It's all necessary complexity in a
| HA cluster.
| candiddevmike wrote:
| > I'd pay to learn this with confidence.
|
| Great, there's a whole industry of PostgreSQL
| consultants/platforms ready to take your money. You could put
| on your tinfoil hat and say say PostgreSQL's lack of out of the
| box, integrated, easy to use HA is by design to make money.
|
| I think most folks use Patroni with some kind of service
| discovery solution like Kubernetes or Consul and have it
| abstract all of this for them.
| lbriner wrote:
| You could also say that since the maintainers are doing this
| for free, it is OK that some of them work for businesses that
| provide (optional) technical support.
|
| However you mileage might vary. We found someone listed as a
| postgres consultant to help us but aside from a load of
| (alleged) personal problems that made the job take ages, I
| realised that the optimisation he sent me was basically a
| rehashed version of something I found on the internet, which
| he was obviously misrepresenting as his work. He got really
| arsey with me when I told him that I was really disappointed
| with his service!
| egnehots wrote:
| > How do I check how many seconds the replica is lagging behind
| the master?
|
| Use PostgreSQL administrative functions, specifically:
| pg_last_xact_replay_timestamp.
| (https://www.postgresql.org/docs/current/functions-
| admin.html...)
|
| > How would I monitor the replica? A simple cron task that
| pings a health check if everything is OK (lag is < x) would be
| a good start.
|
| There are many solutions, highly dependent on your context and
| the scale of your business. Options range from simple cron jobs
| with email alerts to more sophisticated setups like ELK/EFK, or
| managed services such as Datadog.
|
| > How do I failover to the replica if the primary goes down?
|
| > Should I handle failover automatically or manually?
|
| > Do I need two replicas to avoid a split-brain scenario? My
| head hurts already.
|
| While it may be tempting to automate failover with a tool, I
| strongly recommend manual failover if your business can
| tolerate some downtime.
|
| This approach allows you to understand why the primary went
| down, preventing the same issue from affecting the replica.
| It's often not trivial to restore the primary or convert it to
| a replica. YOU become the concensus algorithm, the observer,
| deciding which instance become the primary.
|
| Two scenarios to avoid:
|
| * Falling back to a replica only for it to fail (e.g., due to a
| full disk).
|
| * Successfully switching over so transparently that you will
| not notice that you're now running without a replica.
|
| > After a failover (whether automatic or manual), how do I
| reconfigure the primary to be the primary again, and the
| replica to be the replica?
|
| It's easier to switch roles and configure the former primary as
| the new replica. It will then automatically synchronize with
| the current primary.
|
| You might also want to use the replica for:
|
| * Some read-only queries. However, for long-running queries,
| you will need to configure the replication delay to avoid
| timeouts.
|
| * Backups or point-in-time recovery.
|
| If you manage yourself a database, I strongly recommand to gain
| confidence first in your backups and your ability to restore
| them quickly. Then you can play with replication, they are tons
| of little settings to configure (async for perf, large enough
| wall size to restore quickly, ...).
|
| It's not that hard, but you want to have the confidence and the
| procedure written down before you have to do it in a production
| incident.
| andix wrote:
| You're completely right. You need a full management solution
| around postgres to make it work, and I wouldn't recommend
| building it yourself.
|
| One solution is Kubernetes and one of the many postgres
| operators. Still not easy as pie.
| klaussilveira wrote:
| I am managing my own PostgreSQL cluster with Patroni and, so
| far, the experience has been a breeze. The initial learning
| curve is difficult, Patroni docs are not the best, and the
| community support around it is minimal. Naive questions are
| usually torpedoed by the maintainer or other experienced devs,
| which does not foster a good environment for people learning on
| their own.
|
| Luckily, the Percona Distribution for Postgres includes amazing
| documentation on how to setup and run Patroni, even if you
| choose not to use their distribution. I would highly recommend
| following their step by step:
| https://docs.percona.com/postgresql/17/solutions/ha-setup-ap...
|
| I have OpenTofu scripts for setting this cluster up, although
| they might not be useful to you since I am using CloudStack
| instead of other clouds.
| himinlomax wrote:
| > - How do I check how many seconds the replica is lagging
| behind master?
|
| > - How would I monitor the replica? A simple cron task that
| pings a health check if everything is OK (lag is < x) would be
| a nice start.
|
| No, you should use Patroni. It configures and monitors
| Postgres' native replication.
|
| https://patroni.readthedocs.io/en/latest/
| lbriner wrote:
| > How do I check replica lagging? I use the prometheus exporter
| for postgres
|
| > How would I monitor the replica? Same. You can also use
| something like HA proxy calling a postgres CLI command to
| connect to the instance
|
| > How do I failover? Mostly, you probably want to do this
| manually because there can be data loss and you want to make
| sure the risk is worth it. I simply use repmgr for this.
|
| > Do I need 2 replicas? It's usually good to have at least 3 (1
| master and 2 slaves) but mostly so that if one fails, you still
| have 2 remaining i.e. time to get a 3rd back online
|
| > How do I failback? Again, very easy with repmgr, you just
| tell the primary to be the primary again. The failed over
| primary gets stopped, the original primary gets fast-forwarded
| and promoted to primary and everything else gets told to
| follow.
|
| I do agree that this space for postgres is very fragmented and
| some tools appear abandoned but its pretty straight-forward
| with just postgres + barman + repmgr, I have a series of vides
| on YouTube if you are interested but I am not a Postgres expert
| so please no hating :-) https://youtu.be/YM41mLZQxzE
| cheald wrote:
| +1 to all of this. The thing I'd add is that we use barman
| for our additional replicas; WAL streaming is very easy to do
| with Barman, and we stream to two backups (one onsite, one
| offsite). The only real costs are bandwidth and disk space,
| both of which are cheap. Compared to running a full replica
| (with its RAM costs), it's a very economical way to have a
| robust disaster recovery plan.
|
| If you're doing manual failover, you don't need an odd number
| of nodes in the cluster (since you aren't looking for quorum
| to automatically resolve split-brain like you would be with
| tools Elasticsearch or redis-sentinel), so for us it's just a
| question of "how long does it take to get back online if we
| lose the primary" (answer: as long as it takes to determine
| that we need to do a switch and invoke repmgr switchover),
| and "how robust are we against catastrophic failure" (answer:
| we can recover our DB from a very-close-to-live barman backup
| from the same DC, or from an offsite DC if the primary DC got
| hit by an airplane or something).
| kachapopopow wrote:
| I see this as one of the reasons to use kubernetes (& helm).
|
| https://artifacthub.io/packages/helm/bitnami/postgresql
|
| Configures all of this for you with near zero additional
| configuration required. There's also postgres-ha which handles
| zero-downtime failover by spawning proxy that handles failures in
| a specialized way versus just directly forwarding to a psql
| server.
| andix wrote:
| The only real world easy to use solution for postgres replication
| I've found, are the kubernetes operators. For example
| CloudnativePG.
|
| It's not just replication what you need. It's failover, recovery,
| monitoring, self-healing, backups, and so on.
|
| Are there any other free/open implementations outside of
| kubernetes?
| xtracto wrote:
| I used the wal feature with a bash script that compressed the
| wal into .xz files (-9 compression) every 10,000 lines and
| stored those files in s3fs folder. This as a type of "online"
| backup solution.
| andix wrote:
| Sure, but you probably want some automated snapshot recovery
| on a node that has a quite recent replica. Restoring a
| complete backup might take hours or days, and in a desaster
| scenario you want to be back online within a few minutes.
| Probably even without manual intervention.
|
| And at the same time you want so spin up a new replica to
| replace the lost one.
| klaussilveira wrote:
| For the K8S folks: https://stackgres.io
___________________________________________________________________
(page generated 2024-10-12 23:01 UTC)