[HN Gopher] Minimal downtime major PostgreSQL version upgrades w...
___________________________________________________________________
Minimal downtime major PostgreSQL version upgrades with
pg_easy_replicate
Author : shayonj
Score : 222 points
Date : 2023-06-20 15:46 UTC (7 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| jensenbox wrote:
| I know I am going to get fairly polarized responses on this
| one...
|
| Too bad it is written in Ruby.
| remram wrote:
| If you had included any content or justification in your
| comment, it might have gone differently. "I don't like software
| written in Ruby" is not interesting to anyone. No polarization.
| revskill wrote:
| What if it's written in Assembly (which is hard to read,
| maintained....) ?
| mike256 wrote:
| I likecode written in Assembly. No problem with that. If it's
| written in Assembly in the first place (not disassembled from
| another language) it's usually not that hard to read...
| mike_hock wrote:
| Based.
| jreed91 wrote:
| Be careful if you have sequences as those are not replicated.
| https://www.jacobreed.dev/blog/migrate-postgres-instances
| shayonj wrote:
| I will update the readme to mention this - but the lib takes
| care of updating sequence with the data catchup:
| https://github.com/shayonj/pg_easy_replicate/blob/13ad6af51f...
| gfosco wrote:
| Have done minimal downtime major version upgrades using standard
| replication, switching between two "master" servers. In the five
| minute range.
| hinkley wrote:
| This is also a spot where having failover strategies and
| circuit breakers can help. Microservices make you think about
| these things but you don't have to wait for that forcing
| function to think about them.
|
| When the database goes down, you have to do something else.
| Could be the server crashed or could just be a Postgres
| upgrade. Five minutes is just about the right amount of time
| for an open circuit to do its job.
| tudorg wrote:
| Looks very interesting!
|
| > The switch is made by putting the user on the source database
| in READ ONLY mode, so that it is not accepting any more writes
| and waits for the flush lag to be 0. It is up to user to kick of
| a rolling restart of your application containers or failover DNS
| (more on these below in strategies) after the switchover is
| complete, so that your application isn't sending any read/write
| requests to the old/source database.
|
| This does mean that there will be a "downtime" during which the
| DB is in read-only mode, right? I understand this period can be
| really small, but that depends on how long the app deployment
| takes.
| shayonj wrote:
| Thanks for taking a look!
|
| Thats right, it was a typo. Mean't zero data loss and minimal
| downtime. There are some strategies in the readme like using
| weighted based DNS failover to reduce the downtime even less
| without requiring application deployment.
| MuffinFlavored wrote:
| The title says "zero downtime" but the GitHub tagline says
| "minimal downtime"
|
| Zero... data loss maybe?
| tycoon177 wrote:
| Was just going to comment this. It's specifically:
|
| > This ensures zero data loss and minimal downtime for the
| application.
| cmdrriker wrote:
| interesting, how does this compare to something like
| pg_auto_failover in a two or three-node PostgreSQL setup?
| pqdbr wrote:
| Let's say I have a Rails app and I want to do a major postgres
| version upgrade using this tool.
|
| I see one of the requirements is that 'Both databases should have
| the same schema'. How should I manually load the schema to the
| new database (running on port 5433 for instance)?
| shayonj wrote:
| That's a great question: internally what we have done is when
| preparing the new empty database, we also run a "bundle exec
| rake db:schema:load" against it. Depending on your setup, it
| can be hard to do.
|
| I am exploring some options like loading schema from a
| schema.sql file or something along the lines of what pg_dump
| does for schema export.
|
| Would love to hear any ideas.
| jrochkind1 wrote:
| The easy answer would be on some non-production copy of the
| Rails app, point at the new db in database.yml, and simply run
| `rails db:create; rails db:schema:load`, or some variation
| rails command like `rails db:prepare`.
|
| Rather than this ad-hoc thing, you could also try to create a
| new rails "environment" for this use, I guess.
| jxcl wrote:
| Yeah I think the documentation is a bit ambiguous here.
| Postgres "schemas" have a very specific meaning, different from
| what most people think of when they hear the word. It's more
| like a table namespace.
|
| https://www.postgresql.org/docs/current/ddl-schemas.html
| hinkley wrote:
| I have historically had a "drop-create script" that could
| handle building me empty tables. Typically split up so that I
| can create tables, import (test) data from a snapshot or
| synthetic source, then enable constraints and indexes.
|
| So even if a tool like that needed a little boost at the
| beginning to get going, I'd have something that already took
| care of that.
| ekiauhce wrote:
| Is there an alternative for MySQL? We want to upgrade from 5.7 to
| 8 pretty soon, so it would be nice to have the appropriate
| toolkit.
| hans_castorp wrote:
| There is also a new (OpenSource) Postgres extension that claims
| to provide multi-master replication based on logical replication,
| BDR and pglogical
|
| https://pgedge.github.io/spock/
|
| which can also be used for online upgrades.
|
| It however only supports Postgres 15 and (not yet released) 16
| srslack wrote:
| I did this the "hard way" a year ago. This blog post came in
| handy:
|
| https://jstaf.github.io/posts/pglogical/
|
| pglogical is magic. Keeping an eye on this project for the
| future, looks great.
| sgt wrote:
| > Binary replication only works with DBs of the same major
| version. If you want to replicate to a different version,
| well... you can't.
|
| Is this true? I thought they fixed that at some point after
| PG11.
| shayonj wrote:
| Love pglogical. I was thinking, it'd be nice to orchestrate
| this setup using pglogical too. Would attempt it if there is
| interest. Especially with bi-directional replication setup.
| jrochkind1 wrote:
| I guess this is pretty similar to what heroku does for pg
| updates?
|
| If using this tool for upgrades with minimal downtime, if I
| understand right, you need to orchestrate your db callers to
| switch over to the new primary at the right point, when it's
| ready? Tips for getting that to happen at just the right point to
| minimize downtime?
| shayonj wrote:
| Great question: so you can watch the stats command. The command
| returns a JSON every second which has a switchover_completed_at
| field. It is updated the as soon as the switchover is complete.
|
| UPDATE: Some more reading material here:
| https://github.com/shayonj/pg_easy_replicate#switchover-stra...
| remram wrote:
| The fastest way to switchover is probably adding a iptables rule
| on the old server to DNAT to the new server. Unless you have a
| load-balancer or another network middleware like Kubernetes'
| clusterIP services (which uses iptables or IPVS internally).
|
| Waiting for a rolling restart or DNS cache expiration could take
| a while, during which the app is talking to a read-only database,
| if I understand correctly.
| aidos wrote:
| I could definitely have done with that idea a couple of weeks
| back!
| shayonj wrote:
| Cool idea!
| bbarnett wrote:
| This is why you always give two ip addresses to such servers.
| One for the server, and one for the service.
| take-five wrote:
| Nice UX!
|
| I checked the code and noticed some things that you might want to
| address in the future (I've done major version upgrades via
| logical replication myself several times using a script similar
| to this).
|
| The "default" way of bringing the target DB in sync with the
| source one (CREATE PUBLICATION on the source then CREATE
| SUBSCRIPTION on the target) only works for moderately sized
| databases. Even on ~50GB I noticed that it may take hours to
| catch up with the origin. There are a couple of solutions:
|
| 1. Drop all indices except for the primary keys before performing
| initial copy, and then restore the indices after all tables have
| been copied and replication stream started.
|
| 2. Instead of copying into a blank database, you could instead
| create a logical dump using pg_dump with snapshot isolation,
| restore it on the target DB using pg_restore, create a
| subscription in the disabled state, advance the logical
| replication slot to the LSN with which you created the logical
| dump, and then enable the subscriptions. This way your target
| database would only have to process the delta between the time
| you created a dump and the current time. I have a snippet
| illustrating how we did it: https://gist.github.com/take-
| five/1dab3a99c8636a93fc69f36ff9....
|
| 3. We found out that pg_dump + pg_restore is still very slow on
| larger databases (~1TB) and it's untenable to wait several hours.
| Remember that while you're waiting until the dump is being
| restored, the source database is accumulating WALs which it
| should later send to the target DB. If left unchecked for 12-18
| hours, it could lead to running out of disk space. This was
| unacceptable for us, so instead of creating a target DB from a
| logical dump, we created a copy of the source database from AWS
| RDS snapshot, upgraded it using pg_upgrade and then set up
| logical replication (similar to how it's described in Percona
| blog: https://www.percona.com/blog/postgresql-logical-
| replication-...). Something like this can probably be achieved
| with ZFS snapshots.
|
| --
|
| Otherwise, it was very similar to how you programmed your
| library.
|
| Recently, I saw an article here on HN on the same topic of
| migrating Postgres databases using logical replication. There was
| a very nice idea to set up _reverse_ logical replication after
| switchover so that it's possible to roll back quickly in case
| something breaks.
|
| EDIT: formatting
| shayonj wrote:
| These are great shouts! Thank you so much for sharing. One of
| the operations I ran with this was ~500GB and yeah it takes a
| few hours to catchup. For us, it wasn't an issue, especially
| also since we were on AWS Aurora. However, that said, I can
| totally see it being not feasible on other database
| engine/systems. I considered the idea of dropping all the
| indices and re-creating it once the catch up is complete. I'd
| like pg_easy_replicate it to support that. It should make the
| initial COPY lot more swift. On a 200GB large DB it cut down
| down the total time by 90%.
|
| Re: bi-directional replication, you read my mind :).
| https://github.com/shayonj/pg_easy_replicate/blob/8df94aa93f...
|
| These two features are top of my list to support in
| pg_easy_replicate
| pelagicAustral wrote:
| Much appreciated. I was actually looking into a few options for
| this exact same thing. I will give this shot right away.
| shayonj wrote:
| Thanks for giving it a try. Please don't hesitate to open
| bugs/feature requests.
___________________________________________________________________
(page generated 2023-06-20 23:00 UTC)