hngopher.com

       [HN Gopher] PostgreSQL 14
       ___________________________________________________________________
        
       PostgreSQL 14
        
       Author : jkatz05
       Score  : 815 points
       Date   : 2021-09-30 12:50 UTC (10 hours ago)
        
 (HTM) web link (www.postgresql.org)
 (TXT) w3m dump (www.postgresql.org)
        
       | streamofdigits wrote:
       | the sense of pride is palpable (and well deserved)
        
         | [deleted]
        
       | ggktk wrote:
       | PostgreSQL is my favorite database, but I wish it was possible to
       | use it as a library like sqlite. That would let me use it in a
       | lot more places.
        
       | stronglikedan wrote:
       | Anyone know where to download it? Looks like they haven't updated
       | the download pages, even though they link to them at the end of
       | this post.
        
         | veidelis wrote:
         | Here, for example,
         | https://www.enterprisedb.com/downloads/postgresql
        
       | mattashii wrote:
       | Once again, thanks to all the contributors that provided these
       | awesome new features, translations and documentation.
       | 
       | It's amazing what improvements we can get through public
       | collaboration.
        
       | garyclarke27 wrote:
       | Fantastic piece of software. The only major missing feature that
       | I can think of is Automatic Incremental Materialized View
       | Updates. I'm hoping that this good work in progress makes it to
       | v15 - https://yugonagata-
       | pgsql.blogspot.com/2021/06/implementing-i...
        
         | Tostino wrote:
         | This has been a major one i've wanted for a long time too, but
         | sadly it's initial implementation will be too simple to allow
         | me to migrate any of my use cases to use it.
        
       | skrebbel wrote:
       | These changes look fantastic.
       | 
       | If I may hijack the thread with some more general complaints
       | though, I wish the Postgres team would someday prioritize
       | migration. Like make it easier to make all kinds of DB changes on
       | a live DB, make it easier to upgrade between postgres versions
       | with zero (or low) downtime, etc etc.
       | 
       | Warnings when the migration you're about to do is likely to take
       | ages because for some reason it's going to lock the entire table,
       | instant column aliases to make renames easier, instant column
       | aliases with runtime typecasts to make type migrations easier,
       | etc etc etc. All this stuff is currently extremely painful for,
       | afaict, no good reason (other than "nobody coded it", which is of
       | course a great reason in OSS land).
       | 
       | I feel like there's a certain level of stockholm syndrome in the
       | sense that to PG experts, these things aren't that painful
       | anymore because they know all the pitfalls and gotchas and it's
       | part of why they're such valued engineers.
        
         | bmcahren wrote:
         | We currently use MongoDB and while Postgres is attractive for
         | so many reasons, even with Amazon Aurora's Postgres we still
         | need legacy "database maintenance windows" in order to achieve
         | major version upgrades.
         | 
         | With MongoDB, you're guaranteed single-prior-version
         | replication compatibility within a cluster. This means you spin
         | up an instance with the updated version of MongoDB, it catches
         | up to the cluster. Zero downtime, seamless transition. There
         | may be less than a handful of cancelled queries that are
         | retryable but no loss of writes with their retryable writes and
         | write concern preferences. e.g. MongoDB 3.6 can be upgraded to
         | MongoDB 4.0 without downtime.
         | 
         | Edit: Possibly misinformed but the last deep dive we did
         | indicated there was not a way to use logical replication for
         | seamless upgrades. Will have to research.
        
           | ggregoire wrote:
           | MongoDB and Postgres are like apples and oranges tho.
           | 
           | I'm not gonna choose MongoDB if I need a relational model...
           | even if it offers zero downtime upgrades out-of-the-box.
        
           | [deleted]
        
           | TedShiller wrote:
           | The problem with MongoDB though is that you're on MongoDB
        
             | salil999 wrote:
             | And how exactly is that a problem?
        
               | tpxl wrote:
               | Some things seem to have changed from 2018, but MongoDB
               | was by far the worst database I ever had the displeasure
               | of using (and Amazon DocumentDB was even worse).
               | 
               | https://jepsen.io/analyses/mongodb-3-6-4
               | 
               | https://jepsen.io/analyses/mongodb-4.2.6
        
               | jd_mongodb wrote:
               | Posting old Jepssen analyses is like pointing at old bug
               | reports. Everytime Jepsen finds a bug we fix it lickety-
               | split. I know it's not cool to focus on that fact, but it
               | is a fact. The Jepsen tests are part of the MongoDB test
               | suite so when we fix those problems they stay fixed.
               | 
               | I would love to hear your personal experience of MongoDB
               | as opposed to reposting old Jepsen reports. Perhaps there
               | is something that we can address in 5.1 that is still a
               | problem?
               | 
               | (I work in developer relations at MongoDB)
        
               | sigmonsays wrote:
               | where is the latest jepsen test results published?
        
               | jenny91 wrote:
               | The latest "old Jepsen report" is barely a year old. It's
               | not like digging up dirt from years ago.
               | 
               | It also seems like there was quite a lot wrong even a
               | year ago, quoting from there:
               | 
               | > Roughly 10% of transactions exhibited anomalies during
               | normal operation, without faults.
               | 
               | It's just not a very reassuring response to say "when
               | someone goes to dig a bit and finds a lot of show-
               | stopping bugs, we address those specific bugs quickly".
               | 
               | To me it sounds like the architecture and care just isn't
               | there for a robust data storage layer?
        
               | mayankkaizen wrote:
               | That was sarcasm. But yeah, you can search for MongoDB
               | and you'll come across many many posts criticizing it.
               | 
               | It can be said Mongodb is hated ad much as Postgres is
               | loved.
               | 
               | Personally I have no opinion about mongodb.
        
               | threeseed wrote:
               | You can find posts criticising every database.
               | 
               | Most of the ones for MongoDB are from a decade ago and
               | not at all relevant today.
        
               | The_Colonel wrote:
               | Yes, and most of these love/hate memes are blowned out of
               | proportion by people who don't actually have any real
               | expertise in those technologies, but just parrot whatever
               | they've read in some memes.
        
               | staticassertion wrote:
               | You're exactly correct. Tons of "XYZ is bad" because of
               | some meme that they don't even understand or have context
               | on that hasn't been relevant for years.
               | 
               | I have no idea if MongoDB is good or bad at this point,
               | but the comments of "haha it's mongo" are completely
               | devoid of meaningful content and should be flagged.
        
               | operator9A wrote:
               | I was part of a team that operated a large Mongo cluster
               | for most of the last decade. I would not have advised
               | anyone to use Mongo as their durable source of truth
               | database then, and I still don't think it's advisable to
               | do so now. On numerous occasions, Mongo demonstrated the
               | consequences of poor engineering judgment and an addled
               | approach to logic in critical components responsible for
               | data integrity. In addition, Mongo internalized many poor
               | patterns with respect to performance and change
               | management. Mongo did not, and does not provide the data
               | integrity or performance guarantees that other databases
               | internalize by design (the WiredTiger transition helped,
               | but did not cure many of the issues).
               | 
               | PostgreSQL introduced JSONB GIN index support sometime
               | around 2015, making Postgres a better fit for most JSON-
               | based applications than Mongo.
        
             | kbenson wrote:
             | As someone that mostly shared that opinion for the last
             | decade or more, I recently set up a cluster for work, and
             | everything seems much more production level quality than I
             | remember or what I assumed it was going to be like. I'm not
             | the one using it for queries every day, but I did do a
             | bunch of testing for replication and failed nodes to
             | confirm that I understood (and could rely) on the claims of
             | robustness, and it seemed to be stable and with good
             | documentation of what to expect in different scenarios and
             | how to configure it (which is _not_ what I experienced
             | doing the same testing back in 2010-2011).
             | 
             | All in all, my impression of MongoDB now is that they're
             | one of those "fake it till you make it" _success_ stories,
             | where they leveraged their popularity into enough momentum
             | to fix most their major problems.
        
               | dralley wrote:
               | >All in all, my impression of MongoDB now is that they're
               | one of those "fake it till you make it" success stories,
               | where they leveraged their popularity into enough
               | momentum to fix most their major problems.
               | 
               | The downside being that their reputation is now somewhat
               | charred.
        
               | hodgesrm wrote:
               | > All in all, my impression of MongoDB now is that
               | they're one of those "fake it till you make it" success
               | stories, where they leveraged their popularity into
               | enough momentum to fix most their major problems.
               | 
               | That's not all bad. The same could be said of MySQL. Both
               | DBMS prioritized ease of use over data integrity in the
               | early going.
        
               | zozbot234 wrote:
               | And yet PostgreSQL making the exact opposite choice has
               | really paid off in the longer run. People used to dismiss
               | it as simply a toy for academics to play with, and look
               | where the project is today. It can easily surpass most
               | NoSQL databases on their home turf.
        
               | outworlder wrote:
               | One thing that turned me away from MongoDB was their
               | utter lack of care for your data integrity that they
               | displayed for years. Some of those instances were even
               | documented. Then there were some bad defaults - some
               | could _also_ cause data loss.
               | 
               | For any component that's viewed as a database (as opposed
               | to, say, cache), data integrity is one of the most
               | important metrics (if not THE most).
               | 
               | In contrast, PostgreSQL data loss bugs are rare - and are
               | treated extremely seriously. Defaults are sane and won't
               | lose data. It's one of the few databases I'm pretty
               | confident that data will be there even if you yank a
               | server power cord mid writes.
               | 
               | Has MongoDB improved? Yes, leaps and bounds(seems to
               | still fail Jepsen tests though). But I can't help but
               | feel that it should have been released as a beta product,
               | instead of claiming it was production ready. It wasn't.
               | Maybe it is now. I'd still evaluate other alternatives
               | before considering it.
               | 
               | That said, one thing that always amuses me is how MongoDB
               | gets mentioned in the same context as PostgreSQL. If
               | PostgreSQL would meet your needs, it's unlikely that
               | MongoDB would. And vice-versa(but maybe something else
               | like Cassandra would).
        
               | btown wrote:
               | Postgres with tables that are just an ID and a JSONB
               | column nowadays give you practically everything you'd
               | want out of MongoDB.
               | 
               | You can add deep and customized indices as desired, you
               | can easily shard with Citus, and if you want to live
               | without transactions you'll see equally good if not
               | better performance - with the option to add ACID whenever
               | you want. The developer experience argument, where the
               | ->> operator was more confusing than brackets, is now
               | moot.
               | 
               | As a former MongoDB user, there were good synergies
               | between MongoDB and Meteor back in the day, and I _loved_
               | that tech, but between Materialize and Supabase, you have
               | vastly more options for realtime systems in the Postgres
               | ecosystem.
        
               | threeseed wrote:
               | Those defaults were changed a decade ago and were never
               | an issue if you used a driver eg. Python.
               | 
               | And the Jepsen tests are part of the core test suite so
               | do you some evidence they are still failing.
               | 
               | It's so ridiculous and pointless to be rehashing the same
               | issues a decade later.
               | 
               | Actually more a testament to the company that it's still
               | hugely successful and depended on by some very large
               | applications.
        
               | dralley wrote:
               | Perhaps, but mongodb was responsible for something I have
               | bookmarked as "the worst line of code ever".
               | 
               | Which decided whether or not to log connection warnings
               | based on Math.random()
               | 
               | https://github.com/mongodb/mongo-java-
               | driver/blob/1d2e6faa80...
        
               | threeseed wrote:
               | a) This is a line of code from 2013 and was fixed weeks
               | after.
               | 
               | b) Based on the JIRA [1] it was designed to only log 10%
               | of subsequent failures where there is no connection to
               | prevent log flooding. You would still get the initial
               | failure message.
               | 
               | Pretty reasonable technique and hardly the worst code
               | ever.
               | 
               | [1] https://jira.mongodb.org/browse/JAVA-836
        
               | __turbobrew__ wrote:
               | https://github.com/mongodb/mongo-java-
               | driver/commit/d51b3648...
        
           | outworlder wrote:
           | > Edit: Possibly misinformed but the last deep dive we did
           | indicated there was not a way to use logical replication for
           | seamless upgrades. Will have to research.
           | 
           | It is possible since PG10
           | 
           | https://severalnines.com/database-blog/how-upgrade-
           | postgresq...
        
           | zozbot234 wrote:
           | Logical replication across major releases for seamless
           | upgrades has been supported and documented since pgSQL 10.
        
         | jeltz wrote:
         | Migrations are quite highly prioritized in PostgreSQL.
         | PostgreSQL has the by far best migration support of any
         | database I have worked with. There is of course a lot of work
         | left to do but that is true for many other areas in PostgreSQL.
         | 
         | Also I can't agree at all with "nobody coded it", patches in
         | this area generally welcome. My first real patch for PostgreSQL
         | was about improving migrations. And there has been talk about
         | several of the ideas you propose, but nobody is working on any
         | of them right now. So I mostly think it is a lack of resources
         | in general.
        
           | nix23 wrote:
           | > far best migration support of any database I have worked
           | with
           | 
           | BS, have you never worked with mysql or sqlite?
        
             | dijit wrote:
             | MySQL migrations are beyond a joke.
             | 
             | It's absolutely impossible that you're not trolling with
             | such a statement.
        
               | nix23 wrote:
               | Copy/hardlink the data-files and start the new engine
               | instead have to make a dump and import it?
               | 
               | It's a joke that you have to make a offline dump and
               | import for upgrades with pgsql.
        
               | dijit wrote:
               | Did you miss this?
               | https://www.postgresql.org/docs/10/pgupgrade.html
        
           | oauea wrote:
           | DDL migrations are amazing. Migrations between different
           | postgres versions not so much.
        
           | aynyc wrote:
           | I love postgresql, as long as I don't have to do upgrade. I
           | have yet to see a successful zero downtime upgrade. That
           | being said, other databases aren't that much better. Maybe
           | except SQLite.
        
           | skrebbel wrote:
           | Sorry, I didn't mean offense. What I meant with "nobody coded
           | it" is that the migration DX features that don't exist yet,
           | likely don't exist simply because they haven't been made yet
           | (and not because eg they're architecturally impossible or
           | because the postgres team are stupid or sth).
           | 
           | Its hard to complain about OSS without attacking the
           | creators, I tried to do that right buy clearly I failed
           | nevertheless :-) Thanks for your contributions!
        
             | nix23 wrote:
             | >Sorry, I didn't mean offense.
             | 
             | Don't be a SJP, you have the right to offend someone when
             | he talks bs.
        
               | burnished wrote:
               | What does SJP mean?
        
               | [deleted]
        
         | phamilton wrote:
         | They are slowly getting there. For example, postgres 12 added
         | REINDEX CONCURRENTLY. Under the hood it's just recreating the
         | index and then name swapping and dropping the old one.
         | Basically what pg_repack was doing.
         | 
         | There's a huge collection of tricks out there that just need to
         | become formal features. The trick I'm working with today is
         | adding a check constraint with NOT VALID and then immediately
         | calling VALIDATE because otherwise it takes a very aggressive
         | lock that blocks writes. That could easy become ALTER TABLE
         | CONCURRENTLY or something.
        
           | skrebbel wrote:
           | Do you know a good resource with these tricks? I often
           | struggle to predict exactly which schema migrations will lock
           | tables aggressively and what the smartest workaround is.
        
             | williamdclt wrote:
             | We've had good success with
             | https://github.com/sbdchd/squawk to lint migrations. It
             | tells you if a query is going to lock your table (as long
             | as it's written in SQL, not some ORM DSL)
        
         | brightball wrote:
         | There's a Ruby gem called strong_migrations that does this.
         | It's fantastic and I include it on all of my Ruby projects.
        
           | why-el wrote:
           | The gem does not solve these issues, merely tells you about
           | them (and even then it can't catch all of them, only the ones
           | that Rails defines).
        
             | brightball wrote:
             | It gives you mitigation strategies.
             | 
             | The mere presence of it on the dev workflow keeps your
             | developers thinking about these types of issues as well,
             | which goes a long way.
        
         | dragonwriter wrote:
         | > I feel like there's a certain level of stockholm syndrome in
         | the sense that to PG experts, these things aren't that painful
         | anymore
         | 
         | I don't know if I'm a PG expert, but I just prefer "migration
         | tool" to be a separate thing, and for the DB server engine to
         | focus on being an excellent DB server engine, with the right
         | hooks to support a robust tooling ecosystem, rather than trying
         | to _be_ the tooling ecosystem.
        
           | JohnBooty wrote:
           | As a developer I fully support the notion of splitting the
           | tools out from the server engine, like things are today.
           | 
           | But, realistically, pg_upgrade's functionality would need to
           | be integrated into the server itself if we're ever going to
           | have zero-downtime upgrades, right?
           | 
           | I don't know how other RDBMSs handle this, if at all
        
             | dragonwriter wrote:
             | Yeah, the comment I was responding to addressed two
             | different kinds of migration--schema migration and version
             | upgrades--and my comment really applies more to schema
             | migration than version upgrades; more support for smoothing
             | the latter in the engine makes sense.
        
               | JohnBooty wrote:
               | Agreed: Postgres' transactional schema migration is
               | freaking _sublime._
               | 
               | I used and abused it pretty hard at my previous gig and
               | now it's hard to imagine ever living without it.
               | 
               | At my gig before THAT, we had MySQL and schema migrations
               | were so very very painful.
        
         | znpy wrote:
         | My complain would be that there is no standard multi-master
         | solution for postures, whereas mysql now has group replication
         | as a native multi-master solution.
        
         | throw0101a wrote:
         | > _If I may hijack the thread with some more general complaints
         | though, I wish the Postgres team would someday prioritize
         | migration._
         | 
         | A thing I'm interested in is a 'simple' replication setup to
         | reduce single points of failure. We currently use Galera with
         | My/MariaDB/Percona and it's quite handy for HA-ish needs: we
         | can have two DBs and the garbd running on the web app server.
         | 
         | Pointers to tutorials for Debian/Ubuntu to accomplish something
         | similar would be appreciated. (We run things on-prem in a
         | private cloud.)
        
           | fanf2 wrote:
           | Take a look at Patroni, PAF, repmgr, Stolon, etc.
           | 
           | https://github.com/zalando/patroni
           | 
           | https://clusterlabs.github.io/PAF/
           | 
           | https://repmgr.org/
           | 
           | https://github.com/sorintlab/stolon
        
           | zozbot234 wrote:
           | There's no such thing as "simple" when it comes to HA setups,
           | the requirements are simply too varied. PostgreSQL has great
           | documentation for their HA featureset, but when it comes to
           | systems-level concerns (detecting that a primary is down and
           | arranging promotion of a replica to primary) you're expected
           | to address those on your own.
        
             | throw0101a wrote:
             | With our Galera setup we have a _keepalived_ health check
             | look at the local system, and if it fails /times out it
             | stops sending heart beats so the other sides takes over the
             | vIP. If one system crashes the vIP fails over as well.
        
         | harikb wrote:
         | Doesn't PG already support inplace version upgrade?
         | 
         | Also PG is one of the few that support schema/DDL statements
         | inside a transaction.
        
           | skrebbel wrote:
           | "one of the few" is a pretty low bar though, I don't know a
           | DB that doesn't suck at this.
        
             | jeltz wrote:
             | Or maybe it is you who are underestimating the technical
             | complexity of the task? A lot of effort has been spent on
             | making PostgreSQL as good as it is on migrations. Yes, it
             | is not as highly prioritized as things like performance or
             | partitioning but it is not forgotten either.
        
         | tomhallett wrote:
         | An orthogonal migration issue which I'm hitting right now: we
         | need to migrate from heroku postgres to aws rds postgres, and
         | I'm stressed about the risk and potential downtime in doing so.
         | If there was a way to make a replica in rds based on heroku,
         | promote the rds replica to be the primary, hard switch our apps
         | over to rds, that'd be a lifesaver.
         | 
         | I'm working through this blog post [1] now, but there is still
         | a bit to be defined (including a dependency on heroku's support
         | team) to get this rolling.
         | 
         | Why the migration is required? Heroku postgres doesn't support
         | logical replication, and logical replication is required for
         | any ELT vendor (Fivetran, Stitch, Airbyte) to use Change Data
         | Capture to replicate data from postgres to snowflake (with
         | replicating deleted rows efficiently).
         | 
         | Note: I've also read this ebook [2], but this approach requires
         | downtime.
         | 
         | Note 2: I reached out to heroku support and asked if logical
         | replication was on their short term roadmap. They said they've
         | heard this quite a bit, but nothing tangible is on the roadmap.
         | 
         | If anyone has any thoughts on the above migration, I'd be all
         | ears. :)
         | 
         | 1) https://vericred.com/how-we-migrated-a-1tb-database-from-
         | her...
         | 
         | 2) https://pawelurbanek.com/heroku-migrate-postgres-rds
        
           | smileysteve wrote:
           | logical replication, but this is one of the walls that heroku
           | creates.
        
             | tomhallett wrote:
             | Coming from the outside, with zero understanding of the
             | internal details, my _hunch_ is the same: lack of support
             | for logical replication is more of a business decision than
             | a technical decision. (But again, this a hunch -- partially
             | based on how good heroku is from a technical perspective)
        
               | oauea wrote:
               | It's absolutely an evil business decision, and all the
               | clouds are playing this game. Don't ever use a hosted
               | database solution if you're thinking about storing any
               | significant amount of data. You will not be able to get
               | it out without downtime.
        
           | tomc1985 wrote:
           | When you subscribe to managed services instead of running the
           | software yourself, these are the kinds of trade-offs that get
           | made
        
           | bastawhiz wrote:
           | I did that exact migration. Unfortunately, to my knowledge,
           | there's no way to do it with zero downtime. You need to make
           | your app read only until the RDS instance has ingested your
           | data, then you can cut over. For me, that was roughly one
           | gigabyte of data and took about forty seconds.
           | 
           | My best advice is to automate the whole thing. You can
           | automate it with the Heroku and AWS CLIs. Test on your
           | staging site until you can run through the whole process end
           | to end a few times with no interruptions.
        
             | barrkel wrote:
             | There are other ways to handle this at the application
             | level, to be clear, using dual read & write and backfill.
             | More relevant when you have TB+++ of data.
        
               | tomhallett wrote:
               | Interesting. I've done dual-writes at the application
               | level to migrate the datastore for a smaller feature
               | (branch by abstraction), but never for an entire
               | application. And the code path was quite simple, so it
               | was easy to think about all of the edge cases at one time
               | in your head.
               | 
               | Do you have any resources which talk through the
               | read/write/backfill approach?
               | 
               | Here's what I found so far: * https://medium.com/google-
               | cloud/online-database-migration-by... *
               | https://aws.amazon.com/blogs/architecture/middleware-
               | assiste...
        
             | laurent92 wrote:
             | So, basically, Postgres would have a replication port which
             | can be used for both replication/clustering and transfer
             | across cloud providers. And sharding. </dreaming>
        
             | tomhallett wrote:
             | Thank you for this - extremely helpful in validating the
             | current approach and de-risking the developer time.
        
             | oauea wrote:
             | Yep, absolutely garbage that these clouds (Azure is another
             | one) don't allow you to replicate with external systems.
             | Pretty much devalues their entire hosted postgresql
             | offering if you ask me, since it's just designed to keep
             | you locked in (duh).
             | 
             | If you have any significant amount of data where you're
             | worried about a migration, stay far away from hosted
             | postgres offerings. You'll never get your data out without
             | significant downtime.
        
           | sidmitra wrote:
           | One possible solution for the ETL stuff might be to use
           | Heroku Kafka for the Change Data Capture and then from that
           | Kafka you can move it someplace else.
           | 
           | See https://blog.heroku.com/streaming-data-connectors-beta
           | Heroku's own Kafka seems to have slightly more native support
           | than if you use a 3rd party like Confluence.
           | 
           | We've not yet tried any of this, but it's been bookmarked as
           | a possible solution to explore.
        
             | tomhallett wrote:
             | Interesting that you bring this up. I looked into heroku's
             | streaming connectors to facilitate an integration with
             | materialize.com, but Heroku's support team wasn't confident
             | we could sync all 187 postgres tables under 1 connection.
             | 
             | I thought about using Debezium and Kafka to roll my own
             | micro-batch ETL solution, but listening to this podcast
             | made me walk away slowly:
             | https://www.dataengineeringpodcast.com/datacoral-change-
             | data...
        
               | gunnarmorling wrote:
               | Interesting, what was it from that podcast that made you
               | reconsider? Always eager to learn about opportunities for
               | improving the experience of using Debezium.
               | 
               | Disclaimer: I work on Debezium
        
               | tomhallett wrote:
               | Oh wow, by "work on" you mean "the core maintainer of".
               | Thank you for replying. :)
               | 
               | The main part I reconsidered based on was the level of
               | effort taking the data from kafka and landing into
               | snowflake, especially around handle postgres schema
               | changes safely. I also have no experience with kafka, so
               | I'd be out of my depth's pretty quickly for a critical
               | part of the architecture. He also expressed the need for
               | building quality checks into the kafka to snowflake code,
               | but those details were a bit sparse (if i recall
               | correctly).
               | 
               | Note: all of the above are _probably_ outside the scope
               | of debezium. :)
               | 
               | Note 2: your article [1] on using cdc to build audit logs
               | w/ a "transactions" table blew my mind. Once I listened
               | to your data engineering podcast interview [2], I knew
               | there was some implementation of "event sourcing lite w/
               | a crud app" possible, so I was excited to see you had
               | already laid it out.
               | 
               | 1) https://debezium.io/blog/2019/10/01/audit-logs-with-
               | change-d...
               | 
               | 2) https://www.dataengineeringpodcast.com/debezium-
               | change-data-...
        
               | gunnarmorling wrote:
               | Gotcha, yeah, there's many things to consider indeed when
               | setting up end-to-end pipelines. Thanks for the nice
               | feedback, so happy to hear those resources are useful for
               | folks. As far as event sourcing is concerned, we got
               | another post [1] which might be interesting to you,
               | discussing how "true ES" compares to CDC, pros/cons of
               | either approach, etc.
               | 
               | [1] https://debezium.io/blog/2020/02/10/event-sourcing-
               | vs-cdc/
        
           | zozbot234 wrote:
           | > Why the migration is required? Heroku postgres doesn't
           | support logical replication
           | 
           | You could possibly hack together some form of higher-layer
           | logical replication via postgres_fdw and database triggers. A
           | comment ITT references this as a known technique.
        
           | craigkerstiens wrote:
           | We've moved a number of customers from Heroku over to Crunchy
           | Bridge with essentially no down time, am currently helping
           | one customer with 7TB through that process. It's not over to
           | RDS, but would be happy to talk through process if helpful.
           | And we do support logical replication and have many people
           | using wal2json/logical replication with us.
        
       | nicoburns wrote:
       | Does anyone know the status of the zheap project? I always hope
       | to say news in postgres release notes, but nothing so far.
        
         | murkt wrote:
         | https://www.cybertec-postgresql.com/en/postgresql-zheap-curr...
         | The last status update is from July, so seems like things are
         | progressing.
        
       | netcraft wrote:
       | And now the wait for RDS to support it. Thanks PG team!
        
         | rsanheim wrote:
         | Any ideas on the typical delay before its supported in RDS?
        
           | bmdavi3 wrote:
           | https://www.brianlikespostgres.com/rds-aurora-release-
           | dates....
           | 
           | I gathered release dates a few weeks ago because I was
           | curious. There aren't that many data points, but 150 days
           | might be a good guess
        
           | netcraft wrote:
           | dont quote me on this, but I think it used to be quite a
           | while, but since 12 things have gotten a lot better. I think
           | they generally wait for the .1 patch and then get it in
           | pretty quick.
        
             | netcraft wrote:
             | Postgres 13 was released 2020-09-24
             | 
             | 13.1 was released 2020-11-12
             | 
             | 13.2 was released 2021-02-11
             | https://www.postgresql.org/docs/13/release-13-2.html
             | 
             | AWS supported 13 as of 2021-02-24
             | https://aws.amazon.com/about-aws/whats-new/2021/02/amazon-
             | rd...
        
       | hyper_reality wrote:
       | PostgreSQL is one of the most powerful and reliable pieces of
       | software I've seen run at large scale, major kudos to all the
       | maintainers for the improvements that keep being added.
       | 
       | > PostgreSQL 14 extends its performance gains to the vacuuming
       | system, including optimizations for reducing overhead from
       | B-Trees. This release also adds a vacuum "emergency mode" that is
       | designed to prevent transaction ID wraparound
       | 
       | Dealing with transaction ID wraparounds in Postgres was one of
       | the most daunting but fun experiences for me as a young SRE. Each
       | time a transaction modifies rows in a PG database, it increments
       | the transaction ID counter. This counter is stored as a 32-bit
       | integer and it's critical to the MVCC transaction semantics - a
       | transaction with a higher ID should not be visible to a
       | transaction with a lower ID. If the value hits 2 billion and
       | wraps around, disaster strikes as past transactions now appear to
       | be in the future. If PG detects it is reaching that point, it
       | complains loudly and eventually stops further writes to the
       | database to prevent data loss.
       | 
       | Postgres avoids getting anywhere close to this situation in
       | almost all deployments by performing routine "auto-vacuums" which
       | mark old row versions as "frozen" so they are no longer using up
       | transaction ID slots. However, there are a couple situations
       | where vacuum will not be able to clean up enough row versions. In
       | our case, this was due to long-running transactions that consumed
       | IDs but never finished. Also it is possible but highly
       | inadvisable to disable auto-vacuums. Here is a postmortem from
       | Sentry who had to deal with this leading to downtime:
       | https://blog.sentry.io/2015/07/23/transaction-id-wraparound-...
       | 
       | It looks like the new vacuum "emergency mode" functionality
       | starts vacuuming more aggressively when getting closer to the
       | wraparound event, and as with every PG feature highly granular
       | settings are exposed to tweak this behaviour
       | (https://www.postgresql.org/about/featurematrix/detail/360/)
        
         | darksaints wrote:
         | What is wrong with using a 64 bit, or even 128 bit transaction
         | id?
        
           | hyper_reality wrote:
           | It would increase disk usage by a significant amount, since
           | transaction IDs appear twice in tuple headers (xmin/xmax).
           | Essentially they are overhead on every database row. This
           | submission has a discussion on it:
           | https://news.ycombinator.com/item?id=19082944
        
             | namibj wrote:
             | Yeah, making this not be optional is the issue. Only some
             | situations don't get by with 32 bit txids, but imposing the
             | cost on everyone would be bad.
             | 
             | Oh, and C codebases make such changes far harder than more
             | rigid newtyping in e.g. Rust, which is why I assume no one
             | made the necessary patches yet.
        
             | polskibus wrote:
             | I wonder how does MS SQL work differently.
        
               | outworlder wrote:
               | It has a similar concept if you need MVCC (with InnoDB).
               | It also has a concept of transaction IDs. And also need
               | to clean them up (purge). They will both have table bloat
               | if not done.
               | 
               | Since details matter, there's a post that explains it far
               | better than I could:
               | 
               | https://www.enterprisedb.com/blog/mysql-vs-postgresql-
               | part-2...
        
               | polskibus wrote:
               | I asked about MS not My though.
        
               | sitharus wrote:
               | By default MS SQL uses pessimistic locking, depending on
               | isolation levels. There's only one version of the data on
               | disk and the isolation level of a transaction determines
               | what happens - for example if a transaction in
               | SERIALIZABLE reads a row SQL Server takes a shared read
               | lock on that row preventing any other transaction from
               | writing to it.
               | 
               | MS SQL also has snapshot (and read committed snapshot)
               | isolation levels. These are much more like the Postgresql
               | isolation levels - in fact Postgres only has two 'real'
               | isolation levels, read committed and serializable, you
               | get upgraded to the next higher level as permitted in the
               | spec.
               | 
               | In snapshot isolation instead of taking a lock SQL Server
               | copies the row to a table TempDB when it would be
               | overwritten, additionally it adds a 14-byte row version
               | to each row written. There's a lot of detail here:
               | https://docs.microsoft.com/en-us/sql/relational-
               | databases/sq...
               | 
               | This is also why MS SQL maintains a clustered index on
               | the main table - the main table only contains the latest
               | globally-visible version, so it can be sorted. Postgres
               | stores all versions (until vacuum removes dead rows), so
               | the main table is a heap, only indexes are sorted.
        
           | SigmundA wrote:
           | I believe mostly how much code needs to be changed, which
           | they are working towards slowly, but there is more overhead
           | (memory / disk) associated with those larger data types which
           | are used everywhere:
           | 
           | https://wiki.postgresql.org/wiki/FullTransactionId
        
         | cutler wrote:
         | SRE?
        
           | grzm wrote:
           | Site Reliability Engineer
           | 
           | https://en.wikipedia.org/wiki/Site_reliability_engineering
        
         | jedberg wrote:
         | > Also it is possible but highly inadvisable to disable auto-
         | vacuums.
         | 
         | When I was running my first Postgres cluster (the reddit
         | databases), I had no idea what vacuuming was for. All I knew
         | was that every time it ran it slowed everything down. Being
         | dumb, I didn't bother to read the docs, I just disabled the
         | auto vacuum.
         | 
         | Eventually writes stopped and I had to take a downtime to do a
         | vacuum. Learned a few important lessons that day. I also then
         | set it up to do an aggressive vacuum every day at 3am, which
         | was the beginning of low traffic time, so that the auto-
         | vacuuming didn't have as much work to do during the day.
        
           | williamdclt wrote:
           | Everytime I've seen people having "vacuuming too expensive"
           | problems, the solution was "more vacuum"!
        
             | infogulch wrote:
             | "vacuum during the times when I want instead of randomly at
             | peak traffic"
        
             | jedberg wrote:
             | As it turns out, yes!
        
         | mattashii wrote:
         | > Each time a transaction modifies rows in a PG database, it
         | increments the transaction ID counter.
         | 
         | It's a bit more subtle than that: each transaction that
         | modifies, deletes or locks rows will update the txID counter.
         | Row updates don't get their own txID assigned.
         | 
         | > It looks like the new vacuum "emergency mode" functionality
         | starts vacuuming more aggressively when getting closer to the
         | wraparound
         | 
         | When close to wraparound, the autovacuum daemon stops cleaning
         | up the vacuumed tables' indexes, yes. That saves time and IO,
         | at the cost of index and some table bloat, but both are
         | generally preferred over a system-blocking wraparound vacuum.
        
         | MrWiffles wrote:
         | Thank you for this explanation!
        
       | TOMDM wrote:
       | PostgreSQL is one of those tools I know I can always rely on for
       | a new use-case. There are very few cases where it can't do
       | exactly what I need (large scale vector search/retrieval).
       | 
       | Congrats on the 14.0 release.
       | 
       | The pace of open source has me wondering what we'll be seeing 50
       | years from now.
        
         | gk1 wrote:
         | I recall seeing some library that adds vector search to
         | Postgres. Maybe https://github.com/ankane/pgvector?
         | 
         | Also there's Pinecone (https://www.pinecone.io) which can sit
         | alongside Postgres or any other data warehouse and ingest
         | vector embeddings + metadata for vector search/retrieval.
        
           | TOMDM wrote:
           | I need a self managed solution, so I'm not sure Pinecone is
           | feasible and I don't think pgvector scales well enough for my
           | use-case (hundreds of millions of vectors).
           | 
           | So far I think I'm going to go with Milvus[1], ideally I'd
           | just have a foreign data wrapper for Milvus or FAISS.
           | 
           | [1] https://github.com/milvus-io/milvus
        
           | gavinray wrote:
           | ElasticSearch has "dense_vector" datatype and vector-specific
           | functions.                 https://www.elastic.co/guide/en/el
           | asticsearch/reference/current/dense-vector.html       https:/
           | /www.elastic.co/guide/en/elasticsearch/reference/current/quer
           | y-dsl-script-score-query.html#vector-functions
           | 
           | ZomboDB integrates ElasticSearch as a PG extension, written
           | in Rust:                 https://github.com/zombodb/zombodb
           | 
           | I dunno what exactly a "dense_vector" is, but if you can't
           | use the native "tsvector" maybe you could use this?
        
             | TOMDM wrote:
             | ZomboDB has always seemed really interesting to me.
             | 
             | Elastics search, with PostgreSQL's everything else sounds
             | like a perfect match, but I've never seen it used anywhere
             | which is kind of spooky to me.
             | 
             | Do you know of any non-toy use cases I can look at?
        
               | gavinray wrote:
               | You can ask Eric Ridge on the ZomboDB Discord, he's
               | really nice:
               | 
               | https://discord.gg/hPb93Y9
               | 
               | Also probably one of the most knowledgeable people about
               | both Postgres and Rust I've ever met.
               | 
               | He makes his living primarily from supporting enterprises
               | using ZomboDB (I think), not sure how much he can say,
               | but it's worth asking.
        
               | TOMDM wrote:
               | You're not kidding, they were incredibly friendly and
               | helpful.
               | 
               | Makes me wish I had a big deployment so I could justify
               | sponsoring them.
               | 
               | I paid in cat pictures for now.
        
             | anentropic wrote:
             | I think a dense vector is the opposite of a sparse vector
             | 
             | i.e. in a dense vector every value in the vector is stored
             | 
             | whereas sparse vectors exist to save space when you have
             | large vectors where most of the values are usually zero -
             | they reconstruct the full vector by storing only the non-
             | zero values, plus their indices
        
               | gavinray wrote:
               | "a dense vector is the opposite of a sparse vector"
               | 
               | I think there's another thing besides vectors that are a
               | bit dense in the room here, eh? Yeah that makes sense
               | hahaha -- thank you.
        
       | hackandtrip wrote:
       | Any suggestions to learn and go deep in PostgreSQL for someone
       | who worked mostly on NoSQL (MongoDB)?
       | 
       | From the few days I have explored it, it is absolutely
       | incredible, so congratulations for the work done and good luck on
       | keeping the quality so high!
        
         | fragile_frogs wrote:
         | I found "The Art of PostgreSQL" quite helpful:
         | https://theartofpostgresql.com/
        
         | dgb23 wrote:
         | If you are a novice or even if just a bit rusty with relational
         | databases I recommend:
         | 
         | - Grab a good book or two. The Art of PostgreSQL is one. There
         | are _many_ others. Take a bit of time to find out what suits
         | you. I typically like reading a more theoretical one and a more
         | practical one (with actionable, concrete advice, actual code in
         | it and so on).
         | 
         | - Get a nice tool/IDE/editor plugin that can interactively work
         | with databases, create them, query them etc. Almost all of the
         | editor utilities that you expect from general purpose
         | programming also applies to programming with SQL.
         | 
         | - PostgreSQL is a very powerful, primarily relational database,
         | but there is very little it cannot do. For example it is
         | perfectly feasible to implement a document (JSON) store with
         | it, and convenient to boot. There are also great extensions
         | such as for temporal tables.
         | 
         | - There is a ton of advice, discussion etc. to find, especially
         | around data modelling. Word of caution: Some of it is
         | religious, at least slightly. The book I mentioned above too to
         | a certain degree, but it is still a very good book. Realize
         | that a relational DB can give you a ton of stuff in a
         | performant, declarative manner. But not everything should live
         | in the DB, nor is everything relational in the strictest sense.
         | Be pragmatic.
         | 
         | Source: I've been diving deeper into this topic since a few
         | years and still am. The more I learn and are able to apply, the
         | more I understand why the relational paradigm is so dominant
         | and powerful.
        
         | paulryanrogers wrote:
         | Pg docs are so good I reference them whenever I want to check
         | the SQL standards, even if I'm working on another DB. (I prefer
         | standard syntax to minimize effort moving DBs.)
         | 
         | Otherwise maybe try it with a toy project.
        
           | JohnBooty wrote:
           | I stick to standard SQL syntax/features whenever possible as
           | well, but...
           | 
           | Honest question: how often do you switch databases?
           | 
           | I've never really found myself wanting or needing to do this.
           | 
           | Only time I could really see myself wanting to do this is if
           | I was writing some kind of commercial software (eg, a
           | database IDE like DataGrip) that needed to simultaneously
           | support various multiple databases.                   > MySQL
           | 
           | It feels particularly limiting to stick to "standard" SQL for
           | MySQL's sake, since they frequently lag behind on huge chunks
           | of "standard" SQL functionality anyway. For example, window
           | functions (SQL2003 standard) took them about a decade and a
           | half to implement.
        
             | paulryanrogers wrote:
             | I've done two moves, one from MySQL to Pg and the other
             | from Pg to MySQL. I'm not opposed to leveraging their
             | nonstandard parts where necessary.
             | 
             | Some companies end up with a mix of different DBs and it
             | can help to consolidate to share expertise or resources.
             | 
             | Though at this point both have grown much closer together
             | in capabilities and performance.
        
             | magicalhippo wrote:
             | We got an application which has been developed for over 20
             | years with the mentality of "we're never switching db's".
             | 
             | Yet now we are, because some core customers demand MSSQL
             | support...
             | 
             | Of course the db we're using supports all kind of non-
             | standard goodness that has been exploited all over. Gonna
             | be fun times ahead...
        
         | JohnBooty wrote:
         | As a young developer I stayed away from traditional relational
         | databases because I thought they were boring and restrictive. I
         | quickly fell in love with them. I realized they painlessly
         | provided all the things I was already doing with data and they
         | were doing it much faster and more reliably.
         | 
         | I hope you have the same kind of experience and enjoy Postgres!
         | 
         | As far as actual advice...
         | 
         | 1. The key concept of an RDBMS is that each table typically
         | specifies relationships to other tables. So in order to query
         | data you will typically be joining multiple tables. Sometimes
         | new DB programmers get a lot of enlightenment from the simple
         | Venn diagrams that illustrate the basic types of JOINs:
         | https://www.google.com/search?q=inner+join+diagram ...IMHO,
         | once you get this (fairly simple) paradigm the rest is easy.
         | 
         | 2. Database constraints are your friends. At a minimum you
         | specify the types of the columns, obviously, but you can go
         | much farther - specifying NOT NULL constraints, foreign keys,
         | and more complex check conditions. Do as much of this _as
         | feasible_ at the database level. Otherwise it is something you
         | need to code and enforce at the application level, where it
         | will generally be less performant and more prone to coder
         | error. Additionally, any constraints not enforced at the
         | application level will need to be implemented across multiple
         | applications if more than one app uses the database.
         | 
         | The second one above is an example of something that sounds
         | boring and restrictive but really frees you up to do other
         | things as a developer. About a decade ago, the "trend" was to
         | treat RDBMSs as "dumb" storage and do _all_ that stuff at the
         | application level, in the name of being database-agnostic.
         | Opinions remain divided, but I think that was an objectively
         | bad trend. For one thing, folks noticed they hardly ever needed
         | to suddenly switch databases, but there are other reasons as
         | well.
        
       | rubyist5eva wrote:
       | Postgres is my bread and butter for pretty much every project.
       | Congratulations to the team, you work on and continue to improve
       | one of the most amazing pieces of software ever created.
        
       | 83457 wrote:
       | Can someone who uses Babelfish for PostgreSQL compatibility with
       | SQL Server commands please describe their experience, success,
       | hurdles, etc. We would move to PostgreSQL if made easier by such
       | a tool. Thanks!
        
         | mattashii wrote:
         | I haven't been able to play with it yet.
         | 
         | This was partly due to my lack of SQL Server projects, but
         | mostly due to the lack of availability of the supposedly
         | Apache-licenced sources and/or binaries (and not wanting to
         | configure AWS for 'just testing').
        
           | my123 wrote:
           | It's currently in a limited preview, not accessible outright
           | to any AWS customer even.
        
             | mattashii wrote:
             | Yep. I'm quite interested when the 'somewhere in 2021' will
             | be, as the last mention of babelfish from an Amazon-
             | employee on the mailing lists was at the end of March and I
             | haven't heard news about the project since.
        
       | PlugaruT wrote:
       | I'm trying to understand if with v14 I will be able to connect
       | Debezium to a "slave" node and not to the "master" in order to
       | read the WAL but can't figure it out. Can someone help me with
       | this?
        
         | anarazel wrote:
         | Unfortunately that didn't make it into 14, there were too many
         | rough edges to file off.
        
         | gunnarmorling wrote:
         | I was just yesterday talking to someone about this; they
         | mentioned that Patroni leverages some way for setting up
         | replication slots on replicas [1]. Haven't tried my self yet,
         | but seems worth exploring.
         | 
         | Something I'd like to dive into within Debezium is usage of the
         | pg_tm_aux extension, which supposedly allows to set up
         | replication slots "in the past", so you could use this to have
         | seamless failover to replicas without missing any events. In
         | any case, this entire area is of high importance for us and
         | we're keeping an eye on any improvements closely, so I hope it
         | will be sorted out sooner or later.
         | 
         | [1] https://twitter.com/cyberdemn/status/1443130986116624388
         | [2] https://github.com/x4m/pg_tm_aux
        
         | zozbot234 wrote:
         | Whoa, I thought the proper terminology was more like
         | primary/replica. Are we talking Postgres databases or decades-
         | old clunky IDE hard drives?
        
           | slumpr wrote:
           | Both are valid. Master/slave is an older way of describing
           | how the two are related, but most DBMSes used this
           | terminology up until recently.
        
           | PlugaruT wrote:
           | Yeah, I should have used "primary"/"replica", I agree.
           | Somehow this words did not came to my mind. Thanks
        
       | maxpert wrote:
       | I converted from MySQL (before whole MariaDB and fork), and I've
       | been happier with every new version. My biggest moment of joy was
       | JSONB and it keeps getting better. Can we please make the
       | connections lighter so that I don't have to use stuff like
       | pgbouncer in the middle? I would love to see that in future
       | versions.
        
         | darksaints wrote:
         | There has been a lot of improvement in this release.
         | 
         | https://www.depesz.com/2020/08/25/waiting-for-postgresql-14-...
        
           | the_duke wrote:
           | The benchmark linked in the comments shows 7-38%
           | improvements. Nothing to scoff at, but if you need PGBouncer
           | that probably won't make enough of a difference.
        
             | darksaints wrote:
             | It certainly isn't much of an improvement in connection
             | latency (the connections are still pretty heavy), but it is
             | a massive improvement in transaction throughput with higher
             | numbers of connections. If you scroll down a bit, there is
             | now higher TPS even at 5000 connections than previously
             | could be had at 100 connections. That fixes a massive
             | amount of pain that previously only could be solved with
             | pgbouncer.
        
         | xwdv wrote:
         | Lighter connections would finally allow for using lambda
         | functions that access a Postgres database without needing a
         | dedicated pgbouncer server in the middle.
        
           | jeltz wrote:
           | Yes, but this patch does not make startup cheaper, it only
           | decreases the performance cost of concurrent open
           | connections.
        
           | plasma wrote:
           | Not sure if you're using this, but AWS has RDS Proxy as a
           | service, in case you're hosting your own
        
         | jkatz05 wrote:
         | I'd be curious to see if the concurrency improvements in
         | PostgreSQL 14 help with increasing the threshold for when you
         | need to introduce a connection pooler.
        
         | paulryanrogers wrote:
         | FWIW Mysql 8 has gotten a lot better in standards compliance
         | and ironing out legacy quirks, with some config tweaks. While
         | my heart still belongs to PostgreSQL things like no query
         | hints, dead tuple bloat (maybe zheap will help?), less robust
         | replication (though getting better!), and costly connections
         | dampens my enthusiasm.
        
           | assface wrote:
           | > maybe zheap will help?
           | 
           | According to Robert Haas, the zheap project is dead.
        
             | Tostino wrote:
             | I'd love to see that info, been interested in why it was
             | taken up by Cybertec and EDB seemingly dropped it.
        
             | eurg wrote:
             | Where did Robert Haas say this? Quick search didn't surface
             | much, except for a blog post from July by Hans-Jurgen
             | Schonig: https://www.cybertec-postgresql.com/en/postgresql-
             | zheap-curr... Doesn't sound like they discontinued the
             | project.
        
       | buro9 wrote:
       | Suppose I had a "friend" with a PostgreSQL 9.6 instance (a large
       | single node)... what's the best way to upgrade to PostgreSQL 14?
        
         | jeltz wrote:
         | If you can afford a short downtime I would recommend just
         | running pg_upgrade and upgrade directly to 13. Preferably test
         | it out first before doing it in production.
        
         | jeremy_sikich wrote:
         | Pglogical, replicate to a new database server. There will be
         | minimal downtime and it allows you to test the process
         | thoroughly.
        
           | bananaoomarang wrote:
           | Yeah this is how we do it... Even with terrabytes of data it
           | seems to be pretty efficient and robust, and easy to just
           | keep up the replicating version until you're satisfied it all
           | looks good.
        
         | briffle wrote:
         | We are currently using pg_upgrade to go from 9.6 to 13 in our
         | systems. its supported and works well. We are using hte
         | 'hardlinks' feature that makes it extremely fast as well.
        
           | zozbot234 wrote:
           | Do be aware that using the "hardlinks" feature will mean that
           | restarting the old database version will not work properly
           | and may not be safe. You should make sure to copy the old
           | directory beforehand if you might need that, or simply
           | restore from backup.
        
             | Tostino wrote:
             | I'd argue you shouldn't be performing an upgrade at all
             | without a proper backup. But yes, absolutely do not use a
             | hardlinks upgrade unless you have a backup laying around.
        
         | dylanz wrote:
         | If you're on a single node, probably something like this. This
         | is what my "friend" is doing later this week. He's migrating
         | from 9.5 to 13:                   sudo sh -c 'echo "deb
         | http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg
         | main" > /etc/apt/sources.list.d/pgdg.list'         wget --quiet
         | -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo
         | apt-key add -         sudo apt-get update         sudo apt
         | install postgresql-13         sudo service postgresql stop
         | sudo mkdir -p /secure/pgsql13/data         sudo chown -R
         | postgres:postgres /secure/pgsql13/data         sudo -u postgres
         | /usr/lib/postgresql/13/bin/initdb -D /secure/pgsql13/data
         | sudo rm -rf /var/lib/postgresql/13/main         sudo ln -s
         | /secure/pgsql13/data /var/lib/postgresql/13/main         sudo
         | chown -hR postgres:postgres /var/lib/postgresql/13/main
         | sudo ln -s /etc/postgresql/9.5/main/postgresql.conf
         | /var/lib/postgresql/9.5/main/postgresql.conf         sudo chown
         | -hR postgres:postgres /etc/postgresql/9.5/main/postgresql.conf
         | cd /tmp         sudo -u postgres time
         | /usr/lib/postgresql/13/bin/pg_upgrade --old-bindir
         | /usr/lib/postgresql/9.5/bin --new-bindir
         | /usr/lib/postgresql/13/bin --old-datadir
         | /var/lib/postgresql/9.5/main --new-datadir
         | /var/lib/postgresql/13/main --link --check         sudo -u
         | postgres time /usr/lib/postgresql/13/bin/pg_upgrade --old-
         | bindir /usr/lib/postgresql/9.5/bin --new-bindir
         | /usr/lib/postgresql/13/bin --old-datadir
         | /var/lib/postgresql/9.5/main --new-datadir
         | /var/lib/postgresql/13/main --link         sudo service
         | postgresql start 13         sudo -u postgres
         | "/usr/lib/postgresql/13/bin/vacuumdb" --all -j 32 --analyze-
         | only         sudo -u postgres ./delete_old_cluster.sh
        
           | sharadov wrote:
           | I second this, upgraded 100 odd instances using pg_upgrade.
           | And it's lightning fast, takes less than a min whether the db
           | cluster is a couple GB or a few TB. Just make sure that you
           | always run the check mode to catch incompatibilities between
           | versions.
        
           | throw0101a wrote:
           | > _wget --quiet -O
           | -https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo
           | apt-key add -_
           | 
           | Is there a reason to do this _rather than_ putting the key in
           | /etc/apt/trusted.gpg.d/?
        
             | idoubtit wrote:
             | Putting the key in /etc/apt/trusted.gpg.d/ gives it too
             | much power over all the repositories. A malicious (maybe
             | compromised) third party repository could publish a package
             | that replace an official Debian package.
             | 
             | So trusted.gpg.d/ is not the recommended method. For more
             | information, see the official Debian wiki which states "The
             | key MUST NOT be placed in /etc/apt/trusted.gpg.d"
             | https://wiki.debian.org/DebianRepository/UseThirdParty
        
               | foresto wrote:
               | And while you're there, pay attention to the signed-by
               | option. It's what allows you (the admin) to limit APT's
               | trust in the keys you add.
        
               | throw0101a wrote:
               | Actually it states:
               | 
               | > _The key MUST NOT be placed in /etc/apt/trusted.gpg.d
               | or loaded by apt-key add._
               | 
               | And yet the snippet that I quoted has the latter command.
               | 
               | Further: the link has the example " _[signed-by=
               | /usr/share/keyrings/deriv-archive-keyring.gpg]_". Perhaps
               | it's my BSD upbringing showing through, but shouldn't
               | only/mostly OS-provided stuff generally live in
               | /usr/share? Shouldn't locally-added stuff go into
               | /usr/local/share? Or perhaps creating a
               | /etc/apt/local.gpg.d would be appropriate?
        
         | sudhirj wrote:
         | With downtime, I guess the pg_upgrade tool works fine.
         | 
         | Without downtime / near-zero downtime is more interesting
         | though. Since this is an old version, something like Bucardo
         | maybe? It can keep another pg14 instance in sync with your old
         | instance by copying the data over and keeping it in sync. Then
         | you switch your app over to the new DB and kill the old one.
         | 
         | Newer versions make this even easier with logical replication -
         | just add the new DB as a logical replica of the old one, and
         | kill the old one and switch in under 5 seconds when you're
         | ready.
        
           | yarcob wrote:
           | I think I remember a talk where someone manually set up
           | logical replication with triggers and postgres_fdw to upgrade
           | from an old server with zero downtime.
        
         | nicoburns wrote:
         | Looks like the best approach might be to use in-place upgrade
         | tool that ships with Postgres to upgrade to v10 (use Postgres
         | v10 to perform this upgrade). From there you'd be able to
         | create a fresh v14 instance and use logical replication to sync
         | over the v10 database. Before briefly stopping writes while you
         | swap over to the v14 database.
         | 
         | EDIT: Looks like pg_upgrade supports directly upgrading
         | multiple major versions. So maybe just use that if you can
         | afford the downtime.
        
           | yarcob wrote:
           | Why would you first upgrade to PG 10?
        
             | nicoburns wrote:
             | Because version below 10 don't support logical replication.
             | The alternative would be to use a 3rd party extension for
             | replication, which may well be a good option but I don't
             | have any experience with that so I can't really comment.
        
             | zozbot234 wrote:
             | Native logical replication (making it possible to upgrade
             | without downtime) was introduced in pgSQL 10. But if you're
             | going to have downtime anyway, there's no reason not to do
             | the upgrade in a single step. pg_upgrade should support
             | that quite easily.
        
         | yarcob wrote:
         | Depends on whether you can shut down the server for some time
         | or not. If you can live with some downtime, just use pg_upgrade
         | if you install the new server on the same machine:
         | https://www.postgresql.org/docs/current/pgupgrade.html
        
         | olavgg wrote:
         | This is one of the major pain points with PostgreSQL. Unless
         | you absolutely need any new features and uptime is important,
         | you can just continue using PostgreSQL 9.6 even thought it is
         | EOL. https://www.postgresql.org/support/versioning/ It will
         | most likely work great for many more years.
         | 
         | I wish future versions of PostgreSQL will have some backwards
         | compatibility for old system/data tables/datastructures and be
         | able to do live migration when running a newer release.
         | 
         | pg_upgrade will not work if the internal data structure for
         | your data changes. It only recreates the system tables.
         | https://www.postgresql.org/docs/14/pgupgrade.html
        
           | Tostino wrote:
           | This is not really true for a number of reasons. pg_upgrade
           | is absolutely the preferred method if you are able to spare
           | small downtime. It will absolutely work to upgrade your
           | existing data structures to be compatible with the new
           | Postgres version, so not sure what you even mean.
           | 
           | It doesn't automatically re-create indexes (to take advantage
           | of new features) for example, but that is likely something
           | you don't want to do right away (incurring extra downtime)
           | when you are doing an upgrade anyways. You can easily just
           | REINDEX after the upgrade is complete.
        
       | postgresapp wrote:
       | If you want to test the new features on a Mac, we've just
       | uploaded a new release of Postgres.app:
       | https://postgresapp.com/downloads.html
        
         | abdusco wrote:
         | I love this app on Mac, but I wonder if there is a similar app
         | for Windows (i.e. portable Postgres)?
        
         | TedShiller wrote:
         | This app is amazing. Highly recommend it.
        
       | throwawaybchr wrote:
       | Does anyone know when this will be available on RDS?
        
       | johnthuss wrote:
       | This looks like an amazing release! Here are my favorite features
       | in order:
       | 
       | * Up to 2x speed up when using many DB connections * ANALYZE runs
       | significantly faster. This should make PG version upgrades much
       | easier. * Reduced index bloat. This has been improving in each of
       | the last few major releases. * JSON subscript syntax, like
       | column['key'] * date_bin function to group timestamps to an
       | interval, like every 15 minutes. * VACUUM "emergency mode" to
       | better prevent transaction ID wraparound
        
       | rolobio wrote:
       | Love the JSONB subscripts! It will be so much easier to remember!
       | I may not even have to reference the docs!
        
         | yarcob wrote:
         | Now if I only could remember how to convert a JSONB string to a
         | normal string (without quotes)...
        
         | ape4 wrote:
         | What is JSONB...
         | https://stackoverflow.com/questions/22654170/explanation-of-...
         | 
         | (Don't upvote me, I just googled it)
        
       | ablekh wrote:
       | Congratulations and thanks to all involved! Do I understand
       | correctly that, at this time, while PG has data sharding and
       | partitioning capabilities, it does not offer some related
       | features found in Citus Open Source (shard rebalancer,
       | distributed SQL engine and transactions) and in Citus on Azure
       | aka Hyperscale (HA and streaming replication, tenant isolation -
       | I'm especially interested in the latter one)? Are there any plans
       | for PG to move toward this direction?
        
         | zozbot234 wrote:
         | Streaming replication _is_ supported as per
         | https://www.postgresql.org/docs/current/warm-standby.html#ST...
         | . You can likely build shard rebalancing and tenant isolation
         | on top of the existing logical replication featureset. There
         | are some groundwork features for distributed transactions
         | (PREPARE TRANSACTION, COMMIT PREPARED, ROLLBACK PREPARED), but
         | they're not supported as such.
        
           | ablekh wrote:
           | I see. Thank you for clarifications.
        
       | TOMDM wrote:
       | The query parallelism for foreign data wrappers bring PostgreSQL
       | one step closer to being the one system that can tie all your
       | different data sources together into one source.
       | 
       | Really exciting stuff.
        
       | darksaints wrote:
       | I know this isn't even a big enough deal to mention in the news
       | release, but I am massively excited about the new multirange data
       | types. I work with spectrum licensing and range data types are a
       | godsend (for representing spectrum ranges that spectrum licenses
       | grant). However, there are so many scenarios where you want to
       | treat multiple ranges like a single entity (say, for example, an
       | uplink channel and a downlink channel in an FDD band). And there
       | are certain operations like range differences (e.g. '[10,100)' -
       | '[50,60)'), that aren't possible without multirange support. For
       | this, I am incredibly grateful.
       | 
       | Also great is the parallel query support for materialized views,
       | connection scalability, query pipelining, and jsonb accessor
       | syntax.
        
         | jkatz05 wrote:
         | Multiranges are one of the lead items in the news release :) I
         | do agree that they are incredibly helpful and will help to
         | reduce the complexity of working with ranges.
        
       | mypastself wrote:
       | Somewhat related, but does anybody have suggestions for a quality
       | PostgreSQL desktop GUI tool, akin to pgAdmin3? Not pgAdmin 4,
       | whose usability is vastly inferior.
       | 
       | DBeaver is adequate, but not really built with Postgres in mind.
        
         | rancor wrote:
         | I've been using SQL Workbench/J [https://www.sql-workbench.eu/]
         | for quite a while now. Uses JDBC so it'll connect to anything,
         | good SQL formatter, builtin scripting commands, support for XLS
         | import/export, headless mode, and most importantly fast even
         | when loading massive result sets.
        
         | tfeldmann wrote:
         | TablePlus
        
         | kevinob11 wrote:
         | TablePlus is the only thing I've used cross-platform that is
         | even close to Postico. Postico is amazing.
        
         | illuminated wrote:
         | PgModeler [0], cannot recommend it enough. You have to compile
         | yourself the free version (it's open source - you pay if you
         | want to directly download the precompiled binaries) and it's a
         | bit of work if you want to include the query optimizer plugin,
         | but there's documentation and GitHub issues with solutions
         | already in place. Once you compile it for the first time and
         | start using it, you'll keep doing it again with each new
         | version. I haven't found a better Postgres tool yet.
         | 
         | edit: it's a cross-platform tool, supporting Linux, Mac and
         | Windows (32/64)
         | 
         | [0]: https://pgmodeler.io/
        
           | I_am_tiberius wrote:
           | Especially cool that you can buy a license using Bitcoin! Do
           | you know if the backup/restore functionality is reliable?
        
           | 0XAFFE wrote:
           | I really like the diff function to generate migrations. Make
           | changes to your model and generate a sql file that syncs your
           | database to the model.
        
         | Keyframe wrote:
         | DataGrip is cool.
        
         | JoniDS wrote:
         | For Windows https://www.heidisql.com/ is great
        
           | mypastself wrote:
           | Yeah, that is a good one. It's lightweight and supports a lot
           | of languages, but it lacks some of the nicer features of
           | Postgres-specific tools. I think I had difficulty listing
           | things by schemes.
        
         | paozac wrote:
         | I like Postico, but it is mac-only and not free.
        
           | mypastself wrote:
           | Ah, I should have clarified that I'm using Linux and Windows.
        
         | alexashka wrote:
         | I use Valentina Studio (free license). Does everything I need.
        
         | anentropic wrote:
         | another vote for TablePlus
        
         | nicoburns wrote:
         | I second Postico on Mac. I usually recommend TablePlus for
         | other platforms.
        
         | ptrwis wrote:
         | When did you last use pgadmin4? It recently went through a big
         | changes, in my opinion it's the best client for PostgreSQL.
        
           | mypastself wrote:
           | A few months ago. I had semi-frequent troubles with starting
           | up and freezing (on both Linux and Windows).
           | 
           | I would also easily make accidental GUI changes that I could
           | only revert by reloading original settings.
           | 
           | A less galling example was the introduction of a highlight
           | that marked the line in a query where an error occurred. It
           | was a bright blue color, hard on the eyes, almost completely
           | obscuring the text. It's a comparatively minor issue, but
           | illustrative of how the tool's usability was steadily
           | declining.
        
         | malyk wrote:
         | It's paid, but I like SQL Pro for Postgres. Simple,
         | lightweight, etc.
        
         | mixmastamyk wrote:
         | Happy with dbeaver, what are its shortcomings re pg?
        
           | mypastself wrote:
           | I guess I'm still getting used to it, but it always takes a
           | few extra steps to open things like views and functions,
           | autocomplete needs configuring to work properly, and a couple
           | of other features I'm blanking on at the moment. It's all
           | small stuff, but it can add up.
        
             | mixmastamyk wrote:
             | Yes, they are in separate folders, but don't think it has
             | to do with PG per se. My autocomplete works well, don't
             | think I did anything special. Maybe added a schema to the
             | "search path?" May have been pgcli (which I also
             | recommend).
        
         | joshcrews wrote:
         | Postico is my favorite desktop software of any kind
        
         | cmg wrote:
         | I've been trying out Beekeeper Studio
         | [https://www.beekeeperstudio.io/] recently, and like that it
         | supports both MySQL and PostgreSQL (and others I don't use).
         | The interface takes a little bit to get used to, but it's been
         | pretty powerful for me.
         | 
         | Before that, or when I'm in a rush to just get something done
         | as I adjust to Beekeeper, I use Postbird (an Electron app)
         | [https://github.com/Paxa/postbird]
        
           | rathboma wrote:
           | Beekeeper Studio maintainer here. It's 100% FOSS and cross
           | platform (Windows, MacOS, Linux).
           | 
           | I love Postgres and started Beekeeper because I wanted
           | 'Sequel Pro for PSQL on Linux'. We have a lot of users so
           | seems like we're doing something right.
           | 
           | Very happy for new PSQL.
           | 
           | If you have questions, let me know!
        
             | cmg wrote:
             | Thank you for your work on it! I used Sequel Pro for a long
             | time for MySQL databases and occasionally use its successor
             | Sequel Ace as well, but I'm moving my workflow to Beekeeper
             | Studio as much as I can.
        
               | rathboma wrote:
               | Glad to hear! Please file issues if anything is confusing
               | or missing, it's a pretty friendly GitHub community.
        
           | e12e wrote:
           | In addition to the excellent Beekeeper, you might want to
           | check out https://dbeaver.io/.
           | 
           | And azure data studio now has some pg support:
           | https://docs.microsoft.com/en-us/sql/azure-data-
           | studio/exten...
           | 
           | Personally I've landed on just using pgcli.com (and pg_dump
           | etc).
        
         | ralfhn wrote:
         | https://dbeaver.io/ is pretty good IMO.
        
         | JohnBooty wrote:
         | pgAdmin 4 has made rapid strides. If you haven't checked it out
         | recently (~6 mos?) you may be pleasantly surprised. Many of the
         | rough edges have been sanded off.
        
           | mypastself wrote:
           | Interesting, it's been roughly that amount of time since I've
           | last used it. I might check it out again.
        
         | MrWiffles wrote:
         | Commercial, but worth every Penny and then some in my opinion:
         | https://tableplus.com/
         | 
         | It looks good, performs very well, and supports WAY more than
         | just pg. It's an indispensable part of my dev tooling. Bought
         | it what last year, year before maybe? Very happy customer here,
         | zero problems, all gravy.
        
         | thornygreb wrote:
         | I pay for jetbrains datagrip, worth every penny.
        
           | turbocon wrote:
           | Going to second this, however I will warn, at least in my
           | experience it is a little bit different from most DB IDEs. I
           | didn't like it at all first time I used it, then a friend
           | told me to give it another try. I've never looked back,
           | fantastic tool.
        
           | dpcx wrote:
           | One of my coworkers uses datagrip. Needing to install mysql
           | specific tooling so that they can take a full database dump
           | is kind of frustrating. Many other tools can do it out of the
           | box, why not datagrip?
        
           | apocalyptic0n3 wrote:
           | Seconded. DataGrip is terrific and supports every database
           | type I have ever come into contact with. And it's all JDBC-
           | based so you can add new connectors pretty easily (from
           | within the app, no less. No fiddling with files necessary). I
           | had to do that to do help on a proposal a few years ago for a
           | project that had a Firebird database and Datagrip didn't
           | natively support it.
        
         | pestkranker wrote:
         | Navicat for PostgreSQL
         | (https://www.navicat.com/en/products/navicat-for-postgresql)
        
       | unixhero wrote:
       | Congrats to the team. I cannot wait to work on v14.
        
       | candiddevmike wrote:
       | How does everyone do postgresql upgrades with the least amount of
       | downtime?
        
         | paulryanrogers wrote:
         | For Pg and MySQL I usually have to resort to replicating to a
         | newer instance then cutting over. Tools like pg_upgrade offer
         | promise but I rarely have the time or access to test with a
         | full production dataset. Hosting provider constraints sometimes
         | limit my options too, such as no SSH to underlying instances.
        
           | loopdoend wrote:
           | Bucardo is great for replication.
        
         | aeyes wrote:
         | Fastest non-intrusive way I know in RDS or any other
         | environment which allows you to spin up a new box:
         | 
         | * Set up a replica (physical replication)
         | 
         | * Open a logical replication slot for all tables on your old
         | master (WAL will start accumulating)
         | 
         | * Make your replica a master
         | 
         | * Upgrade your new master using pg_upgrade, run analyze
         | 
         | * On your new master subscribe to the logical replication slot
         | from your old master using the slot you created earlier,
         | logical replication will now replicate all changes that
         | occurred since you created the slot
         | 
         | * Take down your app, disable logical replication, switch to
         | the new master
         | 
         | You can do the upgrade with zero downtime using Bucardo Multi-
         | Master replication but the effort required is much much higher
         | and I'm not sure if this is really feasible for a big instance.
        
       | simonebrunozzi wrote:
       | Here we are, at a fantastic version 14, and still no sign of an
       | MySQL AB-like company able to provide support and extensions to a
       | great piece of open source software. There's a few small ones,
       | yes, but nothing at the billion dollar size.
       | 
       | I am still unable to understand why.
        
         | MrWiffles wrote:
         | I thought this is what Enterprise DB was? Or am I misinformed?
         | I concede that is indeed a possibility.
        
         | arwineap wrote:
         | Which extensions do you think would be enough of a value add to
         | support a billion dollar company?
        
         | mixmastamyk wrote:
         | Microsoft and Citus are "a little" over a billion?
        
           | threeseed wrote:
           | I hope you don't think Microsoft/Citus is going to continue
           | to support PostgreSQL installations on anything other than
           | Azure.
           | 
           | This is ask about the battle of the clouds for them.
        
       | I_am_tiberius wrote:
       | Anyone know when it will be available on Azure Flexible Server?
       | Also, does anyone know when Flexible Server will leave Preview
       | status?
        
       | jonplackett wrote:
       | If you'd like to try out PostgreSQL in a nice friendly hosted
       | fashion then I highly recommend supabase.io
       | 
       | I came from MySQL and so I'm still just excited about the basic
       | stuff like authentication and policies, but I really like how
       | they've also integrated storage with the same permissions and
       | auth too.
       | 
       | It's also open source so if you can to just host it yourself you
       | stil can.
       | 
       | And did I mention they'll do your auth for you?
        
         | alberth wrote:
         | Would you mind expanding on what's so appealing about Supabase
         | (i.e. Firebase).
         | 
         | I feel like I live in a cave because I haven't quite understood
         | what problem Supabase/Firebase is solving for.
        
           | kiwicopple wrote:
           | [supabase cofounder] While we position ourselves as a
           | Firebase alternative, it might be simpler for experienced
           | techies to think of us as an easy way to use Postgres.
           | 
           | We give you a full PG database for every project, and auto-
           | generated APIs using PostgREST [0]. We configure everything
           | in your project so that it's easy to use Postgres Row Level
           | Security.
           | 
           | As OP mentions, we also provide a few additional services
           | that you typically need when building a product - connection
           | pooling (pgbouncer), object storage, authentication + user
           | management, dashboards, reports, etc. You don't need to use
           | all of these - you can just use us as a "DBaaS" too.
           | 
           | [0] https://postgrest.org/
        
             | yawnxyz wrote:
             | as a product designer, I've been with Supabase since its
             | inception. You guys make projects so easy to start
             | prototyping, without me having to think about starting a
             | Postgres droplet or whatever. Thank you so much for making
             | Supabase better every day!!
        
               | kiwicopple wrote:
               | Thanks for joining us for the ride!
        
             | alberth wrote:
             | Thanks so much and really appreciate you taking the time to
             | respond here.
             | 
             | I think it's fantastic to make deploying existing
             | software/tools easier, and people are definitely willing to
             | pay for the ease, curious though - what prevents the
             | Postgres team from taking supabase contributions (since
             | it's Apache 2.0) and including it in core Postgres?
        
               | kiwicopple wrote:
               | > what prevents the Postgres team from taking supabase
               | contributions
               | 
               | Absolutely nothing - we are open source and we would
               | encourage it. However we're not modifying Postgres itself
               | at this stage, we're just providing tooling around it. We
               | have hired developers to work specifically on Postgres
               | but we will always attempt to upstream (if the PG
               | community wants it) rather than maintain a fork
        
         | zgn717 wrote:
         | Supabase looks nice. I'm also using Postgres hosted by
         | https://nhost.io/ which also do auth and storage. It looks like
         | Supabase is focused on REST and Nhost is focused on GraphQL?
        
         | kiwicopple wrote:
         | [Supabase cofounder] thanks for the kind words.
         | 
         | We're about to wrap up Row Level Security on our real-time APIs
         | too[0] - soon you'll be able to use Policies universally on all
         | APIs.
         | 
         | [0] https://github.com/supabase/walrus
        
           | httgp wrote:
           | I cannot even begin to tell you how excited I am for this!
        
           | jonplackett wrote:
           | Congrats on an awesome product. It was _exactly_ what I was
           | looking for.
           | 
           | You also get a feeling about a company pretty quick from
           | their docs (I think it's the best way to judge a company) and
           | I have to say I had the same feeling as using Twilio or
           | Cloudflare for the first time - so you're in good company.
        
       ___________________________________________________________________
       (page generated 2021-09-30 23:01 UTC)