[HN Gopher] The world of PostgreSQL wire compatibility
___________________________________________________________________
The world of PostgreSQL wire compatibility
Author : EntICOnc
Score : 95 points
Date : 2022-02-10 10:42 UTC (12 hours ago)
(HTM) web link (datastation.multiprocess.io)
(TXT) w3m dump (datastation.multiprocess.io)
| chatmasta wrote:
| Shameless plug, but your list is missing Splitgraph [0] :)
|
| We've been based on Postgres from the beginning, and although the
| backend is a bit more complex at this point, we've kept the wire
| protocol intact. We're also heavily invested in FDWs, not only
| for federated queries (e.g. querying data at Snowflake - btw, you
| might enjoy our blog post on achieving a 100x speedup with
| aggregation pushdown [1]), but also for queries on warehoused
| data stored as Splitgraph images.
|
| By keeping Postgres compatibility as our guiding constraint,
| we've been able to build a lot of functionality on top of just a
| few simple abstractions. The result is something akin to a magic
| Postgres database - you can connect dozens of live sources to it
| using FDW plugins, or you can ingest from hundreds of data
| sources using Airbyte connectors, ultimately storing the data as
| immutable Splitgraph images in object storage. But as far as your
| standard Postgres client is concerned, you're just connected to a
| normal database with 40k tables on it.
|
| As for the wire protocol, our implementation is heavily reliant
| on (a forked version of) PgBouncer. Basically, a query arrives,
| we parse it for references to tables (which look like Docker
| image tags), and the proxy layer performs whatever orchestration
| is necessary to satisfy the query. That could mean instantiating
| a foreign server to a saved connection, loading some data from
| object storage, or even lazily loading only the requisite data
| (we call this "layered querying" since it's implemented similarly
| to AUFS). In the future, it could also mean delegating the query
| to a more specialized engine like Presto.
|
| Point is, by keeping the frontend intact, we're able to retain
| compatibility with all Postgres clients, but we're free to
| implement the backend in more scalable or domain-specific ways.
| For example, we're able to horizontally scale our query capacity
| by simply adding more "cache nodes" that perform the layered
| querying.
|
| We are definitely all-in on the Postgres wire protocol, and all
| the ecosystem compatibility that comes along with it. You can
| read our blog for more in-depth discussions of this, but I don't
| want to spam too many links here. :)
|
| [0] https://www.splitgraph.com
|
| [1] https://www.splitgraph.com/blog/postgresql-fdw-
| aggregation-p...
| eatonphil wrote:
| Hey folks! Author here. Happy for questions, suggestions, flames.
| lotsofspots wrote:
| > Proprietary databases like Oracle and SQL Server find value
| in developing their own drivers. They don't tend to publish
| their wire protocol.
|
| Just to note, MS have long published the details of their wire
| protocol, TDS - https://docs.microsoft.com/en-
| us/openspecs/windows_protocols...
| eatonphil wrote:
| Thanks I should be fairer to them. Updated the post to note
| this.
| gfody wrote:
| btw I think you meant to say "worth noting" instead if
| "worth nothing" ;)
| eatonphil wrote:
| lol, oops. Not a freudian slip. Edited, thank you.
| [deleted]
| patrickdevivo wrote:
| Thanks for this write up! I've been really interested in postgres
| compatibility in the context of a tool I maintain
| (https://github.com/mergestat/mergestat) that uses SQLite. I've
| been looking for a way to expose the SQLite capabilities over a
| more commonly used wire-protocol like postgres (or mysql) so that
| existing BI and visualization tools can access the data.
|
| This project is an interesting one:
| https://github.com/dolthub/go-mysql-server that provides a MySQL
| interface (wire and SQL) to arbitrary "backends" implemented in
| go.
|
| It's really interesting how compatibility with existing protocols
| has become an important feature of new databases - there's so
| much existing tooling that already speaks postgres (or mysql),
| being able to leverage that is a huge advantage IMO
| ryanbooz wrote:
| Timescale DevRel here!
|
| Great observations around how popular and useful it is for
| systems to have some level of Postgres line compatibility! Thanks
| for doing the writeup!
|
| One thing that's not totally clear in the comparison table is
| that most of these implementations are just that, databases
| providing an abstraction through the Postgres wire protocol over
| a different database architecture/parser & solution.
|
| TimescaleDB, however, is the only one listed that is built
| directly on Postgres as an extension. There is no abstraction
| layer over some other implementation of the query parser or
| storage layer.
|
| When it comes to application maintenance and developer
| expectations at query time, not all of these are apples-to-apples
| comparisons, as some of the other comments have noted.
| proddata wrote:
| CrateDB DevRel here :)
|
| > databases providing an abstraction through the Postgres wire
| protocol
|
| I would not call it an abstraction, if one has a full parser,
| analyzer, planner and execution engine. It is just a common
| language ;)
| eatonphil wrote:
| While the table doesn't say that much about intent (other than
| what you might infer from the SQL Compatibility, ACID and
| Category columns), I had hoped it was clear from the rest of
| the article that I was saying there is no relationship between
| the wire protocol and being PostgreSQL compatible in general.
|
| > It does NOT encompass the actual query language itself, let
| alone database semantics.
|
| > This doesn't mean that any PostgreSQL or MySQL query will
| work with them since, as mentioned previously, query language
| and database semantics are independent of the wire protocol.
|
| > To reiterate, the wire protocol doesn't specify anything
| about the query language.
|
| > Just because a database implements the PostgreSQL wire
| protocol does not mean it intends to be a drop-in replacement
| for PostgreSQL.
|
| > So just looking at parser support is not enough to talk about
| PostgreSQL query or semantic compatibility. But it is a
| starting point.
|
| But I guess it wasn't clear.
| epberry wrote:
| Well done. Couple of notes:
|
| - YugabyteDB is currently on Postgres 11.2 with plans to keep
| upgrading.
|
| - Cockroach is not pinned to any Postgres version so its
| compatibility is reliant on their implementation. Not a bad
| thing, just different than YB.
|
| - Like you said Timescale is in lockstep with latest Postgres (14
| at the time of this comment)
| https://blog.timescale.com/blog/massive-scale-for-time-serie...
|
| - Materialize is not really focused on scaling because that is
| not really what it is for, https://materialize.com/blog-roadmap/
|
| - I might consider moving Spanner to the Somewhat compatible
| category. Lack of stored procedures and triggers makes it not a
| drop in replacement for those moving from Oracle/MS SQL
| https://cloud.google.com/spanner/docs/reference/postgresql/o...
|
| - For native Postgres scaling strategies, we could call attention
| to Crunchy Data or Percona
|
| Very nice to lay this out and I pretty much agree with the labels
| and categories on here. I would be you will get some people
| reaching out since Postgres compatibility is something these
| vendors compete on.
| benesch wrote:
| - Materialize is not really focused on scaling because that is
| not really what it is for
|
| Horizontal scalability is actually something that Materialize's
| underlying incremental computation engine (timely/differential
| dataflow) excels at. We're actively working to expose that tech
| in Materialize. (As in: many engineers, including me, are
| working on this full time; happy to answer questions!)
|
| It is too bad that you got the wrong impression from that blog
| post! That blog post is about two years out of date. We're
| working on an updated roadmap blog post. The tl;dr is that
| we've since moved from step 3 on the roadmap ("Build a single-
| node database management system") to step 4 ("Build cloud-
| native elasticity and replication: Materialize Cloud").
| linsomniac wrote:
| Postgres wire compatibility does not imply Postgres
| compatibility.
|
| I recently got bit by this, I wanted to convert a mail address
| database from MySQL Galera to CockroachDB, because CockroachDB
| was just a bit easier to deal with.
|
| Unfortunately, Cockroach only support UTF-8, and Postgres only
| supports LATIN1 for the database (compiled into code). I believe
| that there are reasons for both sides to make this choice,
| unfortunately it turned what I hoped would be a slam dunk into a
| round peg in a square hole.
| linsomniac wrote:
| Correction, which I can't do because it's been downvoted,
| _POSTFIX_ only supports LATIN1, so I can 't use Cockroach with
| it.
| anarazel wrote:
| What do you mean with postgres only supporting LATIN1?
| linsomniac wrote:
| Dang, I meant Post _FIX_ only supports LATIN1.
___________________________________________________________________
(page generated 2022-02-10 23:01 UTC)