hngopher.com

       [HN Gopher] The world of PostgreSQL wire compatibility
       ___________________________________________________________________
        
       The world of PostgreSQL wire compatibility
        
       Author : EntICOnc
       Score  : 95 points
       Date   : 2022-02-10 10:42 UTC (12 hours ago)
        
 (HTM) web link (datastation.multiprocess.io)
 (TXT) w3m dump (datastation.multiprocess.io)
        
       | chatmasta wrote:
       | Shameless plug, but your list is missing Splitgraph [0] :)
       | 
       | We've been based on Postgres from the beginning, and although the
       | backend is a bit more complex at this point, we've kept the wire
       | protocol intact. We're also heavily invested in FDWs, not only
       | for federated queries (e.g. querying data at Snowflake - btw, you
       | might enjoy our blog post on achieving a 100x speedup with
       | aggregation pushdown [1]), but also for queries on warehoused
       | data stored as Splitgraph images.
       | 
       | By keeping Postgres compatibility as our guiding constraint,
       | we've been able to build a lot of functionality on top of just a
       | few simple abstractions. The result is something akin to a magic
       | Postgres database - you can connect dozens of live sources to it
       | using FDW plugins, or you can ingest from hundreds of data
       | sources using Airbyte connectors, ultimately storing the data as
       | immutable Splitgraph images in object storage. But as far as your
       | standard Postgres client is concerned, you're just connected to a
       | normal database with 40k tables on it.
       | 
       | As for the wire protocol, our implementation is heavily reliant
       | on (a forked version of) PgBouncer. Basically, a query arrives,
       | we parse it for references to tables (which look like Docker
       | image tags), and the proxy layer performs whatever orchestration
       | is necessary to satisfy the query. That could mean instantiating
       | a foreign server to a saved connection, loading some data from
       | object storage, or even lazily loading only the requisite data
       | (we call this "layered querying" since it's implemented similarly
       | to AUFS). In the future, it could also mean delegating the query
       | to a more specialized engine like Presto.
       | 
       | Point is, by keeping the frontend intact, we're able to retain
       | compatibility with all Postgres clients, but we're free to
       | implement the backend in more scalable or domain-specific ways.
       | For example, we're able to horizontally scale our query capacity
       | by simply adding more "cache nodes" that perform the layered
       | querying.
       | 
       | We are definitely all-in on the Postgres wire protocol, and all
       | the ecosystem compatibility that comes along with it. You can
       | read our blog for more in-depth discussions of this, but I don't
       | want to spam too many links here. :)
       | 
       | [0] https://www.splitgraph.com
       | 
       | [1] https://www.splitgraph.com/blog/postgresql-fdw-
       | aggregation-p...
        
       | eatonphil wrote:
       | Hey folks! Author here. Happy for questions, suggestions, flames.
        
         | lotsofspots wrote:
         | > Proprietary databases like Oracle and SQL Server find value
         | in developing their own drivers. They don't tend to publish
         | their wire protocol.
         | 
         | Just to note, MS have long published the details of their wire
         | protocol, TDS - https://docs.microsoft.com/en-
         | us/openspecs/windows_protocols...
        
           | eatonphil wrote:
           | Thanks I should be fairer to them. Updated the post to note
           | this.
        
             | gfody wrote:
             | btw I think you meant to say "worth noting" instead if
             | "worth nothing" ;)
        
               | eatonphil wrote:
               | lol, oops. Not a freudian slip. Edited, thank you.
        
         | [deleted]
        
       | patrickdevivo wrote:
       | Thanks for this write up! I've been really interested in postgres
       | compatibility in the context of a tool I maintain
       | (https://github.com/mergestat/mergestat) that uses SQLite. I've
       | been looking for a way to expose the SQLite capabilities over a
       | more commonly used wire-protocol like postgres (or mysql) so that
       | existing BI and visualization tools can access the data.
       | 
       | This project is an interesting one:
       | https://github.com/dolthub/go-mysql-server that provides a MySQL
       | interface (wire and SQL) to arbitrary "backends" implemented in
       | go.
       | 
       | It's really interesting how compatibility with existing protocols
       | has become an important feature of new databases - there's so
       | much existing tooling that already speaks postgres (or mysql),
       | being able to leverage that is a huge advantage IMO
        
       | ryanbooz wrote:
       | Timescale DevRel here!
       | 
       | Great observations around how popular and useful it is for
       | systems to have some level of Postgres line compatibility! Thanks
       | for doing the writeup!
       | 
       | One thing that's not totally clear in the comparison table is
       | that most of these implementations are just that, databases
       | providing an abstraction through the Postgres wire protocol over
       | a different database architecture/parser & solution.
       | 
       | TimescaleDB, however, is the only one listed that is built
       | directly on Postgres as an extension. There is no abstraction
       | layer over some other implementation of the query parser or
       | storage layer.
       | 
       | When it comes to application maintenance and developer
       | expectations at query time, not all of these are apples-to-apples
       | comparisons, as some of the other comments have noted.
        
         | proddata wrote:
         | CrateDB DevRel here :)
         | 
         | > databases providing an abstraction through the Postgres wire
         | protocol
         | 
         | I would not call it an abstraction, if one has a full parser,
         | analyzer, planner and execution engine. It is just a common
         | language ;)
        
         | eatonphil wrote:
         | While the table doesn't say that much about intent (other than
         | what you might infer from the SQL Compatibility, ACID and
         | Category columns), I had hoped it was clear from the rest of
         | the article that I was saying there is no relationship between
         | the wire protocol and being PostgreSQL compatible in general.
         | 
         | > It does NOT encompass the actual query language itself, let
         | alone database semantics.
         | 
         | > This doesn't mean that any PostgreSQL or MySQL query will
         | work with them since, as mentioned previously, query language
         | and database semantics are independent of the wire protocol.
         | 
         | > To reiterate, the wire protocol doesn't specify anything
         | about the query language.
         | 
         | > Just because a database implements the PostgreSQL wire
         | protocol does not mean it intends to be a drop-in replacement
         | for PostgreSQL.
         | 
         | > So just looking at parser support is not enough to talk about
         | PostgreSQL query or semantic compatibility. But it is a
         | starting point.
         | 
         | But I guess it wasn't clear.
        
       | epberry wrote:
       | Well done. Couple of notes:
       | 
       | - YugabyteDB is currently on Postgres 11.2 with plans to keep
       | upgrading.
       | 
       | - Cockroach is not pinned to any Postgres version so its
       | compatibility is reliant on their implementation. Not a bad
       | thing, just different than YB.
       | 
       | - Like you said Timescale is in lockstep with latest Postgres (14
       | at the time of this comment)
       | https://blog.timescale.com/blog/massive-scale-for-time-serie...
       | 
       | - Materialize is not really focused on scaling because that is
       | not really what it is for, https://materialize.com/blog-roadmap/
       | 
       | - I might consider moving Spanner to the Somewhat compatible
       | category. Lack of stored procedures and triggers makes it not a
       | drop in replacement for those moving from Oracle/MS SQL
       | https://cloud.google.com/spanner/docs/reference/postgresql/o...
       | 
       | - For native Postgres scaling strategies, we could call attention
       | to Crunchy Data or Percona
       | 
       | Very nice to lay this out and I pretty much agree with the labels
       | and categories on here. I would be you will get some people
       | reaching out since Postgres compatibility is something these
       | vendors compete on.
        
         | benesch wrote:
         | - Materialize is not really focused on scaling because that is
         | not really what it is for
         | 
         | Horizontal scalability is actually something that Materialize's
         | underlying incremental computation engine (timely/differential
         | dataflow) excels at. We're actively working to expose that tech
         | in Materialize. (As in: many engineers, including me, are
         | working on this full time; happy to answer questions!)
         | 
         | It is too bad that you got the wrong impression from that blog
         | post! That blog post is about two years out of date. We're
         | working on an updated roadmap blog post. The tl;dr is that
         | we've since moved from step 3 on the roadmap ("Build a single-
         | node database management system") to step 4 ("Build cloud-
         | native elasticity and replication: Materialize Cloud").
        
       | linsomniac wrote:
       | Postgres wire compatibility does not imply Postgres
       | compatibility.
       | 
       | I recently got bit by this, I wanted to convert a mail address
       | database from MySQL Galera to CockroachDB, because CockroachDB
       | was just a bit easier to deal with.
       | 
       | Unfortunately, Cockroach only support UTF-8, and Postgres only
       | supports LATIN1 for the database (compiled into code). I believe
       | that there are reasons for both sides to make this choice,
       | unfortunately it turned what I hoped would be a slam dunk into a
       | round peg in a square hole.
        
         | linsomniac wrote:
         | Correction, which I can't do because it's been downvoted,
         | _POSTFIX_ only supports LATIN1, so I can 't use Cockroach with
         | it.
        
         | anarazel wrote:
         | What do you mean with postgres only supporting LATIN1?
        
           | linsomniac wrote:
           | Dang, I meant Post _FIX_ only supports LATIN1.
        
       ___________________________________________________________________
       (page generated 2022-02-10 23:01 UTC)