[HN Gopher] Pg_GraphQL: A GraphQL Extension for PostgreSQL
___________________________________________________________________
Pg_GraphQL: A GraphQL Extension for PostgreSQL
Author : samwillis
Score : 319 points
Date : 2021-12-03 14:59 UTC (8 hours ago)
(HTM) web link (supabase.com)
(TXT) w3m dump (supabase.com)
| the_arun wrote:
| What are the use cases we could use a direct plugin to database
| for GraphQL? I mean where do we write business logic before data
| becomes GraphQL?
| warent wrote:
| Security definers are a powerful feature of postgres that allow
| a lot of business logic to live in the database. With an
| extension like this, I reckon postgres can theoretically become
| a first class public api for a surprising number of use cases.
| n0w wrote:
| As someone that has used Hasura very heavily in the past, I can
| see the operational benefit of not needing to manage one more
| service.
|
| The talk about memory constraints seems a bit odd though. The
| extension is still going to require memory and without profiling
| the alternatives it seems odd to say: "we won't use any more
| memory with an extension". Especially when you say "these
| established/stable alternatives fulfill all our feature
| requirements". In practice I always found Hasura to be reasonably
| lightweight (I can't speak for PostGraphile).
|
| I think there's an advantage to running this API layer as an
| additional service in front of the DB though. Then it can act as
| an API gateway to more than just your database.
|
| Also, the generation of a "Relay style" schema is off putting for
| me. I'm really not a fan of the style and was one of the biggest
| advantages for me in using Hasura.
| halfmatthalfcat wrote:
| It is nice Hasura offers it as well via the /v1beta1/relay
| endpoint[1].
|
| [1]
| https://hasura.io/docs/latest/graphql/core/databases/postgre...
| akulkarni wrote:
| As co-founder of another database company in the PostgreSQL
| ecosystem, I have to say that I'm really impressed with the
| quality and velocity of launches from the Supabase team. Nothing
| else to add, except please keep up the great work!
| kiwicopple wrote:
| thanks Ajay, we're also big fans of yours here at Supabase :)
| nextaccountic wrote:
| Hey it's unrelated but, what's the relationship between
| Supabase and Postgrest?
|
| My understanding is that Postgrest is a separate project that
| began development in an unrelated way to Supabase, and
| Supabase is just an user. Is that correct?
|
| But, Supabase is also a contributor to Postgrest, right? Does
| Supabase employs engineers dedicated to it?
| oliverrice wrote:
| > What's the relationship between Supabase and Postgrest?
|
| Supabase uses PostgREST for its automatic/reflected REST
| API
|
| > Supabase is just an user. Is that correct?
|
| yep!
|
| > Does Supabase employs engineers dedicated to it?
|
| https://supabase.com/blog/2020/06/15/supabase-steve-chavez
| nextaccountic wrote:
| Cool, thanks!
| nextaccountic wrote:
| What's your company?
| mrkurt wrote:
| Timescale. Another Postgres adjacent startup that's doing
| amazing work.
| AdriaanvRossum wrote:
| Click his name if you want to find out :)
| pictur wrote:
| supabase look like so helpful tool. which has anyone use in
| production? does it have a hasura like event trigger mechanism?
| oliverrice wrote:
| Yes, event triggers are built right into Postgres so they're
| supported out-of-the-box
|
| Event triggers are the mechanism pg_graphql uses to keep the
| GraphQL schema up-to-date with the SQL schema
| Apaec wrote:
| How does it compare with
| https://github.com/solidsnack/GraphpostgresQL? It's like 7 years
| older and it takes the same approach, it seems.
| oliverrice wrote:
| Very cool! I was not familiar with this project. It does takes
| a similar approach to query building. It is described in the
| README as an alpha POC
| [deleted]
| solidsnack9000 wrote:
| GraphpostgresQL is no longer maintained and hasn't been updated
| for many years.
|
| I am the person who wrote it. It was a proof-of-concept,
| written in PL/pgSQL. This made it fairly easy to set up and
| test but made maintenance and contributions very difficult.
| mritchie712 wrote:
| Never ceases to amaze me when someone pops up on HN with a
| "yeah, I made that, ask me anything".
|
| Nice work, looks like an awesome project.
| halfmatthalfcat wrote:
| Wow, with Relay built-in. Looks great!
| ancharm wrote:
| How long before someone just goes into the Postgres source code
| and creates true native support for GraphQL
| whazor wrote:
| An extension is the best next thing to native support and in my
| opinion a better place for such functionality.
| lucasyvas wrote:
| There is no way every cloud provider won't jump on this as soon
| as it's stable for their Postgres offerings.
| darksaints wrote:
| Hopefully so. It is awesome that supabase is releasing this as
| open source, as they could have easily kept this proprietary. I
| probably won't ever be able to use supabase for my primary job
| (the company is obnoxiously invested in an Azure/AWS hybrid
| cloud). But I do have some profitable side projects with the
| ability to choose my own service providers, and this sort of
| developer goodwill really goes a long way.
| yurisagalov wrote:
| Supabase is open source. I believe the hosted version is on
| AWS, but there's no reason you couldn't run it on your
| company's Azure/AWS directly? :)
| brap wrote:
| That's interesting. How does it deal with authentication and
| authorization? It it handles those well, it could be a real
| "serverless" solution for most CRUD apps.
| dljsjr wrote:
| They mention in the article that auth is handled by reusing the
| existing row level security you'd already use to secure the DB
| (i.e. `CREATE POLICY`).
| clessg wrote:
| As the sibling comments point out, it uses Postgres Row Level
| Security (RLS). For an approachable introduction to how
| Supabase's auth works:
|
| https://supabase.com/docs/guides/auth
|
| > 1. A user signs up. Supabase creates a new user in the
| auth.users table.
|
| > 2. Supabase returns a new JWT, which contains the user's
| UUID.
|
| > 3. Every request to your database also sends the JWT.
|
| > 4. Postgres inspects the JWT to determine the user making the
| request.
|
| > 5. The user's UID can be used in policies to restrict access
| to rows.
|
| > Supabase provides a special function in Postgres, auth.uid(),
| which extracts the user's UID from the JWT. This is especially
| useful when creating policies.
|
| For instance, say you have a `todos` table and want to make it
| so users can only read their own todos - you could have an RLS
| policy `todos.user_id = auth.uid()`. Afterward, `SELECT * FROM
| todos` will only return the authenticated user's todos.
| (Equivalent to manually issuing `SELECT * FROM todos WHERE
| todos.user_id = auth.uid()`.)
|
| There's also `auth.role()` so you can easily restrict access by
| role: `auth.role() = "admin"`
| majkinetor wrote:
| Is there integrated way to achieve claim based approach where
| access to a given resource is governed by both uid (lets say
| org uid, and all users belong to particular org) and set of
| claims (lets say there are at least CRUD claims for all
| tables and I can set for any user any combination of them, or
| create a role containing them then assign that role to a
| user) ?
| oliverrice wrote:
| The extension is compatible with your existing row level
| security policies. If you connect to the database as a role
| like `authenticated` then those policies will be applied.
|
| The columns and tables that are visible are also controlled by
| the role.
|
| One cool thing about that approach is you could run e.g. an
| admin API and a user facing API all from that same endpoint by
| executing as different postgres roles!
| ndejaco wrote:
| I see this as a neat way of exposing a graphql interface but a
| pattern with a number of limitations that are traded off against
| the goal of reducing the network latency. Definitely useful for
| users who want a single datasource and only want to expose access
| through GraphQL (cannot scale db resources indepedently of the
| api access layer). Wondering if the graphql engine could make use
| of the postgres stats collector for optimizing graphql resolver
| execution ex) resolver dispatch scheduling or pre-execution query
| optimization?
| andrewingram wrote:
| Whilst I don't like the paradigm of generating a GraphQL schema
| from your database (or vice-versa!), I appreciate this one for
| having a specific goal of being able to run as part of a small
| database VM. So congrats on the release!
| nick__m wrote:
| Ideally you don't expose the schema directly, you make views in
| another schema and expose that instead of exposing your real
| tables.
| cmrajan wrote:
| I've explored multiple Graphql solutions for PostgreSQL in the
| past and couldn't convince of the resolvers based approach and
| always felt that an extension would be the right solution. The
| closest I've come across and used so far is Graphjin
| https://github.com/dosco/graphjin . Thanks Supabase for the
| fantastic extension!
| PeterZaitsev wrote:
| Wow,
|
| Interesting - many extensions are coming up allowing different
| interfaces to PostgreSQL! Other "Protocol Convertors" launched
| recently
|
| MS SQL - https://babelfishpg.org/ MongoDB -
| https://www.ferretdb.io/
| oliverrice wrote:
| author here! happy to answer questions
| rubyist5eva wrote:
| how do you implement search/sort against something like this?
| thelastbender12 wrote:
| > After tallying the resources reserved for PostgreSQL,
| PostgREST, Kong, GoTrue, and a handful of smaller services, we
| were left with a total memory budget of ... 0 MB
|
| This was a really interesting bit to me, could you give detail
| on how the memory usage gets split up across these? Thank you.
| oliverrice wrote:
| sure, if we're talking free-tier:
|
| Postgres runs in a 1 GB VM all by itself with some
| optimizations to get the most that limited hardware
|
| All other services are in a second 1 GB VM.
|
| You can see a list of those services here https://github.com/
| supabase/supabase/blob/master/docker/dock...
|
| The memory use per process can differs by use-case, so its
| important to leave a bit of headroom. For example,
| PostgREST's memory consumption can grow if the amount of data
| being returned from its queries is large.
|
| By the time you include:
|
| - supabase studio
|
| - kong
|
| - auth
|
| - storage
|
| - meta
|
| if they each take only 100MB, its pretty snug on a 1 GB VM!
| andrew_ wrote:
| We use Postgraphile heavily, it drives the entirety of our API.
| A few feature questions; Is there any plan to support, or has
| there been any conversation around:
|
| - custom directives that run custom code
|
| - custom routes on the server that do custom things? Or perhaps
| proxying to another app to handle alternate/custom routes
|
| - schema injection, custom resolver logic
| httgp wrote:
| Very excited about!
|
| Do you have performance comparisons for the same datasets with
| Hasura / Graphile?
| oliverrice wrote:
| We don't have an apples-to-apples comparison for similar
| hardware and queries at this point but here are the links to
| performance figures for pg_graphql and grahile. Hasura does
| not publish throughput number afaik
|
| pg_graphql:
| https://supabase.github.io/pg_graphql/performance/
|
| graphile: https://www.graphile.org/postgraphile/performance/
|
| We'll certainly be keeping an eye on performance as it gets
| closer to GA
| samwillis wrote:
| Hi Oliver, Supabase team,
|
| Love what you are doing with Supabase!
|
| Quick question, have you considered building any kind of local
| mirroring system for offline mobile app/PWA, say on top of
| SQLite? Something like Realm for mongodb or PouchDB/Couchbase
| Lite for CouchDB/Couchbase.
|
| It would obviously need devs to add some extra columns to
| tables for tracking, and a way to define the merge/overture
| characteristics for each column, but it would be awesome to
| have something like that!
|
| It's something I have thought about building for a while but
| never found the time. (I want to combine it with Yjs for
| collaborative offline rich text editing)
| config_yml wrote:
| So you're calling Graphql via PostgREST, but I didn't ready
| about why you're including Graphql in the first place. Isn't it
| orthogonal to PostgREST which is already used in your stack?
| oliverrice wrote:
| PostgREST does solves a few of the core challenges that
| GraphQL is intended to address like over/under-selection and
| resource embedding (relationships).
|
| Even so, there has still been a lot of interest in GraphQL so
| users can leverage the growing ecosystem for things
| reflecting the data model/types for client usage, offline
| caching, etc
| config_yml wrote:
| Thanks, this was missing for me in the Motivation section
| on the blog post. It was a genuine question, not sure why
| it's downvoted.
| kiwicopple wrote:
| just want to give a big shout out to Oli - his implementation
| really ingeneous, leveraging multiple parts of the Supabase
| stack while also being agnostic enough to work natively inside
| Postgres (or with other tools)
|
| {supabase team}
| nwienert wrote:
| Congrats on release. We use Hasura but haven't been happy with
| their speed of iteration, lack of communication of various
| large bugs, and lack of M1 support / communication.
|
| But we do use a lot of their more advanced features like being
| able to use aggregates in sorts, aggregates in results, custom
| functions, etc. What are your plans there and will you have a
| public roadmap?
| oliverrice wrote:
| There is high level public roadmap that covers the minimum
| features required before we'd consider going GA here:
|
| https://supabase.github.io/pg_graphql/roadmap/
|
| Its early days, so the conversations around aggregates
| haven't happened yet, but I'm optimistic that they'll make an
| appearance in a future release
| ptrwis wrote:
| Am I correct in saying that for read queries, you essentially
| translate the GraphQL query into nested jsonb_agg calls?
| kaspermarstal wrote:
| I've been looking for something like this for a long time, so
| cool! Great work!
|
| Does it work with TimescaleDB?
| oliverrice wrote:
| It has not been tested, but I'm not aware of any differences
| Timescale introduces that would be a compatibility issue.
| rlili wrote:
| Does this work in YugabyteDB/CockroachDB?
| nikivi wrote:
| Curious how it compares to https://www.graphile.org/postgraphile/
| oliverrice wrote:
| its very similar in goals to postgraphile
|
| as samwillis mentioned, the memory footprint is tiny, which is
| a big perk for supabase's platform (or if you're self hosting)
| but its also fully language agnostic which opens up lots of
| options for extensibility:
|
| For simple use-cases you can expose the graphql functionality
| over http using a PostgREST as described here
| https://supabase.github.io/pg_graphql/quickstart/
|
| but, if you want more configuration like adding in middleware
| or wiring it up to an existing backend application, you can do
| that from any programming language that can connect to
| postgres, rather than only javascript
| underbluewaters wrote:
| I had to make sure someone had plugged postgraphile here. It's
| a great system. What impresses me most about it is how it grows
| with the complexity of your application. There are so many
| thoughtful points of extensibility built in, I always have
| confidence that I can go into the docs and find a way to do
| what I need to.
|
| That reminds me, I need to figure out how to get my employer to
| sponsor the project...
| gardnr wrote:
| You could hire one of the maintainers to consult on your
| project.
| samwillis wrote:
| The article suggests they built it so that by running it within
| the DB it would have less compute overhead and they could match
| GraphQL requests to SQL transactions 1:1.
| gregplaysguitar wrote:
| Just noting that Graphile also uses transactions - that's a
| given for this sort of tool I'd say.
| xrd wrote:
| I've been using Hasura extensively and love it. I'm curious what
| the delta is between the two. It's so simple to run Hasura on
| dokku and instantly get an amazing UI for postgres with the bonus
| of graphql. It would take a lot for me to switch.
| piaste wrote:
| > It would take a lot for me to switch.
|
| We were pretty happy with Hasura but had to switch to
| Postgraphile due to poor multi-database support, bummer.
|
| (Postgraphile is not as polished as Hasura in some ways, but
| since it can used as a library, it's easy to dynamically create
| N instances of it with different configurations at runtime.
| Hasura required our ops team to define a new instance of the
| service in the docker-compose.yml file for each database)
| kall wrote:
| Pretty recently, as of 2.0, hasura has added support for
| multiple databases.
| piaste wrote:
| It's a VERY fragile support:
|
| https://github.com/hasura/graphql-engine/issues/6648
|
| Basically, if two entities in the databases have the same
| names, Hasura fails unless you manually define a unique
| `custom_name` for each such entity.
|
| Given that the most common multiple database scenario
| involves different databases with either the exact same
| schema (one-db-per-tenant) or similar schemas (staging vs.
| production database), it forces you to painstakingly set a
| custom_name for basically every single entity in your db.
|
| Thankfully there is an API so in theory you could set this
| programmatically, but it still means that your client code
| needs to be manually kept in-sync with whatever custom name
| generator rule you used.
| jensneuse wrote:
| Hey, I'd be curious to hear your thoughts on our solution. It
| allows you to combine any number of Databases with
| PostgreSQL, MySQL, SQLite and SQLServer are supported, here's
| the full list of supported DataSources: [0]
|
| Our solution comes with a feature called Namespacing [1],
| which means, every API has its own namespace so there are 0
| collisions between the different types and fields. It even
| goes so far that we also namespace directives so you can have
| a combined schema of multiple GraphQL APIs and can still use
| the namespaced directives on fields from that particular
| upstream.
|
| Disclaimer, I'm the founder of WunderGraph.
|
| [0]: https://wundergraph.com/docs/overview/datasources/overvi
| ew#o...
|
| [1]: https://wundergraph.com/docs/overview/features/api_names
| paci...
| supa-osum wrote:
| How does it compare to Apache Age?
| oliverrice wrote:
| Apache Age is based around the openCypher graph query language
| rather than GraphQL
| wiradikusuma wrote:
| I use Dgraph, a "native" GraphQL DB. But I still transform the
| incoming/outgoing JSON to Java object and transforming the object
| again to/from the DB. Why? 2 things: Business Logic and
| Security/Access Control. Yes I can put (some) Business Logic in
| Dgraph DB, but it feels "leaky" like Stored Procedure (maybe
| because I come from Java). I feel it's tiring and stupid, but I
| don't know a better solution.
|
| I reckon for this extension, the business logic uses Stored
| Procedure and Access Control uses PG's user role? Many apps I
| know simply have 1 user "myappuser" (or even default user) to
| access its DB.
| Twisol wrote:
| > Many apps I know simply have 1 user "myappuser" (or even
| default user) to access its DB.
|
| Sure; either those apps don't need to differentiate access
| between their users, in which case one role is sufficient, or
| they reimplement their own auth system, in which case you'd use
| Postgres' own rather robust auth system instead. It comes down
| to the needs of the domain; you'll solve the problems
| differently depending on what approach you take, but you need
| to solve the same problems.
|
| Yes -- I've found it very tiring, as you put it, to keep
| reimplementing the same boilerplate in every API server just to
| lift the operations my database can already support out to an
| HTTP frontend. Postgres' auth means I don't have to make or
| press into service a separate auth system, and there are
| multiple ways to handle business logic orthogonally.
|
| Stored procedures and triggers work well, but are synchronous
| within the current transaction, and sometimes simply don't map
| well to the domain needs. You can also use the AWAIT and NOTIFY
| statements to set up asynchronous external workers. I find this
| has a positive effect on the data model, as you're forced to
| consider what states a system will pass through during an
| asynchronous flow.
| darksaints wrote:
| This is awesome! Really creative approach to the problem. Does
| this also support custom datatypes that can be marshalled to
| json? For example, PostGIS geometries?
| oliverrice wrote:
| Currently, any datatypes that are not in the allow list
|
| https://supabase.github.io/pg_graphql/reflection/#type-conve...
|
| are cast as strings, but prioritizing a JSON conversion if one
| is available is a great idea that we'll look into
| oreilles wrote:
| Don't know how your JSON generation works but when using
| `row_to_json()`, all PostGIS geometries are automatically
| casted to GeoJSON objects by default which is very
| convenient.
| darksaints wrote:
| Personally I like the idea of allowing the user to create a
| cast:
|
| https://www.postgresql.org/docs/current/sql-createcast.html
|
| There are some weird permissions issues to work out IIRC, but
| this allows the user to specify how the json should be
| marshalled, and then can be implicitly or explicitly used by
| any sql statements. This is how PostgREST solves this
| problem, and I currently have a few custom casts created for
| range and multirange datatypes through my PostgREST server.
| intrasight wrote:
| This runs within the database engine? Doesn't that adversely
| affect scalability?
| a-priori wrote:
| On the contrary, this could scale much better.
|
| If the GraphQL server is a separate component, then a GraphQL
| query needs to do its own query planning, then the resolvers
| turn that into SQL queries to the database, then the database
| does its own query planning and execution for each query.
|
| Since one GraphQL query will often turn into multiple SQL
| queries, there's likely to be duplicated work on the database
| side across those queries since they relate to the same data.
|
| By integrating the GraphQL server into the DBMS, it can do
| query planning once for the whole GraphQL query, which means
| that it can reuse parts of the plan and prevent duplicated work
| and N+1 queries.
|
| Either way it's going to have to query the database to get all
| the data, so the essential work is the same. But this way you
| have opportunities to reduce wasted work to process the whole
| GraphQL query.
| syastrov wrote:
| Most likely the actual work done by this extension is little
| compared to the actual work required to perform the
| queries...but...
|
| If you want to scale horizontally you can run multiple read
| replicas with something which distributes connections evenly to
| the replicas.
| mmargerum wrote:
| Thanks for giving this to the community. Will definitely be
| looking at you for a SASS app i'm building.
| mritchie712 wrote:
| They just released a free course you might like:
| https://egghead.io/courses/build-a-saas-product-with-next-js...
| sporkland wrote:
| I enjoyed Sam Newman's take on this capability [1]
|
| > Great to see AWS providing direct data coupling as a service.
| /s
|
| > This service allows you to directly map a GraphQL endpoint to a
| database table. It's like putting getters and setters on an
| object and claiming your encapsulating private variables. The end
| result is coupling between GraphQL clients and the underlying
| datasource.
|
| > Information hiding is a key concept in independent change. Can
| I change the provider (of the GraphQL) endpoint independent of
| the clients? Directly exposing internal data structure makes this
| very difficult.
|
| And [2]:
|
| > So a few people have asked why I have this snarky response.
| What is my problem with this service? Well, to be clear, it's not
| an issue with GraphQL, it's an issue with direct coupling with
| underlying datasources #thread
|
| > The service as advertised makes it simple to map a GraphQL
| definition against a database. Now, what's the problem with this?
| Well, the devil here is in the detail. But fundamentally it comes
| down to how important information hiding is to you.
|
| > ... see the thread for more ...
|
| [1] https://twitter.com/samnewman/status/1346541251617828877 [2]
| https://twitter.com/samnewman/status/1346749556583780352
| Twisol wrote:
| I don't think Sam is wrong, exactly -- and that second thread
| does add some of the missing nuance -- but I do think there's a
| key missing piece here.
|
| He's absolutely right that you shouldn't couple directly to the
| underlying representation. But Postgres lets you transparently
| define views that can be queried (and, with a little more elbow
| grease, updated) just like any other table. You can provide
| decoupling from within the database, and do so on-demand as
| your domain and your data model evolve.
|
| I don't enjoy planting a separate bespoke API server on top of
| the database. Usually you end up lifting many of the same
| capabilities the database already has (auth, batching, ...) to
| your custom API, so a lot of the server is just boilerplate.
| Many API operations are natural consequences of your data
| model; there's little business or engineering value-add once
| you've settled on the latter, you're just writing glorified FFI
| bindings.
|
| Lazy engineering will cause problems no matter what
| architectural stack you end up using. But a state-first
| architecture doesn't have to mean a complete loss of loose
| coupling -- it just means different techniques for achieving
| it.
| doctor_eval wrote:
| Totally agree.
|
| I consider views and functions to be my database's API, and
| when using a wrapping tool (I've used Hasura) I get it to use
| only the views in a dedicated API schema rather than the
| tables directly.
|
| In addition to coupling, you want control over what tables
| and fields you publish.
|
| A simple view is literally one line of SQL; it's hardly a
| burden.
| AtNightWeCode wrote:
| For front-end development where you already using GQL as a
| layer on top of content I can see the use case for this.
|
| A lot of old systems have this tight coupling between databases
| and back-end code. I would advice against going down that path.
|
| GraphQL was designed to solve the flaws in the utterly poor
| service design at Facebook. Do not forget that.
| lstamour wrote:
| We could use PG views or functions to make our own abstraction
| layer inside the database and handle security better, etc. But
| putting that complicated database rebuild aside, just because
| we can publish GraphQL APIs from the database doesn't mean we
| have to: it would be interesting to use GraphQL as a
| communications layer from backend to database too. Just as
| frontend-to-backend could use GraphQL, so too could backend-to-
| database communication, but with different schemas (data
| models) between frontend and backend. Having a backend GraphQL
| server between the frontend and database could allow for
| connection pooling, schema versioning and abstraction, for
| proxying multiple databases, handling business logic and
| perhaps validating inputs, caching or managing compute tasks.
| agilob wrote:
| PostgreSQL has the great shiny features, but still doesn't
| support zero downtime upgrades?
___________________________________________________________________
(page generated 2021-12-03 23:00 UTC)