[HN Gopher] Pg_jsonschema - JSON Schema Support for Postgres
       ___________________________________________________________________
        
       Pg_jsonschema - JSON Schema Support for Postgres
        
       Author : awalias
       Score  : 200 points
       Date   : 2023-03-22 08:32 UTC (14 hours ago)
        
 (HTM) web link (supabase.com)
 (TXT) w3m dump (supabase.com)
        
       | naedish wrote:
       | This is interesting. Would be curious to see if this can replace
       | pydantic for specific cases.
        
         | kkirsche wrote:
         | With pydantic's funding, it wouldn't surprise me if they enter
         | this space themselves (the company side not the existing python
         | library side)
         | 
         | Funding Ref: https://pydantic.dev/announcement/
        
       | mehdix wrote:
       | The article is also a joy to read. Well done.
        
       | Kydlaw wrote:
       | > 2022-08-19
       | 
       | Has anyone tried it? Any feedback on it?
        
         | wendyshu wrote:
         | It works just fine but it will only give a boolean value
         | valid/invalid, it won't tell you why it's invalid.
        
       | julian37 wrote:
       | I've found that this PL/pgSQL implementation also works well when
       | you don't need maximum performance:
       | 
       | https://github.com/gavinwahl/postgres-json-schema
        
       | marwis wrote:
       | The detail error from example is pretty awful and missing any
       | useful detail. Is it the limitation of the jsonschema library?
        
       | korijn wrote:
       | So how do you apply this in the real world when dealing with
       | schema versions and database migrations?
        
         | dtech wrote:
         | You could add a schema_version column and use one constraint
         | per schema version. It will only be per major version, if the
         | schema is backwards compatible there's no problem.
        
         | willmeyers wrote:
         | Add a new constraint that supports the new schema version,
         | migrate json columns to conform to new schema as need be, and
         | then drop old constraint. This assumes your schema is backwards
         | compatible. If it isn't I would probably not use a DB
         | constraint and just handle everything on the application layer.
        
           | korijn wrote:
           | Thanks, that makes sense.
        
           | alexvoda wrote:
           | If the schema is not backwards compatible it either means you
           | will have to do a data migration or it means you want to keep
           | track of the schema version along with the data. In both
           | cases it is much easier if the json columns are split off
           | into a separate table.
        
       | raydiatian wrote:
       | This looks rad.
        
       | hhthrowaway1230 wrote:
       | Ah yeah super nice always wanted schema support in progress love
       | to see them hammering away on top of the solid foundation of
       | postgres
        
       | onderkalaci wrote:
       | PG community had a similar patch, which got reverted from PG 15
       | on the last minute: https://www.depesz.com/2022/04/06/waiting-
       | for-postgresql-15-...
        
         | [deleted]
        
         | kiwicopple wrote:
         | this would be our preference. we'll try to support this for the
         | next commitfest, and if it gets merged then we will deprecate
         | our extension in favour of the native solution.
        
           | mkurz wrote:
           | Is there a chance it gets committed during the current
           | commitfest? See https://commitfest.postgresql.org/42/4086/
           | 
           | This years feature freeze will be on April 8th
           | (https://www.postgresql.org/message-id/flat/9fbe60ec-
           | fd1b-6ee...), so if it does not get committed within the next
           | two and a half weeks it will miss this years Postgres release
           | in September...
        
             | kiwicopple wrote:
             | I'll flag it with the team this week. I'm not sure what the
             | blocker was previously, but it might just be a matter of
             | submitting the patch again (with minor changes) so that
             | it's "in" for commitfest, with someone willing to own the
             | work over the next few months.
        
               | mkurz wrote:
               | I think the things that needed to be fixed from last year
               | are already committed (more general stuff not directly
               | related to the JSON patches). Also according to this
               | message at least partial stuff from the JSON patches
               | should be committed "...in the next few days..." however
               | that was two weeks ago:
               | https://www.postgresql.org/message-
               | id/454db29b-7d81-c97a-bc1...
               | 
               | I am a bit worried, even though the patches seem to be
               | "stable", they will miss the deadline... (since I would
               | need those features as well)
        
               | kiwicopple wrote:
               | I don't know any of the people involved in this patch, so
               | I've sent it to Alexander Korotkov to get his opinion.
               | I'll let you know his response after he has a chance to
               | look at it.
        
               | kiwicopple wrote:
               | Alexander's response:
               | 
               | > This is very long story starting from 2017. This patch
               | should finally be committed. Some preliminary
               | infrastructure already landed to PostgreSQL 16. Regarding
               | SQL/JSON itself I doubt it will be committed to
               | PostgreSQL 16, because feature freeze is coming soon.
               | It's likely be postponed to PostgreSQL 17.
               | 
               | > Regarding replacement for pg_jsonschema, I don't think
               | it will be a good replacement. Yes, one can construct a
               | jsonpath expression which checks if particular items have
               | particular data types. But I doubt that is nearly as
               | convenient as jsonschema.
               | 
               | It looks like there would still be some benefit for
               | pg_jsonschema, unless the community decided that they
               | wanted support jsonschema validation. We could propose
               | this, but I don't think it would arrive to pg core any
               | time soon.
        
           | debugdog wrote:
           | Well written article! Also very glad to hear your approach to
           | support the native implementation. For all of our projects
           | when we're integrating external services we usually keep the
           | relevant original JSON responses as a jsonb as kind of a
           | backup. Next to that we extract the data we'll be using to
           | queryable data. To be able to use those "dumps" directly
           | would be a nice thing to have.
        
       | Pxtl wrote:
       | I've tried to use JSON Schema and the big gap I couldn't figure
       | was how to handle polymorphism.
       | 
       | That is, if I have a case where object                   {
       | "Foo": {                 "Type":"Fred"                 "Bar":1
       | "Baz":2                 "Quux":2             }         }
       | 
       | and the object                   {             "Foo": {
       | "Type":"Waldo"                 "Bar":1                 "Corge":2
       | "Xyzzy":7             }         }
       | 
       | are both valid, without just allowing any object members or
       | allowing the union of their members.
       | 
       | I did a hack by multiplexing the types into a child-object, but
       | that was ugly and clumsy.
       | 
       | In XSD or any statically-typed programming language I could
       | handle this trivially using types and polymorphism, because
       | "Fred" and "Waldo" would be different types.
       | 
       | But I can't figure out how to do that in Json Schema.
        
         | mullsork wrote:
         | IIRC that's what oneOf is for. i.e. a discriminated union / sum
         | type. My experience with oneOf is that tooling support for it
         | is terrible.
        
           | Pxtl wrote:
           | Ah, I didn't realize you could use oneOf with refs or
           | objects, I've only ever seen it used with primitives. Thanks!
        
       | boomskats wrote:
       | This is long overdue and elegantly done. Great work!
       | 
       | On a personal note, it's great to finally see Neon & Supabase
       | playing with each other. Much more interesting to me than Hasura.
        
         | tough wrote:
         | The power of FOSS
        
           | thangngoc89 wrote:
           | Hasura is also FOSS
        
             | zwily wrote:
             | Hasura is open core. Given the massive price increase they
             | just did for their hosted version, I'd expect more and more
             | future features will not land in core at all, to push
             | people to pay.
        
               | tough wrote:
               | AFAIK I have not run supabase in my own infra but they
               | seem to allow you to do so and are quite good citizens
               | publishing all their built tools on top of pg or whatever
               | and as far as I remember with sane licenses.
               | 
               | I love supabase, neon are new-ish but a great alterantive
               | for hosted serverless databases (they also did a great
               | staging-db-for-pr's) when launched that we integrated at
               | work quite soon while on beta and saved a lot of
               | headaches of introducing new features that touched
               | database before
        
               | kiwicopple wrote:
               | > sane licenses
               | 
               | FYI, our policy is to use only MIT, Apache2, or
               | PostgreSQL licenses. You can run it all on your own
               | infra, with instructions here:
               | https://supabase.com/docs/guides/self-hosting
        
               | klabb3 wrote:
               | It's already quite bad unfortunately. Both support for
               | read replicas and metrics(!) are not in the open source
               | version. They have a prometheus exporter but not on free,
               | afaik.
               | 
               | I was also concerned with people reporting memory
               | consumption/leak issues, as I'm planning to have lots of
               | subscriptions. I don't know haskell well enough, but from
               | the outside it does match the symptoms of having dug
               | themselves into an architectural complexity hole, which I
               | assume is much harder to navigate with unorthodox tooling
               | and practices.
               | 
               | Im still rooting for hasura, I like their way of making
               | things dumb, simple and without too much technophilia for
               | its own sake. They genuinely want you to focus on your
               | business problems instead of blasting you with novel and
               | overengineered concepts, like others do.
        
               | zwily wrote:
               | Yeah, I love the product. I've been using it a few years
               | in a fairly complicated internal tool used by a dozen
               | people, and it's been amazing. But I'm pretty discouraged
               | by what I see as its future.
        
               | tango12 wrote:
               | (from Hasura)
               | 
               | Scaling subscriptions is hard, but we work with our
               | users/customers at scale to make sure settings are
               | tweaked correctly.
               | 
               | We have users running 100k - 1M concurrent users in
               | production for live-event type platforms. It's not
               | completely trivial to benchmark and setup because query
               | patterns, streaming vs live queries etc have an impact,
               | but it works very reliably. No missing events, no
               | problems disconnecting/reconnecting, no need for sticky
               | sessions and so on.
               | 
               | An initial POC benchmark [1] should be a quick affair so
               | if you're trying it out and run into any problems, please
               | hit me up! Email on my bio.
               | 
               | [1]: https://github.com/hasura/graphql-bench
        
               | tango12 wrote:
               | Hasura CEO here. Totally hear you.
               | 
               | We're working on a new serverless infrastructure layer
               | that'll make the pricing better for users compared to a
               | DIY API server or to a self-hosted Hasura.
               | 
               | It's a significant engineering lift on our side - at its
               | core we're engineering Hasura to achieve 90%+
               | infrastructure utlization (no cold-start, sub-millisecond
               | auto-scaling), and that's what will allow us to do this.
               | 
               | Not there yet, but we'll be demo-ing and talking about
               | the engineering at HasuraCon in June!
        
         | endisneigh wrote:
         | There's nothing stopping you from using this with Hasura. Don't
         | see the relevance.
        
       | [deleted]
        
       | rapfaria wrote:
       | Wouldn't you wanna save the json even if the structure is
       | incorrect?
        
         | rco8786 wrote:
         | If you want that behavior, just don't use json schema.
        
       | roenxi wrote:
       | I can see some technical advantages to supporting JSON schema
       | directly; but I suspect most people will be using this extension
       | because SQL is really ugly and they don't want to use it to set
       | up their schemas in pg proper.
       | 
       | It says a lot about how weak the SQL syntax is. An extension to
       | replace CREATE TABLE with a JSON schema construct would be wildly
       | popular.
        
       | anonu wrote:
       | well written article and looks like a great extension. However,
       | my only issue with JSON Schema is that it becomes unsupportable
       | once your JSON objects get too big: to many keys or too many
       | items in an array for example. If you are looking to find "where"
       | the issue is in your JSON object, most schema validators don't
       | provide enough guidance, they just say "something is wrong" as it
       | appears this one does.
        
       ___________________________________________________________________
       (page generated 2023-03-22 23:01 UTC)