[HN Gopher] Pg_jsonschema - JSON Schema Support for Postgres
___________________________________________________________________
Pg_jsonschema - JSON Schema Support for Postgres
Author : awalias
Score : 200 points
Date : 2023-03-22 08:32 UTC (14 hours ago)
(HTM) web link (supabase.com)
(TXT) w3m dump (supabase.com)
| naedish wrote:
| This is interesting. Would be curious to see if this can replace
| pydantic for specific cases.
| kkirsche wrote:
| With pydantic's funding, it wouldn't surprise me if they enter
| this space themselves (the company side not the existing python
| library side)
|
| Funding Ref: https://pydantic.dev/announcement/
| mehdix wrote:
| The article is also a joy to read. Well done.
| Kydlaw wrote:
| > 2022-08-19
|
| Has anyone tried it? Any feedback on it?
| wendyshu wrote:
| It works just fine but it will only give a boolean value
| valid/invalid, it won't tell you why it's invalid.
| julian37 wrote:
| I've found that this PL/pgSQL implementation also works well when
| you don't need maximum performance:
|
| https://github.com/gavinwahl/postgres-json-schema
| marwis wrote:
| The detail error from example is pretty awful and missing any
| useful detail. Is it the limitation of the jsonschema library?
| korijn wrote:
| So how do you apply this in the real world when dealing with
| schema versions and database migrations?
| dtech wrote:
| You could add a schema_version column and use one constraint
| per schema version. It will only be per major version, if the
| schema is backwards compatible there's no problem.
| willmeyers wrote:
| Add a new constraint that supports the new schema version,
| migrate json columns to conform to new schema as need be, and
| then drop old constraint. This assumes your schema is backwards
| compatible. If it isn't I would probably not use a DB
| constraint and just handle everything on the application layer.
| korijn wrote:
| Thanks, that makes sense.
| alexvoda wrote:
| If the schema is not backwards compatible it either means you
| will have to do a data migration or it means you want to keep
| track of the schema version along with the data. In both
| cases it is much easier if the json columns are split off
| into a separate table.
| raydiatian wrote:
| This looks rad.
| hhthrowaway1230 wrote:
| Ah yeah super nice always wanted schema support in progress love
| to see them hammering away on top of the solid foundation of
| postgres
| onderkalaci wrote:
| PG community had a similar patch, which got reverted from PG 15
| on the last minute: https://www.depesz.com/2022/04/06/waiting-
| for-postgresql-15-...
| [deleted]
| kiwicopple wrote:
| this would be our preference. we'll try to support this for the
| next commitfest, and if it gets merged then we will deprecate
| our extension in favour of the native solution.
| mkurz wrote:
| Is there a chance it gets committed during the current
| commitfest? See https://commitfest.postgresql.org/42/4086/
|
| This years feature freeze will be on April 8th
| (https://www.postgresql.org/message-id/flat/9fbe60ec-
| fd1b-6ee...), so if it does not get committed within the next
| two and a half weeks it will miss this years Postgres release
| in September...
| kiwicopple wrote:
| I'll flag it with the team this week. I'm not sure what the
| blocker was previously, but it might just be a matter of
| submitting the patch again (with minor changes) so that
| it's "in" for commitfest, with someone willing to own the
| work over the next few months.
| mkurz wrote:
| I think the things that needed to be fixed from last year
| are already committed (more general stuff not directly
| related to the JSON patches). Also according to this
| message at least partial stuff from the JSON patches
| should be committed "...in the next few days..." however
| that was two weeks ago:
| https://www.postgresql.org/message-
| id/454db29b-7d81-c97a-bc1...
|
| I am a bit worried, even though the patches seem to be
| "stable", they will miss the deadline... (since I would
| need those features as well)
| kiwicopple wrote:
| I don't know any of the people involved in this patch, so
| I've sent it to Alexander Korotkov to get his opinion.
| I'll let you know his response after he has a chance to
| look at it.
| kiwicopple wrote:
| Alexander's response:
|
| > This is very long story starting from 2017. This patch
| should finally be committed. Some preliminary
| infrastructure already landed to PostgreSQL 16. Regarding
| SQL/JSON itself I doubt it will be committed to
| PostgreSQL 16, because feature freeze is coming soon.
| It's likely be postponed to PostgreSQL 17.
|
| > Regarding replacement for pg_jsonschema, I don't think
| it will be a good replacement. Yes, one can construct a
| jsonpath expression which checks if particular items have
| particular data types. But I doubt that is nearly as
| convenient as jsonschema.
|
| It looks like there would still be some benefit for
| pg_jsonschema, unless the community decided that they
| wanted support jsonschema validation. We could propose
| this, but I don't think it would arrive to pg core any
| time soon.
| debugdog wrote:
| Well written article! Also very glad to hear your approach to
| support the native implementation. For all of our projects
| when we're integrating external services we usually keep the
| relevant original JSON responses as a jsonb as kind of a
| backup. Next to that we extract the data we'll be using to
| queryable data. To be able to use those "dumps" directly
| would be a nice thing to have.
| Pxtl wrote:
| I've tried to use JSON Schema and the big gap I couldn't figure
| was how to handle polymorphism.
|
| That is, if I have a case where object {
| "Foo": { "Type":"Fred" "Bar":1
| "Baz":2 "Quux":2 } }
|
| and the object { "Foo": {
| "Type":"Waldo" "Bar":1 "Corge":2
| "Xyzzy":7 } }
|
| are both valid, without just allowing any object members or
| allowing the union of their members.
|
| I did a hack by multiplexing the types into a child-object, but
| that was ugly and clumsy.
|
| In XSD or any statically-typed programming language I could
| handle this trivially using types and polymorphism, because
| "Fred" and "Waldo" would be different types.
|
| But I can't figure out how to do that in Json Schema.
| mullsork wrote:
| IIRC that's what oneOf is for. i.e. a discriminated union / sum
| type. My experience with oneOf is that tooling support for it
| is terrible.
| Pxtl wrote:
| Ah, I didn't realize you could use oneOf with refs or
| objects, I've only ever seen it used with primitives. Thanks!
| boomskats wrote:
| This is long overdue and elegantly done. Great work!
|
| On a personal note, it's great to finally see Neon & Supabase
| playing with each other. Much more interesting to me than Hasura.
| tough wrote:
| The power of FOSS
| thangngoc89 wrote:
| Hasura is also FOSS
| zwily wrote:
| Hasura is open core. Given the massive price increase they
| just did for their hosted version, I'd expect more and more
| future features will not land in core at all, to push
| people to pay.
| tough wrote:
| AFAIK I have not run supabase in my own infra but they
| seem to allow you to do so and are quite good citizens
| publishing all their built tools on top of pg or whatever
| and as far as I remember with sane licenses.
|
| I love supabase, neon are new-ish but a great alterantive
| for hosted serverless databases (they also did a great
| staging-db-for-pr's) when launched that we integrated at
| work quite soon while on beta and saved a lot of
| headaches of introducing new features that touched
| database before
| kiwicopple wrote:
| > sane licenses
|
| FYI, our policy is to use only MIT, Apache2, or
| PostgreSQL licenses. You can run it all on your own
| infra, with instructions here:
| https://supabase.com/docs/guides/self-hosting
| klabb3 wrote:
| It's already quite bad unfortunately. Both support for
| read replicas and metrics(!) are not in the open source
| version. They have a prometheus exporter but not on free,
| afaik.
|
| I was also concerned with people reporting memory
| consumption/leak issues, as I'm planning to have lots of
| subscriptions. I don't know haskell well enough, but from
| the outside it does match the symptoms of having dug
| themselves into an architectural complexity hole, which I
| assume is much harder to navigate with unorthodox tooling
| and practices.
|
| Im still rooting for hasura, I like their way of making
| things dumb, simple and without too much technophilia for
| its own sake. They genuinely want you to focus on your
| business problems instead of blasting you with novel and
| overengineered concepts, like others do.
| zwily wrote:
| Yeah, I love the product. I've been using it a few years
| in a fairly complicated internal tool used by a dozen
| people, and it's been amazing. But I'm pretty discouraged
| by what I see as its future.
| tango12 wrote:
| (from Hasura)
|
| Scaling subscriptions is hard, but we work with our
| users/customers at scale to make sure settings are
| tweaked correctly.
|
| We have users running 100k - 1M concurrent users in
| production for live-event type platforms. It's not
| completely trivial to benchmark and setup because query
| patterns, streaming vs live queries etc have an impact,
| but it works very reliably. No missing events, no
| problems disconnecting/reconnecting, no need for sticky
| sessions and so on.
|
| An initial POC benchmark [1] should be a quick affair so
| if you're trying it out and run into any problems, please
| hit me up! Email on my bio.
|
| [1]: https://github.com/hasura/graphql-bench
| tango12 wrote:
| Hasura CEO here. Totally hear you.
|
| We're working on a new serverless infrastructure layer
| that'll make the pricing better for users compared to a
| DIY API server or to a self-hosted Hasura.
|
| It's a significant engineering lift on our side - at its
| core we're engineering Hasura to achieve 90%+
| infrastructure utlization (no cold-start, sub-millisecond
| auto-scaling), and that's what will allow us to do this.
|
| Not there yet, but we'll be demo-ing and talking about
| the engineering at HasuraCon in June!
| endisneigh wrote:
| There's nothing stopping you from using this with Hasura. Don't
| see the relevance.
| [deleted]
| rapfaria wrote:
| Wouldn't you wanna save the json even if the structure is
| incorrect?
| rco8786 wrote:
| If you want that behavior, just don't use json schema.
| roenxi wrote:
| I can see some technical advantages to supporting JSON schema
| directly; but I suspect most people will be using this extension
| because SQL is really ugly and they don't want to use it to set
| up their schemas in pg proper.
|
| It says a lot about how weak the SQL syntax is. An extension to
| replace CREATE TABLE with a JSON schema construct would be wildly
| popular.
| anonu wrote:
| well written article and looks like a great extension. However,
| my only issue with JSON Schema is that it becomes unsupportable
| once your JSON objects get too big: to many keys or too many
| items in an array for example. If you are looking to find "where"
| the issue is in your JSON object, most schema validators don't
| provide enough guidance, they just say "something is wrong" as it
| appears this one does.
___________________________________________________________________
(page generated 2023-03-22 23:01 UTC)