[HN Gopher] Mongo but on Postgres and with strong consistency be...
___________________________________________________________________
Mongo but on Postgres and with strong consistency benefits
Author : oskar_dudycz
Score : 150 points
Date : 2024-07-07 13:22 UTC (9 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| salomonk_mur wrote:
| What would be the advantage of using this instead of simple jsonb
| columns?
| imnotjames wrote:
| Looks like it natches the mongo node API
| joshmanders wrote:
| It uses JSONb under the hood. Just gives you a very "mongo"
| feel to using PostgreSQL. Not sure how I feel about it.
| CREATE TABLE IF NOT EXISTS %I (_id UUID PRIMARY KEY, data
| JSONB)
| wood_spirit wrote:
| Can they make it use uuid7 for ids for better
| insert_becomes_append performance?
| lgas wrote:
| Yes
| oskar_dudycz wrote:
| Yes, I'm using JSONB underneath and translating the
| MongoDB syntax to native queries. As they're not super
| pleasant to deal with, then I thought that it'd be nice
| to use some familiar to many MongoDB API.
|
| Regarding IDs, you can use any UUID-compliant format.
| lopatin wrote:
| jsonb isn't web scale. Mongo is web scale.
| digger495 wrote:
| I see what you did there
| zulban wrote:
| Neat. When I migrated a project from mongo to postgres I took a
| similar approach, except I only implemented the mongo feel I
| needed within my own project instead of building a proper library
| as done here. I was surprised how much performance improved
| despite using a hacky wrapper.
|
| https://blog.stuartspence.ca/2023-05-goodbye-mongo.html
|
| Personally tho, I plan to just drop all similarity to mongo in
| future projects.
| oskar_dudycz wrote:
| Yup, I might not reach full compliance, but I will try to
| follow the Pareto principle. Thanks for the link and kind
| feedback!
| joeyagreco wrote:
| Good work! I would like to see a section on the README outlining
| the benefits of Pongo
| oskar_dudycz wrote:
| Thanks, I'll try to cover that, good call!
| Squarex wrote:
| How does it compare with FerretDB[0]?
|
| [0] https://www.ferretdb.com/
| aleksi wrote:
| (I'm FerretDB co-founder)
|
| As far as I can tell, Pongo provides an API similar to the
| MongoDB driver for Node that uses PostgreSQL under the hood.
| FerretDB operates on a different layer - it implements MongoDB
| network protocol, allowing it to work with any drivers and
| applications that use MongoDB without modifications.
| Keyframe wrote:
| Even monstache?
| aleksi wrote:
| That was the first time I heard about that project. Someone
| could check it using our guide:
| https://docs.ferretdb.io/migration/premigration-testing/ Or
| we will check it ourselves later:
| https://github.com/FerretDB/FerretDB/issues/4429
| Zambyte wrote:
| The posted project looks like a client that connects to pg but
| behaves like Mongo, where Ferret is a server that accepts Mongo
| client connections and uses pg as backend storage.
| oskar_dudycz wrote:
| Yes, I'm using MongoDB API in Pongo to keep the muscle memory.
| So, it's a library that translates the MongoDB syntax to native
| PostgreSQL JSONB queries.
| ramchip wrote:
| Have you tried it with CockroachDB?
| oskar_dudycz wrote:
| I did not, but I'm not using any fancy syntax so far besides
| JSONB operators. If it won't work, then I'm happy to adjust it
| to make it compliant.
| pipe_connector wrote:
| MongoDB has supported the equivalent of Postgres' serializable
| isolation for many years now. I'm not sure what "with strong
| consistency benefits" means.
| Izkata wrote:
| > MongoDB has supported the equivalent of Postgres'
| serializable isolation for many years now.
|
| That would be the "I" in ACID
|
| > I'm not sure what "with strong consistency benefits" means.
|
| Probably the "C" in ACID: Data integrity, such as constraints
| and foreign keys.
|
| https://www.bmc.com/blogs/acid-atomic-consistent-isolated-du...
| lkdfjlkdfjlg wrote:
| > Pongo - Mongo but on Postgres and with strong consistency
| benefits.
|
| I don't read this as saying it's "MongoDB but with...". I read
| it as saying that it's Postgres.
| throwup238 wrote:
| _> I 'm not sure what "with strong consistency benefits"
| means._
|
| "Doesn't use MongoDB" was my first thought.
| zihotki wrote:
| Or is it? Jepsen reported a number of issues like "read skew,
| cyclic information flow, duplicate writes, and internal
| consistency violations. Weak defaults meant that transactions
| could lose writes and allow dirty reads, even downgrading
| requested safety levels at the database and collection level.
| Moreover, the snapshot read concern did not guarantee snapshot
| unless paired with write concern majority--even for read-only
| transactions."
|
| That report (1) is 4 years old, many things could have changed.
| But so far any reviewed version was faulty in regards to
| consistency.
|
| 1 - https://jepsen.io/analyses/mongodb-4.2.6
| endisneigh wrote:
| That's been resolved for a long time now (not to say that
| MongoDB is perfect, though).
| nick_ wrote:
| I just want to point out that 4 years is not a long time in
| the context of consistency guarantees of a database engine.
|
| I have listened to Mongo evangelists a few times despite my
| skepticism and been burned every time. Mongo is way
| oversold, IMO.
| vorticalbox wrote:
| That is for mongo 4.x but latest stable is 6.0.7 which has
| note More resilient operations and Additional data security.
|
| https://www.mongodb.com/blog/post/big-reasons-upgrade-
| mongod...
| pipe_connector wrote:
| Jepsen found a more concerning consistency bug than the above
| results when Postgres 12 was evaluated [1]. Relevant text:
|
| We [...] found that transactions executed with serializable
| isolation on a single PostgreSQL instance were not, in fact,
| serializable
|
| I have run Postgres and MongoDB at petabyte scale. Both of
| them are solid databases that occasionally have bugs in their
| transaction logic. Any distributed database that is receiving
| significant development will have bugs like this. Yes, even
| FoundationDB.
|
| I wouldn't not use Postgres because of this problem, just
| like I wouldn't not use MongoDB because they had bugs in a
| new feature. In fact, I'm more likely to trust a company that
| is paying to consistently have their work reviewed in public.
|
| 1. https://jepsen.io/analyses/postgresql-12.3
| jokethrowaway wrote:
| Have you tried it in production? It's absolute mayhem.
|
| Deadlocks were common; it uses a system of retries if the
| transaction fails; we had to disable transactions completely.
|
| Next step is either writing a writer queue manually or
| migrating to postgres.
|
| For now we fly without transaction and fix the occasional
| concurrency issues.
| pipe_connector wrote:
| Yes, I have worked on an application that pushed enormous
| volumes of data through MongoDB's transactions.
|
| Deadlocks are an application issue. If you built your
| application the same way with Postgres you would have the
| same problem. Automatic retries of failed transactions with
| specific error codes are a driver feature you can tune or
| turn off if you'd like. The same is true for some Postgres
| drivers.
|
| If you're seeing frequent deadlocks, your transactions are
| too large. If you model your data differently, deadlocks can
| be eliminated completely (and this advice applies regardless
| of the database you're using). I would recommend you engage a
| third party to review your data access patterns before you
| migrate and experience the same issues with Postgres.
| akoboldfrying wrote:
| >Deadlocks are an application issue.
|
| Not necessarily, and not in the very common single-writer-
| many-reader case. In that case, PostreSQL's MVCC allows all
| readers to see consistent snapshots of the data without
| blocking each other or the writer. TTBOMK, any other
| mechanism providing this guarantee requires locking (making
| deadlocks possible).
|
| So: Does Mongo now also implement MVCC? (Last time I
| checked, it didn't.) If not, how does it guarantee that
| reads see consistent snapshots without blocking a writer?
| karmakaze wrote:
| What makes mongo mongo is its distibruted nature, without it you
| could just store json(b) in an RDBMS.
| richwater wrote:
| > store json(b) in an RDBMS
|
| I actually did this for as small HR application and it worked
| incredible well.jsonb gin indexes are pretty nice once you get
| the hang of the syntax.
|
| And then, you also have all the features of Postgres as a
| freebie.
| eddd-ddde wrote:
| Personally, I much better like postgres json syntax than
| whatever mongo invented.
|
| Big fan of jsonb columns.
| oskar_dudycz wrote:
| I'm planning to add methods for raw JSON path or, in
| general, raw SQL syntax to enable such fine-tuning and not
| need to always use MongoDB API. I agree that for many
| people, this would be better.
| darby_nine wrote:
| but then you wouldn't have the joy of using the most awkward
| query language invented by mankind
| lkdfjlkdfjlg wrote:
| > What makes mongo mongo is its distibruted nature, without it
| you could just store json(b) in an RDBMS.
|
| Welllllllll I think that's moving the goalposts. Being
| distributed might be a thing _now_ but I still remember when it
| was marketed as the thing to have if you wanted to store
| unstructured documents.
|
| Now that Postgres also does that, you're marketing Mongo as
| having a different unique feature. Moving the goalposts.
| thfuran wrote:
| It doesn't really seem reasonable to accuse someone of moving
| goalposts that you've just brought into the conversation,
| especially when they were allegedly set by a third party.
| coldtea wrote:
| Parent didn't "just brought them", they merely referrenced
| the pre-existing goalposts used to advocate for Mongo and
| reasons devs adopted it.
| lkdfjlkdfjlg wrote:
| Exactly this, very eloquent, thank you.
|
| Yes, I'm still bitter because I was one of those tricked
| into it.
| zihotki wrote:
| But RDBMS'es are often also distributed. So what is mongo now?
| marcosdumay wrote:
| People don't usually distribute Postgres (unless you count
| read replicas and cold HA replicas). But well, people don't
| usually distribute MongoDB either, so no difference.
|
| In principle, a cluster of something like Mongo can scale
| much further than Postgres. In practice, Mongo is full of
| issues even before you replicate it, and you are better with
| something that abstracts a set if incoherent Postgres (or
| sqlite) instances.
| zozbot234 wrote:
| Postgres supports foreign data wrapper (FDW), which is the
| basic building block for a distributed DB. It doesn't
| support strong consistency in distributed settings as of
| yet, although it does provide two-phase commit which could
| be used for such.
| williamdclt wrote:
| > strong consistency in distributed settings
|
| I doubt it ever will. The point of distributing a data
| store is latency and availability, both of which would go
| down the drain with distributed strong consistency
| hibikir wrote:
| I think of the Stripe Mongo install, as it was a decade or
| so ago. It really was sharded quite wide, and relied on all
| shards having multiple replicas, as to tolerate cycling
| through them on a regular basis. It worked well enough to
| run as a source of truth for a financial company, but the
| database team wasn't small, dedicated to keeping all that
| machinery working well.
|
| Ultimately anyone doing things at that scale is going to
| run a small priesthood doing custom things to keep the
| persistence payer humming, regardless of what the
| underlying database is. I recall a project abstracting over
| the Mongo API, as to allow for swapping the storage layer
| if they ever needed to
| brabel wrote:
| Often?? In my experience it's really hard to do it and still
| maintain similar performance, which kind of voids any benefit
| you may be looking for.
| anonzzzies wrote:
| So how easy is it to distribute it? I don't have experience
| with it but the tutorials look terrible compared to, say,
| Scylla, Yuga, Cockroach, TiDB etc. Again, honest question?
| rad_gruchalski wrote:
| Pongo seems to be a middleware between your app and Postgres.
| So it will most certainly work absolutely fine on YugabyteDB,
| if one's okay with occasional latency issues.
|
| One could optimise it more for a distributed sql by
| implementing key partition awareness and connecting directly
| to a tserver storing the data one's after.
| oskar_dudycz wrote:
| Yes, as long as database has support to JSONB and JSON path
| syntax (so PG 12 >= compliant) you should be good to go :)
| rad_gruchalski wrote:
| It could work:
| https://docs.yugabyte.com/preview/explore/ysql-language-
| feat....
| theteapot wrote:
| Does "distributed" mean sharded or just replicated? In either
| case it's a bit quirky but easy enough.
|
| > Scylla, Yuga, Cockroach, TiDB etc.
|
| You have experience "distributing" all these DBs? That's
| impressive.
| rework wrote:
| > What makes mongo mongo is its distibruted nature
|
| Since when? Mongo was popular because it gave the false
| perception it was insanely fast until people found out it was
| only fast if you didn't care about your data, and the moment
| you ensure write happened it ended up being slower than an
| RDB....
| jokethrowaway wrote:
| Since forever, sharding, distributing postgres / mysql was
| not easy. There were a few proprietary extensions. Nowadays
| it's more accessible.
|
| This was typical crap you had to say to pass fang style
| interview "oh of course I'd use mongo because this use case
| doesn't have relations and because it's easy to scale", while
| you know postgres will give you way less problems and allow
| you to make charts and analytics in 30m when finance comes
| around.
|
| I made the mistake of picking mongo for my own startup,
| because of propaganda coming from interviewing materials and
| I regretted it for the entire duration of the company.
| posix_monad wrote:
| Does MongoDB have serious market share compared to DynamoDB (and
| similar clones from Azure, GCP) at this point?
| dudeinjapan wrote:
| Totally. Many of the biggest tech companies are using for core
| use cases. Stripe uses a modified version:
| https://stripe.com/blog/how-stripes-document-databases-suppo...
|
| We use MongoDB's cloud offering called Atlas as our core DB at
| TableCheck.
| cpursley wrote:
| Thanks, just added Pongo to the NoSQL section of my "Postgres Is
| Enough" gist:
|
| https://gist.github.com/cpursley/c8fb81fe8a7e5df038158bdfe0f...
| oskar_dudycz wrote:
| Thank you!
| revskill wrote:
| Genius.
| oskar_dudycz wrote:
| <3
| hdhshdhshdjd wrote:
| I use JSONB columns a lot, it has its place. It can fit certain
| applications, but it does introduce a lot of extra query
| complexity and you lose out on some ways to speed up query
| performance that you could get from a relational approach.
|
| Which is to say JSONB is useful, but I wouldn't throw the
| relational baby out with the bath water.
| oskar_dudycz wrote:
| I'm planning to add possibility to use Generated Columns in the
| future https://www.postgresql.org/docs/current/ddl-generated-
| column... to allow more optimisations.
| doctor_eval wrote:
| I've been doing some reasonably serious playing with the idea
| of using jsonb columns as a kind of front end to relational
| tables. So basically, external interactions with the database
| are done using JSON, which gives end users some flexibility,
| but internally we effectively create a realtime materialised
| view of just those properties we need from the json.
|
| Anyone else tried this approach? Anything I should know about
| it?
| hdhshdhshdjd wrote:
| I do something similar, building a lightweight search index
| over very large relational datasets.
|
| So the tables are much simpler to manage, much more portable,
| so I can serve search off scalable hardware without
| disturbing the underlying source of truth.
|
| The downside is queries are more complex and slower.
| Zenzero wrote:
| My mental model has always been to only use JSONB for column
| types where the relations within the json object are of no
| importance to the DB. An example might be text editor markup.
| I imagine if you start wanting to query within the json
| object you should consider a more relational model.
| 314156 wrote:
| https://docs.oracle.com/en/database/oracle/mongodb-api/mgapi...
|
| Oracle database has had a MongoDB compatible API for a few years
| now.
| cyberpunk wrote:
| And it only costs 75k a seat per year per developer, with free
| bi yearly license compliance audits, a million in ops and
| hardware to get near prod and all the docu is paywalled. What a
| deal!
| slau wrote:
| A client had a DB hosted by Oracle. The client was doing most
| of their compute on AWS, and wanted to have a synchronised
| copy made available to them on AWS. Oracle quoted them a cool
| $600k/year to operate that copy, with a 3 year contract.
|
| DMS + Postgres did it for $5k/year.
| 314156 wrote:
| Maybe you are unaware of this?
| https://www.oracle.com/cloud/free/
| rework wrote:
| Looks sort of like MartenDB but trying to minic mongo api, unsure
| why anyone would want to do that... mongo api is horrible...
| JanSt wrote:
| Wouldn't that allow to switch from Mongo to Postgres without
| having to rewrite all of your app?
| oskar_dudycz wrote:
| Hint: I'm an ex-Marten maintainer, so the similarity is not
| accidental ;)
|
| As Op said, not needing to rewrite applications or using the
| muscle memory from using Mongo is beneficial. I'm not
| planning to be strict and support only MongoDB API; I will
| extend it when needed (e.g. to support raw SQL or JSON Path).
| But I plan to keep shim with compliant API for the above
| reasons.
|
| MongoDB API has its quirks but is also pretty powerful and
| widely used.
| rework wrote:
| Oh, so you are, then we can rest assured this will end up
| being a solid project!
|
| I personally can't stand mongodb, its given me alot of
| headaches, joined a company and the same week I joined we
| lost a ton of data and the twat who set it up resigned in
| the middle of the outage. Got it back online and spend 6m
| moving to postgresql.
| Tao3300 wrote:
| Ditch the dalmatian before Disney rips your face off.
| harel wrote:
| I regularly find the hybrid model is a sweet spot. I keep core
| fields as regular columns and dynamic data structures as JSONB.
| It brings the best of both worlds together.
| Waterluvian wrote:
| I do this too with Postgres and it is just the best of both.
|
| A robot is a record. A sensor calibration is a record. A
| warehouse robot map with tens of thousands of geojson objects
| is a single record.
|
| If I made every map entity its own record, my database would be
| 10_000x more records and I'd get no value out of it. We're not
| doing spatial relational queries.
| willsmith72 wrote:
| It's technologically cool, but I would love a "why" section in
| the README. Is the idea you're a mongo Dev/love the mongo api and
| want to use it rather than switch to pg apis? Or want to copy
| some code over from an old project?
|
| I'm sure there are use cases, I'm just struggling to grasp them.
| Especially if it's about reusing queries from other projects, AI
| is pretty good at that
___________________________________________________________________
(page generated 2024-07-07 23:00 UTC)