[HN Gopher] MongoDB's new query engine
___________________________________________________________________
MongoDB's new query engine
Author : subset
Score : 114 points
Date : 2023-09-20 12:04 UTC (1 days ago)
(HTM) web link (laplab.me)
(TXT) w3m dump (laplab.me)
| tjpnz wrote:
| Does MongoDB still lose data?
| salil999 wrote:
| Providing some context would be nice
| endisneigh wrote:
| > Our most recent report on MongoDB 3.6.4 focused on causal
| consistency and linearizability in sharded collections. We
| found that sharded clusters appeared to offer linearizable
| reads, writes, and compare-and-set operations against single
| documents, so long as users ran with read concern
| linearizable and write concern majority. However, any weaker
| level of write concern resulted in the loss of committed
| writes. MongoDB's default level of write concern was (and
| remains) acknowledgement by a single node, which means
| MongoDB may lose data by default. Although the write concern
| documentation does not make this clear, the rollback
| documentation states:
|
| https://jepsen.io/analyses/mongodb-4.2.6
| beembeem wrote:
| This was prior to the introduction of transactions years
| ago. The versions stated in your comment are past EOL and
| not offered in Atlas today. Not relevant to today's
| reality.
| endisneigh wrote:
| sure, but I'm providing the context for why they'd even
| ask.
| jabradoodle wrote:
| Another case of bad defaults, seems to have changed since
| 5.0.
| jayd16 wrote:
| The backlash at the time was that some felt it wasn't
| just bad defaults but intentional benchmark juicing. Old
| news now, though.
| jabradoodle wrote:
| Your comment is ambigous.
|
| Since 5.0 the default write concern is majority, if that's what
| you were referring to.
| cj wrote:
| No [0]
|
| [0] https://www.mongodb.com/docs/manual/reference/write-
| concern/...
| bit_flipper wrote:
| This article seems to have inspired others to look at MongoDB
| again, so I'll give my thoughts after using it recently.
|
| MongoDB Atlas is a surprisingly good managed database product.
| I'm not a huge fan of someone else running my databases, but I
| think it might be the best one you can run across any cloud. If
| you like MongoDB (and, ignore the memes, there is a lot to like
| nowadays), and are OK paying a bit more to have someone run your
| database, I'd strongly consider Atlas.
| avereveard wrote:
| Yeah atlas with federated queries and what not makes thinking
| about the entirety of an application storage layer a breeze.
| And gpt is even better at generating mongo queries than it is
| at sql, which is a nice unexpected facilitation in day to day
| usage.
| winrid wrote:
| The problem is when you grow. They can be really tough to work
| with on pricing. Also, their licensing does not allow servers
| past a certain size. Can you imagine Oracle telling the CIA
| they can't use servers with more than 256gb of ram? Just silly.
| bit_flipper wrote:
| I'm not sure what experience you have, but I've run both
| their Enterprise licensed database on prem as well as
| migrated to Atlas and there have never been any licensing
| issues preventing vertical scaling of databases. One of our
| clusters on Atlas right now has machines larger than 256GB of
| RAM -- you're more limited by what your cloud vendor has
| available than Atlas.
| winrid wrote:
| Actually yeah for Atlas, I guess they automatically bill
| you as if it was 2-3x Enterprise Advanced licenses so
| there's no discussion. I thought it was the same as
| Enterprise Advanced but I guess not. With EA each unit
| above 256gb is billed as an additional license. See [0] [1]
| [2].
|
| [0] https://www.mongodb.com/community/forums/t/for-mongodb-
| enter...
|
| [1] https://www.mongodb.com/community/forums/t/for-mongodb-
| enter...
|
| [2] https://www.linkedin.com/pulse/mongodb-sizing-guide-
| sepp-ren...
| coddles wrote:
| Oracle charges per 2 vCPU. This is quite standard.
| winrid wrote:
| In our case in the support call they just told us it
| wasn't allowed. Now I'm realizing they were wrong.
| [deleted]
| matthewcford wrote:
| can it sort by id?
| maxbond wrote:
| I'm sure it can[1], do you mean can you get documents in
| insertion order? The default _id is a random right-leaning
| identifier[2], so sorting by it would only get you an
| approximate insertion order (except in a deployment with a
| single server). However as the linked documentation
| demonstrates you could supply a strictly incrementing ID, if
| your architecture abides.
|
| [1]
| https://www.mongodb.com/docs/manual/reference/operator/aggre...
|
| [2]
| https://www.mongodb.com/docs/manual/reference/method/ObjectI...
| winrid wrote:
| no, that means you can sort on a max of 32 fields per sort
| request, like fieldOne, fieldTwo, fieldThree...
|
| There's no limit on number of sortable documents. Use an
| index to sort and it'll also be fast and use no extra memory.
| If you want to do an in memory sort there are memory limits,
| but you can also tell it to overflow to disk.
| maxbond wrote:
| Oh gotcha, thanks that makes sense. 32 did seem like a
| bizarrely low limit for how many documents you can sort,
| but as a number of fields it's plenty.
|
| I've removed that part of my comment. Appreciate the
| correction.
| nevi-me wrote:
| Heh, I missed that there's a new query engine.
|
| From [0] it looks like it was available from 5.1, also
| interesting is that you can't choose which engine to use, so I
| suppose it only works on subsets or queries that meet certain
| conditions.
|
| This isn't the first time I hear something to the effect of "LLVM
| JIT is great, but it introduces a lot of query latency". I wonder
| if there are other JIT engines more suitable for compiling
| potentially small/simple queries.
|
| [0] https://www.mongodb.com/docs/v7.0/reference/sbe/#std-
| label-s...
| jandrewrogers wrote:
| LLVM has many advantages for JIT but it does not prioritize
| very low latency, choosing to optimize other properties
| instead. Consequently, LLVM tends to be more popular for
| systems targeted at analytical workloads where queries commonly
| have intrinsically high overhead latency. The latency isn't
| terrible but it is noticeable if you are running a low-latency
| "fast-twitch" workload.
|
| There are other specialized JIT compilers that are essentially
| purpose-built to provide very low latency; for systems like
| MongoDB which are rarely used for serious analytical
| processing, you'd probably want to use one of these instead.
| __s wrote:
| re LLVM: Yes, Cranelift was designed to address these issues
|
| https://github.com/bytecodealliance/wasmtime/blob/main/crane...
|
| https://blog.benj.me/2021/02/17/cranelift-codegen-primer
|
| https://jason-williams.co.uk/posts/a-possible-new-backend-fo...
|
| https://news.ycombinator.com/item?id=25130528 30% faster rust
| build time
| romanovcode wrote:
| Serious question: Why use MongoDB when Postgres supports indexed
| dynamic json?
| nemo44x wrote:
| Because the developer productivity is great because the
| language drivers that interface with the database are seamless
| with your code. MongoDB lets you burn down a backlog better
| than any other database. It's not right for every problem but
| when it is right it's the best tool because the language
| drivers are so good. That's their secret sauce.
| Hypocritelefty wrote:
| [dead]
| splix wrote:
| MongoDB is much easier to setup as a HA cluster
| threeseed wrote:
| MongoDB:
|
| a) has a proven, supported, easy-to-use horizontal scaling
| solution. PostgreSQL doesn't.
|
| b) is ridiculously faster than PostgreSQL at per-tuple document
| updates.
|
| c) has clients which are tailored for operations and data
| structures around documents.
|
| d) is easier to install, configure and manage.
| riku_iki wrote:
| > MongoDB:
|
| does mongodb support ACID transactions? I think this is one
| of the key question in making decision.
| varelaz wrote:
| All operations related to a single document are ACID.
| MongoDB supports multi-document transactions, but with
| higher performance cost.
| https://www.mongodb.com/docs/manual/core/transactions/
| riku_iki wrote:
| Your doc doesn't state ACID support, but only atomicity
| (first A in ACID).
| coldtea wrote:
| I thought all those points died along with the hype ("MongoDB
| is web scale"), as all were wrong.
|
| Those were the marketing points people mentioned in their
| "honeymoon phase" posts. Then after using it in production,
| actual benchmarks and comparisons coming in, came the regret
| and moving on posts. In fact, most of those mentioned moving
| from Mongo to Postgres, and there was a full blow "yeah,
| NoSQL was a dumb idea for 99% of use cases, and Mongo even
| more so" discussion.
|
| In the end MongoDB was the butt of a joke, there were whole
| memes about it.
|
| So this comment is like a trip down memory lane, or into an
| alternate universe, where it's like 2012 and these things
| never happened.
| skywhopper wrote:
| It all depends on your data structure and problem space. It
| was silly to dismiss RDBMSes as obsolete 10 years ago, and
| it's silly to dismiss document databases or other NoSQL
| databases today. There are plenty of use cases for which
| Mongo is far superior to Postgres. A wise engineer chooses
| the best tool for the job.
| LtWorf wrote:
| They used it at my first full time job out of university.
|
| It was terrible.
|
| It was not fast, a regular SQL db would have been much
| better, due to bugs in our code, the data in mongodb was
| not uniform, which led to failures in the code when the
| document didn't have the expected format.
|
| Cherry on top is that they were mostly growing documents,
| which is a very slow operation, that would have been done
| very fast with a many-to-many table in SQL instead.
| fkyoureadthedoc wrote:
| > due to bugs in our code, the data in mongodb was not
| uniform
|
| It supports schemas though right?
| https://www.mongodb.com/docs/manual/core/schema-
| validation/
| [deleted]
| isoprophlex wrote:
| You're wrong. MongoDB is web scale.
|
| /s
| romanovcode wrote:
| It's not 2012 anymore ;)
| re-thc wrote:
| But is it PlanetScale?
| seedless-sensat wrote:
| How is (a) wrong? Pg doesn't horizontally scale like Mongo
| does
| romanovcode wrote:
| Okey, maybe I do not understand something but how is
| https://neon.tech not auto-scaling? Or the more expensive
| option at AWS Aurora?
| endisneigh wrote:
| This is not Postgres. It's a managed instance, but isn't
| built in.
| re-thc wrote:
| > not auto-scaling?
|
| The writes aren't. There's no sharding for example. You
| can ever only have 1 primary writer.
| maxbond wrote:
| Fair, but this is getting fixed. Postgres 16 (which was
| just released) adds an origin tag to logical replication,
| so that you can know whether or not to pass on a change
| (and avoid having an infinite loop where a change is
| replicated back and forth). That's the first step to
| having multiple mains. I think in theory, if you had deep
| Postgres knowledge, you could set up a high touch,
| bespoke multi-main deployment today.
|
| I expect the replication story with Postgres will get
| much better over the next few years.
| re-thc wrote:
| > Pg doesn't horizontally scale like Mongo does
|
| At what cost though? Does 1 Pg server = 1 Mongo server
| for the same specs or do you need a lot more Mongo
| servers?
| coldtea wrote:
| Between plain old horizontal scaling via custom sharding,
| replicas, and extensions like Citus and Timescale that
| offer full horizontal scalability, Postgres is handling
| some of the biggest use cases in the world.
| endisneigh wrote:
| It's simply a fact that Postgres does not offer out of
| the box scaling like MongoDB.
|
| Postgres does have things like Citus, or even wire-
| protocol compatible things like CockroachDB that do
| scale, but those are _not_ Postgres.
|
| It's the same situation with MySQL vs. Vitess.
| riku_iki wrote:
| > but those are not Postgres.
|
| sure, citus is an extension of postgres and part of
| ecosystem.
| threeseed wrote:
| Citus was acquired by Microsoft so there are doubts about
| its future longevity given that so many similar promises
| of ongoing support are rarely maintained. Timescale is
| optimised for time-series so not sure it's applicable.
|
| Point still remains that PostgreSQL lacks a modern,
| built-in horizontal scalability solution.
| coldtea wrote:
| > _given that so many similar promises of ongoing support
| are rarely maintained_
|
| Microsoft on the other hand is among the best in keeping
| such promises.
|
| If Google had bought it, sure...
| maxthegeek1 wrote:
| Cockroachdb is a common solution here.
| aeyes wrote:
| It's not Postgres, it only speaks Postgres dialect.
| riku_iki wrote:
| but it is not pgsql, and doesn't have drop in
| compatibility..
| threeseed wrote:
| At least in my experience it is all still the case today.
|
| If you are dealing exclusively with documents I find
| MongoDB to be faster, better and easier to use. If my data
| model is hybrid or purely relational then I would use
| PostgreSQL.
|
| After all these years I am _still_ waiting for an
| horizontal scalability solution for PostgreSQL that is up
| to the level of a modern database. It 's 2023 and for many
| of us in enterprise environments it's a mandatory NFR.
|
| But it seems like you're not interested in a having an
| actual technical discussion.
| nesarkvechnep wrote:
| Well, your data model could become relational if you
| normalize it.
| skywhopper wrote:
| Normalizing your data is not always possible or
| desirable.
| prmoustache wrote:
| did you try citus?
| https://www.citusdata.com/product/community
| aeyes wrote:
| It has serious limitations and needs a lot of elbow
| grease to run at scale. With some of the alternatives
| (for example MongoDB) you basically just add a few more
| machines and you are good.
|
| In my opinion Citus is still only a hack which is only
| useful if for some reason you can't move away from
| Postgres but you have reached the limit of what you can
| get out of a single machine.
| [deleted]
| coldtea wrote:
| > _But it seems like you 're not interested in a having
| an actual technical discussion._
|
| You mean like the "technical" bullet points that could
| just as well be marketing copy?
| okr wrote:
| e) MongoDb is web-scale.
|
| ;->
|
| (personally i like the schema-lessness)
| sambeau wrote:
| Thanks. Just what I was looking for.
| varelaz wrote:
| PostgreSQL will be horizontaly scaling if you can avoid joins
| and index range locks. But the thing is you don't want to, or
| why do you need SQL database then. I found MongoDB good for
| cases when you need a lot of upserts, your data model is
| looks like document and you don't need joins. Like you store
| events data for further processing or collect stats or
| counters with tags/indexes.
| dubcanada wrote:
| Why is there no conversation on the topic at hand and instead
| conversation on why use MongoDB.
|
| This authors post has nothing to do with pro/cons of mongodb,
| and is entirely around a new query engine and computer science.
| And instead we've devolved into a discussion on the pros and
| cons of PostgreSQL vs MongoDB.
| hot_gril wrote:
| The original question was fine, asking what's the reason to
| use MongoDB. Like, it got a new query engine, please convince
| me to try using it / how does the new engine compare to
| Postgres.
|
| The sub-questions about FOSS and font choice are annoying, in
| fact I removed my replies on them because I didn't want to
| feed the problem.
| sambeau wrote:
| It's completely understandable. This is a significant and
| interesting upgrade to MongoDB that naturally gets people
| interested in a, "Hmmm... maybe I should take another look at
| MongoDB" way. Before wasting time, it seems sensible to use
| this forum full of experts to ask whether it would be
| worthwhile, and what the advantages or disadvantages would
| be. I didn't sense any negativity in the question, and it was
| the exact question I had in my head too. After skim-reading
| the article I went straight to scrolling down the comments to
| look for this exact question.
| christophilus wrote:
| Same as it ever was. Someone posts a new JS framework, and
| the top 50 comments are about how bad JS is, and why don't we
| do RUST with WASM instead? Someone posts about a new Go
| release, and the top 50 comments are about how terrible Go
| is, and why don't we use a real language instead? Mongo...
| same thing.
|
| I get it. People don't like certain tools. But it would be
| nice for the rest of us to be able to actually discuss the
| topic at hand without all of the tangential critical noise.
| [deleted]
| h1fra wrote:
| I'm a postgres fan but I have to admit the JSON syntax is not
| the best and not well supported by all the tooling around this
| database. Doesn't mean it's bad but the DX could be better.
|
| Other than that, I have never seen a proper justification to
| use schema less db, good for prototyping and get started, not
| incredible as a long-term solution.
| synthmeat wrote:
| Schema-less is a bit wrong term, since you always end up with
| one but, indeed, schema-on-read vs schema-on-write discussion
| for individual use cases is far from a settled thing, even
| though the zeitgeist is that schema-on-write won sometime in
| the 70s.
|
| You say "prototyping", but I would generalize that to "faster
| evolving" in the long term. Of course, not without tradeoffs.
|
| I personally have zero issues writing a custom marshaller
| when needed for any the schema-less document collections I
| have. Constraints your application has on the data are a
| superset of db schema anyways.
| jiripospisil wrote:
| I gave my answer a few years ago. The summary is that its query
| and update operations over documents are miles ahead of what
| PostgreSQL has to offer.
|
| https://news.ycombinator.com/item?id=23271085
| throwaway2990 wrote:
| 3 year... almost 4 years ago you commented about a few years
| ago. So not really relevant.
|
| For the idiots downvoting.
|
| 2018 was version 11. Right when PostgreSQL JSONB support was
| beginning to get good.
| heipei wrote:
| Serious counter-question: How do you run Postgres in a native
| and hands-off replicated failover setup? With MongoDB you
| create a three-node replica set and you're done. I've yet to
| see a simple guide for Postgres. But I might be wrong, so happy
| to hear how you would achieve the same with Postgres.
| dzogchen wrote:
| Why use a non-FOSS database at all?
| nemo44x wrote:
| Because just about no one cares and I want to burn down a
| backlog quickly and often mongo enables that. Developer
| productivity is far more valuable that open source purity.
| [deleted]
| KronisLV wrote:
| > Developer productivity is far more valuable that open
| source purity.
|
| I mean, at least you get access to MongoDB source code, so
| that is something.
|
| I remember working on a project that used Clusterpoint
| years ago and the problem was that it was basically an
| abandoned piece of software due to the company behind it no
| longer releasing new versions. It was running, yes, but
| getting data out of it and migrating to something else sure
| took work and was very annoying because a lot of
| functionality had been built around it.
|
| So in regards to something so critical as databases and
| data storage solutions, I'd err on the side of picking
| things that are open source, or have compatible
| alternatives with permissive licenses.
|
| If someone were to build their own business around MongoDB
| (say, offering it as a service) and then SSPL came along,
| they'd be done for, even if just because of getting caught
| up in things going on between the org and large cloud
| vendors or something. Same with how you have to be careful
| with AGPL, so something like MinIO might be a non-starter
| in some cases.
| nemo44x wrote:
| Yeah if your goal is to offer it as a service then you're
| out of luck. But if you plan on building anything else
| then you're fine.
|
| Mongo is a thriving database so I'm not worried about it
| being abandon.
| KronisLV wrote:
| > Mongo is a thriving database so I'm not worried about
| it being abandon.
|
| With this, I can totally agree! And as long as it's not
| AGPL, you should be fine using it for most regular
| development stuff, even for closed source projects.
| riku_iki wrote:
| > Developer productivity is far more valuable that open
| source purity.
|
| that's until provider starts abusing power: increasing
| rates, close source completely, etc.
| maxloh wrote:
| There is FerretDB. But they are not fully compatible to Mongo
| yet.
|
| https://www.ferretdb.io/
| malermeister wrote:
| Isn't mongodb open source?
|
| https://github.com/mongodb/mongo
|
| Edit: ah it's source-available, not open source.
| JackMorgan wrote:
| My question exactly. No sense building a house on someone
| else's land. Too many projects I've worked on were shattered
| because of technology that stopped being offered or got too
| expensive to justify.
| romanovcode wrote:
| Also, but main point for MongoDB 10 years ago was the JSON.
| But now?
| synthmeat wrote:
| Just to add to sibling comments, one killer feature for me is
| ChangeStreams[1]. It's miles ahead of what Postgres[2] offers,
| and it enables really interesting use cases. Some of my
| services built around ChangeStreams end up not doing a single
| query to the DB. Data is right there in the program memory,
| indexed how I need it to be, and program is immediately
| reactive to any changes in the DB.
|
| [1] https://www.mongodb.com/docs/manual/changeStreams/
|
| [2] https://blog.sequin.io/all-the-ways-to-capture-changes-in-
| po...
| Lolol00 wrote:
| I remember adding changestreams a few years back to my
| project, they were amazing .
| jamil7 wrote:
| If you're only storing and querying JSON, then it's cheaper to
| run DynamoDB instead of RDS.
| salil999 wrote:
| This is completely false and has nothing to do with the
| article. DynamoDB and MongoDB might both serve JSON but
| they're used for wildly different things. DynamoDB is a key
| value store whereas MongoDB is much more than that.
| jamil7 wrote:
| Sorry you're right of course I messed up the service name.
| masukomi wrote:
| Dear Geeks with interesting things to say on your blog:
|
| Please, for sake of the people you want to hear your words. DO
| NOT USE A SANS SERIF MONOSPACED FONT FOR ANYTHING BUT CODE.
|
| I get it. We stare at text (code) formatted in this kind of font
| all day long, and many of us find fonts that we truly enjoy. But,
| most of our monospace sans-serif fonts are designed to make sure
| individual characters aren't misread. We don't read prose the
| same way we read code. There is far more pattern recognition
| going on than actual parsing of individual letters, and
| monospaced fonts break that. We can debate the aesthetics of
| serifs but they actually do help provide context clues to the
| pattern recognition systems in our brain.
|
| Convincing you all to start using serif fonts on prose is not a
| battle i'm likely to win, but maybe I can convince you to only
| use monospaced fonts for your terminal, and your code.
|
| Please.
| weinzierl wrote:
| You exaggerate the issue. Proportional fonts were a necessity
| when lead, ink and paper were expensive. In lead typesetting
| they are easy to do. This caused the success of newspaper fonts
| like Times New Roman.
|
| Contemporary fonts have much less variance in the width of
| characters, except for a couple of outliers like the i. From
| there to a completely monospaced font is not as big a leap as
| you make it seem. For me, I'm fine with monospaced fonts for
| prose.
| TylerE wrote:
| Disagree. Reading monospaced prose sucks big time. For those
| of us with a little bit of vision deficiency the vary
| character widths are helpful.
|
| Also, the reduced character width (as a monospaced font
| inevitable has to be spaced at what the widest chatacters,
| such as w, require, means that you have to use a smaller font
| size to get the same amount of info per scroll/line/page
| whatever, again compromising readability.
|
| Here's a comparison between the posted site, and the same
| site with the font size bumped up 4px, proportional (just
| browser default, I didn't cherry pick) and the ridiculously
| wide line spacing reduced.
|
| Even with all that, the easier to read version is quite a bit
| more compact. Could likely bump the font size 2 more px and
| still be smaller.
|
| https://i.imgur.com/DBFI2RU.png
| glitcher wrote:
| Sentences in all caps are much worse IMO.
| colesantiago wrote:
| Just use reader mode?
|
| All modern browsers have this.
| [deleted]
| ltbarcly3 wrote:
| I'm surprised when people notice fonts. If the font is weird,
| or very decorative then sure, I would notice a font in Papyrus
| or Comic Sans. On my kindle I set it to the modern sans sarif
| font because its easier to read, for me. If there was no option
| to change the font I probably wouldn't have noticed, though.
|
| I don't want say this in a way that comes off as insulting, but
| I'll just say it and please don't take it as a put down: if my
| kid came to me and expressed this much frustration and
| difficulty because of a sans sarif or monospaced font I would
| be concerned they had a problem with their vision or an issue
| processing what they saw, and I might research and/or take them
| to get checked out.
| wheels wrote:
| I wouldn't be bothered about it in this case (I'm much more
| offended when the font size or margins are stupid), but to
| use your phrase "I don't want say this in a way that comes
| off as insulting, but I'll just say it and please don't take
| it as a put down..."
|
| You probably just don't have a strong sense of aesthetics.
| Some people notice design-y stuff and some people don't. A
| lot of people do notice fonts, or misaligned margins, or
| whatever. And a lot of people don't. The latter category are
| probably never going to be good designers or artists or
| whatever. People are different.
| ltbarcly3 wrote:
| You are right, while I might notice that sort of thing you
| mention it wouldn't distract me much. There are other
| things that completely make it impossible for me to
| function, like direct lighting in or near my field of view.
| I could claim like you do that people who have a bright
| light bulb hanging right next to their tv are just blind to
| some deeper experience, but probably they just aren't
| bothered by it.
|
| What makes you think these proclivities about font choice
| are universal then? Maybe you are just insisting on what
| you find aesthetically appealing, while many people would
| equally validly feel the opposite?
| [deleted]
| oron wrote:
| I love Postgres and use it for different projects but for
| inboxes.com which has a very high insert rate coupled with auto
| delete by time stamp Mongo up until now has been very kind to me.
| We sometimes have 1000 incoming emails per second and high usage
| of our API and it just works.
| pphysch wrote:
| Did you actually migrate from PostgreSQL to MongoDB for
| inboxes.com prod?
| salil999 wrote:
| This is great and all but I'm curious on the performance
| improvements. I'm surprised there are no graphs or charts when
| show improvement in latency, CPU usage, disk reads, etc
___________________________________________________________________
(page generated 2023-09-21 23:01 UTC)