[HN Gopher] Never use MongoDB (2013)
___________________________________________________________________
Never use MongoDB (2013)
Author : mikecarlton
Score : 122 points
Date : 2021-02-01 16:55 UTC (6 hours ago)
(HTM) web link (www.sarahmei.com)
(TXT) w3m dump (www.sarahmei.com)
| flowerlad wrote:
| I use MongoDB in my application. The approach I took is to store
| data in flat documents (relational style) and only de-normalize
| when necessary for performance. The relational model was invented
| for a reason -- it is flexible and it is easy to update data in
| one place and so on. The downside of relational is that joins
| will kill you when you have very large tables. To avoid joins, I
| use lookups when possible, and de-normalize only to the extend
| needed. I get the best of both worlds.
| Mavvie wrote:
| You can do all that with SQL too. And do you have any very
| large tables?
|
| > joins will kill you when you have very large tables
|
| nitpick, but table size doesn't directly matter (much). if your
| queries are very specific and only return a couple rows, then
| you can have huge tables and join across them without issue.
| Joins only get particularly painful if you're doing
| aggregation/reporting queries across large parts of it
| dwheeler wrote:
| > if your queries are very specific and only return a couple
| rows, then you can have huge tables and join across them
| without issue.
|
| I agree, _if_ you index your tables. Relational databases are
| very capable; when there 's a performance problem, it's often
| due to simple things like failing to index what should have
| been indexed.
|
| No tool is perfect for all use cases. There are cases where
| relational databases won't work. But when I try to store
| data, I first consider storing it in files, and if that is
| unpleasant, I consider relational databases. These are both
| relatively simple time-tested solutions, and it's usually
| good to start with simple & time-tested _unless_ there 's a
| reason it won't work well.
| threeseed wrote:
| I can also use Lists, Sets, Maps in my data structure.
|
| Doing the same in SQL requires a lot of intermediate tables.
| interlocutor wrote:
| SQL databases don't support easy schema evolution, sharding
| and so on.
| revscat wrote:
| And Mongo is even worse. You can easily change document
| structures, but now you have inconsistent data over time.
| SomeCallMeTim wrote:
| PostgreSQL directly supports table sharding.
| https://pgdash.io/blog/postgres-11-sharding.html
|
| Schema evolution is supported by pretty much every ORM
| you'd care to use; it's not the job of the SQL database to
| handle the migration. I'm using Prisma and you literally
| change the software spec for the schema and say "migrate",
| and it creates the migration SQL and applies it to the
| Postgres DB programmatically. That gets you _deterministic_
| schema evolution and not the "my schema isn't actually
| reliable" that NoSQL/no-schema databases rely on.
|
| And then you have CockroachDB/YugabyteDB that give you
| extreme horizontal scalability, with full PostgreSQL
| compatibility...
|
| And bang, the last reason to use MongoDB vanishes.
| HeyImAlex wrote:
| I think a missing piece is making materialized views fast
| since there will always be cases where someone wants to
| denormalize to get around performance issues, even with
| well designed indexes.
| digitalsushi wrote:
| As a sysadmin that often gets the privilege of pretending to
| design the smallest of complex systems, one of my regular
| components has been CouchDB because of its built in http api.
|
| I've never been anti-Mongo, but this one little piece has made
| CouchDB an affordable choice for people like me who are not
| equipped to otherwise defend the choice.
|
| Is there a missing piece that could deal Mongo back in the next
| time I try to convince someone that there's as straight a path to
| the sysadmin solutions I generally compose?
| jacobwilliamroy wrote:
| mongo-express maybe?
|
| https://github.com/mongo-express/mongo-express
|
| But if couchDB works, it works. Personally I'd love to shoehorn
| LISP into everything I do, but most of the time I just use
| python and bash because things tend to get done faster when I
| do.
| [deleted]
| dang wrote:
| If curious see also
|
| 2016 https://news.ycombinator.com/item?id=12290739
|
| Discussed at the time:
| https://news.ycombinator.com/item?id=6712703
| [deleted]
| eatwater123 wrote:
| Unless you want to.
| nchase wrote:
| This article appears here pretty frequently:
| https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
| manishsharan wrote:
| plot twist .. use MongoDB api /drivers on Document Layer on
| FoundationDB ;-)
| andikleen2 wrote:
| In the early tens we ran an engineering project to improve
| critical sections in applications using transactional memory. We
| had a PhD level intern applying our techniques to various open
| source projects. One target was MongoDB. After a few days of
| investigation of the Mongo source code, he had to give up because
| he couldn't even find the critical sections in the source. They
| had locking, but it was extremely convoluted.
|
| So yes I would agree with that. Never use MongoDB.
| AtlasBarfed wrote:
| Jepsen review 2020:
|
| https://jepsen.io/analyses/mongodb-4.2.6
|
| ... not good.
| the_duke wrote:
| I'm by no means a fan of Mongo, but the product has improved
| quite a lot over the years.
|
| Mongo now supports multi-document (and multi-node) transactions,
| joins, and has a decent storage engine.
|
| So you might even have a chance of keeping your data actually
| consistent.
| williesleg wrote:
| C'mon man!
| cyxxon wrote:
| > On a social network, however, nothing is that self-contained.
| Any time you see something that looks like a name or a picture,
| you expect to be able to click on it and go see that user, their
| profile, and their posts. A TV show application doesn't work that
| way. If you're on season 1 episode 1 of Babylon 5, you don't
| expect to be able to click through to season 1 episode 1 of
| General Hospital.
|
| That is exactly what I'd expect, and that is how small websites
| like IMDB work. I am on the page for a General Hospital episode,
| and via the actors in the episode or whatever other part I can
| click through to Babylon 5, or the other way around, or anywhere
| else.
| offtop5 wrote:
| Mongo and other nosql databases are still the absolute fastest
| way to get started particularly you don't know what your data is
| eventually going to look like.
| ogre_codes wrote:
| Before PostgreSQL added BSON support this may have been true,
| but now it's pretty easy to build an SQL database that you can
| extend arbitrarily just like with Mongo.
|
| You probably shouldn't use that too much, not for anything
| remotely production, but it's there if you need that
| flexibility or an easy place to dump javascript objects.
|
| When you are just getting started, it's super easy to
| manipulate SQL data structures regardless.
| setr wrote:
| I disagree -- an RDBMS is pretty much safe/sane for any design,
| though it may not be the fastest. NoSQL databases
| (specifically, eventually consistent DBs) are only safe/sane in
| specific scenarios.
|
| If you don't know what your needs are, you should _always_
| start with an RDBMS -- it 's not that difficult to go "up" to a
| NoSQL db from there (you're only losing information, and if you
| can't safely move because of loss of ACID... you'd probably
| have been really fucked if you started off without it), but you
| can't easily migrate back "down" to the RDBMS -- a NoSQL
| database stores almost no information about your data or its
| constraints.
|
| And your application will almost always want transactional
| guarantees and to model relationships properly -- generally
| only small chunks of the design (design-wise; data-wise it
| might be 90% of the app) can be treated with eventual
| consistency and have real scaling needs, which you can shift
| over to your nosql system.
|
| Apps are generally just metadata tracking with a dash of real
| work.
| yawnxyz wrote:
| as a UX designer / product engineer, it's easier for me to
| jump into a nosql than to wire up a postgres instance for a
| small proof of concept project that I know won't really go
| anywhere
| nucleardog wrote:
| sqlite?
|
| All my proof of concept (and some production) stuff just
| uses that until I need features or concurrency that it
| can't provide.
| 35fbe7d3d5b9 wrote:
| Careful here: while this may be true of _some_ NoSQL stores
| that I 've not used, this is a tarpit for the wide column
| datastores I'm familiar with (Dynamo, Cassandra, and others).
|
| It's very easy to get going quickly and find out that you've
| tripped over antipatterns, and now your database is "a
| database, a Ruby app, and 30ms of latency on every call". It's
| easy to think "I'll model this later" and end up hitting the
| database five times in a row to answer a question.
|
| With these systems, up front modeling of your data model and
| access patterns is _essential_ if the trade off you are trying
| to make (far less functionality for smooth performance at
| ridiculous scale) will ever make sense.
| adwww wrote:
| This is what the company I work for claimed. Now we are 10
| years old and have a monster of a relational database stored in
| MongoDB.
|
| There's never a good moment to switch database, but there is
| always a new problem caused by storing relational data in a
| document store.
| jayd16 wrote:
| Changing SQL schema is honestly just not that hard...
|
| but even still, you can always use json columns in postgres if
| you don't want the db to enforce a schema.
| vosper wrote:
| Agreed, I've never understood this. There are lots of good
| migrations tools around for SQL databases. If people mean
| "you don't need to run migrations with Mongo, you can just
| start adding fields to documents as they're accessed, and
| clean up as-needed" then... don't do that. That's how the
| Mongo DB I am responsible for was managed for most of the
| last 10 years. It's a nightmare now, we can barely change it
| at all.
| offtop5 wrote:
| If your just prototyping you don't need a schema.
|
| Complaining mongo doesn't scale is like saying your Mita
| can't off-road
| jayd16 wrote:
| >If your just prototyping you don't need a schema.
|
| Oh you always have a schema. Its just about whether you
| want to enforce it or not.
| nucleardog wrote:
| It's enforced regardless in most situations. The question
| is whether you want it enforced on write with the
| database throwing an error, or enforced on every read by
| your app crashing or behaving incorrectly when it gets
| wildly unexpected values back.
| Scarbutt wrote:
| If you are prototyping a local file will do, or store json
| in a text field in sqlite, or in a json column in postgres.
| You will even have more flexibility than mongo since you
| will not have the constraint of their document data model.
| 35fbe7d3d5b9 wrote:
| Echoing this.
|
| My prototyping always starts with DB =
| {}
|
| at the top of my file. Sometimes it grows to serializing
| to disk / loading the dumped object from disk. Often
| times it's all I need to know that my idea was crap and
| needs revised. And it always keeps me from faffing about
| with infrastructure.
| offtop5 wrote:
| Only your last option will allow me to send a webpage
| link to a friend for him to try it.
|
| The idea is if you have just a couple of days to crack
| out of MVP you don't want to waste time with postgres or
| whatever. The problem ends up being then your boss is
| like, all right this works keep going with it
| setr wrote:
| I don't understand why postgres w/ json enables a
| website, where sqlite and text files don't. You just ship
| the sqlite/text file with the rest of the webserver code?
|
| It's less work than setting up mongo, postgres, or
| whatever long-running data store. Just deploy the web
| application... and you're done.
| offtop5 wrote:
| So that would make sense, if it's purely for show. For a
| recent project we actually used firebase but I wanted to
| enable myself to update the database without redeploying
| the website.
|
| I generally dislike when people try to write off a
| technology just because they don't know how to use it.
| Don't hammer in screws, but it doesn't mean you should
| throw out all of your hammers
| [deleted]
| yawnxyz wrote:
| yeah json columns are a game changer and will definitely
| "disrupt" nosql usage, at least for my own projects
| 01acheru wrote:
| This article keeps coming up every once in a while and reminds me
| of all those "Why JS sucks", "Never use PHP", "Java is enterprise
| only", "Ruby only works on hobby projects" etc...
|
| But then in real life people built great software with all the
| above, so I'll just say a great classic: pick something you know,
| use it well, build something good, end of story!
|
| No tool will fix wrong assumptions or bad design, we can dive
| into philosophy here but I'm more of a practical person so... :)
| jeff-davis wrote:
| I would argue that a lot of weaker elements in the stack (e.g.
| PHP) work well _because_ of a stronger database.
|
| A good database system does a lot to catch errors (including
| consistency problems), isolate them, and roll them back.
| Moreover, it will allow you to move performance problems into
| the database, where it's likely to be handled more efficiently
| with less application code (e.g. joining in the database is
| likely to be more efficient than a naive join algorithm
| implemented in the application).
|
| Some will argue that these are misfeatures and should be
| handled in the application. In some cases, that is true; but
| you are probably going to need some other aspects of the stack
| to be very robust and performant to get reasonable results.
|
| In other words (please excuse my examples as they are intended
| for illustration and not flamebait), PHP over Posgres might be
| fine; Haskell over MongoDB might be fine; but PHP over MongoDB
| is playing with fire.
|
| I'd still say that, in most cases, the database layer is the
| first place to start to work toward a robust system. Even a
| proven-correct Haskell program can fail miserably if there was
| a minor bug three versions ago that wrote some bogus data that
| wasn't caught by a good database layer.
| bdcravens wrote:
| In each of those language examples, there are examples of great
| software that migrated away from those languages for different
| reasons.
| throw1234651234 wrote:
| The only practical application of MongoDB I can justify to this
| day, is if you have a form builder, that allows users to build
| completely custom forms. Forms.io gives you the form schema in
| such a scenario as JSON out of the box. That gets matched with
| answers as JSON.
|
| Saving this to MongoDB directly, rather than SQL seems to
| simplify things.
|
| With anything else, at the end of the day, you are enforcing FK
| constraints anyway, so might as well use SQL.
|
| I never had issues with MongoDB performance.
|
| One caveat to this is that I am yet to see a project that needs
| database sharding in real life, and I have worked on projects
| with millions of entries in a table and hundreds of writes a
| minute.
| CodesInChaos wrote:
| Most relational databases have some level of json support
| nowadays. So I'm not sure how much it simplifies in that
| case.
| throw1234651234 wrote:
| There is appeal to the json being directly searchable in
| the database, ala MongoDB.
|
| Specifically for analytics/reporting.
|
| The problem is that most analytics/reporting/BI tools SAY
| they support MongoDB and then they tell you to write a
| "connector" for each entity, at which point it's easier to
| just move it to SQL.
| larrik wrote:
| This advice is fine for programming languages, but not for data
| stores. Using them incorrectly (or in MongoDB's case and
| sometimes MySQL's, correctly) could lead to things like data
| loss or crashed servers. Simply building your product on one of
| them is not proof they are good enough.
| Geminidog wrote:
| No, there are programming languages that are just plain bad
| to use, his philosophy is wrong not only for datastores but
| for reality in general.
|
| All tools, including programming languages can be bad no
| matter the skill of the user.
| artificial wrote:
| Found the person that doesn't like BrainF*ck. /s
| redwall_hp wrote:
| With Excel as a data store!
| 01acheru wrote:
| Glad to see that you actually know what is right ;)
|
| All tools can be bad, but that means that probably you
| cannot use it well or build something good with it. Every
| good enough tool can be used to build something good
| enough, or else it wouldn't be good enough. And it's also
| realistic to assume that every tools popular enough is good
| enough for something.
|
| But you see, it's starting to get philosophical over here
| rowanG077 wrote:
| You can create the grand canyon with an infinite supply
| of plastic spatulas. Does that mean a plastic spatula is
| good tool for creating grand canyons? Of course not.
| TomSwirly wrote:
| It would be more useful if you addressed the specific issues
| brought up in the article, rather than generically tried to
| dismiss all articles critical of any programming language...
| [deleted]
| Geminidog wrote:
| Your post implies that there is no tool on earth that "sucks"
| and that it's not the tool, it's the person.
|
| It's impossible for EVERY tool to be good. This isn't reality.
| There has to be tools that are patently bad to use and people
| have used these bad tools to build great things. But it doesn't
| change the fact that a tool can be horrible to use.
|
| I would argue that at the time the article was written, Mongo
| was definitively a bad tool. Things have changed, but not all
| things.
| aeturnum wrote:
| > It's impossible for EVERY tool to be good.
|
| What does 'good' mean? Gcc is a good C compiler and a bad
| Java compiler. It's even worse at being a document database.
|
| I don't think 01acheru was saying that all tools are equally
| able to do all tasks. I read them as saying that people have
| used tools with recognized flaws to make good stuff and that
| being snobby about whether tools are "good" or "bad" in a
| general way isn't super useful for anyone. Instead, we should
| say specific things about specific flaws and let others
| decide if those flaws matter to them.
|
| In particular, this post isn't really saying that MongoDB
| doesn't work, it's saying that the MongoDB data model isn't
| useful for what the author was using it for. Even if you are
| sure that your app was "the perfect use case for MongoDB" all
| you can really speak about is your use case. The real
| headline for this article is "we couldn't make MongoDB work
| and we're skeptical anyone can," which is totally fair, but
| shys away from the grand claims that 01acheru (and I) are
| critiquing.
| systemvoltage wrote:
| Precisely. It is _extremely_ crucial to discuss shortcomings
| and criticize tools otherwise how are we going to improve if
| we all roll along singing the tunes of each other and never
| questioning anything?
|
| This kind of attitude and softness towards criticism - "All
| tools are great" is not how we need to operate. Professional,
| well articulated and constructive criticism needs to be on
| the table.
|
| I downvoted the GP for this reason.
|
| I advise everyone here to listen to criticisms and write them
| as well. Don't be afraid of some kind of a backlash, express
| freely.
| 01acheru wrote:
| Well I guess we are discussing different angles of this
| matter, I actually didn't thought that my comment would go
| to the top since it was supposed to be a random naive
| statement.
|
| I'm not about making things that simple by default, but I
| don't like those absolutist titles like "Never use
| MongoDB", also because the years since 2013 actually proved
| the article to be kind of wrong.
|
| "Never do X" is what I tell to children about matters that
| they wouldn't be able to understand, and making a point
| like "we used an immature tool that wasn't the best choice
| for what we were building, and on top of that we used it
| wrong, so you random guy should never use it for anything,
| ever" sounds like fearmongering to me and it's not
| something suitable to my taste.
|
| But I get your downvote :+1:
| systemvoltage wrote:
| Indeed, never do X is pretty strong and needs equally
| strong arguments to back it up. But it can be true. There
| are tools that are superseded by better ones.
|
| Equally, it's important to also tone down "X is nothing
| but the best" and praises should also require equal and
| opposite constructivism.
| pdimitar wrote:
| Criticism must have context so your statement was indeed
| a bit naive and too generic.
|
| In the case of MongoDB, many people thought that with it
| you'll have all of the benefits of, say, PostgreSQL,
| without having to think much about your data schema.
| History has shown that this is not the case -- as usual,
| it's about tradeoffs. There are no absolute wins: if you
| want an RDBMS, you have to put more work in X, if you
| want a document store then you have to try really hard
| with Y.
|
| "MongoDB vs. RDBMS" is a very old and tired argument by
| now but it basically boils down to: people started using
| MongoDB wide-eyed, optimistically and with more
| enthusiasm than engineering skill and of course, there
| were harsh reality checks.
| jeff-davis wrote:
| I agree that absolute statements like "never" obscure
| useful discussion.
|
| I would reword your comment as: "If you are building
| something and excited, then keep going, don't stop
| because a blog told you 'never'". I think that's what
| your main point was, and it's a good one.
| alexfrydl wrote:
| I hate the logic that because people have successfully used
| something, it is good. Having used MongoDB extensively, it is
| absolutely true that MongoDB is powerful and useful. However,
| it is also true that I would never choose it over something
| else for a new project. It has likely improved since 2013,
| but so has everything else.
|
| Certainly I could endeavor to build amazing modern software
| in C, but unless for some reason C is an absolute must, I
| would rather try any other language first. I doubt this is a
| controversial statement, and yet someone will always be there
| to defend the opposite stance.
| daniel-grigg wrote:
| Now you're just repeating the same fallacy by assuming all
| these tools have improved at the same rate.
| at-fates-hands wrote:
| > Your post implies that there is no tool on earth that
| "sucks" and that it's not the tool, it's the person.
|
| I got more of a "stop complaining about tools. Pick one you
| like, use it and build something with _IT_ instead. I feel
| the same way. Developers are a fickle bunch. One tool works,
| but then in order to be "cool" you have to bag on it and
| then propose some other obscure tool you think is better that
| nobody has ever heard of.
|
| It seriously reminds me of people arguing over music. Its
| totally uncool to like a mainstream band because _everybody_
| else likes that band and its not cool to like them. So then
| you have all the "cool" people who listen to all the obscure
| "awesome" bands who Rolling Stone magazine tells you to
| listen to, so then you go around telling people you listen to
| the Shithouse Rats. "Oh you've never heard of the Shithouse
| Rats? Well, they're kind of obscure." and now you're one of
| the of the cool kids.
| 01acheru wrote:
| Yeah that was my point, 100%!
|
| And by the way I really like you music analogy, and it's
| emblematic of something even larger: they read about
| Shithouse Rats on Rolling Stone, named after a song of Bob
| Dylan or Muddy Waters or the group itself (don't know which
| one) and all of them are quite famous and mainstream.
|
| You've got to love mankind, we are awesome!
| skrtskrt wrote:
| I feel there can be a fair reason behind this, which is
| basically that using a tool no one else uses, even if it's
| great for the use case, is likely to be a losing battle.
|
| Hard to get using it approved at work, less usage means
| fewer bugs are caught and features developed, etc.
|
| So people evangelize their favorite tools because it
| benefits them directly to have them adopted more widely.
| Geminidog wrote:
| Music implies sort of an everything is just an opinion
| thing so your analogy does not fit.
|
| Think of it like horse drawn wagon vs. a car. There might
| exist a guy who in his humble opinion thinks the wagon is
| better so he use it to get places instead of a car but is
| that guy a reasonable guy? No.
|
| The analogy I mentioned above is more apt because Mongo was
| indeed at one point in time more of a wagon rather than an
| unpopular piece of music.
| jimbokun wrote:
| This is usually true.
|
| But every once in a while you have a case like WhatsApp,
| which sold to Facebook at a price of $500 million per
| engineer, which never could have happened without Erlang.
| karmakaze wrote:
| > at the time the article was written, Mongo was definitively
| a bad tool
|
| I concur. Version 3.0 is dated March 3, 2015 that uses the
| WiredTiger engine which fixes much of the brokenness.
|
| I did some workaround work on a MongoDB v2.x app. It did suck
| and was inconvenient operationally, but it also did scale so
| had its uses.
|
| However the discussion now should be about how it is to be
| used/not today and not back in 2013. So fair to say it did
| suck or you shouldn't have used it, but that doesn't have
| much relevance.
| pc86 wrote:
| They never said that every tool is good. They didn't even
| imply it.
| [deleted]
| worik wrote:
| "There has to be tools that are patently bad to use and
| people have used these bad tools to build great things."
|
| Don't those tools die out and get forgotten?
| pdimitar wrote:
| No. Sticking to what is known beats innovation most of the
| time.
|
| There is no correlation between something being widely used
| and it being good at its job.
| gher-shyu3i wrote:
| It really depends. Things can be built despite of the
| technology. I saw this especially at an employer who was
| heavily invested in golang. Countless times I've thought to
| myself that they wouldn't be having the issues they were
| having, sunken costs, reinventing the wheel, etc. if they used
| a proven technology like the JVM instead of drinking the kool
| aid and using the latest fad of the day. Tons of money was
| sunken into it, and it was made to work by force, but not
| everyone would be able to bear that cost, and it still causes
| massive inefficiencies due to poor tooling.
| Geminidog wrote:
| >No tool will fix wrong assumptions or bad design, we can dive
| into philosophy here but I'm more of a practical person so...
| :)
|
| And no design can fix a bad tool, we can dive into the
| practicalities here but I'm more of a philosophical person
| so...:)
| jimbokun wrote:
| > "Why JS sucks", "Never use PHP", "Java is enterprise only",
| "Ruby only works on hobby projects"
|
| All of these have at least a little bit of truth to them, and
| you should know about the downsides of technologies, even if
| you decide to use them anyways.
| dudeinjapan wrote:
| MongoDB + Ruby sparks joy for me. It's come a long way since 2013
| and latest features like transactions (though nowhere near SQL
| level of robustness) are enough for my use cases. To each his or
| her own.
| jonstaab wrote:
| > On my laptop, PostgreSQL takes about a minute to get
| denormalized data for 12,000 episodes, while retrieval of the
| equivalent document by ID in MongoDB takes a fraction of a
| second.
|
| What? Her database can't possibly be indexed properly.
| kulig wrote:
| I remember reading some of her comments a while ago and she
| seems like a pretty arrogant person. Probably doesnt know how
| to use postgres properly.
| joshxyz wrote:
| at most cases it could also be 1.) how the user writes the
| code, and 2.) how the db api / library was coded
| mywittyname wrote:
| Yeah, this sounds like a design defect. But since the author
| doesn't really describe what they did, it is hard to really
| figure it out. I'm guessing this is some sort of query with a
| self-join going on, where the mongo request is a basic fetch by
| id.
| mbreese wrote:
| The use of the term "denormalized" suggests to me that it was
| a query that involved a lot of joins. Which is certainly
| something that could have been otherwise addressed with a
| different design.
|
| Comparing fetching from a normalized design to a denormalized
| one isn't really a fair comparison.
| dathinab wrote:
| I fear the problem is that the person doesn't do joins but
| separate sub-queries which then get recombined in RAM in
| the client. Given the software stack described there is a
| realistic chance of this happening implicitly due to ORM
| mappers.
|
| But then the way tables don't map well to 1-to-many
| mappings and joins still returning data in tables this can
| also be a problem. Especially if a large field get
| duplicated a lot. RDBMS really should go from 2d-Tables to
| proper nested types for _results_ IMHO.
| fabian2k wrote:
| That would be nice, though I suspect that it is really
| much more complicated than it seems. You can emulate this
| to some extent in Postgres with the various JSON
| functions and essentially return a tree from a single
| query. But my experience was that I quickly got to a
| point where the query plan got really complex and
| planning time started to dominate.
| wetmore wrote:
| Her
| jonstaab wrote:
| Thanks, corrected
| fabian2k wrote:
| Even without an index that sounds too long (though obviously
| hardware and Postgres itself both have come a long way since
| 2013). At 12,000 rows even a brute force query should be quick.
|
| I would suspect something like bad statistics or some other
| reason that caused a pathological query plan. In any case this
| is not a good representation of any potential performance
| difference between Postgres and MongoDB.
| jonstaab wrote:
| I could see it taking that long if she was using a full movie
| database with millions of rows in it.
| luhn wrote:
| Author is using Rails, so my guess would be the bottleneck is
| ActiveRecord. I've never used ActiveRecord, so I can't speak to
| it directly, but in my experience when dealing with large
| numbers of records in an ORM (and author easily is working with
| hundreds of thousands), things grind to a halt, even with eager
| loading. There's a lot more overhead to create thousands of ORM
| objects than it is to serialize an equivalent chunk of BSON.
| treeman79 wrote:
| Bulk operations are were a lot of rails programmers struggle.
|
| Active record is awesome in many ways, but it can shoehorn
| you into n+1 solutions.
| alpineidyll3 wrote:
| For the record we use mdb in production, and it's been fine.
| franklyt wrote:
| SQL is a solution to a problem in the vein of prematurely
| optimizing for many use cases where mongodb is called for.
| petepete wrote:
| I first read this article while working on a project where the
| company had basically written a RDBMS using MongoDB. It was so
| many different kinds of bad I lost count.
| throwawayboise wrote:
| That's the problem with using document databases when you
| really need an RDBMs. You end up reimplementing an RDBMS,
| badly.
| mywittyname wrote:
| It's so painful to come onboard projects that should have
| been designed originally in an RDBM, but that was never a
| consideration because, "they are slow." People fight the
| migration tooth and nail until it becomes nearly impossible
| to move forward.
| a13n wrote:
| We've used MongoDB at our SaaS and have grown to well over $1m
| ARR and never had an issue.
|
| Maybe if you're trying to build a massive ($B) company, starting
| with PostgreSQL makes more sense for you. For everything else,
| MongoDB works just fine.
|
| Just use the technologies that you know and can move fastest
| with. Startups rarely succeed/fail because of which technologies
| you choose to use.
| bdcravens wrote:
| If your data store is nothing but a persistence layer for an
| application, perhaps that makes sense.
|
| In many companies the need to regularly access the database for
| analytics and BI is a thing; this isn't limited to $B
| companies. Most of the tooling available works best with SQL
| databases. (Though the BI connector at
| https://docs.mongodb.com/bi-connector/current/ looks
| interesting for this purpose)
| nojvek wrote:
| I think there are two separate aspects that get conflated into
| one.
|
| 1) Document database - rather than a strict rigid schema, you can
| store nested json documents in tables/collections. Or the idea of
| soft schema where the whole database doesn't need to be blocked
| for a schema change and you have some leeway in integrity.
|
| 2) Relational database - Ability to make complex sql queries that
| join data from multiple tables.
|
| Mongodb has some support for joining but it doesn't have a sql
| variant. If your data is mostly key:val store then it's great.
| You can shard it, and have replicas. It's easy to make a fast
| reliable backend with mongodb. Many popular sites run on mongodb
| backend.
|
| However with new json types in MySQL and Postgres, it too has
| support for inserting documents and querying subkeys. It can be
| sharded and replicated (albeit with a bit more configuration).
|
| Couchbase which is like mongo (in its document store
| capabilities) N1QL which offers agility of SQL and flexibility of
| JSON.
|
| So like any tool, it has it's tradeoffs.
|
| Then again kudus to the author for evoking our reptillian brains:
| "Never use MongoDB" incites emotions and gets you on top of HN.
| If it was called "When to use MongoDB", it wouldn't get the same
| reaction.
| ashtonkem wrote:
| I'll never ever use MongoDB again, because every single time I've
| ended up running a cluster it has always been the most troubled
| part of my stack. I've been burned _way_ too many times to ever
| considering touching that stove again.
| ryanianian wrote:
| What do you mean by "troubled," and what docs did you follow to
| set up and run the cluster?
|
| Caring and feeding for a database (of any type) with any type
| of HA has a learning curve. Hence the growing number of PaaS
| services that handle the setup and maintenance for you (AWS's
| RDS, Mongo's Atlas, etc)
| cj wrote:
| MongoDB's own management software (Cloud Manager, Ops
| Manager) is littered with bugs to the point where it's nearly
| unusable for certain actions. (One of which being restoring
| backups)
|
| It was really, really bad 3-4 years ago. Regularly entering
| irrecoverable error states while performing basic management
| operations via MongoDB's management GUI. I've noticed a
| significant improvement in the past ~1 year.
| ashtonkem wrote:
| By troubled I mean it would regularly go into what I called
| "three stooges" mode where each node swore that another was
| the master. Of course this would mean that writes would stop
| dead in their tracks.
|
| Given that this happened to me across companies, teams, and a
| half decade of time, I've decided that this is a case where
| the problem is Mongo and not me.
| ryanianian wrote:
| The only constant there is you which means it is perhaps
| you.
|
| Setting up and caring for a DB cluster is a complicated
| thing, to the point of there being non-BS certification
| courses for nearly every major HA database including Mongo.
| It very well could be that there was a well-documented flag
| that you never learned.
|
| This doesn't mean you're being unreasonable. I'd be cranky
| with any DB getting itself in a split-brain scenario, but
| my conclusion wouldn't be a bug in the software but rather
| it's a bug in my understanding. It's worth noting that
| getting in this state should either be impossible, or it
| should be obvious about how it arrived at such a state with
| links to relevant documentation.
|
| (There's also little incentive to make it super easy to run
| in production on your own. They sell that as a service
| after all. It wouldn't surprise me if the product had
| invariants that assumed production-level configurations and
| that nobody's tested with whatever configs ended up making
| it go nuts.)
| eecc wrote:
| Ok, so besides the technical faults of Mongo within the context
| of its category, what is the ideal use-case of a document
| oriented store?
|
| If you use metadata documents to model your relations you might
| get away with the most dangerous foot guns, but then why not jump
| straight into graph databases?
| fortran77 wrote:
| But it's web scale!
|
| http://www.mongodb-is-web-scale.com/
| jasondc wrote:
| This article from 8 years ago highlights how far MongoDB has
| come: transactions, left outer joins ($lookup), etc.
| ashtonkem wrote:
| The issue is that MongoDB isn't chasing a fixed target. RDBMS
| have gotten better in the mean time.
| nickkell wrote:
| How many years before it reaches feature parity with a
| traditional RDBMS though? And when will it get a query language
| as good as SQL?
|
| That said, I will admit the change streams feature is amazing.
| That completely changed the way I thought about building
| reactive applications
| leke wrote:
| > But there are actually very few concepts in the world that are
| naturally modeled as normalized tables. We use that structure
| because it's efficient, because it avoids duplication, and
| because when it does get slow, we know how to fix it.
|
| Urmmm...How?
| axegon_ wrote:
| Despite having used document oriented databases for many
| years(largely because they were shoved down my throat and I
| inherited someone else's architecture), I never really managed to
| figure out why people find them so compelling. There has been a
| shift in the last two years and people have started running away
| from them. Specifically the web-dev crowd adored them and I guess
| it's easy to fetch a document in the exact structure you need it
| but sooner or later you inevitably reach the point where you have
| to analyze data. And here mongo(and all the similar alternatives)
| become the biggest pain in the a...neck you can think of.
| Couchbase tried to tackle this issue with n1ql to a certain
| degree but at large scale it is still not particularly useful. To
| my mind, having a relational database which has a good
| architecture can't be matched by any document oriented database.
| But getting a large system/database right does take more effort.
| There are numerous ways to make relational databases incredibly
| scalable but again, it takes a lot more effort.
| fendy3002 wrote:
| It's amazing for three things: search, logging and draft
| records.
|
| Search, with mongodb can do $all query, which is hard to
| replicate at sql level without aggregation. However I'm still
| waiting for aggregate-level $elemAt.
|
| Logging, you can attach anything to a property, then it'll be
| queryable.
|
| Draft records, it's easy to just insert and insert the records
| because it's schema-less. Validate during creation and validate
| again during publishing or approval. It's queryable and you can
| use a generic collection for that.
|
| For logging and draft records, sql JSON field may be able to
| handle them, though I don't know how good it is at querying.
| jacobsenscott wrote:
| There was a time where adding a column to a database was a
| really big deal. You had to get it past the DBA, and there were
| real resource constraints on the database system. With a
| document store the schema is entirely in the hands of the
| developer.
|
| Also JSON became the standard way to ship data around, and
| RDBMs systems of the time couldn't really handle JSON. So you
| either write a bunch of code to map complex nested JSON to
| relational tables, or just dump it into an un-indexible text
| column.
|
| There was vendor hype, just like there was around Object
| databases in the pre-internet days.
|
| If you were starting a new project you needed to decide if you
| were going to use a document store and an RDBMS or just on or
| the other. If it was just one you would choose a document store
| if you anticipated you would need to handle a lot of
| unstructured data.
|
| Today the situation is revered. A document store only does
| documents well. A good hybrid database like postgres gives you
| the best of both worlds. Throw in hosted database services and
| resource constraints are much less of an issue. So people
| aren't running back to an old school RDBMS. They are moving to
| a much superior and evolved data store.
| axegon_ wrote:
| That's only partially true. With document schemas, you simply
| eliminate the DBA since whatever you put in there is entirely
| up to you. In all fairness I've never dealt with DBAs - I've
| always managed to get a technological freedom and be able to
| design and organize my databases in whichever way I see fit.
| I'd generally hate to have to ask someone to clone a table
| for me or whatever.
|
| JSON is the standard way to ship data around the internet,
| yes. Though grpc is catching up and more and more often I see
| people relying on grpc in their architecture. And grpc
| conceptually is a lot closer to RDBM, given that you have a
| code generation step and everything in your data needs to be
| defined(aka statically typed).
|
| Recently I started several personal projects and though I
| struggle to find time and motivation to work on them on my
| own, document related databases are completely out of the
| question. postgre and potentially redis as a proxy for heavy
| loads and that's that. I wouldn't call postgres a hybrid
| database. It does support json datatypes natively but in it's
| core it is the definition of what RDBMs are. The best example
| for a hybrid database(from a developer's perspective since it
| isn't open source and I do not work for google in any shape
| or form) is spanner.
| dgb23 wrote:
| I fairly recently _really_ started to understand how
| important historical reasoning and understanding is in the
| context of software, technology and science. Your comment is
| a great example. Tech developments, choices, trends and so on
| only really make sense in the context of history. And often
| we forget about history, start to reinvent things or even
| steer into a completely useless direction because we don't
| apply temporal reasoning or simply don't learn from the past.
|
| Another benefit of this kind of approach is starting to learn
| about a challenging subject. Say you want to deepen your
| knowledge in a branch of mathematics that you find
| interesting and useful. The history of that branch will tell
| you so much more than a typical lecture-style conglomerate of
| concepts. It provides a great overview of important actors,
| their relationships, cause and effect of discoveries, the
| culture, the problems and so on. On top of that it is easier
| to remember and internalize concepts if you know the story
| behind them.
| ashtonkem wrote:
| > There was a time where adding a column to a database was a
| really big deal. You had to get it past the DBA, and there
| were real resource constraints on the database system. With a
| document store the schema is entirely in the hands of the
| developer.
|
| That time is still here if you're running enough read nodes
| and QPS.
| hnarn wrote:
| > I never really managed to figure out why people find them so
| compelling.
|
| This might sound jaded but my feeling is that a lot of
| developers just looked at JSON objects that they were already
| working with and thought to themselves "actually, it would be
| cool to just store this directly".
|
| Which, in itself, isn't a bad idea but writing a completely new
| solution from scratch to a problem that's been solved for
| decades seems a bit like hubris.
|
| AFAIK many relational databases support JSON today, so I'm not
| sure what the argument would be to choose something like
| MongoDB today from scratch if you had the choice of anything.
| w0m wrote:
| > why people find them so compelling
|
| My theory is that it's easy to add a field by adding logic into
| the app instead of munging tables relationships. Moves the
| logic to where developers are more comfortable. Scalability/etc
| is irrelevant for most use cases anyway.
| hnarn wrote:
| > Scalability/etc is irrelevant for most use cases anyway.
|
| I literally can't parse what you mean by this
| whatsmyusername wrote:
| I have found 2 use cases, one of which I've never actually seen
| in the wild.
|
| The most common use case is, "I need to store data where the
| schema is unknown or can change without notice, and have my
| shit not break." This is what we used Mongo for.
|
| The other use case I could see (and this is pretty much only
| with Dynamo) is, "I want to build an application that's cross-
| region native. Most of my data is relatively static, so I
| accept eventual consistency on changes. I will have a separate
| data store for transactional data and data that cannot be
| eventually consistent." I want to build this project, but it
| will never happen because it's too easy to RDBMS in a single
| region to start.
| __float wrote:
| I have found myself enjoying using a document database as the
| online store, and then using a 'big data solution' (we use
| Presto) for any analytics queries later.
|
| Traditional migrations for relational databases are really
| painful. Document databases make this much easier, and if
| you've faced the operational pain of needing to migrate a large
| database (for example, it's so easy to accidentally lock an
| entire table in Postgres), you might be pretty compelled.
|
| (That said, I think the pendulum is swinging back away from
| document databases. So you're in luck ;))
| mohaine wrote:
| How do you normalize your Json structure so it can be
| queried? Do you enforce a schema on your JSON or do you morph
| it on export into a common structure.
|
| If you enforce a schema on the Json structure how do you
| handle the changes on the live system?
| thehappypm wrote:
| I think the parent is arguing that it's hard to do useful
| big-data analysis on highly nested structures of data. When
| your database is storing some immense JSON blob, it's hard to
| write SQL against it.
| e12e wrote:
| > Despite having used document oriented databases for many
| years(largely because they were shoved down my throat and I
| inherited someone else's architecture), I never really managed
| to figure out why people find them so compelling.
|
| Well, filesystems are pretty good. It's the only document store
| I use (and mostly enjoy).
|
| But then you look at the trade-off with some think like just
| Maildir, and you really start to wonder if this schemaless
| document store thing is so great?
|
| I suppose the real shame is that proper object dbs like zodb or
| gemstone gets much less attention - they to have big trade-offs
| - but I feel they at least give back in terms of consistency
| and simplicity.
___________________________________________________________________
(page generated 2021-02-01 23:03 UTC)