hngopher.com

       [HN Gopher] Never use MongoDB (2013)
       ___________________________________________________________________
        
       Never use MongoDB (2013)
        
       Author : mikecarlton
       Score  : 122 points
       Date   : 2021-02-01 16:55 UTC (6 hours ago)
        
 (HTM) web link (www.sarahmei.com)
 (TXT) w3m dump (www.sarahmei.com)
        
       | flowerlad wrote:
       | I use MongoDB in my application. The approach I took is to store
       | data in flat documents (relational style) and only de-normalize
       | when necessary for performance. The relational model was invented
       | for a reason -- it is flexible and it is easy to update data in
       | one place and so on. The downside of relational is that joins
       | will kill you when you have very large tables. To avoid joins, I
       | use lookups when possible, and de-normalize only to the extend
       | needed. I get the best of both worlds.
        
         | Mavvie wrote:
         | You can do all that with SQL too. And do you have any very
         | large tables?
         | 
         | > joins will kill you when you have very large tables
         | 
         | nitpick, but table size doesn't directly matter (much). if your
         | queries are very specific and only return a couple rows, then
         | you can have huge tables and join across them without issue.
         | Joins only get particularly painful if you're doing
         | aggregation/reporting queries across large parts of it
        
           | dwheeler wrote:
           | > if your queries are very specific and only return a couple
           | rows, then you can have huge tables and join across them
           | without issue.
           | 
           | I agree, _if_ you index your tables. Relational databases are
           | very capable; when there 's a performance problem, it's often
           | due to simple things like failing to index what should have
           | been indexed.
           | 
           | No tool is perfect for all use cases. There are cases where
           | relational databases won't work. But when I try to store
           | data, I first consider storing it in files, and if that is
           | unpleasant, I consider relational databases. These are both
           | relatively simple time-tested solutions, and it's usually
           | good to start with simple & time-tested _unless_ there 's a
           | reason it won't work well.
        
           | threeseed wrote:
           | I can also use Lists, Sets, Maps in my data structure.
           | 
           | Doing the same in SQL requires a lot of intermediate tables.
        
           | interlocutor wrote:
           | SQL databases don't support easy schema evolution, sharding
           | and so on.
        
             | revscat wrote:
             | And Mongo is even worse. You can easily change document
             | structures, but now you have inconsistent data over time.
        
             | SomeCallMeTim wrote:
             | PostgreSQL directly supports table sharding.
             | https://pgdash.io/blog/postgres-11-sharding.html
             | 
             | Schema evolution is supported by pretty much every ORM
             | you'd care to use; it's not the job of the SQL database to
             | handle the migration. I'm using Prisma and you literally
             | change the software spec for the schema and say "migrate",
             | and it creates the migration SQL and applies it to the
             | Postgres DB programmatically. That gets you _deterministic_
             | schema evolution and not the  "my schema isn't actually
             | reliable" that NoSQL/no-schema databases rely on.
             | 
             | And then you have CockroachDB/YugabyteDB that give you
             | extreme horizontal scalability, with full PostgreSQL
             | compatibility...
             | 
             | And bang, the last reason to use MongoDB vanishes.
        
               | HeyImAlex wrote:
               | I think a missing piece is making materialized views fast
               | since there will always be cases where someone wants to
               | denormalize to get around performance issues, even with
               | well designed indexes.
        
       | digitalsushi wrote:
       | As a sysadmin that often gets the privilege of pretending to
       | design the smallest of complex systems, one of my regular
       | components has been CouchDB because of its built in http api.
       | 
       | I've never been anti-Mongo, but this one little piece has made
       | CouchDB an affordable choice for people like me who are not
       | equipped to otherwise defend the choice.
       | 
       | Is there a missing piece that could deal Mongo back in the next
       | time I try to convince someone that there's as straight a path to
       | the sysadmin solutions I generally compose?
        
         | jacobwilliamroy wrote:
         | mongo-express maybe?
         | 
         | https://github.com/mongo-express/mongo-express
         | 
         | But if couchDB works, it works. Personally I'd love to shoehorn
         | LISP into everything I do, but most of the time I just use
         | python and bash because things tend to get done faster when I
         | do.
        
       | [deleted]
        
       | dang wrote:
       | If curious see also
       | 
       | 2016 https://news.ycombinator.com/item?id=12290739
       | 
       | Discussed at the time:
       | https://news.ycombinator.com/item?id=6712703
        
       | [deleted]
        
       | eatwater123 wrote:
       | Unless you want to.
        
       | nchase wrote:
       | This article appears here pretty frequently:
       | https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
        
       | manishsharan wrote:
       | plot twist .. use MongoDB api /drivers on Document Layer on
       | FoundationDB ;-)
        
       | andikleen2 wrote:
       | In the early tens we ran an engineering project to improve
       | critical sections in applications using transactional memory. We
       | had a PhD level intern applying our techniques to various open
       | source projects. One target was MongoDB. After a few days of
       | investigation of the Mongo source code, he had to give up because
       | he couldn't even find the critical sections in the source. They
       | had locking, but it was extremely convoluted.
       | 
       | So yes I would agree with that. Never use MongoDB.
        
       | AtlasBarfed wrote:
       | Jepsen review 2020:
       | 
       | https://jepsen.io/analyses/mongodb-4.2.6
       | 
       | ... not good.
        
       | the_duke wrote:
       | I'm by no means a fan of Mongo, but the product has improved
       | quite a lot over the years.
       | 
       | Mongo now supports multi-document (and multi-node) transactions,
       | joins, and has a decent storage engine.
       | 
       | So you might even have a chance of keeping your data actually
       | consistent.
        
       | williesleg wrote:
       | C'mon man!
        
       | cyxxon wrote:
       | > On a social network, however, nothing is that self-contained.
       | Any time you see something that looks like a name or a picture,
       | you expect to be able to click on it and go see that user, their
       | profile, and their posts. A TV show application doesn't work that
       | way. If you're on season 1 episode 1 of Babylon 5, you don't
       | expect to be able to click through to season 1 episode 1 of
       | General Hospital.
       | 
       | That is exactly what I'd expect, and that is how small websites
       | like IMDB work. I am on the page for a General Hospital episode,
       | and via the actors in the episode or whatever other part I can
       | click through to Babylon 5, or the other way around, or anywhere
       | else.
        
       | offtop5 wrote:
       | Mongo and other nosql databases are still the absolute fastest
       | way to get started particularly you don't know what your data is
       | eventually going to look like.
        
         | ogre_codes wrote:
         | Before PostgreSQL added BSON support this may have been true,
         | but now it's pretty easy to build an SQL database that you can
         | extend arbitrarily just like with Mongo.
         | 
         | You probably shouldn't use that too much, not for anything
         | remotely production, but it's there if you need that
         | flexibility or an easy place to dump javascript objects.
         | 
         | When you are just getting started, it's super easy to
         | manipulate SQL data structures regardless.
        
         | setr wrote:
         | I disagree -- an RDBMS is pretty much safe/sane for any design,
         | though it may not be the fastest. NoSQL databases
         | (specifically, eventually consistent DBs) are only safe/sane in
         | specific scenarios.
         | 
         | If you don't know what your needs are, you should _always_
         | start with an RDBMS -- it 's not that difficult to go "up" to a
         | NoSQL db from there (you're only losing information, and if you
         | can't safely move because of loss of ACID... you'd probably
         | have been really fucked if you started off without it), but you
         | can't easily migrate back "down" to the RDBMS -- a NoSQL
         | database stores almost no information about your data or its
         | constraints.
         | 
         | And your application will almost always want transactional
         | guarantees and to model relationships properly -- generally
         | only small chunks of the design (design-wise; data-wise it
         | might be 90% of the app) can be treated with eventual
         | consistency and have real scaling needs, which you can shift
         | over to your nosql system.
         | 
         | Apps are generally just metadata tracking with a dash of real
         | work.
        
           | yawnxyz wrote:
           | as a UX designer / product engineer, it's easier for me to
           | jump into a nosql than to wire up a postgres instance for a
           | small proof of concept project that I know won't really go
           | anywhere
        
             | nucleardog wrote:
             | sqlite?
             | 
             | All my proof of concept (and some production) stuff just
             | uses that until I need features or concurrency that it
             | can't provide.
        
         | 35fbe7d3d5b9 wrote:
         | Careful here: while this may be true of _some_ NoSQL stores
         | that I 've not used, this is a tarpit for the wide column
         | datastores I'm familiar with (Dynamo, Cassandra, and others).
         | 
         | It's very easy to get going quickly and find out that you've
         | tripped over antipatterns, and now your database is "a
         | database, a Ruby app, and 30ms of latency on every call". It's
         | easy to think "I'll model this later" and end up hitting the
         | database five times in a row to answer a question.
         | 
         | With these systems, up front modeling of your data model and
         | access patterns is _essential_ if the trade off you are trying
         | to make (far less functionality for smooth performance at
         | ridiculous scale) will ever make sense.
        
         | adwww wrote:
         | This is what the company I work for claimed. Now we are 10
         | years old and have a monster of a relational database stored in
         | MongoDB.
         | 
         | There's never a good moment to switch database, but there is
         | always a new problem caused by storing relational data in a
         | document store.
        
         | jayd16 wrote:
         | Changing SQL schema is honestly just not that hard...
         | 
         | but even still, you can always use json columns in postgres if
         | you don't want the db to enforce a schema.
        
           | vosper wrote:
           | Agreed, I've never understood this. There are lots of good
           | migrations tools around for SQL databases. If people mean
           | "you don't need to run migrations with Mongo, you can just
           | start adding fields to documents as they're accessed, and
           | clean up as-needed" then... don't do that. That's how the
           | Mongo DB I am responsible for was managed for most of the
           | last 10 years. It's a nightmare now, we can barely change it
           | at all.
        
           | offtop5 wrote:
           | If your just prototyping you don't need a schema.
           | 
           | Complaining mongo doesn't scale is like saying your Mita
           | can't off-road
        
             | jayd16 wrote:
             | >If your just prototyping you don't need a schema.
             | 
             | Oh you always have a schema. Its just about whether you
             | want to enforce it or not.
        
               | nucleardog wrote:
               | It's enforced regardless in most situations. The question
               | is whether you want it enforced on write with the
               | database throwing an error, or enforced on every read by
               | your app crashing or behaving incorrectly when it gets
               | wildly unexpected values back.
        
             | Scarbutt wrote:
             | If you are prototyping a local file will do, or store json
             | in a text field in sqlite, or in a json column in postgres.
             | You will even have more flexibility than mongo since you
             | will not have the constraint of their document data model.
        
               | 35fbe7d3d5b9 wrote:
               | Echoing this.
               | 
               | My prototyping always starts with                   DB =
               | {}
               | 
               | at the top of my file. Sometimes it grows to serializing
               | to disk / loading the dumped object from disk. Often
               | times it's all I need to know that my idea was crap and
               | needs revised. And it always keeps me from faffing about
               | with infrastructure.
        
               | offtop5 wrote:
               | Only your last option will allow me to send a webpage
               | link to a friend for him to try it.
               | 
               | The idea is if you have just a couple of days to crack
               | out of MVP you don't want to waste time with postgres or
               | whatever. The problem ends up being then your boss is
               | like, all right this works keep going with it
        
               | setr wrote:
               | I don't understand why postgres w/ json enables a
               | website, where sqlite and text files don't. You just ship
               | the sqlite/text file with the rest of the webserver code?
               | 
               | It's less work than setting up mongo, postgres, or
               | whatever long-running data store. Just deploy the web
               | application... and you're done.
        
               | offtop5 wrote:
               | So that would make sense, if it's purely for show. For a
               | recent project we actually used firebase but I wanted to
               | enable myself to update the database without redeploying
               | the website.
               | 
               | I generally dislike when people try to write off a
               | technology just because they don't know how to use it.
               | Don't hammer in screws, but it doesn't mean you should
               | throw out all of your hammers
        
               | [deleted]
        
           | yawnxyz wrote:
           | yeah json columns are a game changer and will definitely
           | "disrupt" nosql usage, at least for my own projects
        
       | 01acheru wrote:
       | This article keeps coming up every once in a while and reminds me
       | of all those "Why JS sucks", "Never use PHP", "Java is enterprise
       | only", "Ruby only works on hobby projects" etc...
       | 
       | But then in real life people built great software with all the
       | above, so I'll just say a great classic: pick something you know,
       | use it well, build something good, end of story!
       | 
       | No tool will fix wrong assumptions or bad design, we can dive
       | into philosophy here but I'm more of a practical person so... :)
        
         | jeff-davis wrote:
         | I would argue that a lot of weaker elements in the stack (e.g.
         | PHP) work well _because_ of a stronger database.
         | 
         | A good database system does a lot to catch errors (including
         | consistency problems), isolate them, and roll them back.
         | Moreover, it will allow you to move performance problems into
         | the database, where it's likely to be handled more efficiently
         | with less application code (e.g. joining in the database is
         | likely to be more efficient than a naive join algorithm
         | implemented in the application).
         | 
         | Some will argue that these are misfeatures and should be
         | handled in the application. In some cases, that is true; but
         | you are probably going to need some other aspects of the stack
         | to be very robust and performant to get reasonable results.
         | 
         | In other words (please excuse my examples as they are intended
         | for illustration and not flamebait), PHP over Posgres might be
         | fine; Haskell over MongoDB might be fine; but PHP over MongoDB
         | is playing with fire.
         | 
         | I'd still say that, in most cases, the database layer is the
         | first place to start to work toward a robust system. Even a
         | proven-correct Haskell program can fail miserably if there was
         | a minor bug three versions ago that wrote some bogus data that
         | wasn't caught by a good database layer.
        
         | bdcravens wrote:
         | In each of those language examples, there are examples of great
         | software that migrated away from those languages for different
         | reasons.
        
         | throw1234651234 wrote:
         | The only practical application of MongoDB I can justify to this
         | day, is if you have a form builder, that allows users to build
         | completely custom forms. Forms.io gives you the form schema in
         | such a scenario as JSON out of the box. That gets matched with
         | answers as JSON.
         | 
         | Saving this to MongoDB directly, rather than SQL seems to
         | simplify things.
         | 
         | With anything else, at the end of the day, you are enforcing FK
         | constraints anyway, so might as well use SQL.
         | 
         | I never had issues with MongoDB performance.
         | 
         | One caveat to this is that I am yet to see a project that needs
         | database sharding in real life, and I have worked on projects
         | with millions of entries in a table and hundreds of writes a
         | minute.
        
           | CodesInChaos wrote:
           | Most relational databases have some level of json support
           | nowadays. So I'm not sure how much it simplifies in that
           | case.
        
             | throw1234651234 wrote:
             | There is appeal to the json being directly searchable in
             | the database, ala MongoDB.
             | 
             | Specifically for analytics/reporting.
             | 
             | The problem is that most analytics/reporting/BI tools SAY
             | they support MongoDB and then they tell you to write a
             | "connector" for each entity, at which point it's easier to
             | just move it to SQL.
        
         | larrik wrote:
         | This advice is fine for programming languages, but not for data
         | stores. Using them incorrectly (or in MongoDB's case and
         | sometimes MySQL's, correctly) could lead to things like data
         | loss or crashed servers. Simply building your product on one of
         | them is not proof they are good enough.
        
           | Geminidog wrote:
           | No, there are programming languages that are just plain bad
           | to use, his philosophy is wrong not only for datastores but
           | for reality in general.
           | 
           | All tools, including programming languages can be bad no
           | matter the skill of the user.
        
             | artificial wrote:
             | Found the person that doesn't like BrainF*ck. /s
        
               | redwall_hp wrote:
               | With Excel as a data store!
        
             | 01acheru wrote:
             | Glad to see that you actually know what is right ;)
             | 
             | All tools can be bad, but that means that probably you
             | cannot use it well or build something good with it. Every
             | good enough tool can be used to build something good
             | enough, or else it wouldn't be good enough. And it's also
             | realistic to assume that every tools popular enough is good
             | enough for something.
             | 
             | But you see, it's starting to get philosophical over here
        
               | rowanG077 wrote:
               | You can create the grand canyon with an infinite supply
               | of plastic spatulas. Does that mean a plastic spatula is
               | good tool for creating grand canyons? Of course not.
        
         | TomSwirly wrote:
         | It would be more useful if you addressed the specific issues
         | brought up in the article, rather than generically tried to
         | dismiss all articles critical of any programming language...
        
         | [deleted]
        
         | Geminidog wrote:
         | Your post implies that there is no tool on earth that "sucks"
         | and that it's not the tool, it's the person.
         | 
         | It's impossible for EVERY tool to be good. This isn't reality.
         | There has to be tools that are patently bad to use and people
         | have used these bad tools to build great things. But it doesn't
         | change the fact that a tool can be horrible to use.
         | 
         | I would argue that at the time the article was written, Mongo
         | was definitively a bad tool. Things have changed, but not all
         | things.
        
           | aeturnum wrote:
           | > It's impossible for EVERY tool to be good.
           | 
           | What does 'good' mean? Gcc is a good C compiler and a bad
           | Java compiler. It's even worse at being a document database.
           | 
           | I don't think 01acheru was saying that all tools are equally
           | able to do all tasks. I read them as saying that people have
           | used tools with recognized flaws to make good stuff and that
           | being snobby about whether tools are "good" or "bad" in a
           | general way isn't super useful for anyone. Instead, we should
           | say specific things about specific flaws and let others
           | decide if those flaws matter to them.
           | 
           | In particular, this post isn't really saying that MongoDB
           | doesn't work, it's saying that the MongoDB data model isn't
           | useful for what the author was using it for. Even if you are
           | sure that your app was "the perfect use case for MongoDB" all
           | you can really speak about is your use case. The real
           | headline for this article is "we couldn't make MongoDB work
           | and we're skeptical anyone can," which is totally fair, but
           | shys away from the grand claims that 01acheru (and I) are
           | critiquing.
        
           | systemvoltage wrote:
           | Precisely. It is _extremely_ crucial to discuss shortcomings
           | and criticize tools otherwise how are we going to improve if
           | we all roll along singing the tunes of each other and never
           | questioning anything?
           | 
           | This kind of attitude and softness towards criticism - "All
           | tools are great" is not how we need to operate. Professional,
           | well articulated and constructive criticism needs to be on
           | the table.
           | 
           | I downvoted the GP for this reason.
           | 
           | I advise everyone here to listen to criticisms and write them
           | as well. Don't be afraid of some kind of a backlash, express
           | freely.
        
             | 01acheru wrote:
             | Well I guess we are discussing different angles of this
             | matter, I actually didn't thought that my comment would go
             | to the top since it was supposed to be a random naive
             | statement.
             | 
             | I'm not about making things that simple by default, but I
             | don't like those absolutist titles like "Never use
             | MongoDB", also because the years since 2013 actually proved
             | the article to be kind of wrong.
             | 
             | "Never do X" is what I tell to children about matters that
             | they wouldn't be able to understand, and making a point
             | like "we used an immature tool that wasn't the best choice
             | for what we were building, and on top of that we used it
             | wrong, so you random guy should never use it for anything,
             | ever" sounds like fearmongering to me and it's not
             | something suitable to my taste.
             | 
             | But I get your downvote :+1:
        
               | systemvoltage wrote:
               | Indeed, never do X is pretty strong and needs equally
               | strong arguments to back it up. But it can be true. There
               | are tools that are superseded by better ones.
               | 
               | Equally, it's important to also tone down "X is nothing
               | but the best" and praises should also require equal and
               | opposite constructivism.
        
               | pdimitar wrote:
               | Criticism must have context so your statement was indeed
               | a bit naive and too generic.
               | 
               | In the case of MongoDB, many people thought that with it
               | you'll have all of the benefits of, say, PostgreSQL,
               | without having to think much about your data schema.
               | History has shown that this is not the case -- as usual,
               | it's about tradeoffs. There are no absolute wins: if you
               | want an RDBMS, you have to put more work in X, if you
               | want a document store then you have to try really hard
               | with Y.
               | 
               | "MongoDB vs. RDBMS" is a very old and tired argument by
               | now but it basically boils down to: people started using
               | MongoDB wide-eyed, optimistically and with more
               | enthusiasm than engineering skill and of course, there
               | were harsh reality checks.
        
               | jeff-davis wrote:
               | I agree that absolute statements like "never" obscure
               | useful discussion.
               | 
               | I would reword your comment as: "If you are building
               | something and excited, then keep going, don't stop
               | because a blog told you 'never'". I think that's what
               | your main point was, and it's a good one.
        
           | alexfrydl wrote:
           | I hate the logic that because people have successfully used
           | something, it is good. Having used MongoDB extensively, it is
           | absolutely true that MongoDB is powerful and useful. However,
           | it is also true that I would never choose it over something
           | else for a new project. It has likely improved since 2013,
           | but so has everything else.
           | 
           | Certainly I could endeavor to build amazing modern software
           | in C, but unless for some reason C is an absolute must, I
           | would rather try any other language first. I doubt this is a
           | controversial statement, and yet someone will always be there
           | to defend the opposite stance.
        
             | daniel-grigg wrote:
             | Now you're just repeating the same fallacy by assuming all
             | these tools have improved at the same rate.
        
           | at-fates-hands wrote:
           | > Your post implies that there is no tool on earth that
           | "sucks" and that it's not the tool, it's the person.
           | 
           | I got more of a "stop complaining about tools. Pick one you
           | like, use it and build something with _IT_ instead. I feel
           | the same way. Developers are a fickle bunch. One tool works,
           | but then in order to be  "cool" you have to bag on it and
           | then propose some other obscure tool you think is better that
           | nobody has ever heard of.
           | 
           | It seriously reminds me of people arguing over music. Its
           | totally uncool to like a mainstream band because _everybody_
           | else likes that band and its not cool to like them. So then
           | you have all the  "cool" people who listen to all the obscure
           | "awesome" bands who Rolling Stone magazine tells you to
           | listen to, so then you go around telling people you listen to
           | the Shithouse Rats. "Oh you've never heard of the Shithouse
           | Rats? Well, they're kind of obscure." and now you're one of
           | the of the cool kids.
        
             | 01acheru wrote:
             | Yeah that was my point, 100%!
             | 
             | And by the way I really like you music analogy, and it's
             | emblematic of something even larger: they read about
             | Shithouse Rats on Rolling Stone, named after a song of Bob
             | Dylan or Muddy Waters or the group itself (don't know which
             | one) and all of them are quite famous and mainstream.
             | 
             | You've got to love mankind, we are awesome!
        
             | skrtskrt wrote:
             | I feel there can be a fair reason behind this, which is
             | basically that using a tool no one else uses, even if it's
             | great for the use case, is likely to be a losing battle.
             | 
             | Hard to get using it approved at work, less usage means
             | fewer bugs are caught and features developed, etc.
             | 
             | So people evangelize their favorite tools because it
             | benefits them directly to have them adopted more widely.
        
             | Geminidog wrote:
             | Music implies sort of an everything is just an opinion
             | thing so your analogy does not fit.
             | 
             | Think of it like horse drawn wagon vs. a car. There might
             | exist a guy who in his humble opinion thinks the wagon is
             | better so he use it to get places instead of a car but is
             | that guy a reasonable guy? No.
             | 
             | The analogy I mentioned above is more apt because Mongo was
             | indeed at one point in time more of a wagon rather than an
             | unpopular piece of music.
        
             | jimbokun wrote:
             | This is usually true.
             | 
             | But every once in a while you have a case like WhatsApp,
             | which sold to Facebook at a price of $500 million per
             | engineer, which never could have happened without Erlang.
        
           | karmakaze wrote:
           | > at the time the article was written, Mongo was definitively
           | a bad tool
           | 
           | I concur. Version 3.0 is dated March 3, 2015 that uses the
           | WiredTiger engine which fixes much of the brokenness.
           | 
           | I did some workaround work on a MongoDB v2.x app. It did suck
           | and was inconvenient operationally, but it also did scale so
           | had its uses.
           | 
           | However the discussion now should be about how it is to be
           | used/not today and not back in 2013. So fair to say it did
           | suck or you shouldn't have used it, but that doesn't have
           | much relevance.
        
           | pc86 wrote:
           | They never said that every tool is good. They didn't even
           | imply it.
        
           | [deleted]
        
           | worik wrote:
           | "There has to be tools that are patently bad to use and
           | people have used these bad tools to build great things."
           | 
           | Don't those tools die out and get forgotten?
        
             | pdimitar wrote:
             | No. Sticking to what is known beats innovation most of the
             | time.
             | 
             | There is no correlation between something being widely used
             | and it being good at its job.
        
         | gher-shyu3i wrote:
         | It really depends. Things can be built despite of the
         | technology. I saw this especially at an employer who was
         | heavily invested in golang. Countless times I've thought to
         | myself that they wouldn't be having the issues they were
         | having, sunken costs, reinventing the wheel, etc. if they used
         | a proven technology like the JVM instead of drinking the kool
         | aid and using the latest fad of the day. Tons of money was
         | sunken into it, and it was made to work by force, but not
         | everyone would be able to bear that cost, and it still causes
         | massive inefficiencies due to poor tooling.
        
         | Geminidog wrote:
         | >No tool will fix wrong assumptions or bad design, we can dive
         | into philosophy here but I'm more of a practical person so...
         | :)
         | 
         | And no design can fix a bad tool, we can dive into the
         | practicalities here but I'm more of a philosophical person
         | so...:)
        
         | jimbokun wrote:
         | > "Why JS sucks", "Never use PHP", "Java is enterprise only",
         | "Ruby only works on hobby projects"
         | 
         | All of these have at least a little bit of truth to them, and
         | you should know about the downsides of technologies, even if
         | you decide to use them anyways.
        
       | dudeinjapan wrote:
       | MongoDB + Ruby sparks joy for me. It's come a long way since 2013
       | and latest features like transactions (though nowhere near SQL
       | level of robustness) are enough for my use cases. To each his or
       | her own.
        
       | jonstaab wrote:
       | > On my laptop, PostgreSQL takes about a minute to get
       | denormalized data for 12,000 episodes, while retrieval of the
       | equivalent document by ID in MongoDB takes a fraction of a
       | second.
       | 
       | What? Her database can't possibly be indexed properly.
        
         | kulig wrote:
         | I remember reading some of her comments a while ago and she
         | seems like a pretty arrogant person. Probably doesnt know how
         | to use postgres properly.
        
         | joshxyz wrote:
         | at most cases it could also be 1.) how the user writes the
         | code, and 2.) how the db api / library was coded
        
         | mywittyname wrote:
         | Yeah, this sounds like a design defect. But since the author
         | doesn't really describe what they did, it is hard to really
         | figure it out. I'm guessing this is some sort of query with a
         | self-join going on, where the mongo request is a basic fetch by
         | id.
        
           | mbreese wrote:
           | The use of the term "denormalized" suggests to me that it was
           | a query that involved a lot of joins. Which is certainly
           | something that could have been otherwise addressed with a
           | different design.
           | 
           | Comparing fetching from a normalized design to a denormalized
           | one isn't really a fair comparison.
        
             | dathinab wrote:
             | I fear the problem is that the person doesn't do joins but
             | separate sub-queries which then get recombined in RAM in
             | the client. Given the software stack described there is a
             | realistic chance of this happening implicitly due to ORM
             | mappers.
             | 
             | But then the way tables don't map well to 1-to-many
             | mappings and joins still returning data in tables this can
             | also be a problem. Especially if a large field get
             | duplicated a lot. RDBMS really should go from 2d-Tables to
             | proper nested types for _results_ IMHO.
        
               | fabian2k wrote:
               | That would be nice, though I suspect that it is really
               | much more complicated than it seems. You can emulate this
               | to some extent in Postgres with the various JSON
               | functions and essentially return a tree from a single
               | query. But my experience was that I quickly got to a
               | point where the query plan got really complex and
               | planning time started to dominate.
        
         | wetmore wrote:
         | Her
        
           | jonstaab wrote:
           | Thanks, corrected
        
         | fabian2k wrote:
         | Even without an index that sounds too long (though obviously
         | hardware and Postgres itself both have come a long way since
         | 2013). At 12,000 rows even a brute force query should be quick.
         | 
         | I would suspect something like bad statistics or some other
         | reason that caused a pathological query plan. In any case this
         | is not a good representation of any potential performance
         | difference between Postgres and MongoDB.
        
           | jonstaab wrote:
           | I could see it taking that long if she was using a full movie
           | database with millions of rows in it.
        
         | luhn wrote:
         | Author is using Rails, so my guess would be the bottleneck is
         | ActiveRecord. I've never used ActiveRecord, so I can't speak to
         | it directly, but in my experience when dealing with large
         | numbers of records in an ORM (and author easily is working with
         | hundreds of thousands), things grind to a halt, even with eager
         | loading. There's a lot more overhead to create thousands of ORM
         | objects than it is to serialize an equivalent chunk of BSON.
        
           | treeman79 wrote:
           | Bulk operations are were a lot of rails programmers struggle.
           | 
           | Active record is awesome in many ways, but it can shoehorn
           | you into n+1 solutions.
        
       | alpineidyll3 wrote:
       | For the record we use mdb in production, and it's been fine.
        
       | franklyt wrote:
       | SQL is a solution to a problem in the vein of prematurely
       | optimizing for many use cases where mongodb is called for.
        
       | petepete wrote:
       | I first read this article while working on a project where the
       | company had basically written a RDBMS using MongoDB. It was so
       | many different kinds of bad I lost count.
        
         | throwawayboise wrote:
         | That's the problem with using document databases when you
         | really need an RDBMs. You end up reimplementing an RDBMS,
         | badly.
        
           | mywittyname wrote:
           | It's so painful to come onboard projects that should have
           | been designed originally in an RDBM, but that was never a
           | consideration because, "they are slow." People fight the
           | migration tooth and nail until it becomes nearly impossible
           | to move forward.
        
       | a13n wrote:
       | We've used MongoDB at our SaaS and have grown to well over $1m
       | ARR and never had an issue.
       | 
       | Maybe if you're trying to build a massive ($B) company, starting
       | with PostgreSQL makes more sense for you. For everything else,
       | MongoDB works just fine.
       | 
       | Just use the technologies that you know and can move fastest
       | with. Startups rarely succeed/fail because of which technologies
       | you choose to use.
        
         | bdcravens wrote:
         | If your data store is nothing but a persistence layer for an
         | application, perhaps that makes sense.
         | 
         | In many companies the need to regularly access the database for
         | analytics and BI is a thing; this isn't limited to $B
         | companies. Most of the tooling available works best with SQL
         | databases. (Though the BI connector at
         | https://docs.mongodb.com/bi-connector/current/ looks
         | interesting for this purpose)
        
       | nojvek wrote:
       | I think there are two separate aspects that get conflated into
       | one.
       | 
       | 1) Document database - rather than a strict rigid schema, you can
       | store nested json documents in tables/collections. Or the idea of
       | soft schema where the whole database doesn't need to be blocked
       | for a schema change and you have some leeway in integrity.
       | 
       | 2) Relational database - Ability to make complex sql queries that
       | join data from multiple tables.
       | 
       | Mongodb has some support for joining but it doesn't have a sql
       | variant. If your data is mostly key:val store then it's great.
       | You can shard it, and have replicas. It's easy to make a fast
       | reliable backend with mongodb. Many popular sites run on mongodb
       | backend.
       | 
       | However with new json types in MySQL and Postgres, it too has
       | support for inserting documents and querying subkeys. It can be
       | sharded and replicated (albeit with a bit more configuration).
       | 
       | Couchbase which is like mongo (in its document store
       | capabilities) N1QL which offers agility of SQL and flexibility of
       | JSON.
       | 
       | So like any tool, it has it's tradeoffs.
       | 
       | Then again kudus to the author for evoking our reptillian brains:
       | "Never use MongoDB" incites emotions and gets you on top of HN.
       | If it was called "When to use MongoDB", it wouldn't get the same
       | reaction.
        
       | ashtonkem wrote:
       | I'll never ever use MongoDB again, because every single time I've
       | ended up running a cluster it has always been the most troubled
       | part of my stack. I've been burned _way_ too many times to ever
       | considering touching that stove again.
        
         | ryanianian wrote:
         | What do you mean by "troubled," and what docs did you follow to
         | set up and run the cluster?
         | 
         | Caring and feeding for a database (of any type) with any type
         | of HA has a learning curve. Hence the growing number of PaaS
         | services that handle the setup and maintenance for you (AWS's
         | RDS, Mongo's Atlas, etc)
        
           | cj wrote:
           | MongoDB's own management software (Cloud Manager, Ops
           | Manager) is littered with bugs to the point where it's nearly
           | unusable for certain actions. (One of which being restoring
           | backups)
           | 
           | It was really, really bad 3-4 years ago. Regularly entering
           | irrecoverable error states while performing basic management
           | operations via MongoDB's management GUI. I've noticed a
           | significant improvement in the past ~1 year.
        
           | ashtonkem wrote:
           | By troubled I mean it would regularly go into what I called
           | "three stooges" mode where each node swore that another was
           | the master. Of course this would mean that writes would stop
           | dead in their tracks.
           | 
           | Given that this happened to me across companies, teams, and a
           | half decade of time, I've decided that this is a case where
           | the problem is Mongo and not me.
        
             | ryanianian wrote:
             | The only constant there is you which means it is perhaps
             | you.
             | 
             | Setting up and caring for a DB cluster is a complicated
             | thing, to the point of there being non-BS certification
             | courses for nearly every major HA database including Mongo.
             | It very well could be that there was a well-documented flag
             | that you never learned.
             | 
             | This doesn't mean you're being unreasonable. I'd be cranky
             | with any DB getting itself in a split-brain scenario, but
             | my conclusion wouldn't be a bug in the software but rather
             | it's a bug in my understanding. It's worth noting that
             | getting in this state should either be impossible, or it
             | should be obvious about how it arrived at such a state with
             | links to relevant documentation.
             | 
             | (There's also little incentive to make it super easy to run
             | in production on your own. They sell that as a service
             | after all. It wouldn't surprise me if the product had
             | invariants that assumed production-level configurations and
             | that nobody's tested with whatever configs ended up making
             | it go nuts.)
        
       | eecc wrote:
       | Ok, so besides the technical faults of Mongo within the context
       | of its category, what is the ideal use-case of a document
       | oriented store?
       | 
       | If you use metadata documents to model your relations you might
       | get away with the most dangerous foot guns, but then why not jump
       | straight into graph databases?
        
       | fortran77 wrote:
       | But it's web scale!
       | 
       | http://www.mongodb-is-web-scale.com/
        
       | jasondc wrote:
       | This article from 8 years ago highlights how far MongoDB has
       | come: transactions, left outer joins ($lookup), etc.
        
         | ashtonkem wrote:
         | The issue is that MongoDB isn't chasing a fixed target. RDBMS
         | have gotten better in the mean time.
        
         | nickkell wrote:
         | How many years before it reaches feature parity with a
         | traditional RDBMS though? And when will it get a query language
         | as good as SQL?
         | 
         | That said, I will admit the change streams feature is amazing.
         | That completely changed the way I thought about building
         | reactive applications
        
       | leke wrote:
       | > But there are actually very few concepts in the world that are
       | naturally modeled as normalized tables. We use that structure
       | because it's efficient, because it avoids duplication, and
       | because when it does get slow, we know how to fix it.
       | 
       | Urmmm...How?
        
       | axegon_ wrote:
       | Despite having used document oriented databases for many
       | years(largely because they were shoved down my throat and I
       | inherited someone else's architecture), I never really managed to
       | figure out why people find them so compelling. There has been a
       | shift in the last two years and people have started running away
       | from them. Specifically the web-dev crowd adored them and I guess
       | it's easy to fetch a document in the exact structure you need it
       | but sooner or later you inevitably reach the point where you have
       | to analyze data. And here mongo(and all the similar alternatives)
       | become the biggest pain in the a...neck you can think of.
       | Couchbase tried to tackle this issue with n1ql to a certain
       | degree but at large scale it is still not particularly useful. To
       | my mind, having a relational database which has a good
       | architecture can't be matched by any document oriented database.
       | But getting a large system/database right does take more effort.
       | There are numerous ways to make relational databases incredibly
       | scalable but again, it takes a lot more effort.
        
         | fendy3002 wrote:
         | It's amazing for three things: search, logging and draft
         | records.
         | 
         | Search, with mongodb can do $all query, which is hard to
         | replicate at sql level without aggregation. However I'm still
         | waiting for aggregate-level $elemAt.
         | 
         | Logging, you can attach anything to a property, then it'll be
         | queryable.
         | 
         | Draft records, it's easy to just insert and insert the records
         | because it's schema-less. Validate during creation and validate
         | again during publishing or approval. It's queryable and you can
         | use a generic collection for that.
         | 
         | For logging and draft records, sql JSON field may be able to
         | handle them, though I don't know how good it is at querying.
        
         | jacobsenscott wrote:
         | There was a time where adding a column to a database was a
         | really big deal. You had to get it past the DBA, and there were
         | real resource constraints on the database system. With a
         | document store the schema is entirely in the hands of the
         | developer.
         | 
         | Also JSON became the standard way to ship data around, and
         | RDBMs systems of the time couldn't really handle JSON. So you
         | either write a bunch of code to map complex nested JSON to
         | relational tables, or just dump it into an un-indexible text
         | column.
         | 
         | There was vendor hype, just like there was around Object
         | databases in the pre-internet days.
         | 
         | If you were starting a new project you needed to decide if you
         | were going to use a document store and an RDBMS or just on or
         | the other. If it was just one you would choose a document store
         | if you anticipated you would need to handle a lot of
         | unstructured data.
         | 
         | Today the situation is revered. A document store only does
         | documents well. A good hybrid database like postgres gives you
         | the best of both worlds. Throw in hosted database services and
         | resource constraints are much less of an issue. So people
         | aren't running back to an old school RDBMS. They are moving to
         | a much superior and evolved data store.
        
           | axegon_ wrote:
           | That's only partially true. With document schemas, you simply
           | eliminate the DBA since whatever you put in there is entirely
           | up to you. In all fairness I've never dealt with DBAs - I've
           | always managed to get a technological freedom and be able to
           | design and organize my databases in whichever way I see fit.
           | I'd generally hate to have to ask someone to clone a table
           | for me or whatever.
           | 
           | JSON is the standard way to ship data around the internet,
           | yes. Though grpc is catching up and more and more often I see
           | people relying on grpc in their architecture. And grpc
           | conceptually is a lot closer to RDBM, given that you have a
           | code generation step and everything in your data needs to be
           | defined(aka statically typed).
           | 
           | Recently I started several personal projects and though I
           | struggle to find time and motivation to work on them on my
           | own, document related databases are completely out of the
           | question. postgre and potentially redis as a proxy for heavy
           | loads and that's that. I wouldn't call postgres a hybrid
           | database. It does support json datatypes natively but in it's
           | core it is the definition of what RDBMs are. The best example
           | for a hybrid database(from a developer's perspective since it
           | isn't open source and I do not work for google in any shape
           | or form) is spanner.
        
           | dgb23 wrote:
           | I fairly recently _really_ started to understand how
           | important historical reasoning and understanding is in the
           | context of software, technology and science. Your comment is
           | a great example. Tech developments, choices, trends and so on
           | only really make sense in the context of history. And often
           | we forget about history, start to reinvent things or even
           | steer into a completely useless direction because we don't
           | apply temporal reasoning or simply don't learn from the past.
           | 
           | Another benefit of this kind of approach is starting to learn
           | about a challenging subject. Say you want to deepen your
           | knowledge in a branch of mathematics that you find
           | interesting and useful. The history of that branch will tell
           | you so much more than a typical lecture-style conglomerate of
           | concepts. It provides a great overview of important actors,
           | their relationships, cause and effect of discoveries, the
           | culture, the problems and so on. On top of that it is easier
           | to remember and internalize concepts if you know the story
           | behind them.
        
           | ashtonkem wrote:
           | > There was a time where adding a column to a database was a
           | really big deal. You had to get it past the DBA, and there
           | were real resource constraints on the database system. With a
           | document store the schema is entirely in the hands of the
           | developer.
           | 
           | That time is still here if you're running enough read nodes
           | and QPS.
        
         | hnarn wrote:
         | > I never really managed to figure out why people find them so
         | compelling.
         | 
         | This might sound jaded but my feeling is that a lot of
         | developers just looked at JSON objects that they were already
         | working with and thought to themselves "actually, it would be
         | cool to just store this directly".
         | 
         | Which, in itself, isn't a bad idea but writing a completely new
         | solution from scratch to a problem that's been solved for
         | decades seems a bit like hubris.
         | 
         | AFAIK many relational databases support JSON today, so I'm not
         | sure what the argument would be to choose something like
         | MongoDB today from scratch if you had the choice of anything.
        
         | w0m wrote:
         | > why people find them so compelling
         | 
         | My theory is that it's easy to add a field by adding logic into
         | the app instead of munging tables relationships. Moves the
         | logic to where developers are more comfortable. Scalability/etc
         | is irrelevant for most use cases anyway.
        
           | hnarn wrote:
           | > Scalability/etc is irrelevant for most use cases anyway.
           | 
           | I literally can't parse what you mean by this
        
         | whatsmyusername wrote:
         | I have found 2 use cases, one of which I've never actually seen
         | in the wild.
         | 
         | The most common use case is, "I need to store data where the
         | schema is unknown or can change without notice, and have my
         | shit not break." This is what we used Mongo for.
         | 
         | The other use case I could see (and this is pretty much only
         | with Dynamo) is, "I want to build an application that's cross-
         | region native. Most of my data is relatively static, so I
         | accept eventual consistency on changes. I will have a separate
         | data store for transactional data and data that cannot be
         | eventually consistent." I want to build this project, but it
         | will never happen because it's too easy to RDBMS in a single
         | region to start.
        
         | __float wrote:
         | I have found myself enjoying using a document database as the
         | online store, and then using a 'big data solution' (we use
         | Presto) for any analytics queries later.
         | 
         | Traditional migrations for relational databases are really
         | painful. Document databases make this much easier, and if
         | you've faced the operational pain of needing to migrate a large
         | database (for example, it's so easy to accidentally lock an
         | entire table in Postgres), you might be pretty compelled.
         | 
         | (That said, I think the pendulum is swinging back away from
         | document databases. So you're in luck ;))
        
           | mohaine wrote:
           | How do you normalize your Json structure so it can be
           | queried? Do you enforce a schema on your JSON or do you morph
           | it on export into a common structure.
           | 
           | If you enforce a schema on the Json structure how do you
           | handle the changes on the live system?
        
           | thehappypm wrote:
           | I think the parent is arguing that it's hard to do useful
           | big-data analysis on highly nested structures of data. When
           | your database is storing some immense JSON blob, it's hard to
           | write SQL against it.
        
         | e12e wrote:
         | > Despite having used document oriented databases for many
         | years(largely because they were shoved down my throat and I
         | inherited someone else's architecture), I never really managed
         | to figure out why people find them so compelling.
         | 
         | Well, filesystems are pretty good. It's the only document store
         | I use (and mostly enjoy).
         | 
         | But then you look at the trade-off with some think like just
         | Maildir, and you really start to wonder if this schemaless
         | document store thing is so great?
         | 
         | I suppose the real shame is that proper object dbs like zodb or
         | gemstone gets much less attention - they to have big trade-offs
         | - but I feel they at least give back in terms of consistency
         | and simplicity.
        
       ___________________________________________________________________
       (page generated 2021-02-01 23:03 UTC)