[HN Gopher] Thou shalt not run a database inside a container
       ___________________________________________________________________
        
       Thou shalt not run a database inside a container
        
       Author : swyx
       Score  : 38 points
       Date   : 2021-01-13 13:16 UTC (9 hours ago)
        
 (HTM) web link (patrobinson.github.io)
 (TXT) w3m dump (patrobinson.github.io)
        
       | rad_gruchalski wrote:
       | It works perfectly fine with rexray + ebs like volumes or
       | kubernetes PVs by ebs like volumes. So why not?
        
       | lawwantsin17 wrote:
       | 2016 article? is this still broken in 2021
        
       | dastx wrote:
       | I remember when this article came up a long time ago. We had just
       | set up an in-house Kubernetes cluster running on AWS.
       | 
       | We were attempting to run Cassandra on top of it and let's just
       | say the majority of our time for a few months was spent fine
       | tuning the setup (iirc this was early days StatefulSets). In the
       | end we gave up and went to EC2 for Cassandra.
       | 
       | Persistent volumes have come a long way, and container technology
       | in general has rapidly evolved. I have yet to try any databases
       | on K8s in a production environment since that attempt, but I
       | don't believe that it's as bad as it was back then.
       | 
       | Many people are scared of it, one of the biggest reasons usually
       | given is that containers are meant to be ephemeral. While
       | containers allow for applications to be ephemeral, and quite
       | frankly makes it a lot easier, I don't believe that they're meant
       | to be ephemeral. At the end of the day, containers are just
       | somewhat isolated processes. Processes don't need to be
       | ephemeral, and often aren't. With the right setup, databases can
       | be run from containers, and I believe the rapid evolution of
       | container technology has allowed for this to be possible today.
        
         | mywittyname wrote:
         | What's the functional benefit of using containers over EC2
         | images for a database? I can't see why anyone would reasonably
         | consider trying such a thing.
         | 
         | Containers make a lot of sense when you're the one developing
         | the software, and you have a lot of micro-services to support,
         | none of which really scale to EC2 levels of utilization.
         | 
         | But EC2s seem better suited to running pre-installed software
         | that can, and will, fully utilize an entire VM. Such as a
         | database.
         | 
         | My intuition on this subject is, docker is the latest, trendy
         | hammer and so everything becomes a nail / container. Maybe
         | people aren't familiar with the tooling around EC2 image
         | creation and deployment.
        
         | qeternity wrote:
         | Containers are no more ephemeral than any other process. People
         | who say this I think are just saying that because of default
         | docker storage settings.
        
       | ghshephard wrote:
       | I'm very familiar with an environment that has about 120 TByte of
       | databases - Mongo and Postgres - 100% run in containers over
       | about 75 namespaces in GKE. Handles being restarted without
       | issue.
        
         | yen223 wrote:
         | What benefits do they get from running databases in Kubernetes?
        
           | ghshephard wrote:
           | Once you containerize every part of your application stack,
           | you can simplify support/deployment to a single model, and
           | you get the advantages of everything that containerization
           | has to offer - dynamic scaling, robust recovery, trivial
           | application migration, etc... without having to build in
           | special rules/processes for one-off elements. The database
           | then become yet one more component in your environment that
           | isn't treated any differently than any other component - with
           | the possible exception of requesting that it be scheduled
           | less ephemerally than other components.
           | 
           | I've seen it work in production for 3+ years - And I don't
           | recall there ever being an issue with the databases being in
           | a container - it's hard to imagine them being anywhere else.
        
       | randompwd wrote:
       | Why wouldn't you put the 2016 date in the title? That's standard
       | courtesy.
        
       | phnofive wrote:
       | (2016)
       | 
       | 2017 Followup: https://patrobinson.github.io/2017/12/16/should-i-
       | run-a-data...
       | 
       | Tone softens a bit, but still comes to the conclusion it's not a
       | good fit. Four years on, I can't imagine a benefit to running a
       | prod DB out of a container either.
        
         | ghshephard wrote:
         | The advantage is you deploy your entire application stack the
         | same way you would anything else. Zero effort to roll out new
         | environment. And if you need to scale it up, you just increase
         | your k8s request/limit on CPU/Memory and restart it and it gets
         | scheduled to a node as appropriate.
        
         | knowhy wrote:
         | I don't see the benefit either. I believe Netflix made this
         | idea popular with sidecar containers as a way to add
         | 
         | > "non-intrusive platform capabilities" [0]
         | 
         | to container stacks.
         | 
         | I understand that the idea might be popular among developers as
         | it is easier to just add a database container to your stack
         | rather than dealing with the db admin. I don't know if Netflix
         | ever recommended databases as sidecar containers. But I have
         | seen it in the wild where dev followed the Netflix model. I
         | sometimes hear people arguing that they have to manage
         | containers anyway so it would be less overhead to manage the db
         | as a container as well.
         | 
         | 0: https://netflixtechblog.com/prana-a-sidecar-for-your-
         | netflix...
        
       | thehappypm wrote:
       | The beauty of containers is how ephemeral they are -- need more?
       | Here's more. Need less? Bye. Oh, this one died? Kill it.
       | 
       | Terrible fit for databases, inherently.
        
         | boardwaalk wrote:
         | You're making a feature into a disadvantage when you could just
         | not use the feature.
        
         | slaymaker1907 wrote:
         | That's only one way of looking at containers. Another way is to
         | see them as a modularization tool since they abstract most of
         | the machine away while being much lighter weight than a VM.
         | 
         | I've recently been looking into containerizing my personal
         | server. I use the cheapest VM on AWS (I work on Azure now, but
         | vendor lock in is real). Adding another VM would double my
         | costs, but with containers I would more isolation and can
         | upgrade sites I run on this server independently.
         | 
         | I usually use sqlite, but I can see why containers would be
         | nice for similar reasons. Even if you aren't resource
         | constrained with VMs, running a DB via containers might be nice
         | to ensure a consistent dev setup even if you don't use
         | containers in prod.
        
           | jasonpeacock wrote:
           | I do this using Lightsail to run my personal blogs, each blog
           | is running in its own container w/sqlite mounted from a local
           | for persistence (too lazy to do actual Docker volumes).
           | 
           | Ditto for Nginx, it's also containerized.
           | 
           | It's a great setup b/c I can also deploy the containers
           | locally (and SCP the sqlite DB locally) to run the blogs on
           | my laptop if I need to.
        
         | ghshephard wrote:
         | But, don't you want to code your application stack such that
         | your database being killed is a no-op? Shouldn't the state
         | mostly reside in your persistent store (which decidedly needs
         | to be robust)
        
           | kjeetgill wrote:
           | Maybe I'm not newfangled enough but isn't the database your
           | persistent store? If not, maybe this is where the db in vs
           | out of container folks are taking past each other.
        
             | ghshephard wrote:
             | I'm calling out the difference between the storage
             | mechanism (SSDs, Block Storage, etc...) - where
             | transactions and journal-logs are dumped, so as to allow
             | recovery if the database server goes down.
             | 
             | Once you design your database
             | server/application/transactions such they don't care
             | whatsoever if the database server goes offline - you not
             | only get the ability to run fine in containers, you also
             | make your application stack just significantly more robust
             | for all sorts of reasons.
             | 
             | I have colleagues who bemoan a major application server
             | going down - which is a no-op in my world where the
             | ephemeral nodes rarely last more than a day or two in GKE -
             | so we're completely used to them going down.
             | 
             | Building transactions, idempotency, tasks in
             | rabbitmq/celery rather than memory - not only do they make
             | deploying in containers straightforward, they also give an
             | overall robustness win for all applications.
             | 
             | I mean, _physical_ servers go down as well.
        
           | slaymaker1907 wrote:
           | While this is somewhat true, most databases are not designed
           | for _fast_ recovery.
        
       | slaymaker1907 wrote:
       | I help support SQL Server on Linux for which we provide a number
       | of official containers. There are some limitations, but these
       | have been pretty useful for a number of customers. Also, my
       | understanding is that volumes are pretty mature these days.
        
         | [deleted]
        
       | qeternity wrote:
       | 4 years old. The world is completely different today. We run a
       | number of HA Postgres setups on k8s and it works beautifully.
       | Local nvme acccess with elections backed using k8s primitives.
        
       | wwarner wrote:
       | Isn't Vitess all about sharding and autoscaling mysql instances
       | with containers and kubernetes?
        
       ___________________________________________________________________
       (page generated 2021-01-13 23:02 UTC)