[HN Gopher] Docker in Production: A History of Failure (2016)
___________________________________________________________________
Docker in Production: A History of Failure (2016)
Author : flanfly
Score : 85 points
Date : 2021-07-27 15:21 UTC (7 hours ago)
(HTM) web link (thehftguy.com)
(TXT) w3m dump (thehftguy.com)
| debarshri wrote:
| It was year 2016, kubernetes did not have jobs, cronjobs,
| statefulsets. Pods would get stuck in terminating state or
| container creating state. Networking in kubernetes was wonky. AWS
| did not have support for EKS. It used be painful.
|
| It is year 2021, 1000s of new startups around kubernetes, more
| features, more resource types. Pods would still get stuck in
| terminating state or container creating state. It still pretty
| painful.
| HelloNurse wrote:
| > Networking in kubernetes was wonky.
|
| Can you elaborate? Does Kubernetes add some thrills to the
| relatively simple Docker network configuration?
| krab wrote:
| Yes, Kubernetes (actually its add-ons) provide a virtual
| network that unifies communication within the cluster so you
| don't need to care on which computer your service runs.
| jrockway wrote:
| You can make it as complicated as you want it to be; part of
| setting up a cluster is picking the networking system
| ("CNI"). Cloud providers often have their own IPAM (i.e. on
| Amazon, you get this:
| https://docs.aws.amazon.com/eks/latest/userguide/pod-
| network... each Pod gets an IP from your VPC, resulting in
| weird limits like 17 pods per instance because that's how
| many IP addresses you can have for that particular instance
| type throughout EC2).
| kaidon wrote:
| Please take a seat and let the joys of k8s networking
| overwhelm your senses:
|
| https://kubernetes.io/docs/concepts/cluster-
| administration/n...
|
| And yes... Kubernetes network configuration is on a whole
| different level from docker networking.
| theptip wrote:
| To be fair, multi-node networking of any sort is on a
| different level than single-host docker networking.
|
| If you ever tried to use Docker Swarm to network multiple
| nodes, god help you.
|
| Also worth noting that almost all users of K8s don't
| actually need to operate a cluster, the hosted offerings
| handle all of that for you. You just need to understand the
| Service object, and maybe Ingress if you're trying to do
| some more advanced cert management or API gateway stuff.
|
| It's a common meme around here to point in horror to the
| complexity that is abstracted away under the K8s cluster
| API, and claim that k8s is really hard to use. I think
| that's mostly misguided, the hosted offerings like GKE
| really do a good job of hiding away all that complexity
| from you.
|
| Honestly I think that it's defensible to say that the k8s
| networking model is in most cases _simpler_ than what you'd
| end up configuring in AWS / GCP to route traffic from the
| internet to multiple VM nodes.
| kazen44 wrote:
| > Honestly I think that it's defensible to say that the
| k8s networking model is in most cases _simpler_ than what
| you'd end up configuring in AWS / GCP to route traffic
| from the internet to multiple VM nodes.
|
| How is routing from the internet to multiple servers a
| problem?
|
| usually, you have either one of these setups:
|
| - you run a loadbalancer that distributes traffic across
| your nodes. (This loadbalancer could even be distributed
| thanks to BGP).
|
| - you either run your own firewall or have a managed one,
| in which you either announce your IP prefix yourself, or
| they are announced for you by your uplink provider.
|
| - you run an anycast setup (for, for example, globally
| distributed DNS). and announce multiples of the same
| prefix across the globe. Routing in the DFZ does the rest
| for you.
|
| Streched L2 across the globe/internet is also possible
| (although not very performant) either by doing IPsec
| tunneling, or by buying/setting up L2VPN services.
| (either MPLS or VXLAN based).
| theptip wrote:
| I didn't say it was a problem. My claim was just that
| it's easier in GKE than in GCE/EC2.
|
| I only mentioned multi-node because exposing a single VM
| to the internet is trivial -- just give it a public IP --
| and thus is not an apples-to-apples comparison with the
| multi-node load balancing that you get from the entry-
| level k8s configuration of Service > Pod < Deployment.
| kazen44 wrote:
| the largest issue with kubernetes networking seems to be the
| lack of integration with modern datacenter networking
| technology.
|
| Things like VXLAN-EVPN are supported on paper, but are no
| where near mature compared to offerings from normal
| networking vendors. Heck, even the BGP support inside
| kubernetes is lacking. Which is a great shame because it
| creates a barrier between pods and the physical world.
| (Getting a VXLAN VTEP mapped to a kubernetes node is a major
| PITA for instance).
|
| Most major cloud providers seem to have fixed this by
| building even more overlay networks (with the included
| inefficiencies).
| dehrmann wrote:
| Around 2015, I was at Spotify, and we were using a container
| orchestrator build in-house named Helios. They didn't build it
| because Kubernetes wasn't invented there; they built it because
| Kubernetes didn't exist, yet.
| lacion wrote:
| kubernetes was released in 2014, in late 2015 I already had a
| cluster marked as released candidate that was put into
| production early 2016.
|
| so im guessing here that spotify wrote their own becouse they
| had a spesific requirement?
|
| nomad was also early days in 2015
| dehrmann wrote:
| I'm pretty sure the work started before Kubernetes was
| released, and even then, it wasn't clear that was going to
| be the de factor orchestrator.
| jrockway wrote:
| I haven't seen these failure modes in 2021. We do managed
| clusters at work and have created around 100,000 of them, and
| basically all the pods we intend to start start, and all the
| pods we intend to kill die -- even with autoscaling that
| provisions new instance types. Our biggest failure mode is TLS
| certificates failing to provision through Let's Encrypt, but
| that has nothing to do with Kubernetes (that is a layer above;
| what we _run_ in Kubernetes).
|
| EKS continues to be painful. It has gotten better over the
| years, but it is a chore compared to GKE. I like to imagine
| that Jeff Bezos walked into someone's office, said "you're all
| fired if I don't have Kubersomethings in two weeks", and that's
| what they launched.
| cpach wrote:
| You have probably thought about this already but I must admit
| I'm curious: If you're on AWS, can you not use Certificate
| Manager instead of Let's Encrypt?
| jjoonathan wrote:
| Certificate Manager pushes you (shoves you, really) in the
| direction of using AWS managed services. They make
| certificate installation/rotation really easy for their own
| services and unnecessarily difficult for any that you
| implement yourself.
|
| (This may have changed in the last year or two, but it was
| certainly this way when I tried it.)
| bashinator wrote:
| In my experience, it's difficult-to-impossible to use
| AWS' certificate management and LB termination in
| conjunction with Envoy-based networking like Istio or
| Ambassador.
| jjoonathan wrote:
| Yeah, the AWS LB has more issues than that, too. I'm
| pretty sure it's just nginx under the hood but they won't
| tweak the simplest parameters for you, even if you make a
| colossal stink, even if your company spends seven figures
| a year. I wonder if it isn't a decade-old duct-tape-and-
| bailing wire solution that shares the same config across
| literally every customer or something. Rolling our own
| was almost a relief -- the pile of awkward workarounds
| had grown pretty high by the point we bit the bullet.
| jrockway wrote:
| I'm actually not on AWS, just used EKS extensively at my
| last job (and we still manually test our software against
| it).
|
| AWS burned me hard with forgetting to auto-renew certs at
| my last job. It just stopped working, the deadline passed,
| and only a support ticket and manual hacking on their side
| could make it work. cert-manager has been significantly
| more reliable and at least transparent. The mistake we make
| right now is asking for certificates on demand in the
| critical path of running our app -- but since we control
| the domain name, we could easily have a pool of domain
| names and certificates ready to go. Our mistake is having
| not done that yet.
| shaicoleman wrote:
| What are the EKS pain points?
| jrockway wrote:
| The biggest pain point is having to manually use
| cloudformation to create node pools. This is especially
| irritating when you just need to roll the Linux version on
| nodes -- takes half a day to do right. In GKE, it's just a
| button in the UI (or better, an easy-to-use API), and you
| can schedule maintenance windows for security updates
| (which are typically zero downtime anyway, assuming you
| have the right PodDisruptionBudgets). I think AWS fixed
| that. I remember when I used it, they said they had some
| new tool that would handle that, but you had to re-create
| the cluster from scratch. This was a couple years ago, and
| is probably decent nowadays.
|
| There are other warts, like certain storage classes being
| unavailable by default (gp3), the whole ENI thing for Pod
| IPs, the supported version being way out of date, etc. EKS
| has always felt like "minimum viable product" to me -- they
| really want you to use their proprietary stuff like
| ECS/Fargate, CloudFormation, etc. If you're already on AWS
| and want Kubernetes, it's just what you need. If you could
| pick any cloud provider for mainly Kubernetes, it wouldn't
| be my first choice.
|
| Having used EKS, GKE, and DOKS, I definitely prefer GKE.
| GKE is very feature-rich, and the API for managing clusters
| works well. The nodes are also cheaper than AWS. (I use
| DOKS for my personal stuff and I haven't had any problems,
| and it is free, but it's missing features like regional
| clusters that you probably want for things you make money
| off of.)
| bashinator wrote:
| For what it's worth, there's an off-the-shelf terraform
| module for EKS that is far simpler to use than AWS'
| cloudformation tooling, which does allow you to pass in a
| custom AMI and multiple nodegroup configurations as input
| parameters.
|
| https://registry.terraform.io/modules/terraform-aws-
| modules/...
| debarshri wrote:
| This year, I have seen those issues popping up in
| statefulsets alot. I realised somebody in the team was force
| deleting it. It is actually well documented.
|
| https://kubernetes.io/docs/tasks/run-application/force-
| delet...
|
| I have seen few scenarios where people patching a statefulset
| actually screws up the volume mount. It is sometimes not
| evident where the error is for instance if it is the CSI or
| in the scheduler unless you deepdive into the issue.
| CSDude wrote:
| We spawn thousands of pods per day for jobs and never get those
| stuck and it was not the case in 2018 either. Not sure what is
| it you are doing causes this.
| halfmatthalfcat wrote:
| I have used Kubernetes extensively over the past couple years
| and have never seen pods stuck in terminating or creating state
| that didn't have to do with errors in container creation (your
| Dockerfile/bootstrapping is messed up) or issues with
| healthchecks.
| _joel wrote:
| In 2016 I started at a company that had no build procedures and
| deployed to a variety of linux versions, developed on windows. It
| was a nightmare for administration, no automation, no monitoring.
| I implemented containers and most of the process was getting the
| developers on board. Having technical sessions with them to
| understand what they needed and ease them into the plan so they
| felt enfranchished. Doing this vastly increased productivity,
| devs could take off the shelf compose files that were written for
| common projects (it was a GIS shop) and meant they could
| concentrate on delivering code. It helped no end.
|
| Sure there's issues (albeit a lot fewer as time progressed) with
| docker but for what it gained in productivity and developer's
| sanity, it was very welcome.
| zz865 wrote:
| Our big project has moved from physical servers to Openshift. Its
| taken a lot of work, much more than expected. The best thing is
| that developers like it on their resume, which is a bigger
| benefit than you'd think as we've kept some good people on the
| team. For users I see zero benefit. CI pipeline is just more
| complicated and probably slower.
|
| Cost wise it was cheaper for a while but now RedHat are bumping
| up licensing costs so now I think is about the same costs.
|
| Overall it seems like a waste of time, but has been interesting.
| mixermachine wrote:
| Moving from classic servers to containers you get:
|
| - Builds with fixed dependencies that never change. Rollback is
| easy
|
| - Easy deployment of a prod environment on a local machine
|
| - Fast deployment
|
| - Easy automation (use version X with config Y)
|
| With Kubernetes (or other derivates like Openshift) you get:
|
| - Auto scaling
|
| - Fail over
|
| - Better resource usage if multiple environments are executed
|
| - Abstraction of infrastructure
|
| - Zero downtime deployment (biggest point for my company, we
| deploy >3 times per week)
|
| There are applications that do not need Kubernetes or even
| containers, but is this list really nothing oO?
|
| I can imagine that if you use Kubernetes just like a classic
| cluster it could seem like an unnecesarry added complexity but
| you gain a lot of things.
| TheDong wrote:
| Each of those benefits are things I had before using
| containers or kubernetes, and were simpler.
|
| > Builds with fixed dependencies that never change. Rollback
| is easy
|
| Any good build system already did this, such as Bazel, or a
| Gemfile.lock. We'd just snapshot AMIs to keep OS dependencies
| fixed... which is what Docker images effectively do. If you
| re-docker-build the same Dockerfile, it's not like you get
| the same result of "apt-get install libxml" the next time
| either.
|
| > Easy deployment of a prod environment on a local machine
|
| How containers are deployed varies wildly between prod and
| the local machine. All the things that were hard before are
| still hard. Things like secrets and external dependencies
| still usually vary.
|
| If prod is a kubernetes environment, getting a suitable k8s
| environment setup locally sucks, especially since it will
| probably have a different ingress controller, load balancer
| setup, storage classes available, resource requests, etc. If
| prod is kubernetes and local is docker-compose, that honestly
| seems like just as much work to create a second way to run
| the stack than just using a bash script + "npm start" or
| "bundle exec rails server" or whatever.
|
| Either way, it's not really a prod environment. It's hard to
| run identical-to-prod environments locally, and those
| problems are related to secrets and clouds and such, not due
| to the lack of containers, in my experience.
|
| > Fast deployment
|
| In my experience, containers haven't sped up deployment.
| Let's say you use ubuntu for your host and container's OS.
| Before containers, this meant you had to download one version
| of libssl ever, and that was it. If there was an update to
| libz, that didn't require a new download of libssl. After
| containers, if you build your container for app1 last week,
| and your container for app2 today, the "FROM ubuntu" likely
| resolves to a different image. Both your apps now have
| different "ubuntu" layers, which probably have the same
| version of libssl, but deduplication of downloads only
| happens if the whole layer is identical.
|
| In essence, we went from downloading 1 copy of libssl (for
| the host OS only) to 3 copies (host OS + 2 containers w/
| different ubuntu bases), and there's no deduplication.
|
| That by itself seems like it has to be slower since there's
| an inherent increase in network bandwidth that has to happen.
| Even if you have a shared base image, you're at least
| doubling the downloads of libssl since before you could use
| the host's copy only.
|
| All the items you listed under k8s are things I had before
| it, excluding "Abstraction of infrastructure". Frankly, if
| you have a well-made load balancer, it's hard not to have
| zero-downtime deployments and auto-scaling.
| mixermachine wrote:
| > We'd just snapshot AMIs to keep OS dependencies fixed
|
| This is a good solution, but I would not call it easier.
|
| Using docker container feels like installing an app on my
| smartphone. I choose the version and it will always work
| like I build it at date x without an additional system.
| Works for every programming language with every dependency
| out of the box. Python, Java, Javascript, GO, Ocaml, C, ...
|
| > How containers are deployed varies wildly between prod
| and the local machine
|
| I just brought a product of my company to Kubernetes.
|
| Run helm upgrade --install . -f dev-values.yaml for dev
|
| Run helm upgrade --install . -f prod-values.yaml for prod
| (of course you need the secrets there. Jenkins has them).
|
| My laptop does run an environment with all components of
| the prod env. Something like email and sap services are of
| course mocked, but everything else? All on my machine. Why
| not?
|
| I can spin up a new test environment for customers with new
| settings on the same day.
|
| > Both your apps now have different "ubuntu" layers
|
| We use a base image that does change not that often. Even
| if: no problem, the registry is connected via 1000 MBit/s
| and zero-downtime deployment does its magic so I don't even
| notice if it takes one or two minutes.
|
| Another thing: my node (or VM) libs and the libs of my
| software should not be connected in any way (at least for
| me). I want to patch my nodes and my software
| independently. Different software should also not be bound
| to libs of another software.
|
| > All the items you listed under k8s are things I had
| before it
|
| - How do you easily scale up? Including starting new
| machines and spinning down machines that are no longer
| needed
|
| - How are multiple software parts executed on one host?
|
| - How do you do fail over?
|
| I know that everything can be done without Kubernetes. With
| enough time and money one can create large systems that do
| this.
|
| I spun up a new Kubernetes cluster and ported our product
| (already containerized) on the cluster in about three
| months.
|
| Really: I also love the classic dev ops and have a proxmox
| server at home, but Kubernetes just solves many problems at
| once in a short time.
| TheDong wrote:
| > How do you easily scale up? Including starting new
| machines and spinning down machines that are no longer
| needed
|
| AWS autoscaling groups + cloudwatch for adding and
| removing machines + checking them into load balancers is
| something that has worked for longer than K8s has been a
| thing.
|
| > How are multiple software parts executed on one host?
|
| systemd units, or for more resource hungry things,
| multiple autoscaling groups.
|
| The overhead of running the kubelet on each host + etcd
| cluster + apiserver means that I still end up with fewer
| hosts if I just run each component on every single host
| vs scaling different deployments independently.
|
| It is true that kubernetes might be more resource
| efficient in some combination of nodes and software, but
| at under 10 servers, I've always found the overhead of
| the etcd cluster + apiserver + kubelet to dwarf any
| savings from not just running 10 copies of my software.
|
| > How do you do fail over?
|
| The AWS-managed load balancer can fail over based on
| health checks failing, metrics, or I can add/remove
| servers from it manually. You can also do DNS health
| checks, or add a layer of haproxy/nginx/whatever if you
| want.
|
| It's not like k8s has some magic ability to fail over
| under the hood. It's just using k8s service objects
| (probably LoadBalancer type), which does the same thing.
| secondcoming wrote:
| Correct. We use the pretty much the same setup on GCP.
| All scaling is automatic. When we deploy new code we just
| run a jenkins job that creates an image from custom
| debian packages. Push that to GCP and it rolls it out
| automatically to all our DCs.
| zz865 wrote:
| Yeah we have fixed usage so scaling or easy failover is not
| something we need.
| mixermachine wrote:
| Then I can understand you well. Kubernetes then just
| provides zero-downtime and additional complexity. When you
| already have something like a deployment window (like 2 am
| to 3 am) then ZDT also does not matter.
| jstimpfle wrote:
| My gut feel is that Docker is part of a trend of decreasing
| software quality.
|
| When someone writes "fixed dependencies" I read "developers
| can more easily add more bloat before the cardhouse tumbles".
| That happens for example when the "fixed dependencies" are
| upgraded.
|
| I am miserable having to touch all this junk. I feel a
| project is right when I can just git clone it (a few
| megabytes of data at most) and am left with a self contained
| repo that was written with minimal dependencies (optimally
| stored in-tree), and that can be easily built in seconds with
| a simple shell script on any reasonably modern system.
|
| The bare bones way takes a good amount of initial work, but
| mostly it's a learning experience. Once one understands a few
| principles of writing portable software, I'm sure it saves a
| huge amount of time compared to adding all these shells of
| junk.
|
| --
|
| Oh yeah, I have zero experience about integrating with
| Kubernetes or whatever. I've been a small time user of
| Jenkins and CircleCI (unvoluntarily), and when I don't have
| to set it up and it actually works, it's alright and can help
| where the developer maybe lacks a bit of discipline (build
| all targets, run all the tests).
|
| _But_ , I doubt these technologies are a replacement for an
| ergonomic build environment (with simple python build script
| or even a crude Makefile). Is incremental building a thing on
| any of this CI pipelines? Because one thing I want is
| building really really fast, and it's already way too much
| overhead if I have to go through a git commit to check this
| stuff. Don't even think about requiring a full rebuild or
| Docker image build just to get some quick feedback on a code
| change.
| [deleted]
| geerlingguy wrote:
| OpenShift is about 10x more complex than basic Docker /
| containers, and probably 2-4x more complex than plain old
| Kubernetes.
|
| I've seen more success from organizations running smaller K3s
| or K8s clusters (if they need the orchestration) or just
| running small apps via Docker/Docker Compose separately, using
| a CI system (even as simple as GitHub Actions) to manage
| deployments.
| zz865 wrote:
| Yeah its also a problem that our org has infrastructure teams
| that manage the openshift clusters and they are under
| resourced so dont help or often can't figure out how to fix
| problems. Linux sysadmins know what they're doing as the core
| infrastructure has been mostly the same for the last few
| decades.
| theamk wrote:
| Back in 2016 during the original discussion of this article,
| amount said it very well in [0]:
|
| "If you hit this many problems with any given tech, I would
| suggest you should be looking for outside help from someone that
| has experience in the area."
|
| - Yes, "clean old images" was not implemented back then. His hack
| is not that bad, and one can filter out in-use images if they
| want to pretty easily. Anyway, docker does have "docker image
| prune" now.
|
| - Storage driver history discussion is entirely incorrect. No,
| docker did not invent overlayfs nor overlayfs2. There was a whole
| big drama of aufs not mainlining, but it was mostly in context of
| live cd's, not docker.
|
| But the big missing thing is: you should not store important data
| in docker images, Docker is designed to work with transient
| container. If you have a database, or a high-performance data
| store, you use volumes, and those _bypass_ docker storage drivers
| completely.
|
| - The database story is completely crazy... judging by their
| comments, they decided to store the database data in the docker
| container for some reason and got all the expected problems
| (unable to recover, hard to migrate, etc....). It is not clear
| why they didn't put database data on the volume, there is a 2016
| StackOverflow question discussing it [0].
|
| Also, "Docker is locking away [...] files through its abstraction
| [...] It prevents from doing any sort of recovery if something
| goes wrong." Really? I did recovery with docker, the files are
| under /var/lib/docker in the directory named with guid, a simple
| "find" command can locate them.
|
| - By default, Docker uses Linux networking and yes, the
| configuration is complex so it adds overhead. That's why there is
| --net=host option (which was there for a long time) which just
| bypasses that all.
|
| [0] https://news.ycombinator.com/item?id=12872636
|
| [1] https://stackoverflow.com/questions/40167245/how-to-
| persist-...
| KronisLV wrote:
| The article seems to mention problems with AUFS, overlay and
| possibly overlay2 as well.
|
| However one of the things that i haven't quite understood, is why
| people use Docker volumes that much in the first place, or even
| think that they need to use additional volume plugins in most
| deployments?
|
| If it's a relatively simple deployment, that has some persistent
| data and it's clear on which nodes the containers could be
| scheduled (either by label or by hostname), what would prevent
| someone from just using bind mounts (
| https://docs.docker.com/storage/bind-mounts/ )?
|
| And if you need to store it on a separate machine, why not just
| use NFS on the host OS to mount the directory which you will bind
| mount? Or, alternatively, why not just use GlusterFS or Ceph for
| that sort of stuff, instead of making Docker attempt to manage
| it?
|
| For example, Docker Swarm fails to launch containers if the bind
| mount path doesn't exist, but that bit can also be addressed by
| creating the necessary directory structure with something like
| Ansible - and then you're not only able to not worry about
| volumes and the risk of them ever becoming corrupt, but you also
| have the ability to inspect the contents of the container storage
| on the actual host. Say, if there are some configuration files
| that need altering (seeing as not all of the containerized
| software out there follows 12 Factor principles with environment
| configuration either), or you just want to do some backups for
| the data that you've stored in a granular fashion.
| pbecotte wrote:
| Even in 2016, I had been running production services in Docker
| successfully. Its interesting to me that they see the problem
| "Docker isn't designed to store data" without also seeing the
| solution "the docker copy-on-write filesystem isn't designed to
| be written to production- but volume mounts are". I hadn't seen
| docker crashing hosts (still haven't) - but I'm guessing that was
| caused by using the storage drivers.
|
| The complaints about their development practices are valid (and
| haven't really improved), but even then the technology worked
| well so long as you understood its limitations.
| rubyist5eva wrote:
| Podman and Kubernetes are like a match made in heaven. Docker was
| a good first try for most people, but there is so much better
| technology that exists now.
| mianos wrote:
| It seems a lot has not changed:
|
| Docker gradually exhausts disk space on BTRFS Open ghost opened
| this issue on 23 Oct 2016
| https://github.com/moby/moby/issues/27653
|
| Still comments this week showing it happens still.
| clipradiowallet wrote:
| I know this article is from 2016...but my feelings about it(the
| article) are unchanged. Some people do not like new things, and
| they will blog about it in some form or fashion. Maybe their
| reasoning is valid, maybe it's not - it doesn't matter.
| Meanwhile...businesses have, and continue, to pay top $$$ for
| people that will help them do these things. If you want to
| collect this $$$, get on board.
|
| In a few years, the things businesses want to pay $$$ will
| change. New blog articles about "this new stuff is bad!" will
| appear, and new job postings paying above-market $$$ will appear
| also. You can either rail on about the bad(or good) changes, and
| how it's just everything-old-is-new-again....or you can get with
| the program, and get paid. In another few years, rinse and
| repeat.
| [deleted]
| jjnoakes wrote:
| It's all anecdotal. For example I know many folks who make $$$
| doing the boring old thing because it is reliable, it gets
| results quickly with low risk, the engineers know the tech
| inside and out, and not many other folks want to work with
| "boring tech".
| cpach wrote:
| Just out of curiosity, what are some examples of boring tech
| in this case?
| isoskeles wrote:
| Makes me think of stuff like managing WordPress although
| I'm not sure if that's an example they had in mind.
| aprdm wrote:
| Django, Rails, Postgres...
| benburleson wrote:
| PHP, MySQL
| dijit wrote:
| Perl, Postgres, Java, Solaris
| cpach wrote:
| Hm... Which companies still use Solaris (or Illumos)? I
| don't hear about it very often these days.
| NexRebular wrote:
| We swapped most of our linux and vmware platforms to
| Triton and SmartOS and been loving it ever since.
| Obviously there's still need to run linux in bhyves due
| to some specific software (e.g. docker) but generally
| services are on either lx-zones or smartmachines. It just
| works.
| dijit wrote:
| I am pretty sure my IT department is transitioning to
| Nexenta which is an illumos.
|
| Company before last was using some Solaris+nexenta.
|
| Samsung has an enormous smartos deployment that would
| rival all of Azure. (Based on what I learned about Azure)
| kazen44 wrote:
| smartos is a great operating system. Kind of a shame the
| kind of thinking did not catch on yet.
| ChrisArchitect wrote:
| Anything new since this?
|
| A history of re-submitted, previously discussed posts:
|
| https://news.ycombinator.com/item?id=12872304
| manishsharan wrote:
| This is a blogpost from 2016 . However if we switch to more
| recent times, my experience with AWS ECS and Fargate has been
| fairly boring. There was a learning curve to get it to work with
| cloudformation, vpcs, iam and load balancer .
| esotericimpl wrote:
| Agreed, dont see why ECS and fargate isnt used everywhere.
|
| It takes a bit of a learning curve to understand how tasks,
| task definitions ,clusters, services all fir together but once
| you do, it's pretty straight forward.
|
| I've ran ECS in production for over 5 years, can count on one
| hand where we had any issue related to docker or availability,
| all were based on code updates we didn't test properly.
| jwildeboer wrote:
| Guy who claims to run systems in the hft space, responsible for
| millions of trades with high values, can't be bothered to
| actually pay for support, relies on community and blames everyone
| but himself for being left alone with his mess. Not sorry.
| user5994461 wrote:
| Original author from 5 years ago. Surprised to see this here 5
| years later.
|
| Docker really used to crash a lot back in the days, mostly due to
| buggy storage drivers. If you were on Debian or CentOS it's very
| likely that you experienced crashes (though a lot of developers
| didn't care or didn't understand the reasons the system went
| unresponsive).
|
| There was notably a new version of Debian (with a newer kernel)
| published the year after my experience. It's a lot more stable
| now.
|
| My experience is that by 2018-2019, Docker had mostly vanished as
| a buzzword, people were only talking about Kubernetes and looking
| for kubernetes experience.
|
| edit: at that time Docker didn't have a way to clear
| images/containers, it was added after the article and follow up
| articles, I will never know if it was a coincidence but I like to
| think there is a link. I think writing the article was worth it
| if only for this reason.
| mberning wrote:
| Docker is the chaos monkey incarnate.
| stevebmark wrote:
| > Docker is meant to be stateless. Containers have no permanent
| disk storage, whatever happens is ephemeral and is gone when the
| container stops.
|
| It's interesting that this misconception made it into a clearly
| knowledgeable article. Containers have state on the writeable
| layer that is persisted between container stops and starts.
| plainnoodles wrote:
| But I think most people and tools consider "containers" to be
| volatile storage, like RAM. Non-volatile storage would be
| volumes.
|
| Honestly I think there is a lot to be said for making the
| writable layer of a container read-only. It makes sure that
| things like logging, if you care about them, go somewhere safe,
| or if you don't, get turned off explicitly. And also prevents
| gotchas like "oops wrote important data to /var/lib/notavolume
| when I meant to write to /var/lib/therightvolume" that show up
| at the worst times.
| beermonster wrote:
| I'm not sure it's a misconception. That's how they're intended
| to be used. Cattle not pets. If you don't get used to not
| treating them as throw away you can end up accidentally relying
| on some state. As you say, the top layer is read/write but that
| doesn't mean you should be relying on what you write there.
| Quite the opposite - that state should be somewhere else unless
| you can afford to lose it.
|
| I usually start mine with ---rm so they're removed on shutdown.
|
| I've seen people apply security updates via 'apt update; apt
| upgrade' within a running container. Guess what happens when
| that container is eventually destroyed?
| bfrog wrote:
| Heh I just ran into an issue the other day with a coworker where
| Ubuntu had a patched kernel auto update and break everything.
| Yep, it's a sand castle
| crummybowley wrote:
| The issue is not docker, the issue is you treat your servers like
| pets.
|
| Folks need to start building systems that destroy all and re-
| image fresh. Any other way you are just setting your self up for
| failure.
| esotericimpl wrote:
| except the Database, don't re-image the database.
| Theodores wrote:
| Symfony console broke Magento 2 today. Same story.
| belter wrote:
| Posted many times before but this is the only one with comments:
|
| https://news.ycombinator.com/item?id=12872304
___________________________________________________________________
(page generated 2021-07-27 23:01 UTC)