[HN Gopher] Dear friend, you have built a Kubernetes
       ___________________________________________________________________
        
       Dear friend, you have built a Kubernetes
        
       Author : todsacerdoti
       Score  : 109 points
       Date   : 2024-11-24 05:03 UTC (17 hours ago)
        
 (HTM) web link (www.macchaffee.com)
 (TXT) w3m dump (www.macchaffee.com)
        
       | zug_zug wrote:
       | For what it's worth, I've worked at multiple places that ran
       | shell scripts just fine for their deploys.
       | 
       | - One had only 2 services [php] and ran over 1 billion requests a
       | day. Deploy was trivial, ssh some new files to the server and run
       | a migration, 0 downtime.
       | 
       | - One was in an industry that didn't need "Webscale" (retirement
       | accounts). Prod deploys were just docker commands run by jenkins.
       | We ran two servers per service from the day I joined the day I
       | left 4 years later (3x growth), and ultimately removed one
       | service and one database during all that growth.
       | 
       | Another outstanding thing about both of these places was that we
       | had all the testing environments you need, on-demand, in minutes.
       | 
       | The place I'm at now is trying to do kubernetes and is failing
       | miserably (ongoing nightmare 4 months in and probably at least 8
       | to go, when it was allegedly supposed to only take 3 total). It
       | has one shared test environment that it takes 3-hours to see your
       | changes in.
       | 
       | I don't fault kubernetes directly, I fault the overall
       | complexity. But at the end of the day kubernetes feels like
       | complexity trying to abstract over complexity, and often I find
       | that's less successful that removing complexity in the first
       | place.
        
         | leetrout wrote:
         | Yea but that doesn't sound shiny on your resume.
        
           | nine_k wrote:
           | Depends on what kind of company you want to join. Some value
           | simplicity and efficiency more.
        
         | loftsy wrote:
         | Are you self hosting kubernetes or running it managed?
         | 
         | I've only used it managed. There is a bit of a learning curve
         | but it's not so bad. I can't see how it can take 4 months to
         | figure it out.
        
           | zug_zug wrote:
           | We are using EKS
           | 
           | > I can't see how it can take 4 months to figure it out.
           | 
           | Well have you ever tried moving a company with a dozen
           | services onto kubernetes piece-by-piece, with zero downtime?
           | How long would it take you to correctly move and test every
           | permission, environment variable, and issue you run into?
           | 
           | Then if you get a single setting wrong (e.g. memory size) and
           | don't load-test with realistic traffic, you bring down
           | production, potentially lose customers, and have to do a
           | public post-mortem about your mistakes? [true story for
           | current employer]
           | 
           | I don't see how anybody says they'd move a large company to
           | kubernetes in such an environment in a few months with no
           | screwups and solid testing.
        
             | tail_exchange wrote:
             | It largely depends how customized each microservice is, and
             | how many people are working on this project.
             | 
             | I've seen migrations of thousands of microservices
             | happening with the span of two years. Longer timeline, yes,
             | but the number of microservices is orders of magnitude
             | larger.
             | 
             | Though I suppose the organization works differently at this
             | level. The Kubernetes team build a tool to migrate the
             | microservices, and each owner was asked to perform the
             | migration themselves. Small microservices could be migrated
             | in less than three days, while the large and risk-critical
             | ones took a couple weeks. This all happened in less than
             | two years, but it took more than that in terms of
             | engineer/weeks.
             | 
             | The project was very successful though. The company spends
             | way less money now because of the autoscaling features, and
             | the ability to run multiple microservices in the same node.
             | 
             | Regardless, if the company is running 12 microservices and
             | this number is expected to grow, this is probably a good
             | time to migrate. How did they account for the different
             | shape of services (stateful, stateless, leader elected,
             | cron, etc), networking settings, styles of deployment
             | (blue-green, rolling updates, etc), secret management, load
             | testing, bug bashing, gradual rollouts, dockerizing the
             | containers, etc? If it's taking 4x longer than originally
             | anticipated, it seems like there was a massive failure in
             | project design.
        
               | hedora wrote:
               | 2000 products sounds like you made 2000 engineers learn
               | kubernetes (a week, optimistically, 2000/52 = 38 engineer
               | years, or roughly one wasted career).
               | 
               | Similarly, the actual migration times you estimate add up
               | to decades of engineer time.
               | 
               | It's possible kubernetes saves more time than using the
               | alternative costs, but that definitely wasn't the case at
               | my previous two jobs. The jury is out at the current job.
               | 
               | I see the opportunity cost of this stuff every day at
               | work, and am patiently waiting for a replacement.
        
               | tail_exchange wrote:
               | > 2000 products sounds like you made 2000 engineers learn
               | kubernetes (a week, optimistically, 2000/52 = 38 engineer
               | years, or roughly one wasted career).
               | 
               | Not really, they only had to use the tool to run the
               | migration and then validate that it worked properly. As
               | the other commenter said, a very basic setup for
               | kubernetes is not that hard; the difficult set up is left
               | to the devops team, while the service owners just need to
               | see the basics.
               | 
               | But sure, we can estimate it at 38 engineering years.
               | That's still 38 years for 2,000 microservices; it's way
               | better than 1 year for 12 microservices like in OP's
               | case. Savings that we got was enough to offset these 38
               | years of work, so this project is now paying dividends.
        
               | mschuster91 wrote:
               | > 2000 products sounds like you made 2000 engineers learn
               | kubernetes (a week, optimistically, 2000/52 = 38 engineer
               | years, or roughly one wasted career).
               | 
               | Learning k8s enough to be able to work with it isn't
               | _that_ hard. Have a centralized team write up a decent
               | template for a CI /CD pipeline, Dockerfile for the most
               | common stacks you use and a Helm chart with an example
               | for a Deployment, PersistentVolumeClaim, Service and
               | Ingress, distribute that, and be available for support
               | should the need for Kubernetes be beyond "we need 1-N
               | pods for this service, they got some environment
               | variables from which they are configured, and maybe a
               | Secret/ConfigMap if the application rather wants
               | configuration to be done in files" is enough in my
               | experience.
        
               | relaxing wrote:
               | > Learning k8s enough to be able to work with it isn't
               | that hard.
               | 
               | I've seen a lot of people learn enough k8s to be
               | dangerous.
               | 
               | Learning it well enough to not get wrapped around the
               | axle with some networking or storage details is quite a
               | bit harder.
        
               | mschuster91 wrote:
               | For sure but that's the job of a good ops department -
               | where I work at for example, every project's CI/CD
               | pipeline has its own IAM user mapping to a Kubernetes
               | role that only has explicitly defined capabilities:
               | create, modify and delete just the utter basics. Even if
               | they'd commit something into the Helm chart that could
               | cause an annoyance, the service account wouldn't be able
               | to call the required APIs. And the templates themselves
               | come with security built-in - privileges are all
               | explicitly dropped, pod UIDs/GIDs hardcoded to non-root,
               | and we're deploying Network Policies at least for ingress
               | as well now. Only egress network policies aren't
               | available, we haven't been able to make these work with
               | services.
               | 
               | Anyone wishing to do stuff like use the RDS database
               | provisioner gets an introduction from us on how to use it
               | and what the pitfalls are, and regular reviews of their
               | code. They're flexible but we keep tabs on what they're
               | doing, and when they have done something useful we aren't
               | shy from integrating whatever they have done to our
               | shared template repository.
        
             | jrs235 wrote:
             | > I don't see how anybody says they'd move a large company
             | to kubernetes in such an environment in a few months with
             | no screwups and solid testing.
             | 
             | Unfortunately, I do. Somebody says that when the culture of
             | the organization expects to be told and hear what they want
             | to hear rather than the cold hard truth. And likely the
             | person saying that says it from a perch up high and not
             | responsible for the day to day work of actually
             | implementing the change. I see this happen when the person,
             | management/leadership, lacks the skills and knowledge to
             | perform the work themselves. They've never been in the
             | trenches and had to actually deal face to face with the
             | devil in the details.
        
             | zdragnar wrote:
             | Comparing the simplicity of two PHP servers against a setup
             | with a dozen services is always going to be one sided. The
             | difference in complexity alone is massive, regardless of
             | whether you use k8s or not.
             | 
             | My current employer did something similar, but with fewer
             | services. The upshot is that with terraform and helm and
             | all the other yaml files defining our cluster, we have test
             | environments on demand, and our uptime is 100x better.
        
             | loftsy wrote:
             | Fair enough that sounds hard.
             | 
             | Memory size is an interesting example. A typical Kubernetes
             | deployment has much more control over this than a typical
             | non-container setup. It is costing you to figure out the
             | right setting but in the long term you are rewarded with a
             | more robust and more re-deployable application.
        
             | sethammons wrote:
             | Took us three-four years to go from self hosted multi-dc to
             | getting the main product almost fully in k8s (some parts
             | didn't make sense in k8s and was pushed to our geo-
             | distributed edge nodes). Dozens of services and teams and
             | keeping the old stuff working while changing the tire on
             | the car while driving. All while the company continues to
             | grow and scale doubles every year or so. It takes maturity
             | in testing and monitoring and it takes longer that everyone
             | estimates
        
             | Cpoll wrote:
             | It sounds like it's not easy to figure out the permissions,
             | envvars, memory size, etc. of your _existing_ system, and
             | that 's why the migration is so difficult? That's not
             | really one of Kubernetes' (many) failings.
        
               | Vegenoid wrote:
               | Yes, and now we are back at the ancestor comment's
               | original point: "at the end of the day kubernetes feels
               | like complexity trying to abstract over complexity, and
               | often I find that's less successful that removing
               | complexity in the first place"
               | 
               | Which I understand to mean "some people think using
               | Kubernetes will make managing a system easier, but it
               | often will not do that"
        
               | Pedro_Ribeiro wrote:
               | Can you elaborate on other things you think Kubernetes
               | gets wrong? Asking out of curiosity because I haven't
               | delved deep into it.
        
             | malux85 wrote:
             | Canary deploy dude (or dude-ette), route 0.001% of service
             | traffic and then slowly move it over. Then set error
             | budgets. Then a bad service wont "bring down production".
             | 
             | Thats how we did it at Google (I was part of the core team
             | responsible for ad serving infra - billions of ads to
             | billions of users a day)
        
           | pclmulqdq wrote:
           | Using microk8s or k3s on one node works fine. As the author
           | of "one big server," I am now working on an application that
           | needs some GPUs and needs to be able to deploy on customer
           | hardware, so k8s is natural. Our own hosted product runs on 2
           | servers, but it's ~10 containers (including databases, etc).
        
         | YZF wrote:
         | If your application doesn't need and likely won't need to scale
         | to large clusters, or multiple clusters, then there's nothing
         | wrong per se. with your solution. I don't think k8s is that
         | hard but there are a lot of moving pieces and there's a bit to
         | learn. Finding someone with experience to help you can make a
         | ton of difference.
         | 
         | Questions worth asking:
         | 
         | - Do you need a load balancer?
         | 
         | - TLS certs and rotation?
         | 
         | - Horizontal scalability.
         | 
         | - HA/DR
         | 
         | - dev/stage/production + being able to test/stage your complete
         | stack on demand.
         | 
         | - CI/CD integrations, tools like ArgoCD or Spinnaker
         | 
         | - Monitoring and/or alerting with Prometheus and Grafana
         | 
         | - Would you benefit from being able to deploy a lot of off the
         | shelf software (lessay Elastic Search, or some random database,
         | or a monitoring stack) via helm quickly/easily.
         | 
         | - "Ingress"/proxy.
         | 
         | - DNS integrations.
         | 
         | If you answer yes to many of those questions there's really no
         | better alternative than k8s. If you're building large enough
         | scale web applications the almost to most of these will end up
         | being yes at some point.
        
       | leetrout wrote:
       | > Spawning containers, of course, requires you to mount the
       | Docker socket in your web app, which is wildly insecure
       | 
       | Dear friend, you are not a systems programmer
        
         | pzmarzly wrote:
         | To expand on this, the author is describing the so-called
         | "Docker-out-of-Docker (DooD) pattern", i.e. exposing Docker's
         | Unix socket into the container. Since Docker was designed to
         | work remotely (CLI on another machine than DOCKER_HOST), this
         | works fine, but essentially negates all isolation.
         | 
         | For many years now, all major container runtimes support
         | nesting. Some make it easy (podman and runc just work), some
         | hard (systemd-nspawn requires setting many flags to work
         | nested). This is called "Docker-in-a-Docker (DinD)".
        
       | rthnbgrredf wrote:
       | I think we need to distinguish between two cases:
       | 
       | For a hobby project, using Docker Compose or Podman combined with
       | systemd and some shell scripts is perfectly fine. You're the only
       | one responsible, and you have the freedom to choose whatever
       | works best for you.
       | 
       | However, in a company setting, things are quite different. Your
       | boss may assign you new tasks that could require writing a lot of
       | custom scripts. This can become a problem for other team members
       | and contractors, as such scripts are often undocumented and don't
       | follow industry standards.
       | 
       | In this case, I would recommend using Kubernetes (k8s), but only
       | if the company has a dedicated Kubernetes team with an
       | established on-call rotation. Alternatively, I suggest leveraging
       | a managed cloud service like ECS Fargate to handle container
       | orchestration.
       | 
       | There's also strong competition in the "Container as a Service"
       | (CaaS) space, with smaller and more cost-effective options
       | available if you prefer to avoid the major cloud providers.
       | Overall, these CaaS solutions require far less maintenance
       | compared to managing your own cluster.
        
         | chamomeal wrote:
         | How would you feel if bash scripts were replaced with Ansible
         | playbooks?
         | 
         | At a previous job at a teeny startup, each instance of the
         | environment is a docker-compose instance on a VPS. It works
         | great, but they're starting to get a bunch of new clients, and
         | some of them need fully independent instances of the app.
         | 
         | Deployment gets harder with every instance because it's just a
         | pile of bash scripts on each server. My old coworkers have to
         | run a build for each instance for every deploy.
         | 
         | None of us had used ansible, which _seems_ like it could be a
         | solution. It would be a new headache to learn, but it seems
         | like less of a headache than kubernetes!
        
           | klooney wrote:
           | Ansible ultimately runs scripts, in parallel, in a defined
           | order across machines. It can help a lot, but it's subject to
           | a lot of the same state bitrot issues as a pole of shell
           | scripts.
        
           | rthnbgrredf wrote:
           | Ansible is better than Bash if your goals include:
           | 
           | * Automating repetitive tasks across many servers.
           | 
           | * Ensuring idempotent configurations (e.g., setting up web
           | servers, installing packages consistently).
           | 
           | * Managing infrastructure as code for better version control
           | and collaboration.
           | 
           | * Orchestrating complex workflows that involve multiple steps
           | or dependencies.
           | 
           | However, Ansible is not a container orchestrator.
           | 
           | Kubernetes (K8s) provides capabilities that Ansible or
           | Docker-Compose cannot match. While Docker-Compose only
           | supports a basic subset, Kubernetes offers:
           | 
           | * Advanced orchestration features, such as rolling updates,
           | health checks, scaling, and self-healing.
           | 
           | * Automatic maintenance of the desired state for running
           | workloads.
           | 
           | * Restarting failed containers, rescheduling pods, and
           | replacing unhealthy nodes.
           | 
           | * Horizontal pod auto-scaling based on metrics (e.g., CPU,
           | memory, or custom metrics).
           | 
           | * Continuous monitoring and reconciliation of the actual
           | state with the desired state.
           | 
           | * Immediate application of changes to bring resources to the
           | desired configuration.
           | 
           | * Service discovery via DNS and automatic load balancing
           | across pods.
           | 
           | * Native support for Persistent Volumes (PVs) and Persistent
           | Volume Claims (PVCs) for storage management.
           | 
           | * Abstraction of storage providers, supporting local, cloud,
           | and network storage.
           | 
           | If you need these features but are concerned about the
           | complexity of Kubernetes, consider using a managed Kubernetes
           | service like GKE or EKS to simplify deployment and
           | management. Alternatively, and this is my prefered option,
           | combining Terraform with a Container-as-a-Service (CaaS)
           | platform allows the provider to handle most of the
           | operational complexity for you.
        
         | vidarh wrote:
         | Up until a few thousand instances, a well designed setup should
         | be a part time job for a couple of people.
         | 
         | To that scale you can write a custom orchestrator that is
         | likely to be smaller and simpler than the equivalent K8S setup.
         | Been there, done that.
        
         | klooney wrote:
         | > dedicated Kubernetes team with an established on-call
         | rotation.
         | 
         | Using EKS or GKS is basically this. K8s is much nicer than ECS
         | in terms of development and packaging your own apps.
        
       | majkinetor wrote:
       | Highly amateurish take if you call shell spaghetti a Kubernates,
       | especially if we compare complexity of both...
       | 
       | You know what would be even more bad? Introducing kubernates for
       | your non-Google/Netflix/WhateverPlanetaryScale App instead of
       | just writing few scripts...
        
       | elktown wrote:
       | This is so unnuanced that it reads like rationalization to me.
       | People seem to get stuck on mantras that simple things are
       | inherently fragile which isn't really true, or at least not
       | particularly more fragile than navigating a jungle of yaml files
       | and k8s cottage industry products that link together in arcane
       | ways and tend to be very hard to debug, or just to understand all
       | the moving parts involved in the flow of a request and thus what
       | can go wrong. I get the feeling that they mostly just don't like
       | that it doesn't have _professional aesthetics_.
        
         | nbk_2000 wrote:
         | This reminds me of the famous Taco Bell Programming post [1].
         | Simple can surprisingly often be good enough.
         | 
         | [1] http://widgetsandshit.com/teddziuba/2010/10/taco-bell-
         | progra...
        
         | TacticalCoder wrote:
         | > People seem to get stuck on mantras that simple things are
         | inherently fragile which isn't really true...
         | 
         | Ofc it isn't true.
         | 
         | Kubernetes was designed at Google at a time when Google was
         | already a behemoth. 99.99% of all startups and SMEs out there
         | shall _never ever_ have the same scaling issues and automation
         | needs that Google has.
         | 
         | Now that said... When you begin running VMs and containers,
         | even only a very few of them, you immediately run into issues
         | and then you begin to think: _" Kubernetes is the solution"_.
         | And it is. But it is also, in many cases, a solution to a
         | problem you created. Still... the justification for creating
         | that problem, if you're not Google scale, are highly
         | disputable.
         | 
         | And, deep down, there's another very fundamental issue IMO:
         | many of those "let's have only one process in one container"
         | solutions actually mean _" we're totally unable to write
         | portable software working on several configs, so let's start
         | with a machine with zero libs and dependencies and install
         | exactly the minimum deps needed to make our ultra-fragile piece
         | of shit of a software kinda work. And because it's still going
         | to be a brittle piece of shit, let's make sure we use
         | heartbeats and try to shut it down and back up again once it'll
         | invariably have memory leaked and/or whatnots"_.
         | 
         | Then you also gained the right to be sloppy in the software you
         | write: not respecting it. Treating it as cattle to be
         | slaughtered, so it can be shitty. But you've now added an
         | insane layer of complexity.
         | 
         | How do you like your uninitialized var when a container launchs
         | but then silently doesn't work as expected? How do you like
         | them logs in that case? Someone here as described the lack of
         | instant failure on any uninitialized var as the "billion dollar
         | mistake of the devops world".
         | 
         | Meanwhile look at some proper software like, say, the Linux
         | kernel or a distro like Debian. Or compile Emacs or a browser
         | from source and _marvel_ at what 's happening. Sure, there may
         | be hickups but it works. On many configs. On many different
         | hardware. On many different architectures. These are robust
         | software that don't need to be "pid 1 on a pristine filesystem"
         | to work properly.
         | 
         | In a way this whole _" let's have all our software each as pid
         | 1 each on a pristine OS and filesystem"_ is an admission of a
         | very deep and profound failure of our entire field.
         | 
         | I don't think it's something to be celebrated.
         | 
         | And don't get me started on security: you know have ultra
         | complicated LANs and VLANs, with a near impossible to monitor
         | traffic, with shitloads of ports open everywhere, the most
         | gigantic attack surface of them all and heartbeats and
         | whatsnots constantly polluting the network, where nobody
         | doesn't even know anymore what's going on. Where the only
         | actual security seems to rely on the firewall being up and
         | correctly configured, which is incredibly complicated to do
         | seen the insane network complexity you added to your stack. _"
         | Oh wait, I have an idea, let's make configuring the firewall a
         | service!"_ (and make sure to not forget to initialize one of
         | the countless var or it'll all silently break and just be not
         | be configuring firewalling for anything).
         | 
         | Now though love is true love: even at home I'm running an
         | hypervisor with VMs and OCI containers ; )
        
       | do_not_redeem wrote:
       | > The inscrutable iptables rules?
       | 
       | You mean the list of calls right there in the shell script?
       | 
       | > Who will know about those undocumented sysctl edits you made on
       | the VM?
       | 
       | You mean those calls to `sysctl` conveniently right there in the
       | shell script?
       | 
       | > your app needs to programmatically spawn other containers
       | 
       | Or you could run a job queue and push tasks to it (gaining all
       | the usual benefits of observability, concurrency limits, etc),
       | instead of spawning ad-hoc containers and hoping for the best.
        
         | jrs235 wrote:
         | "We don't know how to learn/read code we are unfamiliar with...
         | Nor do we know how to grok and learn things quickly. Heck, we
         | don't know what grok means "
        
           | ewuhic wrote:
           | Who do you quote?
        
       | diminish wrote:
       | One can build a better container orchestration than kubernetes;
       | things don't need to be that complex.
        
       | greenie_beans wrote:
       | i'm at this crossroads right now. somebody talk me out of
       | deploying a dagster etl on azure kubernetes service rather than
       | deploying all of the pieces onto azure container apps with my own
       | bespoke scripts / config
        
         | greenie_beans wrote:
         | writing this out helped me re-validate what i need to do
        
           | avandekleut wrote:
           | what did you decide to do?
        
       | kasey_junk wrote:
       | Both this piece and the piece it's imitating seem to have 2
       | central implicit axioms that in my opinion don't hold. The first,
       | that the constraints of the home grown systems are all cost and
       | the second that the flexibility of the general purpose solution
       | is all benefit.
       | 
       | You generally speaking do not want a code generation or service
       | orchestration system that will support the entire universe of
       | choices. You want your programs and idioms to follow similar
       | patterns across your codebase and you want your services
       | architected and deployed the same way. You want to know when
       | outliers get introduced and similarly you want to make it costly
       | enough to require introspection on if the value of the benefit
       | out ways the cost of oddity.
        
         | jesseendahl wrote:
         | outweighs*
         | 
         | Only offering the correction because I was confused at what you
         | meant by "out ways" until I figured it out.
        
         | jerf wrote:
         | I like to say, you can make anything look good by considering
         | only the benefits and anything look bad by considering only the
         | costs.
         | 
         | It's a fun philosophy for online debates, but an expensive one
         | to use in real engineering.
        
         | relaxing wrote:
         | > You generally speaking do not want a code generation or
         | service orchestration system that will support the entire
         | universe of choices.
         | 
         | This. I will gladly give up the universe of choices for a one
         | size fits most solution that just works. I will bend my use
         | cases to fit the mold if it means not having to write k8s
         | configuration in a twisty maze of managed services.
        
         | dogleash wrote:
         | The compiler one read to me like a reminder to not ignore the
         | lessons of compiler design. The premise being that even though
         | you have small scope project compared to a "real" compiler, you
         | will evolve towards analogues of those _design_ ideas. The
         | databases and k8s pieces are more like don 't even try a small
         | scope project because you'll want the same _features_
         | eventually.
        
           | tptacek wrote:
           | I had a hard time putting my finger on what was so annoying
           | about the follow-ons to the compiler post, and this nails it
           | for me. Thanks!
        
       | incrudible wrote:
       | Dear friend, you have made a slippery slope argument.
        
         | nine_k wrote:
         | Yes, because the whole situation is a slippery slope (ony
         | upwards). In the initial state, k8s is obviously overkill; in
         | the end state, k8s is obviously adequate.
         | 
         | The problem is choosing the point of transition, and allocating
         | resources for said transition. Sometimes it's easier to
         | allocate a small chunk to update your bespoke script right now
         | instead of sinking more to a proper migration. It's a typical
         | dilemma of taking debt vs paying upfront.
         | 
         | (BTW the same dilemma exists with running in the cloud vs
         | running on bare metal; the only time when a migration from the
         | cloud is easy is the beginning, when it does not make financial
         | sense.)
        
           | incrudible wrote:
           | Odds are you have 100 DAUs and your "end state" is an "our
           | incredible journey" blog post. I understand that people want
           | to pad their resume with buzzwords on the way, but I don't
           | accept making a virtue out of it.
        
       | hamilyon2 wrote:
       | For the uninitiated: how does k8s handle OS upgrades? If
       | development moves to next version of Debian, because it should
       | eventually, are upgrades, for example, 2x harder vs docker-
       | compose? 2x easier? About the same? Is it even right question to
       | ask?
        
         | JanMa wrote:
         | It doesn't. The usual approach is to create new nodes with the
         | updated OS, migrate all workloads over and then throw away the
         | old ones
        
         | Thiez wrote:
         | Your cluster consists of multiple machines ('nodes'). Upgrading
         | is as simple as adding a new, upgraded node, then evicting
         | everything from one of the existing nodes, then take it down.
         | Repeat until every node is replaced.
         | 
         | Downtime is the same as with a deploment, so if you run at
         | least 2 copies of everything there should be no downtime.
         | 
         | As for updating the images of your containers, you build them
         | again with the newer base image, then deploy.
        
       | JanMa wrote:
       | Dear friend, you should first look into using Nomad or Kamal
       | deploy instead of K8S
        
         | mdaniel wrote:
         | You mean the rugpull-stack? "Pray we do not alter the deal
         | further when the investors really grumble"
         | https://github.com/hashicorp/nomad/blob/v1.9.3/LICENSE
         | 
         | As for Kamal, I shudder to think of the hubris required to say
         | "pfft, haproxy is for lamez, how hard can it be to make my own
         | lb?!" https://github.com/basecamp/kamal-proxy
        
       | signal11 wrote:
       | Dear Friend,
       | 
       | This fascination with this new garbage-collected language from a
       | Santa Clara vendor is perplexing. You've built yourself a COBOL
       | system by another name.
       | 
       | /s
       | 
       | I love the "untested" criticism in a lot of these use-k8s
       | screeds, and also the suggestion that they're hanging together
       | because of one guy. The implicit criticism is that doing your own
       | engineering is bad, really, you should follow the crowd.
       | 
       | Here's a counterpoint.
       | 
       | Sometimes just writing YAML is enough. Sometimes it's not. Eg
       | there are times when managed k8s is just not on the table, eg
       | because of compliance or business issues. Then you've to think
       | about self-managed k8s. That's rather hard to do well. And often,
       | you don't need all of that complexity.
       | 
       | Yet -- sometimes availability and accountability reasons mean
       | that you need to have a really deep understanding of your stack.
       | 
       | And in those cases, having the engineering capability to
       | orchestrate isolated workloads, move them around, resize them,
       | monitor them, etc is imperative -- and engineering capability
       | means understanding the code, fixing bugs, improving the system.
       | Not just writing YAML.
       | 
       | It's shockingly inexpensive to get this started with a two-pizza
       | team that understands Linux well. You do need a couple really
       | good, experienced engineers to start this off though. Onboarding
       | newcomers is relatively easy -- there's plenty of mid-career
       | candidates and you'll find talent at many LUGs.
       | 
       | But yes, a lot of orgs won't want to commit to this because they
       | don't want that engineering capability. But a few do - and having
       | that capability really pays off in the ownership the team can
       | take for the platform.
       | 
       | For the orgs that do invest in the engineering capability, the
       | benefit isn't just a well-running platform, it's having access to
       | a team of engineers who feel they can deal with anything the
       | business throws at them. And really, creating that high-
       | performing trusted team is the end-goal, it really pays off for
       | all sorts of things. Especially when you start cross-pollinating
       | your other teams.
       | 
       | This is definitely not for everyone though!
        
       | Spivak wrote:
       | Infra person here, this is such the wrong take.
       | 
       | > Do I really need a separate solution for deployment, rolling
       | updates, rollbacks, and scaling.
       | 
       | Yes it's called an ASG.
       | 
       | > Inevitably, you find a reason to expand to a second server.
       | 
       | ALB, target group, ASG, done.
       | 
       | > Who will know about those undocumented sysctl edits you made on
       | the VM
       | 
       | You put all your modifications and CIS benchmark tweaks in a repo
       | and build a new AMI off it every night. Patching is switching the
       | AMI and triggering a rolling update.
       | 
       | > The inscrutable iptables rules
       | 
       | These are security groups, lord have mercy on anyone who thinks
       | k8s network policy is simple.
       | 
       | > One of your team members suggests connecting the servers with
       | Tailscale: an overlay network with service discovery
       | 
       | Nobody does this, you're in AWS. If you use separate VPCs you can
       | peer them but generally it's just editing some security groups
       | and target groups. k8s is forced into needing to overlay on an
       | already virtual network because they need to address pods rather
       | than VMs, when VMs are your unit you're just doing basic
       | networking.
       | 
       | You reach for k8s when you need control loops beyond what ASGs
       | can provide. The magic of k8s is "continuous terraform," you will
       | know when you need it and you likely never will. If your infra
       | moves from one static config to another static config on deploy
       | (by far the usual case) then no k8s is fine.
        
         | SahAssar wrote:
         | I'm sure the American Sewing Guild is fantastic, but how do
         | they help here?
        
           | atsaloli wrote:
           | ASG = Auto-Scaling Group
           | 
           | https://docs.aws.amazon.com/autoscaling/ec2/userguide/auto-s.
           | ..
        
       | jpgvm wrote:
       | k8s is the API. Forget the implementation, it's really not that
       | important.
       | 
       | Folks that get tied up in the "complexity" argument are forever
       | missing the point.
        
         | mbrumlow wrote:
         | The thing that the k8s api does is force you to do good
         | practices, that is it.
        
       | lttlrck wrote:
       | I thought k8s might be a solution so I decided to learn through
       | doing. It quickly became obvious that we didn't need 90% of its
       | capabilities but more important it'd put undue load/training on
       | the rest of the team. It would be a lot more sensible to write
       | custom orchestration using the docker API - that was
       | straightforward.
       | 
       | Experimenting with k8s was very much worthwhile. It's an amazing
       | thing and was in many ways inspirational. But using it would have
       | been swimming against the tide so to speak. So sure I built a
       | mini-k8s-lite, it's better for us, it fits better than wrapping
       | docker compose.
       | 
       | My only doubt is whether I should have used podman instead but at
       | the time podman seemed to be in an odd place (3-4 years ago now).
       | Though it'd be quite easy to switch now it hardly seems
       | worthwhile.
        
       | lousken wrote:
       | why adding complexity when many services don't even need
       | horizontal scaling, servers are powerful enough that if you're
       | not stupid to write horrible code, it's fine for millions of
       | requests a day without much of work
        
       | stickfigure wrote:
       | Dear Amazon Elastic Beanstalk, Google App Engine, Heroku, Digital
       | Ocean App Platform, and friends,
       | 
       | Thank you for building "a kubernetes" for me so I don't have to
       | muck with that nonsense, or have to hire people that do.
       | 
       | I don't know what that other guy is talking about.
        
       | cedws wrote:
       | Now compare cloud bills.
        
       | nanomcubed wrote:
       | Like, okay, if that's how you see it, but what's with the tone
       | and content?
       | 
       | The tone's vapidity is only comparable to the content's.
       | 
       | This reads like mocking the target audience rather than showing
       | them how you can help.
       | 
       | A write up that took said "pile of shell scripts that do not
       | work" and showed how to "make it work" with your technology of
       | choice would have been more interesting than whatever this is.
        
       | hamdouni wrote:
       | I was using some ansible playbook scripts to deploy to production
       | some web app. One day the scripts stopped working because of a
       | boring error about python version mismatch.
       | 
       | I rewrite all the deployment scripts with bash (took less than a
       | hour) and never had a problem since.
       | 
       | Morality: it's hard to find the right tool for the job
        
       | mbrumlow wrote:
       | Most of the complaints in this fun post are just bad practice,
       | and really nothing to do with "making a Kubernetes".
       | 
       | Sans bad engineering practices, if you built a system that did
       | the same things as kubernetes I would have no problem with it.
       | 
       | In reality I don't want everybody to use k8s. I want people
       | finding different solutions to solve similar problems.
       | Homogenized ecosystems create walls they block progress.
       | 
       | One is the big things that is overlooked when people move to k8s,
       | and why things get better when moving to k8s, is that k8s made a
       | set of rules that forced service owners to fix all of their bad
       | practices.
       | 
       | Most deployment systems would work fine if the same work to
       | remove bad practices from their stack occurred.
       | 
       | K8s is the hot thing today, but mark my words, it will be
       | replaced with something far more simple and much nicer to
       | integrate with. And this will come from some engineer "creating a
       | kubernetes"
       | 
       | Don't even get me started on how crappy the culture of "you are
       | doing something hard that I think is already a solved problem"
       | is. This goes for compilers and databases too. None is these are
       | hard, and neither is k8s, and all good engineers tasked with
       | making one, be able to do so.
        
         | Kinrany wrote:
         | So you're saying companies should move to k8s and then
         | immediately move to bash scripts
        
           | mbrumlow wrote:
           | No. I am saying that companies should have their engineers
           | understand why k8s works and make those reasons an
           | engineering practice.
           | 
           | As it is today the pattern is spend a ton of money moving to
           | k8s (mostly costly managed solutions) in the process fix all
           | the bad engineering patterns, forced by k8s. To then have an
           | engineer save the company money by moving back to a more home
           | grown solution, a solution that fits the companies needs and
           | saves money, something that would only be possible once the
           | engineering practices were fixed.
        
       | danjl wrote:
       | I love that the only alternative is a "pile of shell scripts".
       | Nobody has posted a legitimate alternative to the complexity of
       | K8S or the simplicity of doctor compose. Certainly feels like
       | there's a gap in the market for an opinionated deployment
       | solution that works locally and on the cloud, with less
       | functionality than K8S and a bit more complexity than docker
       | compose.
        
         | drewbailey wrote:
         | K8s just drowns out all other options. Hashicorp Nomad is
         | great, https://www.nomadproject.io/
        
           | marvinblum wrote:
           | Thumbs up for Nomad. We've been running it for about 3 years
           | in prod now and it hasn't failed us a single time.
        
         | jedberg wrote:
         | I hate to shill my own company, but I took the job because I
         | believe in it.
         | 
         | You should check out DBOS and see if it meets your middle
         | ground requirements.
         | 
         | Works locally and in the cloud, has all the things you'd need
         | to build a reliable and stateful application.
         | 
         | [0] https://dbos.dev
        
           | justinclift wrote:
           | Looks interesting, but this is a bit worrying:
           | ... build reliable AI agents with automatic retries and no
           | limit on how long they can       run for.
           | 
           | It's pretty easy to see how that could go badly wrong. ;)
           | 
           | (and yeah, obviously "don't deploy that stuff" is the
           | solution)
           | 
           | ---
           | 
           | That being said, is it all OSS? I can see some stuff here
           | that seems to be, but it mostly seems to be the client side
           | stuff?
           | 
           | https://github.com/dbos-inc
        
             | jedberg wrote:
             | Maybe that is worded poorly. :). It's supposed to mean
             | there are no timeouts -- you can wait as long as you want
             | between retries.
             | 
             | > That being said, is it all OSS?
             | 
             | The Transact library is open source and always will be.
             | That is what you gets you the durability, statefulness,
             | some observability, and local testing.
             | 
             | We also offer a hosted cloud product that adds in the
             | reliability, scalability, more observability, and a time
             | travel debugger.
        
           | danjl wrote:
           | Nice, but I like my servers and find serverless difficult to
           | debug.
        
             | jedberg wrote:
             | That's the beauty of this system. You build it all locally,
             | test it locally, debug it locally. Only then do you deploy
             | to the cloud. And since you can build the whole thing with
             | one file, it's really easy to reason about.
             | 
             | And if somehow you get a bug in production, you have the
             | time travel debugger to replay exactly what the state of
             | the cloud was at the time.
        
               | danjl wrote:
               | Great to hear you've improved serverless debugging. What
               | if my endpoint wants to run ffmpeg and extract frames
               | from video. How does that work on serverless?
        
         | iamsanteri wrote:
         | Docker Swarm mode? I know it's not as well maintained, but I
         | think it's exactly what you talk about here (forget K3s, etc).
         | I believe smaller companies run it still and it's perfect for
         | personal projects. I myself run mostly docker compose + shell
         | scripts though because I don't really need zero-downtime
         | deployments or redundancy/fault tolerance.
        
         | kikimora wrote:
         | While not opinionated but you can go with cloud specific tools
         | (e.g. ECS in AWS).
        
           | danjl wrote:
           | Sure, but those don't support local deployment, at least not
           | in any sort of easy way.
        
         | sc68cal wrote:
         | Ansible and the podman Ansible modules
        
         | dijit wrote:
         | I coined a term for this because I see it so often.
         | 
         | "People will always defend complexity, stating that the only
         | alternative is shell scripts".
         | 
         | I saw people defending docker this way, ansible this way and
         | most recently systemd this way.
         | 
         | Now we're on to kubernetes.
        
           | d--b wrote:
           | At least I never saw anyone arguing that the only alternative
           | to git was shell scripts.
           | 
           | Wait. Wouldn't that be a good idea?
        
         | nicodjimenez wrote:
         | Agreed, something simpler than Nomad as well hopefully.
        
         | czhu12 wrote:
         | This is basically exactly what we needed at the start up I
         | worked at, with the added need of being able to host open
         | source projects (airbyte, metabase) with a reasonable level of
         | confidence.
         | 
         | We ended up migrating from Heroku to Kubernetes. I tried to
         | take some of the learnings to build
         | https://github.com/czhu12/canine
         | 
         | It basically wraps Kubernetes and tries to hide as much
         | complexity from Kubernetes as possible, and only expose the
         | good parts that will be enough for 95% of web application work
         | loads.
        
       | highspeedbus wrote:
       | >Tired, you parameterize your deploy script and configure
       | firewall rules, distracted from the crucial features you should
       | be working on and shipping.
       | 
       | Where's your Sysop?
        
       | physicsguy wrote:
       | Kubernetes biggest competitor isn't a pile of bash scripts and
       | docker running on a server, it's something like ECS which comes
       | with a lot of the benefits but a hell of a lot less complexity
        
         | FridgeSeal wrote:
         | FWIW I've been using ECS at my current work (previously K8s)
         | and to me it feels just flat worse:
         | 
         | - only some of the features
         | 
         | - none of the community
         | 
         | - all of the complexity but none of the upsides.
         | 
         | It was genuinely a bit shocking that it was considered a
         | serious product seeing as how chaotic it was.
        
           | avandekleut wrote:
           | Can you elaborate on some of the issues you faced? I was
           | considering deploying to ECS fargate as we are all-in on AWS.
        
             | FridgeSeal wrote:
             | Any kind of git-ops style deployment was out.
             | 
             | ECS merges "AWS config" and "app/deployment config
             | together" so it was difficult to separate "what should go
             | in TF, and what is a runtime app configuration. In
             | comparison this is basically trivial ootb with K8s.
             | 
             | I personally found a lot of the moving parts and names
             | needlessly confusing. Tasks e.g. were not your equivalent
             | to "Deployment".
             | 
             | Want to just deploy something like Prometheus Agent? Well,
             | too bad, the networking doesn't work the same, so here's
             | some overly complicated guide where you have to deploy some
             | extra stuff which will no doubt not work right the first
             | dozen times you try. Admittedly, Prom can be a right pain
             | to manage, but the fact that ECS makes you do _extra_ work
             | on top of an already fiddly piece of software left a bad
             | taste in my mouth.
             | 
             | I think ECS get a lot of airtime because of Fargate, but
             | you can use Fargate on K8s these days, or, if you can
             | afford the small increase in initial setup complexity, you
             | can just have Fargates less-expensive, less-restrictive,
             | better sibling: Karpenter on Spot instances.
        
             | andycowley wrote:
             | If your workloads are fairly static,ECS is fine. Bringing
             | up new containers and nodes takes ages with very little
             | feedback as to what's going on. It's very frustrating when
             | iterating on workloads.
             | 
             | Also fargate is very expensive and inflexible. If you fit
             | the narrow particular use case it's quicker for bringing up
             | workloads, but you pay extra for it.
        
         | jbmsf wrote:
         | Can confirm. I've used ECS with Fargate successfully at
         | multiple companies. Some eventually outgrew it. Some failed
         | first. Some continue to use ECS happily.
         | 
         | Regardless of the outcome, it always felt more important to
         | keep things simple and focus on product and business needs.
        
       | marcusestes wrote:
       | You did a no-SQL, you did a serverless, you did a micro-services.
       | This makes it abundantly clear you do not understand the nature
       | of your architectural patterns and the multiplicity of your
       | offenses.
        
       | alganet wrote:
       | Why do I feel this is not so simple as the compiler scenario?
       | 
       | I've seen a lot of "piles of YAML", even contributed to some.
       | There were some good projects that didn't end up in disaster, but
       | to me the same could be said for the shell.
        
       | czhu12 wrote:
       | I think one thing that is under appreciated with kubernetes is
       | how massive the package library is. It becomes trivial to stand
       | up basically every open source project with a single command via
       | helm. It gets a lot of hate but for medium sized deployments,
       | it's fantastic.
       | 
       | Before helm, just trying to run third party containers on bare
       | metal resulted in constant downtime when the process would just
       | hang for no reason, and and engineer would have to SSH and
       | manually restart the instance.
       | 
       | We used this as a previous start up to host metabase, sentry and
       | airbyte seamlessly, on our own cluster. Which let us break out of
       | the constant price increases we faced for hosted versions of
       | these products.
       | 
       | Shameless plug: I've been building
       | https://github.com/czhu12/canine to try to make Kubernetes easier
       | to use for solo developers. Would love any feedback from anyone
       | looking to deploy something new to K8s!
        
         | tptacek wrote:
         | Right, but this isn't a post about why K8s is _good_ , it's a
         | post about why K8s is _effectively mandatory_ , and it isn't,
         | which is why the post rankles some people.
        
           | czhu12 wrote:
           | Yeah I mostly agree. I'd even add that even K8 YAML's are not
           | trivial to maintain, especially if you need to have them be
           | produced by a templating engine.
        
       | mildred593 wrote:
       | Started with a large shell script, the next iteration was written
       | in go and less specific. I still think for some things, k8s is
       | just too much
       | 
       | https://github.com/mildred/conductor.go/
        
       ___________________________________________________________________
       (page generated 2024-11-24 23:00 UTC)