[HN Gopher] Dear friend, you have built a Kubernetes
___________________________________________________________________
Dear friend, you have built a Kubernetes
Author : todsacerdoti
Score : 109 points
Date : 2024-11-24 05:03 UTC (17 hours ago)
(HTM) web link (www.macchaffee.com)
(TXT) w3m dump (www.macchaffee.com)
| zug_zug wrote:
| For what it's worth, I've worked at multiple places that ran
| shell scripts just fine for their deploys.
|
| - One had only 2 services [php] and ran over 1 billion requests a
| day. Deploy was trivial, ssh some new files to the server and run
| a migration, 0 downtime.
|
| - One was in an industry that didn't need "Webscale" (retirement
| accounts). Prod deploys were just docker commands run by jenkins.
| We ran two servers per service from the day I joined the day I
| left 4 years later (3x growth), and ultimately removed one
| service and one database during all that growth.
|
| Another outstanding thing about both of these places was that we
| had all the testing environments you need, on-demand, in minutes.
|
| The place I'm at now is trying to do kubernetes and is failing
| miserably (ongoing nightmare 4 months in and probably at least 8
| to go, when it was allegedly supposed to only take 3 total). It
| has one shared test environment that it takes 3-hours to see your
| changes in.
|
| I don't fault kubernetes directly, I fault the overall
| complexity. But at the end of the day kubernetes feels like
| complexity trying to abstract over complexity, and often I find
| that's less successful that removing complexity in the first
| place.
| leetrout wrote:
| Yea but that doesn't sound shiny on your resume.
| nine_k wrote:
| Depends on what kind of company you want to join. Some value
| simplicity and efficiency more.
| loftsy wrote:
| Are you self hosting kubernetes or running it managed?
|
| I've only used it managed. There is a bit of a learning curve
| but it's not so bad. I can't see how it can take 4 months to
| figure it out.
| zug_zug wrote:
| We are using EKS
|
| > I can't see how it can take 4 months to figure it out.
|
| Well have you ever tried moving a company with a dozen
| services onto kubernetes piece-by-piece, with zero downtime?
| How long would it take you to correctly move and test every
| permission, environment variable, and issue you run into?
|
| Then if you get a single setting wrong (e.g. memory size) and
| don't load-test with realistic traffic, you bring down
| production, potentially lose customers, and have to do a
| public post-mortem about your mistakes? [true story for
| current employer]
|
| I don't see how anybody says they'd move a large company to
| kubernetes in such an environment in a few months with no
| screwups and solid testing.
| tail_exchange wrote:
| It largely depends how customized each microservice is, and
| how many people are working on this project.
|
| I've seen migrations of thousands of microservices
| happening with the span of two years. Longer timeline, yes,
| but the number of microservices is orders of magnitude
| larger.
|
| Though I suppose the organization works differently at this
| level. The Kubernetes team build a tool to migrate the
| microservices, and each owner was asked to perform the
| migration themselves. Small microservices could be migrated
| in less than three days, while the large and risk-critical
| ones took a couple weeks. This all happened in less than
| two years, but it took more than that in terms of
| engineer/weeks.
|
| The project was very successful though. The company spends
| way less money now because of the autoscaling features, and
| the ability to run multiple microservices in the same node.
|
| Regardless, if the company is running 12 microservices and
| this number is expected to grow, this is probably a good
| time to migrate. How did they account for the different
| shape of services (stateful, stateless, leader elected,
| cron, etc), networking settings, styles of deployment
| (blue-green, rolling updates, etc), secret management, load
| testing, bug bashing, gradual rollouts, dockerizing the
| containers, etc? If it's taking 4x longer than originally
| anticipated, it seems like there was a massive failure in
| project design.
| hedora wrote:
| 2000 products sounds like you made 2000 engineers learn
| kubernetes (a week, optimistically, 2000/52 = 38 engineer
| years, or roughly one wasted career).
|
| Similarly, the actual migration times you estimate add up
| to decades of engineer time.
|
| It's possible kubernetes saves more time than using the
| alternative costs, but that definitely wasn't the case at
| my previous two jobs. The jury is out at the current job.
|
| I see the opportunity cost of this stuff every day at
| work, and am patiently waiting for a replacement.
| tail_exchange wrote:
| > 2000 products sounds like you made 2000 engineers learn
| kubernetes (a week, optimistically, 2000/52 = 38 engineer
| years, or roughly one wasted career).
|
| Not really, they only had to use the tool to run the
| migration and then validate that it worked properly. As
| the other commenter said, a very basic setup for
| kubernetes is not that hard; the difficult set up is left
| to the devops team, while the service owners just need to
| see the basics.
|
| But sure, we can estimate it at 38 engineering years.
| That's still 38 years for 2,000 microservices; it's way
| better than 1 year for 12 microservices like in OP's
| case. Savings that we got was enough to offset these 38
| years of work, so this project is now paying dividends.
| mschuster91 wrote:
| > 2000 products sounds like you made 2000 engineers learn
| kubernetes (a week, optimistically, 2000/52 = 38 engineer
| years, or roughly one wasted career).
|
| Learning k8s enough to be able to work with it isn't
| _that_ hard. Have a centralized team write up a decent
| template for a CI /CD pipeline, Dockerfile for the most
| common stacks you use and a Helm chart with an example
| for a Deployment, PersistentVolumeClaim, Service and
| Ingress, distribute that, and be available for support
| should the need for Kubernetes be beyond "we need 1-N
| pods for this service, they got some environment
| variables from which they are configured, and maybe a
| Secret/ConfigMap if the application rather wants
| configuration to be done in files" is enough in my
| experience.
| relaxing wrote:
| > Learning k8s enough to be able to work with it isn't
| that hard.
|
| I've seen a lot of people learn enough k8s to be
| dangerous.
|
| Learning it well enough to not get wrapped around the
| axle with some networking or storage details is quite a
| bit harder.
| mschuster91 wrote:
| For sure but that's the job of a good ops department -
| where I work at for example, every project's CI/CD
| pipeline has its own IAM user mapping to a Kubernetes
| role that only has explicitly defined capabilities:
| create, modify and delete just the utter basics. Even if
| they'd commit something into the Helm chart that could
| cause an annoyance, the service account wouldn't be able
| to call the required APIs. And the templates themselves
| come with security built-in - privileges are all
| explicitly dropped, pod UIDs/GIDs hardcoded to non-root,
| and we're deploying Network Policies at least for ingress
| as well now. Only egress network policies aren't
| available, we haven't been able to make these work with
| services.
|
| Anyone wishing to do stuff like use the RDS database
| provisioner gets an introduction from us on how to use it
| and what the pitfalls are, and regular reviews of their
| code. They're flexible but we keep tabs on what they're
| doing, and when they have done something useful we aren't
| shy from integrating whatever they have done to our
| shared template repository.
| jrs235 wrote:
| > I don't see how anybody says they'd move a large company
| to kubernetes in such an environment in a few months with
| no screwups and solid testing.
|
| Unfortunately, I do. Somebody says that when the culture of
| the organization expects to be told and hear what they want
| to hear rather than the cold hard truth. And likely the
| person saying that says it from a perch up high and not
| responsible for the day to day work of actually
| implementing the change. I see this happen when the person,
| management/leadership, lacks the skills and knowledge to
| perform the work themselves. They've never been in the
| trenches and had to actually deal face to face with the
| devil in the details.
| zdragnar wrote:
| Comparing the simplicity of two PHP servers against a setup
| with a dozen services is always going to be one sided. The
| difference in complexity alone is massive, regardless of
| whether you use k8s or not.
|
| My current employer did something similar, but with fewer
| services. The upshot is that with terraform and helm and
| all the other yaml files defining our cluster, we have test
| environments on demand, and our uptime is 100x better.
| loftsy wrote:
| Fair enough that sounds hard.
|
| Memory size is an interesting example. A typical Kubernetes
| deployment has much more control over this than a typical
| non-container setup. It is costing you to figure out the
| right setting but in the long term you are rewarded with a
| more robust and more re-deployable application.
| sethammons wrote:
| Took us three-four years to go from self hosted multi-dc to
| getting the main product almost fully in k8s (some parts
| didn't make sense in k8s and was pushed to our geo-
| distributed edge nodes). Dozens of services and teams and
| keeping the old stuff working while changing the tire on
| the car while driving. All while the company continues to
| grow and scale doubles every year or so. It takes maturity
| in testing and monitoring and it takes longer that everyone
| estimates
| Cpoll wrote:
| It sounds like it's not easy to figure out the permissions,
| envvars, memory size, etc. of your _existing_ system, and
| that 's why the migration is so difficult? That's not
| really one of Kubernetes' (many) failings.
| Vegenoid wrote:
| Yes, and now we are back at the ancestor comment's
| original point: "at the end of the day kubernetes feels
| like complexity trying to abstract over complexity, and
| often I find that's less successful that removing
| complexity in the first place"
|
| Which I understand to mean "some people think using
| Kubernetes will make managing a system easier, but it
| often will not do that"
| Pedro_Ribeiro wrote:
| Can you elaborate on other things you think Kubernetes
| gets wrong? Asking out of curiosity because I haven't
| delved deep into it.
| malux85 wrote:
| Canary deploy dude (or dude-ette), route 0.001% of service
| traffic and then slowly move it over. Then set error
| budgets. Then a bad service wont "bring down production".
|
| Thats how we did it at Google (I was part of the core team
| responsible for ad serving infra - billions of ads to
| billions of users a day)
| pclmulqdq wrote:
| Using microk8s or k3s on one node works fine. As the author
| of "one big server," I am now working on an application that
| needs some GPUs and needs to be able to deploy on customer
| hardware, so k8s is natural. Our own hosted product runs on 2
| servers, but it's ~10 containers (including databases, etc).
| YZF wrote:
| If your application doesn't need and likely won't need to scale
| to large clusters, or multiple clusters, then there's nothing
| wrong per se. with your solution. I don't think k8s is that
| hard but there are a lot of moving pieces and there's a bit to
| learn. Finding someone with experience to help you can make a
| ton of difference.
|
| Questions worth asking:
|
| - Do you need a load balancer?
|
| - TLS certs and rotation?
|
| - Horizontal scalability.
|
| - HA/DR
|
| - dev/stage/production + being able to test/stage your complete
| stack on demand.
|
| - CI/CD integrations, tools like ArgoCD or Spinnaker
|
| - Monitoring and/or alerting with Prometheus and Grafana
|
| - Would you benefit from being able to deploy a lot of off the
| shelf software (lessay Elastic Search, or some random database,
| or a monitoring stack) via helm quickly/easily.
|
| - "Ingress"/proxy.
|
| - DNS integrations.
|
| If you answer yes to many of those questions there's really no
| better alternative than k8s. If you're building large enough
| scale web applications the almost to most of these will end up
| being yes at some point.
| leetrout wrote:
| > Spawning containers, of course, requires you to mount the
| Docker socket in your web app, which is wildly insecure
|
| Dear friend, you are not a systems programmer
| pzmarzly wrote:
| To expand on this, the author is describing the so-called
| "Docker-out-of-Docker (DooD) pattern", i.e. exposing Docker's
| Unix socket into the container. Since Docker was designed to
| work remotely (CLI on another machine than DOCKER_HOST), this
| works fine, but essentially negates all isolation.
|
| For many years now, all major container runtimes support
| nesting. Some make it easy (podman and runc just work), some
| hard (systemd-nspawn requires setting many flags to work
| nested). This is called "Docker-in-a-Docker (DinD)".
| rthnbgrredf wrote:
| I think we need to distinguish between two cases:
|
| For a hobby project, using Docker Compose or Podman combined with
| systemd and some shell scripts is perfectly fine. You're the only
| one responsible, and you have the freedom to choose whatever
| works best for you.
|
| However, in a company setting, things are quite different. Your
| boss may assign you new tasks that could require writing a lot of
| custom scripts. This can become a problem for other team members
| and contractors, as such scripts are often undocumented and don't
| follow industry standards.
|
| In this case, I would recommend using Kubernetes (k8s), but only
| if the company has a dedicated Kubernetes team with an
| established on-call rotation. Alternatively, I suggest leveraging
| a managed cloud service like ECS Fargate to handle container
| orchestration.
|
| There's also strong competition in the "Container as a Service"
| (CaaS) space, with smaller and more cost-effective options
| available if you prefer to avoid the major cloud providers.
| Overall, these CaaS solutions require far less maintenance
| compared to managing your own cluster.
| chamomeal wrote:
| How would you feel if bash scripts were replaced with Ansible
| playbooks?
|
| At a previous job at a teeny startup, each instance of the
| environment is a docker-compose instance on a VPS. It works
| great, but they're starting to get a bunch of new clients, and
| some of them need fully independent instances of the app.
|
| Deployment gets harder with every instance because it's just a
| pile of bash scripts on each server. My old coworkers have to
| run a build for each instance for every deploy.
|
| None of us had used ansible, which _seems_ like it could be a
| solution. It would be a new headache to learn, but it seems
| like less of a headache than kubernetes!
| klooney wrote:
| Ansible ultimately runs scripts, in parallel, in a defined
| order across machines. It can help a lot, but it's subject to
| a lot of the same state bitrot issues as a pole of shell
| scripts.
| rthnbgrredf wrote:
| Ansible is better than Bash if your goals include:
|
| * Automating repetitive tasks across many servers.
|
| * Ensuring idempotent configurations (e.g., setting up web
| servers, installing packages consistently).
|
| * Managing infrastructure as code for better version control
| and collaboration.
|
| * Orchestrating complex workflows that involve multiple steps
| or dependencies.
|
| However, Ansible is not a container orchestrator.
|
| Kubernetes (K8s) provides capabilities that Ansible or
| Docker-Compose cannot match. While Docker-Compose only
| supports a basic subset, Kubernetes offers:
|
| * Advanced orchestration features, such as rolling updates,
| health checks, scaling, and self-healing.
|
| * Automatic maintenance of the desired state for running
| workloads.
|
| * Restarting failed containers, rescheduling pods, and
| replacing unhealthy nodes.
|
| * Horizontal pod auto-scaling based on metrics (e.g., CPU,
| memory, or custom metrics).
|
| * Continuous monitoring and reconciliation of the actual
| state with the desired state.
|
| * Immediate application of changes to bring resources to the
| desired configuration.
|
| * Service discovery via DNS and automatic load balancing
| across pods.
|
| * Native support for Persistent Volumes (PVs) and Persistent
| Volume Claims (PVCs) for storage management.
|
| * Abstraction of storage providers, supporting local, cloud,
| and network storage.
|
| If you need these features but are concerned about the
| complexity of Kubernetes, consider using a managed Kubernetes
| service like GKE or EKS to simplify deployment and
| management. Alternatively, and this is my prefered option,
| combining Terraform with a Container-as-a-Service (CaaS)
| platform allows the provider to handle most of the
| operational complexity for you.
| vidarh wrote:
| Up until a few thousand instances, a well designed setup should
| be a part time job for a couple of people.
|
| To that scale you can write a custom orchestrator that is
| likely to be smaller and simpler than the equivalent K8S setup.
| Been there, done that.
| klooney wrote:
| > dedicated Kubernetes team with an established on-call
| rotation.
|
| Using EKS or GKS is basically this. K8s is much nicer than ECS
| in terms of development and packaging your own apps.
| majkinetor wrote:
| Highly amateurish take if you call shell spaghetti a Kubernates,
| especially if we compare complexity of both...
|
| You know what would be even more bad? Introducing kubernates for
| your non-Google/Netflix/WhateverPlanetaryScale App instead of
| just writing few scripts...
| elktown wrote:
| This is so unnuanced that it reads like rationalization to me.
| People seem to get stuck on mantras that simple things are
| inherently fragile which isn't really true, or at least not
| particularly more fragile than navigating a jungle of yaml files
| and k8s cottage industry products that link together in arcane
| ways and tend to be very hard to debug, or just to understand all
| the moving parts involved in the flow of a request and thus what
| can go wrong. I get the feeling that they mostly just don't like
| that it doesn't have _professional aesthetics_.
| nbk_2000 wrote:
| This reminds me of the famous Taco Bell Programming post [1].
| Simple can surprisingly often be good enough.
|
| [1] http://widgetsandshit.com/teddziuba/2010/10/taco-bell-
| progra...
| TacticalCoder wrote:
| > People seem to get stuck on mantras that simple things are
| inherently fragile which isn't really true...
|
| Ofc it isn't true.
|
| Kubernetes was designed at Google at a time when Google was
| already a behemoth. 99.99% of all startups and SMEs out there
| shall _never ever_ have the same scaling issues and automation
| needs that Google has.
|
| Now that said... When you begin running VMs and containers,
| even only a very few of them, you immediately run into issues
| and then you begin to think: _" Kubernetes is the solution"_.
| And it is. But it is also, in many cases, a solution to a
| problem you created. Still... the justification for creating
| that problem, if you're not Google scale, are highly
| disputable.
|
| And, deep down, there's another very fundamental issue IMO:
| many of those "let's have only one process in one container"
| solutions actually mean _" we're totally unable to write
| portable software working on several configs, so let's start
| with a machine with zero libs and dependencies and install
| exactly the minimum deps needed to make our ultra-fragile piece
| of shit of a software kinda work. And because it's still going
| to be a brittle piece of shit, let's make sure we use
| heartbeats and try to shut it down and back up again once it'll
| invariably have memory leaked and/or whatnots"_.
|
| Then you also gained the right to be sloppy in the software you
| write: not respecting it. Treating it as cattle to be
| slaughtered, so it can be shitty. But you've now added an
| insane layer of complexity.
|
| How do you like your uninitialized var when a container launchs
| but then silently doesn't work as expected? How do you like
| them logs in that case? Someone here as described the lack of
| instant failure on any uninitialized var as the "billion dollar
| mistake of the devops world".
|
| Meanwhile look at some proper software like, say, the Linux
| kernel or a distro like Debian. Or compile Emacs or a browser
| from source and _marvel_ at what 's happening. Sure, there may
| be hickups but it works. On many configs. On many different
| hardware. On many different architectures. These are robust
| software that don't need to be "pid 1 on a pristine filesystem"
| to work properly.
|
| In a way this whole _" let's have all our software each as pid
| 1 each on a pristine OS and filesystem"_ is an admission of a
| very deep and profound failure of our entire field.
|
| I don't think it's something to be celebrated.
|
| And don't get me started on security: you know have ultra
| complicated LANs and VLANs, with a near impossible to monitor
| traffic, with shitloads of ports open everywhere, the most
| gigantic attack surface of them all and heartbeats and
| whatsnots constantly polluting the network, where nobody
| doesn't even know anymore what's going on. Where the only
| actual security seems to rely on the firewall being up and
| correctly configured, which is incredibly complicated to do
| seen the insane network complexity you added to your stack. _"
| Oh wait, I have an idea, let's make configuring the firewall a
| service!"_ (and make sure to not forget to initialize one of
| the countless var or it'll all silently break and just be not
| be configuring firewalling for anything).
|
| Now though love is true love: even at home I'm running an
| hypervisor with VMs and OCI containers ; )
| do_not_redeem wrote:
| > The inscrutable iptables rules?
|
| You mean the list of calls right there in the shell script?
|
| > Who will know about those undocumented sysctl edits you made on
| the VM?
|
| You mean those calls to `sysctl` conveniently right there in the
| shell script?
|
| > your app needs to programmatically spawn other containers
|
| Or you could run a job queue and push tasks to it (gaining all
| the usual benefits of observability, concurrency limits, etc),
| instead of spawning ad-hoc containers and hoping for the best.
| jrs235 wrote:
| "We don't know how to learn/read code we are unfamiliar with...
| Nor do we know how to grok and learn things quickly. Heck, we
| don't know what grok means "
| ewuhic wrote:
| Who do you quote?
| diminish wrote:
| One can build a better container orchestration than kubernetes;
| things don't need to be that complex.
| greenie_beans wrote:
| i'm at this crossroads right now. somebody talk me out of
| deploying a dagster etl on azure kubernetes service rather than
| deploying all of the pieces onto azure container apps with my own
| bespoke scripts / config
| greenie_beans wrote:
| writing this out helped me re-validate what i need to do
| avandekleut wrote:
| what did you decide to do?
| kasey_junk wrote:
| Both this piece and the piece it's imitating seem to have 2
| central implicit axioms that in my opinion don't hold. The first,
| that the constraints of the home grown systems are all cost and
| the second that the flexibility of the general purpose solution
| is all benefit.
|
| You generally speaking do not want a code generation or service
| orchestration system that will support the entire universe of
| choices. You want your programs and idioms to follow similar
| patterns across your codebase and you want your services
| architected and deployed the same way. You want to know when
| outliers get introduced and similarly you want to make it costly
| enough to require introspection on if the value of the benefit
| out ways the cost of oddity.
| jesseendahl wrote:
| outweighs*
|
| Only offering the correction because I was confused at what you
| meant by "out ways" until I figured it out.
| jerf wrote:
| I like to say, you can make anything look good by considering
| only the benefits and anything look bad by considering only the
| costs.
|
| It's a fun philosophy for online debates, but an expensive one
| to use in real engineering.
| relaxing wrote:
| > You generally speaking do not want a code generation or
| service orchestration system that will support the entire
| universe of choices.
|
| This. I will gladly give up the universe of choices for a one
| size fits most solution that just works. I will bend my use
| cases to fit the mold if it means not having to write k8s
| configuration in a twisty maze of managed services.
| dogleash wrote:
| The compiler one read to me like a reminder to not ignore the
| lessons of compiler design. The premise being that even though
| you have small scope project compared to a "real" compiler, you
| will evolve towards analogues of those _design_ ideas. The
| databases and k8s pieces are more like don 't even try a small
| scope project because you'll want the same _features_
| eventually.
| tptacek wrote:
| I had a hard time putting my finger on what was so annoying
| about the follow-ons to the compiler post, and this nails it
| for me. Thanks!
| incrudible wrote:
| Dear friend, you have made a slippery slope argument.
| nine_k wrote:
| Yes, because the whole situation is a slippery slope (ony
| upwards). In the initial state, k8s is obviously overkill; in
| the end state, k8s is obviously adequate.
|
| The problem is choosing the point of transition, and allocating
| resources for said transition. Sometimes it's easier to
| allocate a small chunk to update your bespoke script right now
| instead of sinking more to a proper migration. It's a typical
| dilemma of taking debt vs paying upfront.
|
| (BTW the same dilemma exists with running in the cloud vs
| running on bare metal; the only time when a migration from the
| cloud is easy is the beginning, when it does not make financial
| sense.)
| incrudible wrote:
| Odds are you have 100 DAUs and your "end state" is an "our
| incredible journey" blog post. I understand that people want
| to pad their resume with buzzwords on the way, but I don't
| accept making a virtue out of it.
| hamilyon2 wrote:
| For the uninitiated: how does k8s handle OS upgrades? If
| development moves to next version of Debian, because it should
| eventually, are upgrades, for example, 2x harder vs docker-
| compose? 2x easier? About the same? Is it even right question to
| ask?
| JanMa wrote:
| It doesn't. The usual approach is to create new nodes with the
| updated OS, migrate all workloads over and then throw away the
| old ones
| Thiez wrote:
| Your cluster consists of multiple machines ('nodes'). Upgrading
| is as simple as adding a new, upgraded node, then evicting
| everything from one of the existing nodes, then take it down.
| Repeat until every node is replaced.
|
| Downtime is the same as with a deploment, so if you run at
| least 2 copies of everything there should be no downtime.
|
| As for updating the images of your containers, you build them
| again with the newer base image, then deploy.
| JanMa wrote:
| Dear friend, you should first look into using Nomad or Kamal
| deploy instead of K8S
| mdaniel wrote:
| You mean the rugpull-stack? "Pray we do not alter the deal
| further when the investors really grumble"
| https://github.com/hashicorp/nomad/blob/v1.9.3/LICENSE
|
| As for Kamal, I shudder to think of the hubris required to say
| "pfft, haproxy is for lamez, how hard can it be to make my own
| lb?!" https://github.com/basecamp/kamal-proxy
| signal11 wrote:
| Dear Friend,
|
| This fascination with this new garbage-collected language from a
| Santa Clara vendor is perplexing. You've built yourself a COBOL
| system by another name.
|
| /s
|
| I love the "untested" criticism in a lot of these use-k8s
| screeds, and also the suggestion that they're hanging together
| because of one guy. The implicit criticism is that doing your own
| engineering is bad, really, you should follow the crowd.
|
| Here's a counterpoint.
|
| Sometimes just writing YAML is enough. Sometimes it's not. Eg
| there are times when managed k8s is just not on the table, eg
| because of compliance or business issues. Then you've to think
| about self-managed k8s. That's rather hard to do well. And often,
| you don't need all of that complexity.
|
| Yet -- sometimes availability and accountability reasons mean
| that you need to have a really deep understanding of your stack.
|
| And in those cases, having the engineering capability to
| orchestrate isolated workloads, move them around, resize them,
| monitor them, etc is imperative -- and engineering capability
| means understanding the code, fixing bugs, improving the system.
| Not just writing YAML.
|
| It's shockingly inexpensive to get this started with a two-pizza
| team that understands Linux well. You do need a couple really
| good, experienced engineers to start this off though. Onboarding
| newcomers is relatively easy -- there's plenty of mid-career
| candidates and you'll find talent at many LUGs.
|
| But yes, a lot of orgs won't want to commit to this because they
| don't want that engineering capability. But a few do - and having
| that capability really pays off in the ownership the team can
| take for the platform.
|
| For the orgs that do invest in the engineering capability, the
| benefit isn't just a well-running platform, it's having access to
| a team of engineers who feel they can deal with anything the
| business throws at them. And really, creating that high-
| performing trusted team is the end-goal, it really pays off for
| all sorts of things. Especially when you start cross-pollinating
| your other teams.
|
| This is definitely not for everyone though!
| Spivak wrote:
| Infra person here, this is such the wrong take.
|
| > Do I really need a separate solution for deployment, rolling
| updates, rollbacks, and scaling.
|
| Yes it's called an ASG.
|
| > Inevitably, you find a reason to expand to a second server.
|
| ALB, target group, ASG, done.
|
| > Who will know about those undocumented sysctl edits you made on
| the VM
|
| You put all your modifications and CIS benchmark tweaks in a repo
| and build a new AMI off it every night. Patching is switching the
| AMI and triggering a rolling update.
|
| > The inscrutable iptables rules
|
| These are security groups, lord have mercy on anyone who thinks
| k8s network policy is simple.
|
| > One of your team members suggests connecting the servers with
| Tailscale: an overlay network with service discovery
|
| Nobody does this, you're in AWS. If you use separate VPCs you can
| peer them but generally it's just editing some security groups
| and target groups. k8s is forced into needing to overlay on an
| already virtual network because they need to address pods rather
| than VMs, when VMs are your unit you're just doing basic
| networking.
|
| You reach for k8s when you need control loops beyond what ASGs
| can provide. The magic of k8s is "continuous terraform," you will
| know when you need it and you likely never will. If your infra
| moves from one static config to another static config on deploy
| (by far the usual case) then no k8s is fine.
| SahAssar wrote:
| I'm sure the American Sewing Guild is fantastic, but how do
| they help here?
| atsaloli wrote:
| ASG = Auto-Scaling Group
|
| https://docs.aws.amazon.com/autoscaling/ec2/userguide/auto-s.
| ..
| jpgvm wrote:
| k8s is the API. Forget the implementation, it's really not that
| important.
|
| Folks that get tied up in the "complexity" argument are forever
| missing the point.
| mbrumlow wrote:
| The thing that the k8s api does is force you to do good
| practices, that is it.
| lttlrck wrote:
| I thought k8s might be a solution so I decided to learn through
| doing. It quickly became obvious that we didn't need 90% of its
| capabilities but more important it'd put undue load/training on
| the rest of the team. It would be a lot more sensible to write
| custom orchestration using the docker API - that was
| straightforward.
|
| Experimenting with k8s was very much worthwhile. It's an amazing
| thing and was in many ways inspirational. But using it would have
| been swimming against the tide so to speak. So sure I built a
| mini-k8s-lite, it's better for us, it fits better than wrapping
| docker compose.
|
| My only doubt is whether I should have used podman instead but at
| the time podman seemed to be in an odd place (3-4 years ago now).
| Though it'd be quite easy to switch now it hardly seems
| worthwhile.
| lousken wrote:
| why adding complexity when many services don't even need
| horizontal scaling, servers are powerful enough that if you're
| not stupid to write horrible code, it's fine for millions of
| requests a day without much of work
| stickfigure wrote:
| Dear Amazon Elastic Beanstalk, Google App Engine, Heroku, Digital
| Ocean App Platform, and friends,
|
| Thank you for building "a kubernetes" for me so I don't have to
| muck with that nonsense, or have to hire people that do.
|
| I don't know what that other guy is talking about.
| cedws wrote:
| Now compare cloud bills.
| nanomcubed wrote:
| Like, okay, if that's how you see it, but what's with the tone
| and content?
|
| The tone's vapidity is only comparable to the content's.
|
| This reads like mocking the target audience rather than showing
| them how you can help.
|
| A write up that took said "pile of shell scripts that do not
| work" and showed how to "make it work" with your technology of
| choice would have been more interesting than whatever this is.
| hamdouni wrote:
| I was using some ansible playbook scripts to deploy to production
| some web app. One day the scripts stopped working because of a
| boring error about python version mismatch.
|
| I rewrite all the deployment scripts with bash (took less than a
| hour) and never had a problem since.
|
| Morality: it's hard to find the right tool for the job
| mbrumlow wrote:
| Most of the complaints in this fun post are just bad practice,
| and really nothing to do with "making a Kubernetes".
|
| Sans bad engineering practices, if you built a system that did
| the same things as kubernetes I would have no problem with it.
|
| In reality I don't want everybody to use k8s. I want people
| finding different solutions to solve similar problems.
| Homogenized ecosystems create walls they block progress.
|
| One is the big things that is overlooked when people move to k8s,
| and why things get better when moving to k8s, is that k8s made a
| set of rules that forced service owners to fix all of their bad
| practices.
|
| Most deployment systems would work fine if the same work to
| remove bad practices from their stack occurred.
|
| K8s is the hot thing today, but mark my words, it will be
| replaced with something far more simple and much nicer to
| integrate with. And this will come from some engineer "creating a
| kubernetes"
|
| Don't even get me started on how crappy the culture of "you are
| doing something hard that I think is already a solved problem"
| is. This goes for compilers and databases too. None is these are
| hard, and neither is k8s, and all good engineers tasked with
| making one, be able to do so.
| Kinrany wrote:
| So you're saying companies should move to k8s and then
| immediately move to bash scripts
| mbrumlow wrote:
| No. I am saying that companies should have their engineers
| understand why k8s works and make those reasons an
| engineering practice.
|
| As it is today the pattern is spend a ton of money moving to
| k8s (mostly costly managed solutions) in the process fix all
| the bad engineering patterns, forced by k8s. To then have an
| engineer save the company money by moving back to a more home
| grown solution, a solution that fits the companies needs and
| saves money, something that would only be possible once the
| engineering practices were fixed.
| danjl wrote:
| I love that the only alternative is a "pile of shell scripts".
| Nobody has posted a legitimate alternative to the complexity of
| K8S or the simplicity of doctor compose. Certainly feels like
| there's a gap in the market for an opinionated deployment
| solution that works locally and on the cloud, with less
| functionality than K8S and a bit more complexity than docker
| compose.
| drewbailey wrote:
| K8s just drowns out all other options. Hashicorp Nomad is
| great, https://www.nomadproject.io/
| marvinblum wrote:
| Thumbs up for Nomad. We've been running it for about 3 years
| in prod now and it hasn't failed us a single time.
| jedberg wrote:
| I hate to shill my own company, but I took the job because I
| believe in it.
|
| You should check out DBOS and see if it meets your middle
| ground requirements.
|
| Works locally and in the cloud, has all the things you'd need
| to build a reliable and stateful application.
|
| [0] https://dbos.dev
| justinclift wrote:
| Looks interesting, but this is a bit worrying:
| ... build reliable AI agents with automatic retries and no
| limit on how long they can run for.
|
| It's pretty easy to see how that could go badly wrong. ;)
|
| (and yeah, obviously "don't deploy that stuff" is the
| solution)
|
| ---
|
| That being said, is it all OSS? I can see some stuff here
| that seems to be, but it mostly seems to be the client side
| stuff?
|
| https://github.com/dbos-inc
| jedberg wrote:
| Maybe that is worded poorly. :). It's supposed to mean
| there are no timeouts -- you can wait as long as you want
| between retries.
|
| > That being said, is it all OSS?
|
| The Transact library is open source and always will be.
| That is what you gets you the durability, statefulness,
| some observability, and local testing.
|
| We also offer a hosted cloud product that adds in the
| reliability, scalability, more observability, and a time
| travel debugger.
| danjl wrote:
| Nice, but I like my servers and find serverless difficult to
| debug.
| jedberg wrote:
| That's the beauty of this system. You build it all locally,
| test it locally, debug it locally. Only then do you deploy
| to the cloud. And since you can build the whole thing with
| one file, it's really easy to reason about.
|
| And if somehow you get a bug in production, you have the
| time travel debugger to replay exactly what the state of
| the cloud was at the time.
| danjl wrote:
| Great to hear you've improved serverless debugging. What
| if my endpoint wants to run ffmpeg and extract frames
| from video. How does that work on serverless?
| iamsanteri wrote:
| Docker Swarm mode? I know it's not as well maintained, but I
| think it's exactly what you talk about here (forget K3s, etc).
| I believe smaller companies run it still and it's perfect for
| personal projects. I myself run mostly docker compose + shell
| scripts though because I don't really need zero-downtime
| deployments or redundancy/fault tolerance.
| kikimora wrote:
| While not opinionated but you can go with cloud specific tools
| (e.g. ECS in AWS).
| danjl wrote:
| Sure, but those don't support local deployment, at least not
| in any sort of easy way.
| sc68cal wrote:
| Ansible and the podman Ansible modules
| dijit wrote:
| I coined a term for this because I see it so often.
|
| "People will always defend complexity, stating that the only
| alternative is shell scripts".
|
| I saw people defending docker this way, ansible this way and
| most recently systemd this way.
|
| Now we're on to kubernetes.
| d--b wrote:
| At least I never saw anyone arguing that the only alternative
| to git was shell scripts.
|
| Wait. Wouldn't that be a good idea?
| nicodjimenez wrote:
| Agreed, something simpler than Nomad as well hopefully.
| czhu12 wrote:
| This is basically exactly what we needed at the start up I
| worked at, with the added need of being able to host open
| source projects (airbyte, metabase) with a reasonable level of
| confidence.
|
| We ended up migrating from Heroku to Kubernetes. I tried to
| take some of the learnings to build
| https://github.com/czhu12/canine
|
| It basically wraps Kubernetes and tries to hide as much
| complexity from Kubernetes as possible, and only expose the
| good parts that will be enough for 95% of web application work
| loads.
| highspeedbus wrote:
| >Tired, you parameterize your deploy script and configure
| firewall rules, distracted from the crucial features you should
| be working on and shipping.
|
| Where's your Sysop?
| physicsguy wrote:
| Kubernetes biggest competitor isn't a pile of bash scripts and
| docker running on a server, it's something like ECS which comes
| with a lot of the benefits but a hell of a lot less complexity
| FridgeSeal wrote:
| FWIW I've been using ECS at my current work (previously K8s)
| and to me it feels just flat worse:
|
| - only some of the features
|
| - none of the community
|
| - all of the complexity but none of the upsides.
|
| It was genuinely a bit shocking that it was considered a
| serious product seeing as how chaotic it was.
| avandekleut wrote:
| Can you elaborate on some of the issues you faced? I was
| considering deploying to ECS fargate as we are all-in on AWS.
| FridgeSeal wrote:
| Any kind of git-ops style deployment was out.
|
| ECS merges "AWS config" and "app/deployment config
| together" so it was difficult to separate "what should go
| in TF, and what is a runtime app configuration. In
| comparison this is basically trivial ootb with K8s.
|
| I personally found a lot of the moving parts and names
| needlessly confusing. Tasks e.g. were not your equivalent
| to "Deployment".
|
| Want to just deploy something like Prometheus Agent? Well,
| too bad, the networking doesn't work the same, so here's
| some overly complicated guide where you have to deploy some
| extra stuff which will no doubt not work right the first
| dozen times you try. Admittedly, Prom can be a right pain
| to manage, but the fact that ECS makes you do _extra_ work
| on top of an already fiddly piece of software left a bad
| taste in my mouth.
|
| I think ECS get a lot of airtime because of Fargate, but
| you can use Fargate on K8s these days, or, if you can
| afford the small increase in initial setup complexity, you
| can just have Fargates less-expensive, less-restrictive,
| better sibling: Karpenter on Spot instances.
| andycowley wrote:
| If your workloads are fairly static,ECS is fine. Bringing
| up new containers and nodes takes ages with very little
| feedback as to what's going on. It's very frustrating when
| iterating on workloads.
|
| Also fargate is very expensive and inflexible. If you fit
| the narrow particular use case it's quicker for bringing up
| workloads, but you pay extra for it.
| jbmsf wrote:
| Can confirm. I've used ECS with Fargate successfully at
| multiple companies. Some eventually outgrew it. Some failed
| first. Some continue to use ECS happily.
|
| Regardless of the outcome, it always felt more important to
| keep things simple and focus on product and business needs.
| marcusestes wrote:
| You did a no-SQL, you did a serverless, you did a micro-services.
| This makes it abundantly clear you do not understand the nature
| of your architectural patterns and the multiplicity of your
| offenses.
| alganet wrote:
| Why do I feel this is not so simple as the compiler scenario?
|
| I've seen a lot of "piles of YAML", even contributed to some.
| There were some good projects that didn't end up in disaster, but
| to me the same could be said for the shell.
| czhu12 wrote:
| I think one thing that is under appreciated with kubernetes is
| how massive the package library is. It becomes trivial to stand
| up basically every open source project with a single command via
| helm. It gets a lot of hate but for medium sized deployments,
| it's fantastic.
|
| Before helm, just trying to run third party containers on bare
| metal resulted in constant downtime when the process would just
| hang for no reason, and and engineer would have to SSH and
| manually restart the instance.
|
| We used this as a previous start up to host metabase, sentry and
| airbyte seamlessly, on our own cluster. Which let us break out of
| the constant price increases we faced for hosted versions of
| these products.
|
| Shameless plug: I've been building
| https://github.com/czhu12/canine to try to make Kubernetes easier
| to use for solo developers. Would love any feedback from anyone
| looking to deploy something new to K8s!
| tptacek wrote:
| Right, but this isn't a post about why K8s is _good_ , it's a
| post about why K8s is _effectively mandatory_ , and it isn't,
| which is why the post rankles some people.
| czhu12 wrote:
| Yeah I mostly agree. I'd even add that even K8 YAML's are not
| trivial to maintain, especially if you need to have them be
| produced by a templating engine.
| mildred593 wrote:
| Started with a large shell script, the next iteration was written
| in go and less specific. I still think for some things, k8s is
| just too much
|
| https://github.com/mildred/conductor.go/
___________________________________________________________________
(page generated 2024-11-24 23:00 UTC)