[HN Gopher] The hater's guide to Kubernetes
___________________________________________________________________
The hater's guide to Kubernetes
Author : paulgb
Score : 217 points
Date : 2024-03-03 16:44 UTC (6 hours ago)
(HTM) web link (paulbutler.org)
(TXT) w3m dump (paulbutler.org)
| t3rabytes wrote:
| My current company is split... maybe 75/25 (at this point)
| between Kubernetes and a bespoke, Ansible-driven deployment
| system that manually runs Docker containers on nodes in an AWS
| ASG and will take care of deregistering/reregistering the nodes
| with the ALB while the containers on a given node are getting
| futzed with. The Ansible method works remarkably well for it's
| age, but the big thing I use to convince teams to move to
| Kubernetes is that we can take your peak deploy times from, say,
| a couple hours down to a few minutes, and you can autoscale far
| faster and more efficiently than you can with CPU-based scaling
| on an ASG.
|
| From service teams that have done the migrations, the things I
| hear consistently though are:
|
| - when a Helm deploy fails, finding the reason why is a PITA (we
| run with --atomic so it'll rollback on a failed deploy. What
| failed? Was it bad code causing a pod to crash loop? Failed k8s
| resource create? who knows! have fun finding out!)
|
| - they have to learn a whole new way of operating, particularly
| around in-the-moment scaling. A team today can go into the AWS
| Console at 4am during an incident and change the ASG scaling
| targets, but to do that with a service running in Kubernetes
| means making sure they have kubectl (and it's deps, for us that's
| aws-cli) installed and configured, AND remembering the `kubectl
| scale deployment X --replicas X` syntax.
|
| [Both of those things are very much fixable]
| dpflan wrote:
| HPAs and VPAs are useful k8s concepts for your auto-scaling
| needs.
| t3rabytes wrote:
| HPA is useful until your maxReplicas count is set too low and
| you're already tapped out.
| cogman10 wrote:
| Sort of a learning thing though right? Like, if you find
| maxReplicas is too low you move that number up until it
| isn't right?
|
| This is different from waking people up at 4am frequently
| to bump up the number of replicas.
| dpflan wrote:
| You can edit your HPA live, in maybe as many commands or
| keystrokes as manually scaling...until you commit the
| change to your repo of configs.
| makestuff wrote:
| I haven't used kubernetes in a few years, but do they have a
| good UI for operations? Your example of the AWS console where
| you can just log in and scale something in the UI but for
| kubernetes. We run something similar on AWS right now, during
| an incident we log into the account with admin access to modify
| something and then go back to configure that in the CDK post
| incident.
| t3rabytes wrote:
| AWS has a UI for resources in the cluster but it relies on
| the IAM role you're using in the console to have configured
| perms in the cluster, and our AWS SSO setup prevents that
| from working properly (this isn't usually the case for AWS
| SSO users, it's a known quirk of our particular auth setup
| between EKS and IAM -- we'll fix it sometime).
| adhamsalama wrote:
| https://k8slens.dev
| cogman10 wrote:
| For scaling, have you tried using either an HPA or keda?
|
| We've had pretty good success with simple HPAs.
| t3rabytes wrote:
| Yep, I'd say >half of the teams with K8s services have
| adopted KEDA, but we've got some HPA stragglers for sure.
| dpflan wrote:
| I have to say that when you have more buy in from delivery
| teams and adoption of HPAs your system can become more
| harmonious overall. Each team can monitor and tweak their
| services, and many services are usually connected upstream
| or downstream. When more components can ebb and flow
| according to the compute context then the system overall
| ebbs and flows better. #my2cents
| freedomben wrote:
| Personally, I don't like Helm. I think for the vast majority of
| usecases where all you need is some simple
| templating/substitution, it just introduces way more complexity
| and abstraction than it is worth.
|
| I've been really happy with just using `envsubst` and
| environment variables to generate a manifest at deploy time.
| It's easy with most CI systems to "archive" the manifest, and
| it can then be easily read by a human or downloaded/applied
| manually for debugging with. Deploys are also just `cat
| k8s/${ENV}/deploy.yaml | envsubt > output.yaml && kubectl apply
| -f output.yaml`
|
| I've also experimented with using terraform. It's actually been
| a good enough experience that I may go fully with terraform on
| a new project and see how it goes.
| linuxftw wrote:
| You might like kubernetes kustomize if you don't care for
| helm (IMO, just embrace helm, you can keep your charts very
| simple and it's straight forward). Kustomize takes a little
| getting used to, but it's a nice abstraction and widely used.
|
| I cannot recommend terraform. I use it daily, and daily I
| wish I did not. I think Pulumi is the future. Not as battle
| tested, but terraform is a mountain of bugs anyway, so it
| can't possibly be worse.
|
| Just one example where terraform sucks: You cannot both
| deploy a kubernetes cluster (say an EKS/AKS cluster) and then
| use kubernetes_manifest provider in a single workspace. You
| must do this across two separate terraform runs.
| jonathaneunice wrote:
| The problem with bespoke, homegrown, and DIY isn't that the
| solutions are bad. Often, they are quite good--excellent, even,
| within their particular contexts and constraints. And because
| they're tailored and limited to your context, they can even be
| quite a bit simpler.
|
| The problem is that they're custom and homegrown. Your
| organization alone invests in them, trains new staff in them,
| is responsible for debugging and fixing when they break, has to
| re-invest when they no longer do all the things you want. DIY
| frameworks ultimately end up as byzantine and labyrinthine as
| Kubernetes itself. The virtue of industry platforms like
| Kubernetes is, however complex and only half-baked they start,
| over time the entire industry trains on them, invests in them,
| refines and improves them. They benefit from a long-term
| economic virtuous cycle that DIY rarely if ever can. Even the
| longest, strongest, best-funded holdouts for bespoke languages,
| OSs, and frameworks--aerospace, finance, miltech--have largely
| come 'round to COTS first and foremost.
| api wrote:
| I understand where most of the complexity in K8S comes from, but
| it still horrifies and offends me and I hate it. But I don't
| think it's Kubernetes' fault _directly_. I think the problem is
| deeper in the foundation. It comes from the fact that we are
| trying to build modern, distributed, high availability,
| incrementally upgradeable, self-regulating systems on a
| foundation of brittle clunky 1970s operating systems that are not
| designed for any of that.
|
| The whole thing is a bolt-on that has to spend a ton of time
| working around the limitations of the foundation, and it shows.
|
| Unfortunately there seems to be zero interest in fixing _that_
| and so much sunk cost in existing Unix /Posix designs that it
| seems like we are completely stuck with a basic foundation of
| outdated brittleness.
|
| What I think we need:
|
| * An OS that runs hardware-independent code (WASM?) natively and
| permits things like hot updates, state saving and restoration,
| etc. Abstract away the hardware.
|
| * Native built-in support for clustering, hot backups, live
| process migration between nodes, and generally treating hardware
| as a pure commodity in a RAIN (redundant array of inexpensive
| nodes) configuration.
|
| * A modern I/O API. Posix I/O APIs are awful. They could be
| supported for backward compatibility via a compatibility library.
|
| * Native built-in support for distributed clustered storage with
| high availability. Basically a low or zero config equivalent of
| Ceph or similar built into the OS as a first class citizen.
|
| * Immutable OS that installs almost instantly on hardware, can be
| provisioned entirely with code, and where apps/services can be
| added and removed with no "OS rot." The concept of installing
| software "on" the OS needs to be killed with fire.
|
| * Shared distributed network stack where multiple machines can
| have the same virtual network interfaces, IPs, and open TCP
| connections can migrate. Built-in load balancing.
|
| I'm sure people around here can think of more ideas that belong
| in this list. These are not fringe things that are impossible to
| build.
|
| Basically you should have an immutable image OS that turns many
| boxes into one box and you don't have to think about it. Storage
| is automatically clustered. Processes automatically restart or,
| if a hardware fault is detected in time, automatically _migrate_.
|
| There were efforts to build such things (Mosix, Plan 9, etc.) but
| they were bulldozed by the viral spread of free Unix-like OSes
| that were "good enough."
|
| Edit:
|
| That being said, I'm not saying Kubernetes is good software
| either. The core engine is actually decent and as the OP said has
| a lot of complexity that's needed to support what it does. The
| ugly nasty disgusting parts are the config interface, clunky shit
| like YAML, and how generally arcane and unapproachable and _ugly_
| the thing is to actually use.
|
| I just _loathe_ software like this. I feel the same way about
| Postgres and Systemd. "Algorithmically" they are fine, but the
| interface and the way you use them is arcane and makes me feel
| like I'm using a 70s mainframe on a green VT220 monitor.
|
| Either these things are designed by the sorts of "hackers" who
| _like_ complexity and arcane-ness, or they 're hacks that went
| viral and matured into global infrastructure without planning. I
| think it's a mix of both... though in the case of Postgres it's
| also that the project is legitimately old. It feels like old-
| school Unix clunkware because it is.
| tayo42 wrote:
| What would you fix if you could?
| happymellon wrote:
| I'm not entirely convinced that there isn't a better way. With
| AWS Lambda and alternatives able to run containers on demand,
| and OpenFaas, they all point to "a better way".
|
| [Edit] Parent comment is almost entirety completely different
| after that edit to what I responded to. But I think my point
| still stands. One day, hopefully in my lifetime, we shall see
| it.
| api wrote:
| Yeah I do think lambda-style coding where you move away from
| the idea of _processes_ toward functions and data are another
| possibly superior way.
|
| The problem is that right now this gets you lock-in to a
| proprietary cloud. There are some loose standards but the
| devil's in the details and once you are deployed somewhere
| it's damn hard to impossible to move without serious downtime
| and fixing.
| Too wrote:
| How about Erlang?
|
| I can't say I know it myself. It always looks good on
| paper. Strangely nobody uses it. There must be a catch that
| detracts from it?
| api wrote:
| People want to program in their preferred language, not
| be forced to use one language to have these benefits.
| Izkata wrote:
| I don't know it either, but a vague understanding I got
| in the past was the language itself wasn't very user-
| friendly. I think Elixir was supposed to solve that.
| happymellon wrote:
| Completely agree, but that's where OpenFaas (or another
| open standard) comes in.
|
| Hopefully we should get OpenFaas and Lambda, in the same
| way we have ECS and EKS. Standardised ways to complete
| tasks, rather than managing imaginary servers.
|
| We are still early in the cycle.
| throwaway892238 wrote:
| Agreed. If Linux were a distributed OS, people would just be
| running a distro with systemd instead of K8s. (Of course,
| systemd is just another kubernetes, but without the emphasis on
| running distributed systems)
| p_l wrote:
| CoreOS tried to distribute systemd and seemed it wasn't
| working all that well compared to trying to optimize for k8s
| geodel wrote:
| Maybe they folded before their ideas could take root and
| backed by decent implementation.
| throwaway892238 wrote:
| That whole concept is bizarre. It's like wanting to fly, so
| rather than buy a plane, you take a Caprice Classic and try
| to make it fly.
|
| If CoreOS actually wanted to make distributed computing
| easier, they'd make patches for the Linux kernel (or make
| an entirely different kernel). See the many distributed OS
| kernels that were made over 20 years ago. But that's a lot
| of work. So instead they tried to go the cheap and easy
| route. But the cheap and easy route ends up being much
| shittier.
|
| There's no commercial advantage to building a distributed
| OS, which is why no distributed OS is successful today. You
| would need a crazy person to work for 10 years on a pet
| project until it's feature-complete, and then all of a
| sudden everyone would want to use it. But until it's
| complete, nobody would use it, and nobody would spend time
| developing it. Even once it's created, if it's not popular,
| still nobody will use it (you can use Plan9 today, but
| nobody does).
|
| https://en.wikipedia.org/wiki/Distributed_operating_system
| kiitos wrote:
| > These are not fringe things that are impossible to build.
|
| Maybe not, but I'm confident that the system you're describing
| is impossible to build in a way that is both general and
| efficient.
| Spivak wrote:
| This matches our experience as well. As long as you treat your
| managed k8s cluster as autoscaling-group as-a-service you'll do
| fine.
|
| k8s's worst property is that it's a cleverness trap. You can do
| anything in k8s whether it's sane to do so or not. The biggest
| guardrail against falling into is managing your k8s with
| terraform-ish so that you don't find yourself in a spot where
| "effort to do it right" >> "effort to hack in YAML" and finding
| your k8s cluster becoming spaghetti.
| x86x87 wrote:
| Why not just use an autoscaling group?
|
| Re: cleverness trap. I feel like this is the tragedy of
| software development. We like to be seen as clever. We are
| doing "hard" things. I have way more respect for engineers that
| do "simple" things that just work using boring tech and factor
| in whole lifecycle of the product.
| p_l wrote:
| > Why not just use an autoscaling group?
|
| Not everyone has money to burn, even back in ZIRP era.
|
| And before you trot out wages for experienced operations team
| - I've regularly dealt with it being cheaper to pay for one
| or two very experienced people than deal with AWS bill.
|
| For the very simple reason that cloud provider's prices are
| scaled to US market and not everyone has US money levels.
| Spivak wrote:
| Sorry, I could have explained that better. The biggest value
| add that k8s has is that it gives you as many or as few
| autoscaling groups as you need at a given time using only a
| single pool (or at least fewer pools) of heterogeneous
| servers. There's lots of fine print here but it really does
| let you run the same workloads on less hardware and to me
| that's the first and last reason you should be using it.
|
| I wouldn't start with k8s and instead opt for ASGs until you
| reach the point where you look at your AWS account and see a
| bunch of EC2 instances sitting underutilized.
| treesciencebot wrote:
| > Above I alluded to the fact that we briefly ran ephemeral,
| interactive, session-lived processes on Kubernetes. We quickly
| realized that Kubernetes is designed for robustness and
| modularity over container start times.
|
| Is there a clear example of this? E.g. is kubernetes inherently
| unable to start a pod (assuming the same sequence of events, e.g.
| warm/cold image with streaming enabled) under 500ms, 1s etc?
|
| I am asking this as someone who spent quite a bit of time and
| wasn't able to bring it down 2s< mark, which eventually led us to
| rewrite the latency sensitive parts to use Nomad. But we are
| currently in a state where we are re-considering kubernetes for
| its auxilary tooling benefits and would love to learn more if
| anyone had experiences with starting and stopping thousands of
| pods with the lowest possible latencies without caring for
| utilization or placement but just observable boot latencies.
| p_l wrote:
| You'd have to ensure that
|
| a) preload all images of course
|
| b) there's enough of nodes with enough capacity
|
| c) the pods don't use anything that has possible longer latency
| (high latency CSI etc.)
|
| d) you might want to write custom scheduler for your workloads
| (it could take into account what images are preloaded where,
| etc)
| paulgb wrote:
| I do believe that with the right knowledge of Kubernetes
| internals it's _probably_ possible to get k8s cold start times
| competitive with where we landed without Kubernetes (generally
| subsecond, often under 0.5s depending on how much the container
| does before passing a health check), but we 'd have to
| understand k8s internals really well and would have ended up
| throwing out much of what already existed. And we'd probably
| end up breaking most of the reasons for using Kubernetes in the
| first place in the process.
| p_l wrote:
| Not much internals needed, but actual in depth understanding
| of Pod kube-api plus at least basics of how scheduler,
| kubelet, and kubelet drivers interact.
|
| Big possible win is custom scheduling, but barely anyone
| seems to know it exists
| paulgb wrote:
| Yeah, looking into writing a scheduler was basically where
| we stepped back and said "if we write this ourselves, why
| not the rest, too". As I see it, the biggest gains that we
| were able to get were by making things happen in parallel
| that would by default happen in sequence, and optimizing
| for the happy path instead of optimizing for reducing
| failure. In Kubernetes it's reasonable to have to wait for
| a dozen things to serially go through RAFT consensus in
| etcd before the pod runs, but we don't want that.
|
| (I made up the dozen number, but my point is that that
| design would be perfectly acceptable given Kubernetes'
| design constraints)
| cogman10 wrote:
| Not surprising to me. People are complaining about how
| difficult it is to know k8s when you talk about the basic
| default objects. Getting into the weeds of how the api and
| control plane work (especially since it has little impact
| on day to day dev) is something devs tend to just avoid.
| p_l wrote:
| Honestly, devs of the applications that run on top
| probably should not have to worry about it. Instead have
| a platform team provide the necessary features.
| hobofan wrote:
| Yeah, with plain Kubernetes I'd also see the practical limit
| around ~0.5s. If you are on GKE Autopilot where you also have
| little control over node startup there is likely also a lot
| more unpredictability.
|
| Something like Knative can allow for faster startup times if
| you follow the common best-practices (pre-fetching images,
| etc.), but I'm not sure if it supports enough of the session-
| related feature that you were probably looking for to be a
| stand-in for Plane.
| fifilura wrote:
| There is nothing wrong with k8s it is a nice piece of technology.
|
| But the article trending here a couple of days ago describes it
| well. https://www.theolognion.com/p/company-forgets-why-they-
| exist...
|
| It is complex enough to make the k8s maintainers the heroes of
| the company. And this is where things tend to go sideways.
|
| It has enough knobs and levers to distract the project from what
| they are actually trying to achieve.
| cedws wrote:
| I see Kubernetes the same way as git. Elegant fundamental
| design, but the interface to it is awful.
|
| Kubernetes is designed to solve big problems and if you don't
| have those problems, you're introducing a tonne of complexity
| for very little benefit. An ideal orchestrator would be more
| composable and not introduce more complexity than needed for
| the scale you're running at. I'd really like to see a modern
| alternative to K8S that learns from some of its mistakes.
| pphysch wrote:
| Git is a much more subtle abstraction than k8s though. You
| can be blissfully unaware that a directory is a git repo, and
| still read/patch files.
|
| You cannot pretend k8s doesn't exist in a k8s system.
| PUSH_AX wrote:
| I was once talking to an ex google site reliability engineer. He
| said there are maybe a handful of companies in the world that
| _need_ k8s. I tend to agree. A lot of people practice hype driven
| development.
| x86x87 wrote:
| I tend to agree. K8s makes a lot of sense if you are running
| your own bare metal servers at scale.
|
| If you are already using the cloud, maybe leverage abstraction
| already available in that context.
| candiddevmike wrote:
| You either recreate a less reliable version of kubernetes for
| workload ops or you go all in on your cloud provider and hope
| they'll be responsible for your destiny.
|
| Vanilla Kubernetes is just enough abstraction to avoid both
| of those situations.
| x86x87 wrote:
| You cannot really be cloud agnostic these days - even when
| using k8s. So learning to use the capabilities the cloud
| provides is key.
| p_l wrote:
| Doesn't really mesh with my experience, especially the
| longer k8s been out.
|
| It can be _cheaper_ to depend on cloud provider to ship
| some features, but with tools like crossplane you can
| abstract that out so developers can just "order" a
| database service etc. for their application.
| PUSH_AX wrote:
| Is "hope" the new replacement for SLAs? Or am I missing
| something with that statement?
| k8sToGo wrote:
| SLA do not prevent something from breaking,
| unfortunately. It is just a blame construct.
| p_l wrote:
| "Hope" that your cloud provider matches as well your
| needs as you thought, that vendor lock-in doesn't let
| them milk you with high prices, etc. etc.
|
| None of that is prevented with SLA
| PUSH_AX wrote:
| This requires the same skill and experience as figuring
| out if k8s is going to be a good fit.
|
| Arguably if you can't evaluate the raw cloud offerings
| and jump on a supposed silver bullet you need to stop
| immediately.
| p_l wrote:
| At this point I found out that k8s knowledge is more
| portable, whereas your trove of $VENDOR_1 knowledge might
| suddenly have issues because for reasons outside of your
| capacity to control there's now big spending contract
| signed with $VENDOR_2 and a mandate to move.
|
| And with smaller companies I tend to find k8s way more
| cost effective. I pulled things I wouldn't be able to fit
| in a budget otherwise.
| k8sToGo wrote:
| I joined a team that used AWS without kubernetes. Thousands
| of fragile weird python and bash scripts. Deployment was
| always such a headache.
|
| A few months later I transitioned the team to use containers
| with proper CI/CD and EKS with Terraform and Argo CD. The
| team and also the managers like it, since we could deploy
| quite quickly.
| evantbyrne wrote:
| This is an apples-to-oranges comparison. You would still
| have to write and maintain glue without the presence of a
| proper CD.
| PUSH_AX wrote:
| Thanks for the anecdote k8sToGo
| misiti3780 wrote:
| if not k8, what would other people be using? ECS?
| k8sToGo wrote:
| From my experience, classical VMs with self written Bash
| scripts. The horror!
| kenhwang wrote:
| If you're on AWS, yeah, I'd say just use ECS until you need
| more complexity. Our ECS deployments have been unproblematic
| for years now.
|
| Our K8s clusters never goes more than a couple days without
| some sort of strange issue popping up. Arguably it could be
| because my company outsourced maintenance of it to an army of
| idiots. But K8s is a tool that is only as good as the
| operator, and competence can be hard to come by at some
| companies.
| p_l wrote:
| K8s or no K8s, outsource to lowest bidder and you'll get
| unworkable platform :|
| kenhwang wrote:
| Agreed. But if you're already on AWS, I'd say the quality
| floor is already higher than the potential at 95%+ of
| other companies.
|
| So I say unless you're at a company that pays top
| salaries for the top 5% of engineering talent, you're
| probably better off just using the AWS provided service.
| p_l wrote:
| I used to have a saying back when Heroku was more
| favourable, is that you use Heroku because you want to go
| bankrupt. AWS is at times similar.
|
| Depending on your local market, AWS bills might be way
| worse than the cost of few bright ops people who will let
| you choose from offerings including running dev envs on
| random assortment of dedicated servers and local e-waste
| escapees
| liveoneggs wrote:
| ECS is so nice and simple. http://kubernetestheeasyway.com
| nprateem wrote:
| Cloud run, etc, but there seem to be some biggish gaps in
| what those tools can do (probably because if deploying a
| container was too easy the cloud providers would lose loads
| of profit).
| evantbyrne wrote:
| I honestly think docker compose is the best default option
| for single-machine orchestration. The catch is that you
| either need to do some scripting to get fully automated zero
| downtime deploys. I have to imagine someone will eventually
| figure out a way to trivialize that, if they haven't already.
| Or, you could just do the poor man's zero downtime deploy:
| run two containers, deploy container a, wait for it to be
| ready, then deploy container b, and let the reverse proxy do
| the rest.
| KronisLV wrote:
| Docker Swarm takes the Compose format and takes it to
| multi-node clusters with load balancing, while keeping
| things pretty simple and manageable, especially with
| something like Porainer!
|
| For larger scale orchestratiom, Hashicorp Nomad can also be
| a notable contender, while in some ways still being simpler
| than Kubernetes.
|
| And even when it comes to Kubernetes, distros like K3s and
| tools like Portainer or Rancher can keep managing the
| cluster easy.
| geodel wrote:
| And that hype is in large part created by Google and other
| cloud vendors.
|
| To be honest I hardly see any reasonable/actionable advice from
| Cloud/SAAS vendors. Either it is to sell their stuff or generic
| stuff like "One should be securing / monitoring their stuff
| running in prod". Oh wow, never thought or done any such thing
| before.
| imiric wrote:
| That might be true, but unfortunately the state of the art
| infrastructure tooling is mostly centered around k8s. This
| means that companies choose k8s (or related technologies like
| k3s, Microk8s, etc.) not because they strictly _need_ k8s, but
| because it improves their workflows. Otherwise they would need
| to invest a disproportionate amount of time and effort adopting
| and maintaining alternative tooling, while getting an inferior
| experience.
|
| Choosing k8s is not just based on scaling requirements anymore.
| There are also benefits of being compatible with a rich
| ecosystem of software.
| PUSH_AX wrote:
| Can you specify what state of the art infra tooling you mean?
| imiric wrote:
| Continuous deployment systems like ArgoCD and Flux, user
| friendly local development environments with tools like
| Tilt, novel networking, distributed storage, distributed
| tracing, etc. systems that are basically plug-and-play,
| etc. Search for "awesome k8s" and you'll get many lists of
| these.
|
| It's surely possible to cobble all of this together without
| k8s, but k8s' main advantage is exposing a standardized API
| that simplifies managing this entire ecosystem. It often
| makes it worth the additional overhead of adopting,
| understanding and managing k8s itself.
| k8sToGo wrote:
| I push for k8s because I _know_ it. Why not use something that
| I know how to use? I know how to quickly set up a cluster, what
| to deploy, and teach other team members about fundamentals.
|
| How many people out there really _need_ C# or object oriented
| programming?
|
| The argument you present might be valid if you decide to use a
| tech stack prior having much experience with it.
| nprateem wrote:
| Yeah that's the point. You know it and stuff everyone else.
| p_l wrote:
| Some custom bash/python/ansible monstrosity is only going
| to be known by few brains in the world.
|
| K8s is remarkably easier to retain institutional knowledge
| as well as spread it.
| nprateem wrote:
| If you're expecting app/FE devs to have to learn it
| you're putting a ton of barriers in their way in terms of
| deploying. Just chucking a container on a non-k8s managed
| platform (e.g. Cloud Run) would be much simpler, and no
| pile of bash scripts.
| p_l wrote:
| PaaSes are for companies with money to burn, most of the
| time. A good k8s team (even a single person, to be quite
| honest) is going to work towards providing your
| application teams with simple templates to let them
| deploy their software easily. Just let them do it.
|
| Also, in my experience, you either have to spend
| ridiculous amounts of money on SaaS/PaaS, or you find
| that you have to host a lot more than just your
| application and suddenly the deployment story becomes
| more complex.
|
| Depending on where you are and how much you're willing to
| burn money, you might find out that k8s experts are
| cheaper than the money saved by not going PaaS.
| foverzar wrote:
| > If you're expecting app/FE devs to have to learn it
|
| Why would anyone expect it? It's not their job, is it? We
| don't expect backend devs to know frontend and vice-
| versa, or any of them to have AWS certification. Why
| would it be different with k8s?
|
| > Just chucking a container on a non-k8s managed platform
| (e.g. Cloud Run) would be much simpler, and no pile of
| bash scripts.
|
| Simpler to deploy, sure, but not to actually run it
| seriously in the long term. Though, if we are talking
| about A container (as in singular), k8s would indeed be
| some serious over-engineering
| k8sToGo wrote:
| If it's about the knowledge of everyone else, why was I
| hired as a _cloud_ engineer? Everyone else in my team was
| more R &D
| planetafro wrote:
| Just a thought as well in my corpo experience: Unfortunately,
| there are some spaces that distribute solutions as k8s-only...
| Which sucks. I've noticed this mostly in the data
| science/engineering world. These are solutions that could be
| easily served up in a small docker compose env. The
| complexity/upsell/devops BS is strong.
|
| To add insult to injury, I've seen more than one use IaC cloud
| tooling as an install script vs a maintainable and idempotent
| solution. It's all quite sad really.
| p_l wrote:
| There's a difference between _need or you don 't survive_ and
| _it improves our operations_.
|
| The former is a very small set involving having huge amounts of
| bare metal systems.
|
| The latter is suprisingly large set of companies, sometimes
| even with one server.
| Thaxll wrote:
| It's a dumb statement especially from an SRE, it's typically a
| comment from people that don't understand k8s and think that
| k8s is only there to have the SLA of Google.
|
| For most use case k8s is not there to give you HA but to give
| you a standard way of deploying a stack, that being on the
| cloud or on prem.
| PUSH_AX wrote:
| He understood it fully, he was running a multi day course on
| it when I spoke to him. He was candid about the tech, most of
| us where there at the behest of our orgs.
| p_l wrote:
| In my personal experience, Google SREs as well as k8s devs
| sometimes didn't grok how wide k8s usability was - they
| also can be blind to financial aspects of companies living
| outside of Silly Valley.
| throwawaaarrgh wrote:
| Most companies in the world don't need to develop software.
| Software development itself is hype. But there's lots of money
| in it, despite no actual value being created most of the time.
| rwmj wrote:
| > _It's also worth noting that we don't administer Kubernetes
| ourselves_
|
| This is the key point. Even getting to the point where I could
| install Kubernetes myself on my own hardware took weeks, just
| understanding what hardware was needed and which of the (far too
| many) different installers I had to use.
| LegibleCrimson wrote:
| I found k3s pretty easy to spin up.
| hobofan wrote:
| OT: Can something be done about HN commenting culture so that the
| comments stay more on topic?
|
| Some technologies (like Kubernetes) tend to attract discussions
| where half of the commenters completely ignore the original
| article, so we end up having a weekly thread about Kubernetes
| where the points of the article (which are interesting) are not
| able to be discussed because they are drowned out by the same
| unstructured OT discussions?
|
| At the time of this posting there are ~20 comments with ~2
| actually having anything to do with the points of the article
| rather than Kubernetes in general.
| cogman10 wrote:
| Having read the article, isn't the point of the article
| kubernetes in general and what the author prescribes you sign
| up for/avoid?
|
| Discussions of k8s pitfalls and successes in general seems to
| be very much in line with what the article is advocating. And,
| to that point, there's frankly just not a whole lot interesting
| in this article for discussion "We avoid yaml and operators"...
| Neat.
| hobofan wrote:
| > Having read the article, isn't the point of the article
| kubernetes in general and what the author prescribes you sign
| up for/avoid?
|
| Yeah, and I think that provides a good basis to discussion,
| where people can critique/discuss whether the evaluation that
| the author has made are correct (which a few comments are
| doing). At the same time a lot of that discussion is being
| displaced by what I would roughly characterize as "general
| technology flaming" which isn't going anywhere productive.
| geodel wrote:
| Huh, this article has hardly anything deep, technical, thought
| provoking or unique compared to ten thousand other Kubernetes
| articles.
|
| I am rather happy that people are having general purpose
| discussion about K8s.
| freedomben wrote:
| What you're seeing is the early-crowd. With most (not all)
| posts, comments will eventually rise to the top that are more
| what you're looking for. IME it usually takes a couple hours.
| If it's a post where I really want to read the relevant
| comments, I'll usually come back at least 8 to 12 hours later
| and there's usually some good ones to choose from. Even topics
| like Apple that attract the extreme lovers and haters tend to
| trend this direction
| pvg wrote:
| The solution to that is to flag boring/generic articles and/or
| post/upvote more specific, interesting articles. Generic
| articles produce generic, mostly repetitive comments but then
| again that's what the material the commenters are given.
| Kab1r wrote:
| I almost feel attacked for using plain yaml, helm, cert-manager
| AND the ingress api just for personal homelab shenanigans.
| cogman10 wrote:
| Yeah, I disagree with the OP on the dangers there. They work
| fairly well for us and aren't the source of headache. Though, I
| still try and teach my dev teams that "just because bitnami
| puts in variables everywhere, doesn't mean you need to. We
| aren't trying to make these apps deployable on homelabs."
| __MatrixMan__ wrote:
| > But we often do multiple deploys per day, and when our products
| break, our customer's products break for their users. Even a
| minute of downtime is noticed by someone.
|
| Kubernetes might be the right tool for the job if we accept that
| this is a necessary evil. But maybe it's not? The idea that I
| might fail to collaborate with you because a third party failed
| because a fourth party failed kind of smells like a recipe for
| software that breaks all the time.
| paulgb wrote:
| It really comes down to, I don't ever want to have the
| conversation "is this a good time to deploy, or should we wait
| until tonight when there's less usage". We have had some
| periods where our system was more fragile, and planning our
| days around the least-bad deployment window was a time suck,
| and didn't scale to our current reality of round-the-clock
| usage.
| hellcow wrote:
| You can achieve this without k8s, though. If your goal is, "I
| want zero-downtime deploys," that alone is not sufficient
| reason to reach for something as massively complex as k8s.
| Set up a reverse proxy and do blue-green deploys behind it.
| paulgb wrote:
| > Set up a reverse proxy and do blue-green deploys behind
| it.
|
| That's what I currently use Kubernetes for. What stack are
| you proposing instead?
| sureglymop wrote:
| If you only need zero downtime deployments, compose and
| traefik/caddy are enough.
|
| If you need to replicate storage, share networks and
| otherwise share resources across multiple hosts,
| kubernetes is better suited.
|
| But you'll also have much less control with compose, e.g.
| no limiting of egress/ingress and more.
| paulgb wrote:
| As I see it, managed Kubernetes basically gives me the
| same abstraction I'd have with Compose, except that I can
| add nodes easily, have some nice observability through
| GKE, etc. Compose might be simpler if I were running the
| cluster myself, but because GKE takes care of that, it's
| one less thing that I have to do.
| danenania wrote:
| "Set up a reverse proxy and do blue-green deploys behind
| it."
|
| I think this already introduces enough complexity and edge
| cases to make reinventing the wheel a bad idea. There's a
| lot involved in doing it robustly.
|
| There are alternatives to Kubernetes (I prefer ECS/Fargate
| if you're on AWS), but trying to do it yourself to a
| production-ready standard sets you up for a lot of
| unnecessary yak shaving imho.
| boxed wrote:
| For small scales you can use Dokku. I do. It's great and
| simple.
| freedomben wrote:
| This sounds like terrible advice. Managing a reverse proxy
| with blue-green deploys behind it is not going to be
| trivial, and you have to roll most of that yourself. The
| deployment scripts alone are going to be hairy. Getting the
| same from K8s requires having a deploy.yaml file and a
| `kubectl apply -f <file>`. K8s is way less complex.
| hellcow wrote:
| I ran such a system in prod over 7 years with >5-9s
| uptime, multiple deploys per day, and millions of users
| interacting with it. Our deploy scripts were ~10 line
| shell scripts, and any more complex logic (e.g. batching,
| parallelization, health checks) was done in a short Go
| program. Anyone could read and understand it in full. It
| deployed much faster than our equivalent stack on k8s.
|
| k8s is a large and complex tool. Anyone who's run it in
| production at scale has had to deal with at least one
| severe outage caused by it.
|
| It's an appropriate choice when you have a team of k8s
| operators full-time to manage it. It's not necessarily an
| appropriate choice when you want a zero-downtime deploy.
| freedomben wrote:
| > _It 's an appropriate choice when you have a team of
| k8s operators full-time to manage it._
|
| Are you talking about a full self-run type of scenario
| where you setup and administer k8s entirely yourself, or
| a managed system or semi-managed (like OpenShift)?
| Because if the former then I would agree with you,
| although I wouldn't recommend a full self-run unless you
| were a big enough corp to have said team. But if you're
| talking about even a managed service, I would have to
| disagree. I've been running for years on a managed
| service (as the only k8s admin) and have never had a
| severe outage caused by K8s
| esafak wrote:
| Is your short Go program public? I'm curious how you
| handled progressive rollouts, and automated rollbacks.
| hellcow wrote:
| It isn't, sadly, but the logic is straightforward. Have a
| set of IPs you target, iterate with your deploy script
| targeting each, check health before continuing. If
| anything doesn't work (e.g. health check fails), stop the
| deploy to debug. There's no automated rollback--simply
| `git revert` and run the deploy script again.
| zer00eyz wrote:
| >> Managing a reverse proxy with blue-green deploys
| behind it is not going to be trivial, and you have to
| roll most of that yourself.
|
| There are a lot of reverse proxies that will do this.
| Traditionally this was the job of a load balancer. With
| that being done by "software" you get the fun job of
| setting it up!
|
| The hard part is doing it the first time, and having a
| sane strategy. What you want to do is identify and
| segment a portion of your traffic. Mostly this means
| injecting a cooking into the segmented traffics HTTP(S)
| requests. If you dont have a group of users consistently
| on the new service you get some odd behavior.
|
| The deployment part is easy. Cause your running things
| concurrently then ports matter. Just have the alternate
| version deployed on a different port. This is not a big
| deal and is supper easy to do. In fact your deployments
| are probably set up to swap ports anyway. So all your
| doing is not committing to a final step in that process.
|
| But... what if it is a service to service call inside
| your network. That too should be easy. Your passing id's
| around between calls for tracing right? Rather than
| "random cookie" you're just going to route based on
| these. Again easy to do in a reverse proxy, easier in a
| load balancer.
|
| It's not like easy blue green deploys are some magic of
| kuberneties. We have been doing them for a long time.
| They were easy to do once set up (and highly scripted as
| a possible path for any normal deployment).
|
| Kubernetes is to operations what rails is to
| programing... Its good, fast, helpful... till it isnt and
| then your left having buyers remorse.
| teeray wrote:
| Is there something like a k1s? What I'd love is "run this set of
| containers on this machine. If the machine goes down, I don't
| care--I will fix it." If it wired into nginx or caddy as well, so
| much the better. Something like that for homelab use would be
| wonderful.
| eropple wrote:
| You've basically described k3s, I think. I run it in my homelab
| (though I am enough of a tryhard to have multiple control
| planes) as well as on a couple of cloud servers as container
| runtimes (trading some overhead for consistency).
|
| k3s really hammers home the "kubernetes is a set of behaviors,
| not a set of tools" stuff when you realize you can ditch etcd
| entirely and use sqlite if you really want to, and is a good
| learning environment.
| szszrk wrote:
| That's basically just a docker-compose.
|
| If you want something crazy all-in-one for homelab check out
| https://github.com/azukaar/Cosmos-Server
| silverquiet wrote:
| Docker Compose probably fits the bill for that. They also have
| a built in minimalist orchestrator called Swarm if you do want
| to extend to multiple machines. I suppose it's considered
| "dead" since Kubernetes won mindshare, but it still gets
| updates.
| morbicer wrote:
| For homelab, Docker compose should be enough
|
| For something more production oriented
| https://github.com/basecamp/kamal
| pheatherlite wrote:
| Docker bare bones or docker compose. Run as systemd services
| and have docker run the container as a service account. Manual
| orchestration is all you need. Anything else like rancher or
| whatever are just fluff.
| 7sidedmarble wrote:
| That's called docker compose
| blopker wrote:
| I run all my projects on Dokku. It's a sweet spot for me
| between a barebones VPS with Docker Compose and something a lot
| more complicated like k8s. Dokku comes with a bunch of solid
| plugins for databases that handle backups and such. Zero
| downtime deploys, TLS cert management, reverse proxies, all out
| of the box. It's simple enough to understand in a weekend and
| has been quietly maintained for many years. The only downside
| is it's meant mostly for single server deployments, but I've
| never needed another server so far.
|
| https://dokku.com/
| josegonzalez wrote:
| Just a note: Dokku has alternative scheduler plugins, the
| newest of which wraps k3s to give you the same experience
| you've always had with Dokku but across multiple servers.
| boxed wrote:
| Dokku really is a game changer for small business. It makes
| me look like a magician with deploys in < 2m (most of which
| is waiting for GitHub Actions to run the tests first!) and no
| downtime.
| throwaway892238 wrote:
| When people call Kubernetes a "great piece of technology", I find
| it the same as people saying the United States is the "greatest
| country in the world". Oh yeah? Great in what sense? Large?
| Absolutely. Powerful? Definitely. But then the adjectives sort of
| take a turn... Complicated? Expensive? Problematic? Threatening?
| A quagmire? You betcha.
|
| If there were an alternative to Kubernetes that were just 10%
| less confusing, complicated, opaque, monolithic, clunky, etc, we
| would all be using it. But because Kubernetes exists, and
| everyone is using it, there's no point in trying to make an
| alternative. It would take years to reach feature parity, and
| until you do, you can't really switch away. It's like you're
| driving an 18-wheeler, and you think it kinda sucks, but you
| can't just buy and then drive a completely different 18 wheeler
| for only a couple of your deliveries.
|
| You probably will end up using K8s at some point in the next 20
| years. There's not really an alternative that makes sense. As
| much as it sucks, and as much as it makes some things both more
| complicated and harder, if you actually need everything it
| provides, it makes no sense to DIY, and there is no equivalent
| solution.
| p_l wrote:
| People forgot just how much of a mess Mesos environment was in
| comparison.
|
| And often pushed Nomad to this day surprises me with randomly
| missing a feature or two that turns out to be impactful enough
| to want to deal with more complexity because ultimately the
| result was less complexity in total.
| madduci wrote:
| I don't get most of the blame and reasoning.
|
| Sure, everyone has their own product and experience and it's fine
| to express it, but I don't get the usage of other decisions such
| as "no to services meshes", "no to helm" and many more.
|
| You don't want to ideally reinvent the wheel for every workload
| you need (say you need a OIDC endpoint, an existing application):
| you are tempted to write everything from scratch by yourself,
| which is also fine, but the point is: why?
|
| Many products deliver their own Helm package. And if you are sick
| of writing YAML, I would look for Terraform over Pulumi, for the
| reason that you use the same tool for bringing up Infrastructure
| and then workloads.
|
| Kubernetes itself isn't easy to be used, in many cases you don't
| need it, but it might bring you nice things straight out of the
| box with less pain than other tooling (e.g. zero downtime
| deployments)
| p_l wrote:
| The problem with Helm is that it did the one thing you should
| not do, and refused to fix it even when their promised to.
|
| They do text-replacement templating for YAML.
|
| I have once spent a month, being quite experienced k8s
| wrangler, trying to figure out why Helm 2 was timing out, only
| to finally trace it down to how sometimes we would get wrong
| number of spaces in some lines.
| eropple wrote:
| I admit that I use some Helm stuff in my home environment, but
| for production I'm genuinely worried about the need to support
| whatever they've thrown into it. At minimum I'm going to have
| to study the chart and understand exactly what they propose to
| open-palm slam into my cluster, and for many/most applications
| at that point it might genuinely be worth just writing a
| manifest myself. Not always. Some applications are genuinely
| complex and need to be! But often, this has been the case for
| me. For all my stuff, though, I use kustomize and I'm pretty
| happy with it; it's too stupid for me to be clever, and this is
| good.
|
| Service meshes are a different kettle of fish. hey add exciting
| new points of failure where they need not exist, and while
| there are definitely use cases for them, I'd default to
| avoiding them until somebody proves the need for one.
| pheatherlite wrote:
| Why are people still scared of k8s? At certain jobs thresholds,
| it is worth every ounce of effort to maintain it. Better yet, go
| managed.
| cogman10 wrote:
| I honestly don't understand it either. Familiarity? K8s has
| like, what, 5 big concepts to know and once you are there the
| other concepts (generally) just build from there.
|
| - Containers
|
| - Pods
|
| - Deployments
|
| - Services
|
| - Ingresses
|
| There are certainly other concepts you can learn, but you
| aren't often dealing with them (just like you aren't dealing
| with them when working with something like docker compose).
| nprateem wrote:
| Good luck fixing etcd when a major version upgrade breaks. It
| took all weekend to fight that fire when it happened to us.
| p_l wrote:
| Been there, done that, didn't get a t-shirt but got to yell
| at some people for setting up with undersized VMs and
| forgetting to note it anywhere.
|
| Haven't had an issue once I fixed sizing.
| Thaxll wrote:
| Use managed k8s, problem solved.
| nprateem wrote:
| That problem solved, but plenty of other things hiding in
| that 'simple' setup of just 5 concepts.
| jakupovic wrote:
| People don't understand k8s and are thus hating. K8s is a
| wonderful tool for most things many teams need. It may not be
| useful for homelab type of stuff as the learning curve is
| steep, but for professional use it cannot be beat currently.
| Just a bunch of I know what I'm doing and don't need this
| complicated thing I don't understand. Pretty simple and
| especially in a forum such as HN where we all are "experts" and
| need to explain to ourselves, and crucially others, why we are
| right not to use k8s. Bunch of children really.
| axpy906 wrote:
| I clicked expecting something a bit more detailed here.
|
| What are the best resources to learn simple k8 in 2024?
| jakupovic wrote:
| Try putting a simple https app on a managed k8s cluster and use
| google/whatever to figure it out, that should get you started.
| k8sToGo wrote:
| Interesting that they avoid helm. It is the "plug and play"
| solution for Kubernetes. However, that is only in theory. My
| experience with most operators out there was clunky, buggy, or
| very limited and did not expose everything needed. But I still
| end up using helm itself with the combination of ArgoCD.
| habitue wrote:
| Helm is just a mess. If you're going to deploy something from
| helm, you're better off taking it apart and reconstructing it
| yourself, rather than depending on it to work like a package
| manager
| hobofan wrote:
| In my experience, if you use first-party charts (= published
| by the same people that publish the packaged software) that
| are likely also provided to enterprise customers you'll have
| a good time (or at least a good starting point). For third-
| party charts, especially for more niche software I'd also
| rather avoid them.
| szszrk wrote:
| I think the important detail here is that he mentions he
| doesn't use it because of operators. That may mean they tried
| it in previous major version which used teller. That was quite
| a long time ago.
|
| That being said, helm templates are disgusting and I absolutely
| hate how easily developers complicate their charts. Even the
| default empty chart has helpers. Why, on Earth, why?
|
| I almost fully relate to OPs aproach to k8s but I think with
| their simplified approach helm (the current one) could work
| quite well.
| stackskipton wrote:
| We avoided Helm as well. We found that Kustomize provides
| enough templating to cover almost all the common use cases and
| it's very easy for anyone to check their work, kubectl
| kustomize > compiled.yaml. FluxCD handles postbuild find and
| replace.
|
| At most places, your cluster configuration is probably pretty
| set in stone and doesn't vary a ton.
| neya wrote:
| There were some "hype cycles" (in Gartner's lingo) that I avoided
| during my career. The first one was the MongoDB/NoSQL hype -
| "Let's use NoSQL for everything!" trend. I tried it in a medium
| sized project and burnt my finger and it was right around when HN
| was flooded with full of "Why we migrated to MongoDB" stories.
|
| The next one was Microservices. Everyone was doing something with
| microservices and I was just on a good 'ole Ruby on Rails
| monolith. Again, the HN stories came and went "Why we broke down
| our simple CRUD app into 534 microservices".
|
| The final one was Kubernetes. I was a Cloud consultant in my past
| life and had to work with a lot of my peers who had the freedom
| to deploy In any architecture they saw fit. A bunch of them were
| on Kubernetes and I was just on a standard Compute VM for my
| clients.
|
| We had a requirement from our management that all of us had to
| take some certification courses so they would be easily to pitch
| to clients. So, I prepped for one and read about Kubernetes and
| tried deploying a bunch of applications only to realize it was a
| very complex piece of moving parts - unnecessarily I may add. I
| was never able to understand why this was pushed on as normal. It
| made my decision to not use it only stronger.
|
| Over the course of the 5 year journey, my peers' apps would
| randomly fail and they would be sometimes pulled over the
| weekends to push fixes to avert the P1 situation whilst I would
| be casually chilling in a bar with my friends. My compute engine
| VM, till date, to its credit has only had one P1 situation yet.
| And that was because the client forgot to renew their domain
| name.
|
| Out of all the 3 hype cycles that I avoided in my career, the
| Kubernetes is the one I really am thankful of evading the most.
| This sort of complexity should not be normalised. I know this
| maybe unpopular opinion on HN, but I am willing to bite the
| bullet and save my time and my clients' money. So, thanks for the
| hater's guide. But, I prefer to remain one. I'd rather call a
| spade one.
| dminor wrote:
| Early on in the container hype cycle we decided to convert some
| of our services from VMs to ECS. It was easy to manage and the
| container build times were so much better than AMI build times.
|
| Some time down the road we got acquired, and the company that
| acquired us ran their services in their own Kubernetes cluster.
|
| When we were talking with their two person devops team about
| our architecture, I explained that we deployed some of our
| services on ECS. "Have you ever used it?" I asked them.
|
| "No, thank goodness" one of them said jokingly.
|
| By this time it was clear that Kubernetes had won and AWS was
| planning its managed Kubernetes offering. I assumed that after
| I became familiar with Kubernetes I'd feel the same way.
|
| After a few months though it became clear that all these guys
| did was babysit their Kubernetes cluster. Upgrading it was a
| routine chore and every crisis they faced was related to some
| problem with the cluster.
|
| Meanwhile our ECS deploys continued to be relatively hassle
| free. We didn't even have a devops team.
|
| I grew to understand that managing Kubernetes was fun for them,
| despite the fact that it was overkill for their situation. They
| had architected for scale that didn't exist.
|
| I felt much better about having chosen a technology that didn't
| "win".
| jakupovic wrote:
| So you don't use things you don't understand, valid point.
| But, saying others are using k8s as a way to use up free time
| is pretty useless too as we have managed k8s offerings and
| thus don't need the exercise. If you don't need k8s don't use
| it, thanks. Pretty useless story honestly
| p_l wrote:
| A lot depended on whether the ECS fit what you needed. ECSv1,
| even with FarGate, was so limited that my first k8s use was
| pretty much impossible on it at sensible price points, for
| example.
| therealfiona wrote:
| Something struck with me here that I've been thinking about. OP
| says a human should never wait for a pod. Agreed, it is annoying
| and sometimes means waiting for an EC2 and the pod.
|
| We have jobs that users initiate that use 80+GB of memory and a
| few dozen cores. We run only one pod per node because the next
| size up EC2 costs a fortune and performance tops out on our
| current size.
|
| These jobs are triggered via a button click that trigger a lambda
| that submits a job to the cluster. If it is a fresh node, user
| has to wait for the 1gb container to download from ECR. But it is
| the same container that the automated jobs that kick off every
| few minutes also uses, to rarely is there any waiting. But
| sometimes there is.
|
| Should we be running some sort of clustering job scheduler that
| gets the job request and distributes work amongst long running
| pods in the cluster? My fear is that we just creat another layer
| of complexity and still end up waiting for the EC2, waiting for
| the pod to download, waiting for the agent now running on this
| pod to join the work distribution cluster.
|
| However, we probably could be more proactive with this because we
| could spin up an extra pod+EC2 when the work cluster is 1:1
| job:ec2.
|
| Thoughts?
|
| We're in the process of moving to Karpenter, so all this may be
| solved for us very soon with some clever configuration.
| p_l wrote:
| If you don't want to change the setup too much, consider
| running your nodes off an AMI with pre-loaded image. Maybe also
| ensure how exactly the images are layered, so if necessary you
| can reduce amount of "first boot patch" download.
| Too wrote:
| There is a difference between waiting and waiting.
|
| For an hourly batch job that already takes 10 minutes to run,
| the extra time for pod scheduling and container downloading is
| negligible anyway.
|
| What you shouldn't do is put pod scheduling in places where
| thousands of users per minute expect sub-second latency.
|
| In your case, if the time for starting up the EC2 becomes a
| bigger factor than the job itself, you can add placeholder pods
| that just sleep, while requiring exactly that machine config
| but request 0 cpus, just to make sure it stays online.
| the_duke wrote:
| I know it's fashionable to hate on Kubernetes these days, and it
| is overly complex and has plenty problems.
|
| But what other solution allows you to:
|
| * declarative define your infrastructure
|
| * gives you load balancing, automatic recovery and scaling
|
| * provides great observability into your whole stack (kubectl,
| k9s, ...)
|
| * has a huge amount of pre-packaged software available (helm
| charts)
|
| * and most importantly: allows you to stand up mostly the same
| infrastructure in the cloud, on your own servers (k3s), and
| locally (KIND), and thus doesn't tie you into a specific cloud
| provider
|
| The answer is: there isn't any.
|
| Kubernetes could have been much simpler, and probably was
| intentionally built to not be easy to use end to end.
|
| But it's still by far the best we've got.
| dijit wrote:
| Cloud and terraform gives you those.
|
| You're right that kubernetes is a bit batteries included, and
| for that its tempting to take it off the shelf because it "does
| a lot of needed things", but you don't _need_ one tool to do
| all of those things.
|
| It is ok to have domain specific processes or utilities to
| solve those.
| theossuary wrote:
| You missed what I think is the most important point in OP's
| list: it does all of the above in a cloud agnostic way. If I
| want to move clouds with TF I'm rewriting everything to fit
| into a new cloud's paradigm. With Kubernetes there's a dozen
| providers built in (storage, loadbalancing, networking, auto
| scaling, etc.) or easy to pull in (certificates, KMS secrets,
| DNS); and they make moving clouds (and more importantly)
| running locally much easier.
|
| Kubernetes is currently the best way to wrap up workloads in
| a cloud agnostic way. I've written dozens of services for K8s
| using different deployment mechanisms (Helm, Carvel's kapp,
| Flux, Kustomize) and I can run them just as easily in my home
| K8s cluster and in GCP. It's honestly incredible; I don't
| know of any other cloud tech that lets me do that.
|
| One thing I think a lot of people miss too, is how good the
| concepts around Operators in Kubernetes are. It's hard to see
| unless you've written some yourself, but the theory around
| how operators work is very reminiscent of reactive coding in
| front end frameworks (or robotics closed loop control, what
| they were originally inspired by). When written well they're
| _extremely_ resilient and incredibly powerful, and a lot of
| that power comes from etcd and the established patterns they
| 're written with.
|
| I think Kubernetes is really painful sometimes, and huge
| parts of it aren't great due to limitations of the language
| it's written in; but I also think it's the best thing
| available that I can run locally and in a cloud with a FOSS
| license.
| dijit wrote:
| > it does all of the above in a cloud agnostic way.
|
| I'll give you the benefit of the doubt here and say that
| some of the basics are indeed cloud agnostic.
|
| However, it's plain for many or most to see that outside of
| extremely "toy" workloads you will be learning a specific
| "flavour" of Kubernetes. EKS/GKE/AKS etc; They have, at
| minimum, custom resource definitions to handle a lot of
| things and at their worst have implementation specific
| (hidden) details between equivalent things (persistent
| volume claims on AWS vs GCP for example is quite
| substantially different).
| theossuary wrote:
| For multicloud I usually think of my local K8s cluster
| and GKE, it's been a few years since I touched EKS. I'd
| love to hear your opinions on the substantive differences
| you run into. When switching between clouds I'm usually
| able to get away with only changing annotations on
| resources, which is easy enough to put in a values.yml
| file. I can't remember the last time I had to use a cloud
| specific CRD. What CRD's do you have to reach for
| commonly?
|
| Thinking about it; the things I see as very cloud
| agnostic: Horizontal pod autoscaling, Node autoscaling,
| Layer 4 loadbalancing, Persistent volumes, Volume
| snapshots, Certificate managment, External DNS, External
| secrets, Ingress (when run in cluster, not through a
| cloud service),
|
| That ends up covering a huge swath of my usecases,
| probably 80-90%. The main pain points I usually run into
| are: IAM, Trying to use cloud layer 7 ingress (app
| loadbalancers?)
|
| I totally agree the underlying implementation if
| resources can be very different, but that's not the fault
| of Kubernetes; it's an issue with the implementation from
| the operator of the K8s cluster. All abstractions are
| going to be leaky at this level. But for PVCs I feel like
| storageclasses capture that well, and can be used to pick
| the level of performance you need per cloud; without
| having to rewrite the common provision of block device.
| elktown wrote:
| Something feels very off and mantra-like with the
| proportionality of how often cloud migration benefits are
| being presented as something very important to how often
| that actually happens in practice. Not to even mention that
| it also assumes that simpler setups are automatically
| harder to move around between clouds, or at least that
| there are a significant difference in required effort.
| theossuary wrote:
| When I say it's easy to move between clouds, I'm not
| referring to an org needing to pick up everything and
| move from AWS to GCP. That is rare, and takes quite a bit
| of rearchitecting no matter what.
|
| When I say something is easy to move, I mean that when I
| build on top of it, it's easy for users to run it in
| their cloud of choice with changes in config. It also
| means I have flexibility with where I choose to run
| something after I've developed it. For example I develop
| most stuff against minikube, then deploy it to GCP or a
| local production k8s. If I was using Terraform I couldn't
| do that.
| the_duke wrote:
| > Cloud and terraform gives you those
|
| * your stack almost always ends up closely tied to one cloud
| provider. I've done and seen cloud migrations. They are so
| painful and costly that they often just aren't attempted.
|
| * Cloud services make it much harder to run your stack
| locally and on CI. There are solutions and workarounds, but
| they are all painful. And you always end up tied to the
| behaviour of the particular cloud services
|
| > but you don't need one tool to do all of those things
|
| To get the same experience, you do. And I don't see why you
| would want multiple tools.
|
| If anything, Kubernetes isn't nearly integrated and full-
| featured enough, because it has too many pluggable parts
| leading to too much choice and interfacing complexity. Like
| pluggable ingress, pluggable state database, pluggable
| networking stack, no simple "end to end app" solution (
| KNative, etc), ... This overblown flexibility is what leads
| to most of the pain and perceived complexity, IMO.
| osigurdson wrote:
| Perhaps a little on the tinfoil hat side of things, but it
| isn't completely unreasonable to think that some of the FUD
| could originate from cloud providers. Kubernetes is a
| commoditizing force to some extent.
| foverzar wrote:
| > This overblown flexibility is what leads to most of the
| pain and perceived complexity, IMO.
|
| Huh, I guess you are spot on. My first experience with
| kubernetes was k3s and I couldn't for a long time figure
| out what's all the fuss is about and where is all that
| complexity people talk so much about. But then I tried
| vanilla kubernetes.
| Too wrote:
| Far from it. TF is mostly writing static content, maybe read
| one or two things. It's missing the runtime aspect of it, so
| are most cloud offerings, without excessive configuration.
| Rollouts, health probes, logs, service discovery. Just to
| name a few.
| treflop wrote:
| Aren't you just describing the basic features of an
| orchestrator?
|
| Docker Swarm has all those features for example.
|
| (Not that I am recommending Docker Swarm.)
| mad_vill wrote:
| "Automatic recovery"
|
| That's a joke.
| yjftsjthsd-h wrote:
| The thing is, "kubernetes" doesn't give you that either. You
| want a LB? Here's a list of them that you can add to a cluster.
| But actually pick multiple, because the one you picked in AWS
| doesn't support bare metal.
| hobofan wrote:
| > because the one you picked in AWS doesn't support bare
| metal
|
| That's just because AWS's Kubernetes offering is laughably
| bad.
|
| There is huge difference in your experience whether you use
| Kubernetes via GKE (Autopilot) or any other solution (at
| least as long you don't have a dedicated infrastructure
| team).
| davkan wrote:
| Bare metal kubernetes is certainly a lot less complete out of
| the box when it comes to networking and storage but, people
| can, and often should, use a managed k8s service which
| provides all those things out of the box. And if you're on
| bare metal once the infra team has abstracted away everything
| into LoadBalancers and StorageClasses it's basically the same
| experience for end users of the cluster.
| eropple wrote:
| If you're talking about OpenShift on rented commodity
| compute, maybe. If you're talking about GKE/AKS/EKS or
| similar, I disagree wholeheartedly; you're then paying
| several multiples on the compute _and_ a little extra for
| Kubernetes.
| osigurdson wrote:
| Naw, just use system-d, ha-proxy and bash scripts. That is much
| "simpler" (for some definition of simple).
|
| Kidding of course. If you need anything approximating
| Kubernetes, use it. If you just need one machine maybe don't.
| p-o wrote:
| I like to think that most people who are upset at Kubernetes
| don't hate on all of it. I think the configuration aspect
| (YAML) and the very high level of abstraction is what get
| people lost and as a result they get frustrated by it. I've
| certainly fall in that category while trying to learn how to
| operate multiple clusters using different topologies and cloud
| providers.
|
| But from an operational standpoint, when things are working, it
| usually behaves very well until you hit some rough edge cases
| (upgrades were much harder to achieve a couple of years back).
| But rough edges exist everywhere, and when I get to a point
| where K8s hits a problem, I would think that it would be much
| worse if I wasn't using it.
| koolba wrote:
| > I like to think that most people who are upset at
| Kubernetes don't hate on all of it. I think the configuration
| aspect (YAML) ...
|
| I question the competence of anyone who does not question
| (and rag on) the prevalence of templating YAML.
|
| > But rough edges exist everywhere, and when I get to a point
| where K8s hits a problem, I would think that it would be much
| worse if I wasn't using it.
|
| Damn straight. It's only bad because everything else is
| strictly worse.
| dfee wrote:
| Helm isn't YAML. It's a go template file that should
| compile to YAML, masquerading as YAML with that extension.
|
| So yaml formatters break it, humans struggle to generate
| code with proper indents, and it's an insane mess. It's
| horrendous.
| garrettgrimsley wrote:
| >I think the configuration aspect (YAML)
|
| What are the reasons to _not_ use JSON rather than YAML? From
| my admittedly-shallow experience with k8s, I have yet to
| encounter a situation in which I couldn 't use JSON. Does
| this issue only pop up once you start using Helm charts?
| kbar13 wrote:
| at the surface level yaml is a lot easier to read and write
| for a human. less "s. but once you start using it for
| complex configuration it becomes unwieldy. but at that
| point json is also not better than yaml.
|
| after using cdk i think that writing typescript to define
| infra is a significantly better experience
| smokel wrote:
| One of the most annoying limitations of JSON is that it
| does not allow for comments.
| mgaunard wrote:
| ever heard of nomad?
| politelemon wrote:
| > and thus doesn't tie you into a specific cloud provider
|
| It ties you to k8s instead, and it ties you to a few company
| wide heroes, and that is not a 'benefit' as it's being touted
| here.
|
| Being tied to a cloud is not a horrible situation either. I
| suspect "being tied to a cloud" is a boogeyman that k8s
| proponents would like to spread, but just like with k8s, with
| the right choices, cloud integration is a huge benefit.
| kortilla wrote:
| Being tied to the cloud is fine if you don't care about
| money. Eventually companies do
| matrss wrote:
| > * declarative define your infrastructure
|
| > [...]
|
| > * has a huge amount of pre-packaged software available (helm
| charts)
|
| > * and most importantly: allows you to stand up mostly the
| same infrastructure in the cloud, on your own servers (k3s),
| and locally (KIND), and thus doesn't tie you into a specific
| cloud provider
|
| NixOS. I have no clue about kubernetes, but I think NixOS even
| goes much deeper in these points (e.g. kubernetes is at the
| "application layer" and doesn't concern itself with
| declaratively managing the OS underneath, if I understand
| right). The other points seem much more situational, and if
| needed kubernetes might well be worth it. For something that
| could be a single server running a handful of services, NixOS
| is amazing.
| the_duke wrote:
| I use NixOS, both on servers and on my machines, but it
| solves a completely orthogonal problem.
|
| Kubernetes manages a cluster, NixOS manages a single machine.
| matrss wrote:
| I wouldn't say completely orthogonal. E.g. the points I've
| cited are overlap between the two, and ultimately both are
| meant to host some kind of services. But yes NixOS by
| itself manages a single machine (although combined with
| terraform it can become very convenient to also manage a
| fleet of NixOS machines). Kubernetes manages services on a
| cluster, but given how powerful a single machine can be I
| do think that many of those clusters could also just be one
| beefy server (and maybe a second one with some fail over
| mechanism, if needed).
|
| If the cluster is indeed necessary though, I think NixOS
| can be a great base to stand up a Kubernetes cluster on top
| of.
| pxc wrote:
| There are lots of native NixOS tools for managing whole
| clusters (NixOps, Disnix, Colmena, deploy-rs, Morph, krops,
| Bento, ...). Lots of people deploy whole fleets of NixOS
| servers or clusters for specific applications without
| resorting to Kubernetes. (Kube integrations are also
| popular, though.) Some of those solutions are very old,
| too.
|
| Disnix has been around for a long time, probably since
| before you ever heard of NixOS.
| Lucasoato wrote:
| There is no easy solution to manage services and
| infrastructure: people who hate kubernetes complexity often
| underestimate the efforts of developing on your own all the
| features that k8s provides.
|
| At the same time, people who suggest everyone to use kubernetes
| independently on the company maturity often forget how easy it
| is to run a service on a simple virtual machine.
|
| In the multidimensional space that contains every software
| project, there is no hyperplane that separates when it's worth
| to use kubernetes or not. It depends on the company, the
| employees, the culture, the business.
|
| Of course there are general best practices, like for example if
| you're just getting started with kubernetes, and already in the
| cloud, using a managed k8s service from your cloud provider
| could be a good idea. But again, even for this you're going to
| find opposing views online.
| g9yuayon wrote:
| When I reflect what Netflix did back in 2010ish on AWS:
|
| * The declarative infra is EC2/ASG configurations plus Jenkins
| configurations * Client-side load balancing * ASG for
| autoscaling and recovery * Amazing observability with a home-
| grown monitoring system by 4 amazing engineers
|
| Most of all, each of the above item was built and run by one or
| two people, except the observability stack with four. Oh,
| standing up a new region was truly a non-event. It just
| happened and as a member of the cloud platform team I couldn't
| even recall what I did for the project. It's not that Netflix's
| infra was better or worse than using k8s. I'm just amazed how
| happy I have been with an infra built more than 10 years ago,
| and how simple it was for end users. In that regard, I often
| question myself what I have missed in the whole movement of k8s
| platform engineering, other than people do need a robust
| solution to orchestrate containers.
| p_l wrote:
| A big chunk was companies that don't have netflix-money
| having to bin-pack compute for efficiency.
|
| Or at least that's how I got into k8s, because it allowed me
| to ship for 1/10th the price of my competitor.
| throwawaaarrgh wrote:
| Yes... And? We don't have to be happy with our lot if it sucks.
| kiitos wrote:
| There are an enormous number of tools that meet these
| requirements, most obviously Nomad. But really any competently-
| designed system, defined in terms of any cloud-agnostic
| provisioning system (Chef, Puppet, Salt, Ansible, home-grown
| scripts) would qualify.
|
| And, for the record, observability is something very much
| unrelated to kubectl or k9s.
| liampulles wrote:
| Good article. I used to be a k8s zealot (both CKAD and CKA
| certified) but have come to think that the good parts of k8s are
| the bare essentials (deployments, services, configmaps) and the
| rest should be left for exceptional circumstances.
|
| Our team is happy to write raw YAML and use kustomize, because we
| prefer keeping the config plain and obvious, but we otherwise
| pretty much follow everything here.
| izietto wrote:
| > Hand-writing YAML. YAML has enough foot-guns that I avoid it as
| much as possible. Instead, our Kubernetes resource definitions
| are created from TypeScript with Pulumi.
|
| LOL so, rather than linting YAML, bring in a whole programming
| language runtime plus third party library, adding yet another
| vendor lock, having to maintain versions, project compiling,
| moving away from K8S, adding mental overhead...
| p_l wrote:
| Managing structures in programming language is easier than
| dealing with finicky _optional_ serialization format.
|
| I have drastically reduced the amount of errors, mistakes,
| bugs, plain old wtf-induced hair pulling, by just mandating
| avoidance of YAML (and Helm) and using Jsonnet. Sure, there was
| some up-front work to write library code, but afterwards? I had
| people introduced to JSonnet with example deployment on one
| day, and shipping production-ready deployment for another app
| the next day.
|
| Something we couldn't get with yaml.
| paulgb wrote:
| We use Pulumi for IAC of non-k8s cloud resources too, so it
| doesn't introduce anything extra. In reality all but the
| smallest Kubernetes services will want _something_ other than
| hand-written YAML: Helm-style templating, HCL, etc. TypeScript
| gives us type safety, and _composable_ type safety. E.g. we
| have a function that encapsulates our best practices for
| speccing a deployment, and we get type safety for free across
| that function call boundary. Can 't do that with YAML.
| Aurornis wrote:
| Most devops disaster stories I've heard lately are the result
| of endless addition of new tools. People join the company, see
| a problem, and then add another layer of tooling to address it,
| introducing new problems in the process. Then they leave the
| company, new people join, see the problems from that new
| tooling, add yet another layer of tooling, continuing the
| cycle.
|
| I was talking to someone from a local startup a couple weeks
| ago who was trying to explain their devops stack. The number of
| different tools and platforms they were using was in the range
| of 50 different things, and they were asking for advice about
| how to integrate yet another thing to solve yet another self-
| inflicted problem.
|
| It was as though they forgot what the goal was and started
| trying to collect as much experience with as many different
| tools as they could.
| izietto wrote:
| Would you believe that there is a company that is using cdk8s
| to handle its K8S configuration, and that such
| "infrastructure as code" repo ("infrastructure as code", this
| is the current hype) counts 76k YAML LoCs and 24k TypeScript
| LoCs to manage a bunch of Rails apps together with their
| related services? Like, some of such apps have less LoC.
| bananapub wrote:
| yaml is objectively a bad language for complicated
| configurations, and once you add string formatting on top of
| it, you now have a complicated and shitty system, yay.
|
| hopefully jsonnet or that apple thing will get more traction
| and popularlity.
| nusl wrote:
| k8s is really about you and if it makes sense for your use case.
| It's not universally bad or universally good, and I don't feel
| that there is a minimum team size required for it to make sense.
|
| Managing k8s, for me at least, is a lot easier than juggling
| multiple servers with potentially different hardware, software,
| or whatever else. It's rare that businesses will have machines
| that are all identical. Trying to keep adding machines to a pool
| that you manage manually and keep them running can be very messy
| and get out of control if you're not on top of it.
|
| k8s can also get out of control though it's also easier to reason
| about and understand in this context. Eg you have eight machines
| of varying specs but all they really have installed is what's
| required to run k8s, so you haven't got as much divergence there.
| You can then use k8s to schedule work across them or ask
| questions about the machines.
| liveoneggs wrote:
| We've found kubernetes to be surprisingly fragile, buggy,
| inflexible, and strictly imperative.
|
| People make big claims but then it's not declarative enough to
| look up a resource or build a dependency tree and then your
| context deadline is exceeded.
| elktown wrote:
| I think an underestimated issue with k8s (et al) is on a cultural
| level. Once you let in complex generic things, it doesn't stop
| there. A chain reaction has started, and before you know it,
| you've got all kinds of components reinforcing each other, that
| are suddenly required due to some real, or just perceived,
| problems that are only there in the first place because of a
| previous step in the chain reaction.
|
| I remember back when the Cloud first started getting a foothold
| that what people was drawn to was that it would enable _reducing_
| complexity of managing the most frustrating things like the load-
| balancer and the database, albeit at a price of course, but it
| was still worth it.
|
| Stateless app servers however, was certainly not a large
| maintenance problem. But somehow we've managed to squeeze in
| things like k8s in the there anyway, we just needed to evangelize
| microservices to create a problem that didn't exist before. Now
| that this is part of the "culture" it's hard to even get beyond
| hand-wavy rationalizations that microservices is a must,
| assumingly because it's the initial spark that triggered the
| whole chain reaction of complexity.
| jupp0r wrote:
| Cloud providers automate things like lease renewals, dealing
| with customs and part time labor contract compliance disputes
| for that datacenter in that Asian country that you don't know
| the language of.
|
| I'm constantly fascinated how people handwaivingly
| underestimate the cost and headaches of actually running on
| prem global infrastructure.
| robertlagrant wrote:
| > I'm constantly fascinated
|
| To call a halt to your constant fascination: they don't all
| have that problem. They still get the complexity of cloudy
| things regardless when they use one.
| jupp0r wrote:
| They also get some of the complexity of cloudy things when
| they run their own datacenter. In the end you find stuff
| like OpenStack which becomes its own nightmare universe.
| eropple wrote:
| YMMV, but more and more I see people moving to k8s to get
| _away_ from OpenStack, to varying but generally positive
| success.
| ricardobeat wrote:
| There are at least five shades in between on-prem and a
| managed k8s cloud.
| d0mine wrote:
| Could you mention three?
| pmalynin wrote:
| colo racks, rented dedicated servers, ec2 / managed vm
| offerings?
| hipadev23 wrote:
| And colo providers solved those hurdles for decades. Let's
| not act like the only options are cloud, or build your own
| datacenter.
| bradfox2 wrote:
| My startup hosts our own training servers in a colo-ed
| space 10 min from our office. Took less than 40 hours to
| get moved in, with most of the time tinkering with
| fortigate network appliance settings.
|
| Cloudflare zero trust for free is a huge timesaver
| kortilla wrote:
| I'm constantly fascinated by people who think they need on
| prem global infrastructure when the vast majority of
| applications both have very loose latency requirements
| (multiple seconds) or no users outside of the home country.
|
| Two datacenters on opposite sides of the US from different
| providers will get you more uptime than a cloud provider and
| is super simple.
| jupp0r wrote:
| While some of the complexity goes away when it's on prem in
| to parts of the US, having to order actual hardware,
| putting it into racks, hiring, training, retaining the
| people there to debug actual hardware issues when they
| arise, dealing with HVAC concerns, etc is a lot of
| complexity that's probably completely outside of your core
| business expertise but that you'll have to spend mental
| cycles on when actually operating your own data center.
|
| It's totally worth it for some companies to do that, but
| you need to have some serious size to be concerned with
| spending your efforts on lowering your AWS bill by
| introducing details like that into your own organization
| when you could alternatively spend those dollars to make
| your core business run better. Usually your efforts are
| better spent on the latter unless you are Netflix or Amazon
| or Google.
| protomikron wrote:
| Why is it always public cloud (aws, gcp, azure) vs.
| "bring your own hardware and deploy it in racks".
|
| There are multiple providers that offer VPS and
| ingress/egress for a fraction of the cost of public
| clouds and they mostly have good uptime.
| pclmulqdq wrote:
| I recently rented a rack with a telecom and put some of
| my own hardware in it (it's custom weird stuff with
| hardware accelerators and all the FIPS 140 level 4
| requirements), but even the telecom provider was offering
| a managed VPS product when I got on the phone with them.
|
| The uptime in these DCs is very good (certainly better
| than AWS's us-east-1), and you get a very good price with
| tons of bandwidth. Most datacenter and colo providers can
| do this now.
|
| I think people believe that "on prem" means actually
| racking the servers in your closet, but you can get
| datacenter space with fantastic power, cooling, and
| security almost anywhere these days.
| Zircom wrote:
| >I think people believe that "on prem" means actually
| racking the servers in your closet, but you can get
| datacenter space with fantastic power, cooling, and
| security almost anywhere these days.
|
| That's because that is what on prem means. What you're
| describing is colocating.
| pclmulqdq wrote:
| When clouds define "on-prem" in opposition to their
| services (for sales purposes), colo facilities are lumped
| into that bucket. They're not exactly wrong, except a
| rack at a colo is an extension of your premises with a
| landlord who understands your needs.
| jupp0r wrote:
| It's a spectrum:
|
| On top is AWS lambda or something where you are
| completely removed from the actual hardware that's
| running your code.
|
| At the bottom is a free acre of land where you start
| construction and talk to utilities to get electricity and
| water there. You build your own data center, hire people
| to run and extend it, etc.
|
| There is tons of space in between where compromises are
| made by either paying a provider to do something for you
| or doing it yourself. Is somebody from the datacenter
| where you rented a rack or two going in and pressing a
| reset button after you called them a form of cloud
| automation? How about you renting a root VM at Hetzner?
| Is that VM on prem? People who paint these tradeoffs in a
| black and white matter and don't acknowledge that there
| are different choices for different companies and
| scenarios are not doing the discussion a service.
|
| On the other hand, somebody who built their business on
| AppEngine or CloudFlare workes could look at that other
| company who is renting a pet pool of EC2 instances and
| ask if they are even in the cloud or if they are just
| simulating on-prem.
| cangeroo wrote:
| Because their arguments are disingenuous.
|
| It reads like propaganda sponsored by the clouds.
| Scaremongering.
|
| Clouds are incredibly lucrative.
|
| But don't worry. You can make the prices more reasonable
| by making a 3-year commitment to run old outdated
| hardware.
| jupp0r wrote:
| There are tons of examples where low latency is good for
| business, even small businesses. I'm sure you've seen the
| studies from Amazon that every 100ms of page load latency
| is costing them 1% of revenue, etc. Also everything
| communication is very latency sensitive.
|
| Of course there are plenty of scenarios where latency does
| not matter at all.
| groestl wrote:
| So you can trade off 300ms of additional roundtrip time
| (on anything non-CDNable) at a cost of 3% revenue and
| reduce your infrastructure complexity a lot
| throwaway22032 wrote:
| Not every business is based on impulse buys. Amazon is a
| pretty biased sample there.
| jupp0r wrote:
| Is this agreeing with "Of course there are plenty of
| scenarios where latency does not matter at all." or are
| you trying to make a point?
| threeseed wrote:
| That latency is correlated with revenue is not exclusive
| to Amazon.
|
| And many people who aren't impulse buying will not stick
| around on slow sites.
| pdimitar wrote:
| Disagreed, once we're not talking a worldwide shop for
| non-critical buys like Amazon the picture changes
| dramatically. Many people on local markets have no choice
| and will stick around no matter how slow the service is.
|
| Evidence: my wife buying our groceries for delivery at
| home. We have 4-5 choices in our city. All their websites
| are slow as hell, and I mean adding an item to a cart
| takes good 5-10 seconds. Search takes 20+ seconds.
|
| She curses at them every time yet there's nothing we can
| do. The alternative is both of us to travel by foot 20
| minutes to the local mall and wait on queues. 2-3 times a
| week. She figured the slow websites are the lesser evil.
| threeseed wrote:
| > Vast majority of applications have no users outside home
| country
|
| Any evidence to back this up. Because on the surface seems
| like a ridiculous statement.
| elktown wrote:
| Not sure how how you could jump all the way to running your
| own Asian datacenter from my post. A bit amusing though :). I
| even wrote that it's worth running the LB/DB in the Cloud?
| jupp0r wrote:
| Oh it was more of an addition to your point about "reducing
| complexity of managing the most frustrating things like the
| load-balancer and the database, albeit at a price of
| course". There is a whole mountain of complexity that most
| software engineers never think about when they dream about
| going back to the good old on prem days.
| elktown wrote:
| Alright, just feels like taking a bit too far into the
| exceptions. Even back then only large companies would
| consider that. Renting servers, renting a server rack
| (co-location), or even just a in-office server rack for
| what would be a startup today.
| aeturnum wrote:
| > _if a human is ever waiting for a pod to start, Kubernetes is
| the wrong choice._
|
| As someone who is always working "under" a particular set of
| infrastructure choices I want people who write this kind of
| article to understand something: the people who dislike
| particular infrastructure systems are by-in-large those who are
| working under sub-optimal uses of them. No one who has the space
| to think about "if their infrastructure choices will create an
| effect" in the future hates any infrastructure system. Their life
| is good. They can choose and most everyone agrees that any system
| can be done well.
|
| The haters come from being in situations where a system has not
| been done well - where for whatever combination of reasons they
| are stuck using a system that's the wrong mix of complex /
| monitorable / fragile / etc. It's true enough that, if that
| system had been built with more attention to its needs, that
| people would not hate it - but that's just not how people come to
| hate k8s (or any other tool).
| shrubble wrote:
| If Kubernetes is the answer ... you very likely asked the wrong
| questions.
|
| Reading about JamSocket and what it does, it seems that it
| essentially lets you run Docker instances inside the Jamsocket
| infrastructure.
|
| Why not just take Caddy in a clustered configuration, add some
| modules to control Docker startup/shutdown and reduce your
| services usage by 50%? As one example.
| paulgb wrote:
| I'm not sure what you mean by that reducing service usage.
|
| The earliest version of the product really was just nginx
| behind some containers, but we outgrew the functionality of
| existing proxies pretty quickly. See e.g. keys
| (https://plane.dev/developing/keys) which would not be possible
| with clustered Caddy alone.
| shrubble wrote:
| My understanding was that K8s itself has overhead, which
| ultimately has to be paid for, even if using a managed
| service (it might be included in the cost of what you pay, of
| course).
|
| I did add the caveat of "with modules" and the idea of
| sharing values around to different servers would be easy to
| do, since you have Postgres around as a database to hold
| those values/statuses.
| paulgb wrote:
| HTTP proxying is not much of our codebase. I wouldn't want
| to shoehorn what we're doing into being a module of a proxy
| service just to avoid writing that part. That proxy doesn't
| run on Kubernetes currently anyway, so it wouldn't change
| anything we currently use Kubernetes for.
| siliconc0w wrote:
| IMO the big win with Kubernetes is helm or operators. If you're
| going to pay the complexity costs you might as well get the wins
| which is essentially a huge 'app-store' of popular infrastructure
| components and an entirely programmatic way to manage your
| operations (deployments, updates, fail-overs, backups, etc).
|
| For example if you want to setup something complex like Ceph -
| Rook is a really nice way to do that. It's a very leaky
| abstraction so you aren't hiding all the complexity of Ceph but
| the declarative interface is generally a much nicer way to manage
| Ceph than a boatload of ansible scripts or generally what we had
| before. The key to understand is that helm or operators don't
| magically make infrastructure a managed 'turn-key' appliances,
| you do generally need to understand how the thing works.
| jakupovic wrote:
| This article talks about using k8s but trying not to use it as
| much as possible. First example being operators, this is the
| underlying mechanism that makes k8s possible. To me taking a
| stance not to use operators but use k8s is less than optimal, or
| plainly stupid. The whole stack is built on operators, which you
| inherently trust as you use k8s, but choosing not to use them.
| Sorry but this is hard to read.
|
| The only thing I learned is about Caddy as a cert-manager
| replacement, even though I have used, extended and been pretty
| happy with cert-manager. The rest is hard to read ;(.
| hintymad wrote:
| When I checked out an operator repo for some stateful services,
| say, elasticsearch, the repo most likely would contain 10s of
| thousands of lines of YAML and 10s of thousands lines of Go code.
| Is this due to essential complexity of implementing auto-pilot of
| a complex service, or is it due to massive integration with k8s'
| operators framework?
| patmcc wrote:
| Missed opportunity to title this the h8r's guide to k8s.
| kube-system wrote:
| The people who dislike kubernetes are, in my experience, people
| who don't need to do all of the things kubernetes does. If you
| just need to run an application, it's not what you want.
| erulabs wrote:
| I suppose I'm the guy pushing k8s on midsized companies. If there
| have been unhappy engineers along the way - they've by the vast
| majority stayed quiet and lied about being happier on surveys.
|
| Yes, k8s is complex. The tool matches the problem: complex. But
| having a standard is so much better than having a somewhat
| simpler undocumented chaos. "Kubectl explain X" is a thousand
| times better than even AWS documentation, which in turn was a
| game changer compared to that-one-whiteboard-above-Dave's-desk.
| Standards are tricky, but worth the effort.
|
| Personally I'm also very judicious with operators and CRDs - both
| can be somewhat hidden to beginners. However, the operator
| pattern is wonderful. Another amazing feature is ultra simple
| leader election - genuinely difficult outside of k8s, a 5 minute
| task inside. I agree with Paul's take here tho of at least being
| extremely careful about which operators you introduce.
|
| At any rate, yes k8s is more complex than your bash deploy
| script, of course it is. It's also much more capable and works
| the same way as it did at all your developers previous jobs.
| Velocity is the name of the game!
| paulgb wrote:
| Good point about k8s vs. AWS docs -- a lot of the time people
| say "just use ECS" or the AWS service of the day, and it will
| invariably be more confusing to me and more vendor-tied than
| just doing the thing in k8s.
| p_l wrote:
| And then if you're unlucky you might hit one of the areas
| where the AWS documentation has a "teaser" about some
| functionality that is critical for your project, you spend
| months looking for the rest of the documentation when initial
| foray doesn't work, and the highly paid AWS-internal
| consultants disappear into thin air when asked about the
| features.
|
| So nearly a year later you end up writing the whole feature
| from scratch yourself.
| pclmulqdq wrote:
| I have to say that I don't believe the problem is all that
| complex unless you make it hard. But on the flip side, if
| you're a competent Kubernetes person, the correct Kubernetes
| config is also not that complex.
|
| I think a lot of the reaction here is a result of the age-old
| issues of "management is pushing software on me that I don't
| want" and people adopting it without knowing how to use it
| because it's considered a "best practice."
|
| In other words, the reaction you probably have to an Oracle
| database is the same reaction that others have to Kubernetes
| (although Oracle databases are objectively crappy).
| throwitaway222 wrote:
| Article makes a good point.
|
| Allow k8s disallow any service meshes.
| bedobi wrote:
| the "best" infra i ever had was a gig where we
|
| * built a jar (it was a jvm shop)
|
| * put it on a docker image
|
| * put that on an ami
|
| * then had a regular aws load balancer that just combined the ami
| with the correctly specced (for each service) ec2 instances to
| cope with load
|
| it was SIMPLE + it meant we could super easily spin up the
| previous version ami + ec2s in case of any issues on deploys (in
| fact, when deploying, we could keep the previous ones running and
| just repoint the load balancer to them)
|
| ps putting the jar on a docker image was arguably unnecessary, we
| did it mostly to avoid "it works on my machine" style problems
___________________________________________________________________
(page generated 2024-03-03 23:01 UTC)