[HN Gopher] What would a Kubernetes 2.0 look like
___________________________________________________________________
What would a Kubernetes 2.0 look like
Author : Bogdanp
Score : 121 points
Date : 2025-06-19 12:00 UTC (10 hours ago)
(HTM) web link (matduggan.com)
(TXT) w3m dump (matduggan.com)
| zdw wrote:
| Related to this, a 2020 take on the topic from the MetalLB dev:
| https://blog.dave.tf/post/new-kubernetes/
| jauntywundrkind wrote:
| 152 comments on _A Better Kubernetes, from the Ground Up,_
| https://news.ycombinator.com/item?id=25243159
| zug_zug wrote:
| Meh, imo this is wrong.
|
| What Kubernetes is missing most is a 10 year track record of
| simplicity/stability. What it needs most to thrive is a better
| reputation of being hard to foot-gun yourself with.
|
| It's just not a compelling business case to say "Look at what you
| can do with kubernetes, and you only need a full-time team of 3
| engineers dedicated to this technology at tho cost of a million a
| year to get bin-packing to the tune of $40k."
|
| For the most part Kubernetes is becoming the common-tongue,
| despite all the chaotic plugins and customizations that interact
| with each other in a combinatoric explosion of
| complexity/risk/overhead. A 2.0 would be what I'd propose if I
| was trying to kill kuberenetes.
| candiddevmike wrote:
| Kubernetes is what happens when you need to support everyone's
| wants and desires within the core platform. The abstraction
| facade breaks and ends up exposing all of the underlying pieces
| because someone needs feature X. So much of Kubernetes'
| complexity is YAGNI (for most users).
|
| Kubernetes 2.0 should be a boring pod scheduler with some RBAC
| around it. Let folks swap out the abstractions if they need it
| instead of having everything so tightly coupled within the core
| platform.
| echelon wrote:
| Kubernetes doesn't need a flipping package manager or charts.
| It needs to do one single job well: workload scheduling.
|
| Kubernetes clusters shouldn't be bespoke and weird with
| behaviors that change based on what flavor of plugins you
| added. That is antithetical to the principal of the workloads
| you're trying to manage. You should be able to headshot the
| whole thing with ease.
|
| Service discovery is just one of many things that should be a
| different layer.
| KaiserPro wrote:
| > Service discovery is just one of many things that should
| be a different layer.
|
| hard agree. Its like jenkins, good idea, but its not
| portable.
| 12_throw_away wrote:
| > Its like jenkins
|
| Having regretfully operated both k8s and Jenkins, I fully
| agree with this, they have some deep DNA in common.
| sitkack wrote:
| Kubernetes is when you want to sell complexity because
| complexity makes money and and naturally gets you vendor
| lockin even while being ostensibly vendor neutral. Never
| interrupt the customer while they are foot gunning
| themselves.
|
| Swiss Army Buggy Whips for Everyone!
| wredcoll wrote:
| Not really. Kubernetes is still wildly simpler than what
| came before, especially accounting for the increased
| capabilities.
| cogman10 wrote:
| Yup. Having migrated from a puppet + custom scripts
| environment and terraform + custom scripts. K8S is a
| breath of fresh air.
|
| I get that it's not for everyone, I'd not recommend it
| for everyone. But once you start getting a pretty diverse
| ecosystem of services, k8s solves a lot of problems while
| being pretty cheap.
|
| Storage is a mess, though, and something that really
| needs to be addressed. I typically recommend people
| wanting persistence to not use k8s.
| mdaniel wrote:
| > Storage is a mess, though, and something that really
| needs to be addressed. I typically recommend people
| wanting persistence to not use k8s.
|
| I have actually come to wonder if this is actually an
| _AWS_ problem, and not a _Kubernetes_ problem. I mention
| this because the CSI controllers seem to behave sanely,
| but they are only as good as the requests being fulfilled
| by the IaaS control plane. I secretly suspect that EBS
| just wasn 't designed for such a hot-swap world
|
| Now, I posit this because I haven't had to run clusters
| in Azure nor GCP to know if my theory has legs
|
| I guess the counter-experiment would be to forego the AWS
| storage layer and try Ceph or Longhorn but no company
| I've ever worked at wants to blaze trails about that, so
| they just build up institutional tribal knowledge about
| treating PVCs with kid gloves
| wredcoll wrote:
| Honestly this just feels like kubernetes just solving the
| easy problems and ignoring the hard bits. You notice the
| pattern a lot after a certain amount of time watching new
| software being built.
| mdaniel wrote:
| Apologies, but what influence does Kubernetes have over
| the way AWS deals with attach and detach behavior of EBS?
|
| Or is your assertion that Kubernetes should be its own
| storage provider and EBS can eat dirt?
| wredcoll wrote:
| I was tangenting, but yes, kube providing no storage
| systems has led to a lot of annoying 3rd party ones
| oceanplexian wrote:
| > Yup. Having migrated from a puppet + custom scripts
| environment and terraform + custom scripts. K8S is a
| breath of fresh air.
|
| Having experience in both the former and the latter (in
| big tech) and then going on to write my own controllers
| and deal with fabric and overlay networking problems, not
| sure I agree.
|
| In 2025 engineers need to deal with persistence, they
| need storage, they need high performance networking, they
| need HVM isolation and they need GPUs. When a philosophy
| starts to get in the way of solving real problems and
| your business falls behind, that philosophy will be left
| on the side of the road. IMHO it's destined to go the way
| as OpenStack when someone builds a simpler, cleaner
| abstraction, and it will take the egos of a lot of people
| with it when it does.
| wredcoll wrote:
| > simpler, cleaner abstraction
|
| My life experience so far is that "simpler and cleaner"
| tends to be mostly achieved by ignoring the harder bits
| of actually dealing with the real world.
|
| I used kubernete's (lack of) support for storage as an
| example elsewhere, it's the same sort of thing where you
| can look really clever and elegant if you just ignore the
| 10% of the problem space that's actually hard.
| KaiserPro wrote:
| the fuck it is.
|
| The problem is k8s is both a orchestration system and a
| service provider.
|
| Grid/batch/tractor/cube are all much much more simple to
| run at scale. More over they can support complex
| dependencies. (but mapping storage is harder)
|
| but k8s fucks around with DNS and networking, disables
| swap.
|
| Making a simple deployment is fairly simple.
|
| But if you want _any_ kind of ci/cd you need flux, any
| kind of config management you need helm.
| JohnMakin wrote:
| > But if you want _any_ kind of ci/cd you need flux, any
| kind of config management you need helm.
|
| Absurdly wrong on both counts.
| jitl wrote:
| K8s has swap now. I am managing a fleet of nodes with
| 12TB of NVMe swap each. Each container gets (memory limit
| / node memory) * (total swap) swap limit. No way to
| specify swap demand on the pod spec yet so needs to be
| managed "by hand" with taints or some other correlation.
| wredcoll wrote:
| What does swap space get you? I always thought of it as a
| it of an anachronism to be honest.
| mdaniel wrote:
| The comment you replied to cited 12TB of NVMe, and I can
| assure you that 12TB of ECC RAM costs way more than NVMe
| selcuka wrote:
| > Let folks swap out the abstractions if they need it instead
| of having everything so tightly coupled within the core
| platform.
|
| Sure, but then one of those third party products (say, X)
| will catch up, and everyone will start using it. Then job ads
| will start requiring "10 year of experience in X". Then X
| will replace the core orchestrator (K8s) with their own
| implementation. Then we'll start seeing comments like "X is a
| horribly complex, bloated platform which should have been
| just a boring orchestrator" on HN.
| dijit wrote:
| Honestly; make some blessed standards easier to use and maintain.
|
| Right now running K8S on anything other than cloud providers and
| toys (k3s/minikube) is disaster waiting to happen unless you're a
| really seasoned infrastructure engineer.
|
| Storage/state is decidedly not a solved problem, debugging
| performance issues in your longhorn/ceph deployment is just
| _pain_.
|
| Also, I don't think we should be removing YAML, we should instead
| get better at using it as an ILR (intermediate language
| representation) and generating the YAML that we want instead of
| trying to do some weird in-place generation (Argo/Helm
| templating) - Kubernetes sacrificed a lot of simplicity to be
| eventually consistent with manifests, and our response was to
| ensure we use manifests as little as possible, which feels
| incredibly bizzare.
|
| Also, the design of k8s networking feels like it fits ipv6 really
| well, but it seems like nobody has noticed somehow.
| lawn wrote:
| k3s isn't a toy though.
| dijit wrote:
| * Uses Flannel bi-lateral NAT for SDN
|
| * Uses local-only storage provider by default for PVC
|
| * Requires entire cluster to be managed by k3s, meaning no
| freebsd/macos/windows node support
|
| * Master TLS/SSL Certs not rotated (and not talked about).
|
| k3s is _very much_ a toy - a nice toy though, very fun to
| play with.
| zdc1 wrote:
| I like YAML since anything can be used to read/write it. Using
| Python / JS / yq to generate and patch YAML on-the-fly is quite
| nifty as part of a pipeline.
|
| My main pain point is, and always has been, helm templating.
| It's not aware of YAML or k8s schemas and puts the onus of
| managing whitespace and syntax onto the chart developer. It's
| pure insanity.
|
| At one point I used a local Ansible playbook for some
| templating. It was great: it could load resource template YAMLs
| into a dict, read separately defined resource configs, and then
| set deeply nested keys in said templates and spit them out as
| valid YAML. No helm `indent` required.
| pm90 wrote:
| yaml is just not maintainable if you're managing lots of apps
| for eg a midsize company or larger. Upgrades become
| manual/painful.
| jcastro wrote:
| For the confusion around verified publishing, this is something
| the CNCF encourages artifact authors and their projects to set
| up. Here are the instructions for verifying your artifact:
|
| https://artifacthub.io/docs/topics/repositories/
|
| You can do the same with just about any K8s related artifact. We
| always encourage projects to go through the process but sometimes
| they need help understanding that it exists in the first place.
|
| Artifacthub is itself an incubating project in the CNCF, ideas
| around making this easier for everyone are always welcome,
| thanks!
|
| (Disclaimer: CNCF Staff)
| calcifer wrote:
| > We always encourage projects to go through the process but
| sometimes they need help understanding that it exists in the
| first place.
|
| Including ingress-nginx? Per OP, it's not marked as verified.
| If even the official components don't bother, it's hard to
| recommend it to third parties.
| johngossman wrote:
| Not a very ambitious wishlist for a 2.0 release. Everyone I talk
| to complains about the complexity of k8s in production, so I
| think the big question is could you do a 2.0 with sufficient
| backward compatibility that it could be adopted incrementally and
| make it simpler. Back compat almost always mean complexity
| increases as the new system does its new things _and_ all the old
| ones.
| herval wrote:
| The question is always what part of that complexity can be
| eliminated. Every "k8s abstraction" I've seen to date either
| only works for a very small subset of stuff (eg the heroku-like
| wrappers) or eventually develops a full blown dsl that's as
| complex as k8s (and now you have to learn that job-specific
| dsl)
| mdaniel wrote:
| Relevant: _Show HN: Canine - A Heroku alternative built on
| Kubernetes_ - https://news.ycombinator.com/item?id=44292103 -
| June, 2025 (125 comments)
| herval wrote:
| yep, that's the latest of a long lineage of such projects
| (one of which I worked on myself). Ohers include kubero,
| dokku, porter, kr0, etc. There was a moment back in 2019
| where every big tech company was trying to roll out their
| own K8s DSL (I know of Twitter, Airbnb, WeWork, etc).
|
| For me, the only thing that really changed was LLMs -
| chatgpt is exceptional at understanding and generating
| valid k8s configs (much more accurately than it can do
| coding). It's still complex, but it feels I have a second
| brain to look at it now
| mrweasel wrote:
| What I would add is "sane defaults", as in unless you pick
| something different, you get a good enough load
| balancer/network/persistent storage/whatever.
|
| I'd agree that YAML isn't a good choice, but neither is HCL. Ever
| tried reading Terraform, yeah, that's bad too. Inherently we need
| a better way to configure Kubernetes clusters and changing out
| the language only does so much.
|
| IPv6, YES, absolutely. Everything Docker, container and
| Kubernetes should have been IPv6 only internal from the start.
| Want IPv4? That should be handle by a special case ingress
| controller.
| zdw wrote:
| Sane defaults is in conflict with "turning you into a customer
| of cloud provider managed services".
|
| The longer I look at k8s, the more I see it "batteries not
| included" around storage, networking, etc, with the result
| being that the batteries come with a bill attached from AWS,
| GCP, etc. K8s is less of an open source project, and more as a
| way encourage dependency on these extremely lucrative gap
| filler services from the cloud providers.
| JeffMcCune wrote:
| Except you can easily install calico, istio, and ceph on used
| hardware in your garage and get an experience nearly
| identical to every hyper scaler using entirely free open
| source software.
| zdw wrote:
| Having worked on on-prem K8s deployments, yes, you can do
| this. But getting it to production grade is very different
| than a garage-quality proof of concept.
| mdaniel wrote:
| I think OP's point was: but how much of that production
| grade woe is the fault of _Kubernetes_ versus, sure,
| turns out booting up an PaaS from scratch is hard as
| nails. I think that k8s pluggable design also blurs that
| boundary in most people 's heads. I can't think of the
| last time the _control plane_ shit itself, versus
| everyone and their cousin has a CLBO story for the
| component controllers installed on top of k8s
| zdw wrote:
| CLBO?
| mdaniel wrote:
| Crash Loop Back Off
| ChocolateGod wrote:
| I find it easier to read Terraform/HCL over YAML for the simple
| fact that it does't rely me trying to process invisible
| characters.
| tayo42 wrote:
| > where k8s is basically the only etcd customer left.
|
| Is that true. No one is really using it?
|
| I think one thing k8s would need is some obvious answer for
| stateful systems(at scale, not mysql at a startup). I think there
| are some ways to do it? Where I work there is basically
| everything on k8s, then all the databases on their own crazy
| special systems to support they insist its impossible and costs
| to much. I work in the worst of all worlds now supporting this.
|
| re: comments about k8s should just schedule pods. mesos with
| aurora or marathon was basically that. If people wanted that
| those would have done better. The biggest users of mesos switched
| to k8s
| haiku2077 wrote:
| I had to go deep down the etcd rabbit hole several years ago.
| The problems I ran into:
|
| 1. etcd did an fsync on every write and required all nodes to
| complete a write to report a write as successful. This was not
| configurable and far higher a guarantee than most use cases
| actually need - most Kubernetes users are fine with snapshot +
| restore an older version of the data. But it really severely
| impacts performance.
|
| 2. At the time, etcd had a hard limit of 8GB. Not sure if this
| is still there.
|
| 3. Vanilla etcd was overly cautious about what to do if a
| majority of nodes went down. I ended up writing a wrapper
| program to automatically recover from this in most cases, which
| worked well in practice.
|
| In conclusion there was no situation where I saw etcd used that
| I wouldn't have preferred a highly available SQL DB. Indeed,
| k3s got it right using sqlite for small deployments.
| nh2 wrote:
| For (1), I definitely want my production HA databases to
| fsync every write.
|
| Of course configurability is good (e.g. for automated fasts
| tests you don't need it), but safe is a good default here,
| and if somebody sets up a Kubernetes cluster, they can and
| should afford enterprise SSDs where fsync of small data is
| fast and reliable (e.g. 1000 fsyncs/second).
| haiku2077 wrote:
| > I definitely want my production HA databases to fsync
| every write.
|
| I didn't! Our business DR plan only called for us to
| restore to an older version with short downtime, so fsync
| on every write on every node was a reduction in performance
| for no actual business purpose or benefit. IIRC we modified
| our database to run off ramdisk and snapshot every few
| minutes which ran way better and had no impact on our
| production recovery strategy.
|
| > if somebody sets up a Kubernetes cluster, they can and
| should afford enterprise SSDs where fsync of small data is
| fast and reliable
|
| At the time one of the problems I ran into was that public
| cloud regions in southeast asia had significantly worse
| SSDs that couldn't keep up. This was on one of the big
| three cloud providers.
|
| 1000 fsyncs/second is a tiny fraction of the real world
| production load we required. An API that only accepts 1000
| writes a second is very slow!
|
| Also, plenty of people run k8s clusters on commodity
| hardware. I ran one on an old gaming PC with a budget SSD
| for a while in my basement. Great use case for k3s.
| dilyevsky wrote:
| 1 and 2 can be overridden via flag. 3 is practically the
| whole point of the software
| haiku2077 wrote:
| With 3 I mean that in cases where there was an
| unambiguously correct way to recover from the situation,
| etcd did not automatically recover. My wrapper program
| would always recover from thise situations. (It's been a
| number of years and the exact details are hazy now,
| though.)
| dilyevsky wrote:
| If the majority of quorum is truly down, then you're
| down. That is by design. There's no good way to recover
| from this without potentially losing state so the system
| correctly does nothing at this point. Sure you can force
| it into working state with external intervention but
| that's up to you
| haiku2077 wrote:
| Like I said I'm hazy on the details, this was a small
| thing I did a long time ago. But I do remember our on-
| call having to deal with a lot of manual repair of etcd
| quorum, and I noticed the runbook to fix it had no steps
| that needed any human decision making, so I made that
| wrapper program to automate the recovery. It wasn't
| complex either, IIRC it was about one or two pages of
| code, mostly logging.
| dilyevsky wrote:
| That is decisively not true. A number of very large companies
| use etcd directly for various needs
| rwmj wrote:
| Make there be one, sane way to install it, and make that method
| work if you just want to try it on a single node or single VM
| running on a laptop.
| mdaniel wrote:
| My day job makes this request of my team right now, and yet
| when trying to apply this logic to a container _and_ cloud-
| native control plane, there are a lot more devils hiding in
| those details. Use MetalLB for everything, even if NLBs are
| available? Use Ceph for storage even if EBS is available?
| Definitely don 't use Ceph on someone's 8GB laptop. I can keep
| listing "yes, but" items that make doing such a thing
| impossible to troubleshoot because there's not one consumer
|
| So, to circle back to your original point: rke2 (Apache 2) is a
| fantastic, airgap-friendly, intelligence community approved
| distribution, and pairs fantastic with rancher desktop (also
| Apache 2). It's not the _kubernetes_ part of that story which
| is hard, it 's the "yes, but" part of the lego build
|
| -
| https://github.com/rancher/rke2/tree/v1.33.1%2Brke2r1#quick-...
|
| - https://github.com/rancher-sandbox/rancher-desktop/releases
| fatbird wrote:
| How many places are running k8s without OpenShift to wrap it and
| manage a lot of the complexity?
| jitl wrote:
| I've never used OpenShift nor do I know anyone irl who uses it.
| Sample from SF where most people I know are on AWS or GCP.
| coredog64 wrote:
| You can always go for the double whammy and run ROSA: RedHat
| OpenShift on AWS
| raincom wrote:
| Openshift, if IBM and Redhat want to milk the license and
| support contracts. There are other vendors that sell k8s:
| Rancher, for instance. SuSe bought Rancher.
| Melatonic wrote:
| MicroVM's
| geoctl wrote:
| I would say k8s 2.0 needs: 1. gRPC/proto3-based APIs to make
| controlling k8s clusters easier using any programming language
| not just practically Golang as is the case currently and this can
| even make dealing with k8s controllers easier and more
| manageable, even though it admittedly might actually complicates
| things at the API server-side when it comes CRDs. 2. PostgreSQL
| or pluggable storage backend by default instead of etcd. 3. Clear
| identity-based, L7-aware ABAC-based access control interface that
| can be implemented by CNIs for example. 4. Applying userns by
| default 5. Easier pluggable per-pod CRI system where microVMs and
| container-based runtimes can easily co-exist based on the
| workload type.
| jitl wrote:
| All the APIs, including CRDs, already have a well described
| public & introspectable OpenAPI schema you can use to generate
| clients. I use the TypeScript client generated and maintained
| by Kubernetes organization. I don't see what advantage adding a
| binary serialization wire format has. I think gRPC makes sense
| when there's some savings to be had with latency, multiplexing,
| streams etc but control plane things like Kubernetes don't seem
| to me like it's necessary.
| geoctl wrote:
| I haven't used CRDs myself for a few years now (probably
| since 2021), but I still remember developing CRDs was an ugly
| and hairy experience to say the least, partly due to the
| flaws of Golang itself (e.g. no traits like in Rust, no
| macros, no enums, etc...). With protobuf you can easily
| compile your definitions to any language with clear enum,
| oneof implementations, you can use the standard protobuf
| libraries to do deepCopy, merge, etc... for you and you can
| also add basic validations in the protobuf definitions and so
| on. gRPC/protobuf will basically allow you to develop k8s
| controllers very easily in any language.
| mdaniel wrote:
| CRDs are not tied to golang in any way whatsoever;
| <https://www.redhat.com/en/blog/writing-custom-controller-
| pyt...> and <https://metacontroller.github.io/metacontrolle
| r/guide/create...> are two concrete counter-examples, with
| the latter being the most "microservices" extreme. You can
| almost certainly implement them in bash if you're trying to
| make the front page of HN
| geoctl wrote:
| I never said that CRDs are tied to Golang, I said that
| the experience of compiling CRDs, back then gen-
| controller or whatever is being used these days, to
| Golang types was simply ugly partly due to the flaws of
| the language itself. What I mean is that gRPC can
| standardize the process of compiling both k8s own
| resource definitions as well as CRDs to make the process
| of developing k8s controllers in any language simply much
| easier. However this will probably complicate the logic
| of the API server trying to understand and decode the
| binary-based protobuf resource serialized representations
| compared to the current text-based JSON representations.
| znpy wrote:
| > have a well described public & introspectable OpenAPI
| schema you can use to generate clients.
|
| Last time I tried loading the openapi schema in the swagger
| ui on my work laptop (this was ~3-4 years ago, and i had an
| 8th gen core i7 with 16gb ram) it hang my browser, leading to
| tab crash.
| mdaniel wrote:
| Loading it in what? I just slurped the 1.8MB openapi.json
| for v1.31 into Mockroon and it fired right up instantly
| ofrzeta wrote:
| I think the HTTP API with OpenAPI schema is part of what's so
| great about Kubernetes and also a reason for its success.
| dilyevsky wrote:
| 1. The built-in types are already protos. Imo gRPC wouldn't be
| a good fit - actually will make the system harder to use. 2.
| Already can be achieved today via kine[0] 3. Couldn't you build
| this today via regular CNI? Cilium NetworkPolicies and others
| basically do this already
|
| 4,5 probably don't require 2.0 - can be easily added within
| existing API via KEP (cri-o already does userns configuration
| based on annotations)
|
| [0] - https://github.com/k3s-io/kine
| geoctl wrote:
| Apart from 1 and 3, probably everything else can be added
| today if the people in charge have the will to do that, and
| that's assuming that I am right and these points are actually
| that important to be standardized. However the big
| enterprise-tier money in Kubernetes is made from dumbing down
| the official k8s interfaces especially those related to
| access control (e.g. k8s own NetworkPolicy compared to
| Istio's access control related resources).
| pm90 wrote:
| Hard disagree with replacing yaml with HCL. Developers find HCL
| very confusing. It can be hard to read. Does it support imports
| now? Errors can be confusing to debug.
|
| Why not use protobuf, or similar interface definition languages?
| Then let users specify the config in whatever language they are
| comfortable with.
| geoctl wrote:
| You can very easily build and serialize/deserialize HCL, JSON,
| YAML or whatever you can come up with outside Kubernetes from
| the client-side itself (e.g. kubectl). This has actually
| nothing to do with Kubernetes itself at all.
| Kwpolska wrote:
| There aren't that many HCL serialization/deserialization
| tools. Especially if you aren't using Go.
| dilyevsky wrote:
| Maybe you know this but Kubernetes interface definitions are
| already protobufs (except for crds)
| cmckn wrote:
| Sort of. The hand-written go types are the source of truth
| and the proto definitions are generated from there, solely
| for the purpose of generating protobuf serializers for the
| hand-written go types. The proto definition is used more as
| an intermediate representation than an "API spec". Still
| useful, but the ecosystem remains centered on the go types
| and their associated machinery.
| dilyevsky wrote:
| Given that i can just take generated.proto and ingest it in
| my software then marshal any built-in type and apply it via
| standard k8s api, why would I even need all the boilerplate
| crap from apimachinery? Perfectly happy with existing
| rest-y semantics - full grpc would be going too far
| dangus wrote:
| Confusing? Here I am working on the infrastructure side
| thinking that I'm working with a a baby configuration language
| for dummies who can't code when I use HCL/Terraform.
|
| The idea that someone who works with JavaScript all day might
| find HCL confusing seems hard to imagine to me.
|
| To be clear, I am talking about the syntax and data types in
| HCL, not necessarily the way Terraform processes it, which I
| admit can be confusing/frustrating. But Kubernetes wouldn't
| have those pitfalls.
| mdaniel wrote:
| orly, what structure does this represent?
| outer { example { foo = "bar" }
| example { foo = "baz" } }
|
| it reminds me of the insanity of toml [lol]
| [[whut]] foo = "bar" [[whut]] foo = "baz"
|
| only at least with toml I can $(python3.13 -c 'import
| tomllib, sys; print(tomllib.loads(sys.stdin.read()))') to
| find out, but with hcl too bad
| icedchai wrote:
| Paradoxically, the simplicity itself can be part of the
| confusion: the anemic "for loop" syntax, crazy conditional
| expressions to work around that lack of "if" statements,
| combine this with "count" and you can get some weird stuff.
| It becomes a flavor all its own.
| znpy wrote:
| > Hard disagree with replacing yaml with HCL.
|
| I see some value instead. Lately I've been working on Terraform
| code to bring up a whole platform in half a day (aws sub-
| account, eks cluster, a managed nodegroup for karpenter,
| karpenter deployment, ingress controllers, LGTM stack,
| public/private dns zones, cert-manager and a lot more) and I
| did everything in Terraform, including Kubernetes resources.
|
| What I appreciated about creating Kubernetes resources (and
| helm deployments) in HCL is that it's typed and has a schema,
| so any ide capable of talking to an LSP (language server
| protocol - I'm using GNU Emacs with terraform-ls) can provide
| meaningful auto-completion as well proper syntax checking (I
| don't need to apply something to see it fail, emacs (via the
| language server) can already tell me what I'm writing is
| wrong).
|
| I really don't miss having to switch between my ide and the
| Kubernetes API reference to make sure I'm filling each field
| correctly.
| wredcoll wrote:
| But... so do yaml and json documents?
| NewJazz wrote:
| I do something similar except with pulumi, and as a result I
| don't need to learn HCL, and I can rely on the excellent
| language servers for e.g. Typescript or python.
| vanillax wrote:
| Agree HCL is terrible. K8s YAML is fine. I have yet to hit a
| use case that cant be solved with its types. If you are doing
| too much perhaps a config map is the wrong choice.
| ofrzeta wrote:
| It's just easier to shoot yourself in the foot with no proper
| type support (or enforcement) in YAML. I've seen Kubernetes
| updates fail when the version field was set to 1.30 and it
| got interpreted as a float 1.3. Sure, someone made a mistakes
| but the config language should/could stop you from making
| them.
| dochne wrote:
| My main beef with HCL is a hatred for how it implemented for
| loops.
|
| Absolutely loathsome syntax IMO
| mdaniel wrote:
| It was mentioned pretty deep in another thread, but this is
| just straight up user hostile variable
| "my_list" { default = ["a", "b"] } resource whatever
| something { for_each = var.my_list }
| The given "for_each" argument value is unsuitable: the
| "for_each" argument must be a map, or set of strings, and you
| have provided a value of type tuple.
| acdha wrote:
| > Developers find HCL very confusing. It can be hard to read.
| Does it support imports now? Errors can be confusing to debug.
|
| This sounds a lot more like "I resented learning something new"
| than anything about HCL, or possibly confusing learning HCL
| simultaneously with something complex as a problem with the
| config language rather than the problem domain being
| configured.
| aduwah wrote:
| Issue is that you do not want a dev learning hcl. Same as you
| don't want your SRE team learning next and react out of
| necessity.
|
| The ideal solution would be to have an abstraction that is
| easy to use and does not require learning a whole new concept
| (especially an ugly one as hcl). Also learning hcl is simply
| just the tip of the iceberg, with sinking into the
| dependencies between components and outputs read from a bunch
| of workspaces etc. It is simply wasted time to have the devs
| keeping up with all the terraform heap that SREs manage and
| keep evolving under the hood. The same dev time better spent
| creating features.
| acdha wrote:
| Why? If they can't learn HCL, they're not going to be a
| successful developer.
|
| If your argument is instead that they shouldn't learn
| infrastructure, then the point is moot because that applies
| equally to every choice (knowing how to indent YAML doesn't
| mean they know what to write). That's also wrong as an
| absolute position but for different reasons.
| aduwah wrote:
| You don't see my point. The ideal solution would be
| something that can be learned easily by both the dev and
| infra side without having to boil the ocean on one side
| or the other. Something like protobuf was mentioned above
| which is a better idea than hcl imho
| darkwater wrote:
| I totally dig the HCL request. To be honest I'm still mad at
| Github that initially used HCL for Github Actions and then
| ditched it for YAML when they went stable.
| carlhjerpe wrote:
| I detest HCL, the module system is pathetic. It's not
| composable at all and you keep doing gymnastics to make sure
| everything is known at plan time (like using lists where you
| should use dictionaries) and other anti-patterns.
|
| I use Terranix to make config.tf.json which means I have the
| NixOS module system that's composable enough to build a Linux
| distro at my fingertips to compose a great Terraform
| "state"/project/whatever.
|
| It's great to be able to run some Python to fetch some data,
| dump it in JSON, read it with Terranix, generate config.tf.json
| and then apply :)
| jitl wrote:
| What's the list vs dictionary issue in Terraform? I use a lot
| of dictionaries (maps in tf speak), terraform things like
| for_each expect a map and throw if handed a list.
| carlhjerpe wrote:
| Internally a lot of modules cast dictionaries to lists of
| the same length because the keys of the dict might not be
| known at plan time or something. The "Terraform AWS VPC
| module does this internally for many things.
|
| I couldn't tell you exactly, but modules always end up
| either not exposing enough or exposing too much. If I were
| to write my module with Terranix I can easily replace any
| value in any resource from the module I'm importing using
| "resource.type.name.parameter = lib.mkForce
| "overridenValue";" without having to expose that parameter
| in the module "API".
|
| The nice thing is that it generates
| "Terraform"(config.tf.json) so the supremely awesome state
| engine and all API domain knowledge bound in providers work
| just the same and I don't have to reach for something as
| involved as Pulumi.
|
| You can even mix Terranix with normal HCL since
| config.tf.json is valid in the same project as HCL. A great
| way to get started is to generate your provider config and
| other things where you'd reach to Terragrunt/friends. Then
| you can start making options that makes resources at your
| own pace.
|
| The terraform LSP sadly doesn't read config.tf.json yet so
| you'll get warnings regarding undeclared locals and such
| but for me it's worth it, I generally write tf/tfnix with
| the provider docs open and the language (Nix and HCL) are
| easy enough to write without full LSP.
|
| https://terranix.org/ says it better than me, but by doing
| it with Nix you get programatical access to the biggest
| package library in the world to use at your discretion
| (Build scripts to fetch values from weird places, run
| impure scripts with null_resource or it's replacements) and
| an expressive functional programming language where you can
| do recursion and stuff, you can use derivations to run any
| language to transform strings with ANY tool.
|
| It's like Terraform "unleashed" :) Forget "dynamic" blocks,
| bad module APIs and hacks (While still being able to use
| existing modules too if you feel the urge).
| mdaniel wrote:
| Sounds like the kustomize mental model: take code you
| potentially don't control, apply patches to it until it
| behaves like you wish, apply
|
| If the documentation and IDE story for kustomize was
| better, I'd be its biggest champion
| carlhjerpe wrote:
| You can run Kustomize in a Nix derivation with inputs
| from Nix and apply the output using Terranix and the
| kubectl provider, gives you a very nice reproducible way
| to apply Kubernetes resources with the Terraform state
| engine, I like how Terraform makes managing the lifecycle
| of CRUD with cascading changes and replacements which
| often is pretty optimal-ish at least.
|
| And since it's Terraform you can create resources using
| any provider in the registry to create resources
| according to your Kubernetes objects too, it can
| technically replace things like external-dns and similar
| controllers that create stuff in other clouds, but in a
| more "static configuration" way.
|
| Edit: This works nicely with Gitlab Terraform state
| hosting thingy as well.
| jitl wrote:
| I think Pulumi is in a similar spot, you get a real
| programming language (of your choice) and it gets to use
| the existing provider ecosystem. You can use the
| programming language composition facilities to work
| around the plan system if necessary, although their plans
| allow more dynamic stuff than Terraform.
|
| The setup with Terranix sounds cool! I am pretty
| interested in build system type things myself, I recently
| wrote a plan/apply system too that I use to manage SQL
| migrations.
|
| I want learn nix, but I think that like Rust, it's just a
| bit too wide/deep for me to approach on my own time
| without a tutor/co-worker or forcing function like a work
| project to push me through the initial barrier.
| carlhjerpe wrote:
| Yep it's similar, but you bring all your dependencies
| with you through Nix rather than a language specific
| package manager.
|
| Try using something like devenv.sh initially just to
| bring tools into $PATH in a distro agnostic & mostly-ish
| MacOS compatible way (so you can guarantee everyone has
| the same versions of EVERYTHING you need to build your
| thing).
|
| Learn the language basics after it brings you value
| already, then learn about derivations and then the module
| system which is this crazy composable multilayer
| recursive magic merging type system implemented on top of
| Nix, don't be afraid to clone nixpkgs and look inside.
|
| Nix derivations are essentially Dockerfiles on steroids,
| but Nix language brings /nix/store paths into the
| container, sets environment variables for you and runs
| some scripts, and all these things are hashed so if any
| input changes it triggers automatic cascading rebuilds,
| but also means you can use a binary cache as a kind of
| "memoization" caching thingy which is nice.
|
| It's a very useful tool, it's very non-invasive on your
| system (other than disk space if you're not managing
| garbage collection) and you can use it in combination
| with other tools.
|
| Makes it very easy to guarantee your DevOps scripts runs
| exactly your versions of all CLI tools and build systems
| and whatever even if the final piece isn't through Nix.
|
| Look at "pgroll" for Postgres migrations :)
| jitl wrote:
| pgroll seems neat but I ended up writing my own tools for
| this one because I need to do somewhat unique shenanigans
| like testing different sharding and resource allocation
| schemes in Materialize.com (self hosted). I have 480
| source input schemas (postgres input schemas described
| here if you're curious, the materialize stuff is brand
| new https://www.notion.com/blog/the-great-re-shard) and
| manage a bunch of different views & indexes built on top
| of those; create a bunch of different copies of the
| views/indexes striped across compute nodes, like right
| now I'm testing 20 schemas per whole-aws-instance node,
| versus 4 schemas per quarter-aws-node, M/N*Y with
| different permutations of N and Y. With the plan/apply
| model I just need to change a few lines in TypeScript and
| get the minimal changes to all downstream dependencies
| needed to roll it out.
| Groxx wrote:
| Internally... in what? Not HCL itself, I assume? Also I'm
| not seeing much that implies HCL has a "plan time"...
|
| I'm not familiar with HCL so I'm struggling to find much
| here that would be conclusive, but a lot of this thread
| sounds like "HCL's features that YAML does not have are
| sub-par and not sufficient to let me only use HCL" and...
| yeah, you usually can't use YAML that way either, so I'm
| not sure why that's all that much of a downside?
|
| I've been idly exploring config langs for a while now,
| and personally I tend to just lean towards JSON5 because
| comments are absolutely required... but support isn't
| anywhere near as good or automatic as YAML :/ HCL has
| been on my interest-list for a while, but I haven't gone
| deep enough into it to figure out any real opinion.
| mdaniel wrote:
| > Allow etcd swap-out
|
| From your lips to God's ears. And, as they correctly pointed out,
| this work is already done, so I just do not understand the
| holdup. Folks can continue using etcd if it's their favorite, but
| _mandating_ it is weird. And I can already hear the
| butwhataboutism yet there is already a CNCF certification process
| and a whole subproject just for testing Kubernetes itself, so do
| they believe in the tests or not?
|
| > The Go templates are tricky to debug, often containing complex
| logic that results in really confusing error scenarios. The error
| messages you get from those scenarios are often gibberish
|
| And they left off that it is crazypants to use a _textual_
| templating language for a _whitespace sensitive, structured_ file
| format. But, just like the rest of the complaints, it 's not like
| we don't already have replacements, but the network effect is
| very real and very hard to overcome
|
| That barrier of "we have nicer things, but inertia is real"
| applies to so many domains, it just so happens that helm impacts
| a much larger audience
| jonenst wrote:
| What about kustomize and kpt ? I'm using them (instead of helm)
| but but:
|
| * kpt is still not 1.0
|
| * both kustomize and kpt require complex setups to
| programatically generate configs (even for simple things like
| replicas = replicasx2)
| jitl wrote:
| I feel like I'm already living in the Kubernetes 2.0 world
| because I manage my clusters & its applications with Terraform.
|
| - I get HCL, types, resource dependencies, data structure
| manipulation for free
|
| - I use a single `tf apply` to create the cluster, its underlying
| compute nodes, related cloud stuff like S3 buckets, etc; as well
| as all the stuff running on the cluster
|
| - We use terraform modules for re-use and de-duplication,
| including integration with non-K8s infrastructure. For example,
| we have a module that sets up a Cloudflare ZeroTrust tunnel to a
| K8s service, so with 5 lines of code I can get a unique public
| HTTPS endpoint protected by SSO for _whatever_ running in K8s.
| The module creates a Deployment running cloudflared as well as
| configures the tunnel in the Cloudflare API.
|
| - Many infrastructure providers ship signed well documented
| Terraform modules, and Terraform does reasonable dependency
| management for the modules & providers themselves with lockfiles.
|
| - I can compose Helm charts just fine via the Helm terraform
| provider if necessary. Many times I see Helm charts that are just
| "create namespace, create foo-operator deployment, create custom
| resource from chart values" (like Datadog). For these I opt to
| just install the operator & manage the CRD from terraform
| directly or via a thin Helm pass through chat that just echos
| whatever HCL/YAML I put in from Terraform values.
|
| Terraform's main weakness is orchestrating the apply process
| itself, similar to k8s with YAML or whatever else. We use
| Spacelift for this.
| ofrzeta wrote:
| In a way it's redundant to have the state twice: once in
| Kubernetes itself and once in the Terraform state. This can
| lead to problems when resources are modified through mutating
| webhooks or similar. Then you need to mark your properties as
| "computed fields" or something like that. So I am not a fan of
| managing applications through TF. Managing clusters might be
| fine, though.
| moomin wrote:
| Let me add one more: give controllers/operators a defined
| execution order. Don't let changes flow both ways. Provide better
| ways for building things that don't step on everyone else's toes.
| Make whatever replaces helm actually maintain stuff rather than
| just splatting it out.
| clvx wrote:
| This is a hard no for me. This is the whole thing about
| reconciliation loop. You can just push something to the
| api/etcd and eventually it will become ready when all the
| dependencies exist. Now, rejecting manifests because crd's
| don't exist yet is a different discussion. I'm down to have a
| cache of manifests to be deployed waiting for crd but if the
| crd isn't deployed, then a garbage collection alike tool
| removes them from cache. This is what fluxcd and argocd already
| do in a way but I would like to Have it natively.
| recursivedoubts wrote:
| please make it look like old heroku for us normies
| dzonga wrote:
| I thought this would be written along the lines of an lllm going
| through your code - spinning up a railway file. then say have tf
| for few of the manual dependencies etc that can't be easily
| inferred.
|
| & get automatic scaling out of the box etc. a more simplified
| flow rather than wrangling yaml or hcl
|
| in short imaging if k8s was a 2-3 max 5 line docker compose like
| file
| singularity2001 wrote:
| More like wasm?
| mdaniel wrote:
| As far as I know one can do that right now, since wasmedge
| (Apache 2) exposes a CRI interface
| https://wasmedge.org/docs/develop/deploy/oci-runtime/crun#pr...
| (et al)
| nunez wrote:
| I _still_ think Kubernetes is insanely complex, despite all that
| it does. It seems less complex these days because it's so
| pervasive, but complex it remains.
|
| I'd like to see more emphasis on UX for v2 for the most common
| operations, like deploying an app and exposing it, then doing
| things like changing service accounts or images without having to
| drop into kubectl edit.
|
| Given that LLMs are it right now, this probably won't happen, but
| no harm in dreaming, right?
| Pet_Ant wrote:
| Kubernetes itself contains so many layers of abstraction. There
| are pods, which is the core new idea, and it's great. But now
| there are deployments, and rep sets, and namespaces... and it
| makes me wish we could just use Docker Swarm.
|
| Even Terraform seems to live on just a single-layer and was
| relatively straight-forward to learn.
|
| Yes, I am in the middle of learning K8s so I know exactly how
| steep the curve is.
| jakewins wrote:
| The core idea isn't pods. The core idea is reconciliation
| loops: you have some desired state - a picture of how you'd
| like a resource to look or be - and little controller loops
| that indefinitely compare that to the world, and update the
| world.
|
| Much of the complexity then comes from the enormous amount of
| resource types - including all the custom ones. But the basic
| idea is really pretty small.
|
| I find terraform much more confusing - there's a spec, and
| the real world.. and then an opaque blob of something I don't
| understand that terraform sticks in S3 or your file system
| and then.. presumably something similar to a one-shot
| reconciler that wires that all together each time you plan
| and apply?
| vrosas wrote:
| Someone saying "This is complex but I think I have the core
| idea" and someone to responding "That's not the core idea
| at all" is hilarious and sad. BUT ironically what you just
| laid out about TF is exactly the same - you just manually
| trigger the loop (via CI/CD) instead of a thing waiting for
| new configs to be loaded. The state file you're referencing
| is just a cache of the current state and TF reconciles the
| old and new state.
| jauco wrote:
| Always had the conceptual model that terraform executes
| something that resembles a merge using a three way diff.
|
| There's the state file (base commit, what the system
| looked like the last time terraform succesfully
| executed). The current system (the main branch, which
| might have changed since you "branched off") and the
| terraform files (your branch)
|
| Running terraform then merges your branch into main.
|
| Now that I'm writing this down, I realize I never really
| checked if this is accurate, tf apply works regardless of
| course.
| mdaniel wrote:
| and then the rest of the owl is working out the merge
| conflicts :-D
|
| I don't know how to have a cute git analogy for "but
| first, git deletes your production database, and then
| recreates it, because some attribute changed that made
| the provider angry"
| mdaniel wrote:
| > a one-shot reconciler that wires that all together each
| time you plan and apply?
|
| You skipped the { while true; do tofu plan; tofu apply;
| echo "well shit"; patch; done; } part since the providers
| do fuck-all about actually, no kidding, saying whether the
| plan could succeed
| jonenst wrote:
| To me the core of k8s is pod scheduling on nodes,
| networking ingress (e.g. nodeport service), networking
| between pods (everything addressable directly), and
| colocated containers inside pods.
|
| Declarative reconciliation is (very) nice but not
| irreplaceable (and actually not mandatory, e.g. kubectl run
| xyz)
| throwaway5752 wrote:
| I've come to think that it is a case of "the distinctions
| between types of computer programs are a human construct"
| problem.
|
| I agree with you on a human level. Operators and controllers
| remind me of COM and CORBA, in a sense. They are hightly
| abstract things, that are intrinsically so flexible that they
| allow judgement (and misjudgement) in design.
|
| For simple implementations, I'd want k8s-lite, that was more
| opinionated and less flexible. Something which doesn't allow
| for as much shooting ones' self in the foot. For very complex
| implementations, though, I've felt existing abstractions to be
| limiting. There is a reason why a single cluster is sometimes
| the basis for cell boundaries in cellular architectures.
|
| I sometimes wonder if one single system - kubernetes 2.0 or
| anything else - can encompass the full complexity of the
| problem space while being tractable to work with by human
| architects and programmers.
| nine_k wrote:
| > _I 'd want k8s-lite, that was more opinionated and less
| flexible_
|
| You seem to want something like https://skateco.github.io/
| (still compatible to k8s manifests).
|
| Or maybe even something like https://uncloud.run/
|
| Or if you still want real certified Kubernetes, but small,
| there is https://k3s.io/
| mdaniel wrote:
| Ah, so that explains it: https://github.com/skateco/skate#:
| ~:text=leverages%20podman%...
| NathanFlurry wrote:
| We're still missing a handful of these features, but this is
| the end goal with what we're building over at Rivet:
| https://github.com/rivet-gg/rivet
|
| This whole thing started scratching my own itch of wanting an
| orchestrator that I can confidently stand up, delpoy to, then
| forget about.
| coderatlarge wrote:
| where is that in the design space relative to where goog
| internal cluster management has converged to after the many
| years and the tens of thousands of engineers who have sanded
| it down under heavy fire since the original borg?
| mdaniel wrote:
| I recognize that I'm biased, but you'll want to strongly
| consider whether https://rivet.gg/docs/config is getting your
| audience where they can be successful, as compared to (e.g.)
| https://kubernetes.io/docs/reference/generated/kubernetes-
| ap...
| stackskipton wrote:
| Ops type here, after looking at Rivet, I've started doing The
| Office "Dear god no, PLEASE NO"
|
| Most people are looking for Container Management runtime with
| HTTP(S) frontend that will handle automatic certificate from
| Let's Encrypt.
|
| I don't want Functions/Actors or require this massive suite:
|
| FoundationDB: Actor state
|
| CockroachDB: OLTP
|
| ClickHouse: Developer-facing monitoring
|
| Valkey: Caching
|
| NATS: Pub/sub
|
| Traefik: Load balancers & tunnels
|
| This is just switching Kubernetes cloud lock in with KEDA and
| some other more esoteric operators to Rivet Cloud lock in. At
| least Kubernetes is slightly more portable than this.
|
| Oh yea, I don't know what Clickhouse is doing with monitoring
| but Prometheus/Grafana suite called, said they would love for
| you to come home.
| mountainriver wrote:
| We have started working on a sort of Kubernetes 2.0 with
| https://github.com/agentsea/nebulous -- still pre-alpha
|
| Things we are aiming to improve:
|
| * Globally distributed * Lightweight, can easily run as a single
| binary on your laptop while still scaling to thousands of nodes
| in the cloud. * Tailnet as the default network stack * Bittorrent
| as the default storage stack * Multi-tenant from the ground up *
| Live migration as a first class citizen
|
| Most of these needs were born out of building modern machine
| learning products, and the subsequent GPU scarcity. With ML
| taking over the world though this may be the norm soon.
| hhh wrote:
| Wow... Cool stuff, the live migration is very interesting. We
| do autoscaling across clusters across clouds right now based on
| pricing, but actual live migration is a different beast
| znpy wrote:
| > * Globally distributed
|
| Non-requirement?
|
| > * Tailnet as the default network stack
|
| That would probably be the first thing I look to rip out if I
| ever was to use that.
|
| Kubernetes assuming the underlying host only has a single NIC
| has been a plague for the industry, setting it back ~20 years
| and penalizing everyone that's not running on the cloud. Thank
| god there are multiple CNI implementation.
|
| Only recently with Multus
| (https://www.redhat.com/en/blog/demystifying-multus) some sense
| seem to be coming back into that part of the infrastructure.
|
| > * Multi-tenant from the ground up
|
| How would this be any different from kubernetes?
|
| > * Bittorrent as the default storage stack
|
| Might be interesting, unless you also mean seeding public
| container images. Egress traffic is crazy expensive.
| nine_k wrote:
| > _Non-requirement_
|
| > _the first thing I look to rip out_
|
| This only shows how varied the requirements are across the
| industry. One size does not fit all, hence multiple
| materially different solutions spring up. This is only good.
| znpy wrote:
| > One size does not fit all, hence multiple materially
| different solutions spring up.
|
| Sooo... like what kubernetes does today?
| mountainriver wrote:
| >> * Globally distributed >Non-requirement?
|
| It is a requirement because you can't find GPUs in a single
| region reliably and Kubernetes doesn't run on multiple
| regions.
|
| >> * Tailnet as the default network stack
|
| > That would probably be the first thing I look to rip out if
| I ever was to use that.
|
| This is fair, we find it very useful because it easily scales
| cross clouds and even bridges them locally. It was the
| simplest solution we could implement to get those properties,
| but in no way would we need to be married to it.
|
| >> * Multi-tenant from the ground up
|
| > How would this be any different from kubernetes?
|
| Kuberentes is deeply not multi-tenant, anyone who has tried
| to make a multi-tenant solution over kube has dealt with
| this. I've done it at multiple companies now, its a mess.
|
| >> * Bittorrent as the default storage stack
|
| > Might be interesting, unless you also mean seeding public
| container images. Egress traffic is crazy expensive.
|
| Yeah egress cost is a concern here, but its lazy so you don't
| pay for it unless you need it. This seemed like the lightest
| solution to sync data when you do live migrations cross
| cloud. For instance, I need to move my dataset and ML model
| to another cloud, or just replicate it there.
| stackskipton wrote:
| What is use case for multiple NICs outside bonding for
| hardware failure?
|
| Every time I've had multiple NICs on a server with different
| IPs, I've regretted it.
| mdaniel wrote:
| I'd guess management access, or the old school way of doing
| vLANs. Kubernetes offers Network Policies to solve the risk
| of untrusted workloads in the cluster accessing both pods
| and ports on pods that they shouldn't
| https://kubernetes.io/docs/concepts/services-
| networking/netw...
|
| Network Policies are also defense in depth, since another
| Pod would need to know its sibling Pod's name or IP to
| reach it directly, the correct boundary for such things is
| not to expose management toys in the workload's Service,
| rather create a separate Service that just exposes those
| management ports
|
| Akin to: interface Awesome { String
| getFavoriteColor(); } interface Management { void
| setFavoriteColor(String value); } class MyPod
| implements Awesome, Management {}
|
| but then only make either Awesome, or Management, available
| to the consumers of each behavior
| znpy wrote:
| A nic dedicated to SAN traffic, for example. People being
| serious about networked storage don't run their storage
| network i/o on the same nic where they serve traffic.
| Thaxll wrote:
| This is not Kubernetes, this a custom made solution to run GPU.
| nine_k wrote:
| Since it still can consume Kubernetes manifests, it's of
| interest for k8s practitioners.
|
| Since k8s manifests are a _language_ , there can be multiple
| implementations of it, and multiple dialects will necessarily
| spring up.
| mountainriver wrote:
| Which is the future of everything and Kuberentes does a very
| bad job at
| mdaniel wrote:
| You stopped typing; what does Kubernetes do a bad job at
| with relation to scheduling workloads that declare they
| need at least 1 GPU resource but should be limited to no
| more than 4 GPU resources on a given Node?
| https://kubernetes.io/docs/tasks/manage-gpus/scheduling-
| gpus...
| mdaniel wrote:
| heh, I think you didn't read the room given this directory
| https://github.com/agentsea/nebulous/blob/v0.1.88/deploy/cha...
|
| Also, ohgawd please never ever do this ever ohgawd
| https://github.com/agentsea/nebulous/blob/v0.1.88/deploy/cha...
| mountainriver wrote:
| Why not? We can run on Kube and extend it to multi-region
| when needed, or we can run on any VM as a single binary, or
| just your laptop.
|
| If you mean Helm, yeah I hate it but it is the most common
| standard. Also not sure what you mean by the secret, that is
| secure.
| mdaniel wrote:
| Secure from what, friend? It's a credential leak waiting to
| happen, to say nothing of the need to now manage IAM Users
| in AWS. That is the 2000s way of authenticating with AWS,
| and reminds me of people who still use passwords for ssh.
| Sure, it works fine, until some employee leaves and takes
| the root password with them
| Dedime wrote:
| From someone who was recently tasked with "add service mesh" -
| make service mesh obsolete. I don't want to install a service
| mesh. mTLS or some other from of encryption between pods should
| just happen automatically. I don't want some janky ass sidecar
| being injected into my pod definition ala linkerd, and now I've
| got people complaining that cilium's god mode is too permissive.
| Just have something built-in, please.
| mdaniel wrote:
| For my curiosity, what threat model is mTLS and encryption
| between pods driving down? Do you run untrusted workloads in
| your cluster and you're afraid they're going to exfil your ...
| I dunno, SQL login to the in-cluster Postgres?
|
| As someone who has the same experience you described with janky
| sidecars blowing up normal workloads, I'm violently anti
| service-mesh. But, cert expiry and subjectAltName management is
| already hard enough, and you would want that to happen for
| _every pod_? To say nothing of the TLS handshake for every
| connection?
| ahmedtd wrote:
| Various pieces support pieces for pod to pod mTLS are slowly
| being brought into the main Kubernetes project.
|
| Take a look at https://github.com/kubernetes/enhancements/tree/
| master/keps/..., which is hopefully landing as alpha in
| Kubernetes 1.34. It lets you run a controller that issues
| certificates, and the certificates get automatically plumbed
| down into pod filesystems, and refresh is handled
| automatically.
|
| Together with ClusterTrustBundles (KEP 3257), these are all the
| pieces that are needed for someone to put together a controller
| that distributes certificates and trust anchors to every pod in
| the cluster.
| benced wrote:
| I found Kubernetes insanely intuitive coming from the frontend
| world. I was used to writing code that took in data and made the
| UI react to that - now I write code that the control panel uses
| reconciles resources with config.
| znpy wrote:
| I'd like to add my points of view:
|
| 1. Helm: make it official, ditch the text templating. The helm
| workflow is okay, but templating text is cumbersome and error-
| prone. What we should be doing instead is patching objects. I
| don't know how, but I should be setting fields, not making sure
| my values contain text that are correctly indented (how many
| spaces? 8? 12? 16?)
|
| 2. Can we get a rootless kubernetes already, as a first-class
| citizen? This opens a whole world of possibilities. I'd love to
| have a physical machine at home where I'm dedicating only an
| unprivileged user to it. It would have limitations, but I'd be
| okay with it. Maybe some setuid-binaries could be used to handle
| some limited privileged things.
| d4mi3n wrote:
| I agree with the author that YAML as a configuration format
| leaves room for error, but please, for the love of whatever god
| or ideals you hold dear, do not adopt HCL as the configuration
| language of choice for k8s.
|
| While I agree type safety in HCL beats that of YAML (a low bar),
| it still leaves a LOT to be desired. If you're going to go
| through the trouble of considering a different configuration
| language anyway, let's do ourselves a favor and consider things
| like CUE[1] or Starlark[2] that offer either better type safety
| or much richer methods of composition.
|
| 1. https://cuelang.org/docs/introduction/#philosophy-and-
| princi...
|
| 2. https://github.com/bazelbuild/starlark?tab=readme-ov-
| file#de...
| mdaniel wrote:
| I repeatedly see this "yaml isn't typesafe" claim but have no
| idea where it's coming from since all the Kubernetes APIs are
| OpenAPI, and thus JSON Schema, and since YAML is a superset of
| JSON it is necessarily typesafe
|
| Every JSON Schema aware tool in the universe will instantly
| know this PodSpec is wrong: kind: 123
| metadata: [ {you: wish} ]
|
| I think what is very likely happening is that folks are --
| rightfully! -- angry about using a _text_ templating language
| to try and produce structured files. If they picked jinja2 they
| 'd have the same problem -- it does not consider _any_ textual
| output as "invalid" so jinja2 thinks this is a-ok
| jinja2.Template("kind: {{ youbet }}").render(youbet=True)
|
| I am aware that helm does *YAML* sanity checking, so one cannot
| just emit whatever crazypants yaml they wish, but it does not
| then go one step further to say "uh, your json schema is fubar
| friend"
| fragmede wrote:
| Instead of yaml, json, or HCL, how about starlark? It's a
| stripped down Python, used in production by bazel, so it's
| already got the go libraries.
| fjasdfwa wrote:
| kube-apiserver uses a JSON REST API. You can use whatever
| serializes to JSON. YAML is the most common and already works
| directly with kubectl.
|
| I personally use TypeScript since it has unions and structural
| typing with native JSON support but really anything can work.
| mdaniel wrote:
| Fun fact, while digging into the sibling comment's complaint
| about the OpenAPI spec, I learned that it actually advertises
| multiple content-types: application/json
| application/json;stream=watch
| application/vnd.kubernetes.protobuf
| application/vnd.kubernetes.protobuf;stream=watch
| application/yaml
|
| which I _presume_ all get coerced into protobuf before being
| actually interpreted
| mdaniel wrote:
| As the sibling comment points out, I think that would be a
| perfectly fine _helm_ replacement, but I would never ever want
| to feed starlark into k8s apis directly
| NathanFlurry wrote:
| The #1 problem with Kubernetes is it's not something that "Just
| Works." There's a very small subset of engineers who can stand up
| services on Kubernetes without having it fall over in production
| - not to mention actually running & maintaining a Kubernetes
| cluster on your own VMs.
|
| In response, there's been a wave of "serverless" startups because
| the idea of running anything yourself has become understood as
| (a) a time sink, (b) incredibly error prone, and (c) very likely
| to fail in production.
|
| I think a Kubernetes 2.0 should consider what it would look like
| to have a deployment platform that engineers can easily adopt and
| feel confident running themselves - while still maintaining
| itself as a small-ish core orchestrator with strong primitives.
|
| I've been spending a lot of time building Rivet to itch my own
| itch of an orchestrator & deployment platform that I can self-
| host and scale trivially: https://github.com/rivet-gg/rivet
|
| We currently advertise as the "open-source serverless platform,"
| but I often think of the problem as "what does Kubernetes 2.0
| look like." People are already adopting it to push the limits
| into things that Kubernetes would traditionally be good at. We've
| found the biggest strong point is that you're able to build
| roughly the equivalent of a Kubernetes controller trivially. This
| unlocks features more complex workload orchestration (game
| servers, per-tenant deploys), multitenancy (vibe coding per-
| tenant backends, LLM code interpreters), metered billing per-
| tenant, more powerful operators, etc.
| stuff4ben wrote:
| I really dislike this take and I see it all the time. Also I'm
| old and I'm jaded, so it is what it is...
|
| Someone decides X technology is too heavy-weight and wants to
| just run things simply on their laptop because "I don't need
| all that cruft". They spend time and resources inventing
| technology Y to suit their needs. Technology Y gets popular and
| people add to it so it can scale, because no one runs shit in
| production off their laptops. Someone else comes along and
| says, "damn, technology Y is too heavyweight, I don't need all
| this cruft..."
|
| "There are neither beginnings nor endings to the Wheel of Time.
| But it was a beginning."
| adrianmsmith wrote:
| It's also possible for things to just be too complex.
|
| Just because something's complex doesn't necessarily mean it
| has to be that complex.
| mdaniel wrote:
| IMHO, the rest of that sentence is "be too complex for some
| metric within some audience"
|
| I can assure you that trying to reproduce kubernetes with a
| shitload of shell scripts, autoscaling groups, cloudwatch
| metrics, and hopes-and-prayers is too complex for my metric
| within the audience of people who know Kubernetes
| wongarsu wrote:
| Or too generic. A lot of the complexity if from trying to
| support all use cases. For each new feature there is a
| clear case of "we have X happy users, and Y people who
| would start using it if we just added Z". But repeat that
| often enough and the whole things becomes so complex and
| abstract that you lose those happy users.
|
| The tools I've most enjoyed (including deployment tools)
| are those with a clear target group and vision, along with
| leadership that rejects anything that falls too far outside
| of it. Yes, it usually doesn't have _all_ the features I
| want, but it also doesn 't have a myriad of features I
| don't need
| supportengineer wrote:
| Because of promo-driven, resume-driven culture, engineers
| are constantly creating complexity. No one EVER got a
| promotion for creating LESS.
| NathanFlurry wrote:
| I hope this isn't the case here with Rivet. I genuinely
| believe that Kubernetes does a good job for what's on the tin
| (i.e. container orchestration at scale), but there's an
| evolution that needs to happen.
|
| If you'll entertain my argument for a second:
|
| The job of someone designing systems like this is to decide
| what are the correct primitives and invest in building a
| simple + flexible platform around those.
|
| The original cloud primitives were VMs, block devices, LBs,
| and VPCs.
|
| Kubernetes became popular because it standardized primitives
| (pods, PVCs, services, RBAC) that containerized applications
| needed.
|
| Rivet's taking a different approach of investing in different
| three primitives based on how most organizations deploy their
| applications today:
|
| - Stateless Functions (a la Fluid Compute)
|
| - Stateful Workers (a la Cloudflare Durable Objects)
|
| - Containers (a la Fly.io)
|
| I fully expect to raise a few hackles claiming these are the
| "new primitives" for modern applications, but our experience
| shows it's solving real problems for real applications today.
|
| Edit: Clarified "original _cloud_ primitives "
| RattlesnakeJake wrote:
| See also: JavaScript frameworks
| themgt wrote:
| The problem Kubernetes solves is "how do I deploy this" ... so
| I go to Rivet (which does look cool) docs, and the options are:
|
| * single container
|
| * docker compose
|
| * manual deployment (with docker run commands)
|
| But erm, realistically how is this a viable way to deploy a
| "serverless infrastructure platform" at any real scale?
|
| My gut response would be ... how can I deploy Rivet on
| Kubernetes, either in containers or something like kube-virt to
| run this serverless platform across a bunch of physical/virtual
| machines? How is docker compose a better more reliable/scalable
| alternative to Kubernetes? So alternately then you sell a cloud
| service, but ... that's not a Kubernetes 2.0. If I was going to
| self-host Rivet I'd convert your docs so I could run it on
| Kubernetes.
| NathanFlurry wrote:
| Our self-hosting docs are very rough right now - I'm fully
| aware of the irony given my comment. It's on our roadmap to
| get them up to snuff within the next few weeks.
|
| If you're curious on the details, we've put a lot of work to
| make sure that there's as few moving parts as possible:
|
| We have our own cloud VM-level autoscaler that's integrated
| with the core Rivet platofrm - no k8s or other orchestrators
| in between. You can see the meat of it here:
| https://github.com/rivet-
| gg/rivet/blob/335088d0e7b38be5d029d...
|
| For example, Rivet has an API to dynamically spin up a
| cluster on demand: https://github.com/rivet-
| gg/rivet/blob/335088d0e7b38be5d029d...
|
| Once you start the Rivet "seed" process with your API key,
| everything from there is automatic.
|
| Therefore, self-hosted deployments usually look like one of:
|
| - Plugging in your cloud API token in to Rivet for
| autoscaling (recommended)
|
| - Fixed # of servers (hobbyist deployments that were manually
| set up, simple Terraform deployments, or bare metal)
|
| - Running within Kubernetes (usually because it depends on
| existing services)
| hosh wrote:
| It's been my experience that nothing in infra and ops will ever
| "just work". Even something like Heroku will run into scaling
| issues, and how much you are willing to pay for it.
|
| If people's concerns is that they want a deployment platform
| that can be easily adopted and used, it's better to understand
| Kubernetes as the primitives on which the PaaS that people want
| can be built on top of it.
|
| Having said all that, Rivet looks interesting. I recognize some
| of the ideas from the BEAM ecosystem. Some of the appeal to me
| has less to do with deploying at scale, and more to do with
| resiliency and local-first.
| nikisweeting wrote:
| It should natively support running docker-compose.yml configs,
| essentially treating them like swarm configurations and
| "automagically" deploying them with sane defaults for storage and
| network. Right now the gap between compose and full-blown k8s is
| too big.
| mdaniel wrote:
| So, what I'm hearing is that it should tie itself to a
| commercial company, who now have a private equity master to
| answer to, versus an open source technology run by a foundation
|
| Besides, easily half of this thread is whining about helm for
| which docker-compose has _no_ answer whatsoever. There is no
| $(docker compose run oci: //example.com/awesome --version 1.2.3
| --set-string root-user=admin)
| ChocolateGod wrote:
| > Right now the gap between compose and full-blown k8s is too
| big.
|
| It's Hashicorp so you have to be weary, but Nomad fills this
| niche
|
| https://developer.hashicorp.com/nomad
| woile wrote:
| What bothers me:
|
| - it requires too much RAM to run in small machines (1GB RAM). I
| want to start small but not have to worry about scalability.
| docker swarm was nice in this regard.
|
| - use KCL lang or CUE lang to manage templates
| otterley wrote:
| First, K8S doesn't force anyone to use YAML. It might be
| idiomatic, but it's certainly not required. `kubectl apply` has
| supported JSON since the beginning, IIRC. The endpoints
| themselves speak JSON and grpc. And you can produce JSON or YAML
| from whatever language you prefer. Jsonnet is quite nice, for
| example.
|
| Second, I'm curious as to why dependencies are a thing in Helm
| charts and why dependency ordering is being advocated, as though
| we're still living in a world of dependency ordering and service-
| start blocking on Linux or Windows. One of the primary idioms in
| Kubernetes is looping: if the dependency's not available, your
| app is supposed to treat that is a recoverable error and try
| again until the dependency becomes available. Or, crash, in which
| case, the ReplicaSet controller will restart the app for you.
|
| You can't have dependency conflicts in charts if you don't have
| dependencies (cue "think about it" meme here), and you install
| each chart separately. Helm does let you install multiple
| versions of a chart if you must, but woe be unto those who do
| that in a single namespace.
|
| If an app _truly_ depends on another app, one option is to
| include the dependency in the same Helm chart! Helm charts have
| always allowed you to have multiple application and service
| resources.
| Arrowmaster wrote:
| You say supposed to. That's great when building your own
| software stack in house but how much software is available that
| can run on kubenetes but was created before it existed. But
| somebody figured out it could run in docker and then later
| someone realized it's not that hard to make it run in kubenetes
| because it already runs in docker.
|
| You can make an opinionated platform that does things how you
| think is the best way to do them, and people will do it how
| they want anyway with bad results. Or you can add the features
| to make it work multiple ways and let people choose how to use
| it.
| delusional wrote:
| > One of the primary idioms in Kubernetes is looping
|
| Indeed, working with kubernetes I would argue that the primary
| architectural feature of kubernetes is the "reconciliation
| loop". Observe the current state, diff a desired state, apply
| the diff. Over and over again. There is no "fail" or "success"
| state, only what we can observe and what we wish to observe.
| Any difference between the two is iterated away.
|
| I think it's interesting that the dominant "good enough
| technology" of mechanical control, the PID feedback loop, is
| quite analogous to this core component of kubernetes.
| tguvot wrote:
| i developed a system like this (with reconciliation loop, as
| you call it) some years ago. there is most definitely failed
| state (for multiple reasons). but as part of "loop" you can
| have logic to fix it up in order to bring it to desired
| state.
|
| we had integrated monitoring/log analysis to correlate
| failures with "things that happen"
| cyberax wrote:
| I would love:
|
| 1. Instead of recreating the "gooey internal network" anti-
| pattern with CNI, provide strong zero-trust authentication for
| service-to-service calls.
|
| 2. Integrate with public networks. With IPv6, there's no _need_
| for an overlay network.
|
| 3. Interoperability between several K8s clusters. I want to run a
| local k3s controller on my machine to develop a service, but this
| service still needs to call a production endpoint for a dependent
| service.
| mdaniel wrote:
| To the best of my knowledge, nothing is stopping you from doing
| any of those things right now. Including, ironically,
| authentication for pod-to-pod calls, since that's how Service
| Accounts work today. That even crosses the Kubernetes API
| boundary thanks to IRSA and, if one were really advanced, any
| OIDC compliant provider that would trust the OIDC issuer in
| Kubernetes. The eks-anywhere distribution even shows how to
| pull off this stunt _from your workstation_ via publishing the
| JWKS to S3 or some other publicly resolvable https endpoint
|
| I am not aware of any reason why you couldn't connect directly
| to any Pod, which necessarily includes the kube-apiserver's
| Pod, from your workstation except for your own company's
| networking policies
| solatic wrote:
| I don't get the etcd hate. You can run single-node etcd in simple
| setups. You can't easily replace it because so much of the
| Kubernetes API is a thin wrapper around etcd APIs like watch that
| are quite essential to writing controllers and don't map cleanly
| to most other databases, certainly not sqlite or frictionless
| hosted databases like DynamoDB.
|
| What actually makes Kubernetes hard to set up by yourself are a)
| CNIs, in particular if you both intend to avoid cloud-provider
| specific CNIs, support all networking (and security!) features,
| and still have high performance; b) all the cluster PKI with all
| the certificates for all the different components, which
| Kubernetes made an absolute requirement because, well,
| prpduction-grade security.
|
| So if you think you're going to make an "easier" Kubernetes, I
| mean, you're avoiding all the lessons learned and why we got here
| in the first place. CNI is hardly the naive approach to the
| problem.
|
| Complaining about YAML and Helm are dumb. Kubernetes doesn't
| force you to use either. The API server anyway expects JSON at
| the end. Use whatever you like.
| mdaniel wrote:
| > I don't get the etcd hate.
|
| I'm going out on a limb to say you've only ever used hosted
| Kubernetes, then. A sibling comment mentioned their need for
| vanity tooling to babysit etcd and my experience was similar.
|
| If you are running single node etcd, that would also explain
| why you don't get it: you've been very, very, very, very lucky
| never to have that one node fail, and you've never had to
| resolve the very real problem of ending up with _just two_ etcd
| nodes running
| mootoday wrote:
| Why containers when you can have Wasm components on wasmCloud
| :-)?!
|
| https://wasmcloud.com/
| 0xbadcafebee wrote:
| > Ditch YAML for HCL
|
| _Hard_ pass. One of the big downsides to a DSL is it 's
| linguistic rather than programmatic. It depends on a human to
| learn a language and figuring out how to apply it correctly.
|
| I have written a metric shit-ton of terraform in HCL. Yet even I
| struggle to contort my brain into the shape it needs to think of
| _how the fuck_ I can get Terraform to do what I want with its
| limiting logic and data structures. I have become almost
| completely reliant on saved snippet examples, Stackoverflow, and
| now ChatGPT, just to figure out how to deploy the right resources
| with DRY configuration in a multi-dimensional datastructure.
|
| YAML isn't a configuration format (it's a data encoding format)
| but it does a decent job at _not being a DSL_ , which makes
| things way easier. Rather than learn a language, you simply fill
| out a data structure with attributes. Any human can easily follow
| documentation to do that without learning a language, and any
| program can generate or parse it easily. (Now, the specific
| configuration schema of K8s does suck balls, but that's not
| YAML's fault)
|
| > I still remember not believing what I was seeing the first time
| I saw the Norway Problem
|
| It's not a "Norway Problem". It's a PEBKAC problem. The "problem"
| is literally that the user did not read the YAML spec, so they
| did not know what they were doing, then did the wrong thing, and
| blamed YAML. It's wandering into the forest at night, tripping
| over a stump, and then blaming the stump. Read the docs. YAML is
| not crazy, it's a pretty simple data format.
|
| > Helm is a perfect example of a temporary hack that has grown to
| be a permanent dependency
|
| Nobody's permanently dependent on Helm. Plenty of huge-ass
| companies don't use it at all. This is where you proved you
| really don't know what you're talking about. (besides the fact
| that helm is a _joy_ to use compared to straight YAML or HCL)
| rcarmo wrote:
| One word: Simpler.
| aranw wrote:
| YAML and Helm are my two biggest pain points with k8s and I would
| love to see them replaced with something else. CUE for YAML would
| be really nice. As for replacing Helm, I'm not too sure really.
| Perhaps with YAML being replaced by CUE maybe something more
| powerful and easy to understand could evolve from using CUE?
| fideloper wrote:
| "Low maintenance", welp.
|
| I suppose that's true in one sense - in that I'm using EKS
| heavily, and don't maintain cluster health myself (other than all
| the creative ways I find to fuck up a node). And perhaps in
| another sense: It'll try its hardest to run some containers so
| matter how many times I make it OOMkill itself.
|
| Buttttttttt Kubernetes is almost pure maintenance in reality.
| Don't get me wrong, it's amazing to just submit some yaml and get
| my software out into the world. But the trade off is pure
| maintenance.
|
| The workflows to setup a cluster, decide which chicken-egg trade-
| off you want to get ArgoCD running, register other clusters if
| you're doing a hub-and-spoke model ... is just, like, one single
| act in the circus.
|
| Then there's installing all the operators of choice from
| https://landscape.cncf.io/. I mean that page is a meme, but how
| many of us run k8s clusters without at least 30 pods running
| "ancillary" tooling? (Is "ancillary" the right word? It's stuff
| we need, but it's not our primary workloads).
|
| A repeat circus is spending hours figuring out just the right
| values.yaml (or, more likely, hours templating it, since we're
| ArgoCD'ing it all, right?)
|
| > As an side, I once spent HORUS figuring out to (incorrectly)
| pass boolean values around from a Secrets Manager Secret, to a
| k8s secret - via External Secrets, another operator! - to an
| ArgoCD ApplicationSet definition, to another values.yaml file.
|
| And then you have to operationalize updating your clusters - and
| all the operators you installed/painstakingly configured. Given
| the pace of releases, this is literally, pure maintenance that is
| always present.
|
| Finally, if you're autoscaling (Karpenter in our case), there's a
| whole other act in the circus (wait, am I still using that
| analogy?) of replacing your nodes "often" without downtime, which
| gets fun in a myriad of interesting ways (running apps with state
| is fun in kubernetes!)
|
| So anyway, there's my rant. Low fucking maintenance!
| ljm wrote:
| I've been running k3s on hetzner for over 2 years now with 100%
| uptime.
|
| In fact, it was so low maintenance that I lost my SSH key for
| the master node and I had to reprovision the entire cluster.
| Took about 90 mins including the time spent updating my docs.
| If it was critical I could have got that down to 15 mins tops.
|
| 20EUR/mo for a k8s cluster using k3s, exclusively on ARM, 3
| nodes 1 master, some storage, and a load balancer with
| automatic dns on cloudflare.
| Bombthecat wrote:
| Yeah, as soon as you got your helm charts and node
| installers.
|
| Installing is super fast.
|
| We don't do backup of the cluster for example for that reason
| ( except databases etc) we just reprovision the whole
| cluster.
| verst wrote:
| How often do you perform version upgrades? Patching of the
| operation system of the nodes or control plane etc? Things
| quickly get complex if application uptime / availability is
| critical.
| aljgz wrote:
| "Low Maintenance" is relative to alternatives. In my
| experience, any time I was dealing with K8s I needed much lower
| maintenance to get the same quality of service (everything from
| [auto]scaling, to faileover, deployment, rollback, disaster
| recovery, DevOps, ease of spinning up a completely independent
| cluster) compared to not using it. YMMV.
| turtlebits wrote:
| Sounds self inflicted. Stop installing so much shit. Everything
| you add is just tech debt and has a cost associated, even if
| the product is free.
|
| If autoscaling doesn't save more $$ than the tech
| debt/maintenance burden, turn it off.
| ozim wrote:
| I agree with your take.
|
| But I think a lot of people are in state where they need to
| run stuff the way it is because "just turn it off" won't
| work.
|
| Like system after years on k8s coupled to its quirks. People
| not knowing how to setup and run stuff without k8s.
| pnathan wrote:
| vis-a-vis running a roughly equivalent set of services cobbled
| together, its wildly low maintenance to the point of fire and
| forget.
|
| you do have to know what you're doing and not fall prey to the
| "install the cool widget" trap.
| hosh wrote:
| While we're speculating:
|
| I disagree that YAML is so bad. I don't particularly like HCL.
| The tooling I use don't care though -- as long as I can stil
| specify things in JSON, then I can generate (not template) what I
| need. It would be more difficult to generate HCL.
|
| I'm not a fan of Helm, but it is the de facto package manager.
| The main reason I don't like Helm has more to do with its
| templating system. Templated YAML is very limiting, when compared
| to using a full-fledged language platform to generate a
| datastructure that can be converted to JSON. There are some
| interesting things you can do with that. (cdk8s is like this, but
| it is not a good example of what you can do with a generator).
|
| On the other hand, if HCL allows us to use modules, scoping, and
| composition, then maybe it is not so bad after all.
| mikeocool wrote:
| How about release 2.0 and then don't release 2.1 for a LONG time.
|
| I get that in the early days such a fast paced release/EOL
| schedule made sense. But now something that operates at such a
| low level shouldn't require non-security upgrades every 3 months
| and have breaking API changes at least once a year.
___________________________________________________________________
(page generated 2025-06-19 23:00 UTC)