[HN Gopher] So you wanna write Kubernetes controllers?
___________________________________________________________________
So you wanna write Kubernetes controllers?
Author : gokhan
Score : 77 points
Date : 2025-01-22 22:33 UTC (4 days ago)
(HTM) web link (ahmet.im)
(TXT) w3m dump (ahmet.im)
| Vampiero wrote:
| Why do devops keep piling abstractions on top of abstractions?
|
| There's the machine. Then the VM. Then the container. Then the
| orchestrator. Then the controller. And it's all so complex that
| you need even more tools to generate the configuration files for
| the former tools.
|
| I don't want to write a Kubernetes controller. I don't even know
| why it should exist.
| GiorgioG wrote:
| I don't want Kubernetes period. Best decision we've made at
| work is to migrate away from k8s and onto AWS ECS. I just want
| to deploy containers! DevOps went from something you did when
| standing up or deploying an application, to an industry-wide
| jobs program. It's the TSA of the software world.
| mugsie wrote:
| Thats great if that works for you, and for a lot people and
| teams. You have just shifted the complexity of networking,
| storage, firewalling, IP management, L7 proxying to AWS, but
| hey, you do have click ops there.
|
| > DevOps went from something you did when standing up or
| deploying an application, to an industry-wide jobs program.
| It's the TSA of the software world.
|
| DevOps was never a job title, or process, it was a way of
| working, that went beyond yeeting to prod, and ignoring it.
|
| From that one line, you never did devops - you did dev, with
| some deployment tools (that someone else wrote?)
| ninjha wrote:
| You can have Click-Ops on Kubernetes too! Everything has a
| schema so it's possible to build a nice UI on top of it
| (with some effort).
|
| My current project is basically this, except it edits your
| git-ops config repository, so you can click-ops while you
| git-ops.
| k8sToGo wrote:
| You mean ArgoCD and Rancher? Both ready to do click ops!
| ninjha wrote:
| I mean you can edit a big YAML file inside ArgoCD, but
| what I'm building is an actual web form (e.x.
| `spec.rules[].http.paths[].pathType` is a dropdown of
| `Prefix`, `ImplementationSpecific`, `Exact`), and all
| your documentation inline as you're editing.
|
| People have tried this before but usually the UI version
| is not fully complete so you have to drop to YAML. Now
| that the spec is good enough it's possible to build a
| complete UI for this.
| mugsie wrote:
| Yup, and it has the advantage of having a easily backed
| up state store to represent the actions of the GUI.
|
| I always liked the octant UI autogeneration for CRDs and
| the way it just parsed things correctly from the
| beginning, if they had an edit mode that would be perfect
| ninjha wrote:
| Is there anything in particular you like about what
| Octant does? I don't see anything that actually looks at
| the object spec, just the status fields / etc.
| k8sToGo wrote:
| Sounds great. An interactive Spec builder, if I
| understand correctly.
| frazbin wrote:
| If I may ask, just to educate myself
|
| where do you keep the ECS service/task specs and how do you
| mutate them across your stacks?
|
| How long does it take to stand up/decomm a new instance of
| your software stack?
|
| How do you handle application lifecycle concerns like
| database backup/restore, migrations/upgrades?
|
| How have you supported developer stories like "I want to test
| a commit against our infrastructure without interfering with
| other development"?
|
| I recognize these can all be solved for ECS but I'm curious
| about the details and how it's going.
|
| I have found Kubernetes most useful when maintaining lots of
| isolated tenants within limited (cheap) infrastructure, esp
| when velocity of software and deployments is high and has
| many stakeholders (customer needs their demo!)
| liveoneggs wrote:
| https://docs.aws.amazon.com/AmazonECS/latest/developerguide
| /...
|
| https://docs.aws.amazon.com/AmazonECS/latest/developerguide
| /...
|
| https://docs.aws.amazon.com/AmazonECS/latest/developerguide
| /...
|
| etc
| mugsie wrote:
| Yeah, that doesn't really answer the question at all...
| Do you just have a pile of cloudformation on your
| desktop? point and click? tf? And then none of the actual
| questions like
|
| > How do you handle application lifecycle concerns like
| database backup/restore, migrations/upgrades?
|
| were even touched.
| k8sToGo wrote:
| It is always this holier than thou attitude of Software
| engineers towards DevOps that is annoying. Especially if it
| comes from ignorance.
|
| These days often DevOps is done by former Software Engineers
| rather than "old fashioned" Sys admins.
|
| Just because you are ignorant on how to use AKS efficiently,
| doesn't mean your alternative is better.
| mugsie wrote:
| Yeah, DevOps was a culture not a job title, and then we let
| us software engineers in who just want to throw something
| into prod and go home on friday night, so they decided it
| was a task, and the lowest importance thing possible, but
| simultaniously, the devops/sre/prod eng teams needed to be
| perfect, because its prod.
|
| it is a wierd dichotomy I have seem, and it is getting
| worse. We let teams have access to argo manifiests, and
| helm charts, and even let them do custom in repo charts.
|
| not one team in the last year has actually gone and looked
| at k8s docs to figure out how to do basic shit, they just
| dump questions into channels, and soak up time from people
| explaining the basics of the system their software runs on.
| sgarland wrote:
| > These days often DevOps is done by former Software
| Engineers rather than "old fashioned" Sys admins.
|
| Yes, and the world is a poorer place for it. Google's SRE
| model works in part because they have _both_ Ops and SWE
| backgrounds.
|
| The thing about traditional Ops is, while it may not scale
| to Google levels, it does scale quite well to the level
| most companies need, _and_ along the way, it forces people
| to learn how computers and systems work to a modicum of
| depth. If you're having to ssh into a box to see why a
| process is dying, you're going to learn something about
| that process, systemd, etc. If you drag the dev along with
| you to fix it, now two people have learned cross-areas.
|
| If everything is in a container, and there's an
| orchestrator silently replacing dying pods, that no longer
| needs to exist.
|
| To be clear, I _love_ K8s. I run it at home, and have used
| it professionally at multiple jobs. What I don't like is
| how it (and every other abstraction) have made it such that
| "infra" people haven't the slightest clue how infra
| actually operates, and if you sat them down in front of an
| empty, physical server, they'd have no idea how to
| bootstrap Linux on it.
| blazing234 wrote:
| Why don't you just deploy to cloud run on gcp and call it a
| day
| Spivak wrote:
| I'm so confused about the jobs program thing. I'm an infra
| engineer who has had the title devops for parts of my career.
| I feel like I've always been _desperately_ needed by teams of
| software devs that don 't want to concern themselves with the
| gritty reality of actually running software in production.
| The job kinda sucks but for some reason jives with my brain.
| I take a huge amount of work and responsibility off the
| plates of my devs and my work scales well to multiple teams
| and multiple products.
|
| I've never seen an infra/devops/platform team not swamped
| with work and just spinning their tires on random unnecessary
| projects. We're more expensive on average than devs, harder
| to hire, and two degrees separated from revenue. We're not a
| typically overstaffed role.
| danielklnstn wrote:
| CRDs and their controllers are perhaps _the_ reason Kubernetes
| is as ubiquitous as it is today - the ability to extend
| clusters effortlessly is amazing and opens up the door for so
| many powerful capabilities.
|
| > I don't want to write a Kubernetes controller. I don't even
| know why it should exist.
|
| You can take a look at Crossplane for a good example of the
| capabilities that controllers allow for. They're usually
| encapsulated in Kubernetes add-ons and plugins, so much as you
| might never have to write an operating system driver yourself,
| you might never have to write a Kubernetes controller yourself.
| raffraffraff wrote:
| One of the first really pleasant surprises I got while
| learning was that the kubectl command itself was extended
| (along with tab completion) by CRDs. So install external
| secrets operator and you get tab complete on those resources
| and actions.
| mugsie wrote:
| Yeah, for a lot of companies, this is way overkill. Thats fine,
| don't use it! In the places I have seen use it when it is
| actually needed, the controller makes a lot of work for teams
| disappear. It exists, because thats how K8S itself works? - how
| it translates from a deployment -> replica set -> pod ->
| container.
|
| Abstractions are useful to stop 100000s lines of boiler plate
| code. Same reason we have terraform providers, Ansible modules,
| and well, the same concepts in programming ...
| stouset wrote:
| Right now I'm typing on a glass screen that pretends to have a
| keyboard on it that is running a web browser developed with a
| UI toolkit in a programming language that compiles down to an
| intermediate bytecode that's compiled to machine code that's
| actually interpreted as microcode on the processor, half of it
| is farmed out to accelerators and coprocessors of various
| kinds, all assembled out of a gajillion transistors that neatly
| hide the fact that we've somehow made it possible to make sand
| think.
|
| The number of layers of abstraction you're already relying on
| just to post this comment is nigh uncountable. Abstraction is
| literally the only way we've continued to make progress in any
| technological endeavor.
| petercooper wrote:
| Then all of that data is turned into HTTP requests which turn
| into TCP packets distributed over IP over wifi over Ethernet
| over PPPoE over DSL and probably turned into light sent over
| fiber optics at various stages... :-)
| ok123456 wrote:
| The problem isn't abstractions. The problem is leaky
| abstractions that make it harder to reason about a system and
| add lots of hidden states and configurations of that state.
|
| What could have been a static binary running a system service
| has become a Frankenstein mess of opaque nested environments
| operated by action at a distance.
| zug_zug wrote:
| I think the point is that there are abstractions that require
| you to know almost nothing (e.g. that my laptop has a SSD
| with blocks that are constantly dying is abstracted to a
| filesystem that looks like a basic tree structure).
|
| Then there are abstractions that may actually _increase_
| cognitive load "What if instead of thinking about chairs, we
| philosophically think about ALL standing furniture types,
| stools, tables, etc. They may have 4 legs, 3, 6? What about a
| car seats too?"
|
| AFAICT writing a kubernetes controller is probably overkill
| challenge-yourself level exercise (e.g. a quine in BF)
| because odds are that any resource you've ever needed to
| manage somebody else has built an automated way to do it
| first.
|
| Would love to hear other perspectives though if anybody has
| great examples of when you really couldn't succeed without
| writing your own kubernetes controller.
| stouset wrote:
| Those only require you to understand them because you're
| working directly on top of them. If you were writing a
| filesystem driver you would _absolutely_ need to know those
| details. If you're writing a database backend, you probably
| need to know a lot about the filesystem. If you're writing
| an ORM, you need to know a lot about databases.
|
| Some of these abstractions are leakier than others. Web
| development coordinates a _lot_ of different technologies
| so often times you need to know about a wide variety of
| topics, and sometimes a layer below those. Part of it is
| that there's a lot less specialization in our profession
| than in others, so we need lots of generalists.
| zug_zug wrote:
| I think you're sort of hand-waving here.
|
| I think the concrete question is -- do you need to learn
| more or fewer abstractions to use kubernetes versus say
| AWS?
|
| And it looks like kubernetes is more abstractions in
| exchange for more customization. I can understand why
| somebody would roll their eyes at a system that has as
| much abstraction as kuberenetes does if their use-case is
| very concrete - they are scaling a web app based on
| traffic.
| zenethian wrote:
| Seemingly endlessly layered abstraction is also why phones
| and computers get faster and faster yet nothing seems to
| actually run better. Nobody wants to write native software
| anymore because there are too many variations of hardware and
| operating systems but everyone wants their apps to run on
| everything. Thus, we are stuck in abstraction hell.
|
| I'd argue the exact opposite has happened. We have made very
| little progress because everything is continually abstracted
| out to the least common denominator, leaving accessibility
| high but features low. Very few actual groundbreaking leaps
| have been accomplished with all of this abstraction; we've
| just made it easier to put dumb software on more devices.
| stouset wrote:
| I encourage you to actually work on a twenty year old piece
| of technology. It's easy to forget that modern computers
| are doing a _lot_ more. Sure, there's waste. But the
| expectations from software these days are exponentially
| greater than what we used to ship.
| solatic wrote:
| Current example from work: an extreme single-tenant
| architecture, deployed for large N number of tenants, which
| need both logically and physically isolation; the cost of the
| cloud provider's managed databases is considered Too Expensive
| to create one per tenant, so an open-source Kubernetes
| controller for the database is used instead.
|
| Not all systems are small-N modern multi-tenant architectures
| deployed at small scale.
| dijit wrote:
| > Why do devops keep piling abstractions on top of
| abstractions?
|
| Mostly, because developers keep trying to replace sysadmins
| with higher levels of abstraction. Then when they realise that
| they require (some new word for) sysadmins still, they pile on
| more abstractions again and claim they don't need them.
|
| The abstraction du-jour is not Kubernetes at the moment, it's
| FaaS. At some point managing those FaaS will require operators
| again and another abstraction on top of FaaS will exist, some
| kind of FaaS orchestrator, and the cycle will continue.
| robertlagrant wrote:
| I think it's clear that Kubernetes et al aren't trying to
| replace sysadmins. They're trying to massively increase the
| ratio of sysadmin:machine.
| antonvs wrote:
| If you're implementing a distributed system that needs to
| manage many custom resources (of whatever kind, not Kubernetes-
| specific), implementing a Kubernetes controller for it can save
| a great deal of development time and give you a better system
| in the end, with standard built-in observability,
| manageability, deployment automation, and a whole lot else.
|
| It's certainly true that some use of Kubernetes is overkill.
| But if you actually need what it offers, it can be a game-
| changer. That's a big reason why it caught on so fast in big
| enterprises.
|
| Don't fall into the trap of thinking that because you don't
| understand the need for something, that the need doesn't exist.
| clx75 wrote:
| At work we are using Metacontroller to implement our "operators".
| Quoted because these are not real operators but rather
| Metacontroller plugins, written in Python. All the watch and
| update logic - plus the resource caching - is outsourced to
| Metacontroller (which is written in Go). We define - via its
| CompositeController or DecoratorController CRDs - what kind of
| resources it should watch and which web service it should call
| into when it detects a change. The web service speaks plain HTTP
| (or HTTPS if you want).
|
| In case of a CompositeController, the web service gets the
| created/updated/deleted parent resource and any already existing
| child resources (initially none). The web service then analyzes
| the parent and existing children, then responds with the list of
| child resources whose existence and state Metacontroller should
| ensure in the cluster. If something is left out from the response
| compared to a previous response, it is deleted.
|
| Things we implemented using this pattern:
|
| - Project: declarative description of a company project, child
| resources include a namespace, service account, IAM role,
| SMB/S3/FSX PVs and PVCs generated for project volumes (defined
| under spec.volumes in the Project CR), ingresses for a set of
| standard apps
|
| - Job: high-level description of a DAG of containers, the web
| service works as a compiler which translates this high-level
| description into an Argo Workflow (this will be the child)
|
| - Container: defines a dev container, expands into a pod running
| an sshd and a Contour HTTPProxy (TCP proxy) which forwards TLS-
| wrapped SSH traffic to the sshd service
|
| - KeycloakClient: here the web service is not pure - it talks to
| the Keycloak Admin REST API and creates/updates a client in
| Keycloak whose parameters are given by the CRD spec
|
| So far this works pretty well and makes writing controllers a
| breeze - at least compared to the standard kubebuilder approach.
|
| https://metacontroller.github.io/metacontroller/intro.html
| fsniper wrote:
| At work we are using nolar/kopf for writing controllers that
| provisions/manages our kubernetes clusters. This also includes
| managing any infrastructure related apps that we deploy on
| them.
|
| We were using whitebox controller at the start, which is also
| like metacontroller that runs your scripts on kubernetes
| events. That was easy to write. However not having full control
| on the lifecycle of the controller code gets in the way time to
| time.
|
| Considering you are also writing Python did you review kopf
| before deciding on metacontroller?
| ec109685 wrote:
| Curious why using controller for these aspects versus
| generating the K8s objects as part of your deployment pipeline
| that you just apply? The latter gives you versioned artifacts
| you can roll forward and back and independent deployment of
| these supporting pieces with each app.
|
| Is there runtime dynamism that you need the control loop to
| handle beyond what the built-in primitives can handle?
| neuroelectron wrote:
| No not really
___________________________________________________________________
(page generated 2025-01-26 23:00 UTC)