[HN Gopher] Accidental complexity, essential complexity, and Kub...
___________________________________________________________________
Accidental complexity, essential complexity, and Kubernetes
Author : paulgb
Score : 83 points
Date : 2022-09-05 19:47 UTC (3 hours ago)
(HTM) web link (driftingin.space)
(TXT) w3m dump (driftingin.space)
| theteapot wrote:
| > Through Brooks' accidental-vs-essential lens, a lot of
| discussion around when to use Kubernetes boils down to the idea
| that essential complexity can become accidental complexity when
| you use the wrong tool for the job. The number of states my
| microwave can enter is essential complexity if I want to heat
| food, but accidental complexity if I just want a timer. With that
| in mind ...
|
| I'm twisting my mind trying to grasp this interpretation in
| Brooks' complexity paradigm. I'm sure Brooks would be interested
| to learn there exists so called wrong-tools that can reduce
| essential complexity to accidental :). I think Brooks would put
| it that the essential complexity is the same and irreducible, but
| the accidental complexity is increased when the wrong tool is
| used.
| [deleted]
| threeseed wrote:
| > The fact that you need a special distribution like minikube,
| Kind, k3s, microk8s, or k0s to run on a single-node instance.
| (The fact that I named five such distributions is a canary of its
| own.)
|
| They serve completely different purposes though.
|
| Some are Docker-based designed to be lightweight for testing e.g
| Kind, others are designed to be scalable to full clusters e.g
| k3s.
|
| And I think having the choice is a good thing as it proves that
| Kubernetes is vendor-agnostic.
| bob1029 wrote:
| Also see closely related paper:
|
| http://curtclifton.net/papers/MoseleyMarks06a.pdf
|
| This one inspired our most recent system architecture.
| candiddevmike wrote:
| Thank you for sharing this, much more insightful than the
| article
| orf wrote:
| > One way to think of Kubernetes is as a distributed framework
| for control loops. Broadly speaking, control loops create a
| declarative layer on top of an imperative system.
|
| Finally: a post about Kubernetes that actually understands what
| it fundamentally is.
|
| > Kubernetes abstracts away the decision of which computer a pod
| runs on, but reality has a way of poking through. For example, if
| you want multiple pods to access the same persistent storage
| volume, whether or not they are running on the same node suddenly
| becomes your concern again.
|
| I'm not sure I agree that this itself constitutes a leaky
| abstraction. Kubernetes still abstracts away the decision of
| which computer a pod runs on, the persistent storage volume is
| just another declarative _constraint_ on where a pod can be
| scheduled. It 's no different from specifying you need "40" CPU
| cores, and thus only being able to be scheduled on nodes with >40
| cores.
|
| > There's no fundamental reason that I should need anything
| except my compiler to produce a universally deployable unit of
| code.
|
| I agree - but you can do that with 'containers'. You just create
| an empty "FROM scratch" container then copy your binary in. The
| end result is that your OCI image is just a single .tar.gz file
| containing your binary alongside a JSON document specifying
| things like the entrypoint and any labels. Just like a shipping
| container, this plugs into anything that ships things shaped like
| that.
|
| You get a _bunch_ of nice stuff for free with this (tagged +
| content addressable remote storage with garbage collection!
| inter-image layer caching!), even if you 're slinging about WASM
| binaries I'd still package them as OCI images.
|
| > The use of YAML as the primary interface, which is notoriously
| full of foot-guns.
|
| There's a lot more to say here, and it's more of a legitimate
| criticism of Kubernetes than most of the "hurr dur k8s complex"
| critisisms you commonly see. The ecosystem has kind of centered
| around Helm as a way of templating resources, but it's...
| horrible. It's all so horrible. A huge step up from ktml or other
| rubbish from the past, but go's template language isn't fun.
|
| But I'm not sure how it could have gone any differently. JSON is
| as standard and simple as it gets and is simple to auto-generate,
| but it's not user friendly. So either the kubernetes getting
| started guide starts with "first go auto-generate your JSON using
| some tool we don't distribute", or you provide a more user-
| friendly way for people to onboard.
|
| YAML is a good middle ground here between giving users the
| ability to auto-generate resources from other systems (i.e
| spitting out JSON and shovelling it into a k8s API) and something
| user-friendly for people to interact with and view using
| kubectl/k9s/whatever.
| soco wrote:
| I'm confused, I thought JSON was created because XML wasn't
| user friendly. I on the other hand see no user friendliness in
| either YAML, JSON or XML. Just formatted text with or without
| tabs. Users don't like standards so whatever you choose
| somebody will complain.
| dinosaurdynasty wrote:
| I often wonder how many footguns could've been removed if
| projects like k8s/ansible used TOML instead of YAML.
| bvrmn wrote:
| TOML is awful substitution in the ansible context. Let's play
| a game, how about I give you a real-life playbook and you
| translate it to TOML?
| [deleted]
| mati365 wrote:
| Kubernetes and its complexity again..
| bvrmn wrote:
| > The number of states my microwave can enter is essential
| complexity
|
| Two dials to set power and clock-timer is enough. Microwaves with
| digital panel and led display are a perfect example of accidental
| complexity.
| hbrn wrote:
| The more I think about it, the more I realize that generic
| declarative style, despite sounding very promising, might not be
| a good fit for modern deployment. Mainly due to technology
| fragmentation.
|
| There's no one true way to deploy an abstract app, each tech
| stack is fairly unique. Two apps can look exactly the same from
| the desired state perspective, but have two extremely different
| deployment processes. They might even have exactly the same tech
| stack, but different reliability requirements.
|
| Somehow you need to be able to accommodate for those differences
| in your declarative framework. So you'll pay the abstraction
| costs (complexity). But you will only reap the benefits if you're
| constantly switching the components of your architecture. And
| that typically doesn't happen: by the time you get to a point
| where you need a deployment framework, your architecture fairly
| rigid.
|
| Maybe k8s makes a lot of sense if you're Google. But 99.99% of
| companies are not Google.
| jrockway wrote:
| I agree very much about the accidental complexity of containers.
| Ignoring the runtime concerns (cgroups, namespaces, networking,
| etc.), the main problem seems to have been "we can't figure out
| how to get Python code to production", and the solution was "just
| ship a disk image of some developer's workstation to production".
| To do this, they created a new file format ("OCI image format",
| though not called that at the time), a new protocol for
| exchanging files between computers ("OCI distribution spec"), a
| new way to run executables on Linux ("OCI runtime spec"), and
| various proprietary versions of Make, with a caching mechanism
| that isn't aware of the actual dependencies that go into your
| final product. The result is a staggering level of complexity,
| all to work around the fact that nobody even bothered trying to
| add a build system to Python.
|
| Like the author, I tend to write software in a language that
| produces a single statically linked binary, so I don't need any
| of this distribution stuff. I don't need layers, I don't need a
| Make-alike, I don't need a build cache, but I still have to go
| out of my way to wrap the generated binary and push it to a super
| special server. Imagine a world where we just skipped all of
| this, and your k8s manifest looks like:
| containers: - name: foo-app image: -
| architecture: amd64 os: linux binary:
| https://github.com/whatever/download/foo-app/foo-app-1.0.0-linux-
| amd64 checksum: sha256@1234567890abcdef -
| architecture: arm64 os: linux binary:
| https://github.com/whatever/download/foo-app/foo-app-1.0.0-linux-
| arm64 checksum: sha256@abdcef1234567890
|
| Meanwhile, if you don't want to orchestrate your deployment and
| just want to run your app: wget
| https://github.com/whatever/download/foo-app/foo-app-1.0.0-linux-
| amd64 chmod a+x foo-app-1.0.0-linux-amd64 ./foo-
| app-1.0.0-linux-amd64
|
| I dunno. I feel like, as an industry, we spent billions of
| dollars, founded brand new companies, created a new job title
| ("Devops", don't get me started), all so that we could avoid
| making Python output a single binary. I'm not sure that random
| chaos did the right thing, and you're right to be bitter about
| it.
| tsimionescu wrote:
| The article is pretty lite, though I agree with most of the
| points.
|
| However, I think that, especially in the context context of
| Kubernetes, this part is completely wrong:
|
| > Containers were a great advance, but much of their complexity
| falls into the "accidental" category. There's no fundamental
| reason that I should need anything except my compiler to produce
| a universally deployable unit of code.
|
| Containers are not used in Kubernetes or other similar
| orchestrators because of their support for bundling dependencies
| - that is a bonus at best.
|
| Instead, they are used because they are a standard way of using
| cgroups to tightly control what pieces of the system a process
| has access to, so that multiple processes running on the same
| system can't accidentally affect each other, and so that a
| process can't accidentally depend on the system it is running on
| (including things like open ports). These are key properties for
| a system that seeks to efficiently distribute workloads on a
| number of computers without having to understand the specifics of
| what workload it's running.
|
| They are also used because container registries are a ready made
| secure Linux distribution-agnostic way of retrieving software and
| referring to it with a unique name. quay.io/python:3.7 will work
| on Ubuntu, SuSE, RedHat or any other base system, unlike relying
| on apt/yum/etc.
| paulgb wrote:
| > Containers are not used in Kubernetes or other similar
| orchestrators because of their support for bundling
| dependencies - that is a bonus at best.
|
| > Instead, they are used because they are a standard way of
| using cgroups to tightly control what pieces of the system a
| process has access to
|
| Well, sure, you get both. But the point is that needing a post-
| build step just to get that _is_ accidental complexity. I 'm by
| no means a Java fan, but the way JVM gives jars as a compile
| target and lets you run them with some measure of isolation is
| an example of how things could be.
| threeseed wrote:
| With Maven, Gradle etc you can output a Docker container from
| a single package command.
|
| And I don't understand this idea of using code bundles as the
| deployment artefact.
|
| They don't specify _how_ the code should be run e.g. Java
| version, set of SSL certificates in your keystore etc.
|
| Do you really want Kubernetes to have hundreds of options for
| each language ? Or is it better to just leave that up to the
| user.
| anonymous_sorry wrote:
| I think the suggestion is to use native Linux binaries with
| static linking of any libraries and resources in the
| executable.
|
| Java programs would need to be compiled as a standard ELF.
| I think GraalVM can do this.
| threeseed wrote:
| Only some Java/Scala programs can be compiled as a single
| binary.
|
| It's been a concept that has been around for years but
| not progressing to the point where it comes close to
| negating the need for the JVM in a Production
| environment.
|
| In the meantime containers works today and is
| significantly more powerful and flexible.
| jayd16 wrote:
| I just don't follow the argument. You can also use Java
| tooling to wrap up a container image. Why is it _accidental_
| complexity that we settled on a language agnostic target that
| also wraps up a lot of the process isolation metadata?
| paulgb wrote:
| Java is an outlier here. For most languages the process of
| dockerizing a codebase involves learning a separate tool
| (Docker). As someone who knows Docker, I get the temptation
| to say "who cares, just write a few lines of Dockerfile",
| but after talking to a bunch of developers about our
| container-based tool, having to leave their toolchain to
| build an image is a bigger sticking point than you might
| think.
| jayd16 wrote:
| But this is begging the question. If the work was more
| integrated with the compiler, it would still need to be
| learned. If your compiler of choice spit out an
| deployable unit akin to an image you'd still need
| something akin to the Dockerfile to specify exposed ports
| and volumes and such, no?
| paulgb wrote:
| The Dockerfile doesn't specify volumes, those are
| configured at runtime. In the case of Kubernetes, they're
| configured in the pod spec.
|
| As for ports, EXPOSE in a Dockerfile is basically just
| documentation/metadata. In practice ports are exposed at
| runtime (if using docker) or by configuring services (if
| using Kubernetes), and these are unaffected by whether a
| port is exposed in the dockerfile.
|
| IMHO this is how it should be -- if I'm running a
| containerized database or web server, I want to be able
| to specify the port that the rest of the world sees it
| as, I don't want the container creator to decide that.
| candiddevmike wrote:
| Containers are a temporary solution while we wait for everyone
| to realize self contained binaries with static linking are the
| real solution.
| rocmcd wrote:
| The executable piece is nice, but not the whole picture.
| Configuration and, more importantly, isolation of the
| runtime, are also huge benefits that come with containers.
| tsimionescu wrote:
| No, containers give you things that static binaries just
| don't. How do you specify the maximum allowed memory for a
| static binary? The ports it opens? The amount of CPU usage?
| The locations it will access from the host file system?
|
| Also, how will you distribute this static binary? How do you
| check that the result of a download is the specified version,
| and the same that others are downloading? How will you
| specify its name and version?
|
| By the time you have addressed all of these, you will have
| re-implemented the vast majority of what standard container
| definitions and registries do.
| anonymous_sorry wrote:
| The orchestrator can still use cgroups for resource
| constraints and isolation. Or it could use virtualization -
| it would be an implementation detail. But devs would not
| have to build a container.
|
| Binary distribution, versioning and checksumming shouldn't
| need to be coupled to a particular format.
|
| Obviously docker solves a bunch of disparate problems.
| That's kind of the objection.
| threeseed wrote:
| That is only a real solution if everything is running on the
| same operating system in the same environment.
|
| What is the point of Kubernetes at all in your situation.
| candiddevmike wrote:
| Everything already is running on the same operating system
| with Kubernetes where it matters (kernel, runc/crun, etc).
| Containers are a band aid to wrap an app with additional
| files/libraries/whatever. In Go for instance, I can include
| all this stuff at compile time (even conditionally!).
|
| I'd love to see Kubernetes be able to schedule executables
| directly without using containers by way of systemd nspawn
| or similar. You could have the "container feel" without the
| complexity of the tool chain required to
| build/deploy/run/validate containers.
| theteapot wrote:
| Sounds pretty much like containers are the solution to
| containers.
| [deleted]
| zamalek wrote:
| I think they mean different things to different people. From
| the problems I have faced with customer-controlled OS
| installations, the biggest thing that they offer is
| configuration isolation (or rather independence). I have seen
| some truly crazy shit done by customer's administrators and
| even crazier shit done by management software that they
| install. Taking away that autonomy is huge.
| tsimionescu wrote:
| Absolutely, but the context of the article was specifically
| Kubernetes.
| jfoutz wrote:
| After thinking about this for a few minutes, the author might
| be on the wrong track connecting this with containers.
|
| But I think there's something really there about "environmental
| linting". I know deep in my bones I need write access to make
| log files, but I don't know how many times I've debugged
| systems lacking this permission.
|
| I know the log path won't be known until runtime, I know the
| port can be specified at runtime, but I think there's a ton of
| room for improvement around a tool that says - hey, you're
| making this set of assumptions about your environment, and you
| should have these health checks or tests or whatever.
|
| I agree with you about this is what containers give, but I
| think the author is really on to something about the dev
| tooling and environment warning about what sorts of permissions
| are needed to, like, work.
___________________________________________________________________
(page generated 2022-09-05 23:00 UTC)