[HN Gopher] Run Nix Based Environments in Kubernetes
___________________________________________________________________
Run Nix Based Environments in Kubernetes
Author : kelseyhightower
Score : 104 points
Date : 2025-11-10 15:27 UTC (6 days ago)
(HTM) web link (flox.dev)
(TXT) w3m dump (flox.dev)
| nrhrjrjrjtntbt wrote:
| How does this differ from the tooling that lets you build
| containers from nix?
| dlahoda wrote:
| seems similar to this
|
| https://github.com/pdtpartners/nix-snapshotter
|
| so kind of allowing pull images from nix store, mounting shared
| host nix store per node into each container, incremental fast
| rebuilds, generating basic pod configs are good things.
|
| and local, ci and remote runs same flows and envs.
| setheron wrote:
| There was also Nixery paving the way
| ronef wrote:
| Jotting down a few quick thoughts here but we can totally go
| deep. This is something Michael Brantley started working on a
| few months ago to test out how to make it super easy to ease
| and leverage existing Nix & Flox architecture. One of the core
| differences from my quick perspective is that it specifically
| leverages the unique way that Flox environments are rendered
| without performing a nix evaluation, making it safe and
| optimally performant for the k8s node to realize the packages
| directly on the node, outside of a container.
| zobzu wrote:
| I read this a few times but there's no info.
| Shebanator wrote:
| Wrong. If you know nix then you know "leverages the unique
| way that Flox environments are rendered without performing
| a nix evaluation" is a very significant statement.
| nixosbestos wrote:
| > leverages the unique way that Flox environments are
| rendered without performing a nix evaluation"
|
| I'm curious! and ignorant! help!
|
| Is that via (centrally?) cached eval? or what? there's
| only so much room for magic in this arena.
| jeremy_flox wrote:
| Yeah, it's essentially cached eval, the key being
| where/how that eval is stored.
|
| When you create a Flox environment, we evaluate the Nix
| expressions once and store the concrete result (ie exact
| store paths) in FloxHub. The k8s node just fetches that
| pre-rendered manifest and bind-mounts the packages with
| no evaluation step at pod startup.
|
| It's like the difference between giving the node a recipe
| to interpret vs. giving it a shopping list of exact
| items. Faster, safer, and the node doesn't need to know
| how to cook (evaluate Nix). I don't know, there's a
| metaphor here somewhere, I'll find it.
|
| Only so much room for magic, for sure, but tons of room
| for efficiency and optimization.
| jeremy_flox wrote:
| Correction: we don't eval when you create environments.
|
| Our catalog continuously pre-evaluates nixpkgs in the
| background. 'flox install' just selects from pre-
| evaluated packages -- no eval needed, ever. The k8s node
| fetches the manifest and mounts the packages.
|
| Eval is done once, centrally, continuously. So... even
| more pre-val'd, so to speak.
| whazor wrote:
| When I worked on an enterprise data analytics platform, a big
| problem was docker image growth. People were using different
| python versions, different cuda versions, all kinds of libraries.
| With Cuda being over a gigabyte, this all explodes.
|
| The solution is to decompose the docker images and make sure that
| every layer is hash equivalent. So if people update their Cuda
| version, it result in a change within the Python layers.
|
| But it looks like Flox now simplifies this via Nix. Every Nix
| package already has a hash and you can combine packages however
| you would like.
| justincormack wrote:
| Yes, there were various attempts to do this in the container
| ecosystem, but there is a hard limit on layers on Docker images
| (because there are hard limits on overlay mounts; you don't
| really need to overlay all the Nix store mounts of course as
| they have different paths but the code is for teh geenral
| case). So then there were various ways of bundling sets of
| packages into layers, but just managing it directly through Nix
| store is much simpler.
| based2 wrote:
| https://github.com/pdtpartners/nix-
| snapshotter/blob/main/doc...
| ronef wrote:
| And I'm back in the land of the living. Can't really beat a
| response from Justin Cormack!
| __MatrixMan__ wrote:
| I was an early and enthusiastic adopter of docker. I really
| liked how it would let me use layers to keep track of
| dependency between files.
|
| After spending a few years using nix, the docker image
| situation looks pretty bonkers. If two files end up in separate
| layers, the system assumes dependency so if the lower file
| changes you need to build a separate copy of the higher one
| just in case there's actual dependency there.
|
| Within nix you can be more precise about what depends on what,
| which is nice, but you do have to be thoughtful about it or you
| can summon the same footgun that got you with docker, just in
| smaller form. Because a nix derivation, while a box with nicely
| labeled inputs and output, is still a black box. If you insert
| a readme as an input to a derivation that does a build, nix
| will assume that the compiled binary depends on it and when you
| fix a typo in the readme and rebuild you'll end up with a
| duplicate binary build in the nix store despite the contents of
| the binary not actually depending on the text of the readme.
|
| > you can combine packages however you would like
|
| So this is true, more or less, but be aware that while nix lets
| you do this in ways that don't force needless duplication, it
| doesn't force you to avoid that duplication. Things carelessly
| packaged with nix can easily recreate the problem you mentioned
| with docker.
| justincormack wrote:
| The problem is that whiteouts are not commutative. If the
| layers you build turn out to be bit for bit identical the
| layers will be shared anyway, but its much mroe complex than
| Nix where the composition operation is commutative.
| d3Xt3r wrote:
| > _If you insert a readme as an input to a derivation that
| does a build, nix will assume that the compiled binary
| depends on it and when you fix a typo in the readme and
| rebuild you 'll end up with a duplicate binary build in the
| nix store despite the contents of the binary not actually
| depending on the text of the readme._
|
| One my issues with Nix is the black box that is the store,
| and maybe it's just my system, but over time I find it full
| of redundant files / orphans and no obvious way to flatten it
| or clean it safely without breaking something.
|
| I wonder how flox solves this.
| jeremy_flox wrote:
| Both fair points. The README rebuild issue is a Nix hiccup
| we don't solve; our quantized catalog reduces cascading
| rebuilds from upstream churn, but input over-specification
| is still there.
|
| On store bloat: Flox makes it clearer what's in use
| (explicit environments vs. implicit dependencies), but you
| still need nix-collect-garbage.
|
| The store accumulates cruft, that's Nix reality, we haven't
| changed it.
| jeremy_flox wrote:
| Just to follow up on this, Flox puts packages in one
| group by default so they share dependencies, plus our
| quantized catalog means way less version spread than raw
| Nix. So I do think we still improve on the Nix story,
| here.
|
| We're also adding "stabilities" (downsample from daily to
| weekly/monthly snapshots) to reduce churn even more.
| Still need GC, but a lot fewer bags on trash day. .
| __MatrixMan__ wrote:
| Have you found that nix-store --gc breaks things, or are
| you concerned that it's not an aggressive enough garbage
| collection?
|
| When I have space problems, I run that and they're gone,
| then later I do it again. It could be that I'm just
| avoiding functionality that it breaks though.
| ronef wrote:
| Yes, this hits the nail on the head. We've seen the same
| explosion in image size and rebuild complexity, especially with
| AI/ML workloads where Python + CUDA + random pip wheels +
| system libs = image bloat and massive rebuilds.
|
| With the Kubernetes shim, you can run the hash-pinned
| environments without building or pulling an image at all. It
| starts the pod with a stub, then activates the exact runtime
| from a node-local store.
| CuriouslyC wrote:
| Too bad this isn't open source, I'm 3/4ths of the way through
| building pretty much this exact product in order to support my
| actual products.
| natebc wrote:
| Is it not GPL?
|
| The license file in their github seems to indicate that it is.
| https://github.com/flox/flox?tab=GPL-2.0-1-ov-file
| CuriouslyC wrote:
| Cool if so, I didn't see it prominently linked or mentioned
| on the landing page. Maintainers: being open source is a big
| feature, mention it prominently and have your repo links
| front and center.
| rootnod3 wrote:
| I used to love both, Kubernetes and Nix. But after a few years of
| using both I felt like the abstraction levels are a bit too deep.
|
| Sure, it's easy to stand up a mail server in NixOS, or to just
| use docker/kubernetes to deploy stuff. But after a few years it
| felt like I don't have a single understanding of the stack. When
| shit hits the fan, it makes it very difficult to troubleshoot.
|
| I am now back on running my servers on FreeBSD/OpenBSD and jails
| or VMM respectively. And also dumbing the stack down to just "run
| it in a jail, but set it up manually".
|
| The only outlier is Immich. For some reason they only officially
| support the docker images but not a single clear instruction on
| how to set it up manually. Sure, I could look at the Dockerfiles,
| but many of the scripts also expect docker to be present.
|
| And now that FreeBSD also has reproducible builds, it took one
| more stone away from Nix.
| ronef wrote:
| Going to sound weird but with both my hats on I super
| appreciate this perspective. I can only speak to some areas of
| Nix and Flox obviously and I know folks are looking into doing
| this to your point a whole lot better. Zooming in way more into
| solving for us that just want to run and fix it fast when it
| breaks.
|
| Also, think it's a huge ecosystem win for FreeBSD pushing on
| reproducibility too. I think we are trending in a direction
| where this just becomes a critical principle for certain
| stacks. (also needed when you dive into AI stacks/infra...)
| rootnod3 wrote:
| Yes, but I also think that the BSDs are the last bastions you
| will find any AI usage in. And I for one am grateful for
| that.
|
| I like it when my system comes with a complete set of
| manpages and good docs.
|
| But you mentioned Flox, which I didn't even know about. First
| I thought that's what they renamed the Nix fork to after the
| schism, but now I see it's a paid product and yuck...just
| further deepens my believe in going more bare bones manual
| control, even if sometimes bothersome.
| antonvs wrote:
| Kubernetes can be a godsend at larger orgs.
|
| We have six dev teams and are just about done with migrating to
| k8s. It's an immense improvement over what we had before.
|
| It's a version of Greenspun's tenth rule: "Any sufficiently
| complicated distributed system contains an ad hoc, informally-
| specified, bug-ridden, slow implementation of half of
| Kubernetes."
| eep_social wrote:
| I think six dev teams is small in terms of kube. I wouldn't
| be surprised if that's close to the perfect size to move onto
| kube and create and adopt a standard set of platform idioms.
|
| at orgs significantly larger than that, the kube team has to
| aggressively spin out platform functions that enable further
| layering or risk getting overwhelmed trying to support and
| configure kube features to cover diverse team needs (that is,
| storage software doesn't have the same needs or concerns as
| middleware or the frontend). this incubator model isn't easy
| in practice. trying to adopt kube at this scale is very
| challenging because it requires the kube team to spin up and
| out sub-teams at a very high rate or risk slowing the
| migration down to a crawl or outright failure and purchasing
| e.g. off the shelf AWS because teams need to offboard their
| previous platform.
| throwaway838112 wrote:
| No, most large orgs do not need it.
| antonvs wrote:
| Completely unsupported assertions are the least interesting
| kinds of comment anyone can post.
|
| Besides, this particular comment would need to explain why
| it's likely that its unsupported opinion is correct, rather
| than all the counterexamples that exist in the actual
| highly competitive industry that's heavily using this
| system.
|
| There's a fundamental conflict there that doesn't work in
| favor of your inchoate opinion. The most likely conclusion
| is that there are factors at work here that you don't
| understand.
| ronef wrote:
| Ron from Flox here, woke up to feed a brand new 3 day old to see
| this here! On about 3 hours of sleep (over the lat 48 hours) but
| excited to try and answer some questions! Feel free to also drop
| any below <3
|
| We did just launch this last week after a good bit of work from
| the team. Steve wrote up a deeper technical dive here if anyone
| is interested - https://flox.dev/blog/kubernetes-uncontained-
| explained-unloc...
| wathef wrote:
| congrats on the little one, here's to many wonderful moments.
| ronef wrote:
| online community love was not in my cards going into day 3 of
| a newborn but I'll take it + definitely needed! thank you!
| nixosbestos wrote:
| So, nix-snapshotter? Also, Flox going all in on "environments"
| seems like such a choice. I'm sure that Flox is not encouraging
| shipping a binary-in-a-devshell to Prod, so it seems an
| interesting branding decision.
|
| It's hard for me to understand if I should be excited about this.
| I think companies do themselves such huge disservices from not
| being transparent _to the nerds that WILL be the ones helping
| choose /implement these things_. Instead of the current feeling I
| have, there could be three sentences that explains what Flox is
| offering here beyond what *anyone* can go do right now with nix-
| snapshotter.
|
| If it's ecosystem stuff (you get Flox's CI, or CLI, or whatever
| else), that's not very well sold to me on the landing page.
| Otherwise I'm feeling left empty-handed.
| jeremy_flox wrote:
| Totally valid - we buried the lede here. Quick version:
|
| Not nix-snapshotter because we skip Nix eval entirely and get
| way better cache sharing across unrelated workloads (quantized
| catalog means everything shares base deps). On "environments":
| these aren't devshells-as-prod, they're the actual runtime;
| same as 'flox activate' works everywhere. You're shipping a
| declarative, hash-pinned runtime that happens to also work
| great in dev/CI
|
| And yeah, we should have been upfront that this is alpha and
| we're planning to open source it after vetting at KubeCon.
|
| You're right that we're doing ourselves a disservice not being
| transparent with the technical crowd. What specific technical
| details would help you evaluate this?
| jeremy_flox wrote:
| Jeremy from Flox, here, I want to chime in here so Ron can be
| with his family, even though he will no doubt be right back on
| here:
|
| Re: Relationship to nix-snapshotter and prior art This is
| original work, though very much built on prior innovations. Our
| approach hooks into the upstream containerd runc shim to pull the
| FloxHub-managed environment and bind-mount the closure at
| startup. The key distinction is that we use how Flox environments
| are rendered to avoid Nix evaluation entirely, making it safe and
| fast for a k8s node to realize packages directly on the node.
| Less about images and containers, per se, and more out bringing
| the power of Flox and Nix at the buildtime end to the runtime end
| of SDLC.
|
| The cache story is surprisingly strong: nix store paths
| effectively behave like layers in the node's registry, but with
| dramatically higher hit rates -- often across entirely unrelated
| pod deployments. Because all pods rely on the same underlying
| system libraries drawn from the "quantized" Flox catalog,
| different environments naturally share glibc, core utilities, and
| common dependencies, where traditional containers typically share
| nothing.
|
| Tools like nix-snapshotter, Nixery, and others have pioneered
| this space and we're grateful for that work. This rising "post-
| Docker" tide raises all ships.
|
| Re: Open Source The software is brand new -- only slightly older
| than Ron's baby -- and currently in alpha. KubeCon was our first
| opportunity for broad feedback, and we uncovered a few issues
| we're still addressing. Our intent is to open-source the project
| once we've fully vetted the approach, ideally in the coming
| weeks.
|
| Yes, we launched early and the product is imperfect, but we're
| doing so transparently and with a commitment to getting it right
| and releasing it to the community, we will continue to release
| early and often.
|
| Re: Abstraction depth concerns I appreciate @rootnod3's point
| about deeper abstractions complicating debugging. We're thinking
| hard about how to keep things simple for people who need to run
| and fix systems quickly. It's encouraging to see the broader
| ecosystem--like FreeBSD--lean further into reproducibility,
| especially as AI-centric stacks make this increasingly important.
|
| Re: Nix vs traditional approaches Skilled Dockerfile authors can
| achieve great caching results -- and you can pin and you can
| prune registries, etc -- but our goal is to make these best
| practices the default. Nix enables finer-grained caching and a
| universal packaging format for building and consuming open source
| software.
|
| We see intrinsic value in Flox environments -- whether on the
| CLI, k8s, Nomad down the road, or other platforms. Our aim is for
| Flox environments to be as universal and natural as Nix packages
| themselves -- essentially extending "flox activate" into the k8s
| world.
|
| We likewise got a ton of valuable feedback at KubeCon, most of
| which was validating, all of which was very inline with this
| conversation.
| robinhoodexe wrote:
| First, congrats on the release. I've looked at flox and devenv
| for nixifying our container builds. Our distribution of
| languages is about 40/30/20/10 of Python, F#, R and nodejs.
|
| A dilemma I'm facing is that the win from nix in terms of
| faster builds and smaller images would be largely from python
| and R images (where the average size is often 1Gi or larger).
| However, the developers that use Python or R are less likely to
| "get" the point of Nix and might have a steeper learning curve
| than F# developers (where the builds are quite efficient).
|
| That was the context, my question is, how's the integration
| with Flox and R/RStudio? I know there's Rix[1] for managing R
| packages with Nix.
|
| [1] https://github.com/ropensci/rix
| nickysielicki wrote:
| What constraints/coordination exists with this, in terms of host
| driver support? What enforces that Nix does not attempt to use a
| newer cuda toolkit on a host with an older cuda driver?
___________________________________________________________________
(page generated 2025-11-16 23:01 UTC)