hngopher.com

       [HN Gopher] Microsandbox: Virtual Machines that feel and perform...
       ___________________________________________________________________
        
       Microsandbox: Virtual Machines that feel and perform like
       containers
        
       Author : makeboss
       Score  : 369 points
       Date   : 2025-05-30 13:20 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | appcypher wrote:
       | Thanks for sharing!
       | 
       | I'm the creator of microsandbox. If there is anything you need to
       | know about the project, let me know.
       | 
       | This project is meant to make creating microvms from your machine
       | as easy as using Docker containers.
       | 
       | Ask me anything.
        
         | esafak wrote:
         | Looks neat. If I understand correctly, I can use it to spin up
         | backends on the fly? You have an ambitious list of languages to
         | support:
         | https://github.com/microsandbox/microsandbox/tree/main/sdk
         | 
         | edit: A fleshed out contributors guide to add support for a new
         | language would help.
         | https://github.com/microsandbox/microsandbox/blob/main/CONTR...
        
           | appcypher wrote:
           | Yes. Self-hosting and using it on your own backend infra is
           | the main use-case. And JVM support should just work since it
           | is a Linux machine.
        
         | 0cf8612b2e1e wrote:
         | Only did a quick skim of the readme, but a few questions which
         | I would like some elaboration.
         | 
         | How is it so fast? Is it making any trade offs vs a traditional
         | VM? Is there potential the VM isolation is compromised?
         | 
         | Can I run a GUI inside of it?
         | 
         | Do you think of this as a new Vagrant?
         | 
         | How do I get data in/out?
        
           | appcypher wrote:
           | > How is it so fast? Is it making any trade offs vs a
           | traditional VM? Is there potential the VM isolation is
           | compromised?
           | 
           | It is a lighweight VM and uses the same technology as
           | Firecracker
           | 
           | > Can I run a GUI inside of it?
           | 
           | It is planned but not yet implemented. But it is absolutely
           | possible.
           | 
           | > Do you think of this as a new Vagrant?
           | 
           | I would consider Docker for VMs instead. In a similar way, it
           | focuses on dev ops type use case like deplying apps, etc.
           | 
           | > How do I get data in/out?
           | 
           | There is an SDK and server that help does that and file
           | streaming is planned. But right now, you can execute commands
           | in the VM and get the result back via the server
        
             | westurner wrote:
             | > _I would consider Docker for VMs instead._
             | 
             | Native Containers would probably solve here, too.
             | 
             | From https://news.ycombinator.com/item?id=43553198 :
             | 
             | >>> _ostree native containers are bootable host images that
             | can also be built and signed with a SLSA provenance
             | attestation;https://coreos.github.io/rpm-ostree/container/
             | _
             | 
             | And also from that thread:
             | 
             | > _How should a microkernel run (WASI) WASM runtimes?_
             | 
             | What is the most minimal microvm for WASM / WASI, and what
             | are the advantages to running WASM workloads with
             | firecracker or microsandbox?
        
               | appcypher wrote:
               | > What is the most minimal microvm for WASM / WASI,
               | 
               | By setting up an image with wasmtime for example.
               | 
               | > and what are the advantages to running WASM workloads
               | with firecracker or microsandbox?
               | 
               | I can think of stronger isolation or when you have legacy
               | stuff you need to run alongside.
        
               | westurner wrote:
               | From https://e2b.dev/blog/firecracker-vs-qemu
               | 
               | > _AWS built [Firecracker (which is built on KVM)] to
               | power Lambda and Fargate [2], where they need to quickly
               | spin up isolated environments for running customer code.
               | Companies like E2B use Firecracker_ to run AI generated
               | code securily in the cloud, _while Fly.io uses it to run
               | lightweight container-like VMs at the edge [4, 5]._
               | 
               | "We replaced Firecracker with QEMU" (2023)
               | https://news.ycombinator.com/item?id=36666782
               | 
               | "Firecracker's Kernel Support Policy" describes
               | compatible kernel configurations;
               | https://github.com/firecracker-
               | microvm/firecracker/blob/main...
               | 
               | /? wasi microvm kernel [github] https://www.google.com/se
               | arch?q=wasi+microvm+kernel+GitHub :
               | 
               | - "Mewz: Lightweight Execution Environment for
               | WebAssembly with High Isolation and Portability using
               | Unikernels" (2024) https://arxiv.org/abs/2411.01129
               | similar: https://scholar.google.com/scholar?q=related:b36
               | 57VNcyJ0J:sc...
        
         | hugs wrote:
         | Looks great! This might be extremely useful for a
         | distributed/decentralized software testing network I'm building
         | (called Valet Network)...
         | 
         | Question: How does networking work? Can I restrict/limit
         | microvms so that they can only access public IP addresses? (or
         | in other words... making sure the microvms can't access any
         | local network IP addresses)
        
           | appcypher wrote:
           | Yes! With the `scope` property.
           | 
           | https://github.com/microsandbox/microsandbox/blob/0c13fc27ab.
           | ..
        
             | hugs wrote:
             | thanks! have an example on how to use that in a
             | sandboxfile?
             | 
             | (also, this project is really cool. great work!)
        
               | appcypher wrote:
               | Yeah. I need to fix that in the docs!
        
               | hugs wrote:
               | no prob!
        
         | simonw wrote:
         | What's the story for macOS support?
        
           | appcypher wrote:
           | It uses libkrun which uses Hypervisor.framework on macOS.
        
         | wolfhumble wrote:
         | Can you use Microsandbox for everything you can use Docker for,
         | or are there cases where containers make more sense?
         | 
         | Congratulations on the launch!
        
           | appcypher wrote:
           | We want microsandbox to be usable for everything you can with
           | Docker.
           | 
           | That said, hosting microVMs require dedicated hardware or VMs
           | with nested virt support. Containers don't have that problem.
        
         | nqzero wrote:
         | i'm on a mid-level laptop, at times with slow or expensive
         | internet, running ubuntu. i want to be able to run nominally-
         | isolated "copies" of my laptop at near-native speed
         | 
         | 1. each one should have it's own network config, eg so i can
         | use wireguard or a vpn
         | 
         | 2. gui pass-through to the host, eg wayland, for trusted tools,
         | eg firefox, zoom or citrix
         | 
         | 3. needs to be lightweight. eg gnome-boxes is dead simple to
         | setup and run and it works, but the resource usage was
         | noticeably higher than native
         | 
         | 4. optional - more security is better (ie, i might run semi-
         | untrusted software in one of them, eg from a github repo or
         | npm), but i'm not expecting miracles and accept that escape is
         | possible
         | 
         | 5. optional - sharing disk with the host via COW would be nice,
         | so i'd only need to install the env-specific packages, not the
         | full OS
         | 
         | i'm currently working on a podman solution, and i believe that
         | it will work (but rebuilding seems to hammer the network - i'm
         | hoping i can tweak the layers to reduce this). does
         | microsandbox offer any advantages for this use case ?
        
           | appcypher wrote:
           | > 1. each one should have it's own network config, eg so i
           | can use wireguard or a vpn
           | 
           | This is possible right now but the networking is not where I
           | want it to be yet. It uses libkrun's default TSI impl;
           | performant and simplifies setup but can be inflexible. I plan
           | to implement an alternative user-space networking stack soon.
           | 
           | > 2. gui pass-through to the host, eg wayland, for trusted
           | tools, eg firefox, zoom or citrix
           | 
           | We don't have GUI passthrough. VNC?
           | 
           | > 3. needs to be lightweight. eg gnome-boxes is dead simple
           | to setup and run and it works, but the resource usage was
           | noticeably higher than native
           | 
           | It is lightweight in the sense that it is not a full vm
           | 
           | > 4. optional - more security is better (ie, i might run
           | semi-untrusted software in one of them, eg from a github repo
           | or npm), but i'm not expecting miracles and accept that
           | escape is possible
           | 
           | The security guarantees are similar to what typical VMs
           | support. It is hardware-virtualized so I would say you should
           | be fine.
           | 
           | > 5. optional - sharing disk with the host via COW would be
           | nice, so i'd only need to install the env-specific packages,
           | not the full OS
           | 
           | Yeah. It uses virtio-fs and has overlayfs on top of that for
           | COW.
        
         | simonw wrote:
         | I'm trying this out now and it's very promising. One problem
         | I'm running into with the Python library is that I'd like to
         | keep that sandbox running for several minutes while I do things
         | like set variables in one call and then use them for stuff
         | several calls later. I keep seeing this error intermittently:
         | Error: Sandbox is not started. Call start() first
         | 
         | Is there a suggested way of keeping a sandbox around for
         | longer?
         | 
         | The documented code pattern is this:                   async
         | def main():             async with
         | PythonSandbox.create(name="my-sandbox") as sb:
         | exec = await sb.run("print('Hello, World!')")
         | print(await exec.output())
         | 
         | Due to the way my code works I want to instantiate the sandbox
         | once for a specific class and then have multiple calls to it by
         | class methods, which isn't a clean fit for that "async with"
         | pattern.
         | 
         | Any recommendations?
        
           | appcypher wrote:
           | Right. You can skip the `with` context manager and call start
           | and stop yourself.
           | 
           | There is an example of that here:
           | 
           | https://github.com/microsandbox/microsandbox/blob/0c13fc27ab.
           | ..
        
           | gcharbonnier wrote:
           | async with is just syntactic sugar. You could very well call
           | __aenter__ and __aexit__ manually. You could also use an
           | AsyncExitStack, call __aenter__ manually, then
           | enter_async_context, and call aclose when you're done. Since
           | aclose method exists I guess this is not an anti-pattern.
           | 
           | https://docs.python.org/3/library/contextlib.html#contextlib.
           | ..
        
         | codethief wrote:
         | Hi appcypher, very cool project! Does the underlying MicroVM
         | feature provide an OCI runtime interface, so that it could be
         | used as a replacement for runc/crun in Docker/Podman?
        
           | Nypro wrote:
           | No. Not yet. Would be nice to have
        
             | codethief wrote:
             | Thanks for your response!
             | 
             | One more question: What syscalls do I need to have access
             | to in order to run a MicroVM? I'm asking because ideally
             | I'd like to run container workloads inside existing
             | containers (self-hosted GitLab CI runners) whose
             | configuration (including AppArmor) I don't control.
        
         | Hilift wrote:
         | Are you ready for the deluge of networking questions for all
         | the buck wild configurations?
        
           | Nypro wrote:
           | Lol. I should brace for impact.
           | 
           | Networking continues to be a pain but I'm open to
           | suggestions.
        
         | catlifeonmars wrote:
         | How does the microvm architecture compare with firecracker?
        
           | appcypher wrote:
           | They are similar. We use libkrun under the hood. Firecracker
           | team seems not to be interested in a macOS implementation
        
         | nulld3v wrote:
         | Cool project. Off topic question: Are the images in the "Use
         | Cases" section in the README from a real app? I like the clean
         | UI design.
        
           | appcypher wrote:
           | No they are not.
        
         | spicybright wrote:
         | I like the idea. But when you say "bullet proof" security,
         | there are exploits to break out of VMs that exist. Have you
         | looked into those?
        
           | appcypher wrote:
           | Will fix the docs
        
         | meander_water wrote:
         | Can you explain how this compares to Kata Containers? [0] That
         | also supports OCI to run microVMs. You can also choose
         | different hypervisors such as firecracker to run it on.
         | 
         | [0] https://katacontainers.io/
        
           | appcypher wrote:
           | Katacontainers is an interesting project. Microsandbox is a
           | more opinionated project with a UX that focuses on getting up
           | and running with microVMs quickly. I want this experience for
           | Linux, macOS and Windows users.
           | 
           | More importantly is making sandboxing really accessible to AI
           | devs with `msb server`.
        
         | nikolamus wrote:
         | Think I can build a notebook on top of this ? Jupyter client
         | has been a pain to manage
        
           | appcypher wrote:
           | Not sure what that entails. You can try and I can help along
           | the way
        
         | int_19h wrote:
         | This is very neat tech, but I think you might want to wait
         | until you actually have Windows covered before making claims
         | like
         | https://github.com/microsandbox/microsandbox/blob/main/MSB_V...
        
           | appcypher wrote:
           | What do you mean?
        
       | Tsarp wrote:
       | Wow. This looks awesome.
       | 
       | Can we build our own python sandbox using the sandboxfile spec?
       | This is if I want to add my own packages. Would this be just
       | having my own requirements file here -
       | https://github.com/microsandbox/microsandbox/blob/main/MSB_V...
        
         | appcypher wrote:
         | Thank you!
         | 
         | > Can we build our own python sandbox using the sandboxfile
         | spec?
         | 
         | Yes and I plan to make that work with the SDK.
         | 
         | PS: Multi-stage build is WIP.
        
           | Tsarp wrote:
           | Great will join the discord. Is this embeddable? Will it work
           | with a cross platform desktop app(Tauri)?
        
             | apitman wrote:
             | An embeddable library that lets you launch Linux VMs that
             | works across Windows, MacOS, and Linux hosts would be
             | incredible.
        
             | appcypher wrote:
             | If by embeddable, you mean having the vm run in the same
             | process, then no. The vm aborts its process when it's done
             | so it has to run as separate process.
        
       | jauntywundrkind wrote:
       | Why not some of the existing microvm efforts?
       | 
       | Cloud Hypervisor and Firecracker both have an excellent
       | reputation for ultra lightweight VM's. Both are usable in the
       | very popular Kata Containers project (as well as other upstart
       | VM's Dragonball, & StratoVirt). In us by for example the CNCF
       | Confidential Containers https://github.com/kata-containers/kata-
       | containers/blob/main... https://confidentialcontainers.org/
       | 
       | There's also smaller efforts such as firecracker-containerd or
       | Virtink, both which bring OCI powered microvms into a Docker like
       | position (easy to slot into Kubernetes), via Firecracker and
       | Cloud Hypervisor respectively.
       | https://github.com/smartxworks/virtink
       | https://github.com/firecracker-microvm/firecracker-container...
       | 
       | Poking around under the hood, microsandbox appears to use krun.
       | There is krunvm for OCI support (includes MacOS/arm64 support!).
       | https://github.com/containers/krunvm https://github.com/slp/krun
       | 
       | The orientation as a safe sandbox for AI / MCP tools is a very
       | nicely packaged looking experience, and very well marketred.
       | Congratulations! I'm still not sure why this warrants being it's
       | own project.
        
         | simonw wrote:
         | If we get enough of these sandboxes, maybe we will finally get
         | one that's easy for me to run on my own machines.
        
           | mike_hearn wrote:
           | Which platforms do you use?
        
             | simonw wrote:
             | macOS on my laptop, anything that runs in a container for
             | when I deploy things.
        
               | tough wrote:
               | I had luck using ALVM which users Apple Hypervisor
               | framework while exploring linux micro-vm's in macos fwiw
               | https://github.com/mathetake/alvm
        
               | simonw wrote:
               | That looks _really cool_ , but it's missing the one
               | feature I want most from anything that runs a sandbox (or
               | any security-related software): I need something which a
               | billion dollar company with a professional security team
               | is running in production on a daily basis.
               | 
               | So much of the solutions to this stuff I see come from a
               | GitHub repo with a few dozen commits and often a README
               | that says "do not rely on this software yet".
               | 
               | Definitely going to play with it a bit though, I love the
               | idea of hooking into Apple's Hypervisor.framework (which
               | absolutely fits my billion-dollar-company requirement.)
        
               | ericb wrote:
               | Working gVisor Mac install instructions here.
               | 
               | https://dev.to/rimelek/using-gvisors-container-runtime-
               | in-do...
               | 
               | After this is done, it is:
               | 
               | docker run --rm --runtime=runsc hello-world
        
               | mike_hearn wrote:
               | If you use macOS then it has a great sandboxing system
               | built in (albeit, undocumented). Anthropic are starting
               | to experiment with using it in Claude Code to eliminate
               | permission prompts. Claude can choose to run commands
               | inside the sandbox, in which case they execute
               | immediately.
               | 
               | I've thought about making one of these for other coding
               | agents. It's not quite as trivial as it looks and I know
               | how to do it, also on Windows, although it seems quite a
               | few coding agents just pretend Windows doesn't exist
               | unfortunately.
        
               | simonw wrote:
               | The lack of documentation for that system is so
               | frustrating! Security feature are the one thing where
               | great documentation should be table stakes, otherwise we
               | are left just wildly guessing how to keep our system
               | secure!
               | 
               | I'm also disheartened by how the man pages for some of
               | the macOS sandboxing commands have declared them
               | deprecated for at least the last five years:
               | https://7402.org/blog/2020/macos-sandboxing-of-
               | folder.html
        
               | mike_hearn wrote:
               | It's an internal system that exposes implementation
               | details all over the place, so I understand why they do
               | it that way. You have to know a staggering amount about
               | the architecture of macOS to use it correctly. This isn't
               | a reasonable expectation to have of developers, hence why
               | the formal sandbox API is exposed via a set of
               | permissions you request and the low level SBPL is for
               | exceptions, sandboxing OS internals and various other
               | special cases.
               | 
               | Is AI a special case? Maybe! I have some ideas about how
               | to do AI sandboxing in a way that works more with the
               | grain of macOS, though god knows when I'll find the time
               | for it!
        
           | tough wrote:
           | would you be OK with a -hardened- with default profiles
           | docker containers one?
        
             | appcypher wrote:
             | I don't understand what you mean? Can you clarify?
        
               | tough wrote:
               | sorry i meant to ask simon directly if they require a
               | non-docker solution
               | 
               | im working on a wrapper that lets you swap runtimes and
               | my first implementation is mostly a wrapper around docker
               | containers
               | 
               | planning to add firecracker next
               | 
               | will explore adding microsandbox too cool stuff!
        
               | simonw wrote:
               | My ideal solution is non-Docker purely because I build
               | software for other people to use. I don't want to have to
               | tell my users "step 1: install Docker" if I can avoid it.
        
               | tough wrote:
               | that does make sense, sadly firecracker seems to be
               | mostly relegated to linux for now so there's no good
               | multi-arch story i'm aware of
        
           | appcypher wrote:
           | That's the plan lol. There is too much friction setting up
           | existing solutions.
        
           | hobofan wrote:
           | Exactly my thoughts when I read the headline, after having
           | read a similar one every few months.
           | 
           | However, by looking at it and playing with a few simple
           | examples, I think this is the one that looks the closest so
           | far.
           | 
           | Definitely interested to see the FS support, and also some
           | instruction on how to customize the images to e.g. pre-
           | install common Python packages or Rust crates. As an example,
           | I tried to use the MCP with some very typical use-cases for
           | code-execution that OpenAI/Anthropic models would generate
           | for data analysis, and they almost always include using numpy
           | or a excel library, so you very quicly hit a wall here
           | without the ability to include libraries.
        
         | appcypher wrote:
         | Because those have different directions than microsandbox and
         | you've already mentioned one. I want easy secure sandboxes for
         | AI builders. IMHO, microsandbox is easier to get started with.
         | 
         | That said I don't think either KataContainer or Cloud
         | Hypervisor has first-class support for macOS.
        
       | dataflow wrote:
       | Tangential question: why does it normally take so long to start
       | traditional VMs in the first place? At least on Windows, if you
       | start a traditional VM, it takes several seconds for it to start
       | running _anything_.
       | 
       | Edit: when I say _anything_ , I'm not talking user programs. I
       | mean as in, before even the first instruction of the firmware --
       | before even the virtual disk file is zeroed out, in cases where
       | it needs to be. You literally can't pause the VM during this
       | interval because the window hasn't even popped up yet, and even
       | when it has, you still can't for a while because it literally
       | hasn't started running _anything_. So the kernel and even
       | firmware initialization slowness are entirely irrelevant to my
       | question.
       | 
       | Why is that?
        
         | diggan wrote:
         | I mean it is basically booting a computer from scratch, kind of
         | makes sense. You have to allocate memory, start virtual CPUs,
         | initialize devices, run BIOS/UEFI checks, perform hardware
         | enumeration, all that jazz while emulating all of it, which
         | tends to be slower than "real" implementations. I guess there
         | is a bunch of processes for security as well, like wiping like
         | zeroing pages and similar things that takes additional time.
         | 
         | If I let a VM use most of my hardware, it takes a few seconds
         | from start to login prompt, which is the same time it takes for
         | my Arch desktop to boot from pressing the button to seeing the
         | login prompt.
        
           | dataflow wrote:
           | > You have to allocate memory, start virtual CPUs, initialize
           | devices, run BIOS/UEFI checks, perform hardware enumeration,
           | all that jazz while emulating all of it, which tends to be
           | slower than "real" implementations.
           | 
           | That's not what I'm asking.
           | 
           | I'm saying it takes a long time for it to even execute a
           | single instruction, in the BIOS itself. Even for the window
           | to pop up, before you can even pause the VM (because it
           | hasn't even started yet). What you're describing comes after
           | all that, which I already understand and am not asking about.
        
             | drewg123 wrote:
             | Without any context in terms of what the VM is doing or
             | what VMM software you use, my best guess is that the OS/VMM
             | are pre-allocating memory for the VM. This might involve
             | paging out other processes' memory, which could take some
             | time.
             | 
             | I think task manager would tell you if there is a blip of
             | memory usage and paging activity at the time. And I'm sure
             | windows itself has profilers that can tell you what is
             | happening when the VM is started..
        
               | dataflow wrote:
               | VirtualBox on Windows, primarily. Though I feel like
               | haven't seen other VMs in the past start up a whole ton
               | faster (maybe a somewhat) (ignoring WSL2). Page files are
               | already disabled, there's plenty of free RAM, and it
               | makes no difference how little RAM the guest is allocated
               | (even if it's 256MB). So no, those are not the issues.
               | VirtualBox itself seems to be doing something slow during
               | that time and I don't know what that is.
        
               | mynameisvlad wrote:
               | So the issue is pretty clearly with VirtualBox itself,
               | but you are making it sound like it's an issue with VMs
               | on Windows or in general.
        
               | gopher_space wrote:
               | I remembered something about VirtualBox not playing
               | nicely with Hyper-V on Windows, and dug up a possibly
               | relevant post[0] on their forums. IIRC we ended up moving
               | a few build systems to Docker and dropping VirtualBox
               | because of hyper-v related issues, but it's been a few
               | years.
               | 
               | [0] https://forums.virtualbox.org/viewtopic.php?t=112113
        
               | dataflow wrote:
               | That's the unrelated green-turtle issue. It's only
               | relevant after the guest has actually started running
               | instructions. I'm talking about before that point.
        
               | gopher_space wrote:
               | I'm not aware of any turtles, that was just the first
               | thing I found when trying to see if VirtualBox and
               | Hyper-V were still a problematic combo.
               | 
               | Again, it was a few years ago, but we didn't solve the
               | problem or identify an actual root cause. We stopped
               | banging our heads against that particular wall and
               | switched technologies.
        
               | mgerdts wrote:
               | What is your definition of free memory? If the system has
               | read a lot of data, the page cache is probably occupying
               | most of the RAM you consider free. Look at cache and
               | standby counters.
               | 
               | I've noticed that windows can only evict data from the
               | page cache at about 5 GB/s. I do not know if this zeros
               | the memory or that would need to be done in the
               | allocation path.
               | 
               | A couple years ago I tracked down a long pause while
               | starting qemu on Linux to it zeroing the 100s of GB of
               | RAM given to the VM as 1 GB huge pages.
               | 
               | These may or may not be big contributors to what you are
               | seeing, depending on the VM's RAM size.
        
               | drewg123 wrote:
               | For some reason I can't reply to your reply. I'd strongly
               | suggest that you profile virtual box. It beats
               | speculation..
        
               | HumanOstrich wrote:
               | I experienced something similar back when Microsoft
               | decided to usurp all hypervisors made for Windows and
               | make Windows itself run as a VM on Hyper-V running as a
               | Type 1 hypervisor on the hardware. That made it so other
               | VMs could only run on Hyper-V alongside Windows or with
               | nested virtualization.
               | 
               | So this meant VMWare, VirtualBox, etc as they were would
               | no longer work on Windows. Microsoft required all of them
               | to switch to using Hyper-V libs behind the scenes to
               | launch Hyper-V VMs and then present them as their own
               | (while hiding them from the Hyper-V UI).
               | 
               | VirtualBox was slow, hot garbage on its own before this
               | happened, but now it's even worse. They didn't optimize
               | their Hyper-V integration as well as VMWare (eventually)
               | did. VMWare is still worse off than it was though since
               | it has to inherit all of Hyper-V's problems behind the
               | scenes.
               | 
               | Hope this brings some clarity.
        
             | bityard wrote:
             | In defense of the replies, your initial question was very
             | vague and left people to assume you meant the obvious
             | thing.
        
               | dataflow wrote:
               | Sure, that's why I clarified.
        
         | jeroenhd wrote:
         | You can optimize a lot to start a Linux kernel in under a
         | second, but if you're using a standard kernel, there are all
         | manners of timeouts and poll attempts that make the kernel
         | waste time booting. There's also a non-trivial amount of time
         | the VM spends in the UEFI/CSM system preparing the virtual
         | hardware and initializing the system environment for your
         | bootloader. I'm pretty sure WSL2 uses a special kernel to avoid
         | the unnecessary overhead.
         | 
         | You also need to start OS services, configure filesystems,
         | prepare caches, configure networking, and so on. If you're not
         | booting UKIs or similar tools, you'll also be loading a
         | bootloader, then loading an initramfs into memory, then loading
         | the main OS and starting the services you actually need, with
         | eachsstep requiring certain daemons and hardware probes to work
         | correctly.
         | 
         | There are tools to fix this problem. Amazon's Firecracker can
         | start a Linux VM in a time similar to that of a container
         | (milliseconds) by basically storing the initialized state of
         | the VM and loading that into memory instead of actually
         | performing a real boot. https://firecracker-microvm.github.io/
         | 
         | On Windows, I think it depends on the hypervisor you use. Hyper
         | V has a pretty slow UEFI environment, its hard disk access
         | always seems rather slow to me, and most Linux distro don't
         | seem to package dedicated minimal kernels for it.
        
           | dataflow wrote:
           | That's not what I'm asking about.
           | 
           | I'm saying it takes a long time for it to even execute a
           | single instruction, in the BIOS itself. Even for the window
           | to pop up, before you can even pause the VM (because it
           | hasn't even started yet). What you're describing comes after
           | all that, which I already understand and am not asking about.
        
             | hnuser123456 wrote:
             | probably the intel ME setting up for virtualization in a
             | way that it can infiltrate
        
               | LoganDark wrote:
               | Ah yes, the source of all slowness in the CPU: hostile
               | backdoors taking their time to compromise the work.
               | Classic...
        
             | bonki wrote:
             | I have always wondered the same, never tried looking into
             | it but I wouldn't be surprised if Defender at least played
             | a part in it. Defender is a huge source for general
             | slowness on Windows from my experience.
        
             | zbentley wrote:
             | Unsubstantiated hunch: the hypervisor is doing a shitload
             | of probes against the host system before
             | allocating/configuring virtual hardware devices/behaviors.
             | Since the host's hardware/driver/kernel situation can
             | change between hypervisor invocations, it might have to re-
             | answer a ton of questions about the host environment in
             | order to provide things like "the VM/host USB bridge uses
             | so-and-so optimized host kernel/driver functionality to
             | speed up accesses to a VM-attached USB device". Between
             | running such checks for all behaviors the VM needs, and the
             | possibility that wasteful checks (e.g. for rare VM
             | behaviors or virtual hardware that's not in use) are also
             | performed, that could take some time.
             | 
             | On the other hand, it could just as easily be something
             | simple, like setting up hugepages or checksumming virtual
             | hard disk image files.
             | 
             | Both are total guesses, though. Could be anything!
        
         | orev wrote:
         | I think you need to provide more details on what VM software
         | you're using. On VirtualBox what you describe is very
         | noticeable, and it didn't have that delay in older versions. So
         | it could be just an issue with that VM software and not a
         | general "traditional VMs" issue.
        
           | dataflow wrote:
           | Yup I'm asking about VirtualBox mainly, I just don't
           | understand what the heck it's doing during that time that
           | takes so long. Although I don't recall other VMs (like say,
           | Hyper-V) being dramatically different either (ignoring WSL2
           | here).
        
             | _factor wrote:
             | Try disabling Windows Defender and trying again.
        
               | dataflow wrote:
               | Are you just guessing or have you actually seen the delay
               | I'm talking about disappear as a result of this (or as a
               | result of anything else for that matter)? Because I've
               | already done this (yes, entirely, even the kernel mode
               | drivers) and it's definitely not the issue.
        
               | hinkley wrote:
               | There was a release of subversion back in the day that
               | reduced the number of files that were opened during a
               | repo action like pull, and the number of times any one
               | file got opened. On Linux it ran about 2-3x faster. Very
               | nice change.
               | 
               | On windows it was almost 10x faster. On the project where
               | this change was released, my morning ritual was to come
               | in, log on, run an svn pull command, lock my screen and
               | go get coffee. I had at least ten minutes to kill after I
               | got coffee, if the pot wasn't empty when I got there.
               | 
               | Windows is hot garbage about fopen particularly when
               | virus scanning is on.
        
             | icedchai wrote:
             | Linux KVM/qemu VMs start pretty fast.
        
         | speed_spread wrote:
         | Creating the VM itself is fast. It depends on what you run in
         | it. Unikernel VMs can start in a few milliseconds. For example,
         | checkout OSv.
        
           | dataflow wrote:
           | You're saying this is true on a _Windows_ host?
        
             | akdev1l wrote:
             | Yes. The delay you're complaining about happens because you
             | are looking at general hypervisors which also come with
             | virtualized hardware and need to mimic a bunch of stuff so
             | that most software will work as usual.
             | 
             | For example: your VM starts up with the CPU in 16 bit mode
             | because that's just how things work in x86 and then it
             | waits for the guest OS to set the CPU into 64 bit mode.
             | 
             | This is completely unnecessary if you just want to run
             | x86-64 code in a virtualized environment and you control
             | the guest kernel and can just assume things are in 64bit
             | mode because it's not the 70s or whatever
             | 
             | The guest OS would also need to probe few ports to get a
             | bootable disk. If you control the kernel then you can just
             | not do that and boot directly.
             | 
             | There's a ton of stuff that isn't needed
        
               | dataflow wrote:
               | The 16 bit mode stuff and the guest OS probes are after
               | what I'm asking, not before.
        
               | akdev1l wrote:
               | No it is not. The "first instruction in the BIOS" is 16
               | bit mode code when dealing with an x86 VM.
               | 
               | A virtual environment doesn't even really need any BIOS
               | or anything like that.
               | 
               | You can feel free to test with qemu direct kernel booting
               | to see this skips a lot of delay without even having to
               | use a specialized hypervisor like firecracker
        
               | speed_spread wrote:
               | A bare VM may not have a BIOS, it's just partitioning
               | supported by the host CPU and OS. The emulation of the
               | legacy PC hardware stack for conventional OS
               | compatibility is a separate thing. If the guest OS is
               | custom-designed to launch in a bare VM with known
               | topology it can boot very, very fast.
        
         | dist-epoch wrote:
         | Sounds like a VirtualBox problem.
         | 
         | I'm using Hyper-V and I can connect through XRDP to a GUI
         | Ubuntu 22 in 10 seconds and I can SSH into a Ubuntu 22 server
         | in 3 seconds after start.
        
         | akdev1l wrote:
         | The answer is that it doesn't have to be like that.
         | 
         | In practice virtual machines are trying to emulate a lot of
         | stuff that isn't really needed but they're doing it for
         | compatibility.
         | 
         | If one builds a hypervisor which is optimized for startup speed
         | and doesn't need to support generalized legacy software then
         | you can:
         | 
         | > Unlike traditional VMs that might take several seconds to
         | start, Firecracker VMs can boot up in as little as 125ms.
        
         | jiggawatts wrote:
         | Try Windows Server Core on an SSD. I've seen VMs launch in low
         | single-digit seconds. You can strip it down even further by
         | removing non-64-bit support, Defender, etc...
        
         | BobbyTables2 wrote:
         | In Linux, VM memory allocations can be slow if it tries to
         | allocate GBs of RAM using 4K pages. There are ways to help it
         | allocate 1GB at a time which vastly speeds it up.
         | 
         | Windows probably has an equivalent.
        
           | pdimitar wrote:
           | Is this specifically for during boot time? Also, any links?
        
       | Jayakumark wrote:
       | Windows support ? and can we VNC in to the sandbox and stream it
       | ?
        
         | appcypher wrote:
         | Windows support is a work in progress. I haven't tested using
         | VNC yet but it should be possible.
        
       | h1fra wrote:
       | Can't wait to test, if it's really what's advertised it would be
       | much easier to use than workerd or firecracker
        
       | McAlpine5892 wrote:
       | This looks awesome. The amount of super lightweight and almost-
       | disposable VM options in recent years is crazy. I remember when
       | VMs were slow, clunky, and generally painful.
       | 
       | I wonder how this compares to Orbstack's [0] tech stack on macOS,
       | specifically the "Linux machines" [1] feature. Seems like Orb
       | might reuse a single VM?
       | 
       | ---
       | 
       | [0] https://orbstack.dev
       | 
       | [1] https://docs.orbstack.dev/machines/
        
       | jbverschoor wrote:
       | Related, https://github.com/jrz/container-shell which uses docker
       | to create adhoc shells / chroots in the current directory.
        
       | manveru wrote:
       | Are the SDKs AI generated? I looked at the Crystal, Ruby, and Zig
       | ones and all they contain is a hello world example with some docs
       | that have little to do with the code. Sorry if this comment seems
       | rude, just curious.
        
         | appcypher wrote:
         | The other SDKs are generated hello-worlds at the moment. I will
         | get to them one by one, but I welcome and appreciate any
         | contributions to them.
        
       | jmehman wrote:
       | I've been looking for something I could host for this kind of
       | thing - for LLM agents. Ended up on https://www.daytona.io/ as I
       | couldn't find anything suitable to self host and realised it was
       | a complex thing to manage. It seems Daytona is open source,
       | including the server platform, but there is no documentation for
       | the server element. Azure also seem to offer a service for this,
       | it's a space that is growing rapidly.
        
         | appcypher wrote:
         | Microsandbox is for people that would like to maintain their
         | own infra. I'm not going to stop trying to make it better to
         | self-host.
        
           | jmehman wrote:
           | Yeah, it looks great, makes me reconsider the self hosted
           | route
        
       | patrick4urcloud wrote:
       | very nice ! i will definetly try
        
       | ATechGuy wrote:
       | Congrats on launching! Booting VMs in milliseconds is certainly
       | important, but it can also be achieved with
       | CloudHypervisor/Firecracker. Where Containers beat VMs is runtime
       | perf. The overhead in case of VMs stems from emulation of IO
       | devices. I believe the overhead will become noticeable for AI
       | agentic use cases. Any plans to address perf issues?
        
         | appcypher wrote:
         | You are right. We leverage libkrun. Libkrun uses virtio-mmio
         | transport for block, vsock and virtio-fs to keep overhead
         | minimal so we basically depend on any perf improvement made
         | upstream.
         | 
         | Firecracker is no different btw and E2B uses that for agentic
         | AI workloads. Anyway, I don't have any major plan except fix
         | some issues with the filesystem rn.
        
       | SwiftyBug wrote:
       | Kind of almost off-topic: I'm working on a project where I must
       | run possibly untrusted JavaScript code. I want to run it in an
       | isolated environment. This looks like a very nice solution as I
       | could spin up a microsandbox and securely run the code. I could
       | even have a pool os live sandboxes so I wouldn't even experience
       | the 200ms starts. Because this is OCI-compatible, I could even
       | provide a whole sandboxed environment on which to run that code.
       | Would that be a good use case for this? Are there better
       | alternatives?
        
         | appcypher wrote:
         | > Would that be a good use case for this?
         | 
         | That is an ideal use case
         | 
         | > Are there better alternatives?
         | 
         | Created microsandbox because I didn't find any
        
           | SwiftyBug wrote:
           | Awesome. This is really good timing. I'm going to give it a
           | try.
        
         | ericb wrote:
         | runsc / gVisor is interesting also as the runsc engine can be
         | run from within Docker/Docker Desktop.
         | 
         | gVisor has performance problems, though. Their data shows 1/3rd
         | the throughput vs. docker runtime for concurrent network calls
         | --if that's an issue for your use-case.
        
         | apitman wrote:
         | You might be able to get away with running QuickJS compiled to
         | WebAssembly: https://til.simonwillison.net/npm/self-hosted-
         | quickjs
        
         | arjunbajaj wrote:
         | I recommend trying Javy[0]. Javy allows you to build a WASM
         | file that includes Javy's JS interpreter along with your JS
         | source code. Note that Javy is a heavily sandboxed environment
         | so it doesn't have access to the internet, or npm modules, a
         | desirable feature for running user code.
         | 
         | We're building an IoT Cloud Platform, Fostrom[1] where we're
         | using Javy to power our Actions infrastructure. But instead of
         | compiling each Action's JS code to a Javy WASM module, I
         | figured out a simpler way by creating a single WASM module with
         | our wrapper code (which contains some further isolation and
         | helpful functions), and we provide the user code as an input
         | while executing the single pre-compiled WASM module.
         | 
         | [0] https://github.com/bytecodealliance/javy
         | 
         | [1] https://fostrom.io
        
       | hinkley wrote:
       | How's performance? What's the overhead versus docker? Terraform
       | or Pulumi integration on the horizon?
        
         | appcypher wrote:
         | Wow. Just seeing this. I've not done proper benchmarking yet
         | but rn we are lagging behind in file I/O for the OverlayFS impl
        
           | hinkley wrote:
           | There was a period where NFS was faster, particularly on
           | windows and OSX where you were paying a double indirection.
           | 
           | Overlays are always tough because docker doesn't like you
           | writing to the filesystem in the first place. The weapon if
           | first result is deflection; tell them not to do it.
           | 
           | I had to put up with an old docker version that leaked
           | overlay data for quite a while before we moved off prem.
        
       | elwebmaster wrote:
       | One topic I am not finding anything about is networking. Can
       | these microsandbox instances listen on ports? How is the port
       | forwarding configured? Can they access the internet or any
       | resources on the host?
        
         | appcypher wrote:
         | They can. I need to improve the doc. Working on that right now
        
       | zackmorris wrote:
       | This is great!
       | 
       | I'd like to see a formal container security grade that works
       | like:                 1) Curate a list of all known (container)
       | exploits       2) Run each exploit in environments of increasing
       | security like permissions-based, jail, Docker and emulator
       | 3) The percentage of prevented exploits would be the score from
       | 0-100%
       | 
       | Under this scheme, I'd expect naive attempts at containerization
       | with permissions and jails to score around 0%, while Docker might
       | be above 50% and Microsandbox could potentially reach 100%.
       | 
       | This might satisfy some of our intuition around questions like
       | "why not just use a jail?". Also the containers could run on a
       | site on the open web as honeypots with cash or crypto prizes for
       | pwning them to "prove" which containers achieve 100%.
       | 
       | We might also need to redefine what "secure" means, since
       | exploits like Rowhammer and Spectre may make nearly all
       | conventional and cloud computing insecure. Or maybe it's a moving
       | target, like how 64 bit encryption might have once been
       | considered secure but now we need 128 bit or higher.
       | 
       | Edit: the motivation behind this would be to find a container
       | that's 100% secure without emulation, for performance and cost-
       | savings benefits, as well as gaining insights into how to secure
       | operating systems by containerizing their various services.
        
         | bjackman wrote:
         | You cannot build a secure container runtime (against malicious
         | containers) because underlying it is the Linux kernel.
         | 
         | The only way to make Linux containers a meaningful sandbox is
         | to drastically restrict the syscall API surface available to
         | the sandboxee, which quickly reduces its value. It's no longer
         | a "generic platform that you can throw any workload onto" but
         | instead a bespoke thing that needs to be tuned and reconfigured
         | for every usecase.
         | 
         | This is why you need virtualization. Until we have a properly
         | hardened and memory safe OS, it's the only way. And if we do
         | build such an OS it's unclear to me whether it will be faster
         | than running MicroVMs on a Linux host.
        
           | Veserv wrote:
           | You cannot build a secure virtualization runtime because
           | underlying it is the VMM. Until you have a secure VMM you are
           | subject to precisely the same class of problems plaguing
           | container runtimes.
           | 
           | The only meaningful difference is that Linux containers
           | target partitioning Linux kernel services which is a shared-
           | by-default/default-allow environment that was never designed
           | for and has never achieved meaningful security. The number of
           | vulnerabilities resulting from, "whoopsie, we forgot to
           | partition shared service 123" would be hilarious if it were
           | not a complete lapse of security engineering in a product
           | people are convinced is adequate for security-critical
           | applications.
           | 
           | Present a vulnerability assessment demonstrating a team of 10
           | with 3 years time (~10-30 M$, comparable to many
           | commercially-motivated single-victim attacks these days) can
           | find no vulnerabilities in your deployment or a formal proof
           | of security and correctness otherwise we should stick with
           | the default assumption that software if easily hacked instead
           | of the extraordinary claim that demands extraordinary
           | evidence.
        
             | nyrikki wrote:
             | While VMs do have an attack surface, it is vastly different
             | than containers, which as you pointed out are not really a
             | security system, but simply namespaces.
             | 
             | Seacomp, capabilities, selinux, apparmor, etc.. can help
             | harden containers, but most of the popular containers don't
             | even drop root for services, and I was one of the people
             | who tried to even get Docker/Moby etc.. to let you disable
             | the privileged flag...which they refused to do.
             | 
             | While some CRIs make this easier, any agent that can spin
             | up a container should be considered a super user.
             | 
             | With the docker --privlaged flag I could read the hosts
             | root volume or even install efi bios files just using mknod
             | etc, walking /sys to find the major/minor numbers.
             | 
             | Namespaces are useful in a comprehensive security plan, but
             | as you mentioned, they are not jails.
             | 
             | It is true that both VMs and containers have attack
             | surfaces, but the size of the attack surface on containers
             | is much larger.
        
             | transpute wrote:
             | _> You cannot build a secure virtualization runtime because
             | underlying it is the VMM_
             | 
             | There are VMMs (e.g. pKVM in upstream Linux) with small
             | SLoC that are isolated by silicon support for nested
             | virtualization. This can be found on recent Google Pixel
             | phones/tablets with strong isolation of untrusted Debian
             | Arm Linux "Terminal" VM.
             | 
             | A similar architecture was shipped a decade ago by Bromium
             | and now on millions of HP business laptops, including
             | hypervisor isolation of firmware, _" Hypervisor Security :
             | Lessons Learned -- Ian Pratt, Bromium -- Platform Security
             | Summit 2018_", https://www.youtube.com/watch?v=bNVe2y34dnM
             | 
             | Christian Slater, HP cybersecurity ("Wolf") edutainment on
             | nested virt hypervisor in printers,
             | https://www.youtube.com/watch?v=DjMSq3n3Gqs
        
               | delusional wrote:
               | > silicon support for nested virtualization
               | 
               | Is there any guarantee that this "silicon support" is any
               | safer than the software? Once we break the software
               | abstraction down far enough it's all just configuring
               | hardware. Conversely, once you start baking significant
               | complexity into hardware (such as strong security
               | boundaries) it would seem like hardware would be subject
               | to exactly the same bugs as software would, except it
               | will be hard to update of course.
        
               | transpute wrote:
               | _> Is there any guarantee that this  "silicon support" is
               | any safer than the software?_
               | 
               | Safety and security claims are only meaningful in the
               | context of threat models. As described in the Xen/uXen/AX
               | video, pKVM and AWS Nitro security talks, one goal is to
               | reduce the size, function and complexity of open-source
               | code running at the highest processor privilege levels
               | [1], minimizing dependency on closed
               | firmware/SMM/TrustZone. Nitro moved some functions (e.g.
               | I/O virtualization) to separate processors, e.g.
               | SmartNIC/DPU. Apple used an Arm T2 secure enclave
               | processor for encryption and some I/O paths, when their
               | main processor was still x86. OCP Caliptra RoT requires
               | OSS firmware signed by both the OEM and hyperscaler
               | customer. It's a never-ending process of reducing attack
               | surface, prioritized by business context.
               | 
               |  _> hardware would be subject to exactly the same bugs as
               | software would, except it will be hard to update of
               | course_
               | 
               | Some "hardware" functions can be updated via microcode,
               | which has been used to mitigate speculative execution
               | vulnerabilities, at the cost of performance.
               | 
               | [1] https://en.wikipedia.org/wiki/Protection_ring
               | 
               | [2] https://en.wikipedia.org/wiki/Transient_execution_CPU
               | _vulner...
        
             | bjackman wrote:
             | I see your point but even if your VMM is a zillion lines of
             | C++ with emulated devices there are opportunities to secure
             | it that don't exist with a shared-monolithic-kernel
             | container runtime.
             | 
             | You can create security boundaries around (and even
             | within!) the VMM. You can make it so an escape into the VMM
             | process has only minimal value, by sandboxing the VMM
             | aggressively.
             | 
             | Plus you can absolutely escape the model of C++ emulating
             | devices. Ideally I think VMMs should do almost nothing but
             | manage VF passthroughs. Of course then we shift a lot of
             | the problem onto the inevitably completely broken device
             | firmware but again there are more ways to mitigate that
             | than kernel bugs.
        
               | delusional wrote:
               | Could you elaborate on how you could secure those
               | architectures better? It's unclear to me how being in
               | device firmware or being a VMM provides you with any
               | further abilities. Surely you still have the same
               | fundamental problem of being a shared resource.
               | 
               | Intuitively there are differences. The Linux kernel is
               | fucking huge, and anything that could bake the "shared
               | resources" down to less than the entire kernel would be
               | easier to verify, but that would also be true for an
               | entirely software based abstraction inside the kernel.
               | 
               | In a way it's the whole micro kernel discussion again.
        
               | bjackman wrote:
               | When you escape a container generally you can do whatever
               | the kernel can do. There is no further security boundary.
               | 
               | If you escape into a VMM you can do whatever the VMM can
               | do. You can build a system where it can not do very much
               | more than the VM guest itself. By the time the guest
               | boots the process containing the vCPU threads has already
               | lost all its interesting privileges and has no
               | credentials of value.
               | 
               | Similar with device passthrough. It's not very
               | interesting if the device you're passing through
               | ultimately has unchecked access to PCIe but if you have a
               | proper ioMMU set up it should be possible to have a
               | system where pwning the device firmware is just a small
               | step rather than an immediate escalation to root-
               | equivalent. (I should say, I don't know if this system
               | actually exists today, I just know it's possible).
               | 
               | With a VMM escape your next step is usually to exploit
               | the kernel. But if you sandbox the VMM properly there is
               | very limited kernel attack surface available to it.
               | 
               | So yeah you're right it's similar to the microkernel
               | discussion. You could develop these properties for a
               | shared-kernel container runtime... By making it a
               | microkernel.
               | 
               | It's just that isn't a path with any next steps in the
               | real world. The road from Docker to a secure VM platform
               | is rich with reasonable incremental steps forward
               | (virtualization is an essential step but it's still just
               | one of many). The road from Docker to a microkernel is...
               | Rewrite your entire platform and every workload!
        
               | delusional wrote:
               | > It's just that isn't a path with any next steps in the
               | real world.
               | 
               | It appears we find ourselves at the Theory/Praxis
               | intersection once again.
               | 
               | > The road from Docker to a secure VM platform is rich
               | with reasonable incremental steps forward
               | 
               | The reason it seems so reasonable is that it's well
               | trodden. There were an infinity of VM platforms before
               | Docker, and they were all discarded for pretty well known
               | engineering reasons mostly to do with performance, but
               | also for being difficult for developers to reason about.
               | I have no doubt that there's still dialogue worth having
               | between those two approaches, but cgroups isn't a
               | "failed" VM security boundary anymore than Linux is a
               | failed micro kernel. It never aimed to be a VM-like
               | security boundary.
        
           | akdev1l wrote:
           | One can definitely build a container runtime that uses
           | virtualization to protect the host
           | 
           | For example there is Kata containers
           | 
           | https://katacontainers.io/
           | 
           | This can be used with regular `podman` by just changing the
           | container runtime so there's no even need for any extra
           | tooling
           | 
           | In theory you could shove the container runtime into
           | something like k8s
        
             | bjackman wrote:
             | > container runtime that uses virtualization to protect the
             | host
             | 
             | True, by "container" I really meant "shared-kernel
             | container".
             | 
             | > In theory you could shove the container runtime into
             | something like k8s
             | 
             | Yeah this is actually supported by k8s.
             | 
             | Whether that means it's actually reasonable to run
             | completely untrusted workloads on your own cluster is
             | another question. But it definitely seems like a really
             | good defense-in-depth feature.
        
           | ignoramous wrote:
           | > _... drastically restrict the syscall API surface available
           | to the sandboxee, which quickly reduces its value ..._
           | 
           | Depends I guess as Android has had quite a bit of success
           | with seccomp-bpf & Android-specific flavour of SELinux [0]
           | 
           | > _Until we have a properly hardened and memory safe OS ...
           | faster than running MicroVMs on a Linux host._
           | 
           | Andy Tanenbaum might say, Micro Kernels would do just as
           | well.
           | 
           | [0] https://youtu.be/WxbOq8IGEiE
        
             | carlhjerpe wrote:
             | You also have gVisor, which runs all syscall through some
             | Go history that's supposedly safe enough for Google.
        
               | bjackman wrote:
               | gVisor uses virtualization
        
             | bjackman wrote:
             | > Android
             | 
             | Exactly. Android pulls this off by being extremely
             | constrained. It's dramatically less flexible than an OCI
             | runtime. If you wanna run a random unenlightened workload
             | on it you're probably gonna have a hard time.
             | 
             | > Micro Kernels would do just as well.
             | 
             | Yea this goes in the right direction. In the end a lot of
             | kernel work I look at is basically about trying to retrofit
             | benefits of microkernels onto Linux.
             | 
             | Saying "we should just use an actual microkernel" is a bit
             | like "Russia and Ukraine should just make peace" IMO
             | though.
        
         | tptacek wrote:
         | The issue, at least with multitenant workloads, isn't
         | "container vulnerabilities" as such; it's that standard
         | containers are premised on sharing a kernel, which makes every
         | kernel LPE a potential container escape --- there's a long
         | history of those bugs, and they're only rarely flagged as
         | "container escapes"; it's just sort of understood that a kernel
         | LPE is going to break containers.
        
           | delusional wrote:
           | > it's just sort of understood that a kernel LPE is going to
           | break containers.
           | 
           | I think it's generally understood that any sort of kernel LPE
           | can potentially (and therefore is generally considered to)
           | lead to breaking all security boundaries on the local
           | machine, since the kernel contains no internal security
           | boundaries. That includes both containers, but also
           | everything else such a user separation, hardware
           | virtualization controlled by the local kernel, and kernel
           | private secrets.
        
             | transpute wrote:
             | _> hardware virtualization controlled by the local kernel_
             | 
             | In some architectures, kernel LPE does not break platform
             | (L0/EL2) virtualization,
             | https://news.ycombinator.com/item?id=44141164
             | L0/EL2  L1/EL1                               pKVM    KVM
             | AX      Hyper-V / Xen / ESX
        
               | tptacek wrote:
               | Most Linux kernel LPEs --- in fact, the overwhelming
               | majority of them --- don't threaten KVM hosts when
               | exploited in KVM guests.
        
               | billywhizz wrote:
               | is there anything good written up on this?
        
             | zrm wrote:
             | A large proportion of LPE vulnerabilities are in the nature
             | of "perform a syscall to pass specially crafted data to the
             | kernel and trigger a kernel bug". For containers, the
             | kernel is the host kernel and now the host is compromised.
             | For VMs, the kernel is the guest kernel and now the guest
             | is compromised, but not the host. That's a much narrower
             | compromise and in security models where root on the guest
             | is already expected to be attacker-controlled, isn't even a
             | vulnerability.
        
               | tptacek wrote:
               | Yes, what they just said here. ^^ ^^
        
               | Veserv wrote:
               | VM sandbox escape is just "perform a hypercall/trap to
               | pass specially crafted data to the hypervisor and trigger
               | a hypervisor bug". For virtual machines, the hypervisor
               | is the privileged host and now the host is compromised.
               | 
               | There is no inherent advantage to virtualization, the
               | only thing that matters is the security and robustness of
               | the privileged host.
               | 
               | The only reason there is any advantage in common use is
               | that the Linux Kernel is a security abomination designed
               | for default-shared/allow services that people are now
               | trying to kludge into providing multiplexed services. But
               | even that advantage is minor in comparison to modern,
               | commonplace threat actors who can spend millions to tens
               | of millions of dollars finding security vulnerabilities
               | in core functions and services.
               | 
               | You need privileged manager code that a highly skilled
               | team of 10 with 3 years to pound on it can not find _any_
               | vulnerabilities in to reach the minimum bar to be secure
               | against prevailing threat actors, let alone near-future
               | threat actors.
        
               | zrm wrote:
               | The syscall interface has a lot more attack surface than
               | the hypercall interface. If you want to run existing
               | applications, you have to implement the existing syscall
               | interface.
               | 
               | The advantage to virtualization is that the syscall
               | interface is being implemented by the guest kernel at a
               | lower privilege level instead of the host kernel at a
               | higher privilege level.
        
               | tptacek wrote:
               | If this were true, it would be easy to support the claim
               | with evidence. What were the last three Linux LPEs that
               | could be used in a realistic scenario (an attacker with
               | shell, root, full control of guest kernel) to compromise
               | a KVM host? There are dozens of published LPEs every
               | year, so this should be easy for you.
        
         | Etheryte wrote:
         | In a way, containers already run as honeypots with cash or
         | crypto prizes, it's called production code and plenty of people
         | are looking for holes day and night. While this setup sounds
         | like a nice idea conceptually, the monetary incentives it could
         | offer would surely be miniscule compared to real targets.
        
         | godelski wrote:
         | Importantly I'd like to see the configurations of the machines.
         | There's a lot you can do to docker or systemd spawns that
         | greatly vary the security levels. This would really help show
         | what needs to be done and what configurations lead to what
         | risks.
         | 
         | Basically I'd love to see a giant ablation
        
       | rbitar wrote:
       | Looks great and excited to try this out. We've also had success
       | using CodeSandbox SDK and E2B, can you share some thoughts on how
       | you compare or future direction? Do you also use Firecracker
       | under the hood?
        
         | pkkkzip wrote:
         | I can't tell if it uses firecracker but thats my main question
         | too. I'm curious as to whether microsandbox will be maintained
         | and proper auditing will be done.
         | 
         | I welcome alternatives. It's been tough wrestling with
         | Firecracker and OCI images. Kata container is also tough.
        
           | appcypher wrote:
           | It will be maintained as I will be using it for some other
           | product. And it will be audited in the future but it still
           | early days.
        
           | pdimitar wrote:
           | I wanted to try Kata containers soon. What difficulties do
           | you have with them?
        
         | appcypher wrote:
         | > can you share some thoughts on how you compare or future
         | direction?
         | 
         | Microsandbox does not offer a cloud solution. It is self-
         | hosted, designed to do what E2B does, to make it easier working
         | with microVM-based sandboxes on your local machine whether that
         | is Linux, macOS or Windows (planned) and to seamlessly
         | transition to prod.
         | 
         | > Do you also use Firecracker under the hood?
         | 
         | It uses libkrun.
        
           | rbitar wrote:
           | Self-hosting is definitely something we are keen to explore
           | as most of the cloud solutions have resource constrains (ie,
           | total active MicroVMs and/or specs per VM) and managing
           | billing gets complicated even with hibernation features.
           | Great project and we'll definitely take it for a spin
        
       | sureglymop wrote:
       | Always interested when things like this come up.
       | 
       | What like about containers is how quickly I can run something,
       | e.g. `docker run --rm ...` without having to specify disk size,
       | amount of cpu cores, etc. I can then diff the state of the
       | container with the image (and other things) to see what some
       | program did while it ran.
       | 
       | So I basically want the same but instead with small vms to have
       | better sandboxing. Sometimes I also use bwrap but it's not really
       | intended to be used on the command line like that.
        
         | srmatto wrote:
         | It has a YAML config format to declare all of that so you could
         | just do that once, or template it, generate it on the fly,
         | fetch it from remote, or many other methods.
        
       | eamann wrote:
       | > Ever needed to run code you don't fully trust?
       | 
       | Then the installation instructions include piping a remote script
       | directly to Bash ... Oh irony ...
       | 
       | That said, the concept itself is intriguing.
        
         | appcypher wrote:
         | Your statement initially went over my head. Sorry lol. You can
         | always download the installer script and audit yourself. I will
         | set up proper distribution later.
        
           | hakcermani wrote:
           | .. did exactly that and also changed the BINDIR and LIBDIR to
           | another location. BTW, amazing project from initial glance.
           | Will give it a detailed look this weekend!
        
           | raphinou wrote:
           | In case you're interested when you set up proper
           | distribution, I'm working on an open source solution aiming
           | to improve security of downloads from the internet. Our first
           | step is maintaining a mirror of checksums published in GitHub
           | releases at https://github.com/asfaload/checksums/. If you
           | publish a checksums file in your releases it can
           | automatically be mirrored. The checksums mirror is not our
           | end game, but it already protects against changes of released
           | files from the time the mirror was taken. For anyone
           | interested: https://asfaload.com/asfald/
        
       | amelius wrote:
       | For my taste, container technology is pushing the OS too far. By
       | typing:                   mount
       | 
       | you immediately see what I mean. Stuff that should be hidden is
       | now in plain sight, and destroys the usefulness of simple system
       | commands. And worse, the user can fiddle with the data
       | structures. It's like giving the user peek and poke commands.
       | 
       | The idea of containers is nice, but they are a hack until kernels
       | are re-architected.
        
         | throwaway314155 wrote:
         | Sorry I am lacking the context to understand this post. What
         | does running mount inside a container do that's so egregious?
         | Are host mounts exposed to the container somehow? I thought
         | everything needed to be explicitly passed through to the
         | container (e.g. using a volume)?
        
           | remram wrote:
           | I think they mean that running `mount` on the host now lists
           | hundreds of mountpoints from containers, snaps, packagekit
           | etc.
        
         | topspin wrote:
         | On recent Linux, try:                   findmnt --real
         | 
         | It's part of linux-utils, so it is generally available wherever
         | have a shell. The legacy tools you have in mind aren't ever
         | going to be changed as you would wish, for reasons.
        
       | sbassi wrote:
       | There are python and node environment for this, so they are not
       | VMs in the sense that I can host a OS and arbitrary executables?
        
         | appcypher wrote:
         | They are Linux VMs and you can host any executable that can
         | work on that. The python/node environment you see is part of
         | what makes the SDK work. Really, it's very similar to Docker in
         | use.
        
           | sbassi wrote:
           | thank you. Is there any "docker host" or centralized repo
           | where I can pull VMs from?
        
             | appcypher wrote:
             | We support just Docker hub for now. Let me know if you want
             | any other OCI-compatible registry.
             | 
             | PS: microsandbox will likely have its own OCI registry in
             | the future
        
       | airocker wrote:
       | Would love to hear nix people take on this?
        
         | mjrusso wrote:
         | As a Nix user, I'm actually really excited to try this out.
         | 
         | I want to run sandboxes based on Docker images that have Nix
         | pre-installed. (Once the VM boots, apply the project-specific
         | Flake, and then run Docker Compose for databases and other
         | supporting services.) In theory, an easy-to-use, fully isolated
         | dev environment that matches how I normally develop, except
         | inside of a VM.
        
           | airocker wrote:
           | but dont they have overlapping requirements of solving "not
           | works on my machine"
        
             | mjrusso wrote:
             | Microsandbox's primary goal is to make it easy to build
             | environments for running untrusted code.
             | 
             | Nix, on the other hand, solves the problem of building
             | reproducible environments... but making said environments
             | safe for running untrusted code is left as an exercise for
             | the reader.
        
       ___________________________________________________________________
       (page generated 2025-05-31 23:01 UTC)