hngopher.com

       [HN Gopher] What is gVisor?
       ___________________________________________________________________
        
       What is gVisor?
        
       Author : yla92
       Score  : 92 points
       Date   : 2025-07-31 13:40 UTC (9 hours ago)
        
 (HTM) web link (blog.yelinaung.com)
 (TXT) w3m dump (blog.yelinaung.com)
        
       | ericpauley wrote:
       | One of the coolest things about gVisor to me is that it's the
       | ultimate implication of core computer engineering concepts like
       | "the OS is just software" and "network traffic is just bytes".
       | It's one thing to learn these ideas in theory, but it's another
       | altogether to be able to play with an entire network stack in
       | userspace and inject arbitrary behavior in the OSI stack. It's
       | also been cool to see what companies like Fly.io and Tailscale
       | can do with complete flexibility in the network, enabled by tools
       | like gVisor.
        
         | sidewndr46 wrote:
         | I'm trying to understand the point you're making here but don't
         | really get it. The OS is just software, in most circumstances.
         | Most modern OS require at least one binary blob that has to be
         | sent to some hardware device. This is mostly because the the
         | device manufacturer didn't want to include NVRAM and at the end
         | of the day is usually just software as well.
        
           | bananapub wrote:
           | their point is that lots of things everyone thinks of as "OS"
           | things like "tcp" and "doing file IO" can just be done in
           | user space by some new program without the processes that
           | make use of these facilities knowing or caring.
        
             | surajrmal wrote:
             | The majority of any OS lives in user space though.
             | Intercepting syscalls is also not that weird of an idea,
             | that's how tools like strace works. Building out sufficient
             | kernel functionality without needing to forward calls to
             | the kernel is definitely impressive though.
        
               | tptacek wrote:
               | What do you mean by that? There's a notion of an
               | "operating system" that encompasses both the kernel and
               | all the userland tools (in this sense, each Linux
               | distribution is an "OS"), and there's a more common
               | notion of an OS that is just the kernel and any userland
               | services required for the kernel to function; the latter
               | is the more common definition.
        
         | leetrout wrote:
         | How does Fly use gVisor?
        
           | abound wrote:
           | I don't believe they do, they use Firecracker microVMs for
           | isolation: https://fly.io/docs/reference/architecture/
        
           | ericpauley wrote:
           | https://fly.io/blog/ssh-and-user-mode-ip-wireguard/
        
             | PhilippGille wrote:
             | Quote:
             | 
             | > And, long story short, we now have an implementation of
             | certificate-based SSH, running over gVisor user-mode
             | TCP/IP, running over userland wireguard-go, built into
             | flyctl.
        
               | tptacek wrote:
               | Also:
               | 
               | https://fly.io/blog/our-user-mode-wireguard-year/
               | 
               | https://fly.io/blog/jit-wireguard-peers/
               | 
               | This is another one of those things where the graph of
               | our happiness about a technical decision is sinusoidal.
               | :)
        
           | tptacek wrote:
           | We don't.
        
             | jchw wrote:
             | I think they mean to say that a part of gVisor is used by
             | Fly, because if I recall correctly flyctl did use the
             | gVisor user mode TCP stack for Wireguard tunneling.
        
               | tptacek wrote:
               | Ahh, that makes sense. Ok, revised answer: yes, we do. :)
        
         | quotemstr wrote:
         | Just wait until you read about Wine or captive NDIS. You'll
         | probably enjoy User Mode Linux most of all.
         | 
         | The concept of an OS still makes sense on a system with no
         | privilege level transitions and a single address space (e.g.
         | DOS, FreeRTOS): therefore, mystical low level register goo
         | isn't essential to the concept.
         | 
         | The boundary between the OS is a lot more porous and a lot less
         | arcane than people imagine. In the end, it's just software.
        
           | jchw wrote:
           | I believe early on Linode used UML for their VPS hosting
           | offering. At that point in history, I recall solutions like
           | OpenVZ being pretty popular in the low end space, too.
           | 
           | gVisor's modular design seems to have been its strongest
           | point. It's not that nobody understood the OS is just
           | software or whatever, but actually ripping the Linux TCP
           | stack out and using it in userland isn't really that trivial.
           | Meanwhile though a lot of projects have made use of the
           | gVisor networking components, since they're pretty self-
           | contained.
           | 
           | I think gVisor is one of the coolest things written in Go,
           | although it's not really that easy to convey why.
           | 
           | Seriously, just check out the list of packages in the pkg
           | directory:
           | 
           | https://pkg.go.dev/gvisor.dev/gvisor
           | 
           | (I should acknowledge, though, that I don't know of that many
           | unique use cases for all of these packages; and while the TCP
           | stack is very useful, it's mainly used for Wireguard
           | tunneling and user mode TCP stacks are not particularly new.
           | Still, the gVisor network stack is nicer than hacked together
           | stuff using SLiRP-derived code imo.)
        
       | udev4096 wrote:
       | Moving to unikernel [0] is the best way to get strong isolation
       | and high performance
       | 
       | [0] - https://unikraft.org
        
         | sidewndr46 wrote:
         | The last solution I looked at to do something like this was
         | using tap / tun devices for networking. How does unikraft
         | handle network isolation and virtualization?
        
           | udev4096 wrote:
           | From my limited understanding, it has the same isolation
           | advantages as that of a VM and therefore it's as strong as
           | the hypervisor you use
        
             | sidewndr46 wrote:
             | so does unikraft contain a "driver" for virtio networking?
        
           | johncolanduoni wrote:
           | It relies on your hypervisor and/or network hardware to
           | provide that. In an ideal circumstance (e.g. running on a
           | multiqueue NIC with VFIO or virtio acceleration), your VM can
           | talk directly to the network hardware. Major clouds will
           | provide something morally equivalent via their newer network
           | interfaces (gVNIC etc.).
        
         | mikepurvis wrote:
         | Absolutely, that reduces your surface area more than anything
         | else, but at an enormous cost to ergonomics.
         | 
         | Some of us are still fighting for docker images to not include
         | a vim install ("but it's so handy!") and here we've got madlads
         | building their app as its own bootable machine image.
        
         | johncolanduoni wrote:
         | It's not the best way to get low per-privilege domain overhead
         | and fungible resource allocation. You're ultimately limited by
         | your hypervisor on those fronts. gVisor containers are
         | ultimately a few Linux processes and mostly behave like one
         | from a CPU and memory allocation perspective.
        
         | eyberg wrote:
         | These people definitely do not understand security at all:
         | 
         | https://github.com/unikraft/unikraft/issues/414
         | 
         | Also - one needs to be careful cause many of the workloads they
         | advertise on their site do not actually run under their kernel
         | - it runs under linux which breaks a completely different type
         | of trust barrier.
         | 
         | As for trust/full disclosure - I'm with nanovms.com
        
           | tkz1312 wrote:
           | they acknowledged the issue and the fix was merged in 2022,
           | what exactly is the criticism here?
        
             | eyberg wrote:
             | No it wasn't - you can still easily replicate. I just did.
             | 
             | My point is that you shouldn't go around talking about how
             | "secure" you are when you have large gaping things like
             | this. This btw is not the only major security issue they
             | have.
        
         | kang1 wrote:
         | not really, its just attack surface reduction
        
       | mikepurvis wrote:
       | I love the concept of gVisor; it's surprising to me that it
       | hasn't seemingly gotten more real world traction-- even GHA is
       | booting you a fresh machine for every build when probably 80%+ of
       | them could run just fine in a gVisor sandbox.
       | 
       | I'd be curious to hear from someone at Google if gVisor gets a
       | ton of internal use there, or it really was built mainly for
       | GCP/GKE
        
         | seabrookmx wrote:
         | Google Cloud Functions and Cloud Run both started as gVisor
         | sandboxes and now have "gen2" runtimes that boot a full VM.
         | 
         | Poor I/O performance and a couple of missing syscalls made it
         | hard to predict how your app was going to behave before you
         | deployed it.
         | 
         | Another example of a switch like this is WSL 1 to WSL 2 on
         | Windows.
         | 
         | It seems like unless you have a niche use case, it's hard to
         | truly replicate a full Linux kernel.
        
         | kang1 wrote:
         | gvisor is difficult to implement in practice. it a syscall
         | proxy rather than a virtualization mechanism (even thus it does
         | have kvm calls).
         | 
         | This causes a few issues: - the proxying can be slightly slower
         | - its not a vm, so you cannot use things such as confidential
         | compute (memory encryption) - you can't instrument all
         | syscalls, actually (most work, but there's a few edges cases
         | where it wont and a vm will work just fine)
         | 
         | On the flip side, some potential kernel vulnerabilities will be
         | blocked by gvisor, while it wont in a vm (where it wouldnt be a
         | hypervisor escape, but you'd be able to run code as the
         | kernel).
         | 
         | This is to say: there are some good use cases for gVisor, and
         | there's less of these than for (micro) vms in general.
         | 
         | Google developed both gVisor and crosvm (firecracker and others
         | are based on it) and uses both in different products.
         | 
         | AFAIK, there isn't a ton of gVisor use internally if its not
         | already in the product, though some use it in Borg (they have a
         | "sandbox multiplexer" called vanadium where you can pick and
         | choose your isolation mechanism)
        
           | coppsilgold wrote:
           | It's not an actual [filtering] proxy. It re-implements an
           | increasing chunk of Linux syscalls with its own logic. It has
           | to invoke some Linux syscalls to do so but it doesn't just
           | pass them through.
        
           | tptacek wrote:
           | I don't think this is really the case, if I'm reading it
           | right. Can you think of a vulnerability hypo where a KVM host
           | is vulnerable, but a gVisor host isn't? gVisor uses KVM.
        
         | dmoy wrote:
         | We used gvisor in Kythe (semantic indexer for the monorepo).
         | Like for the guts of running it on borg, not the open source
         | indexers part.
         | 
         | For indexing most languages, we didn't need it, because they
         | were pretty well supported on borg stack with all the Google
         | internals. But Kythe indexes 45 different languages, and so
         | inevitably we ran into problems with some of them. I think it
         | was the newer python indexer?
         | 
         | > really was mainly for GCP/GKE
         | 
         | I mean... I don't know. That could also be true. There's a
         | whole giant pile of internal software at Google that starts out
         | as "built for <XYZ>, but then it gets traction and starts being
         | used in a ton of other unrelated places. It's part of the glory
         | of the monorepo - visibility into tooling is good, and
         | reusability is pretty easy (and performant), because everyone
         | is on the same build system, etc.
        
           | mikepurvis wrote:
           | Dang, 45? I mean, I assume that's C++, Go, Python, Java, and
           | JavaScript/TypeScript. And languages for build scripts, plus
           | stuff like md and rst. And some shells. Probably embedded
           | languages like lua, sql, graphql, and maybe some shading
           | languages. Fortran and some assembly languages, a forth or
           | two for low level bringup or firmware. Dart of course.
           | 
           | But all of those is still less than 30. What am I missing?
        
       | gowld wrote:
       | What in this article is different for the gvisor intro docs
       | (where the gVisor pictures are plagiarized from)?
       | https://gvisor.dev/docs/
        
       | setheron wrote:
       | Is gVisor a libc LD_PRELOAD ?
        
         | kang1 wrote:
         | no ;) (though you could start it there if you wanted, but..
         | why)
         | 
         | LD_PRELOAD simply loads a library of your choice that executes
         | code in the process context, that's all. folks usually do this
         | when they cannot recompile or change the running binary, which
         | means they also hook and/or overwrite functions of the said
         | program.
         | 
         | generally folks will have gvisor calls integrated to their
         | sandbox code, before the target process starts, so no need for
         | preloading anything in most cases
        
       | lanigone wrote:
       | ask chatgpt to run dmesg via python and you'll find another use
       | of gvisor in prod...
        
       | sneak wrote:
       | I have wondered for a long time why we don't see more networking
       | in userspace for high security applications that don't require
       | high performance. I guess the answer is just that Linux has
       | enough features now to hook into the kernel with userspace code
       | that it usually isn't necessary to move the whole IP and TCP
       | stacks out.
        
       | illamint wrote:
       | gVisor also has a complete userspace networking stack that you
       | can pull in, which makes it a lot easier to do some neat things
       | like run an HTTP server responding to packets intercepted via
       | eBPF and sent to an AF_XDP socket, which would otherwise be a
       | pain.
        
         | tptacek wrote:
         | There's a separately-maintained fork of this (originally by the
         | Tailscale folks) at https://pkg.go.dev/inet.af/netstack.
        
       | spr-alex wrote:
       | We're adding support to gvisor for container plugins, it's a
       | reasonable approach for limiting the rich attack surface on linux
        
         | remram wrote:
         | Who is "we"? What are "container plugins"?
        
       | thundergolfer wrote:
       | We've run gVisor for over 2 years at Modal, and it's been a huge
       | unlock for us. We get a secure sandbox with GPU support that can
       | run on VMs. Just recently it allowed us to checkpoint/restore
       | containers AND its GPUs[1].
       | 
       | gVisor's achilles heel is it's missing or inaccurate syscalls,
       | but the gVisor team is first class in responding to Github issues
       | so it's really quite manageable in practice if you know how to
       | debug and hack on a userspace kernel.
       | 
       | 1. https://news.ycombinator.com/item?id=44747116
        
         | ignoramous wrote:
         | > _userspace kernel_
         | 
         | Is gVisor a Kernel or a syscall + select subsystems (like
         | network/gpu) proxy? In my head, a monolith Kernel (like Linux)
         | does more than just syscalls (like memory management, device
         | management, filesystems etc).
        
         | peterldowns wrote:
         | In the past I'd heard people recommend against gVisor, and
         | recommend looking at firecracker instead, because of I/O
         | overhead. Is that something you've noticed at Modal? Obviously
         | you're happy with gVisor, not suggesting you switch, just
         | curious about your experience.
        
         | tptacek wrote:
         | How are you handling the GPU isolation? (This was a big
         | challenge for us doing AMD-Vi KVM isolation).
        
       | Nican wrote:
       | Microsoft's blog post on Hyperlight got my attention a while ago:
       | https://opensource.microsoft.com/blog/2025/02/11/hyperlight-...
       | 
       | I am way out of my depth here, but can anyone make a comparison
       | with the "micro virtual machines" concept?
        
         | eyberg wrote:
         | microvms as espoused by things like firecracker offer full
         | machines but have tradeoffs like no gpu (which makes it boot
         | faster)
         | 
         | hyperlight shaves way more off - (eg: no access to various
         | devices that you'd find via qemu or firecracker) it does make
         | use of virtualization but it doesn't try to have a full blown
         | machine so it's better for things like embedding simple
         | functions - I actually think it's an interesting concept but it
         | is very different than what firecracker is doing
        
       | laurencerowe wrote:
       | TinyKVM [1] has similarities to the gVisor approach but runs at
       | the KVM level instead, proxying a limited set of system calls
       | through to the host.
       | 
       | EDIT: It seems that gVisor has a KVM mode too.
       | https://gvisor.dev/docs/architecture_guide/platforms/#kvm
       | 
       | I've been working on KVMServer [2] recently which uses TinyKVM to
       | run existing Linux server applications by intercepting epoll
       | calls. While there is a small overhead to crossing the KVM
       | boundary to handle sys calls we get the ability to quickly reset
       | the state of the guest. This means we can provide per-request
       | isolation with an order of magnitude less overhead than
       | alternative approaches like forking a process or even spinning up
       | a v8 isolate.
       | 
       | [1] Previous discussion:
       | https://news.ycombinator.com/item?id=43358980
       | 
       | [2] https://github.com/libriscv/kvmserver
        
       ___________________________________________________________________
       (page generated 2025-07-31 23:01 UTC)