[HN Gopher] What is gVisor?
___________________________________________________________________
What is gVisor?
Author : yla92
Score : 92 points
Date : 2025-07-31 13:40 UTC (9 hours ago)
(HTM) web link (blog.yelinaung.com)
(TXT) w3m dump (blog.yelinaung.com)
| ericpauley wrote:
| One of the coolest things about gVisor to me is that it's the
| ultimate implication of core computer engineering concepts like
| "the OS is just software" and "network traffic is just bytes".
| It's one thing to learn these ideas in theory, but it's another
| altogether to be able to play with an entire network stack in
| userspace and inject arbitrary behavior in the OSI stack. It's
| also been cool to see what companies like Fly.io and Tailscale
| can do with complete flexibility in the network, enabled by tools
| like gVisor.
| sidewndr46 wrote:
| I'm trying to understand the point you're making here but don't
| really get it. The OS is just software, in most circumstances.
| Most modern OS require at least one binary blob that has to be
| sent to some hardware device. This is mostly because the the
| device manufacturer didn't want to include NVRAM and at the end
| of the day is usually just software as well.
| bananapub wrote:
| their point is that lots of things everyone thinks of as "OS"
| things like "tcp" and "doing file IO" can just be done in
| user space by some new program without the processes that
| make use of these facilities knowing or caring.
| surajrmal wrote:
| The majority of any OS lives in user space though.
| Intercepting syscalls is also not that weird of an idea,
| that's how tools like strace works. Building out sufficient
| kernel functionality without needing to forward calls to
| the kernel is definitely impressive though.
| tptacek wrote:
| What do you mean by that? There's a notion of an
| "operating system" that encompasses both the kernel and
| all the userland tools (in this sense, each Linux
| distribution is an "OS"), and there's a more common
| notion of an OS that is just the kernel and any userland
| services required for the kernel to function; the latter
| is the more common definition.
| leetrout wrote:
| How does Fly use gVisor?
| abound wrote:
| I don't believe they do, they use Firecracker microVMs for
| isolation: https://fly.io/docs/reference/architecture/
| ericpauley wrote:
| https://fly.io/blog/ssh-and-user-mode-ip-wireguard/
| PhilippGille wrote:
| Quote:
|
| > And, long story short, we now have an implementation of
| certificate-based SSH, running over gVisor user-mode
| TCP/IP, running over userland wireguard-go, built into
| flyctl.
| tptacek wrote:
| Also:
|
| https://fly.io/blog/our-user-mode-wireguard-year/
|
| https://fly.io/blog/jit-wireguard-peers/
|
| This is another one of those things where the graph of
| our happiness about a technical decision is sinusoidal.
| :)
| tptacek wrote:
| We don't.
| jchw wrote:
| I think they mean to say that a part of gVisor is used by
| Fly, because if I recall correctly flyctl did use the
| gVisor user mode TCP stack for Wireguard tunneling.
| tptacek wrote:
| Ahh, that makes sense. Ok, revised answer: yes, we do. :)
| quotemstr wrote:
| Just wait until you read about Wine or captive NDIS. You'll
| probably enjoy User Mode Linux most of all.
|
| The concept of an OS still makes sense on a system with no
| privilege level transitions and a single address space (e.g.
| DOS, FreeRTOS): therefore, mystical low level register goo
| isn't essential to the concept.
|
| The boundary between the OS is a lot more porous and a lot less
| arcane than people imagine. In the end, it's just software.
| jchw wrote:
| I believe early on Linode used UML for their VPS hosting
| offering. At that point in history, I recall solutions like
| OpenVZ being pretty popular in the low end space, too.
|
| gVisor's modular design seems to have been its strongest
| point. It's not that nobody understood the OS is just
| software or whatever, but actually ripping the Linux TCP
| stack out and using it in userland isn't really that trivial.
| Meanwhile though a lot of projects have made use of the
| gVisor networking components, since they're pretty self-
| contained.
|
| I think gVisor is one of the coolest things written in Go,
| although it's not really that easy to convey why.
|
| Seriously, just check out the list of packages in the pkg
| directory:
|
| https://pkg.go.dev/gvisor.dev/gvisor
|
| (I should acknowledge, though, that I don't know of that many
| unique use cases for all of these packages; and while the TCP
| stack is very useful, it's mainly used for Wireguard
| tunneling and user mode TCP stacks are not particularly new.
| Still, the gVisor network stack is nicer than hacked together
| stuff using SLiRP-derived code imo.)
| udev4096 wrote:
| Moving to unikernel [0] is the best way to get strong isolation
| and high performance
|
| [0] - https://unikraft.org
| sidewndr46 wrote:
| The last solution I looked at to do something like this was
| using tap / tun devices for networking. How does unikraft
| handle network isolation and virtualization?
| udev4096 wrote:
| From my limited understanding, it has the same isolation
| advantages as that of a VM and therefore it's as strong as
| the hypervisor you use
| sidewndr46 wrote:
| so does unikraft contain a "driver" for virtio networking?
| johncolanduoni wrote:
| It relies on your hypervisor and/or network hardware to
| provide that. In an ideal circumstance (e.g. running on a
| multiqueue NIC with VFIO or virtio acceleration), your VM can
| talk directly to the network hardware. Major clouds will
| provide something morally equivalent via their newer network
| interfaces (gVNIC etc.).
| mikepurvis wrote:
| Absolutely, that reduces your surface area more than anything
| else, but at an enormous cost to ergonomics.
|
| Some of us are still fighting for docker images to not include
| a vim install ("but it's so handy!") and here we've got madlads
| building their app as its own bootable machine image.
| johncolanduoni wrote:
| It's not the best way to get low per-privilege domain overhead
| and fungible resource allocation. You're ultimately limited by
| your hypervisor on those fronts. gVisor containers are
| ultimately a few Linux processes and mostly behave like one
| from a CPU and memory allocation perspective.
| eyberg wrote:
| These people definitely do not understand security at all:
|
| https://github.com/unikraft/unikraft/issues/414
|
| Also - one needs to be careful cause many of the workloads they
| advertise on their site do not actually run under their kernel
| - it runs under linux which breaks a completely different type
| of trust barrier.
|
| As for trust/full disclosure - I'm with nanovms.com
| tkz1312 wrote:
| they acknowledged the issue and the fix was merged in 2022,
| what exactly is the criticism here?
| eyberg wrote:
| No it wasn't - you can still easily replicate. I just did.
|
| My point is that you shouldn't go around talking about how
| "secure" you are when you have large gaping things like
| this. This btw is not the only major security issue they
| have.
| kang1 wrote:
| not really, its just attack surface reduction
| mikepurvis wrote:
| I love the concept of gVisor; it's surprising to me that it
| hasn't seemingly gotten more real world traction-- even GHA is
| booting you a fresh machine for every build when probably 80%+ of
| them could run just fine in a gVisor sandbox.
|
| I'd be curious to hear from someone at Google if gVisor gets a
| ton of internal use there, or it really was built mainly for
| GCP/GKE
| seabrookmx wrote:
| Google Cloud Functions and Cloud Run both started as gVisor
| sandboxes and now have "gen2" runtimes that boot a full VM.
|
| Poor I/O performance and a couple of missing syscalls made it
| hard to predict how your app was going to behave before you
| deployed it.
|
| Another example of a switch like this is WSL 1 to WSL 2 on
| Windows.
|
| It seems like unless you have a niche use case, it's hard to
| truly replicate a full Linux kernel.
| kang1 wrote:
| gvisor is difficult to implement in practice. it a syscall
| proxy rather than a virtualization mechanism (even thus it does
| have kvm calls).
|
| This causes a few issues: - the proxying can be slightly slower
| - its not a vm, so you cannot use things such as confidential
| compute (memory encryption) - you can't instrument all
| syscalls, actually (most work, but there's a few edges cases
| where it wont and a vm will work just fine)
|
| On the flip side, some potential kernel vulnerabilities will be
| blocked by gvisor, while it wont in a vm (where it wouldnt be a
| hypervisor escape, but you'd be able to run code as the
| kernel).
|
| This is to say: there are some good use cases for gVisor, and
| there's less of these than for (micro) vms in general.
|
| Google developed both gVisor and crosvm (firecracker and others
| are based on it) and uses both in different products.
|
| AFAIK, there isn't a ton of gVisor use internally if its not
| already in the product, though some use it in Borg (they have a
| "sandbox multiplexer" called vanadium where you can pick and
| choose your isolation mechanism)
| coppsilgold wrote:
| It's not an actual [filtering] proxy. It re-implements an
| increasing chunk of Linux syscalls with its own logic. It has
| to invoke some Linux syscalls to do so but it doesn't just
| pass them through.
| tptacek wrote:
| I don't think this is really the case, if I'm reading it
| right. Can you think of a vulnerability hypo where a KVM host
| is vulnerable, but a gVisor host isn't? gVisor uses KVM.
| dmoy wrote:
| We used gvisor in Kythe (semantic indexer for the monorepo).
| Like for the guts of running it on borg, not the open source
| indexers part.
|
| For indexing most languages, we didn't need it, because they
| were pretty well supported on borg stack with all the Google
| internals. But Kythe indexes 45 different languages, and so
| inevitably we ran into problems with some of them. I think it
| was the newer python indexer?
|
| > really was mainly for GCP/GKE
|
| I mean... I don't know. That could also be true. There's a
| whole giant pile of internal software at Google that starts out
| as "built for <XYZ>, but then it gets traction and starts being
| used in a ton of other unrelated places. It's part of the glory
| of the monorepo - visibility into tooling is good, and
| reusability is pretty easy (and performant), because everyone
| is on the same build system, etc.
| mikepurvis wrote:
| Dang, 45? I mean, I assume that's C++, Go, Python, Java, and
| JavaScript/TypeScript. And languages for build scripts, plus
| stuff like md and rst. And some shells. Probably embedded
| languages like lua, sql, graphql, and maybe some shading
| languages. Fortran and some assembly languages, a forth or
| two for low level bringup or firmware. Dart of course.
|
| But all of those is still less than 30. What am I missing?
| gowld wrote:
| What in this article is different for the gvisor intro docs
| (where the gVisor pictures are plagiarized from)?
| https://gvisor.dev/docs/
| setheron wrote:
| Is gVisor a libc LD_PRELOAD ?
| kang1 wrote:
| no ;) (though you could start it there if you wanted, but..
| why)
|
| LD_PRELOAD simply loads a library of your choice that executes
| code in the process context, that's all. folks usually do this
| when they cannot recompile or change the running binary, which
| means they also hook and/or overwrite functions of the said
| program.
|
| generally folks will have gvisor calls integrated to their
| sandbox code, before the target process starts, so no need for
| preloading anything in most cases
| lanigone wrote:
| ask chatgpt to run dmesg via python and you'll find another use
| of gvisor in prod...
| sneak wrote:
| I have wondered for a long time why we don't see more networking
| in userspace for high security applications that don't require
| high performance. I guess the answer is just that Linux has
| enough features now to hook into the kernel with userspace code
| that it usually isn't necessary to move the whole IP and TCP
| stacks out.
| illamint wrote:
| gVisor also has a complete userspace networking stack that you
| can pull in, which makes it a lot easier to do some neat things
| like run an HTTP server responding to packets intercepted via
| eBPF and sent to an AF_XDP socket, which would otherwise be a
| pain.
| tptacek wrote:
| There's a separately-maintained fork of this (originally by the
| Tailscale folks) at https://pkg.go.dev/inet.af/netstack.
| spr-alex wrote:
| We're adding support to gvisor for container plugins, it's a
| reasonable approach for limiting the rich attack surface on linux
| remram wrote:
| Who is "we"? What are "container plugins"?
| thundergolfer wrote:
| We've run gVisor for over 2 years at Modal, and it's been a huge
| unlock for us. We get a secure sandbox with GPU support that can
| run on VMs. Just recently it allowed us to checkpoint/restore
| containers AND its GPUs[1].
|
| gVisor's achilles heel is it's missing or inaccurate syscalls,
| but the gVisor team is first class in responding to Github issues
| so it's really quite manageable in practice if you know how to
| debug and hack on a userspace kernel.
|
| 1. https://news.ycombinator.com/item?id=44747116
| ignoramous wrote:
| > _userspace kernel_
|
| Is gVisor a Kernel or a syscall + select subsystems (like
| network/gpu) proxy? In my head, a monolith Kernel (like Linux)
| does more than just syscalls (like memory management, device
| management, filesystems etc).
| peterldowns wrote:
| In the past I'd heard people recommend against gVisor, and
| recommend looking at firecracker instead, because of I/O
| overhead. Is that something you've noticed at Modal? Obviously
| you're happy with gVisor, not suggesting you switch, just
| curious about your experience.
| tptacek wrote:
| How are you handling the GPU isolation? (This was a big
| challenge for us doing AMD-Vi KVM isolation).
| Nican wrote:
| Microsoft's blog post on Hyperlight got my attention a while ago:
| https://opensource.microsoft.com/blog/2025/02/11/hyperlight-...
|
| I am way out of my depth here, but can anyone make a comparison
| with the "micro virtual machines" concept?
| eyberg wrote:
| microvms as espoused by things like firecracker offer full
| machines but have tradeoffs like no gpu (which makes it boot
| faster)
|
| hyperlight shaves way more off - (eg: no access to various
| devices that you'd find via qemu or firecracker) it does make
| use of virtualization but it doesn't try to have a full blown
| machine so it's better for things like embedding simple
| functions - I actually think it's an interesting concept but it
| is very different than what firecracker is doing
| laurencerowe wrote:
| TinyKVM [1] has similarities to the gVisor approach but runs at
| the KVM level instead, proxying a limited set of system calls
| through to the host.
|
| EDIT: It seems that gVisor has a KVM mode too.
| https://gvisor.dev/docs/architecture_guide/platforms/#kvm
|
| I've been working on KVMServer [2] recently which uses TinyKVM to
| run existing Linux server applications by intercepting epoll
| calls. While there is a small overhead to crossing the KVM
| boundary to handle sys calls we get the ability to quickly reset
| the state of the guest. This means we can provide per-request
| isolation with an order of magnitude less overhead than
| alternative approaches like forking a process or even spinning up
| a v8 isolate.
|
| [1] Previous discussion:
| https://news.ycombinator.com/item?id=43358980
|
| [2] https://github.com/libriscv/kvmserver
___________________________________________________________________
(page generated 2025-07-31 23:01 UTC)