[HN Gopher] We improved the performance of a userspace TCP stack...
       ___________________________________________________________________
        
       We improved the performance of a userspace TCP stack in Go
        
       Author : infomaniac
       Score  : 112 points
       Date   : 2024-06-05 16:13 UTC (6 hours ago)
        
 (HTM) web link (coder.com)
 (TXT) w3m dump (coder.com)
        
       | wmf wrote:
       | "Asking for elevated permissions inside secure clusters at
       | regulated financial enterprises or top secret government networks
       | is at best a big delay and at worst a nonstarter."
       | 
       | But exfiltrating data with a userspace VPN is totally fine?
       | 
       | I'm also wondering why not use TLS.
        
         | tazjin wrote:
         | Yeah, the optimisations are cool of course, but (maybe due to
         | being unfamiliar with the tool?!) I didn't understand why they
         | can't just `listen(2)`.
        
           | vlovich123 wrote:
           | It's answered in the opening paragraph although I'll admit
           | I'm still unclear.
           | 
           | > We are committed to keeping your data safe through end-to-
           | end encryption and to making Coder easy to run across a wide
           | variety of systems from client laptops and desktops to VMs,
           | containers, and bare metal. If we used the TCP implementation
           | in the OS, we'd need a way for the TCP packets to get from
           | the operating system back into Coder for encryption. This is
           | called a TUN device in unix-style operating systems and
           | creating one requires elevated permissions, limiting who can
           | run Coder and where. Asking for elevated permissions inside
           | secure clusters at regulated financial enterprises or top
           | secret government networks is at best a big delay and at
           | worst a nonstarter.
           | 
           | The specific part that's unclear is why encryption needs to
           | be applied at the TCP layer and at that point if they need it
           | at the transport layer why they're not using something like
           | QUIC which has a much more mature user-space implementation.
        
             | immibis wrote:
             | Or TLS. It seems to be a remote cloud desktop type of
             | product, so why not use TLS like every other one?
        
             | neonsunset wrote:
             | The quote - is this yet another issue caused by abysmal FFI
             | overhead in Go?
        
               | tazjin wrote:
               | There's nothing related to FFI calls in this quote.
        
               | zer00eyz wrote:
               | https://www.reddit.com/r/golang/comments/12nt2le/when_dea
               | lin...
               | 
               | If your C doesn't fight the scheduler it isn't that bad.
        
             | cricketlover wrote:
             | Agree. Very unclear why they won't simply use a secure
             | socket or why a user space tunnel will be needed.
             | 
             | I surmise that the reason might be that a user space tunnel
             | might be faster (like maybe they can do UDP over TCP or
             | something to gain speed improvements).
             | 
             | Good post nevertheless.
        
             | dpeckett wrote:
             | I think the key insight behind this approach (and I'm
             | biased here having written something similar) is that the
             | difference between QUIC and (wireguard + network stack) is
             | A LOT less than you might think.
        
         | tptacek wrote:
         | Every connection you make to a remote service "exfiltrates
         | data". Modern TLS is just as opaque to middleboxes as WireGuard
         | is, unless you add security telemetry directly to endpoints ---
         | and then you don't care about the network anyways, so just
         | monitor the endpoint.
         | 
         | The reason you'd use WireGuard rather than TLS is that it
         | allows you to talk directly to multiple services, using
         | multiple protocols (most notably, things like Postgres and
         | Redis) without having to build custom serverside "gateways" for
         | each of those protocols.
        
           | taeric wrote:
           | I think the point was more that doing this as a way to avoid
           | the red tape of getting permission to open a new connection
           | is odd?
        
             | tptacek wrote:
             | I understand the impulse, but I think it misconstrues the
             | "red tape" this method avoids. It's sidestepping a quirky
             | OS limitation, which dates back to an era of "privileged
             | ports" and multi-user machines. It's not really
             | sidestepping any sort of modern policy boundary. For
             | instance: you could do the exact same thing with WebSockets
             | (and people do).
        
               | taeric wrote:
               | I was thinking websockets; though, I thought those
               | largely hit the same criticisms? That is, tons of things
               | moved to them specifically to avoid any firewall rules
               | about what they were allowed to send over a network.
               | 
               | I'll fully grant that that seems to be the norm for
               | everything browser related. Policies got difficult to
               | install new software, just point your browser to this url
               | and call it a day.
        
       | convolvatron wrote:
       | is this part of the open source releases? I looked at the
       | coder.com github, but couldn't find it. I haven't written a
       | compatible TCP, but a different reliable transport in go
       | userspace. fairness aside, i wonder why we dont see this more
       | often. would love to take a look
        
         | tazjin wrote:
         | They upstreamed their gVisor changes:
         | https://github.com/google/gvisor/pull/10287
        
       | jijji wrote:
       | it's a solution looking for a problem
        
         | lxgr wrote:
         | gVisor definitely solves a problem for me:
         | https://news.ycombinator.com/item?id=39900329
        
       | pantalaimon wrote:
       | The obvious question is: How does it compare to the in-Kernel TCP
       | stack?
        
         | syzcowboy99 wrote:
         | gVisor's netstack is still much slower than the kernel's (and
         | likely always will be). The goal of this userspace netstack is
         | not to compete with the kernel on performance, but offer an
         | alternative that is more portable and secure.
        
       | nynx wrote:
       | Doesn't creating a raw socket need elevated permissions?
        
         | tptacek wrote:
         | They're not creating raw sockets+. The neat thing about
         | WireGuard is that it runs over vanilla UDP, and presents to the
         | "client" a full TCP/IP interface. We normally plug that
         | interface directly into the kernel, but you don't have to; you
         | can just write a userspace program that speaks WireGuard
         | directly, and through it give a TCP/IP stack interface directly
         | to your program.
         | 
         | + _I don 't think? I didn't see them say that, and we do the
         | same thing and we don't create raw sockets._
        
           | vlovich123 wrote:
           | So it tunnels TCP/IP over Wireguard UDP?
        
             | tptacek wrote:
             | Correct (I mean, that's fundamentally what WireGuard is: a
             | UDP TCP/IP tunnel, with strong modern encryption).
        
       | andrewstuart wrote:
       | If you're tunneling a better connection configuration isn't the
       | tunnel what defines the latency?
        
       | andrewstuart wrote:
       | I have a problem right now which is that it's slow to copy large
       | files from one side of the earth to the other. Is this the basis
       | of a solution to that maybe?
        
         | dpe82 wrote:
         | What do you think are the current problems contributing to your
         | slow transfers?
        
           | andrewstuart wrote:
           | Window and buffer size is a problem on high latency links.
        
         | 392 wrote:
         | No. Profile first. Make sure you've tried tweaking params like
         | batch sizes.
        
       | parhamn wrote:
       | I don't know anything about Coder, but Gvisor proliferation is
       | annoying. It's a boon for cloud providers, helping them find
       | another way to get a large multiple performance decrease per
       | dollar spent in exchange for questionable security benefits. And
       | I'm seeing it everywhere now.
        
         | kccqzy wrote:
         | There are still products from cloud providers that don't use
         | gvisor. Basics like EC2 or GCE. Sounds like you chose the wrong
         | cloud product.
        
         | tptacek wrote:
         | Are you referring to gVisor the container runtime, or
         | gVisor/netstack, the TCP/IP stack? I see more uptick in
         | netstack. I don't see proliferation of gVisor itself.
         | "Security" is much more salient to gVisor than it is to
         | netstack.
        
           | parhamn wrote:
           | In the issue of abysmal performance on cloud-compute/PaaS Im
           | talking about the container runtime (most Paas is gVisor or
           | Firecracker, no?) cloudrun, DO, modal, etc.
           | 
           | But given this article is about improving gvisors userland
           | tcp performance significantly, it seems like the netstack
           | stuff causes major performance losses too.
           | 
           | I saw a github link in another top article today
           | https://github.com/misprit7/computerraria where the Readme's
           | Pitch section feels very relevant to gvisor.
        
             | tptacek wrote:
             | I don't believe many PAAS run gVisor; a surprising number
             | just run multitenant docker.
             | 
             | The netstack stuff here has nothing to do with the rest of
             | gVisor.
        
         | loosescrews wrote:
         | Can you elaborate on your concern? Is the issue that you don't
         | trust gVisor to keep the cloud provider secure?
        
           | parhamn wrote:
           | Providers managed secure shared environments for decades
           | before ultra inefficient wrappers and runtimes like gVisor
           | existed.
        
         | weitendorf wrote:
         | I don't understand - what do you suggest as an alternative to
         | Gvisor?
         | 
         | > large multiple performance decrease per dollar spent
         | 
         | Gvisor helps you offer multi-tenant products which can be
         | actually much cheaper to operate and offer to customers,
         | especially when their usage is lower than a single VM would
         | require. Also, a lot of applications won't see big performance
         | hits from running under Gvisor depending on their resource
         | requirements and perf bottlenecks.
        
       | dpeckett wrote:
       | Really cool to see others hacking on netstack, bit of a shame
       | it's tied up in the gVisor monorepo (and all the Bazel
       | idiosyncracies) but it's a very neat piece of kit.
       | 
       | I've actually been hacking on a similar FOSS project lately, with
       | a focus on building what I'm calling a layer 3 service mesh for
       | the edge. More or less came out of my learned hatred for managing
       | mTLS at scale and my dislike for shoving everything through a L7
       | proxy (insane protocol complexity, weird bugs, and you still have
       | the issue of authenticating you are actually talking to the proxy
       | you expect).
       | 
       | Last week I got the first release of the userspace router
       | shipped, worth taking a look if you want to play around with a
       | completely userspace and unprivileged WireGuard compatible VPN
       | server.
       | 
       | https://github.com/noisysockets/nsh/blob/main/docs/router.md
        
         | iangudger wrote:
         | If you want to use netstack without Bazel, just use the go
         | branch:
         | 
         | https://github.com/google/gvisor/tree/go
         | 
         | go get gvisor.dev/gvisor/pkg/tcpip@go
         | 
         | The go branch is auto generated with all of the generated code
         | checked in.
        
           | dave78 wrote:
           | I did this once for an experimental project and found it
           | really difficult to keep the version of gVisor I was using up
           | to date, since it seems like the API is extremely volatile.
           | Anyone else had this experience? If so, is there some way
           | around it that I don't know? Or did I just try it at a bad
           | point in the development timeline?
        
       ___________________________________________________________________
       (page generated 2024-06-05 23:00 UTC)