[HN Gopher] Our User-Mode WireGuard Year
       ___________________________________________________________________
        
       Our User-Mode WireGuard Year
        
       Author : xrd
       Score  : 258 points
       Date   : 2022-02-09 18:06 UTC (4 hours ago)
        
 (HTM) web link (fly.io)
 (TXT) w3m dump (fly.io)
        
       | born-jre wrote:
       | i had something similar in mind but using libp2p[0]. use as a
       | universal mesh network as library but without central control
       | server, better p2p/NAT traversal, no need to mess with keys*.
       | 
       | [0]: https://github.com/libp2p/
        
       | tedunangst wrote:
       | Man, where are all those virtualization fanboys who said usermode
       | linux was a dead end? :)
        
       | anderspitman wrote:
       | Usermode WireGuard would be a big deal. I maintain a list[0] of
       | tunneling solutions, and one of the only limitations of systems
       | built on WireGuard is the requirement for admin privileges. Even
       | with the performance hit from running outside the kernel, UDP-
       | based tunnels have a lot of advantages for multiplexing channels.
       | Pretty much your only mainstream options today are QUIC and
       | WireGuard, and only QUIC is intended to run in userspace.
       | 
       | I'd have to dig into the details more, but something like this
       | might allow you to implement a simple tunneling system based on
       | WireGuard that runs in the kernel if you have the privileges,
       | otherwise falls back to usermode and is no worse than QUIC in
       | terms of performance. That would be awesome.
       | 
       | [0]: https://github.com/anderspitman/awesome-tunneling
        
         | remram wrote:
         | It sounds like we could have a generic userspace tool that
         | proxies any connection to a WireGuard server. Similar to ssh
         | -L, it would listen on a TCP/UDP port locally (or talk the
         | SOCKS protocol) and convert that to IP packets over the
         | WireGuard connection (using a userspace TCP or UDP
         | implementation for that side).
         | 
         | It looks like Fly.io has all the bits, they just need to be
         | packaged as a stand-alone tool rather than built into flyctl
         | and only talk SSH.
        
           | mrkurt wrote:
           | Tailscale will do this!                   tailscaled
           | --tun=userspace-networking --socks5-server=localhost:1081
        
             | sa1 wrote:
             | Sadly, on the only machine that I would have wanted this
             | on, where I didn't have root access, this has never worked
             | for me.
             | 
             | I should try to recreate the logs and issue for the
             | tailscale folks.
        
         | ffk wrote:
         | Very cool!
         | 
         | Found a gap, Linux Foundation's FD.io's VPP (a high performance
         | network virtual switch) has native wireguard support as well,
         | all in userspace. Support here means you can do full kernel
         | bypass from the app all the way down to the NIC card (e.g. via
         | DPDK).
         | 
         | https://docs.fd.io/vpp/20.09/d5/d54/wireguard_plugin_doc.htm...
         | 
         | I'll open a PR on this later.
        
         | wahern wrote:
         | > Pretty much your only mainstream options today are QUIC and
         | WireGuard, and only QUIC is intended to run in userspace.
         | 
         | Not sure it fits your "mainstream" qualification, but many
         | projects ago I used Airhook to help create a userspace,
         | application-layer('ish), multi-channel virtual network:
         | http://airhook.ofb.net/ (https://github.com/egnor/airhook)
         | 
         | Airhook is a relatively low-level library that handles framing
         | and flow control; it's not a functional solution on its own.
         | But that seems to be what you're getting at--something you can
         | deeply integrate into your application, not a separate service.
         | Though, I guess containers have sort of muddied that
         | distinction.
        
         | pmarreck wrote:
         | Is https://tailscale.com/ not "usermode WireGuard"? I've been
         | playing with it for a while now (it has a fairly generous free
         | tier) and am quite impressed. I can access any of my LAN
         | machines (my servers, my NAS, etc.) from anywhere that is also
         | connected to the same network, and the names work for DNS as
         | well.
        
           | anderspitman wrote:
           | It depends on what sort of tunneling you're doing. If you
           | just want a general-purpose private VPN, Tailscale is
           | amazing. That list is more focused on the use case where you
           | want to host a public server on a machine that isn't
           | accessible to the internet (NAT, corporate firewall, etc).
           | Think a shared Jellyfin server for your friends and family.
           | 
           | You can use Tailscale here but you'll need to separately run
           | a reverse-proxy on a public machine. There are more moving
           | pieces but if you're using Tailscale already then it's a good
           | option.
        
             | pmarreck wrote:
             | I wish I could run two separate Tailscale networks on a
             | single device, one for business and one for personal (for
             | example). Would make it tremendously more useful.
        
           | tptacek wrote:
           | 1. Tailscale is amazing. I hate them so much. (We use
           | Tailscale and are very happy with it.)
           | 
           | 2. Tailscale is user-mode WireGuard.
           | 
           | 3. "User-mode WireGuard" in the sense this post uses the term
           | is a misnomer and refers to the fact that we run TCP/IP
           | itself in userland (Tailscale normally runs through a tunnel
           | device and uses your native TCP/IP stack).
           | 
           | 4. But Tailscale also has code to do user-mode TCP/IP
           | (they've got it running in a browser with wasm).
        
             | pmarreck wrote:
             | Fascinating!
        
             | amscanne wrote:
             | I think Tailscale uses user-mode TCP/IP (also gVisor
             | netstack) for some client devices, like iOS? But could be
             | wrong here.
        
               | bradfitz wrote:
               | We use it on all platforms _except_ iOS, for binary
               | size/memory reasons.
               | 
               | (iOS 15 bumped the Network Extension memory limit to 50
               | MB, but we still need to be super trim for iOS 14's 15 MB
               | limit)
        
             | anderspitman wrote:
             | Last I heard[0] they were experimenting but hadn't shipped
             | it. AFAIK their client still requires root, no?
             | 
             | Running on wasm sounds awesome. This[1] looks like it. Do
             | you know how they're doing the actual networking? WebRTC
             | tunnel?
             | 
             | [0]: https://news.ycombinator.com/item?id=24483173
             | 
             | [1]: https://twitter.com/bradfitz/status/145142338677775156
             | 1?lang...
        
               | tptacek wrote:
               | Yeah, their client is always going to require privileges,
               | because it needs to enable every other program on the
               | system to interact directly with remote hosts
               | transparently. User-mode TCP/IP works for us because we
               | own the client-side program that our users run to talk to
               | stuff on Fly.io.
        
               | bradfitz wrote:
               | > Last I heard[0] they were experimenting but hadn't
               | shipped it. AFAIK their client still requires root, no?
               | 
               | Tailscale's gvisor/netstack-based userspace networking
               | mode has been supported and in wide use for quite some
               | time. It's the default on Synology DSM7, for instance.
               | 
               | You don't need root when you run tailscaled with
               | `--tun=userspace-networking`.
               | 
               | Peers can still connect inbound to the non-root
               | tailscaled, but to connect _out_ to other peers, you need
               | to use tailscaled's HTTP or SOCKS5 proxy, which are also
               | flags to tailscaled, to specify what port they listen on.
        
               | anderspitman wrote:
               | Thanks for the update!
               | 
               | Do you have any links that talk more about how the wasm
               | stuff works? I'd love to read more about that.
        
       | lyeager wrote:
       | > The Consul cluster would hold an Entmoot
       | 
       | Hah! That one got me. They know their audience :)
        
       | mwcampbell wrote:
       | > being able to pop a shell on a running app was table-stakes for
       | the platform.
       | 
       | Tangent: that's debatable IMO. In my company's current AWS
       | infrastructure, there's no shell access to either the production
       | containers or the host machines. I did write a script to create
       | an ephemeral container that lets me (and future staff) run a
       | shell inside the production network. And the thing I usually do
       | in that shell is run psql; I suppose that's not ideal for
       | auditability. But still, I can't poke around in the live
       | containers or the host machines; in theory they could be
       | distroless, with no shell at all. I'm trying to take immutable
       | infrastructure to the max here; this seems like a good thing for
       | security. It does mean that for debugging production problems, I
       | can only go by what I find in logs and the database. But that has
       | been acceptable so far.
       | 
       | Edit: Given tptacek's security background, I was surprised that
       | he considered production shell access essential for a new app
       | platform.
        
         | tptacek wrote:
         | It's funny you bring this up, because I had the same thought
         | when I was first implementing SSH (the initial implementation
         | relied on native WireGuard, and just plugged a client
         | certificate into your running SSH agent). I thought people
         | might not want to enable SSH access, and so I made the
         | provisioning of the root SSH certificate optional: you have to
         | run `flyctl ssh establish` to tell us to set up a root cert for
         | your organization.
         | 
         | It's turning out to have been a misfeature that confuses people
         | more than it helps anyone, and we may get to a place soon where
         | we just automatically provision a root cert for new
         | organizations.
        
           | mwcampbell wrote:
           | So do you think I'm being too extreme on this? Or did you
           | just implement SSH because customers want it?
        
             | tptacek wrote:
             | I think it's sensible to run an application fleet without
             | SSH access, but it's tough for a hosting provider that has
             | to support lots of different application fleets to not
             | offer a way to get a shell. Our authorization systems are
             | about to get sharply more interesting as we roll out
             | Macaroon-style tokens this quarter, so I'm optimistic we'll
             | get to a place where we make both styles of application
             | owners happy.
             | 
             | I was more on your side when I wrote the feature, and I'm
             | less on your side now. Also, I SSH into instances to debug
             | things all the time. :)
        
               | 10000truths wrote:
               | I think Linode's approach to this is best: they offer a
               | "virtual console" that basically consists of an SSH
               | gateway that pipes to your VPS's virtual serial port.
        
               | [deleted]
        
         | remram wrote:
         | It's useful when developing. Not being able to shell into the
         | production system is fine if you can shell into a
         | staging/development system.
         | 
         | If you are running Docker containers and you can shell into
         | local containers, that is usually "close enough" that you can
         | do useful troubleshooting. But fly.io (and CloudFlare workers,
         | etc) are different enough from off-the-shelf containers that it
         | is very important to be able to poke at containers when they
         | break, even if they are not the actual production containers.
        
       | drunner wrote:
       | How do they manage their mesh?
       | 
       | I've just been doing research on setting up my own wireguard mesh
       | (currently using a spoke/hub setup with pi-hole/pivpn).
       | 
       | I found https://github.com/HarvsG/WireGuardMeshes today which is
       | awesome, but I'm curious what fly.io / other readers here may be
       | using.
        
         | ohyeshedid wrote:
         | I usually build my own solutions, but I've played with Netmaker
         | and it seems solid.
         | 
         | https://github.com/gravitl/netmaker
        
       | tptacek wrote:
       | Because 'sho_hn brought this up, here's a stab at a pro/con list
       | of building TCP/IP directly into our API the way `flyctl` does:
       | 
       | Pro:
       | 
       | + Can just run "native" SSH directly over it (or, in our case,
       | use x/crypto/ssh, without modification).
       | 
       | + Lets `flyctl` offers a `flyctl proxy` command to users, so they
       | can plug their own programs into whatever application they need
       | to use, without asking us to change some proxy we run in our
       | infrastructure.
       | 
       | + Offers a single security and access control model (IPv6 private
       | networking), rather than something we have to think about on a
       | per-app basis.
       | 
       | + In theory, we get all this right and never have to think about
       | another network protocol in our infrastructure.
       | 
       | + Allows existing network management tools and libraries to
       | function directly with Fly.io infrastructure.
       | 
       | + With the WebSockets gateway, we can do all of this stuff
       | directly in browsers as well; that is, we can present TCP/IP as
       | an API to browser Javascript to do UI stuff (and we're doing more
       | and more UI stuff these days, in Elixir.)
       | 
       | + Puts more IPv6 in the world.
       | 
       | + Get to talk to Jason Donenfeld more.
       | 
       | + Get to write blog posts like this.
       | 
       | Con:
       | 
       | - Spends one (maybe multiple) innovation tokens or whatever you
       | want to call them.
       | 
       | - Way more things can go wrong; relies on state synchronization
       | and on a clear network path between our users and our gateways.
       | Right now, we have to care whether you can speak 51820/udp.
       | 
       | - User-mode TCP/IP via Netstack is probably significantly slower
       | than a simple TCP proxy would be.
       | 
       | - Required `flyctl` to run a background agent process to manage
       | multiple connections through WireGuard.
       | 
       | - The agent process adds to the list of things that can go wrong
       | (hopefully we're ironed most of them out now).
       | 
       | I can probably come up with more cons.
       | 
       | ...
       | 
       | It's buried in the middle of the post but I want to say it again
       | because I think it's important: this sort of started out as a
       | stunt; it's what I put together to allow people to SSH into their
       | instances without having to install WireGuard locally, and that's
       | _all_ it was. I don 't have to write a soul-searching pro/con
       | list on stunts I use to give people SSH access, because lots of
       | providers have super janky "pop a private terminal" setups. But
       | all this stuff took on greater importance when we used it to run
       | Docker for our remote builders.
       | 
       | I like the approach we're taking a lot! I don't... regret it? I
       | don't think? I think I'm happy with it. But it's complicated.
        
         | tedunangst wrote:
         | How complete is the ssh implementation? I'm thinking I probably
         | want to at least run git/hg push, and maybe even do port
         | forwarding.
        
           | tptacek wrote:
           | Not very. You can scp and rsync over it. You can run with or
           | without a pty. That's pretty much it. It should work with
           | git!
           | 
           | You probably shouldn't do port forwarding on Fly.io; if
           | you're running into an actual need for that, we should talk
           | about extending our network access control model.
        
             | tedunangst wrote:
             | Perhaps atypical, but about 50% of my ssh use is port
             | forwarding to construct impoverished man's VPNs. Like I
             | send mail by forwarding localhost:25 to localhost:25 on the
             | mail server.
             | 
             | If I were running PoE (Postgres on Edge) I'd probably want
             | to connect a local client for poking around, but without
             | the bother of meshing my laptop into the cloud.
        
         | anderspitman wrote:
         | Question: ultimately all the packets are actually being sent
         | via a Golang net.UDPConn right? ie you're simulating raw
         | network packets by wrapping them in UDP packets, then running
         | TCP in Golang over those wrapped packets?
        
           | tptacek wrote:
           | That is effectively what we're doing, yeah.
        
             | anderspitman wrote:
             | Cool, thanks!
        
         | alilleybrinker wrote:
         | Thanks for the pro-con list!
         | 
         | The experience of...
         | 
         | 1) build a thing because it's immediately useful for a specific
         | use-case, 2) someone reuses it for another use-case because
         | it's already there and saves some work, 3) times passes 4) oops
         | really important stuff now relies on this thing in ways that
         | weren't originally intended
         | 
         | ... seems like a common pattern (see: JWT succeeding as the
         | format for interoperable tokens by dint of just being around).
         | 
         | In this case, it seems like the pros are basically user-
         | centered pros (`flyctl proxy`, existing tool interop, etc.) and
         | the cons are basically fly-centered cons (state
         | synchronization, maintaining the agent and making it work
         | right).
         | 
         | The cons that do affect users (slowness, maybe they can't speak
         | 51820/udp) seem _annoying_ but not deal-breaking for a lot of
         | use cases. If the slowness persists over a long time it will be
         | interesting to see how users opt to route around it (architect
         | applications / processes to not rely on this channel).
        
         | ericpauley wrote:
         | We recently moved our entire app deployment over to Fly and are
         | mostly loving it, but one of the mildly janky features is
         | hallpass. For instance, (1) connections often fail if you have
         | X forwarding enabled (even if you did no specifial config on
         | the machine), and (2) port forwarding doesn't work. While these
         | aren't really a big deal since (1) you can just disable X
         | forwarding in ssh_config and (2) port forwarding is unnecessary
         | if you can tunnel in via Wireguard, it makes me wonder why a
         | native SSH server isn't used with a small script to manage the
         | required config changes.
         | 
         | While on the topic, we're also eagerly awaiting improved
         | autoscaling (e.g., more responsive, using additional metrics,
         | and scaling down properly). I'd be really curious if you could
         | leverage the more detailed access to instance-level metrics to
         | implement some cool new queue-theoretic modeling: You know
         | roughly how long it takes for an app to launch, you know the
         | current request rate, and you know the time to service
         | requests. You could apply a lightweight Markov model to predict
         | the probability of a given queueing delay in each region within
         | the average launch time and, if so, preemptively launch a new
         | instance before queueing delays even occur. This could be
         | configured to balance a client's tolerance for queueing delays
         | with over-provisioning budget.
        
           | tptacek wrote:
           | Hallpass is a truly trivial piece of code --- it might be
           | less than 400 lines all in. All it really does is run
           | certificate authentication off of a root cert we store in
           | `_orgcert.internal` in DNS.
           | 
           | If you like, you can roll a Dockerfile that runs OpenSSH
           | directly on your internal network address (bind it to `fly-
           | local-6pn`), and then use native WireGuard to talk to it.
           | 
           | I've got a branch on hallpass that does port forwarding, but
           | I never merge it, because you're right: using port forwarding
           | on Fly.io is weird, because we already provide you direct
           | access to any port you're exposing, and you can't talk to any
           | of this stuff without WireGuard already. I think it would
           | just confuse people more if I made port forwarding work.
        
       | anonymousisme wrote:
       | Back in the day (nearly 30 years ago) people would run a user-
       | mode stack to obtain Internet connectivity via a (dial-up) Unix
       | shell account. The program was "slirp" which was named after
       | SLIP/CSLIP, but then upgraded to support PPP once that became a
       | thing.
       | 
       | https://en.wikipedia.org/wiki/Slirp
        
         | rhn_mk1 wrote:
         | History likes to repeat itself:
         | 
         | https://github.com/rootless-containers/slirp4netns
        
       | marcus_cemes wrote:
       | Fly.io's blog posts are incredible, they really seem to really
       | enjoy what they do and want to share what they've made with
       | everyone else. I love them for that.
       | 
       | I wish that more companies could be like this and skip the
       | corporate BS, it shows that they really have something
       | outstanding to offer.
        
         | samwillis wrote:
         | I think their super power here is employing a renowned security
         | expert who is an incredibly good communicator!
         | 
         | (And happens to be a top HN contributor)
        
           | philosopher1234 wrote:
           | Nit: THE top HN contributor
        
             | purplerabbit wrote:
             | my goodness, you're right...
             | https://news.ycombinator.com/leaders (edit: by a factor of
             | ~2, no less!)
        
           | Karrot_Kream wrote:
           | I'm pretty sure their experience running an ISP helps too
           | heh.
        
             | tptacek wrote:
             | It's definitely all my talent that keeps this place
             | running. I'm definitely not just a noisy message board guy
             | who got hired after most of this infrastructure was built
             | and deployed and then just proceeded to make a bunch of
             | message board noise about it.
        
               | swyx wrote:
               | see, this is why the people love reading what you write.
               | keep giving credit but also having fun!
        
           | myth_drannon wrote:
           | And they also employ Phoenix framework creator.
        
         | mbesto wrote:
         | > I wish that more companies could be like this and skip the
         | corporate BS, it shows that they really have something
         | outstanding to offer.
         | 
         | The nature of the blog typically cater towards the intended
         | audience.
         | 
         | The CIO of Disney doesn't give a sh*t if the protocol is called
         | WireGuard or OpenVPN or that if it uses AES-256 encryption -
         | he/she wants someone to tell them that their developers are
         | securely accessing their infrastructure. Full stop. If/when Fly
         | gets to that level (let's say $500M in revenue) their blog tone
         | will likely change - their audience is almost primarily
         | developers and startup CTOs...for now.
        
           | tptacek wrote:
           | So Eden sank to grief, so dawn goes down to day.
        
           | mrkurt wrote:
           | For better or worse, I can guarantee you that we won't ever
           | write articles for the Disney CIO. Unless I get fired.
           | 
           | Whitepapers. They want whitepapers and magic quadrants.
        
         | Bayart wrote:
         | It think it's the only corporate blog I know of that's on my
         | much-read list (ie. every post goes to the top of the pile).
        
         | jcul wrote:
         | They really are. Feels like working there would be really fun.
        
         | [deleted]
        
         | dan-robertson wrote:
         | A lot of (small-medium sized, tech) companies just don't have a
         | process to get things out on their blog like this. It might be
         | that only a few senior people have the ability to write posts
         | and they are not interested or busy with other things, or it
         | might be that there is a slow review process for posts that
         | makes writing them unpleasant, or it might be that they don't
         | want to reveal IP or have an opinion and so have little to talk
         | about. Another company that does a good job of writing blog
         | posts, often timely posts about current (and relevant) events,
         | is cloudflare though their posts have a quite different energy
         | to Fly.io's.
        
         | [deleted]
        
       | iqanq wrote:
       | Can someone explain to me why wireguard is implemented as a
       | kernel module? Yes I get it, more performance. But isn't it
       | completely and absolutely insane to run a complicated piece of
       | software that is open to outside connections with kernel
       | privileges?
        
         | miloignis wrote:
         | Running complex software open to outside connections in the
         | kernel is pretty standard - the TCP/IP stack is in the kernel
         | too!
        
         | tptacek wrote:
         | The WireGuard protocol is deliberately designed to be
         | straightforward to run in the kernel. In steady state, it
         | doesn't even require dynamic memory allocation. It uses timers
         | in lieu of extra statekeeping. It has a simplified networking
         | model ("cryptokey routing") that defers to the host TCP/IP
         | stack a bunch of stuff that other VPN protocols take upon
         | themselves to build. It has just one keying mechanism and an
         | API to build more interesting authentication features (like SSO
         | integration) on top of it, rather than having it invade the
         | core design.
         | 
         | It helps that it was designed and implemented by a kernel
         | exploit author.
        
         | robertlagrant wrote:
         | Isn't most networking in the kernel? It's pretty complicated
         | already.
        
         | remram wrote:
         | Performance.
        
           | Hendrikto wrote:
           | It also helps with availability. If you got a recent kernel,
           | it's already there.
        
       | api wrote:
       | WireGuard is just a transport protocol, so of course you could
       | use it in place of SSL/TLS if you wanted. Interesting though, and
       | I prefer it to SSL/TLS because X509 certs suck.
        
         | derekzhouzhen wrote:
         | It is not just that. WireGuard is TCP tunneled through UDP. A
         | TCP connection tunneled through another layer of TCP will suck
         | badly performance wise. The sliding window algorithm in both
         | layers will fight each other.
        
         | tptacek wrote:
         | WireGuard isn't really the interesting bit here, it's running
         | TCP/IP over it in userland. You cannot straightforwardly do
         | that with SSL/TLS, but it is in fact the API that WireGuard
         | provides.
        
           | sho_hn wrote:
           | This is where the article lost me a little bit. I (think I)
           | technically got the part of running a TCP/IP stack in an
           | unprivileged user process, so you don't have to elevate
           | privilege for adding a network interface and using the host
           | OS TCP/IP stack. And maybe that's already very cool. But:
           | 
           | - What other benefits does it give you?
           | 
           | - This isn't a new problem and presumably has prior best
           | practices for mitigation. What is this replacing, what was
           | the landscape like before this? What was the most similar
           | already?
           | 
           | - Have people been looking for a better solution like this
           | for some time?
           | 
           | - What's the over/under on the maintenance cost? You added in
           | another TCP/IP stack to look after. You maybe save on static
           | configuration and can make your system more dynamic. Pros,
           | cons ... let's list them.
           | 
           | The talks a bit about problems they were trying to address,
           | but not in a way that clearly answered the above for me. It's
           | of course valid to write a piece with a more informed
           | audience in mind, but in something that aims to spread the
           | virtues of an idea I think it could do more.
        
             | tptacek wrote:
             | This is a good question and part of the reason you didn't
             | get a clear answer from the article is that I'm not sure if
             | I have a clear answer.
             | 
             | What I think user-mode TCP/IP gives us is the ability to
             | build arbitrary services --- Postgres, Redis, SSH, network
             | management, whatever --- without having to make
             | infrastructure changes. We don't have to have some weird
             | API or application proxy that knows what's running and
             | who's allowed to run what. Instead, that's simply baked
             | into the network, and flyctl, by dint of netstack, can just
             | use it. If somebody comes up with a cool network service to
             | plug flyctl (or any other tool someone wants to write)
             | into, it will just work.
             | 
             | But things like the maintenance cost, well, yeah, that's
             | most of what the post is about. The maintenance cost was
             | not especially low.+
             | 
             | There's a natural inclination to read any post like this as
             | a kind of brag, but I'm really just experimenting with
             | trying to show the good with the bad here. User-mode TCP/IP
             | is a weird choice! Nobody else I know of does it! It might
             | have been the wrong choice! Even though I love it!
             | 
             | + _It 's actually not low even right now; I'm spending the
             | first half of the day deploying code that relays stats from
             | Netlink on our gateways through our GraphQL API, so that
             | flyctl can check WireGuard gateway health. That is not a
             | thing we would be spending time on if we had just written
             | an explicit proxy for Postgres or whatever, rather than
             | providing a generic network transport._
        
               | sho_hn wrote:
               | That's cool, and I appreciate you sharing these
               | experiments.
               | 
               | I'm in automotive/embedded at the moment, and our daily
               | battle is making decisions on how static vs. dynamic we
               | want our system to be - static (e.g. resource allocations
               | or baked-in scheduling decisions) makes it easier to
               | reason about the system and provide guarantees, but
               | generally lowers efficiency at runtime. Dynamic can make
               | the system much better at serving a wide range of usage
               | scenarios, but makes it harder to eliminate the risk of
               | pathological cases. It's hard not to see things through
               | that lens. The way to construct and run services you've
               | described here to me is an interesting option on that
               | type of axis, in the sense of where the costs/friction
               | goes.
        
               | tptacek wrote:
               | I wish I'd thought of writing a simple pro/con list when
               | I wrote this post. I'll think about that!
        
             | zaphar wrote:
             | The problem isn't new but the previous best practices
             | involved giving the tool super user. Usually through an
             | installation process. See most other VPNs. UserMode TCP/IP
             | stacks aren't very common in practice. This is why what
             | fly.io did is interesting.
        
               | touisteur wrote:
               | Or network namespaces? User-space tcp might also make it
               | easier to do tcp checkpoint restore and container/app
               | migration. Interesting write-up.
               | 
               | I keep thinking of the nightmare of keeping up with the
               | world of Internet middleboxens, broken net layer
               | implementations and icmp hacks that the Linux kernel
               | supports and makes 'just work'. The jump to usermode tcp
               | seems interesting if you're not worried about that (and
               | I've been watching the formal-proven ip/tcp stack space
               | like a hawk for years), but I've been burned so many
               | times with non standard stacks and 'oh you need to
               | connect to that non-updated lynxos system and huh' or
               | 'hah could you enable ecn or this obscure tcp option
               | because... Legacy?'... And sometimes I need tc/netem and
               | netlink and I don't know...
        
         | qbasic_forever wrote:
         | Replacing SSL/TLS with wireguard is cool but aren't you just
         | going to run into the same headaches of rotating
         | certificates/keys? No one is really going to rely on using the
         | same wireguard key indefinitely, right?
        
           | tptacek wrote:
           | It's pretty easy for us to rotate keys now, since new
           | WireGuard peers are extremely cheap to bring up (part of the
           | point of the post is that for most of the last year, that was
           | the opposite of the case, and a new peer was a very painful
           | thing to ask for). But rotating WireGuard keys with Fly.io
           | makes about as much sense as rotating the OAuth2 API token
           | `flyctl` uses (the token is strictly more powerful than the
           | WireGuard key), and people generally don't do that.
        
       | ulzeraj wrote:
       | I was using wireguard-go on FreeBSD jail running on top of an
       | APU2C2 board. Torrenting from my laptop caused wireguard-go cpu
       | usage to spike to high loads and 30-50% CPU usage. Loading
       | wireguard-kmod on the host machine plus some devfs rules dropped
       | the CPU load to 0s.
       | 
       | Not sure what happened there. The processor seems to score less
       | than an RPi4 on Geekbench.
        
         | bityard wrote:
         | I use one of these as a firewall (running OPNSense) and they're
         | very nice but the CPU is indeed _slow_. It's plenty good enough
         | for everything the firewall does but booting it up takes
         | minutes and that's saying something for FreeBSD.
        
       ___________________________________________________________________
       (page generated 2022-02-09 23:00 UTC)