[HN Gopher] SSH and User-Mode IP WireGuard
___________________________________________________________________
SSH and User-Mode IP WireGuard
Author : BCM43
Score : 268 points
Date : 2021-03-02 14:34 UTC (8 hours ago)
(HTM) web link (fly.io)
(TXT) w3m dump (fly.io)
| azalemeth wrote:
| I know it's not really an HN thing to say, but _this is just
| cool_. Reverse ssh tunnels on Wireguard through my VPN are cool
| enough; the amount of _magic_ here (albeit I think perhaps not
| totally strictly required magic...) is definitely interesting++.
| willis936 wrote:
| This morning I set up openSSH + wsl + minicom to configure my
| router over LAN. Why? Because I could. Computers are fun.
|
| Also, it saves moving cables around when I break stuff.
| mrkurt wrote:
| It should be an HN thing to say. Unrestrained positivity is
| much more fun than kneejerk cynicism. :D
| rileymichael wrote:
| Pretty cool write up. It mentions that every host is running a
| DNS server that instances have access to, which is being utilized
| to store the public key (neat!)... is there any way for customers
| to consume this for other purposes, say out of the box service
| (instance) discovery?
| tptacek wrote:
| Yes; the original purpose of private DNS at Fly was for service
| discovery. `your-app.internal` is the AAAA's of every instance
| for your-app; `nrt.your-app.internal` every instance in Japan,
| `aws-rds-1._peer.internal` is AAAA for the other side of a
| WireGuard gateway you created to bridge your apps to an RDS
| database, etc.
| kerng wrote:
| Isn't this bad for privacy? Encoding app, org and such
| information in IP address?
| tptacek wrote:
| No. These are internal IPv6 addresses in the ULA space. They're
| a part of our network fabric, not something the Internet sees.
| spockz wrote:
| > We take Docker-type containers from users and transmogrify them
| into Firecracker micro-VMs
|
| What is the relationship with micro kernels? Is the feature
| available separate from the deployment/hosting?
| tptacek wrote:
| None. A Firecracker micro-vm is just a very small, very quick-
| to-start-up VM. It uses KVM, eliminates the BIOS, and
| implements only the minimal devices needed to boot and run
| server Linux. Amazon built the project for Lambda and Fargate.
| More about it here:
|
| https://fly.io/blog/sandboxing-and-workload-isolation/
| atonse wrote:
| Is this sort of like what MS is doing with Windows Subsystem
| for Linux, where they're able to "boot" that Linux in mere
| seconds?
|
| By the way, as an elixir developer Fly.io looks extremely
| cool. But my (mostly public sector) customers want to hear
| something similar to the words "AWS" when asked about hosting
| - so is it running on top of AWS or Azure or GCP? (instances
| look like they may be GCP, which is fine too).
| tptacek wrote:
| It runs on our own hardware. There's no AWS or GCP beneath
| it.
| atonse wrote:
| Ok thanks - this has been one of the rare things that
| some of those clients seem to care about (whether they're
| right or wrong, or rather, more conservative).
|
| I had another question - this seems similar to what
| Hashicorp is doing with Boundary. Have you looked at
| Boundary and how this potentially compares with that,
| from an architecture standpoint? Of course there are
| parts of this that are bespoke to your infrastructure,
| but I'm just more curious from a nerdy-aspect of it
| because we're evaluating boundary as a replacement to our
| current setup (Wireguard bastion host), for all the other
| benefits like auth and logging.
| tptacek wrote:
| There is a lot of Hashi in our stack already; we
| orchestrate with Nomad (we have our own Firecracker task
| driver), we backend our certificate system --- which is
| awesome, certificates _just work_ for Fly apps --- with
| Vault, and we use a lot of Consul.
|
| I think our take on end-user access management is lower-
| level than what Boundary is trying to do. Boundary, as I
| understand it, sees the world the way an IdP RP does,
| mostly in terms of bearer tokens. We see stuff as
| infrastructure; a static configuration on an EC2 instance
| or a CI container; "just Unix". If we weren't building a
| PAAS, we'd probably lean much more strongly towards
| Boundary's way of looking at things.
|
| As well, we care about minimizing and understanding as
| much of the code we expose as possible. For all the
| talking I've done about SSH here, the serverside of this
| feature is just a couple hundred lines of code; it is
| dwarfed by the clientside code. I couldn't say that about
| a Hashi product. (HashiCorp could though!)
| atonse wrote:
| Thanks. I actually feel like if Boundary had an
| experience more like Tailscale does, but on top of their
| stack (wireguard network, secrets in vault,
| server/service discovery in consul, etc), that would be
| really powerful and a no-brainer for those of us who also
| use a lot of Hashicorp products.
|
| But I'm still trying to fully understand what they're
| doing with Boundary. The abstractions just feel a bit off
| to me unlike other Hashicorp products (it's odd to me
| that you have to tell boundary to treat a database
| connection differently rather than just giving me any TCP
| or UDP access).
|
| But their team does great work and has elegant designs so
| I trust it's more likely that the lightbulb just hasn't
| gone off in my head yet with Boundary.
| chrisweekly wrote:
| Amazing. The client API is profoundly simple.
|
| Also, this post prompted me to look closer at Fly.io, and it's
| leapfrogged to the top of my shortlist for an imminent client
| "edge proxy" project.
| mrkurt wrote:
| I love proxies and think all problems should be solved with
| proxies. Which means - if you give Fly.io a try and need any
| help, you should let me know!
| [deleted]
| CyberRabbi wrote:
| Running networking stacks in user mode really opens up a lot of
| interesting solutions. Wireguard is sort of an enabling
| technology for this.
|
| Just realized this was written by security guru tptacek, nice.
| What is the contextual meaning of "AFFIANT SAYS NOTHING
| FURTHER."?
| boundlessdreamz wrote:
| Off-topic: What's the software used for the blog?
| michaeldwan wrote:
| Markdown + middleman. We use it for docs and landing pages in
| addition to the blog.
| vlmutolo wrote:
| > Normally, this big balloon thingy would be an elaborate scheme
| to get you to check out our product, but here it's just pointing
| out some new source code we haven't talked about elsewhere.
|
| I really enjoy this style of writing from a company.
|
| Regarding the article, it _seems_ like Fly has pulled off some
| insane networking nonsense, but I don't know enough about
| networking yet to understand it. Saving this page for later and
| gonna get back to the TCP /IP Guide.
| ignoramous wrote:
| > _Regarding the article, it seems like Fly has pulled off some
| insane networking nonsense_
|
| Fly is essentially building a Tailscale-esque infrastructure to
| service _one_ part of their cloud offering. It is indeed insane
| the amount of heavy-lifting they do to make it all work. They
| seem like a cross between packetfabric, gitops, docker, and
| hashicorp but with way less engineers on the team.
| brianm wrote:
| The technical heavy lift is rarely the success determinant,
| so having a company implement half-baked (enough for internal
| use, but without the edges polished off that are needed to
| support it with external customers) versions of N related
| (but not yet mature) technologies is pretty normal (if they
| are full of good engineers) and getting advantage from it.
|
| Most of the time these implementations are too tightly tied
| to the rest of the company's infra to be useful standalone.
| When one of those companies succeeds a common pattern is for
| engineers to cash out, leave, and build a new startup around
| one technology from the success story.
|
| I would not be surprised if this is one of the forces that
| drives the consumer -> infra -> consumer -> infra cycle. A
| consumer wave leads to inventing lots of interesting but
| bespoke infra while it is growing like crazy. When it
| plateaus, folks spin out the interesting infra bits until the
| next consumer wave (generally larger) starts rising.
| mrkurt wrote:
| To be fair, what Tailscale is doing is much harder than our
| private networking. They have to deal with NAT, mobile OSes,
| etc.
|
| We mostly just try to pick the right primitives. And
| frequently get that wrong. Like that time we wrote our own JS
| runtime ...
| tarasglek wrote:
| The prefix for ssh command looks good for commandline. However,
| is there a way to hide with some settings in .ssh/config so one
| can have normal-looking "ssh host" cmdline without special
| prefixes?
| mrkurt wrote:
| "flyctl ssh issue" will get you all setup for normal ssh
| access, and even store credentials in an agent if you have one
| running.
| im3w1l wrote:
| I read this but I didn't _get it_ at all. I can 't see the forest
| for all the excited talk about particular trees. In simple words,
| what problem are they trying to solve?
| mrkurt wrote:
| We are a hosting company. Customer apps run in isolated private
| networks. We let them connect to these private networks with
| WireGuard. Customers _also_ want to do things like "launch a
| console", so we give them a mechanism for SSHing into their
| running containers over their private network (6PN).
|
| WireGuard is dead simple, but setting it up is extra cognitive
| friction if you've never dealt with it before (or if you're in
| an environment where you can't create a network interface).
| Jason Donenfield did some magic with a Google user space
| networking stack that lets us "hide" the wireguard component.
| People using our CLI will soon be able to connect to their
| private network + SSH into a container with one command.
|
| Basically, WireGuard is cool and being able to connect into a
| wireguard network from a userland program is really helpful for
| building a straightforward UX.
| tptacek wrote:
| You need WireGuard to SSH to machines at Fly (that's a good
| thing). You don't have WireGuard installed on a particular
| machine. That's OK, because there's a portable, userland,
| Golang implementation of not only WireGuard but all of TCP/IP
| that can be imported into any Go program. Go programs can BYO
| network stacks. That's crazy. The end.
| londons_explore wrote:
| Still seems like a downgrade for actual users... I just want
| to be able to type ssh instance7.service.zone.user.fly.io
| into my console, and be connected... I don't actually care
| about compiling my own custom ssh client written in go,
| however neat its implementation might be...
| mrkurt wrote:
| That's how it works now! You just have to setup wireguard
| first. You don't need to compile anything.
|
| This userland wireguard project was helpful for making
| "flyctl run console" work.
| kasey_junk wrote:
| But! They shipped that in _their_ go client program so you
| don't have to.
| ash wrote:
| Why is userland TCP/IP stack needed? I didn't get this part
| of the story.
| tptacek wrote:
| Work through it. You need WireGuard to talk to SSH on our
| instances; that can't change, it's a security rule. You can
| get userland WireGuard; that's how most people WireGuard.
| But you can't create an OS tun device: you need root to do
| that; you might as well just install WireGuard. Ok: you
| handshaked a WireGuard connection in Go. What's next?
|
| Let's simplify it: from your Go WireGuard connection, just
| do an HTTP GET. What's your next step?
| ash wrote:
| I think I got it now!
|
| I was confused because Tailscale does not bring its own
| userland TCP/IP. It can - as a VPN solution - rely on OS-
| provided TCP/IP stack, but you wanted to avoid having to
| hook up flyctl into OS as a virtual network interface,
| right?
| tptacek wrote:
| I think you've got it. Tailscale _is installing
| WireGuard_. You have to have privileges to install
| Tailscale. They can tell the OS to route packets through
| their virtual interface.
|
| We could too! This is all in `wireguard-go`. But we'd
| have to prompt users to escalate privileges every time
| they tried to SSH somewhere (or, worse, install a long-
| term resident thingy, _just to SSH to things_ ). We don't
| want to own your VPN connections!
|
| This is an end-run around all of that; we just take
| responsibility for all of TCP/IP, in our dumb little
| command line program.
| aleph- wrote:
| So I'm curious are there any good documentation available
| for using wireguard-go as a lib? Or is it just read the
| source and also read through flyctl source?
|
| Curious about fiddling with something similar with
| firecracker at home.
|
| Think it'd be neat to spin up bespoke micro-vm's with
| wireguard enabled.
| 0xbadcafebee wrote:
| _> I've written a bunch about private networking at Fly. Long
| story short: it's like a simpler, IPv6 version of GCP or AWS
| "Virtual Private Clouds"; we call it "6PN". When an app instance
| (a Firecracker micro-VM) is started at Fly, we assign it a
| special IPv6 prefix; the prefix encodes the app's ID, the ID of
| its organization, and an identifier for the Fly hardware it's
| running on. We use a tiny bit of eBPF code to statically route
| those IPv6 packets along our internal WireGuard mesh, and to make
| sure that customers can't hop into different organizations._
|
| My first thought was _" Wow, can we make this _more_ complicated
| please?"_, and then I read the rest of the post.
|
| I hate technology.
| anderspitman wrote:
| This is fantastic. I maintain a list[0] of tunneling software.
| One of the few downsides of WireGuard is the inability to run it
| in unprivileged situations. The complexity and performance
| overhead here might still be too much to edge out solutions like
| SSH tunnels, but I love that the space is being explored.
|
| I'm hopeful we'll also see some robust QUIC-based tunneling tools
| over the next couple years.
|
| [0]: https://github.com/anderspitman/awesome-tunneling
| russdill wrote:
| tunsocks[0] might be of interest to you. It's very similar to
| the software mentioned by OP except in C. It uses the lwIP
| usermode tcp/ip stack. It doesn't itself have any VPN or
| tunneling support, but instead relies on raw packets being
| passed into and out of a pipe. It can then provide access to
| that network via various proxies, port forwards, and even raw
| packets via NAT (very useful for VMs).
|
| [0]: https://github.com/russdill/tunsocks
| ptomato wrote:
| Not having been previously familiar with fly's network setup, I
| gotta say I find it delightful; derived-prefix IPv6 + WG to give
| you basically static routing + ability to auth on IP is very
| elegant. I've actually been working on a toy stupid-simple
| clustering thing that does something similar, and I'm absolutely
| going to steal the userspace tcp stack over wireguard thing for
| API access.
| tptacek wrote:
| I added some example code to the post, because, again, I kind of
| can't get over how easy this turns out to be. And if you follow
| the link into Jason's `wireguard-go` code, until you hit gVisor
| itself, it's not much more complicated under the hood.
|
| Having complete control of TCP/IP in userland like this, with so
| little code, is so valuable I feel like there needs to be some
| special name for the technique.
|
| The whole thing is kind of a vindication for Go's standard
| library network interface, which I have always hated.
| bluesign wrote:
| Why not something like:
|
| ssh dogmatic-potato-342@jump.fly.io
|
| And tunnel connection over wireguard on jump server
| tptacek wrote:
| Because then there would be some service exposed to the
| Internet (not over WireGuard; if you have WireGuard, you
| don't need a jump box) whose job it would be to hop 6PN
| networks. The only thing we have in our infra now that
| controls access to 6PN is eBPF code; we keep the system
| simple so we can reason about it.
| bluesign wrote:
| Fair point, but isn't this also losing "who connected to
| this server in my organization and when" information.
| tptacek wrote:
| We pipe logs from our instances to users (all logs,
| including your app's); you can see them in `flyctl`.
| (Certificate issuance is also logged in our API, and
| these certs are very short-lived).
| anderspitman wrote:
| > The whole thing is kind of a vindication for Go's standard
| library network interface, which I have always hated.
|
| Curious about this. I've generally found Go's net libs to be
| pretty pleasant. Can you compare/contrast it with others you
| like better?
| tptacek wrote:
| I'm just a 1990s BSD sockets, write-my-own-select-loop kind
| of programmer; the idea of an abstract `Dial` interface
| always seemed like just a performative Plan-9-ism (I
| assume?).
|
| Anyways. Wrong about that one! Movin' on!
| anderspitman wrote:
| Fair enough.
| jeffbee wrote:
| I hope people can mentally generalize this enthusiasm for user-
| mode wireguard in order to understand the value proposition of
| QUIC.
| tptacek wrote:
| QUIC was my plan B for this feature. :)
| ignoramous wrote:
| > _Having complete control of TCP /IP in userland like this,
| with so little code, is so valuable I feel like there needs to
| be some special name for the technique._
|
| Yes! Userspace TCP/IP is how we implement firewall for Androids
| (which don't expose iptables on non-root devices but let you
| setup TUN interfaces via VPN APIs). Right now, we rely on LwIP
| (wrapped in golang) and it has worked wonderfully well;
| especially since it is light-weight without any locking-
| overheads (single-threaded) and that bodes well for battery-
| powered devices.
|
| > _The whole thing is kind of a vindication for Go 's standard
| library network interface, which I have always hated._
|
| The Fuchsia team at Google is re-implementing netstack3 in Rust
| (and hence you're probably right to call it "gVisor netstack")
| due to what I presume are performance and efficiency reasons
| (which is of interest to us because we develop for
| smartphones). Of course, flyctl doesn't need that, but since
| you wrote about pulling in heavy dependencies, I am interested
| in your take on it.
| anderspitman wrote:
| Don't want to go OT but I'm super curious what your
| experience developing a network application for non-root
| Android devices has been?
|
| As a non-Android developer, I've been working on a project
| the last few months that involves running an HTTP server on
| the device and tunneling out so it can receive requests from
| the outside world, and the platform feels nerfed at every
| level from filesystem access to keeping your server from
| being battery-killed.
| ignoramous wrote:
| > _...and the platform feels nerfed at every level_
|
| Android development is a bit tedious relatively compared to
| iOS due to having to support multiple API levels and having
| to account for subtleties across OEM implementations, but
| things have drastically improved in the last few years,
| especially after Oreo (Android 8).
|
| > _...from filesystem access_
|
| Watch out for tutorials still recommending workarounds that
| aren't necessarily needed due to Jetpack and friends:
| https://developer.android.com/modern-android-development.
|
| > _...to keeping your server from being battery-killed._
|
| See: https://dontkillmyapp.com/
|
| Process reaping is also, I believe, a problem on iOS? One
| way to keep a process out of OutOfMemory/LowMemoryKiller's
| reach is to make it a foreground service (what stuff like
| Music Players do) and generally be very stringent with
| resource use. It is easy to profile for resource usage
| thanks to Android Studio's built-in profiler and tools like
| https://perfetto.dev/
| anderspitman wrote:
| Oh iOS is way worse from what I can tell. I don't
| consider it a viable computing platform so haven't
| bothered trying to make my software run there.
|
| But Android seems to be working hard to "catch up" to
| iOS.
|
| I'm mostly comparing to native Linux development.
| Obviously you may need to make some changes for security,
| but I feel like they've gone way overboard with things
| like forcing the storage access framework/media storage
| APIs, killing even foreground services (doze mode etc),
| and so on.
|
| At the end of the day, if you're using software to
| purposefully limit what hardware is capable of, I think
| that's wrong. Even if you're worried about security, add
| a simple escape hatch for power users.
| prattmic wrote:
| This is awesome! In the post you mention "For a couple hundred
| lines of code (not counting the entire user-mode Linux you'll
| be pulling in from gVisor, HEY! Dependencies! What are you
| gonna do!) ..."
|
| I'll note that while all of gVisor's user-mode Linux is in the
| same Go module, we've actually gone to decent lengths to keep
| the network stack logically separate from the rest of the user-
| mode Linux code.
|
| So while go.sum might look a bit frightening, Brad's depaware
| shows that the extra code you pull in to binaries by using
| netstack is actually quite minimal: https://github.com/tailscal
| e/tailscale/commit/5aa5db89d6a9a6....
| closeparen wrote:
| This is such an interesting marketing strategy, I had never
| thought of selling B2B production infrastructure under the
| aesthetic of, "Can you believe this shit actually works?"
| ryanmarsh wrote:
| This is going to be a great tweet for me and I will totally not
| give you credit.
| tptacek wrote:
| Yes! This person gets it.
| xbar wrote:
| I'm a fan of the writing style. It reminds of smart people I
| know. I haven't bought any fly.io yet, so I don't know that
| I'm you're target market. Still--well said, repeatedly.
| ficklepickle wrote:
| I've had a fly.io tab open for months. I haven't had time
| to use it yet, but something in me won't let me close that
| tab.
| tptacek wrote:
| I wish I could tell you it would eat a bunch of your
| time, engaging your curiosity and sense of wonder all the
| way, but really what's going to happen is you're going to
| install `flyctl`, go somewhere with a Dockerfile, do
| `flyctl app create` and then `flyctl app deploy` and it's
| going to just work. :)
| omribahumi wrote:
| Have you considered using ssh command's ProxyCommand option?
| It allows you to replace the TCP transport with communication
| over stdin/stdout.
|
| It could help you replace the TUN with something more cross
| platform, and possibly with less overhead. You can pass in
| the hostname using %h, so you can even have virtual DNS.
| tptacek wrote:
| How does that help us here? Without WireGuard, there's no
| channel with which you can talk to a Fly Hallpass instance,
| by design.
| majke wrote:
| This sounds very similar to my slirpnetstack, which is using
| gvisor netstack to do, which I call translating L3 (packets) into
| L7 (userspace syscalls like connect()):
|
| https://github.com/majek/slirpnetstack/
|
| (btw, gvisor netstack, while not without problems, is likely to
| be faster than libslirp, see benchmarks
| https://github.com/rootless-containers/rootlesskit/pull/101#... )
| tptacek wrote:
| It is extremely similar, and thanks for posting this, I had no
| idea.
| resoluteteeth wrote:
| Maybe you should put this up on github so everyone can use it
| rather than just talking about how easy it is?
| devwastaken wrote:
| I believe all informational blogs/guides should backup to a
| markdown file on GitHub or other. Over time playing with
| technologies I've found a lot of dead links to personal
| websites. This is of course because maintaining your own
| hosting can be cumbersome, domain name expirey, etc. Some
| valuable information gets lost with only waybackmachine to save
| the day. Someday wayback may no longer exist though.
| 177tcca wrote:
| Same with GitHub.
|
| Any Git opening will do, for private cloning.
| mrkurt wrote:
| The link to the source code is in the middle of the article,
| jeez: https://github.com/superfly/flyctl/pull/368
| sdevonoes wrote:
| Sounds very cool and all but at the same time it sounds like a
| terrible thing to maintain in the future.
|
| Perhaps it's just me, but this is something I would accept as a
| "hey, I was bored and worked on something on my free time. It's
| probably broken but nobody cares because it's a toy thing, but
| it's sooo cool". I wouldn't accept it as " Fly.io OKR 1.3 (2021):
| SSH and User-mode iP WireGuard"... it's sounds pretty much like a
| hack.
| tptacek wrote:
| This is called "coming to grips with the insanity that is
| gVisor, the Docker runtime for GKE that is also inexplicably
| just a Go import". I feel your pain.
|
| Wait until I find a reason to put a whole virtual memory
| manager into `flyctl`. I'll probably knock out a whole bunch of
| MBOs that way, and gVisor has me covered.
| bluesign wrote:
| Is this super complex infrastructure for fairly simple thing or
| am I missing something?
| tptacek wrote:
| No, it is _extra-super complex infrastructure_ for a fairly
| simple thing.
|
| Normal SSH still works, and is usually going to be what people
| end up using. You just have to have WireGuard installed and
| running.
|
| The product feature here is less interesting than how we did
| it.
| jrockway wrote:
| It's not clear to me how much day-to-day use of Wireguard
| being a Fly customer requires, but I can't help but wonder if
| you guys should collaborate with Tailscale to make all of the
| micro-VMs appear on a Tailscale network, and authorize access
| between humans and the VMs that way.
|
| (I admit that I haven't looked much into mesh networking /
| edge servers, so I don't know what the problems are. I always
| preferred Internet -> Identity Aware Proxy type thing -> mTLS
| mesh that is useless to humans. And, I don't ssh to stuff
| much anymore... I have my software collect debugging
| information and send it to something I can access through a
| browser or API, and control that software through an API. So
| everything is editing config files, basically, not SSHing
| places ;)
| tptacek wrote:
| We'd love to; we don't want to build a crappy version of
| 1/11th of Tailscale ourselves.
| ignoramous wrote:
| > _...I can 't help but wonder if you guys should
| collaborate with Tailscale..._
|
| I imagine a merger! Tailscale's mission is to "simplify the
| long tail of software development", and coincidentally, fly
| does just that (if only for server-side apps right now).
| Panino wrote:
| Please keep writing about WireGuard. If it wasn't already
| magical enough for its stated purpose (VPN), maybe the
| "truly" interesting thing is how it can enable tech that
| wasn't previously envisioned. After using WireGuard for a
| couple years I'm _still_ excited about it because I feel like
| I 've only glimpsed a small piece of the things that can be
| done with it.
___________________________________________________________________
(page generated 2021-03-02 23:00 UTC)