[HN Gopher] JIT WireGuard
       ___________________________________________________________________
        
       JIT WireGuard
        
       Author : Lwrless
       Score  : 410 points
       Date   : 2024-03-13 06:13 UTC (16 hours ago)
        
 (HTM) web link (fly.io)
 (TXT) w3m dump (fly.io)
        
       | irjustin wrote:
       | For everyone else, I'll shamelessly use this to plug Netmaker[0].
       | 
       | Not affiliated, just a satisfied guy that needs access to private
       | AWS VPCs across multiple accounts and would love to see them be
       | more widely adopted.
       | 
       | [0] https://www.netmaker.io/
        
         | porker wrote:
         | Is Netmaker like Tailscale? From their site I'm unclear what
         | the distinguishing factor is.
        
           | thunfischbrot wrote:
           | Roughly, yes. Netmaker has a self-hostable server though.
           | With tailscsle of course the 3rd-party headscale is
           | available. Netbird also seems promising. See
           | https://github.com/cedrickchee/awesome-wireguard for more
           | alternatives.
        
             | computeinbrain wrote:
             | Yes. Netbird is very recommendable: https://netbird.io/
        
               | sneak wrote:
               | ...until you read the app privacy label. This app
               | collects a lot of your data for being privacy software.
               | 
               | Tell me again why p2p privacy software needs to phone
               | home?
               | 
               | https://apps.apple.com/us/app/netbird-p2p-vpn/id646932933
               | 9
        
               | AnarchismIsCool wrote:
               | Using it in prod right now, not a fan. It will gleefully
               | create a root ssh tunnel for you through its daemon if
               | the box on the website is checked.
        
         | denkmoon wrote:
         | Can't you do that "AWS native" with private link or vpc
         | peering? I'm a noob with these so I don't understand the
         | benefit of netmaker
        
           | irjustin wrote:
           | The goal isn't to make the networks seem like one and connect
           | resources across accounts, which is what those products do.
           | 
           | My goal is to access private resources via SSH bastion/jump
           | machines in a specific account. There's a few ways to do this
           | in AWS, but all of them are more costly by a pretty wide
           | margin.
        
             | vasco wrote:
             | AWS VPN is pretty cost effective, we used if for a few
             | years with a multi account setup. And it's pretty much zero
             | work.
        
               | apitman wrote:
               | Unless you're transferring a lot of data, such as video
               | streaming or copying files. Looks like $90/TB for egress.
        
             | seany wrote:
             | You can forward ssh through ssm, and dump that into your
             | ssh config file. Works pretty nicely with some of the sso
             | automation for the cli that's around these days.
        
               | rmccue wrote:
               | Is this using SSM's raw port forwarding support? From
               | what I've seen, their protocols seem to lack binary
               | safety (we get weird encoding issues).
        
               | zbentley wrote:
               | Without knowing the specifics of your situation, I would
               | slightly suspect client/configuration if you're
               | encountering encoding issues. In my experience,
               | integrating ssh and ssm is quite stable (provided you're
               | using OpenSSH and not a specific language client's own
               | implementation of the protocol).
        
             | verticalscaler wrote:
             | https://docs.aws.amazon.com/systems-
             | manager/latest/userguide...                 aws ssm start-
             | session --target $instance-id
        
               | poxrud wrote:
               | This is the best way to connect to your instances.
               | However you still need the SSM agent installed and the
               | right IAM permissions.
        
         | mtmk wrote:
         | Thanks, didn't know about this. I guess Netmaker (or similar)
         | manage the keys for you which would make the admin a lot
         | easier. In a previous job we setup and managed wg across a few
         | Windows and Linux boxes using Ansible. It was OK but was
         | getting a little messy in the end.
        
         | remram wrote:
         | What is it, a generic VPN platform? Similar to Tailscale etc?
         | 
         | Their site is extremely vague.
        
       | rubatuga wrote:
       | What's stopping the initial handshake packet from being replayed
       | into the network stack? Seems like there would be no packets lost
       | that way. Also, what is the purpose of checking for "udp[8] = 1"
       | in the eBPF filter?
        
         | zekica wrote:
         | udp[8] = 1 filters only handshake packets. Without it, data
         | packets would also be sent to the userspace daemon.
         | 
         | I'm not sure if initial handshakes can be replayed, but since
         | WireGuard ignores unknown clients, it might be possible.
        
         | tptacek wrote:
         | Nothing. It's a good idea. (As the sibling comment points out:
         | the BPF filter snags just initiation packets, which is what we
         | want; it's the WireGuard equivalent of sniffing for TCP SYNs to
         | see connections starting).
        
         | yencabulator wrote:
         | Yeah, sounds like an NFQUEUE helper that releases the packet
         | after it has added the keys.
        
           | tptacek wrote:
           | Say more!
        
             | yencabulator wrote:
             | So NFQUEUE is an nftables verdict that puts the packet into
             | a numbered queue going to userspace. A userspace process
             | gets packets from the queue, and at some later point issues
             | a verdict on each packet, which can be drop or allow the
             | packet to pass. You can also pcap-style ask to see just
             | first N bytes of the packet, to decrease overhead.
             | 
             | Out of that, you can construct a userspace process that
             | reads a packet from the queue, decrypts the noise
             | initiation, requests configuration, adds wireguard config
             | via netlink, and then releases the packet. And you can do
             | that with multiple packets in flight concurrently.
             | 
             | It also allows things like fail-open if the userspace
             | process is broken or overwhelmed, which would be useful
             | here (already in-kernel wg peers would keep working), and
             | spreading load over multiple workers.
             | 
             | https://netfilter.org/projects/nftables/manpage.html
             | (search for queue)
             | 
             | https://wiki.nftables.org/wiki-
             | nftables/index.php/Queueing_t...
             | 
             | Here's a nice & simple Rust library for the userspace part,
             | to give an idea of what the shape of the API is:
             | https://docs.rs/nfq/latest/nfq/
             | 
             | Also, I'm available for contract work ;) Say hi to Ben from
             | Tv.
        
       | akira2501 wrote:
       | I always felt the disappointment of wireguard was wrapping it up
       | into an opinionated network interface. It really should have been
       | a generic "filter" that you could attach to any type of file
       | handle. Then the configuration would be far less strongly
       | coupled, less weirdly communicated to the kernel, and the status
       | of your connection more immediately obvious.
       | 
       | Plus, you could have wireguard files on your local or remote
       | filesystems, or any character device, or named pipes if you felt
       | like it. You could use a "jit" daemon to build tap or other
       | interfaces for you, or just do it individually at the application
       | layer. You could have pre registered keys with the kernel, or you
       | could manage that directly, or generate them randomly.
       | 
       | It's always been a weird smelling underspecified IPSEC clone to
       | me, when it could have been so much more.
        
         | mellutussa wrote:
         | But why don't use the nice smelling IPSEC if that ticks your
         | boxes?
        
           | akira2501 wrote:
           | It doesn't. It just foresaw the need to be able to
           | dynamically configure tunnels on first connection and
           | specified all of that. Which seems to me is a lot of what fly
           | io has just mostly reimplemented here.
           | 
           | In any case the point is I would prefer to just have the
           | basic components available and let me piece them together
           | however I want. Mostly to allow using the underlying
           | technology in more contexts that it is currently available
           | in.
        
             | medstrom wrote:
             | Heh, it really sounds like your needs would be better
             | served with IPSec or something. WireGuard was born
             | precisely because they saw that the whole problem making
             | other existing solutions difficult to audit and insecure-
             | in-practice was their thousand ways to configure. So they
             | did the opposite. Low lines of code, few possibilities.
             | 
             | In software you often choose between a small monolith and a
             | big kitchen sink. Once you have 1 more need than the
             | monolith covers, you have to go over to the kitchen sink.
        
         | d-z-m wrote:
         | > I always felt the disappointment of wireguard was wrapping it
         | up into an opinionated network interface.
         | 
         | Why? WireGuard is a VPN, it's pretty normal for VPN solutions
         | to expose themselves as a network interface.
         | 
         | > It really should have been a generic "filter" that you could
         | attach to any type of file handle.
         | 
         | What's the use-case you had in mind here? I'm not sure how
         | generifying it to a "filter" on any type of file descriptor
         | looks for an interactive protocol like wireguard.
         | 
         | > It's always been a weird smelling underspecified IPSEC clone
         | to me[...]
         | 
         | Just because there isn't an RFC? I've always found the
         | wireguard paper[0] to be quite readable and thorough in it's
         | specification of the protocol.
         | 
         | [0]:https://www.wireguard.com/papers/wireguard.pdf
        
           | justsomehnguy wrote:
           | Not the OP, but the main problem with WireGuard[0] is not the
           | protocol[1], it's good, but the opinionated tooling around
           | it, be it .INI style configurations (god I hate it), mutual
           | incompatibility of wg and wg-quick or just outright stupid
           | decisions around storing config files and interacting with
           | the user in the Windows client.
           | 
           | Though there are some nuances with the routing
           | selection/filtering too, which gets troublesome when you just
           | need a pipe and run a proper routing protocols over it. ::/0
           | solves most of it but still there are some rough edges.
           | 
           | [0] well, for *me*
           | 
           | [1] One of the amusing things I discovered what I have a full
           | 10MB/s+ to the SMB server in the DC over the WireGuard tunnel
           | (and that's because it's 100Mbit/s uplink), while the
           | Synology which sits _on the same router on a 1Gbit port_ only
           | makes 3-5MB /s.
        
             | aragilar wrote:
             | You don't ever need to use the configuration files, netlink
             | on linux or the cross platform interface documented at
             | https://www.wireguard.com/xplatform/ mean you can write
             | your own tooling around an existing inteface.
             | 
             | The cryptokey routing is pretty fundamental to wireguard,
             | I'm not sure you could have one without the other.
        
             | vbezhenar wrote:
             | wg is low-level binary, wg-quick is high-level script
             | wrapper. They're not supposed to have any compatibility at
             | all. You can build any kind of high-level wrapper for wg.
             | One of that wrappers is Network Manager, for example.
             | 
             | My issue with wireguard is that it's not enough for full-
             | fledged VPN solution, for example there's no way to push
             | routes to the client or DNS configuration or something like
             | that. Those are very basic needs. If you have 100 users and
             | you decide to change routing scheme, well, you're in a
             | trouble. It is supposed to be solved by higher-level
             | protocols, but I'm not aware of any open standard-de-facto
             | ones with quality implementations.
        
               | tuetuopay wrote:
               | nothing prevents you from running bgp over wireguard for
               | route exchange. and there are many quality bgp daemons
               | available*
               | 
               | (* for linux and bsd)
        
               | vbezhenar wrote:
               | How do I configure my iPhone with BGP routes? Write my
               | own app for VPN? Android, Windows? Linux users who have
               | no idea what BGP is? That won't work, if you're small.
        
               | tuetuopay wrote:
               | huh yeah, too focused on my infra side of things; mobile
               | vpn is another can of worms.
               | 
               | though I don't know what would prevent any bgp daemon
               | from running in e.g. the wireguard iOS app? there are bgp
               | daemons like gobgp that can be easily integrated in other
               | software.
               | 
               | but this was more meant to be a joke than anything.
               | wireguard is batteries definitely not included, and is
               | why tailscale and the like do exist.
        
               | LeBit wrote:
               | You mean "dynamically" push routes and DNS configurations
               | while the tunnel is already up?
               | 
               | Because you can definitively configure routes and DNS at
               | connection time.
        
               | ta1243 wrote:
               | What you'd normally think of as Wireguard allows routes
               | at connection time sure, however OP wants a VPN which
               | allows peer B ("server") to define a route and advertise
               | that route to peer A ("client"). So one day the client
               | would route 10.1.0.0/24 down the wireguard tunnel, but
               | not 10.2.0.0/24, the next day however from changing peer
               | B, the config on peer A would change.
               | 
               | Obviously there are many things you could do to allow
               | this (run a routing protocol, build a custom client which
               | gets route information, etc), but the "out of the box"
               | wireguard is a kernel interface, a wg command, and a
               | utility script (wg-quick). I think there are some gui
               | based clients for non-linux based OSes, but it's the same
               | principle.
               | 
               | DNS is nothing to do with the wireguard kernel or
               | userspace, it's configured in the "wg-quick script"
               | (there's a bash function called set_dns), but you can do
               | that however you want.
               | 
               | Wireguard alone isn't what an enterprise would consider
               | to be a "VPN solution", it doesn't push configs from a
               | central location, it's very much a peer-to-peer tool. You
               | can build "enterprise" features like centrally defined
               | routes or DNS on top of that, or not, it's not
               | opinionated.
        
               | vbezhenar wrote:
               | I'm generally comparing it with OpenVPN and it allows to
               | do all that.
        
               | justsomehnguy wrote:
               | > there's no way to push routes to the client or DNS
               | configuration
               | 
               | Yep, this one too.
        
         | api wrote:
         | Wireguard is more or less just Noise (IK I think) for IP
         | packets. It wouldn't make much sense for files since there is
         | no counterparty. File encryption is done differently.
         | 
         | Edit: http://noiseprotocol.org/
        
           | actionfromafar wrote:
           | Aren't encrypted files like noise?
        
       | d-z-m wrote:
       | > We can install the peer as if we're the initiator, and flyctl
       | is the responder. The Linux kernel will initiate a WireGuard
       | connection back to flyctl. This works; the protocol doesn't care
       | a whole lot who's the server and who's the client. We get new
       | connections established about as fast as they can possibly be
       | installed.
       | 
       | Is this(effectively) adding a half round-trip to the handshake?
       | i.e.                 1. ->flyctl sends Initiation       2. <-peer
       | is added via netlink(which causes new Initiation to be sent)
       | 3. ->Response from flyctl
        
         | OJFord wrote:
         | My reading was that both peers end up 'thinking' they
         | initiated, but it doesn't matter. i.e. (3) either doesn't
         | happen or just doesn't need to be waited for, or that they
         | could even block (2-new initiation) and then it definitely
         | wouldn't.
        
         | saclark11 wrote:
         | Pretty much, yes. If you imagine "Bob" has a policy that he can
         | only converse with numbers in his address book, then you could
         | think of it as:                 1. -> Alice calls Bob
         | 1.a. Bob does not pick up the call, but adds the number shown
         | from caller ID to his address book            2. <- Bob calls
         | the number (Alice) back       3. -> Alice picks up and they
         | talk happily
        
       | niz4ts wrote:
       | When I read this, I got a little too excited and thought they
       | managed to get wireguard connections happening in the browser
       | with webassembly (this isn't impossible, but the only attempt[0]
       | I know of so far only works because of extra things tailscale
       | has). It's an idea I've had for a project, but one I haven't had
       | time to dedicate to (yet).
       | 
       | In any case, really cool write-up! I wonder if they thought about
       | making `flyctl` do a check with their API for any command that
       | requires talking over wireguard to ensure the keys would be
       | installed in the gateway. Since `flyctl` knows when the last
       | command was run with it, it could do this only after some
       | inactivity. And on the gateway machines, they'd just clean up any
       | inactive peers with a cron (which they seem to be doing already).
       | 
       | Not a solution as elegant as the one they reached (which is super
       | cool), but I'm assuming the considerably lower effort would make
       | it appealing.
       | 
       | [0]: https://labs.leaningtech.com/blog/webvm-virtual-machine-
       | with...
        
         | apignotti wrote:
         | WebAssembly is not magic, the simple reality is that browsers
         | do not expose low level socket interfaces, so they cannot
         | connect to arbitrary services on the wider internet.
         | 
         | We choose to use Tailscale since they allow WebSocket-based
         | connections via their DERPs.
         | 
         | It is interesting that, originally, DERPs were intended to be a
         | solution for machines in extremely limited networking
         | environment where nothing but HTTP is allowed. Turns out
         | browsers are exactly one of those extremely limited networking
         | environments.
        
           | apitman wrote:
           | > We choose to use Tailscale since they allow WebSocket-based
           | connections via their DERPs.
           | 
           | Some context that I wasn't initially aware of: apignotti is
           | the CTO of Leaning Technologies, which is where the article
           | GP linked is from.
        
         | SparkyMcUnicorn wrote:
         | That's a separate blog post, and it definitely is pretty cool.
         | 
         | https://tailscale.com/blog/ssh-console
         | 
         | https://www.npmjs.com/package/@tailscale/connect
        
         | gz5 wrote:
         | For Chromium-based browsers, an option is to use BrowZer (built
         | on OpenZiti, Apache 2.0). Enables you to connect into a full
         | mesh private network (mTLS, e2e encrypted, no TLS man in middle
         | inspection). 3 examples below with well known apps. Disclosure,
         | I work on the project.
         | 
         | MSFT RDP (video):https://youtu.be/1NMrxRIowog
         | 
         | Private network for Grafana
         | (video):https://youtu.be/l5ktiI-j3eg
         | 
         | Private network for Plex (blog
         | post)https://blog.openziti.io/its-a-zitiful-life
         | 
         | Basically you decide what 'app' you want to deliver via the
         | overlay, e.g. Grafana, Plex, RDP. For those destinations, a
         | (one time) bootstrapping process (invisible to end user)
         | results in your browser receiving a <script> tag which includes
         | some configuration when the browser attempts to connect to the
         | destination (Grafana etc). This ultimately results in the
         | browser downloading some JavaScript and WA, and registering a
         | service worker (the wasm contains the PKI bits).
         | 
         | After successful auth, your browser can then open a websocket
         | to your private OpenZiti overlay network (distributed, OpenZiti
         | overlay network software routers, deployed where you want
         | them), and ultimately hit the app (which no longer needs to
         | listen to anything other than the overlay network; becomes
         | private).
         | 
         | Desktop Chrome is the most tested, followed by Android Chrome.
        
         | tptacek wrote:
         | Tailscale already got all this stuff working in WebAssembly in
         | the browser. :)
         | 
         | The way we think about things, if we were going to try to
         | provide a browser experience of doing something with WireGuard,
         | we'd probably just fork off a tiny Fly Machine VM to run it on.
         | Just a different vibe here.
        
       | protoman3000 wrote:
       | > The problem you quickly run into to build this design is that
       | Linux kernel WireGuard doesn't have a feature for installing
       | peers on demand.
       | 
       | I don't seem to understand. You can add peers at runtime, e.g.
       | 
       | https://serverfault.com/questions/1101002/wireguard-client-a...
       | 
       | Can somebody clarify?
       | 
       | EDIT: If I understand correctly, that step is already too late.
       | They want to authenticate a peer before adding it to the
       | interface in order to prevent stale entries on the interface.
       | 
       | They thus put a eBPF filter in front of the interface and do the
       | cryptokey-routing based association to an authorized counterpart
       | by themselves. If it checks out, then they add the peer to the
       | interface and remove it after a timeout.
        
         | api wrote:
         | Seems to me they did this to avoid the alternative of running
         | WG in user space. They wanted a feature the Linux kernel didn't
         | have to route by cryptographic address first but without
         | leaving the kernel so they hacked it in.???
         | 
         | JIT Wireguard is a weird way to frame this. My mind went to
         | "why? The performance bottleneck is the crypto and per client
         | JIT won't help with that."
         | 
         | I would have just gone user space. Use something like tokio-
         | uring or glommio to get the performance. If you keep going in
         | the kernel you are going to keep hitting limitations because
         | Linux is not built to serve millions and millions of active
         | tunnels. Even doing millions of TCP connections per kernel gets
         | hairy sometimes.
         | 
         | Every limitation will require a hack. Every hack will be some
         | system config that has to be applied and managed. The tool
         | chains for provisioning Linux metal boxes are vastly inferior
         | to the tooling for developing apps and services and managing
         | their config.
         | 
         | Or am I stupid and misunderstanding?
        
           | vidarh wrote:
           | It does not seem like they need huge numbers of _active_
           | tunnels per gateway.
           | 
           | And JIT just as in "just in time" _configuration_ of Wire
           | guard. Once the configuration has been done, their stack
           | stays out of it.
        
             | api wrote:
             | Ahh. In that case they are using the term JIT weirdly.
             | Usually that means just in time compilation of script or
             | byte code to machine code.
        
               | NoahKAndrews wrote:
               | The phrase "just-in-time" can be used for other things
               | besides compilation (it's often used for manufacturing,
               | for instance). I think it's a helpful way to describe
               | lots of things, and that we shouldn't try to limit its
               | usage in tech to just compilation.
        
               | gtirloni wrote:
               | Exactly. The very first time I heard about "JIT" was in
               | the context of manufacturing. The Toyota Production
               | System [0].
               | 
               | I think JIT compilation wasn't popular in ancient times,
               | so I never associated JIT with compilation by default.
               | 
               | 0 -
               | https://en.wikipedia.org/wiki/Toyota_Production_System
        
               | cchance wrote:
               | JIT compiling is the term your most used to it being used
               | with but JIT has been around in other fields for longer,
               | and just means what it says... Just In time :)
        
         | tptacek wrote:
         | What you want, and what in the medium term I think Jason plans
         | to provide, is a Netlink API from kernel WireGuard that just
         | gives you a feed of all the public keys the kernel has seen
         | from initiator messages. With that feed, you wouldn't have to
         | install a single WireGuard peer in advance. They could all just
         | live in a SQLite database (or something), and get installed on-
         | demand as clients try to connect with them.
         | 
         | If you're a VPN provider (for instance), the current API is a
         | little clunky. It's not just that at any given time only a
         | small fraction of your peers are actually in use, though that
         | is probably true. It's also that as the number of peers you
         | handle scales, from hundreds of thousands to tens of millions,
         | you lose the ability to store them all in a single instance of
         | the kernel at all; there are just too many. If peers have to be
         | pre-installed, a consequence of that is that they get locked to
         | specific server machines.
         | 
         | But, as the post points out, you can get a facsimile of the
         | interface you need today with simple packet capture. And Jason
         | set the API up so that you can --- very easily --- flip the
         | initiation from server to client so the connection experience
         | is seamless, even though the kernel dropped the first
         | initiation message (because the peer hadn't been installed).
         | 
         | So that's the idea here.
         | 
         | Jann Horn pointed out that we could have taken this a step
         | farther: we could have held on to the initiation packet that we
         | captured, and, once the peer was installed, replayed the packet
         | into the kernel. Which is also a neat idea.
         | 
         | I don't think there's much in this post that is going to change
         | your life. It's just a couple of neat tricks that we thought
         | people would like to know about.
         | 
         | (Though: the next step for us is to build on this to get
         | "floating peers", de-regionalizing them completely, so users no
         | longer have to think about what region their peers are
         | configured in, which I think will actually have a product
         | benefit for users, unlike this, which has primarily nerd
         | benefit.)
        
       | justsomehnguy wrote:
       | > and gateways with hundreds of thousands of peers that will
       | never be used again
       | 
       | My thoughts exactly as I was reading the first paragraphs.
       | 
       | > Note that there's no API call to subscribe for "incoming
       | connection attempt" events. That's OK! We can just make our own
       | events. WireGuard connection requests are packets, and they're
       | easily identifiable, so we can efficiently snatch them with a BPF
       | filter and a packet socket.
       | 
       | Nice idea.
       | 
       | > When we get an incoming initiation message, we have the 4-tuple
       | address of the desired connection, including the ephemeral source
       | port flyctl is using. We can install the peer as if we're the
       | initiator, and flyctl is the responder
       | 
       | And this works behind NAT?
        
         | chgs wrote:
         | If the packet goes back to the same ip/port and generated from
         | the same ip/port it will work through nat.
        
         | TheDong wrote:
         | > And this works behind NAT?
         | 
         | Sure, UDP NAT only knows the 4-tuple (say {wggwd.fly.io, 12345,
         | clientIP, 23456}).
         | 
         | Any UDP packet, whether it's a new "initiator" udp packet, or a
         | response to the outgoing initiation message, will look the
         | exact same to any UDP NAT in the way since it only has the
         | 4-tuple to go on, and the 4-tuple is the same.
        
           | justsomehnguy wrote:
           | The way it's written made me think it's flyctl sending it's
           | own ip:port, which would be private ones behind the NAT. If
           | it's the received packet source:port then it's, obviously,
           | already translated.
        
           | ta1243 wrote:
           | Presumably you could have a situation where there's deep-
           | packet inspection on the traffic which would only allow a
           | "handshake response" to come back through, and drop your
           | attempt at an initialisation.
           | 
           | I doubt that happens much, and I assume you'll fall back
           | hapilly enough to the second time the "client" sends an init
           | packet and then you simply respond with the response packet.
        
       | tinco wrote:
       | While I generally agree with the idea that a direct HTTP request
       | for a single point to point message would be more reliable than
       | routing through a message queue, I'm a little bit surprised that
       | there would be so many messages lost by NATS it had significant
       | impact on their services.
       | 
       | Wouldn't a lost message just mean NATS would retry delivery until
       | succesful? Anyone know why they would experience noticeable
       | unreliability?
        
         | dilyevsky wrote:
         | > Wouldn't a lost message just mean NATS would retry delivery
         | until succesful? Anyone know why they would experience
         | noticeable unreliability?
         | 
         | I believe if you're using core nats (not JetStream) there's no
         | option for re-delivering like at all.
        
           | tinco wrote:
           | Well if that's what they did, then then using NATS over HTTP
           | in that scenario is just switching a single point of failure
           | for two single points of failure with no feedback on the
           | second point, did they just pick it for the convenience of
           | the NATS interface and only later realize their mistake?
        
             | emmanueloga_ wrote:
             | I'm guessing they realized late that core nats is a message
             | broker without features like durability and at least once
             | delivery. Jetstream was released less than a year ago (in
             | nats 2.2) so perhaps by the time they made the switch it
             | was not an option.
        
               | caleblloyd wrote:
               | NATS 2.2 with initial JetStream support was actually
               | released 3 years ago.
        
         | klabb3 wrote:
         | I am also very curious to know more. I'm sure the NATS
         | maintainers are as well. Their architecture is extremely
         | intuitive and appealing to me, so I wonder where things went
         | south. Nats has a lot of tunable parameters with Jetstream. For
         | instance, an in-memory stream with a time-based duplicate
         | detection window, both push and pull semantics, and
         | configurable re-delivery and ack policies.
         | 
         | The one thing where I can see an impedance mismatch is
         | ephemeral single-message connections which I don't _think_ it's
         | built for. In either case, more details would be invaluable.
         | (Fly folks, please consider sharing!)
        
         | tptacek wrote:
         | We're not dunking on NATS. We were probably holding it wrong.
         | But it turns out, we didn't need it; a message layer wasn't
         | really making things any more expressive, just harder to test
         | and monitor.
        
           | re5i5tor wrote:
           | "holding it wrong" OMG ;-)
        
       | dpeckett wrote:
       | Might as well take the opportunity to shill one of my recent
       | experimental projects, If you are interested in building Go apps
       | that act as userspace WireGuard peers take a look at
       | https://github.com/dpeckett/noisysockets
       | 
       | Based off the excellent work in done by wireguard-go but I've
       | attempted to simplify and make things a lot more idiomatic for
       | library use.
       | 
       | I reckon building a service mesh out of it would be interesting,
       | obviously supporting multiple languages would be hard but maybe
       | you could implement a sockets API. Though it might be hard to
       | compete performance wise with mTLS as I've not seen HW
       | acceleration for WireGuards crypto yet.
       | 
       | FWIW, I'm currently on the market for freelancing roles, so if
       | you're interested in Golang freelancers in the high-speed/secure
       | networking space, please reach out (email is in my profile).
        
         | bscphil wrote:
         | This looks great, nice work!
         | 
         | I have a dream of taking a userspace Wireguard project like
         | this and gluing PAKE over a relay in front of it to exchange
         | Wireguard keys, followed by holepunching and establishing a
         | direct tunnel. Basically Magic Wormhole for arbitrary tunnels -
         | and hopefully vastly improved performance for the file transfer
         | case as well, rather than crapping out at 20-30 MB/s over long
         | fat networks.
        
           | xyzzy_plugh wrote:
           | You're basically describing Tailscale.
        
             | mbreese wrote:
             | Maybe, but I read it as wormholes at the application level.
             | Which would be awesome for me.
             | 
             | I do a lot of work on remote servers for heavy data
             | processing (large files). Sometimes I need to get these
             | files back to my local computer, or just view figures I've
             | generated using remote data. I have a small reverse tunnel
             | daemon setup where I can (in my remote terminal) send
             | arbitrary files or data back to my local computer.
             | $ rtun send file.txt         $ rtun view data.pdf
             | 
             | It is a daemon that listens locally and when I setup SSH, a
             | reverse tunnel is configured from one Unix socket to
             | another. These are multiuser servers and Unix sockets
             | handle authorization easily. The same setup could be
             | handled similarly with a zmodem like process and aware
             | terminal.
             | 
             | However, this setup is a bit cumbersome and I can't just
             | run `ssh host`. If I could have the same setup with an in-
             | app wireguard tunnel, with no other setup, it would be
             | amazing.
        
               | dpeckett wrote:
               | Hmm interesting usecase, I know a lot of datascience
               | folks are fans of SSHFS but it sounds like you have kind
               | of got an inversion of control thing going on. Is there a
               | reason you aren't mounting the remote filesystem on your
               | local machine and manipulating files/data via that?
               | 
               | I've been looking for interesting applications for
               | noisysockets, and I'm obsessed with network attached
               | storage so this is a fun one for me.
        
               | mbreese wrote:
               | I have issues trusting SSHFS. It's never been stable
               | enough for me. Maybe it's because I have to go through at
               | least one ssh proxy, in addition to a VPN. Maybe it's
               | that the remote filesystem is slow enough, so trying to
               | do anything remotely is very slow.
               | 
               | But really, it think it's that I'm already in a terminal
               | connected to a remote system. I don't want to have to go
               | to a different terminal to try and transfer data that I'm
               | already looking at. And trying to use a Finder window (or
               | explorer) to navigate a complex remote filesystem
               | hierarchy isn't fun.
               | 
               | Occasionally I can do my work locally, but usually the
               | data is large enough that I have to do my work on a
               | remote server/cluster. When I generate figures describing
               | my data, I want to see those locally. This particular
               | use-case could be solved by using something like Xpdf,
               | but it's easier to send the figure back to my local
               | machine and view it with Preview.app. If I could view a
               | PDF figure in my terminal -- I'd probably just do that.
               | 
               | I also sometimes do need to send datafiles back to my
               | local computer. In these cases, I could use sshfs (but
               | don't like the duelling terminals) or scp (but my file
               | paths can be long and complicated, so typing out paths is
               | a pain). I used to actually just handle this with
               | Dropbox. I'd have a program that would send files to a
               | specific Dropbox folder and that would then sync to my
               | local computer. That worked well, but the delay between
               | syncing was an issue. I've also tried using a remote
               | tunnel back to an SSH server running on my laptop. That
               | was equally cumbersome.
               | 
               | The two other limits I would like to avoid: 1) no ports,
               | unix sockets only, and 2) no coordinating servers. I try
               | to use only unix sockets so that I can use normal unix
               | permissions to protect the socket. I'd rather not also
               | include an authentication layer if I can avoid it. I'd
               | also like to not need a third-party server to coordinate
               | the connection between the server and my local computer.
               | My remote servers are behind VPNs and while they can
               | access the internet, it's heavily firewalled. If I
               | already have an SSH connection to the remote server, I'd
               | like to tunnel through this if at all possible.
               | 
               | Here's the code/project I wrote to manage this:
               | https://github.com/mbreese/rtun
        
               | apitman wrote:
               | I maintain this list:
               | 
               | https://github.com/anderspitman/awesome-tunneling
               | 
               | Your use case sounds interesting and there may be a tool
               | out there that will do it, but I can't quite wrap my head
               | around your description of how everything is connected
               | and what runs where with your current setup.
               | 
               | I agree with sibling that my main question is what
               | prevents you from using SSHFS or similar?
        
             | ikiris wrote:
             | I for one would appreciate a tailscale that actually does
             | proper networking instead of their weird static routes but
             | not thing.
        
         | gtirloni wrote:
         | Pretty good. Is Noisy Transport something similar to Slack's
         | Nebula [0] in a way or am I mixing things up?
         | 
         | 0 - https://github.com/slackhq/nebula
        
           | dpeckett wrote:
           | There's some similarities and differences, instead of going
           | through a TUN/TAP device each application communicates
           | directly to the mesh with it's own userspace TCP/IP stack
           | (and also retains compatibility with wireguard).
        
             | johnmaguire wrote:
             | (I am a Nebula maintainer.) We recently merged support for
             | gVisor-based services, although it's very new, and I don't
             | know of much experimentation that's been done with it yet:
             | https://github.com/slackhq/nebula/pull/965
        
               | dpeckett wrote:
               | Really cool! I guess that there's little differences
               | other than noisysockets is explicitly WireGuard
               | compatible (but I actually prefer Nebulas tweaks to the
               | wire protocol).
               | 
               | I actually had some folks from OpenZiti reach out and
               | they have another similar tool in this space.
        
               | PLG88 wrote:
               | Some similarities: though when app embedded we are not
               | using a user space TCP stack, we have a suite of SDKs so
               | that you can embed the ingress/egress, authentication,
               | mTLS, E2EE, etc directly into your app, really easily -
               | https://openziti.io/docs/reference/developer/sdk/. When
               | using the Ziti tunnelers on host OS we use a TCP stack.
        
       | mdavid626 wrote:
       | What does this mean?
       | 
       | "We've gone a step beyond that: every time you run flyctl, our
       | lovable, sprawling CLI, it conjures a TCP/IP stack out of thin
       | air, with its own IPv6 address, and speaks directly to Fly
       | Machines running on our networks."
        
         | makeworld wrote:
         | They're saying Wireguard is used.
        
           | Spivak wrote:
           | More than that. They're saying that they're running a
           | userspace implementation of both the tcp/ip stack and
           | wireguard. Your machine isn't a peer to the wireguard tunnel,
           | only flyctl is.
        
             | dementik wrote:
             | Of course, your machine can be a peer also. You can create
             | peers to your organization with `flyctl wireguard create`.
        
               | carl_dr wrote:
               | But that isn't the case for all of the other uses of
               | flyctl.
               | 
               | You can't make connections with other processes over the
               | same connection flyctl makes when it's doing stuff.
               | 
               | That is what GP is saying.
        
         | zrail wrote:
         | They wrote a blog post all about it: https://fly.io/blog/our-
         | user-mode-wireguard-year/
        
         | ape4 wrote:
         | Its hard for me to think of a sprawling CLI that I love.
        
         | freedomben wrote:
         | This basically means they are using userspace Wireguard, such
         | as the go implementation. That is as opposed to the in-kernel
         | wireguard.
         | 
         | The reason to mention conjuring a TCP IP stack out of thin air,
         | is that most of the time the operating system is providing the
         | tcpip stack as part of the kernel. With wireguard go, the tcpip
         | stack is running in user space, meaning it can be created in a
         | normal user space process such as the flyctl command line
         | interface. This is indeed quite magical for people who have
         | been around for many years as a usable in-process userspace
         | tcpip stack is a relatively new and a novel thing.
        
           | cchance wrote:
           | I always felt weird about this, i still cant really wrap my
           | head around userspace tcpip stacks, it just feels... weird
        
             | _JamesA_ wrote:
             | There was a time that's all we had when using Windows [1].
             | It was weird.
             | 
             | [1]: https://retrocomputing.stackexchange.com/questions/589
             | 1/how-...
        
       | sneak wrote:
       | Fun fact: the WireGuard macOS client application cannot work on
       | macOS unless it's distributed via the App Store - Apple simply
       | will not provide the required VPN entitlements for web/self
       | distributed apps. You can use the commandline wg tools (which use
       | a different OS API) but not the GUI ones.
       | 
       | This means that WireGuard-the-org can't distribute the mac GUI
       | app directly, and if you want to use the official WireGuard app
       | (eg for accessing a VPN for privacy), somewhat ironically you
       | have to ID yourself (which requires a phone number) and your
       | computer's hardware serial to Apple first to download it via the
       | App Store. It smacks of the totalitarian rms "right to read"
       | essay - provide strong ID and hardware unique identifiers to be
       | allowed to download and use privacy software on your own
       | computer.
       | 
       | https://www.wireguard.com/install/
       | 
       | While not directly relevant to fly.io, I figured this might be
       | relevant in the context of Apple's other anticompetitive actions
       | this week related to web-based native app distribution on iOS in
       | the EU for the DMA. This issue with VPN apps has been true on
       | macOS for years.
       | 
       | I personally didn't know why WireGuard didn't offer direct
       | downloads, but I emailed Jason Donenfeld a couple years ago and
       | he let me know that Apple has been restricting these APIs in non-
       | AppStore apps, which was news to me, as I didn't know that Apple
       | had started any of their AppStore-only bullshit on macOS.
        
         | Corrado wrote:
         | This is not entirely accurate. macOS now has System
         | Extensions[0], which allow you to tie into the network to
         | provide VPN services. This is what Tailscale uses in their non-
         | App Store downloaded app. [0]
         | https://developer.apple.com/system-extensions/
        
           | sneak wrote:
           | https://developer.apple.com/documentation/bundleresources/en.
           | ..
           | 
           | It appears that they will now give out these entitlements for
           | non-MAS apps, but only to members of their developer program.
           | 
           | This means that you can't build one that will work on macOS
           | without IDing yourself to Apple, so you still can't build it
           | from scratch from source (as a non-developer-program person)
           | and expect it to work (without disabling SIP).
        
         | traceroute66 wrote:
         | > the WireGuard macOS client application cannot work on macOS
         | unless it's distributed via the App Store
         | 
         | That's bullshit and I'm pretty sure you know it. This is a
         | technical forum and you should know better than to spread
         | unsubstantiated Apple-bashing FUD.
         | 
         | MullvadVPN and ProtonVPN are two very well known VPN providers
         | who's VPN apps you will not find on the Apple App Store and
         | instead you are invited to download from their respective
         | websites. It has been that way basically forever (or at least a
         | very long time now we're in 2024), so its not like its a
         | reflection of a recent Apple change either.
         | 
         | I actually quite like the fact that WireGuard is distributed
         | via the Apple App Store and I thank them for doing that. It
         | makes updating much easier, rather than having a dozen apps
         | each with their own update process.
        
           | sneak wrote:
           | Those use different (less efficient, older, and perhaps
           | deprecated-in-the-future) APIs.
           | 
           | I encourage you to try to build a working copy of the
           | official WireGuard macOS app from source without an ADP
           | membership. You can't.
           | 
           | https://git.zx2c4.com/wireguard-apple/about/
           | 
           | https://git.zx2c4.com/wireguard-
           | apple/tree/Sources/WireGuard...
        
             | traceroute66 wrote:
             | > Those use different (less efficient, older, and perhaps
             | deprecated-in-the-future) APIs.
             | 
             | Your original statement is still bullshit.
             | 
             | You said "Apple simply will not provide the required VPN
             | entitlements for web/self distributed apps".
             | 
             | I've clearly demonstrated that statement is just plain
             | wrong. MullvadVPN and ProtonVPN are not distributed via the
             | AppStore and they work and are not blocked by MacOS.
        
             | bananapub wrote:
             | > Those use different (less efficient, older, and perhaps
             | deprecated-in-the-future) APIs.
             | 
             | then correct your initial post, which says something
             | entirely different:
             | 
             | > Fun fact: the WireGuard macOS client application cannot
             | work on macOS unless it's distributed via the App Store -
             | Apple simply will not provide the required VPN entitlements
             | for web/self distributed apps. You can use the commandline
             | wg tools (which use a different OS API) but not the GUI
             | ones.
             | 
             | it really is increasingly annoying that people just post
             | nonsense on HN to back up their dumb prejudices.
        
         | apitman wrote:
         | Are you sure this isn't about Apple trying to prevent users
         | from accidentally installing backdoor VPNs from apps that Apple
         | hasn't vetted? I've run into the same thing on Windows. You
         | basically can't even run a Golang app unless it's signed, and
         | code signing is basically a racket (~$400/year) unless you
         | distribute through the official stores.
         | 
         | Note defending the current state of things, but I can
         | understand how we got here and it may not be malicious on
         | Apple's part.
        
       | tschumacher wrote:
       | Sounds like a whole lot of effort to avoid a GraphQL request each
       | time a flyctl client wants to connect.
        
         | sudhirj wrote:
         | Huh? They do make one to set it up. More a way to avoid having
         | the public keys of every single client every loaded up into the
         | wireguard kernel module on the gateway all the time.
        
           | tschumacher wrote:
           | My implicit suggestion was that clients make a GraphQL
           | request not only before the first connection but before every
           | connection. The gateway server can insert the keys into the
           | kernel in response to an explicit GraphQL request instead of
           | in response to some complicated packet sniffing.
        
             | rudasn wrote:
             | What would the payload of the grapphql request to fetch the
             | wg config for that peer look like, when they don't know
             | from which peer the request is coming from?
        
             | mrkurt wrote:
             | This needs to support any ol' wireguard client. We use it
             | in `flyctl` but people also use it to create gateways so
             | they can, eg, peer with VPCs.
        
       | chairmanwow1 wrote:
       | My startup used Fly for almost a year. The core feature of code
       | to deployed code in less than a minute is beautiful. Spinning up
       | / down new nodes for backfills takes seconds.
       | 
       | But the company feels a little immature. Once our API server
       | became unreachable in Fly for 48 hours. I'm not sure if it was my
       | fault for getting config wrong or if they just had another
       | "silent" failure. They have a "db" product, but it's "not managed
       | postgres". Would get consistent disconnections from that. Just
       | feels weird for them to add a top level noun in their cli for
       | postgres and then limit the extent it's a feature they support.
       | 
       | API access to their core service would frequently go down and
       | leave us waiting to deploy new service fixes.
       | 
       | I miss the deployment experience, but I'm frankly happier with
       | Cloud Run on GCP. Just way fewer "surprises" and much more
       | complete documentation.
        
         | ZeroCool2u wrote:
         | Fly looks great to me, though I've never had a chance to use
         | it. For what it's worth though, Cloud Run on GCP is one of my
         | top 3 favorite infrastructure/deployment tools, so you're
         | setting the bar pretty high.
        
           | icedchai wrote:
           | Cloud Run is pretty slick... services, batch jobs, easy to
           | deploy, very flexible.
        
           | WuxiFingerHold wrote:
           | > so you're setting the bar pretty high
           | 
           | Why would anyone choose a provider that is inferior (apart
           | from toying)?
           | 
           | I mainly read two kind of stories from fly.io. Their
           | promotional, but well written and interesting technical blogs
           | like this one and stories about issues with their services
           | and miscommunication. So, despite liking their blogs I don't
           | consider using it.
        
             | apitman wrote:
             | Part of the reason I stick with Fly.io is because I want a
             | rock solid version of what they do to exist, and they're
             | the most likely people to eventually get there.
             | 
             | That said, I've had very few issues with their platform,
             | and I don't think it's ever caused downtime for my
             | (admittedly very small) service.
        
         | carderne wrote:
         | Same exact experience, a year on Fly but moved to GCP (GKE in
         | our case for reasons) a month or two ago. Super slick when it
         | worked, but that wasn't often enough...
        
         | apitman wrote:
         | The deployment experience is awesome, but for me[0] the killer
         | feature of Fly.io is their Anycast network and features such as
         | FLY_REPLAY and LiteFS that make clusering a breeze[1].
         | 
         | It's wild to me how little support VPS providers have for
         | reducing latency of backend services for users. None of them
         | support Anycast, and there are very few GeoDNS options (it adds
         | more complexity besides).
         | 
         | I just wish Fly.io had cheaper data transfer, since I'm
         | currently having to re-implement (poorly) a lot of their
         | features for an ngrok-like service I'm working on.
         | 
         | [0]: using them for https://lastlogin.io
         | 
         | [1]: Here's all the fly-specific code necessary to run
         | LastLogin in a globally distributed way:
         | https://github.com/lastlogin-io/obligator/blob/37f75cc861f1b...
        
       | cushpush wrote:
       | What a chart.
        
       | apitman wrote:
       | Interesting that they're defaulting to tunneling WireGuard over
       | WebSockets. Not great for performance but probably fine for the
       | DevOpsy stuff flyctl is used for. This is something I've wondered
       | about for the future of QUIC/HTTP3. There's a nonzero chance
       | network operators will just block UDP on port 443 altogether
       | rather than properly handling it.
        
         | tptacek wrote:
         | You can absolutely use native WireGuard, including from
         | `flyctl` (it's an option you can set). When UDP doesn't work,
         | it doesn't work at all, and it's hard to debug, so our default
         | is for the thing that we know will work.
         | 
         | (I say this ruefully, having lost the argument about what our
         | default should be.)
        
           | apitman wrote:
           | I'm assuming you decided doing some sort of happy eyeballs
           | type thing isn't worth the complexity?
        
       | unixhero wrote:
       | How might I take an arbitrary docker mized application and deploy
       | it on Fly.io? Please take my money
        
         | tptacek wrote:
         | cd directory/with/Dockerfile         flyctl launch
        
       ___________________________________________________________________
       (page generated 2024-03-13 23:01 UTC)