[HN Gopher] Hooking Go from Rust
       ___________________________________________________________________
        
       Hooking Go from Rust
        
       Author : carride
       Score  : 125 points
       Date   : 2022-08-26 14:52 UTC (8 hours ago)
        
 (HTM) web link (metalbear.co)
 (TXT) w3m dump (metalbear.co)
        
       | arriu wrote:
       | This is really cool. Thanks for sharing.
       | 
       | Does anyone have any recommendations on how to do something like
       | this but in reverse? Calling a go function from rust?
        
         | aviramha wrote:
         | Thanks!
         | 
         | Do you mean for an already compiled Go binary? if you can
         | recompile you can use cgo.
         | 
         | If not the following requirements come to mind (there are more
         | probably):
         | 
         | 1. Allocate g and m if doesn't exist in current process. 2.
         | Switch to go stack before call
         | 
         | Btw there's the cgocallback routine that manages C code that
         | calls into go so that's a good look to see what's needed.
        
       | metadat wrote:
       | This is badass, thanks for sharing @carride. In celebration that
       | it's Friday, I'm going to test it out right now.
       | 
       | In case the author happens to see this and can respond, how did
       | you figure all of this out? Would really enjoy a deep dive
       | covering your process. This is brilliant.
       | 
       | I also wonder if it could be extended to Nim, Zig, or any other
       | less-straightforward hookable languages [than C/C++].
       | 
       | Edit: Apologies, maybe this is not that interesting of a
       | wonderment. I did a little research and both Nim and Zig
       | interface with libc.
       | 
       | I'm not yet clear on whether node.js or Deno use libc, if any
       | fellow HNers know please leave a reply! If they don't use libc,
       | they could be interesting targets.
        
         | aviramha wrote:
         | Thanks for the kind words. One of the authors here :) Actually
         | this blog post started from a Twitter thread (that I forgot to
         | link in the post
         | https://twitter.com/Aviramyh/status/1544964265961979905) - if
         | you have any more questions, feel free to ask. We didn't want
         | to go "too low level" so the common dev would enjoy this :)
         | 
         | regarding extending, probably possible - I think it'd be easier
         | with Zig (no weird stacks AFAIK) but we don't need it as they
         | probably just use libc like most languages.
         | 
         | P.S would love hearing your thoughts about mirrord
        
         | generichuman wrote:
         | > Edit: Apologies, maybe this is not that interesting of a
         | wonderment. I did a little research and both Nim and Zig
         | interface with libc.
         | 
         | Zig does not depend on libc on Linux, for example:
         | https://github.com/ziglang/zig/blob/master/lib/std/os/linux....
         | 
         | You can choose to link libc though.
        
         | infiniteregrets wrote:
         | Hey, seems like node uses libuv and in fact we did a blogpost
         | on it! https://metalbear.co/blog/mirrord-internals-hooking-
         | libc-fun...
        
       | ithrow wrote:
       | why doesn't t Go just uses "syscalls" in macos too?
        
         | aviramha wrote:
         | They tried but macOS doesn't have any user<>kernel stable API
         | so you have to rely on libsystem to provide it. (It broke very
         | often so they changed it to use libsystem)
        
         | infiniteregrets wrote:
         | > Go used to do raw system calls on macOS, and binaries were
         | occasionally broken by kernel updates. Now Go uses libc on
         | macOS, and binaries are forward compatible with future macOS
         | versions just like any other C/C++/ObjC/swift program. OS X
         | 10.10 (Yosemite) is the current minimum supported version.
         | 
         | also found a discussion -
         | https://news.ycombinator.com/item?id=18439100 that points to
         | why this might be the case
        
       | jgavris wrote:
       | How big is your Go codebase versus your Rust codebase? Why not
       | rewrite bits in Rust? This is very cool, but seems like a lot of
       | effort and ongoing maintenance.
        
         | aviramha wrote:
         | It's not the case actually. We're working on a dev tool called
         | mirrord that lets you create local processes in context of a
         | remote environment so the ergonomics would be of local setup
         | with the benefits of leveraging real cloud environments. The
         | way we accomplish that is we hook sys calls and then choose
         | what happens locally and what happens remotely.
        
       | scottlamb wrote:
       | Isn't there something important missing here? My understanding is
       | that Go's non-standard ABI doesn't guarantee much available stack
       | space or the existence of guard pages. IIUC, this places a
       | standard-ABI-like stack frame on the Go stack for calling into
       | Rust, but what guarantee is there that the Rust/libc code won't
       | overflow the small stack? What happens if it does? AFAICT, the
       | answer are "none" and "memory corruption".
        
         | aviramha wrote:
         | We actually address it in the post -
         | https://metalbear.co/blog/hooking-go-from-rust-hitchhikers-g...
         | 
         | > "Goroutine stack is dynamic, i.e. it is constantly
         | expanding/shrinking depending on the current needs. This means
         | any common code that runs in system stack assumes it can grow
         | as it wishes (until it exceeds max stack size) while actually,
         | it can't unless using Go APIs for expanding. Our Rust code
         | isn't aware of it, so it uses parts of the stack that aren't
         | actually usable and causes stack overflow."
         | 
         | tl;dr - yes, you're correct and we replace Go stack with system
         | stack for the duration of the call, just like cgo does.
        
           | scottlamb wrote:
           | Thanks, I somehow missed that whole section!
        
       | zasdffaa wrote:
       | > Golang doesn't use libc on Linux, and instead calls syscalls
       | directly
       | 
       | Could someone explain this. I'm not familiar with low level linux
       | stuff. Why would you choose not to use libc, what are the
       | implications?
        
         | rapidlua wrote:
         | This is mostly due to ease of distribution. The binary is self-
         | contained, no need to ship anything but the binary itself. You
         | can also compile locally and then run it elsewhere that is a
         | completely different flavor of Linux. Quite handy when
         | experimenting. Finally, no dependency on libc enables easy
         | cross-compilation. You might not even have libc installed for
         | the architecture you are targeting!
        
         | aviramha wrote:
         | Hey, One of the writers of the article here. This is a very
         | controversial topic in Go ecosystem, and following all the
         | discussions that led to this decision is very hard. In short, I
         | think main reasons are:
         | 
         | 1. libc is considered hazard - security, comfort, runtime. It
         | needs to maintain support for so many flows and setups so it's
         | hard to make things better. For example, having `errno` as a
         | global is quite weird (instead of just returning it, which is a
         | whole discussion of it's own)
         | 
         | 2. Golang devs like to re-invent the wheel, in a kind of Apple-
         | ish way - all other are doing it wrong, we're going to do it
         | better. ofc it's debatable whether they're correct with their
         | approach.
         | 
         | 3. Given they use Plan 9 system design, doing FFI from Go is
         | very expensive (need to switch stack, save context, etc on each
         | call)
        
           | infiniteregrets wrote:
           | A relevant discussion:
           | https://news.ycombinator.com/item?id=25997506
        
             | metadat wrote:
             | ^ (Go 1.16 will make system calls through Libc on OpenBSD)
        
           | tialaramex wrote:
           | > For example, having `errno` as a global is quite weird
           | 
           | It was quite weird before we wrote concurrent software. Once
           | you're writing concurrent software it's _very_ weird because
           | as originally documented this single error location is shared
           | by all threads, so it necessarily gets concurrently modified
           | while being read, a Data Race.
           | 
           | Your 2022 C library typically defines it as Thread Local
           | instead, so when you call some_c_function() and then check
           | errno, you aren't confused by the fact meanwhile another
           | thread tried to open("nonexistent.file") and so _their_
           | thread set errno to report that the file doesn 't exist.
           | 
           | Given that none of this fuss is how the actual system calls
           | work, it's understandable that people might want to sidestep
           | it.
        
             | aviramha wrote:
             | Thanks for elaborating! I totally agree there's valid
             | reasons to drop libc, though not sure if that's the _right
             | way_. In most modern languages today you can just return a
             | tuple, not needing to mess with TLS.. whereas old times
             | didn 't really like passing results in out parameters (by
             | ref) (which makes sense, it's less ergonomic).
        
               | throwaway894345 wrote:
               | I much prefer the ergonomics of passing out-parameters
               | than TLS. Honestly, TLS has always struck me as a code
               | smell in any language.
        
           | throwaway894345 wrote:
           | > Golang devs like to re-invent the wheel, in a kind of
           | Apple-ish way - all other are doing it wrong, we're going to
           | do it better. ofc it's debatable whether they're correct with
           | their approach.
           | 
           | I'm very happy with these tradeoffs. Not only is it a
           | pleasure not to have to deal with libc (truly static binaries
           | on Linux, 2mb docker images, etc), but because interop is
           | tedious/expensive, the ecosystem is largely pure-Go. This
           | means that we rarely (if ever) have to deal with C's abysmal
           | build tooling and (lack of) package managers and stuff like
           | cross compilation is trivial. Everything builds consistently
           | on every system, every time, and irrespective of target
           | platform (provided there are no C dependencies).
        
             | aviramha wrote:
             | My concern here is that many people are now writing
             | software in Go which can only be used from Go. Yes, you can
             | cgo but as you wrote the interop is tedious/expensive. It's
             | good we're rewriting old software but this time in a semi
             | closed garden...
        
               | throwaway894345 wrote:
               | I mean, the garden _has always been_ semi-closed, we 've
               | never had a truly open garden, and nothing Go did was
               | going to change that. In other words, what language has
               | ever solved the problem of cross-language-reuse in a way
               | that was so simple that we didn't end up rewriting a huge
               | swath of the ecosystem in every new language? It's not
               | like Java or Rust came along and now we can stop
               | rewriting HTTP libraries in subsequent languages.
               | 
               | Easy interop has only ever been an illusion (we forget
               | about the difficulties of integrating distinct build
               | systems and memory management systems or the runtime
               | costs of marshaling data between memory layouts). Rather
               | than paying significant costs to preserve that illusion,
               | Go doubled down in order to realize some pretty
               | considerable gains: easy compilation even on niche
               | distros, even when cross compiling, truly static
               | binaries, etc.
        
               | aviramha wrote:
               | We're sliding into opinions but using C, Python, Rust in
               | each interop had some cost but I did it and got a lot of
               | benefits of using existing good software. Yes C lacks dev
               | ergonomics, but meson + ninja gave a very neat
               | experience..
        
               | throwaway894345 wrote:
               | > using C, Python, Rust in each interop had some cost but
               | I did it and got a lot of benefits of using existing good
               | software
               | 
               | I don't dispute this, but that is still a niche case. The
               | general case is that library software gets rewritten in
               | each language (and if it's not worth the trouble to
               | rewrite it, it's usually not worth the trouble to use and
               | maintain bindings). So yes, there is a tradeoff, I'm just
               | expressing my opinion that Go made the better tradeoff,
               | at least for a garbage collected language (since you
               | mentioned it, Python trades off a boatload of performance
               | and package management simplicity in its pursuit of easy
               | C interop).
               | 
               | > Yes C lacks dev ergonomics, but meson + ninja gave a
               | very neat experience..
               | 
               | My point about abysmal C build/package tooling wasn't
               | about the experience of a C developer, but the experience
               | of someone who is downstream of C packages. As a Python
               | developer, I don't get to choose how my C dependencies
               | are packaged for Python (which build system and package
               | manager they use)--I'm just at the mercy of whatever they
               | offer, and if I'm targeting some non-mainstream Linux
               | distro, then there's a good chance some build or runtime
               | dependency of some upstream C project is going to fail,
               | and now I need to grok that C project's build/linking
               | system to sort it out. This just doesn't happen in (pure)
               | Go because Go's build system and package manager are
               | reproducible (no implicit dependencies).
        
               | aviramha wrote:
               | Fair point.
        
       | nedsma wrote:
       | This is plain and simply incredible.
        
       ___________________________________________________________________
       (page generated 2022-08-26 23:01 UTC)