[HN Gopher] Using Firecracker and Go to run short, untrusted cod...
       ___________________________________________________________________
        
       Using Firecracker and Go to run short, untrusted code execution
       jobs (2021)
        
       Author : hasheddan
       Score  : 144 points
       Date   : 2022-08-02 12:01 UTC (10 hours ago)
        
 (HTM) web link (stanislas.blog)
 (TXT) w3m dump (stanislas.blog)
        
       | solarengineer wrote:
       | Illumos Zones are another worthwhile technology to explore.
        
         | tptacek wrote:
         | Illumos Zones are shared-kernel isolation, which isn't safe for
         | multitenant untrusted workloads. They're a great way to
         | segregate components of microservice ensembles from a single
         | tenant, to reduce blast radius.
        
       | AccountAccount1 wrote:
       | Isn't web assembly in the browser suited for these kinds of
       | problems? You could run the code, spit the benchmarks and save
       | that (maybe saving the code and the time of the submit for
       | validation.)?
       | 
       | Does anyone know about this?
        
         | paulgb wrote:
         | Even for languages that can compile to wasm (Go, Rust), the
         | compilation toolchain doesn't necessarily run in the browser.
         | AFAIK there's no good way to run LLVM in wasm yet. You could do
         | the compilation on the server and then send it back down to the
         | browser to run, though.
         | 
         | Theoretically, it should be possible to do it all in the
         | browser, just a lot more work given the state of things now.
        
           | steveklabnik wrote:
           | Someone did port clang to the browser, though I don't have
           | the moment to Google it right now. So LLVM has been done at
           | least once.
        
             | paulgb wrote:
             | Oh cool, so they did! https://binji.github.io/wasm-clang/
        
               | azakai wrote:
               | Here are some more ports of LLVM and clang to wasm/the
               | browser:
               | 
               | * https://github.com/kripken/llvm-js
               | 
               | * https://github.com/tbfleming/cib
               | 
               | * https://github.com/jprendes/emception
               | 
               | LLVM just does pure computation, really, so it's not hard
               | to port to wasm - much simpler than say Python (which has
               | also been ported several times). The only challenges with
               | LLVM are the build system (which has self-execution),
               | working around some issues like clang wanting to open a
               | subprocess, and adding some ifdefs.
        
               | nerpderp82 wrote:
               | You are much closer to Clang/LLVM than I am, how likely
               | is it that these changes would get upstreamed? I see the
               | clang.wasm ports, but really I'd like to get in-tree
               | builds of each commit. I find clang.wasm extremely
               | useful, but I all I have are snapshots and I don't have
               | the skills (maybe the patience) to get main continuously
               | building to wasm.
        
               | azakai wrote:
               | Hmm, I'm not a regular LLVM committer myself, but I think
               | there's a chance.
               | 
               | The history here has been several ports "for fun", so no
               | one has tried to upstream anything. But if you have a
               | real use case that could benefit from this, we should
               | talk with the LLVM people and see. Feel free to file an
               | issue and cc me and we'll find the right people.
        
       | kodka wrote:
       | I TIL - firecracker, thank you for that! :) Great article and
       | project, i didn't understand the idea of using RabbitMQ and
       | fetching for new events in a loop. I am not an architect, but i
       | was thinking that real-time databases, are used for those
       | purposes?
       | 
       | About the containers - if you are not running them in privileged
       | mode, you should be pretty secured, especially by limiting what
       | kind of binaries the containers have.
        
         | tptacek wrote:
         | Containers share kernels between tenants; that's the point of
         | the design. The shared kernel is a huge security problem; it's
         | easy to rattle off kernel LPEs that no realistic syscall filter
         | would have prevented and that would pop a container runtime.
         | 
         | It is not similarly easy to do that with a lightweight
         | hypervisor.
        
           | nerpderp82 wrote:
           | How so? virtio is pretty small surface area, go out far
           | enough and everything is pure computation.
        
             | tptacek wrote:
             | My argument is empirical. I can name recent LPEs that
             | bypass MAC and sandbox policies; it is not easy to name
             | comparable hypervisor escapes. Shared-kernel container
             | escapes are found so often they're not even all that
             | memorable.
        
               | nerpderp82 wrote:
               | > Shared-kernel container escapes are found so often
               | they're not even all that memorable.
               | 
               | Agreed. I realized I inverted your hypervisor comment.
               | Hypervisors have the compact contract that has any
               | reasonable chance at being audited. Container security is
               | basically a screen door.
        
         | paulgb wrote:
         | With a real-time database, each worker could subscribe to be
         | notified when the set of tasks changed, but it would require a
         | separate locking mechanism in the application to ensure that
         | each task is only attempted by one worker. (Imagine a scenario
         | in which a task arrives when there are multiple workers idle.)
         | 
         | With RabbitMQ, you ensure* that each task is only attempted by
         | one worker at a time, and you don't have to do anything special
         | to ensure that at the application level.
         | 
         | *I'm simplifying a bit, there are edge cases where e.g. you
         | lose a worker that has already started a task but not completed
         | it.
        
         | eatonphil wrote:
         | What is a "real-time database"?
        
           | davidjfelix wrote:
           | I think they're referring to firebase which sells itself as a
           | "realtime database" or rephrased, a database designed with
           | "realtime" (automatically updating) websites/apps in mind.
           | 
           | The use of the word "realtime" for web tends to trigger a lot
           | of systems developers who use the term to indicate that you
           | can predict the actual real world time that something will
           | take to compute, typically used in automotive and robotic
           | settings. That being said, I didn't invent the word and words
           | have multiple meanings and contexts. In this context it
           | simply means push-delivered data that is pushed when updated.
           | /disclaimer
        
           | MattyMc wrote:
           | It's a term that Firebase uses. From [their
           | documentation](https://firebase.google.com/docs/database):
           | "The Firebase Realtime Database is a cloud-hosted database.
           | Data is stored as JSON and synchronized in realtime to every
           | connected client."
           | 
           | I believe "realtime" in this case pertains to the
           | synchronization of data amongst clients through websockets.
           | This is how I've seen the term "realtime database" most
           | commonly used.
        
           | seabrookmx wrote:
           | People mention firebase but another one was (is? haven't
           | followed it in recent years) RethinkDB.
        
       | okasaki wrote:
       | How do web sites like http://cpp.sh/ run code? Wouldn't it be
       | enough to forbid/intercept system calls somehow and set limits
       | with systemd-run or do you really need a VM?
        
         | petercooper wrote:
         | There's the source code for such a site, if that would help:
         | https://github.com/radian-software/riju
         | 
         | Docker + heavily restricted user + firewalls.. seems to get you
         | much of the way there. I am aware that some work was done back
         | in the pre-Docker day with Ruby's online sandbox to neuter
         | Ruby's ability to make certain syscalls, but I imagine Docker,
         | eBPF, or even using WebAssembly makes it a lot easier now.
        
         | coffeeblack wrote:
         | Would it be possible to compile the compiler to wasm and then
         | use it in the browser to compile the user code to wasm in the
         | browser and then run that?
        
         | Bayart wrote:
         | There's a talk by Matt Godbolt about the way
         | https://godbolt.org works :
         | https://www.youtube.com/watch?v=kIoZDUd5DKw
        
         | jrockway wrote:
         | Some options already listed in replies to this comment, but you
         | might consider gVisor as well: https://gvisor.dev/
         | 
         | gVisor is kind of Linux running in Linux. More insolation than
         | containers; less overhead than VMs (but less isolation, of
         | course).
        
           | jmillikin wrote:
           | > less overhead than VMs
           | 
           | Make sure to benchmark your workload first -- gVisor's I/O
           | subsystem is a lot slower than the Linux kernel's, so a VM
           | can be materially faster if you're doing a lot of filesystem
           | operations or file I/O.
           | 
           | One of the systems I built at a former employer supported
           | both gVisor and Firecracker for isolation, and the gVisor
           | version was 10-50x slower for a specific class of workload
           | that did ~millions of stat() calls at startup.
        
             | [deleted]
        
             | jrockway wrote:
             | Yup, very good point. I think that something like gVisor
             | should probably be your second choice after you've
             | eliminated VMs for whatever reason.
        
         | jerf wrote:
         | "It's complicated."
         | 
         | One thing to bear in mind is that these sites use super-
         | paranoid security because it has been proved time and time
         | again that it is necessary. I wouldn't look at _any_ particular
         | solution for running arbitrary code from a user and assume that
         | it 's actually 100% correct. I think this can help remove some
         | of the mystery of how they do it, which is that there is very
         | likely some way in which they actually aren't doing it. Once
         | you remove that idea from the possibility space, the ways it is
         | done start making much more sense. (And the idea becomes much
         | more scary.)
        
       | johnbellone wrote:
       | I've been meaning to take a look at Bottlerocket[^1] as an
       | alternative to a custom spin of Kubernetes but we haven't really
       | had a chance to dig into it. The folks over at Fly[^2] have built
       | an awesome edge platform out of Firecracker, and ultimately,
       | where I want to take the next generation of our internal compute
       | offering. I am eagerly looking forward to any and all
       | presentations they do on their work.
       | 
       | [^1]: https://github.com/bottlerocket-os/bottlerocket
       | 
       | [^2]: https://fly.io/docs/reference/architecture/
        
       | lovingCranberry wrote:
       | I was recently looking into an equivalent for V8 isolates. I'd
       | like something like this for python, but it looks like micro VMs
       | is my best bet here. For anyone working in this field or having
       | hands-on experience: is weave's ignite a choice if I'd want to
       | execute long-running python scripts of users? I remember that
       | there was a lot of overhead for I/O. Or should I just go with raw
       | Firecracker like Stan does in this article?
       | 
       | The scripts would be long-running, but don't require much
       | computational power. Just a few arithmetic operations on an array
       | every second. The array is being fed in via WebSocket.
        
         | rubenfiszel wrote:
         | For the open-source windmill project, we need to support
         | sandboxing of typescript (deno) and python. For deno we could
         | have just relied on v8 isolate and deno layer of isolation. But
         | for Python we could not anyway so we had to come up with a
         | common solution. We chose nsjail in the end and it works really
         | well. All the config files are here:
         | https://github.com/windmill-labs/windmill/tree/main/nsjail and
         | this is how it is spawned from within the Rust worker:
         | https://github.com/windmill-labs/windmill/blob/main/backend/...
         | 
         | Happy to expand more of my experience of making this work at
         | scale.
        
         | paulgb wrote:
         | I'm a fan of Firecracker, but your use case might be a better
         | fit for plain old containers because the tooling is currently
         | more mature. If it's the isolation that attracts you to
         | Firecracker, gVisor is an option: https://gvisor.dev/.
        
           | lovingCranberry wrote:
           | The isolation is the main pin point why I was looking at
           | firecracker.
           | 
           | Thank you for linking gVisor, went through the docs real
           | quick and it looks very promising for my use case.
        
         | babelfish wrote:
         | Cloudflare will be opening their V8 isolate runtime soon (for
         | Cloudflare Workers)
        
           | asadlionpk wrote:
           | Can you expand on this (a relevant link will suffice)?
        
             | babelfish wrote:
             | https://blog.cloudflare.com/workers-open-source-
             | announcement...
        
         | mrkurt wrote:
         | Weave is probably overkill. Firecracker or gVisor are where I'd
         | start.
         | 
         | Using Firecracker directly is pretty straightforward:
         | https://jvns.ca/blog/2021/01/23/firecracker--start-a-vm-in-l...
         | 
         | gVisor gives you all the container tooling, which may or may
         | not be useful.
         | 
         | And, because I'm a shill, we actually shipped an API
         | specifically for this kind of use case. So if you'd rather not
         | build it all yourself, we can help: https://fly.io/blog/fly-
         | machines/
        
         | nerpderp82 wrote:
         | You want to run short lived Python in a trusted environment?
         | Can you go into more detail about your specific use case? How
         | much CPU, memory and IO do you need to do? Is it chatty over
         | the life of the execution or does have all of its data up
         | front?
        
         | metadat wrote:
         | Related discussion from this past Saturday:
         | 
         | https://news.ycombinator.com/item?id=32289979 (70 points, 16
         | comments)
         | 
         | and
         | 
         | https://news.ycombinator.com/item?id=32287798
         | 
         | Both links include hot takes from a tech lead for CF Workers
         | (@kentonv).
        
         | tekknolagi wrote:
         | If you take a look at the Skybison Python runtime, I would be
         | happy to chat and help you poke around integrating it:
         | https://github.com/tekknolagi/skybison
        
       ___________________________________________________________________
       (page generated 2022-08-02 23:01 UTC)