[HN Gopher] WebAssembly and Back Again: Fine-Grained Sandboxing ...
___________________________________________________________________
WebAssembly and Back Again: Fine-Grained Sandboxing in Firefox 95
Author : feross
Score : 303 points
Date : 2021-12-06 13:33 UTC (9 hours ago)
(HTM) web link (hacks.mozilla.org)
(TXT) w3m dump (hacks.mozilla.org)
| pjmlp wrote:
| With the recompilation back to C I fail to see how RLbox prevents
| memory corruption due to lack of bounds checking or UB being
| explored by the optimizer.
| azakai wrote:
| There is that risk, yes. For wasm2c to be correct it must emit
| C code without undefined behavior. As best we know it does that
| properly today, and we've tested and fuzzed it quite a lot, but
| whenever you use an optimizing C compiler on the output there
| is no 100% guarantee.
| SAI_Peregrinus wrote:
| There's no guarantee even without optimization. Some
| optimizations exploit UB, but not all "unexpected" code
| generation in the presence of UB is due to optimization.
|
| Take signed int overflow on addition. On some platforms ADD
| wraps, on others it traps. Depending on target CPU you'll get
| very different behavior even for non-optimizing compilers
| that just emit an ADD instruction!
|
| WASM is just another target, with its own behavior.
| Tobu wrote:
| The host application is still C, and the final compilation of
| the mix of C and sanitized C still relies on the optimizer to
| be fast, so it might be possible for the untrusted library to
| reveal UB in the rest of the application. I can see how the
| whole approach (wasm + taint) would make compromise of the
| sanitized bits through crafted data harder, but I'm not sure
| it does enough for guarding against a supply chain compromise
| of the library.
| Tobu wrote:
| And I see the safe languages section[1] in the WasmBoxC
| post; that looks like it would address most of the
| remaining UB risk, but Firefox will take some time getting
| there.
|
| [1]: https://kripken.github.io/blog/wasm/2020/07/27/wasmbox
| c.html...
| IshKebab wrote:
| WASM is always bounds checked and doesn't have UB. The C code
| produced by wasm2c should be the same.
| pjmlp wrote:
| WASM doesn't do bounds checking inside linear memory
| segments, good luck preventing corruption on data structures
| stored on the same segment.
|
| Unless the produced C code is validated against optimizations
| across every single compiler version, there are no guarantees
| of that actually being the case.
| afiori wrote:
| they specifically address this in the article; the
| objective is to be able to treat the component as untrusted
| code (the same as a website's js) and sanitize/check the
| component's output (the same as web APIs called by js).
|
| Corruption is not an issue when you assume the corruptee to
| already be an attacker
| masklinn wrote:
| > WASM doesn't do bounds checking inside linear memory
| segments, good luck preventing corruption on data
| structures stored on the same segment.
|
| Isn't the point that you don't care?
|
| Each untrusted library is compiled to wasm then C then
| native, they can corrupt their own datastructures but the
| point is to prevent that corruption from escaping those
| boundaries, or at least that's how I understand it.
| azakai wrote:
| Mostly that's the case, yes. The main benefit of wasm
| sandboxing in this situation is to keep any exploit of
| these libraries in the sandbox - no memory corruption
| outside. That's a big improvement on running the same
| code outside of a sandbox.
|
| (But in general corruption inside the sandbox is
| potentially dangerous too. You need to be careful about
| what you do with data you get from the sandboxed code.
| RLBox does help in that area as well.)
| pjmlp wrote:
| Just because corruption doesn't escape the sandbox
| doesn't means it isn't exploitable.
|
| This like attacking microservices, you have a module that
| exposes a set of interfaces and produces outputs when
| called with specific APIs.
|
| For the sake of example lets say you have an
| authentication module that says if a given user id is
| root.
|
| Now imagine producing a sequence of API calls that it
| will trigger the side effect of _is_root(id)_ being true
| for an id that it is a plain user.
|
| No sandbox escape took place, only internal corruption of
| internal data keeping structures that lead the
| _is_root()_ to misbehave.
| davidkunz wrote:
| Can't wait for Firefox XP.
| kgeist wrote:
| I wonder how it deals with intrinsics for optimization (SSE and
| the like), does it fail to compile, or maybe WASM has some
| support, or it's completely lost in translation?
| azakai wrote:
| For something like SSE to work you'd need both wasm and wasm2c
| to support it.
|
| Wasm doesn't support all of SSE, but wasm does have SIMD
| support which is a portable subset of common SIMD instructions.
| You may lose some performance there, but wasm is adding more
| instructions to help (see "relaxed-simd"). There are also
| headers to help translate between SSE and wasm SIMD for
| existing code where possible.
|
| wasm2c does have support for wasm SIMD, although I believe it
| is not 100% complete yet.
| SubzeroCarnage wrote:
| Firefox appears to utilize a custom clang toolchain to enable
| this without documenting how to make such toolchain (wasi
| sysroot). And expects you to just download the precompiled
| version from their servers.
|
| Fedora and Fennec F-Droid have since disabled this feature.
|
| https://src.fedoraproject.org/rpms/firefox/c/4cb1381d80a94c9...
|
| https://gitlab.com/relan/fennecbuild/-/commit/12cdb51bb045c3...
| fabrice_d wrote:
| Pretty sure you can build it yourself from
| https://github.com/WebAssembly/wasi-libc given that
| https://github.com/WebAssembly/wasi-libc/commit/ad5133410f66...
| is a contribution from a MoCo employee doing a lot of work
| around toolchains.
| floatboth wrote:
| There's also the https://github.com/WebAssembly/wasi-sdk repo
| which is kind of a meta-build-system for all this.
|
| But in FreeBSD we build all the pieces directly, here's our
| build recipes (with some hacks due to llvm's cmake code being
| stupid sometimes):
|
| compiler-rt (from llvm): https://github.com/freebsd/freebsd-
| ports/blob/main/devel/was...
|
| libc (from what you linked):
| https://github.com/freebsd/freebsd-
| ports/blob/main/devel/was...
|
| libc++ (from llvm): https://github.com/freebsd/freebsd-
| ports/blob/main/devel/was...
| rajanaccros wrote:
| How does this affect process isolation? If only _some_ components
| can be sandboxed at this fine grained level, aren 't we still
| subject to process isolation to sandbox everything else? It would
| seem like one still has to run _fission.autostart true_ to
| isolate the components that cannot be compiled in this way,
| therefore not gaining the benefit of less overhead as stated in
| the article.
| bholley wrote:
| The purpose of RLBox is to add an extra layer of component-
| level isolation on top of Firefox's process-based site-level
| isolation. The reduced overhead is relative to the hypothetical
| scenario in which we performed the component-level isolation
| with processes (rather than WebAssembly).
| rajanaccros wrote:
| Ohh I see. Not a replacement for process based site level
| isolation. I just wasn't wrapping my head around that. Makes
| much more sense now. Thanks for the explanation.
| majkinetor wrote:
| However, without removing processes it will still be as
| slow as today, which I really hope browsers will do.
| ekr____ wrote:
| I think a more likely way to think about it is that this
| allows us to sandbox things that would otherwise would
| not be sandboxable. For a variety of reasons, it's
| probably not practical to remove the existing process
| sandboxes.
| jerheinze wrote:
| > Cross-platform sandboxing for Graphite, Hunspell, and Ogg is
| shipping in Firefox 95, while Expat and Woff2 will ship in
| Firefox 96.
|
| I wonder what the other "good candidates" that he referred to
| are.
| anonymousDan wrote:
| What's the performance overhead in comparison to the unsandboxed
| version I wonder?
| azakai wrote:
| There are detailed performance numbers here on a variety of
| real-world codebases:
|
| https://kripken.github.io/blog/wasm/2020/07/27/wasmboxc.html
|
| tl;dr Something like ~14% when using the best bounds checking
| strategy, or ~42% when using the most portable one. (There are
| options in the middle as well.)
| bholley wrote:
| It varies. Here's an example of some performance analysis I did
| on the expat port:
| https://bugzilla.mozilla.org/show_bug.cgi?id=1688452#c37
| Jyaif wrote:
| The downside to this technique is that wasm2c code is 50% slower,
| so (at least for now) process-isolation is still a win in some
| cases (when the overhead of process-isolation is small compared
| to the rest).
|
| Still, that's a very exciting development that could lead to a
| revolution in operating systems.
| floatboth wrote:
| 42% slower only in the worst case using the slowest (explicit)
| bounds checks, only 12% slower with a signal handler:
|
| https://kripken.github.io/blog/wasm/2020/07/27/wasmboxc.html
| chakkepolja wrote:
| I am probably missing something: Why is WASM required here? Can't
| these analysis done directly on LLVM IR?
| glandium wrote:
| LLVM IR is still CPU dependent, and is a moving target. WASM is
| also a moving target, but much more controlled.
| azakai wrote:
| You're right that this could be done on LLVM IR. The MinSFI
| project did exactly that basically, several years ago, but it
| did not see adoption sadly.
|
| The benefits of wasm over LLVM IR is that wasm has already done
| the work to define the sandboxed format and build the tooling
| to compile to it. Wasm is also almost as fast as running
| normally. (Wasm is also portable and lacks undefined behavior,
| although for this use case those might matter less.)
|
| See the MinSFI section here which compares it directly to wasm
| for sandboxing:
|
| https://kripken.github.io/blog/wasm/2020/07/27/wasmboxc.html
|
| And the original MinSFI presentation is here:
|
| https://docs.google.com/presentation/d/1RD3bxsBfTZOIfrlq7HzG...
| ink404 wrote:
| It could be, seems like they already had something in WASM for
| doing this so it made more sense to use that than re do on LLVM
| IR.
| fooyc wrote:
| That's what I though too. C compilers should be able to achieve
| that directly, and it's incredible that nobody though of doing
| so yet.
|
| What's great, though, is that they are achieving this with
| tools that are already available.
| throw10920 wrote:
| WebAssembly is kind of a hack here (although a clever hack that
| saves a lot of effort) - the essence of what the Mozilla folks
| have done isn't _WebAssembly_ , it's a _trusted compiler_ - by
| which I mean a compiler that emits trustable code, regardless of
| how untrusted the source is. It 's a really neat idea that I hope
| to see more adoption of, because our current security models for
| software _suck_.
|
| Security based on process isolation is extremely inefficient and
| coarse-grained - having a trusted compiler could (eventually)
| _massively_ increase performance by removing processes entirely
| (no more virtual memory! no more TLB flushes and misses! less
| task switch overhead!) and eliminating the kernel /user mode
| separation, with an _increase_ in security.
|
| "Could" because it's not clear to me if the reduction in
| expressiveness from our languages now to future languages with a
| theoretical trusted compiler (all jump targets have to be known
| at compile-time?) will be accepted by the majority of the
| populace. Look at how hard it is to get people to accept borrow-
| checkers...
| [deleted]
| formerly_proven wrote:
| In 2021, the world finally achieves AS/400 on the web.
| wffurr wrote:
| Can you expand on that? Was this a property of C compilers or
| other languages on IBM mainframes?
|
| I get that it's tongue in cheek, but it would probably be
| even funnier / ironic if I had more context to understand it.
|
| It's the same spirit as languages adopting functional
| programming techniques aka rediscovering Lisp.
| formerly_proven wrote:
| AS/400 had a machine-independent binary format which was
| translated ahead of execution by the system's specific
| compiler into machine code, and all applications ran in the
| same address space with zero memory protection because the
| code generated by the compiler ensured isolation.
| vanderZwan wrote:
| > _a compiler that emits trustable code, regardless of how
| untrusted the source is_
|
| Hasn't this been one of the goals of the design of WebAssembly
| since day one, and something that has been getting people
| excited about it too? Using something for one of its intended
| purposes isn't really a "hack", no?
| IshKebab wrote:
| Sort of. The main goal for WebAssembly is to be a fast
| platform-agnostic compilation target that you can use in
| websites. The "in websites" bit means that it has to be
| completely safe (i.e. no accessing outside memory etc.) but
| it's not the main goal.
|
| This _is_ a bit of a hack because Mozilla don 't care about
| the platform-agnostic bit, so they're taking LLVM IR,
| compiling it to WebAssembly, then back to LLVM IR just so
| that they can ensure that the code is safe.
|
| WebAssembly comes with extra constraints that you probably
| don't care about if you're compiling to native (e.g. there's
| no 'goto') so you would get more efficient code (and probably
| faster compilation) if you just had some way of compiling
| LLVM IR directly to "safe binary".
|
| That would probably be a mountain of work though so its
| understandable why they went with this. Would be nice if they
| said how much the performance was impacted.
| oefrha wrote:
| The hack is translating wasm back to C, then compiling again.
| It wouldn't be a hack if they're running wasm, but they're
| not.
| masklinn wrote:
| AFAIK the ability to compile wasm to native code was pretty
| much always part of the goal, running wasm in a VM was
| never the end-game.
|
| Compiling _via C_ might be considered a hack (especially in
| the sense that it introduces a potential weak link in the
| chain), but it makes a lot of sense since they want to
| integrate the result into the Firefox build artefacts, and
| compiling via C is not exactly novel either.
| madflame991 wrote:
| > having a trusted compiler could (eventually) massively
| increase performance by removing processes entirely (no more
| virtual memory! no more TLB flushes and misses! less task
| switch overhead!) and eliminating the kernel/user mode
| separation
|
| I saw a talk a while ago that was advocating for the same
| thing, except this was about JS and not webassembly. I can't
| find it tho - I remember it being related to the WAT js talk;
| It also mentioned that it would eliminate rings on the cpu (and
| simplify cpus) and context switches which would make execution
| faster; they were citing some MS research on the matter - damn
| I really wanna find the talk now...
|
| Edit: https://www.destroyallsoftware.com/talks/the-birth-and-
| death...
|
| thanks BoppreH
|
| MS research: "Hardware-based isolation incurs nontrivial
| performance costs (up to 25-33%) and complicates system
| implementations" (virtual memory and protection rings); I think
| MS knows what they're talking about here
| SigmundA wrote:
| Singularity was a experimental OS written in a a variant of
| C# and .Net managed code by MS Research that ran using
| software isolated processes rather than hardware isolation,
| this is probably what they where referencing:
|
| https://en.wikipedia.org/wiki/Singularity_(operating_system)
| kaba0 wrote:
| http://joeduffyblog.com/2015/11/03/blogging-about-midori/
|
| There is also a really great blog about Singularity's
| "rebirth" experimental OS, Midori, that continued in its
| footsteps.
| [deleted]
| throw10920 wrote:
| Thanks for the link. I would argue that a true trusted
| compiler needs to accept an unmanaged language and emit code
| without a runtime, though. A runtime is cheating, because you
| can always make one that implements an iron-clad sandbox that
| doesn't require processes...by implementing a (very slow) VM.
|
| To put in another way - I don't think that security or
| performance are that hard to achieve on their own - the hard
| part is getting _both at once_. And then, adding
| expressiveness on top is even more difficult, as Rust as
| aptly demonstrated.
| kaba0 wrote:
| Rust is not secure at all in the sense used here --
| untrusted, arbitrary user code written in rust is a
| security threat.
| titzer wrote:
| > it's a trusted compiler
|
| Sorry I have to quibble here, but this term is already a thing,
| and typically has the opposite connotation: a compiler that
| _must be trusted_ because we cannot verify the output. It 's
| trusted because we "trust" it (to not screw up).
|
| I would argue that this makes the compiler _untrusted_ --we
| don't care what it does, whatever it outputs is going to be
| both statically and dynamically verified to not break the
| sandbox properties.
|
| > Security based on process isolation is extremely inefficient
| and coarse-grained - having a trusted compiler could
| (eventually) massively increase performance by removing
| processes entirely (no more virtual memory! no more TLB flushes
| and misses! less task switch overhead!) and eliminating the
| kernel/user mode separation, with an increase in security.
|
| I thought this right up until, well, about this time in 2017.
| Side-channel attacks are a real and bad thing. Our conclusion
| is that Spectre, in all its flavors, break confidentiality for
| in-process memory, regardless of sandboxing technology. On
| current hardware, there is no 100% bulletproof way to enforce
| isolation.
| zozbot234 wrote:
| We will always need process boundaries to separate information
| domains, because side-channel vulnerabilities are not addressed
| by having a "trusted" compile step.
| throw10920 wrote:
| When it comes to side-channel attacks, process boundaries
| aren't adequate either. Rowhammer, Meltdown/Spectre, and
| friends show that handily, and the RSA key leakage attack
| using SDR shows that even machine isolation isn't going to be
| enough for some things.
|
| I guess that the idea that trusted compilers are the way
| forward is predicated on the assumption that we've managed to
| mitigate most/all side-channel attacks, because there really
| isn't much you can do about those otherwise.
| lisper wrote:
| Why can't a trusted compiler prevent side-channel attacks?
| All you need to do is prevent the code from accessing the
| side channel. It seems to me that doing this at compile time
| would actually be easier than doing it at run time.
| titzer wrote:
| A key side-channel is execution time, and no, in general,
| you can't prevent a program from getting a clock. Even
| without a clock, one can construct one easily using shared
| memory and threads. Clocks are also easy to find. Even with
| low resolution clocks, timing differences can be amplified
| programmatically, making them observable.
| sfink wrote:
| It _is_ easy. You just have to forbid access to timers and
| loops. Where "timers" include anything that can count and
| store the count in shared memory.
|
| Alternatively, you could forbid branches (and therefore
| loops, implicitly).
| leni536 wrote:
| If you can use a loop for timing then you can use an
| unrolled loop for timing too.
| vlovich123 wrote:
| I can't find it right now but I read a paper that showed that the
| WASM security model is weaker than native compiled code in some
| cases. For example, due to compiler and OS hardening techniques,
| exploits of a libpng flaw weren't exploitable unless run in WASM.
| You couldn't escape the WASM sandbox but the a application itself
| could be compromised.
|
| I'm sure that this approach is valid as a hardening measure but
| some of the enthusiasm in the post is perhaps worthy of
| temperance. This thunk through WASM can't protect against runtime
| heap overflows and such.
|
| > However, the transformation places two key restrictions on the
| target code: it can't jump to unexpected parts of the rest of the
| program, and it can't access memory outside of a specified region
|
| Oof. The paper I recall specifically called these out as not
| enforceable. The libpng example in the paper directly had an
| external request have libpng corrupt and access other WASM memory
| than it owned (in this model it would be other in-process native
| memory I think or at least the other code placed within the same
| heap region unless each component gets its own which then means
| you need to have a fixed memory allocated upfront...).
| miloignis wrote:
| Are you perhaps thinking of "Everything Old is New Again:
| Binary Security of WebAssembly"? (
| https://www.usenix.org/system/files/sec20-lehmann.pdf )
|
| In any case, I think you've misunderstood the security
| properties. WASM can have weaker security _within_ the sandbox
| because it doesn 't have access to some of the more
| sophisticated mitigation measures that native code does, but
| the security of the sandbox boundary itself is _very solid_.
|
| The part of the article that you quote is accurate in the sense
| that I believe it was meant - the code cannot jump to
| unexpected parts of _the rest_ of the program (outside the
| sandbox) and cannot access memory outside of a specified region
| (the sandboxed memory). A vulnerability might allow the target
| code to jump to somewhat unexpected parts _inside_ the sandbox,
| or buffer overflows _inside_ the sandbox, but not outside.
|
| As such, it's actually a really effective application of a WASM
| sandbox!
| pjmlp wrote:
| Triggering a fire inside the castle might be enough to change
| the output of calls being done into the sandbox, it doesn't
| need to escape it to be exploitable.
| miloignis wrote:
| They're addressing this as well - from the article:
|
| > This, in turn, makes it easy to apply without major
| refactoring: the programmer only needs to sanitize any
| values that come from the sandbox (since they could be
| maliciously-crafted), a task which RLBox makes easy with a
| tainting layer.
| pjmlp wrote:
| Programmer only needs to sanitize....
|
| You mean like all those great programmers that keep
| introducing bugs like the one recently found out by
| Project Zero?
| titzer wrote:
| > WASM can have weaker security within the sandbox because it
| doesn't have access to some of the more sophisticated
| mitigation measures that native code does,
|
| And that's mostly read-only data pages. The primary blocker
| there is how to integrate that capability with ArrayBuffer in
| the web platform, since Wasm memories can be exposed (or
| aliased) as ArrayBuffer objects, and most engines aren't
| prepared to encourage non-writable holes in ArrayBuffers.
| Deukhoofd wrote:
| > WASM security model is weaker than native compiled code in
| some cases. For example, due to compiler and OS hardening
| techniques, exploits of a libpng flaw weren't exploitable
| unless run in WASM.
|
| If I understand the post correctly, it's still native compiled
| code in the end, and it won't run through WASM. The goal of the
| approach sounds more like a code sanitizer tool to ensure the
| external library they're using isn't making calls outside of
| it, or requesting memory beyond the region its given.
| Deukhoofd wrote:
| So if I'm understanding this correctly, Firefox compiles its
| dependencies as WASM, effectively blocking function calls to
| things it shouldn't and illegal memory access, and then
| translates it back to C so it can compile it normally? Sounds
| neat!
| fyrn- wrote:
| Not back into C, from WASM to native executable / asm / object
| code
| flohofwoe wrote:
| The solution described in the post actually translates C/C++
| to WASM, and then translates the WASM bytecode back to C (via
| a tool called wasm2c), which is then fed back into the C
| compiler again to compile to native code, all 'offline' in
| the Firefox build process.
| bholley wrote:
| Deukhoofd is correct -- we compile the WASM code back into C
| in order to reuse and reduce friction with our existing
| compilation pipeline.
| Hendrikto wrote:
| That was only for the prototype, the current implementation
| is:
|
| source code -> WASM -> C -> native code
| FpUser wrote:
| I am confused. Isn't WASM supposed to be eventually AOTed (Ahead
| of Time Compile) or a least JITed? Why this bizarre twist with
| WASM-C-NATIVE? Browser should do just that instead of these
| dances around.
| floatboth wrote:
| This isn't for running WASM _from the web_. This has nothing
| whatsoever to do with the WASM JIT that 's in Spidermonkey.
| This is sandboxing for internal components of the browser (or
| any application really). But, this _is_ a kind of AOT
| compilation involving WASM in the middle.
| mfrw wrote:
| This sandboxing is achieved via RLBox[0], which is a toolkit for
| sandboxing third-party libraries. It comprises of a WASM sandbox
| and an API which existing application can leverage. The research
| paper[1].
|
| [0]: https://plsyssec.github.io/rlbox_sandboxing_api/sphinx/
|
| [1]: https://arxiv.org/abs/2003.00572
| paulgdp wrote:
| How is it different from using Clang's CFI (control flow
| integrity)?
|
| I thought this was the same technique used in webassembly.
|
| Chromium is using this too i think
| azakai wrote:
| CFI helps with control flow exploits, but it doesn't prevent
| memory corruption for example.
|
| This sandboxing technique ensures that both control flow and
| memory accesses remain in the sandbox (except for when you
| explicitly allow otherwise).
| Jyaif wrote:
| > it can't access memory outside of a specified region
|
| How are segmentation faults handled?
| azakai wrote:
| Wasm is defined to trap when it accesses memory outside the
| sandbox (the embedder can decide how to handle that trap, say
| by shutting down that particular sandbox).
|
| With wasm2c the trapping can be implemented in a variety of
| ways, for example using the signal handler trick like wasm VMs
| do (~14% overhead) or manual bounds checks (~42% overhead, but
| fully portable).
| bholley wrote:
| I believe the implementation in Firefox masks off the high
| bits of pointers and adds the result to the base address
| before performing a load/store. This requires us to reserve a
| power-of-two-sized region of address space, but we can
| lazily/incrementally commit the pages as the sandboxed code
| invokes sbrk.
| azakai wrote:
| Thanks for the details bholley!
|
| Do you plan to use the signal handler trick eventually?
| Less portable but in my tests it shrinks the total overhead
| by half (from masking's 29% to 14%).
| jayd16 wrote:
| Reminds me a bit of Apple's AOT protections together with Unity's
| IL2CPP approach.
| makeworld wrote:
| I don't have a lot of knowledge in this area, but using WASM for
| forcing code to be safe seems bizarre. Why aren't there just
| compiler flags that can enforce the same restrictions they want?
| masklinn wrote:
| > Why aren't there just compiler flags that can enforce the
| same restrictions they want?
|
| Because they want to compile arbitrary code in order to sandbox
| it.
|
| The alternative is something like eBPF, but that imposes a
| limited subset of the source language, which would be unlikely
| to work with something like a video decoder.
| 7373737373 wrote:
| Why is it bizarre? The Wasm function interface seems perfect
| for sandboxing code. It's a "whitelist" system, the contained
| process can only call external functions that have been
| explicitly attached, perfect for implementing the capability
| security paradigm and progressively hollowing out the attack
| surface by separating functionality into several instances.
| Deukhoofd wrote:
| Requiring compilation to WASM to then translate it back to C
| and compile it again might be a bit strange. Clang obviously
| already has the tools to do the WASM sanitizing, it might be
| really cool to have a way to directly enforce those rules
| outside of WASM.
| bilkow wrote:
| As I understand it, clang doesn't have the tools to
| sanitize WASM. It just emits WASM, which, malicious or
| benevolent, can't access memory outside its designated
| memory regions.
|
| It's wasm2c job to ensure that the C generated enforces the
| WASM memory rules, so I'd say the one sanitizing the code
| is not clang but wasm2c.
| the_duke wrote:
| Webassembly is much more restricted than regular machine code.
|
| It's a stack machine with a limited set of operations, no
| direct control over the stack/control flow and restricted
| access to memory.
|
| It's way easier to compile this limited set of operations to
| assembly (or C) that is guaranteed to not do things it
| shouldn't.
| IshKebab wrote:
| You definitely _could_ do that. It would just be a ton of work
| and nobody has done it.
| deian wrote:
| It is doable, but it's hard to make it fast on all platform.
| See the SegmentZero32 description in <https://cseweb.ucsd.edu
| /~dstefan/pubs/kolosick:2022:isolatio...> for an example
| prototype.
| azakai wrote:
| For technical reasons adding compiler flags to do that is
| fairly hard. You'd need to handle a lot of things like
| compiling to the sandboxed format, system library support, the
| FFI to normal code, etc. It would be possible to do all that,
| but wasm has already done it - so compiling to wasm as an
| intermediary step is the most practical solution.
|
| (See also https://news.ycombinator.com/item?id=29460766)
| bholley wrote:
| Beyond the reasons others have mentioned, another key issue is
| that this isn't a transparent transformation. The sandboxed
| code can only access memory within a restricted subregion,
| which often requires some small code changes on both sides of
| the boundary (for example, copying input data into that memory
| region so that sandboxed code can operate on it).
|
| So implementing this in the compiler would entail some fairly
| involved handshaking between the code and the compiler beyond
| the normal scope of C/C++. Doing this in a library instead --
| and leaning on a well-understood and well-studied execution
| model -- makes everything a bit more natural to work with.
| Tobu wrote:
| NaCL (native client) sort of did this, but through an entirely
| separate toolchain. It's not an easy task.
| gostsamo wrote:
| This looks more like using the compilation to wasm and back to
| automatically rewrite the code of entire components in a manner
| that makes them safer.
| sdze wrote:
| What a stupid idea to run bytecode in a browser.
|
| We have gaping security holes with JavaScript already.
|
| Stop the madness.
| Jyaif wrote:
| It's compiling the webasm back to C, so it's not running
| bytecode.
| BoppreH wrote:
| Getting strong vibes of The Birth and Death of JavaScript (2014)
| [1], one of the numerous great talks by Gary Bernhardt.
|
| My engineer side is happy seeing how strong tooling enables such
| creative features with high assurances.
|
| My futurist side is dreading the day Intel launches their first
| Javascript/WebAssembly-only processor.
|
| [1] https://www.destroyallsoftware.com/talks/the-birth-and-
| death...
| MangoCoffee wrote:
| i don't think JavaScript going to die but its time that we have
| another option for the web. JavaScript have its warts. some
| people love JavaScript and some don't. its not fair for
| JavaScript to be the only option. i see Web Assembly as an
| option for people who doesn't like JavaScript warts to use
| their favor langue to develop for the web.
| Omnius wrote:
| I feel like the only people that "like" javascript or those
| that had it for their first language. Its needed, its better
| than it was, but compared to just about any other language
| its a total mess.
| k__ wrote:
| My career was C, Java, PHP, JavaScript.
|
| I like JS the most.
|
| It's flexible, lightweight, and omnipresent.
|
| The only other mainstream language that gives me that
| feeling is Rust.
| allisfalafel wrote:
| Rust is hardly omnipresent. I understand it has trouble
| with lesser used architectures and operating systems.
| While yes you probably dont use them, they do still
| exist.
| lelandfe wrote:
| Brilliant, hilarious talk, thanks for linking.
| mmastrac wrote:
| ARM already has Java and JavaScript extensions in their CPUs,
| so that day isn't completely off the horizon yet.
|
| I'm not even sure it would be a terrible idea, as we'd have a
| very interesting JS/WASM-like set of opcodes that we could
| target with _any_ compiler.
| hajile wrote:
| The "Javascript instruction" is a bit of a misnomer.
|
| JS accidentally got part of the x86 execution model for float
| conversion baked into the spec. ARM added an instruction to
| mimic the old x86 one. It's potentially useful in some other
| contexts too.
| mmastrac wrote:
| Regardless, FJCVTZS is still literally a "Javascript"
| instruction: "Floating-point Javascript Convert to Signed
| fixed-point, rounding toward Zero".
| dmix wrote:
| Here is an example of sandboxing a library and then calling
| functions:
| rlbox::rlbox_sandbox<rlbox_noop_sandbox> sandbox;
| sandbox.create_sandbox();
| sandbox.invoke_sandbox_function(hello);
|
| https://github.com/PLSysSec/rlbox_sandboxing_api/blob/master...
|
| Seems like it could get a bit verbose when used all over the
| place but I guess there's always a cost with security and having
| clearly defined risky parts also helps. Regardless I'm happy to
| see the effort being made beyond process isolation and OS
| capabilities.
| kevincox wrote:
| This is a really powerful tool and I hope we see this used more.
| Traditional process based sandboxing is very efficient inside the
| process, but IPC is very expensive. This approach flips the
| tradeoffs exactly backwards as the sandboxed code is slower, but
| IPC is nearly free. This means that it can cover exactly the
| space that was too expensive to sandbox before. The two
| approaches are perfect compliments for each other. I now imagine
| that the vast majority of code can be put into one of these two
| groups leaving very little code that is unable to be sandboxed
| for performance reasons.
___________________________________________________________________
(page generated 2021-12-06 23:00 UTC)