[HN Gopher] Breaking the WASM/JS communication performance barrier
___________________________________________________________________
Breaking the WASM/JS communication performance barrier
Author : weinzierl
Score : 112 points
Date : 2025-07-23 07:11 UTC (3 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| andyferris wrote:
| The whole UTF-8 vs UTF-16 thing makes this way more messy than it
| should be.
|
| I'd love for some native way of handling UTF-8 in JavaScript and
| the DOM (no, TextEncoder/TextDecoder do not count). Even a kind
| of "mode" you could choose for the whole page would be a huge
| step forward for the "compile native language to WASM + web"
| thing.
| theSherwood wrote:
| 100%. If we could get a DomString8 (8-bit encoded) interface in
| addition to the existing DomString (16-bit encoded) and a way
| to wrap a buffer in a DomString8, we could have convenient and
| reasonably performant interfaces between WASM and the DOM.
| continuational wrote:
| The extra DOM complexity that would entail seems like a loss
| for the existing web.
| ethan_smith wrote:
| The TC39 proposal for "Resizable ArrayBuffer" and
| "String.prototype.isWellFormed" methods are steps in this
| direction, though we still need proper zero-copy UTF-8 string
| views.
| samwillis wrote:
| It's great that the rust community are finding ways to improve
| the performance of decoding strings from WASM to js, it's one of
| the major performance holes you hit when using WASM.
|
| The issue comes down to the fact that even if your WASM code can
| return a utf16 buffer, to use it as a string in JS code the
| engine needs to make a copy at some point. The TextDecoder api
| does a first good job of making this efficient, ensuring there is
| just a single copy, but it's still overhead.
|
| Ideally there should be a way to wrap an array buffer with a
| "String View", offloading the responsibility of ensuring its
| utf16 to the WASM code, and there being no copy made. But that
| brings a ton of complexities as strings need to be immutable in
| js, but the underlying buffer could still be changed.
| breve wrote:
| The JS string built-ins proposal for WebAssembly:
|
| https://github.com/WebAssembly/js-string-builtins/blob/main/...
| samwillis wrote:
| Personally I feel this is backwards - I don't want access to
| js literals and objects from WASM, I just want a way to wrap
| an arbitrary array buffer that contains a utf16 string as a
| js string.
|
| It keeps WASM simple and provides a thin layer as an
| optimisation.
| vanderZwan wrote:
| > _It keeps WASM simple_
|
| At the cost of complicating JS string implementations,
| probably to the point of undoing the benefits.
|
| Currently JS strings are immutable objects, allowing for
| all kinds of optimization tricks (interning, ropes, etc.).
| Having one string represented by a mutable arraybuffer
| messes with that.
|
| There's probably also security concerns with allowing
| mutable access to string internals inside the JS engine
| side.
|
| So the simple-appearing solution you suggested would be
| rejected all major browser vendors who back the various
| WASM and JS engines.
|
| Access to constant JS strings without any form of
| mutability is the only realistic option for accessing JS
| strings. And creating constant strings is the only one for
| sending them back.
| nhatcher wrote:
| I wrote a while back about a somewhat related issue:
|
| https://www.nhatcher.com/post/should_i_import_or_should_i_ro...
|
| The code is a bit outdated, but the principle of linking against
| the browser implementation stands
| bcardarella wrote:
| How does the performance compare to projects like Wasmtime?
| Evan-Almloff wrote:
| The two projects have different usecases so they can't be
| directly compared. Slegehammer bindgen makes calling javascript
| from rust faster in the browser. Wasmtime is a native runtime
| for WASM outside of the browser
| CyanLite2 wrote:
| Sad that this isn't natively in browsers...
| vanderZwan wrote:
| > _Wasm-bindgen calls TextDecoder.decode for every string.
| Sledgehammer only calls TextEncoder.decode once per batch._
|
| So they decode one long concatenated string and then on the JS
| side split it into substrings? I wonder if that messes with the
| GC on the JS side of things.
| boomskats wrote:
| How would splitting it into substrings be different from
| decoding individual strings from an allocation/gc perspective?
| If anything I'd assume splitting a substring was more efficient
| - i expect there's a ton of optimisations in js for sliced
| strings or whatever as it's been around for ages.
| vanderZwan wrote:
| I imagine it's faster during creation because there's fewer
| allocations for a backing array for the string content (one,
| basically, unless they move stuff around). But then that can
| also mean holding on to the entire backing array even if only
| one of the strings is still "alive", unless there are
| optimizations for reclaiming memory in those situations too.
| MuffinFlavored wrote:
| I think there is a ton of room left on the table here for
| innovation.
|
| Context: as far as I know Electron is still the king if you want
| to do (unsafe but performant) "IPC/RPC" between native and a
| webview.
|
| All of the other options that exist in other languages (Deno,
| Rust, you name it) do the same "stringified JSON back and forth"
| which really isn't great for performance in my opinion.
|
| It'd be cool if (obviously in a sandboxed or secure way) you
| could _opt in_ to something albeit a bit reckless, but some way
| to provide native methods for the WASM part of V8 and its WebView
| (thinking Electron-esque here) to call.
| boomskats wrote:
| I'm not sure if I'm understanding you correctly, but vanilla
| wasm ipc works by sharing linear memory, where it's up to the
| implementation to choose the data encoding
| (arrow/proto/whatever). In the case of wasm-bindgen's dom
| manipulation api, the implementation serialises individual
| commands and sends them over the boundary, with any string
| params for each command being deserialised individually, and
| this project improves on that by batching them all into one big
| string thus reducing the deserialisation overhead. However, the
| string encoding is specific to that use case - it's not a
| general wasm ipc mechanism.
|
| VSCode IPC is kinda similar as it's designed to facilitate
| comms over an enforced process isolation barrier to protect the
| main thread from slow extensions etc. but it's actually IPC
| there (as in, there are multiple processes at the os level).
| The wasm/js stuff is handled within the same v8 context - it's
| not actually ipc.
|
| (Happy to be corrected here, but this is my understanding)
| MuffinFlavored wrote:
| https://github.com/webview/webview_deno
|
| Tell me how you'd do "native C/C++ FFI (to like a .so or
| .dylib or .dll)" between the webview using WASM or anything
| other than "WebKit's built in JSON-string based IPC"
|
| Like a <button> that triggers a DLL call. How would you
| achieve it with WASM? How does WASM act as the bridge to the
| DOM and/or native? It doesn't, right?
| Evan-Almloff wrote:
| Most of the optimizations apply equally to calling into
| webview JS from a native application for something like dom
| manipulation. There is a wry example that uses the system
| webview in the repo: https://github.com/ealmloff/sledgehamm
| er_bindgen/blob/master...
| MuffinFlavored wrote:
| https://github.com/ealmloff/sledgehammer_bindgen/blob/mas
| ter...
|
| This is fetch(), which I have to imagine is much less
| performant than Electron with a require('native-module')
___________________________________________________________________
(page generated 2025-07-26 23:00 UTC)