[HN Gopher] WebAssembly: TinyGo vs. Rust vs. AssemblyScript
___________________________________________________________________
WebAssembly: TinyGo vs. Rust vs. AssemblyScript
Author : buradol
Score : 87 points
Date : 2022-11-27 09:53 UTC (13 hours ago)
(HTM) web link (ecostack.dev)
(TXT) w3m dump (ecostack.dev)
| ungawatkt wrote:
| a little off topic, but something I've been wondering:
|
| With my small learning project using tinygo, disabling the GC in
| tinygo sped things up a good amount (unsurprisingly), and since
| the wasm part is in a webworker it seemed safe enough (the
| webworker terminating cleared the allocations as far as I could
| tell). It seems to work nicely for the project, basically "apply
| an image transform", but would this pattern be good in a more
| serious code base?
| aatd86 wrote:
| I love Go but I must confess I was a bit skeptical when I read
| the article: 1. In my own experience, firefox is usually faster
| than chrome at executing wasm.
|
| 2. I'm not a rust programmer but the performance of the rust
| implementation a bit surprising compared to tinyGo.
|
| Still, tinyGo is very impressive. Very glad we have this project.
| postalrat wrote:
| Was the test here on the author's understanding of data types and
| sort algorithms?
| techn00 wrote:
| Does this still hold up if you don't use fixed size arrays?
| qayxc wrote:
| Oh my, where to start... This little article shows everything
| wrong with software devs these days :) First of all, the
| implementations aren't doing the same thing, which is bad for a
| proper comparison; but others pointed that out already.
|
| What bugs me the most, though, is the atrocious JavaScript
| implementation. On my aging machine, none the of the browsers I
| tried it on (Edge, Opera, Chrome, and Firefox) - not a single
| one! - even ran the original version. All of them refused to run
| the script and timed out.
|
| I did a very, very simple change and just used the proper tool
| for the job - TypedArrays: function testSort() {
| const length = 100_000 const arr = new Uint8Array(length)
| for (let i = 0; i < arr.length; i++) { arr[i] =
| Math.random() * 255 // the original code produced float values
| here } const temp = new Uint8Array(length)
| for (let i = 0; i < 500; i++) { temp.set(arr, 0) //
| shorter than a manual for-loop and about 6+% faster
| temp.sort() } }
|
| And wouldn't you know it, Firefox now runs this function. I did
| my testing on https://jsbench.me, put the function in the setup
| section and the actual call in test cases.
|
| I added another function, this time using an Int32Array instead
| and random values between -1_000 and +1_000.
|
| Just to be thorough and because I found it interesting, I also
| added a Float32Array version this time using the result from
| Math.random() without any scaling.
|
| This leads to an interesting observation: there seems to be a
| huge difference in optimisation between Firefox and Chromium-
| based browsers with TypedArrays: Test
| | Performance | Browser --------------
| ---+---------------------------------+-----------------
| testSort_uint8 | 0.75 ops/s+-1.8% (~1.33s/call) | Edge
| testSort_uint8 | 29.33 ops/s+-2.3% (~34ms/call) | Firefox
| -----------------+---------------------------------+-------------
| ---- testSort_int32 | 0.64 ops/s+-1% (~1.56s/call) |
| Edge testSort_int32 | 1.44 ops/s+-0.2% (~694ms/call) |
| Firefox -----------------+--------------------------------
| -+----------------- testSort_f32 | 0.16 ops/s+-0.6%
| (~(~6.25s/call) | Edge testSort_f32 | 1.49
| ops/s+-0.3% (~671ms/call) | Firefox
|
| In Firefox, the results where *really* interesting. The dramatic
| difference between uint8 and int32 in particular is quite
| astonishing. Unlike the author, I fund zero difference between
| Edge, Opera, and Chrome in terms of performance, which doesn't
| surprise me as they all use the same engine.
|
| While Firefox's SpiderMonkey engine is faster in all scenarios,
| the uint8-version is faster by an order of magnitude, whereas V8
| shows comparatively little difference (~15%) in the integer case
| and absolutely tanks performance with float values.
|
| This last result in particular might explain, why the original
| version was both unable to start on my machine and also slow in
| the author's browsers.
|
| I hope you found this as interesting as I did and maybe even find
| a place to apply these findings in your projects.
|
| Cheers.
| goliatone wrote:
| OT but I've been working with WASM and TinyGo and hit an
| unexpected roadblock: TinyGo lacks JSON support (because the
| native go implementation uses reflection).
|
| That meant that I couldn't find a way to serialize structs
| between host and WASM compiled functions/modules.
|
| The initial implementation of what I wanted to do took me half an
| hour. Trying and testing serialization libraries (JSON, protobuf,
| msgpack) took me four days and couldn't find one that worked
| (working on a apple M1 made things more complicated). The one
| left to try is karmem but I just had to move on and be
| productive.
|
| I used goja instead to get unstuck but now I want to find a
| working solution :)
| usrusr wrote:
| Next step in result assessment would be a few less trivial
| examples and comparing how file sizes go up between languages: is
| the bulk of the observed "Rust tax" a one-time investment that
| won't grow for less trivial code or will the size keep growing in
| almost linear way, keeping the file size ratios roughly the same?
| zRedShift wrote:
| Your go uses `pdqsort` to sort 4 byte ints from 0 to 100, while
| rust uses a stable sort (`sort_unstable` is equivalent to
| `pdqsort`) on single byte integers from 0 to 255. Hardly a fair
| comparison.
| schemescape wrote:
| Seems more like a test of the speed of the random number
| generator. Never mind that the JS version doesn't use "slice" to
| copy the array...
|
| Edit: they're not even using the same data types in the arrays.
| Very misleading. I would caution against drawing any conclusions
| from this article.
|
| Edit again: someone pointed out that the random number generation
| happens in an outer loop, so probably not a big impact. The point
| about slice still stands.
| zRedShift wrote:
| It's also not measuring rng speed, it's measuring memcpy and
| sort speed.
| schemescape wrote:
| Not sure I follow. Math.random() is in testSort, so the time
| to generate the random numbers is part of the measurement
| (even though it almost certainly shouldn't be).
|
| Edit: my main point is that there are many flaws in this
| comparison, so I wouldn't draw any conclusions from the
| measurements in the article. They're pretty much meaningless.
| zRedShift wrote:
| I agree, but what I'm saying is, it only generates the
| numbers once and then sorts them 500 times. Yes it's a
| flawed measurement because it measures the generation time
| + 500 sorts, but the time to generate the numbers is
| probably minuscule compared to the sorting.
|
| There are many more flaws, as you say, the biggest flaw is
| the stable vs unstable sort comparison, but it looks like
| the article author (not OP) has fixed it half an hour ago
| and updated the article.
| schemescape wrote:
| You're right, it uses the same array 500 times and then
| runs an outer loop (with a new array each time) 5 times.
| chrismandelics wrote:
| Plain JS numeric code _can_ be really fast if you stick to typed
| arrays.
|
| I tried the posted JS in Chrome on an aging thinkpad and each
| `testSort()` took as long as 125 seconds (32 seconds in Firefox).
| But when I replaced `Array` with `Float32Array` the runtime
| dropped to under 7 seconds (about 2 seconds in Firefox) -- in
| line with the wasm alternatives. Going further and filling an
| `Int32Array` with `(Math.random() * 2.0 - 1.0) * 100` (as in the
| AssemblyScript code) brought the Chrome runtime down to around
| 1.6 sec (still around 2 seconds in Firefox).
| chrismandelics wrote:
| Also worth noting: default JS `Array.sort()` converts
| everything to strings and compares them lexically -- so 80
| comes before 9. It's a little faster to sort them as numbers:
| `.sort((a, b) => a - b)` takes me around 40 seconds in Chrome
| (11 seconds in Firefox).
| ankrgyl wrote:
| The Rust vs. Go comparison has two key differences:
|
| - The Rust example uses 8 bit unsigned ints vs. Go example uses
| 32 bit signed ints
|
| - Rust's sort is stable _by default_ whereas Go 's is not.
|
| If you tweak the Rust benchmark to use `i32` instead of `u8` and
| `sort_unstable` instead of `sort`, you should see ~3-4x faster
| performance.
| zRedShift wrote:
| Made a PR with the fixes, Rust is now 3 times faster than
| tinygo, and the wasm is almost 3 times smaller (wasm+js is
| twice as small) as expected.
|
| https://github.com/Ecostack/wasm-rust-go-asc/pull/1
|
| My first foray into wasm, so I probably missed some
| optimizations like wasm-opt.
| twp wrote:
| The Go version should also use `sort.Ints`.
| https://pkg.go.dev/sort#Ints
| miohtama wrote:
| Also I would assume different languages have different random()
| implementations which could contribute to the run time. So to
| make tests equal, you should not measure time to set up the
| array.
| protoduction wrote:
| One can speed up the AssemblyScript implementation by using
| StaticArray and adding unchecked() around the indexing - this
| makes it some 25% faster.
|
| Further by using the smallest stub runtime (= without garbage
| collector) one can drop filesize by around half, making the
| smallest implementation even smaller (4.7kb -> 2.8kb uminified).
| Quite some of that code is the random initialization.. if you do
| that in JS it may just be smaller than the plain JS
| implementation.
|
| And still it's a flawed comparison, ASC is using a stable sorting
| algorithm (with a comparator function). It's difficult to compare
| approaches based on such small benchmarks, what's really being
| compared is the built-in sorting algorithm and the bundle size
| overhead of the different languages.
___________________________________________________________________
(page generated 2022-11-27 23:01 UTC)