[HN Gopher] WebAssembly: TinyGo vs. Rust vs. AssemblyScript
       ___________________________________________________________________
        
       WebAssembly: TinyGo vs. Rust vs. AssemblyScript
        
       Author : buradol
       Score  : 87 points
       Date   : 2022-11-27 09:53 UTC (13 hours ago)
        
 (HTM) web link (ecostack.dev)
 (TXT) w3m dump (ecostack.dev)
        
       | ungawatkt wrote:
       | a little off topic, but something I've been wondering:
       | 
       | With my small learning project using tinygo, disabling the GC in
       | tinygo sped things up a good amount (unsurprisingly), and since
       | the wasm part is in a webworker it seemed safe enough (the
       | webworker terminating cleared the allocations as far as I could
       | tell). It seems to work nicely for the project, basically "apply
       | an image transform", but would this pattern be good in a more
       | serious code base?
        
       | aatd86 wrote:
       | I love Go but I must confess I was a bit skeptical when I read
       | the article: 1. In my own experience, firefox is usually faster
       | than chrome at executing wasm.
       | 
       | 2. I'm not a rust programmer but the performance of the rust
       | implementation a bit surprising compared to tinyGo.
       | 
       | Still, tinyGo is very impressive. Very glad we have this project.
        
       | postalrat wrote:
       | Was the test here on the author's understanding of data types and
       | sort algorithms?
        
       | techn00 wrote:
       | Does this still hold up if you don't use fixed size arrays?
        
       | qayxc wrote:
       | Oh my, where to start... This little article shows everything
       | wrong with software devs these days :) First of all, the
       | implementations aren't doing the same thing, which is bad for a
       | proper comparison; but others pointed that out already.
       | 
       | What bugs me the most, though, is the atrocious JavaScript
       | implementation. On my aging machine, none the of the browsers I
       | tried it on (Edge, Opera, Chrome, and Firefox) - not a single
       | one! - even ran the original version. All of them refused to run
       | the script and timed out.
       | 
       | I did a very, very simple change and just used the proper tool
       | for the job - TypedArrays:                 function testSort() {
       | const length = 100_000         const arr = new Uint8Array(length)
       | for (let i = 0; i < arr.length; i++) {             arr[i] =
       | Math.random() * 255 // the original code produced float values
       | here         }         const temp = new Uint8Array(length)
       | for (let i = 0; i < 500; i++) {          temp.set(arr, 0) //
       | shorter than a manual for-loop and about 6+% faster
       | temp.sort()         }       }
       | 
       | And wouldn't you know it, Firefox now runs this function. I did
       | my testing on https://jsbench.me, put the function in the setup
       | section and the actual call in test cases.
       | 
       | I added another function, this time using an Int32Array instead
       | and random values between -1_000 and +1_000.
       | 
       | Just to be thorough and because I found it interesting, I also
       | added a Float32Array version this time using the result from
       | Math.random() without any scaling.
       | 
       | This leads to an interesting observation: there seems to be a
       | huge difference in optimisation between Firefox and Chromium-
       | based browsers with TypedArrays:                      Test
       | | Performance                     | Browser        --------------
       | ---+---------------------------------+-----------------
       | testSort_uint8   | 0.75 ops/s+-1.8% (~1.33s/call)   |  Edge
       | testSort_uint8   | 29.33 ops/s+-2.3% (~34ms/call)   |  Firefox
       | -----------------+---------------------------------+-------------
       | ----        testSort_int32   | 0.64 ops/s+-1% (~1.56s/call)     |
       | Edge        testSort_int32   | 1.44 ops/s+-0.2% (~694ms/call)   |
       | Firefox        -----------------+--------------------------------
       | -+-----------------        testSort_f32     | 0.16 ops/s+-0.6%
       | (~(~6.25s/call) |  Edge        testSort_f32     | 1.49
       | ops/s+-0.3% (~671ms/call)   |  Firefox
       | 
       | In Firefox, the results where *really* interesting. The dramatic
       | difference between uint8 and int32 in particular is quite
       | astonishing. Unlike the author, I fund zero difference between
       | Edge, Opera, and Chrome in terms of performance, which doesn't
       | surprise me as they all use the same engine.
       | 
       | While Firefox's SpiderMonkey engine is faster in all scenarios,
       | the uint8-version is faster by an order of magnitude, whereas V8
       | shows comparatively little difference (~15%) in the integer case
       | and absolutely tanks performance with float values.
       | 
       | This last result in particular might explain, why the original
       | version was both unable to start on my machine and also slow in
       | the author's browsers.
       | 
       | I hope you found this as interesting as I did and maybe even find
       | a place to apply these findings in your projects.
       | 
       | Cheers.
        
       | goliatone wrote:
       | OT but I've been working with WASM and TinyGo and hit an
       | unexpected roadblock: TinyGo lacks JSON support (because the
       | native go implementation uses reflection).
       | 
       | That meant that I couldn't find a way to serialize structs
       | between host and WASM compiled functions/modules.
       | 
       | The initial implementation of what I wanted to do took me half an
       | hour. Trying and testing serialization libraries (JSON, protobuf,
       | msgpack) took me four days and couldn't find one that worked
       | (working on a apple M1 made things more complicated). The one
       | left to try is karmem but I just had to move on and be
       | productive.
       | 
       | I used goja instead to get unstuck but now I want to find a
       | working solution :)
        
       | usrusr wrote:
       | Next step in result assessment would be a few less trivial
       | examples and comparing how file sizes go up between languages: is
       | the bulk of the observed "Rust tax" a one-time investment that
       | won't grow for less trivial code or will the size keep growing in
       | almost linear way, keeping the file size ratios roughly the same?
        
       | zRedShift wrote:
       | Your go uses `pdqsort` to sort 4 byte ints from 0 to 100, while
       | rust uses a stable sort (`sort_unstable` is equivalent to
       | `pdqsort`) on single byte integers from 0 to 255. Hardly a fair
       | comparison.
        
       | schemescape wrote:
       | Seems more like a test of the speed of the random number
       | generator. Never mind that the JS version doesn't use "slice" to
       | copy the array...
       | 
       | Edit: they're not even using the same data types in the arrays.
       | Very misleading. I would caution against drawing any conclusions
       | from this article.
       | 
       | Edit again: someone pointed out that the random number generation
       | happens in an outer loop, so probably not a big impact. The point
       | about slice still stands.
        
         | zRedShift wrote:
         | It's also not measuring rng speed, it's measuring memcpy and
         | sort speed.
        
           | schemescape wrote:
           | Not sure I follow. Math.random() is in testSort, so the time
           | to generate the random numbers is part of the measurement
           | (even though it almost certainly shouldn't be).
           | 
           | Edit: my main point is that there are many flaws in this
           | comparison, so I wouldn't draw any conclusions from the
           | measurements in the article. They're pretty much meaningless.
        
             | zRedShift wrote:
             | I agree, but what I'm saying is, it only generates the
             | numbers once and then sorts them 500 times. Yes it's a
             | flawed measurement because it measures the generation time
             | + 500 sorts, but the time to generate the numbers is
             | probably minuscule compared to the sorting.
             | 
             | There are many more flaws, as you say, the biggest flaw is
             | the stable vs unstable sort comparison, but it looks like
             | the article author (not OP) has fixed it half an hour ago
             | and updated the article.
        
               | schemescape wrote:
               | You're right, it uses the same array 500 times and then
               | runs an outer loop (with a new array each time) 5 times.
        
       | chrismandelics wrote:
       | Plain JS numeric code _can_ be really fast if you stick to typed
       | arrays.
       | 
       | I tried the posted JS in Chrome on an aging thinkpad and each
       | `testSort()` took as long as 125 seconds (32 seconds in Firefox).
       | But when I replaced `Array` with `Float32Array` the runtime
       | dropped to under 7 seconds (about 2 seconds in Firefox) -- in
       | line with the wasm alternatives. Going further and filling an
       | `Int32Array` with `(Math.random() * 2.0 - 1.0) * 100` (as in the
       | AssemblyScript code) brought the Chrome runtime down to around
       | 1.6 sec (still around 2 seconds in Firefox).
        
         | chrismandelics wrote:
         | Also worth noting: default JS `Array.sort()` converts
         | everything to strings and compares them lexically -- so 80
         | comes before 9. It's a little faster to sort them as numbers:
         | `.sort((a, b) => a - b)` takes me around 40 seconds in Chrome
         | (11 seconds in Firefox).
        
       | ankrgyl wrote:
       | The Rust vs. Go comparison has two key differences:
       | 
       | - The Rust example uses 8 bit unsigned ints vs. Go example uses
       | 32 bit signed ints
       | 
       | - Rust's sort is stable _by default_ whereas Go 's is not.
       | 
       | If you tweak the Rust benchmark to use `i32` instead of `u8` and
       | `sort_unstable` instead of `sort`, you should see ~3-4x faster
       | performance.
        
         | zRedShift wrote:
         | Made a PR with the fixes, Rust is now 3 times faster than
         | tinygo, and the wasm is almost 3 times smaller (wasm+js is
         | twice as small) as expected.
         | 
         | https://github.com/Ecostack/wasm-rust-go-asc/pull/1
         | 
         | My first foray into wasm, so I probably missed some
         | optimizations like wasm-opt.
        
         | twp wrote:
         | The Go version should also use `sort.Ints`.
         | https://pkg.go.dev/sort#Ints
        
         | miohtama wrote:
         | Also I would assume different languages have different random()
         | implementations which could contribute to the run time. So to
         | make tests equal, you should not measure time to set up the
         | array.
        
       | protoduction wrote:
       | One can speed up the AssemblyScript implementation by using
       | StaticArray and adding unchecked() around the indexing - this
       | makes it some 25% faster.
       | 
       | Further by using the smallest stub runtime (= without garbage
       | collector) one can drop filesize by around half, making the
       | smallest implementation even smaller (4.7kb -> 2.8kb uminified).
       | Quite some of that code is the random initialization.. if you do
       | that in JS it may just be smaller than the plain JS
       | implementation.
       | 
       | And still it's a flawed comparison, ASC is using a stable sorting
       | algorithm (with a comparator function). It's difficult to compare
       | approaches based on such small benchmarks, what's really being
       | compared is the built-in sorting algorithm and the bundle size
       | overhead of the different languages.
        
       ___________________________________________________________________
       (page generated 2022-11-27 23:01 UTC)