[HN Gopher] Ruby 3.2 preview 1 with support for WASM compilation
___________________________________________________________________
Ruby 3.2 preview 1 with support for WASM compilation
Author : pvsukale3
Score : 228 points
Date : 2022-04-07 17:38 UTC (5 hours ago)
(HTM) web link (www.ruby-lang.org)
(TXT) w3m dump (www.ruby-lang.org)
| freedomben wrote:
| Timeouts for Regexp is quite interesting. The engineering purity
| in me saddens at such a thought, but indeed it seems highly
| practical.
|
| The syntax feels a little rough although I have no ideas how to
| make it better: Regexp.timeout = 1.0
| ... /^a*b?a*$/ =~ "a" * 50000 + "x"
|
| I think I would favor the: long_time_re =
| Regexp.new("^a*b?a*$", timeout: 1.0)
|
| version instead but I use the `=~` almost entirely, so that would
| still be a big style change. Probably end up setting a global
| timeout per app and then overriding for individual checks as
| needed?
| speedgoose wrote:
| A timeout for regexps makes so much sense. And it would end all
| these denial of service security reports.
| [deleted]
| Thaxll wrote:
| It does not make sense to me, best solution is to have an
| implementation like re2 that does not have those problems.
|
| Adding a timeout is a bit strange, first because you don't
| know in advance how long it's going to take for large search.
| The timeout is a failsafe against something that should be
| fixed in the first place.
| jatone wrote:
| thats the problem with all these syntax sugar features in
| languages. you literally can't change them without blowing
| up your entire ecosystem.
| djur wrote:
| How would this problem be different if Ruby did not have
| syntax support for regexes and instead offered a regex
| module in the standard library?
| JohnBooty wrote:
| You _can_ use re2 in Ruby if you like - it 's just not the
| default. https://github.com/mudge/re2
| best solution is to have an implementation like re2 that
| does not have those problems.
|
| By design RE2 isn't fully compatible with Onigmo. As
| another poster mentioned, a hybrid "use RE2 when possible;
| fall back to Onigmo otherwise" approach was considered and
| rejected for well-explained reasons https://bugs.ruby-
| lang.org/issues/18653
|
| Maybe in addition to `Regexp.timeout = 1.0` there could
| also be a `Regexp.parser = :re2` option with `:onigmo`
| being the default.
| messe wrote:
| I think a limit on stack/recursion/backtracking depth would
| be tad a bit more elegant than a timeout and would keep
| your code behaviour the same between different machines.
| capableweb wrote:
| It'd be harder to control perceived performance of user
| facing applications though. If I can set the timeout, I
| can guarantee that something happens within X seconds,
| instead of within X iterations which could have different
| performance machine to machine.
| MaxLeiter wrote:
| is this not the halting problem? You can guarantee a
| certain depth isnt reached but you cant guarantee a
| recursion will unwind or anything like that
| __s wrote:
| It's not the halting problem; it's bounding computational
| depth
|
| Their suggestion is essentially making stack overflow a
| feature in regex, & then allowing that stack depth to be
| tuned
| andreynering wrote:
| I wouldn't expect people to have to change that setting often,
| and 1 second seems very reasonable to me.
|
| So yes, `Regexp.timeout` is supposed to be a default setting
| for the app, and when really needed you can override it with
| the `timeout:` key.
| JohnBooty wrote:
| Yeah the implementation they've chosen seems totally perfect
| to me. Sane global default, easily overridable globally or
| locally.
|
| No easy way to override it locally when using =~, but I can't
| imagine too many cases where you would want to use a local
| timeout anyway.... can just switch away from =~ syntax for
| those.
|
| This is mostly a denial-of-service mitigation tool, something
| you'd just want to apply globally to avoid disasters spawned
| by malformed or malicious input. In practice, it's hard to
| imagine a use case where you'd really want to be twiddling
| the knobs on a regexp-by-regexp basis.
| freedomben wrote:
| Yes good point, I was initially thinking that it would make
| sense to always ask yourself "how long should this take" and
| tune appropriately, but for the vast majority of regexes
| that's overkill, especially if you're not doing anything
| O(n^2). sticking a 1 second in there gives you a lot of
| headroom and you can just get more specific for any
| exceptions.
| JohnBooty wrote:
| I was initially thinking that it would make sense
| to always ask yourself "how long should this take"
| and tune appropriately, but for the vast majority
| of regexes that's overkill
|
| More than being overkill, it's actually impossible right?
|
| The execution time will also vary greatly based on base CPU
| performance, and current server load.
|
| A regexp that takes 10ms to process right now might take
| 500ms tomorrow when your server is under heavy load. So we
| can't predict how much time each regexp "needs."
|
| But, like you said, we can set a somewhat ridiculously high
| limit to help prevent regex-based _oopsies_ or re-based DoS
| attacks from dragging us down =)
| ainar-g wrote:
| I wonder why they didn't just include an option to use a non-
| backtracking algorithm, like re2's[1]. As far as I know, that
| would completely eliminate the possibility of catastrophic
| backtracking happening.
|
| [1]: https://github.com/google/re2
| byroot wrote:
| It was explored but decided against, at least for now
| https://bugs.ruby-lang.org/issues/18653
| dragonwriter wrote:
| Wrapping RE2 with a fallback to the existing engine to try
| to maintain compatibility was explored; that, like the
| timeout approach, is pretty clearly a stopgap measure;
| actually implementing an RE2-style algorithm without the
| compatibility and toolchain warts of RE2 for Ruby's
| existing code and functionality is a bigger but more
| permanent solution, that I don't think has really been
| ruled out of explored.
| riffraff wrote:
| if you break compatibility you might as well just use
| some re2 bindings[0]
|
| [0] https://github.com/mudge/re2
| exfascist wrote:
| Better arguably would be to use a generator or continuation.
| marcus_cemes wrote:
| I've been very attracted to learn Ruby a couple of times, being
| exhausted of the JS ecosystem. Everybody who's used it seems to
| fall in love with it, but I can't get over just how slow it is...
| It takes a fresh installation of Discourse over 10 minutes to
| start-up again on a small underpowered VM and uses 10x as much
| RAM as an alternative platform such as Flarum.
| inopinatus wrote:
| My developer experience is that the long initial start time (of
| Rails in particular) is more than offset by my productivity.
| freedomben wrote:
| I'm one of those people that fell in love with Ruby, and yeah
| the speed is the biggest downside. That said, a lot of the bulk
| is often Rails. I usually use Sinatra now and it's pretty
| light. On the smallest VM it usually starts quickly and runs
| fine for quite a while. One even survived an HN Hug. There are
| also some big improvements coming with Ruby 3 (if you aren't
| already upgraded to that) and more to come. But you definitely
| "pay" a fee in CPU/memory for the privilege of using Ruby. In
| most cases, it's way worth it IMHO. I've also been loving
| Elixir lately. It's got much the same feeling of beauty that
| Ruby does, and it's much lighter and lightning fast. I often
| measure response times in microseconds rather than
| milliseconds!
| syrusakbary wrote:
| This is super exciting!
|
| They also created an awesome playground to try Ruby online [1]...
| all powered by Wasmer/WASI [2]!
|
| [1] https://try.ruby-lang.org/playground/
|
| [2] https://wasmer.io
| eatonphil wrote:
| This looks awesome! I've already played around with pyodide and
| coldbrew doing the same thing for CPython. I use it for an in-
| memory playground [0] of an open-source desktop app I build [1].
| I've been waiting for Ruby, Julia, and R support to add them in
| too.
|
| That said, I am not seeing a link in here about how to actually
| use this code. Is there a good tutorial/example somewhere?
|
| [0] app.datastation.multiprocess.io
|
| [1] github.com/multiprocessio/datastation
| [deleted]
| swlkr wrote:
| One notable thing is the ruby apps in a single .wasm file. This
| may make ruby CLI apps easier, as well as eventually replacing
| things like docker or shipping your ruby code to a server.
| exdsq wrote:
| Why would it replace docker? You still need the dependencies of
| the CLI app
|
| Edit: Ah I guess it's just the WASM vm if it includes
| everything
| cpuguy83 wrote:
| You don't just execute a .wasm file, it requires a runtime
| which will JIT compile the code into machine code and handle
| the (wasi) system interfaces (e.g. read, write, stat, etc).
| specialp wrote:
| Yes this is true with all interpreted languages. But if you
| consider the use-case the OP was contrasting with (Docker)
| that not only has the Ruby runtime, but an entire Linux OS as
| well.
| qbasic_forever wrote:
| I was thinking the same thing, isn't ruby particularly hard to
| package as it doesn't support static compilation? It would be
| nice to just sidestep all of that with a hermetic little WASM
| distribution.
| alberth wrote:
| Does this imply that Rails apps could run as WASM server apps and
| receive a huge performance boost?
| teeray wrote:
| I've found that Rails is like the Crysis of Ruby. Usually the
| answer to "will X ruby runtime run Rails?" is "not yet."
| pqdbr wrote:
| Lol, good analogy. "But can it run Rails?"
| eatonphil wrote:
| There are many other existing/mature Ruby bytecode VMs/JITs you
| could switch to before a WASM bytecode VM/JIT.
| JohnBooty wrote:
| Yes, but they're not portable/interoperable in the way that a
| WASM version would be -- which is why the WASM version is
| exciting, right?
|
| (Somebody correct me if I'm wrong; I know what WASM is but
| I'm not sure how it's employed in practice outside of in-
| browser tech demos of games and things)
| alberth wrote:
| Cloudflare Workers allows you to deploy server side WASM
| apps.
|
| https://blog.cloudflare.com/webassembly-on-cloudflare-
| worker...
| Mikeb85 wrote:
| It doesn't compile Ruby code to wasm code, it compiles the Ruby
| interpreter to wasm, so it'll be roughly the same performance
| as as the Ruby interpreter on Windows or Linux.
|
| Also Rails is plenty quick these days, tons of people running
| it at massive scale.
| the_duke wrote:
| The best case WASM performance is roughly 20-50% slower than
| native code, depending on the runtime and the type of code
| executed.
|
| In the browser you have to also factor in the warmup time.
|
| I'd imagine an interpreter will suffer a lot because certain
| C tricks like computed goto don't work directly. (This will
| hopefully be improved by future Wasm proposals)
|
| (Note: that's still plenty fast enough for most use cases,
| and performance will improve)
| titzer wrote:
| > The _best case_ WASM performance is roughly 20-50% slower
|
| It's more accurate to say that the _average case_ is 20-50%
| slower. The best case is on par, or slightly faster than
| native code[1].
|
| [1] Measurements from our original paper,
| https://dl.acm.org/doi/10.1145/3062341.3062363
|
| Engines are even faster now.
| bpicolo wrote:
| It's not comparatively quick, but you don't necessarily need
| quick to scale or be successful
| shafyy wrote:
| Why do you think this would give them a performance boost?
| anm89 wrote:
| This is by far my favorite tech talk of all time. It goes
| into why WASM can run faster than native code in many
| contexts, the reason being that it get's around the overhead
| of OS security rings
|
| https://www.destroyallsoftware.com/talks/the-birth-and-
| death...
| georgyo wrote:
| The base argument here doesn't make sense.
|
| WASM requires an interpreter which must be native.
|
| The argument is that this interpreter can be smarter about
| what crosses OS security rings. But those same improvements
| could be done in the native compiler or interpreter.
|
| The next argument could be that many things using the WASM
| target would focus more effort on improving it so all WASM
| targets benefit outpacing their individual optimizations.
|
| This one is harder to dismiss outright, but instead of
| optimizing for machine code you are now optimizing your
| WASM output.
|
| Also this intermediate byte code representation already
| exists for both LLVM and JVM, which many languages target.
|
| It is difficult to see WASM magically improving performance
| at all and especially not dramatically enough to encourage
| people to switch to it for that reasoning.
| eatonphil wrote:
| That talk is from 2014 and Wikipedia says wasm was
| announced in 2017?
| time_to_smile wrote:
| While the parent is does seem to be treading into Poe's
| law territory, it's not entirely correct to dismiss that
| talk's relationship to wasm based on the dates your
| quoting.
|
| Bernhardt in the talk explicitly mentions asm.js which is
| the precursor to wasm (it's even mentioned in the
| wikipedia article you skimmed a bit too quickly). asm.js
| was released Feburary 2013.
|
| I'm surprised HN has such a short memory, but the impetus
| for that talk was a clearly disturbing trend at the time
| implying that everything should be done in javascript.
| Node.js was gaining rapid popularity, people were
| discussing javascript as the new C for using as the
| language to write example code in, and while things like
| asm.js were exciting, they seemed to point towards the
| hilariously nightmarish future Bernhardt is discussing
| there.
| dnsco wrote:
| This talk is about asm.js which is a precursor technology
| to wasm, parents logic seems to be "wasm is an
| improvement on asm.js". I have no idea if the kernel
| isolation benefits the garry bernhardt talk is about
| apply.
| capableweb wrote:
| asm.js was first mentioned in 2013. asm.js was eventually
| superseded by wasm and is pretty much the beginning of
| wasm as we know it. Didn't watch the talk, but could
| asm.js be the thing the presenter was talking about?
| AprilArcus wrote:
| it post-dated and took inspiration from Mozilla's asm.js,
| which was highly influential on wasm.
| k__ wrote:
| Is it comparable to Opal?
| AprilArcus wrote:
| Not really. Opal is a source-to-source compiler that compiles
| Ruby to JavaScript. Ruby 3.2 compiles the whole Ruby VM and
| runtime to wasm, which then runs Ruby inside a real Ruby VM
| nested within the JS VM.
|
| A good analogy is that Opal is like PureScript, whereas Ruby
| 3.2 is like GHCJS.
| poisonta wrote:
| I hope it performs better than Opal.
___________________________________________________________________
(page generated 2022-04-07 23:00 UTC)