[HN Gopher] Maybe you don't need Rust and WASM to speed up your ...
___________________________________________________________________
Maybe you don't need Rust and WASM to speed up your JS (2018)
Author : zbentley
Score : 102 points
Date : 2021-06-24 12:42 UTC (10 hours ago)
(HTM) web link (mrale.ph)
(TXT) w3m dump (mrale.ph)
| edflsafoiewq wrote:
| Previously https://news.ycombinator.com/item?id=16413917
| jonnytran wrote:
| Is there a 2021 update to this topic? How has the state of WASM
| changed? How have JS engines changed? Is it still a good idea to
| make these changes to your JS code?
| pretentious7 wrote:
| https://blog.feather.systems/jekyll/update/2021/06/21/WasmPe...
|
| Here's some benchmarks. But in summary, for small simple hot
| loops, chrome is just as fast using js, but for firefox wasm
| gives big performance improvements.
|
| Of course, if you can use SIMD instructions, then wasm will
| win, but that's less fair.
| dang wrote:
| Discussed at the time:
|
| _Maybe you don 't need Rust and WASM to speed up your JS_ -
| https://news.ycombinator.com/item?id=16413917 - Feb 2018 (181
| comments)
| cogman10 wrote:
| Javascript JITs are really good, you (likely) aren't going to see
| major performance improvements by dropping into WASM.
|
| That said, one major benefit of WASM that Javascript jits will
| have a hard time competing with is GC pressure. So long as your
| WASM lib focuses on stack allocations, it'll be real tough for a
| Javascript native algorithm doing the same thing to compete
| (particularly if there's a bunch of object/state management).
|
| For a hot math loop doing floating point calcs (mandelbrot calc),
| however, I've seen javascript end up with identical performance
| compared to WASM. It's really pretty nuts.
| pretentious7 wrote:
| https://blog.feather.systems/jekyll/update/2021/06/21/WasmPe...
|
| We did that! Yea chrome basically keeps right up, of course not
| accounting for SIMD
| tyingq wrote:
| >aren't going to see major performance improvements by dropping
| into WASM
|
| I think that might also change as they add proposal and roadmap
| items like "direct access to the DOM"[1] to WASM.
|
| Proposals: https://github.com/WebAssembly/proposals
|
| Roadmap: https://webassembly.org/roadmap/
|
| [1] https://github.com/WebAssembly/interface-
| types/blob/master/p...
|
| Edit: Overall, the proposals seem to be pushing WASM closer to
| being a general purpose VM (VM as in JVM, not KVM).
| jerf wrote:
| "Javascript JITs are really good, you (likely) aren't going to
| see major performance improvements by dropping into WASM."
|
| If that were true, nobody would be working on WASM because
| there would be no point.
|
| "For a hot math loop doing floating point calcs (mandelbrot
| calc), however, I've seen javascript end up with identical
| performance compared to WASM. It's really pretty nuts. "
|
| That's easy mode for a JIT. If it can't do that it's not even
| worth being called a JIT. That's not a criticism or a snark,
| it's a description of the problem space.
|
| The problem is people then assume that "tight loops of numeric
| computation" performance can be translated to "general purpose
| computing performance", and it can't. I have not seen
| performance numbers that suggest that Javascript on _general
| computation_ is at anything like C speed or anything similar. I
| see performance being more rule-of-thumb 'd at 10x slower than
| C or comparable compiled languages. Now, that's pretty good for
| a dynamic scripting language, which with a naive-but-optimized
| interpreter tends to clock in at around 40-50x slower than C.
| The competition from the other JITs for dynamic scripting
| languages mostly haven't done as well (with the constant
| exception of LuaJIT). But JIT performance across the board,
| including Javascript, seems to have plateaued at much, _much_
| less than "same as compiled" performance and I see no reason
| to believe that's going to change.
| staticassertion wrote:
| > If that were true, nobody would be working on WASM because
| there would be no point.
|
| Wasm has much more potential than just optimization. It opens
| up capabilities like:
|
| * Using other languages besides Javascript, or languages that
| just compile to JS
|
| * Passing programs around vs passing data around
|
| * Providing a VM isolate abstraction that's embedable in your
| existing language/ runtime
| dgb23 wrote:
| WASM however is quite high level, restricted bytecode and
| stack based. The baseline performance gain over JS is maybe
| 30% to 10x. Not the 50-100x you'd expect.
|
| It's also new. Maybe there is potential here for WASM to get
| faster.
| masklinn wrote:
| > Javascript JITs are really good, you (likely) aren't going to
| see major performance improvements by dropping into WASM.
|
| The original essay shows exactly that, the one here shows
| pretty epic optimisations (and needs for an excellent knowledge
| of the platform) to get to "naive" WASM performances.
| julosflb wrote:
| "Javascript JITs are really good, you (likely) aren't going to
| see major performance improvements by dropping into WASM."
|
| WASM may not significantly outperform the JIT of one particular
| browser on a given scenario but you are more likely to get
| homogeneous performance across different browsers.
| OskarS wrote:
| > Javascript JITs are really good, you (likely) aren't going to
| see major performance improvements by dropping into WASM.
|
| A javascript JIT is never going to be able to compete with
| codegen from a low-level statically typed language running
| through an optimizing compiler. I mean, this very article
| contains the perfect example: by manually inlining the
| comparison function they got huge performance gains from a
| sorting function. That is child's play for GCC or LLVM (it's
| the whole point of std::sort in C++).
| cogman10 wrote:
| There are certainly nasty edges to javascript programming and
| I'm not trying to say that it will always be near the same
| performance as a WASM implementation or a statically compiled
| binary.
|
| What I'm saying is that you'll find more often than not that
| it gets close enough to not matter. You'll see javascript at
| or below 2->3x the runtime of a C++ or Rust statically
| compiled implementation in most cases. That is pretty close
| as far as languages go. Java is right in the same range
| (maybe a little lower).
| derefr wrote:
| I don't know about the "low-level" part. I have a feeling
| you'd get just as much of a win from a HLL statically-typed
| language, like Haskell.
|
| It's the static-typing, not the low-level-ness, doing most of
| the heavy lifting in making code JITable/WPOable. You don't
| _need_ to manually inline a comparison function, _if_ the JIT
| knows how to inline the larger class of thing that a
| comparison function happens to fall into, and _if_ the code
| is amenable to that particular WPO transformation.
|
| I would compare this to SQL: you don't optimize a SQL query
| plan by dropping to some lower level where you directly do
| the query planner's job for it. Instead, you just add
| information to the shape of your query such that it becomes
| _amenable_ to the optimization the query planner knows how to
| do. That will almost always get you 100% of the wins it's
| possible to get on the given query engine anyway, such that
| there'd be nothing to gain by writing a query plan yourself.
| cogman10 wrote:
| So, you'd think static typing was a major win but it
| actually isn't (surprisingly) for a JITed language. Most of
| the benefits of statically typed languages comes from
| memory layout optimizations. However, that sort of layout
| optimization is something that every fast JITed language
| ends up doing.
|
| This is why, for example, javascript in Graal ends up
| running nearly as fast as Java in the same VM. The reason
| it isn't just as fast is the VM has to insert constraint
| checks to deopt when an assumption about the type shape is
| violated.
|
| https://www.graalvm.org/javascript/
| unrealhoang wrote:
| not really, Haskell is slower than C++/Rust. You get faster
| performance by being mechanical sympathetic, i.e. you care
| about how to laid out your data in memory efficiently for
| the CPU to process, when and how much to allocate, which
| code path to fold together (inlining) to create a smallest
| possible set of instructions.
|
| JIT is theoretically possible to figure out all that and
| transform your code into optimized form, but practically?
| we don't have Sufficiently Smart Compiler[1] yet. Usually,
| JIT is worse in restructure your data layout than figure
| out which part to inline.
|
| [1] https://wiki.c2.com/?SufficientlySmartCompiler
| ex3ndr wrote:
| Isn't Java's JIT is better than static optimizations of C++
| because of runtime data is also very important for
| optimizations? Also inlining of comparison functions usually
| already done in V8...
| cogman10 wrote:
| Depends. Java's JIT can handle cases of dynamic dispatch
| better than C++ because of runtime information. However,
| C++ has the advantage of time. It can do more heavy
| optimizations because the time budget to do such opts is
| much longer than what the JIT has available to it. C++
| compilers can be much more aggressive about things like
| inlining.
|
| That said, yes, the runtime information provides really
| valuable information. That's why PGO is a thing for C++.
| dcomp wrote:
| Regarding caching and memoisation. Isn't the main benefit memory
| usage saving. I wonder if it's possible to do parsing for speed
| and then background deduplication for memory savings. (I don't
| know what the status is of multithreading in js or wasm)
| mamcx wrote:
| This nicely show why Rust/WASM speed up things.
|
| All that detective work? Is UNDOING what javascript do,
| idiomatically!
|
| Can be argued that JS is "fast" after all that, but instead, that
| show Rust give it for free, because is idiomatic there.
|
| The problem with JS and other mis-designed languages is that do
| easy things are easy, but add complexity is easier. And the
| language NOT HELP if you wanna reduce it.
|
| In contrast, in Rust (more thanks to ML-like type system), the
| code YELL the more complex it becomes. And Rust _help_ to
| simplify things.
|
| That is IMHO the major advantage of a static type system,
| enriched on the ML family: The types are not there, per-se, as
| performance trick, but for guide how model the app, then you can
| model performance!
|
| P.D: In other words? If JS were alike rust (ala: ML, plus more
| use of vectors!) it will be far easier to a) not get on a bad
| performance spot in first place b) easy to simplify things with
| type modeling!
| derefr wrote:
| I wouldn't call JS mis-designed. Most use of JS in the world is
| still for simple things, e.g. validating form inputs to put an
| X or checkmark beside each field. It's important for _those_
| cases that in JS "easy things are easy", even at the expense of
| developer productivity for complex apps. JavaScript (the
| syntax+semantics) was never _intended for_ complex apps.
|
| It would be better if we had JS _and another_ language that
| were both usable in browsers to interact with the DOM, where
| this other language was more low-level. And this _was_ the
| original plan -- the "other language" was originally going to
| be Java. (Java applets _were_ originally capable of
| manipulating the DOM, through the same API surface as
| JavaScript!) Then people stopped liking Java applets, so it
| moved to thinking about "pluggable" language runtimes, loadable
| through either NPAPI (Netscape /Firefox) or ActiveX (IE),
| enabling <script type="foo"> for arbitrary values of foo. This
| effort died too, both because those plugin systems are security
| nightmares (PPAPI came too late), and because browsers just
| didn't seem willing to standardize on a plugin ecosystem in a
| way that would allow websites to declare a plugin once that
| enables the same functionality on all past, present, and future
| browsers, the way JS does.
|
| Eventually, we acknowledged that all browsers had already
| implemented JavaScript engines (incl. legacy browsers on
| feature-phones) and so it would be basically impossible to
| achieve the same reach with a latecomer. So we switched to the
| strategy of making browsers' JavaScript engines work well when
| you use in-band signalling to (effectively) program them in
| another language; and we called the result WASM.
|
| This isn't the cleanest strategy. What's great about it,
| though, is that (the text format of) WASM will load and run in
| any browser that supports JavaScript itself.
| mamcx wrote:
| > And this was the original plan -- the "other language" was
| ...
|
| In short? mis-designed. JS was fora use case, and have become
| more and more for other uses case (even backend!). This is
| also the problem with html, css, dom.
|
| Plus, some WATs are part of rushing it, and others for
| mismatch in the uses cases.
| handrous wrote:
| I think the fact that it features, very prominently, an OO
| model (prototypal) that's pretty firmly in "please, never
| actually use any of the notable features that differentiate
| this model from others" territory is enough to fairly label
| it mis-designed, however far it's come. It's no accident
| that prototypal-heavy JS is damn near at the bottom of
| paradigms for approaching JS programming, beneath a bunch
| of others, OO or otherwise, that brush right past it,
| pretending it's not there.
|
| I'd point to not making async calls synchronous-to-the-
| caller by default as another pretty bad design mistake. The
| way so many JS files start nearly every line (correctly!
| This isn't even counting mis-use of the feature, which is
| also widespread!) with `await` is evidence of this, and
| so's all the earlier thrashing before we had `await` to
| deal semi-sanely with this bad design decision.
|
| The original scoping system was just bad. We have better
| tools to make it suck less now, but it was designed _wrong_
| originally.
| 29athrowaway wrote:
| I disagree.
|
| JS optimizations are implementation defined. They are also
| largely undocumented. You will have to read source code if you
| want to know the truth, or write microbenchmarks to probe the VM
| behavior.
|
| Your optimization from today can be deoptimized tomorrow. There
| are no documented guarantes that certain code will remain fast.
|
| Knowing about inline caching, shapes, smis, deoptimizations and
| trampolines... does helps. But the internals are unintuitive.
|
| Instead, I can save myself that time and use WASM instead.
|
| There are so many things that can trigger a deoptimization that I
| would rather ignore them all and do it in WASM instead.
| dboreham wrote:
| Why would someone think that the purpose of WASM is to speed up
| JavaScript?
| sam0x17 wrote:
| I read that sentence as replace the word "JavaScript" with
| "Frontend" and it made more sense.
|
| Generally though, I think wasm represents an opportunity to
| break the strangle-hold JavaScript has had on the frontend
| ecosystem for nearly 3 decades. Typed, compiled languages
| typically have huge advantages in terms of safety and
| performance over dynamically typed interpreted languages. And
| now a lot of high-level features and syntactic sugar that used
| to be only available in dynamic languages is becoming available
| in systems languages.
| pornel wrote:
| There was a back-and-forth on optimizing this (see the link Steve
| found). The WASM version got faster too by adopting algorithmic
| changes proposed here, and the conclusion of Rust authors was
| that you do need WASM if you want _predictable_ performance.
|
| High-perf JS relies on staying on the JIT happy path, not
| upsetting GC, and all of this is informal trickery that's not
| easy to do for all JS engines. There's never any guarantee that
| JS that optimizes perfectly today won't hit a performance cliff
| in a new JIT implementation tomorrow.
| steveklabnik wrote:
| https://fitzgeraldnick.com/2018/02/26/speed-without-wizardry...
| eevilspock wrote:
| I know we are not supposed to talk about down voting but in
| this case I would really like to know why? The article Steve
| links to is great and even the GP (top comment) approves. Yet
| Steve's comment is now dimmed.
|
| I thought HN was supposed to be better.
| edflsafoiewq wrote:
| It isn't dimmed. It contains only a link and visited links
| are styled to be lighter gray than normal text. You can
| check this with the inspector: normal comments have the
| class c00 (color: #000000); dimmed comments have classes
| like c5a (color: #5a5a5a), c73, etc.
| eevilspock wrote:
| ok bad. too late to delete.
| derefr wrote:
| > There's never any guarantee that JS that optimizes perfectly
| today won't hit a performance cliff in a new JIT implementation
| tomorrow.
|
| I would say the opposite: there kind of _are_ such guarantees,
| to JIT development's detriment. Because JIT developers use the
| performance of known workloads under current state-of-the-art
| JITs as a regression-testing baseline. If their JIT
| underperforms the one it's aiming to supersede on a known
| workload, their job isn't done.
|
| This means that there's little "slack"
| (https://slatestarcodex.com/2020/05/12/studies-on-slack/) in
| JIT development -- it'll be unlikely that we'll see a JIT
| released (from a major team) that's an order-of-magnitude
| better at certain tasks, at the expense of being less-than-an-
| order-of-magnitude worse at others. ( _Maybe_ from an
| individual as code ancillary to a research paper, but never
| commercialized /operationalized.)
| heydenberk wrote:
| People are downvoting this comment but I think it's a
| valuable assertion, and the core fact (JS engine developers
| use current/real-world empirics to make optimization
| decisions) is indisputably true:
| https://v8.dev/blog/sparkplug
| astrange wrote:
| Well, it's a bit silly to quote SSC (a blog that rewrites
| basic philosophy so STEMheads can think they invented it)
| for a definition of "slack".
| refenestrator wrote:
| The only things philosophers do are think and write, and
| somehow they are ass bad at writing.
|
| SSC wouldn't have a niche if they could get concepts
| across without a bunch of obscurantist bullshit.
| Smaug123 wrote:
| I'll need someone to step in with a link to the actual
| post, because I can't think of the right search terms,
| but he has (of course) written about this in the past.
| His self-described schtick is to write about existing
| things in such a way that they reach a new audience, by
| dint of acting as a translator between some of the mental
| models that various people have. I'm not sure how much is
| gained by describing such an approach in the way that you
| do.
|
| If you'd simply omitted the bracketed clause, perhaps
| your comment might have been useful!
| astrange wrote:
| The mental model that causes you to need to get
| information from nerd essayists is probably not a good
| one though. You gotta try and be well read.
|
| Also, I thought it was bad when he said he was running
| the blog as a project to promote scientific racism
| (https://twitter.com/ToWit12/status/1362159170615537667).
| kannanvijayan wrote:
| I've spent the last decade working on and thinking about
| JITs, runtime hidden type modeling, and other optimization
| problems relating to reactive dynamic systems dealing with
| heavy polymorphism (specifically Javascript as a programming
| language).
|
| Now, different people in the industry will have different
| perspectives, so I'll preface this by saying that this is
| _my_ view on where the future of JIT compilation leads.
|
| The order of magnitude gains were mostly plumbed with the
| techniques JIT engineers have brought into play over the last
| decade or so.
|
| One aspect that remains relevant now is responsiveness to
| varying workloads, and nible reaction to polymorphic code,
| and finding the right balance between time spent compiling
| and the productivity of the optimized code that comes out the
| other end. There is significant work yet to be done in
| finding the right runtime type models that quickly and
| effectively distinguish between monomorphism, polymorphism,
| and megamorphism, and are able to respond appropriately.
|
| The other major avenue for development, and the one that I
| have yet to see a lot of talk about, is the potential to move
| many of these techniques and insights out of the domain of
| language runtime optimization, and into _libraries_, and
| allowing developers direct API-level access to the
| optimization strategies that have been developed.
|
| If you work on this stuff, you find very quickly that a huge
| amount of a JIT VM's backend infrastructure has nothing to
| with _compilation_ per se, and much more to do with the
| support structures that allow for the discovery of fastpaths
| for operations over "stable" data structures.
|
| In Javascript, the data structure that is of most relevance
| is a linked list of property-value-maps (javascript objects
| linked together by proto chains). We use well-known
| heuristics about the nature of those structures (e.g. "many
| objects will share the _keyset_ component of the hashtable",
| and "the linked list of hashtables will organize itself into
| a tree structure"). Using that information, we factor out
| common bits of the data-structure "behind the scenes" to
| optimize object representation and capture shared structural
| information in hidden types, and then apply techniques (such
| as inline caches) to optimize on the back of that shared
| structure.
|
| There's no reason that this sort of approach cannot be
| applied to _user specified structures_ of different sorts.
| Different "skeletal shapes" that are highly conserved in
| programs.
|
| For me, the big promise that these technologies have brought
| to the fore is the possibility of effectively doing partial
| specialization of data structures at runtime. Reorganizing
| data structures under the hood of an implementation to
| deliver, transparently to the client developer, optimization
| opportunities that we simply don't even consider as possible
| today.
| miohtama wrote:
| Anything explicit JavaScript the language wise to help
| this? New private attributes? Something similar to Python
| __slots__.
|
| Basically why the guess structures when programmer could
| easily tell you the static bits and ask them to be frozen.
| pornel wrote:
| Browsers have replaced their optimizing backends entirely a
| few times already. They all optimize similar kinds of code,
| but the details of when and how change all the time. Vendors
| have deprecated and regressed on whole benchmark suites when
| they've decided the tests no longer represent what they want
| to optimize for.
|
| The biggest problem is that JIT is guided by heuristics.
| Observations from interpreters decide when code is advanced
| to higher optimization tiers, and that varies between
| implementations and changes relatively often. Your hottest
| code is also subject to JIT cache pressue and eviction
| policy, so you can't even rely on it staying hot across
| different machines using the exact same browser.
| lhorie wrote:
| > There's never any guarantee that JS that optimizes perfectly
| today won't hit a performance cliff in a new JIT implementation
| tomorrow.
|
| Yep. As someone who played this game a few years ago, JIT
| implementations do change (e.g. function inlining based on
| function size was removed from V8, delete performance changed,
| number of allowed slots in hidden classes changed, etc).
|
| Also worth noting that just because an optimization works in
| V8, there's no guarantee that it will also work in another JS
| engine.
| masklinn wrote:
| > Also worth noting that just because an optimization works
| in V8, there's no guarantee that it will also work in another
| JS engine.
|
| The essay even covers that, when it mentions that matching
| arities had a significant impact on V8 (14% improvement) but
| not perceivable effect on SpiderMonkey.
| Hypx_ wrote:
| Though I wonder... Some decades ago people made similar
| arguments with regards to C/C++ versus native assembly. That
| the only way compiled languages could approach assembly
| languages on performance was if the code was written in a way
| that the compiler could optimize. If it couldn't, performance
| would often tank.
|
| But as compilers got better this become less and less of an
| issue. By today it has become very rare to need to write in
| assembly. So shouldn't history repeat itself? That JIT
| compilers will be so good that it becomes extremely unlikely
| that you'll hit the performance cliff.
| TazeTSchnitzel wrote:
| The dancing required to stay within the happy paths, and the
| necessity of avoiding certain common abstractions in order to
| do so, leaves me convinced it must be much easier to write
| performant high-level code in C, C++, Rust etc than in
| something like JavaScript or Python. While it's _possible_ to
| write fast code in the latter, it doesn 't seem like much fun.
| pretentious7 wrote:
| (https://blog.feather.systems/jekyll/update/2021/06/21/WasmPe...)
| OP was very useful for us in optimizing our js vs wasm
| benchmarks. We were wondering what a very simple parallel problem
| like mandelbrot rendering would show about browser jits vs their
| wasm compilers.
|
| Our conclusion was that wasm was consistent across browsers
| whereas js wasn't. Further, If you can use simd, wasm is faster.
| Also, v8 is way faster than spidermonkey.
| lovasoa wrote:
| Your link gives a 404, I think there is a problem with the path
| in the URL.
|
| And since you are talking about the fastest mandelbrot on the
| web, here is my contribution https://mandelbrot.ophir.dev
|
| It renders in real time and is fully interactive, all in pure
| js.
| pretentious7 wrote:
| Does it still 404? Sorry I made some mistake with it. Theres
| a link in the post to an interactive demo.
|
| The reason I said fastest is cause I wrote the hot loop with
| SIMD instructions. I'll be adding in period checking and
| stuff in a couple months time, will be sure to refer your
| work then, thanks.
| recursivedoubts wrote:
| It is shocking how fast javascript is.
|
| Hyperscript is an interpreted programming language written on top
| of javascript with an _insane_ runtime that resolves promises at
| the _expression_ level to implement async transparency.
|
| Head to the playground and select the "Drag" example:
|
| https://hyperscript.org/playground/
|
| And drag that div around. That whole thing is happening in an
| interpreted event-driven loop: repeat until event
| pointerup from document wait for pointermove(pageX,
| pageY) or pointerup(pageX, pageY) from document
| add { left: `${pageX - xoff}`, top: `${pageY - yoff}` } end
|
| The fact that the performance of this demo isn't absolutely CPU-
| meltingly bad is dramatic testament to how smoking fast
| javascript is.
| nemetroid wrote:
| I cannot tell if you're being ironic or not. Being able to drag
| a static text box around is not a very impressive feat. And
| more importantly, it _is_ CPU-meltingly bad: when I drag the
| box around, my Firefox process jumps from 5% to a steady 35%
| (on a Ryzen 2600).
| ohgodplsno wrote:
| You're... dragging an empty div around. You know what else
| doesn't choke? Me dragging absolutely any window on my PC. It
| will happily do so at 120Hz without chugging. It even does
| crazy things, such as playing games with complex event loops at
| 120FPS.
|
| The fact that you're excited at how dragging a single div
| around actually works as expected is a testament to how
| horrifyingly low your expectations are.
| ex3ndr wrote:
| For JS it is the same - moving empty div or moving whole
| window - it is still same amount of code and actual moving is
| always on different threads. It works for native apps too.
| handrous wrote:
| Doesn't work on desktop Safari. Div "rises" on-click and the
| cursor changes as (evidently) intended, but the div cannot be
| dragged.
| flohofwoe wrote:
| Yes, Javascript can be made surprisingly fast, but it requires
| very detailed knowledge of the JS engine internals and may
| require giving up most high-level features that define the
| language (as demonstrated by asm.js). And the resulting carefully
| optimized JS code is often less readable / maintainable than
| relatively straightforward C code doing the same thing compiled
| to WASM.
| rikroots wrote:
| When I started developing my JS 2D canvas library (back in
| 2013) I worked on a desktop PC and used Firefox for most of my
| code testing work. The library always ran faster on Firefox
| than on IE or Chrome, which at the time I assumed was probably
| something to do with Firefox putting extra effort into the
| canvas part of their engine.
|
| Then in 2019 I rewrote the whole library from scratch (again),
| this time working on a MacBook using Chrome for most of my
| develop/test stuff. The library is now much faster on Chrome! I
| assume what happened was that I was constantly looking for ways
| to improve the code for speed (in canvas-world, if everything
| is not completing in under 16ms then you must own the
| humiliation) which meant that prior to 2019 I was
| subconsciously optimising for FF; after 2019 for Chrome.
|
| > it requires very detailed knowledge of the JS engine
| internals and may require giving up most high-level features
| that define the language
|
| I can't claim that I have that knowledge. Most of my
| optimisations have been basic JS 101 stuff - object pools (eg
| for canvas, vectors, etc) to minimise the stuff sent to
| garbage, doing the bulk of the work in non-DOM canvases,
| minimising prototype chain lookups as much as possible, etc.
| When I tried to add web workers to the mix I ended up slowing
| everything down!
|
| There is stuff in my library that I do want to port over to
| WASM, but my experiments in that direction so far have been
| less than successful - I need to learn a lot more
| WASM/WebAssembly/Rust before I make progress, but the learning
| is not as much fun as I had hoped it would be.
|
| > Javascript can be made surprisingly fast
|
| The speeds that can be achieved by browser JS engines still
| astonish me! For instance, shape and animate an image between
| two curved paths in real time in a 2D (not WebGL!) canvas:
|
| - CodePen - https://codepen.io/kaliedarik/pen/ExyZKbY
|
| - Demo with additional controls -
| https://scrawl-v8.rikweb.org.uk/demo/canvas-024.html
| sam0x17 wrote:
| Yeah, it strikes me as an odd choice that V8 doesn't just use C
| code for something as central and important as sorting, but
| maybe there's some technical reason.
| pornel wrote:
| For some functions speed-up from inlining and run-time
| specialization via JIT outweigh the benefit of native code
| speed.
|
| There's similarly a cost in crossing JS<>WASM boundary, so
| WASM doesn't help for speeding up small functions and can't
| make DOM-heavy code faster.
| masklinn wrote:
| > maybe there's some technical reason
|
| Native code is opaque to the JIT, which means no inlining
| through native code (inlining being the driver for lots of
| optimisations) and no specialisation. This means if you have
| a JIT native code is fine for "leaf" functions but not great
| for intermediate ones, as it hampers everything.
|
| When ES6 compatibility was first released, the built-in array
| methods were orders of magnitude slower than hand-rolling
| pure JS versions.
|
| This issue is one of the things Graal/Truffle attempts to
| fix, by moving the native code inside the JIT.
| inbx0 wrote:
| There's probably some overhead for switching between a C++
| implementation and user-defined JS, and the sorter would need
| to do that a lot since most sort calls come with the compare-
| callback.
| derefr wrote:
| My understanding comes from Erlang's HiPE optimizer (which
| was AoT rather than a JIT), but I think you've nailed it.
| Most JITs can't optimize across a native-interpreted
| boundary, any more than compilers can optimize across a
| dynamic-linkage (dlopen(2)) boundary. When all the code is
| interpreted, you get something that's a lot more WPO-like.
|
| IIRC, some very clever JITs do a waltz here:
|
| 1. replace the user's "syscall" into the runtime, with
| emitted bytecode;
|
| 2. WPO the user's bytecode together with the generated
| bytecode;
|
| 3. Find any pattern that still resembles the post-
| optimization form of _part of_ the bytecode equivalent of
| the call into the runtime, and replace it _back_ with a
| "syscall" to the runtime, for doing that partial step.
|
| Maybe clearer with an example. Imagine there's a platform-
| native function sha256(byte[]). The JIT would:
|
| 1. Replace the call to sha256(byte[]) with a bytecode loop
| over the buffer, plus a bunch of logic to actually do a
| sha256 to a fixed-size register;
|
| 2. WPO the code with that bytecode loop embedded;
|
| 3. Replace the core of the remnants of the sha256-to-a-
| fixed-size-register code, with a call to a platform-native
| sha256(register) function.
___________________________________________________________________
(page generated 2021-06-24 23:01 UTC)