[HN Gopher] Web Workers API
___________________________________________________________________
Web Workers API
Author : KubikPixel
Score : 92 points
Date : 2021-12-03 08:48 UTC (1 days ago)
(HTM) web link (developer.mozilla.org)
(TXT) w3m dump (developer.mozilla.org)
| yutijke wrote:
| The fact that creating a worker requires you to pass a module
| (file) name makes them extremely unergonomic to use.
|
| This is clearly visible in the lack of any library ecosystem
| around them. We don't have any highly used threadpool or executor
| libraries using workers. Everyone seems to be manually setting up
| a worker and setting up the job scheduling logic from scratch.
|
| All the boilerplate and restrictions really restrict its usage
| IMO.
|
| I know you can pass a Blob URL but you can't write libraries this
| way since you risk running foul of the downstream consumer's
| browser CSP.
| Jasper_ wrote:
| A big complaint I have is that the built-in scheduling for
| workers assumes a very specific scenario. You can send messages
| to a single worker, but the message loop has to be operated by
| the browser. You also don't know the length of the message
| queue, even though the browser has it available, so you can't
| easily send a message to the worker with the least work queued.
| If a worker starts working on low-priority items, and you want
| to interject with a high-priority message, you also can't
| interrupt the worker, nor can the worker loop the messages on
| its own accord. You also can't re-sort the message queue, it's
| FIFO.
|
| Basically, any sort of work scheduling that you would _like_ to
| do to queue high-priority messages, or have workers share a
| pool of work, is impossible to build with the onmessage-style
| of WebWorkers. It feels like an API made by someone who had
| read about threads once, rather than someone who 's built an
| actual many-workers processing system like this. The event loop
| being unpumpable from user code feels like a giant kick in the
| face.
|
| My workaround for this was to kill workers working on low-
| priority items, relaunch them, and resort the queue, but that
| all feels like a mess. Also, at some point a Chrome update made
| this strategy crash. I ended up just removing the WebWorker
| code; more trouble than it was worth for marginal code
| improvements.
|
| I know that SAB and atomics add new low-level primitives to
| support this, but SAB is still poorly supported on widely
| deployed platforms. And you still need to do the serialization
| yourself.
| The_rationalist wrote:
| There are solutions to those problems:
| https://github.com/GoogleChromeLabs/comlink
| https://github.com/Bnaya/objectbuffer
| MuffinFlavored wrote:
| My bigger complaint is... how you cleanly do synchronous RPC
| style calling? Even with clever async/await tricks, the
| serialization to and from input/output complicated structs
| seems so expensive.
| csmpltn wrote:
| > "The fact that creating a worker requires you to pass a
| module (file) name makes them extremely unergonomic to use."
|
| Can anybody explain how did this ever make it into the official
| spec and the default implementation?
|
| The browser will make a network call to fetch your Web Worker
| .js file on every instantiation of your Web Worker.
| Instantiations of the same Web Worker "module" aren't cached,
| so in a thread-pool scenario your browser would be fetching the
| same file over and over again.
|
| What were the people designing this even thinking? This is so
| wasteful and frustrating. The API simply sucks.
| dmitriid wrote:
| > Can anybody explain how did this ever make it into the
| official spec and the default implementation?
|
| This and service workers. Both are... weird, to say the least
| csmpltn wrote:
| When I'm seeing half-arsed APIs like this being introduced
| into so-called "modern" incarnations of the web, I start
| doubting the technical chops of the people involved and the
| process as a whole. Is it truly possible that there wasn't
| anyone with sufficient experience in building
| concurrency/multi-threading/parallelism APIs involved in
| building this?
|
| Failing to account for the thread-pool scenario, as an
| example, is just mind boggling.
| TAForObvReasons wrote:
| "best guess": API decisions are opinionated, reflecting the
| designers' views of how users should write code.
|
| For example, FileReader API is async in the main thread.
| There's an equivalent FileReaderSync for sync operations but
| that is only available in Web Workers. Why isn't
| FileReaderSync available on the main thread? Because the
| designers didn't want people to do things that would block
| the main thread.
|
| The Web Worker argument probably went along the lines of "If
| we allow users to pass arbitrary function objects, they may
| try to do something that accesses local variables from the
| site where the worker is created. That is obviously an error,
| so we should design the API so that it can't happen. Creating
| a separate script creates a clear mental separation and
| avoids that class of error"
| fabiospampinato wrote:
| Workers are awesome but you are right, working with them can be
| painful without the right tooling.
|
| Personally I've written my own libraries for abstracting all
| this away and I'm having a blast working with workers now,
| maybe check them out:
|
| - WorkTank [1]: This abstracts away the difference between
| browser workers and Node worker threads, it makes it easy to
| make worker pools, and it can transfer simple functions to a
| worker at runtime too.
|
| - WorkTank loader: This abstracts away loading asynchronous
| function from a worker basically, you just add ".worker" to
| your file name and that file and all its dependencies are
| transparently moved to a worker (or worker pool), all the rest
| of the app (TS types for example) won't even notice anything
| happened, it just works, transparently.
|
| You might want to check out the more popular "comlink" library
| too, although it didn't work for me for whatever reason when I
| tried it, and it doesn't support worker pools I believe.
|
| [1]: https://github.com/fabiospampinato/worktank
|
| [2]: https://github.com/fabiospampinato/worktank-loader
| jcun4128 wrote:
| I have used them for keeping things running in tabs that are not
| focused, not losing time sync (not critical though).
| recursivedoubts wrote:
| Hyperscript supports inline web worker definitions:
|
| https://hyperscript.org/features/worker/
|
| We tried to improve the API to this very cool feature.
| pictur wrote:
| is web worker really good for some costly processes? would it
| really be useful for a dashboard with sample instant data flow?
| hotz wrote:
| We used web workers to shift a lot of processing off of the
| main thread. They helped in keeping the UI feeling responsive
| and less laggy.
| wackget wrote:
| What kind of processing do you mean? Unless you're developing
| a game or a media player, I can't think of a good example
| where processing might delay the UI.
| Klaster_1 wrote:
| Depends what's on the dashboard and how the data has to be
| transformed for display. I've worked on several such dashboards
| and never had to use Web Workers to solve the performance
| issues, these were usually caused by UI rendering not optimized
| for loads of data. Now, the workers can help with rendering
| too, for example to render a chart off the main thread, but the
| libraries that support this usually hide the worker magic
| behind an API.
| rezmason wrote:
| This past year I've live streamed the development of a small web
| app that benefits enormously from web workers:
|
| https://github.com/Rezmason/wireworld-player
|
| https://rezmason.github.io/wireworld-player
|
| As a simulation, the main thread asks a web worker to update the
| world state and then render the new state. By default, this
| happens once per requestAnimationFrame.
|
| But there's a "Turbo" mode (its UI toggle looks like a
| radioactive hazard symbol) that, when activated, tells the web
| worker to update as often as it can per requestAnimationFrame,
| speeding up the simulation around 72x while keeping the main
| thread 100% responsive.
|
| The decoupling of the synchronous number crunching work from the
| main thread has also given me a place to experiment with much
| more resource intensive algorithms, like
| https://jennyhasahat.github.io/hashlife.html , which fills an
| enormous cache and can advance the sim by exponential time steps.
|
| Modifying Hashlife to run in the main thread without freezing the
| app is possible, but it would have made the code much more
| complicated, run slower, and the other cores available to web
| workers would have gone unused.
| fabiospampinato wrote:
| Two lesser known really cool things about Web Workers:
|
| 1. They kind of allow for stopping synchronous operations,
| example: some regexes have "catastrophic backtracking", executing
| them will take a really long time, so what do you do if you have
| to execute user-provided regexes, especially if on the server?
| Detecting potentially catastrophic regexes is tough,
| reimplementing the regex engine in order to make it yield to the
| main thread frequently so that you can stop it is super tough, so
| what's the solution? You can execute the regex in a Web Worker,
| and if you haven't received a response within some set amount of
| time you can just kill the Web Worker, effectively stopping the
| regex execution, cool!
|
| 2. They kind of allow for blocking on promises, example: normally
| you can't block the event loop while you are waiting for a
| promise to be resolved, in other words you can't make an
| asynchronous function synchronous, except if you use Web Workers,
| you can execute the asynchronous function you need on a worker,
| and then use Atomics.wait on the main thread to block (without
| melting the computer) until that function resolves, super cool!
| jefftk wrote:
| In the specific case of regular expressions on the server, re2
| was written to handle exactly this.
| wffurr wrote:
| In #2 you use Atomics.wait in the Worker instead and then it
| can signal the main thread when done.
|
| We use this to convert async browser APIs to service
| synchronous callbacks that our C library compiled to
| WebAssembly expects.
| domenicd wrote:
| Atomics.wait() cannot be called on the main thread (i.e. when
| your global is a Window object).
| no_way wrote:
| There is Atomics.waitAsync which can be used on the main
| thread and just returns a promise. It is shipped at least in
| chrome.
| fabiospampinato wrote:
| It works fine both in Node and Deno. I haven't tested this
| out on browsers though, is that supposed to throw even if
| SharedArrayBuffer is enabled?
| azakai wrote:
| Yes, browsers will not allow you to block on the main
| thread. Atomics.waitAsync is supposed to be used instead.
|
| This is a fairly difficult aspect of multithreading on the
| web, and it makes things more complicated than other
| platforms (like Node and Deno, as you mentioned). For
| example in emscripten's pthreads support layer there is
| code dedicated to do a sort of careful busy-wait when we
| have no other option, and all that is only for the case of
| the main thread.
|
| But your point is still very relevant, just not on the main
| thread: if you can run your application in a worker, then
| you _can_ block on Promises using another worker that does
| the async operation while the first worker is synchronous.
| And that 's really useful!
| chrismorgan wrote:
| > _and then use Atomics.wait on the main thread to block_
|
| The main thread is not allowed to use Atomics.wait. I'm not
| certain what the implementation status of this is because I've
| never used it and I have a vague feeling I heard that _one_
| browser shipped it without that restriction, but at the very
| least you may get a TypeError in some user agents and you can
| expect to in all user agents at some point in the future when
| they become sterner about not blocking the main thread.
|
| As for stopping synchronous operations, I'm dubious that would
| actually work; without actually testing it (and I don't have
| time to test it now, though I'd be interested in the result,
| including across various platforms), I think it's more likely
| that the regexp match would be uninterruptable, and that it
| would just go on munching your CPU until it finished, and
| _then_ terminate once it returned from the native code to the
| JavaScript.
| fabiospampinato wrote:
| I should check what's the status with Atomics.wait on Chrome,
| currently it seems to work fine under both Node and Deno.
|
| I'll test out the regex thing in the following days as I need
| it for that exact use case, I just assumed killing the web
| worker would... work, hopefully that's the case otherwise I'm
| back to square 0 :D
|
| ---
|
| Edit: MDN says: "The terminate() method of the Worker
| interface immediately terminates the Worker. This does not
| offer the worker an opportunity to finish its operations; it
| is stopped at once." so if that's not the case either the
| engine is wrong or the docs are wrong.
| chrismorgan wrote:
| Worker.terminate() aborts the currently running script
| evaluation, see
| https://html.spec.whatwg.org/multipage/workers.html#dom-
| work... - https://html.spec.whatwg.org/multipage/workers.ht
| ml#terminat... - https://html.spec.whatwg.org/multipage/web
| appapis.html#abort..., but that's a fairly fuzzy
| definition, and it's not generally reasonable to expect
| that to interrupt a currently executing piece of _native_
| code, because interrupting that (e.g. by sending a signal
| to the thread and forcibly starting unwinding) could leave
| data structures in a memory-unsafe state (that is, in the
| improbable worst case this could be a vector for escaping
| the sandbox). It's possible they've come up with some way
| of working around this, but it's going to be _considerably_
| easier and safer to just treat native code as
| uninterruptible by default, and possibly get known-slow
| blocking operations to manually periodically check if
| they're being asked to stop.
|
| So I say my gut feeling is that the match operation won't
| actually be interrupted, and I wouldn't be inclined to
| _depend_ on it actually being interrupted without explicit
| documentation of what aborting does, even if all
| environments I cared about did abort it immediately.
| richardanaya wrote:
| A little known annoying detail of web workers is that passing
| large data to them via postMessage is incredibly slow. The
| browser has to convert your javascript object into some sort of
| internal binary format and it slows down the thread that's doing
| the sending.
| reginaldo wrote:
| Depends on the type of large data, but since 2011 there's
| "Transferable" [1], where objects like ArrayBuffer,
| MessagePort, and ImageBitmap can be transferred with a low
| overhead [2]. Now, if you're passing large arbitrary object
| graphs, you're out of luck.
|
| [1]
| https://developers.google.com/web/updates/2011/12/Transferab...
| [2] https://developer.mozilla.org/en-
| US/docs/Web/API/Worker/post...
| MuffinFlavored wrote:
| Any libraries that make this nicer/easier to do instead of
| rewriting the same thing over and over?
| The_rationalist wrote:
| https://github.com/Bnaya/objectbuffer and comlink
| lanecwagner wrote:
| I built a little golang playground using web workers and WASM.
| It's nice to not hang the UI thread.
| https://app.qvault.io/playground/go
| _squared_ wrote:
| If you're interested in leveraging web workers easily for
| repetitive compute-heavy tasks in a webapp, i've built a little
| library that takes care of launching and managing worker threads
| for you: https://github.com/GitSquared/rinzler
| wngr wrote:
| Nice. Do you also support SharedArrayBuffers or does everything
| need to be serializable that is sent to/from WebWorkers?
| wngr wrote:
| By the way, I built something similar (?): A Rust library
| that mimics the API of the `futures-executor` crate, but each
| worker thread is a single WebWorker.
|
| https://github.com/wngr/wasm-futures-executor
| KubikPixel wrote:
| Web Workers makes it possible to run a script operation in a
| background thread separate from the main execution thread of a
| web application. The advantage of this is that laborious
| processing can be performed in a separate thread, allowing the
| main (usually the UI) thread to run without being blocked/slowed
| down.
| kitsunesoba wrote:
| That can be a great feature in some circumstances, but it also
| seems like it can undo much of the efficiency improvement
| browsers have accomplished by throttling and sleeping less used
| tabs. Do browsers implement any kind of leashing in terms of
| how much resources any particular site's workers can use,
| frequency of running, length of runs, etc?
| andybak wrote:
| Any context? Is there a specific reason to post this now? Web
| workers aren't new to Firefox, are they?
| k__ wrote:
| There was a discussion in another thread about Apple stifling
| PWAs on iOS.
|
| One point was, web workers aren't supported correctly.
| SahAssar wrote:
| IIRC normal workers are supported (dedicated workers) but
| shared workers are not. Service workers are "supported" but
| some features that are pretty critical (like push
| notifications) are not.
| styfle wrote:
| I believe this is the other thread:
| https://news.ycombinator.com/item?id=29440457
| rubyskills wrote:
| Also I think push notifications are not supported.
___________________________________________________________________
(page generated 2021-12-04 23:01 UTC)