[HN Gopher] Web Workers API
       ___________________________________________________________________
        
       Web Workers API
        
       Author : KubikPixel
       Score  : 92 points
       Date   : 2021-12-03 08:48 UTC (1 days ago)
        
 (HTM) web link (developer.mozilla.org)
 (TXT) w3m dump (developer.mozilla.org)
        
       | yutijke wrote:
       | The fact that creating a worker requires you to pass a module
       | (file) name makes them extremely unergonomic to use.
       | 
       | This is clearly visible in the lack of any library ecosystem
       | around them. We don't have any highly used threadpool or executor
       | libraries using workers. Everyone seems to be manually setting up
       | a worker and setting up the job scheduling logic from scratch.
       | 
       | All the boilerplate and restrictions really restrict its usage
       | IMO.
       | 
       | I know you can pass a Blob URL but you can't write libraries this
       | way since you risk running foul of the downstream consumer's
       | browser CSP.
        
         | Jasper_ wrote:
         | A big complaint I have is that the built-in scheduling for
         | workers assumes a very specific scenario. You can send messages
         | to a single worker, but the message loop has to be operated by
         | the browser. You also don't know the length of the message
         | queue, even though the browser has it available, so you can't
         | easily send a message to the worker with the least work queued.
         | If a worker starts working on low-priority items, and you want
         | to interject with a high-priority message, you also can't
         | interrupt the worker, nor can the worker loop the messages on
         | its own accord. You also can't re-sort the message queue, it's
         | FIFO.
         | 
         | Basically, any sort of work scheduling that you would _like_ to
         | do to queue high-priority messages, or have workers share a
         | pool of work, is impossible to build with the onmessage-style
         | of WebWorkers. It feels like an API made by someone who had
         | read about threads once, rather than someone who 's built an
         | actual many-workers processing system like this. The event loop
         | being unpumpable from user code feels like a giant kick in the
         | face.
         | 
         | My workaround for this was to kill workers working on low-
         | priority items, relaunch them, and resort the queue, but that
         | all feels like a mess. Also, at some point a Chrome update made
         | this strategy crash. I ended up just removing the WebWorker
         | code; more trouble than it was worth for marginal code
         | improvements.
         | 
         | I know that SAB and atomics add new low-level primitives to
         | support this, but SAB is still poorly supported on widely
         | deployed platforms. And you still need to do the serialization
         | yourself.
        
         | The_rationalist wrote:
         | There are solutions to those problems:
         | https://github.com/GoogleChromeLabs/comlink
         | https://github.com/Bnaya/objectbuffer
        
         | MuffinFlavored wrote:
         | My bigger complaint is... how you cleanly do synchronous RPC
         | style calling? Even with clever async/await tricks, the
         | serialization to and from input/output complicated structs
         | seems so expensive.
        
         | csmpltn wrote:
         | > "The fact that creating a worker requires you to pass a
         | module (file) name makes them extremely unergonomic to use."
         | 
         | Can anybody explain how did this ever make it into the official
         | spec and the default implementation?
         | 
         | The browser will make a network call to fetch your Web Worker
         | .js file on every instantiation of your Web Worker.
         | Instantiations of the same Web Worker "module" aren't cached,
         | so in a thread-pool scenario your browser would be fetching the
         | same file over and over again.
         | 
         | What were the people designing this even thinking? This is so
         | wasteful and frustrating. The API simply sucks.
        
           | dmitriid wrote:
           | > Can anybody explain how did this ever make it into the
           | official spec and the default implementation?
           | 
           | This and service workers. Both are... weird, to say the least
        
             | csmpltn wrote:
             | When I'm seeing half-arsed APIs like this being introduced
             | into so-called "modern" incarnations of the web, I start
             | doubting the technical chops of the people involved and the
             | process as a whole. Is it truly possible that there wasn't
             | anyone with sufficient experience in building
             | concurrency/multi-threading/parallelism APIs involved in
             | building this?
             | 
             | Failing to account for the thread-pool scenario, as an
             | example, is just mind boggling.
        
           | TAForObvReasons wrote:
           | "best guess": API decisions are opinionated, reflecting the
           | designers' views of how users should write code.
           | 
           | For example, FileReader API is async in the main thread.
           | There's an equivalent FileReaderSync for sync operations but
           | that is only available in Web Workers. Why isn't
           | FileReaderSync available on the main thread? Because the
           | designers didn't want people to do things that would block
           | the main thread.
           | 
           | The Web Worker argument probably went along the lines of "If
           | we allow users to pass arbitrary function objects, they may
           | try to do something that accesses local variables from the
           | site where the worker is created. That is obviously an error,
           | so we should design the API so that it can't happen. Creating
           | a separate script creates a clear mental separation and
           | avoids that class of error"
        
         | fabiospampinato wrote:
         | Workers are awesome but you are right, working with them can be
         | painful without the right tooling.
         | 
         | Personally I've written my own libraries for abstracting all
         | this away and I'm having a blast working with workers now,
         | maybe check them out:
         | 
         | - WorkTank [1]: This abstracts away the difference between
         | browser workers and Node worker threads, it makes it easy to
         | make worker pools, and it can transfer simple functions to a
         | worker at runtime too.
         | 
         | - WorkTank loader: This abstracts away loading asynchronous
         | function from a worker basically, you just add ".worker" to
         | your file name and that file and all its dependencies are
         | transparently moved to a worker (or worker pool), all the rest
         | of the app (TS types for example) won't even notice anything
         | happened, it just works, transparently.
         | 
         | You might want to check out the more popular "comlink" library
         | too, although it didn't work for me for whatever reason when I
         | tried it, and it doesn't support worker pools I believe.
         | 
         | [1]: https://github.com/fabiospampinato/worktank
         | 
         | [2]: https://github.com/fabiospampinato/worktank-loader
        
       | jcun4128 wrote:
       | I have used them for keeping things running in tabs that are not
       | focused, not losing time sync (not critical though).
        
       | recursivedoubts wrote:
       | Hyperscript supports inline web worker definitions:
       | 
       | https://hyperscript.org/features/worker/
       | 
       | We tried to improve the API to this very cool feature.
        
       | pictur wrote:
       | is web worker really good for some costly processes? would it
       | really be useful for a dashboard with sample instant data flow?
        
         | hotz wrote:
         | We used web workers to shift a lot of processing off of the
         | main thread. They helped in keeping the UI feeling responsive
         | and less laggy.
        
           | wackget wrote:
           | What kind of processing do you mean? Unless you're developing
           | a game or a media player, I can't think of a good example
           | where processing might delay the UI.
        
         | Klaster_1 wrote:
         | Depends what's on the dashboard and how the data has to be
         | transformed for display. I've worked on several such dashboards
         | and never had to use Web Workers to solve the performance
         | issues, these were usually caused by UI rendering not optimized
         | for loads of data. Now, the workers can help with rendering
         | too, for example to render a chart off the main thread, but the
         | libraries that support this usually hide the worker magic
         | behind an API.
        
       | rezmason wrote:
       | This past year I've live streamed the development of a small web
       | app that benefits enormously from web workers:
       | 
       | https://github.com/Rezmason/wireworld-player
       | 
       | https://rezmason.github.io/wireworld-player
       | 
       | As a simulation, the main thread asks a web worker to update the
       | world state and then render the new state. By default, this
       | happens once per requestAnimationFrame.
       | 
       | But there's a "Turbo" mode (its UI toggle looks like a
       | radioactive hazard symbol) that, when activated, tells the web
       | worker to update as often as it can per requestAnimationFrame,
       | speeding up the simulation around 72x while keeping the main
       | thread 100% responsive.
       | 
       | The decoupling of the synchronous number crunching work from the
       | main thread has also given me a place to experiment with much
       | more resource intensive algorithms, like
       | https://jennyhasahat.github.io/hashlife.html , which fills an
       | enormous cache and can advance the sim by exponential time steps.
       | 
       | Modifying Hashlife to run in the main thread without freezing the
       | app is possible, but it would have made the code much more
       | complicated, run slower, and the other cores available to web
       | workers would have gone unused.
        
       | fabiospampinato wrote:
       | Two lesser known really cool things about Web Workers:
       | 
       | 1. They kind of allow for stopping synchronous operations,
       | example: some regexes have "catastrophic backtracking", executing
       | them will take a really long time, so what do you do if you have
       | to execute user-provided regexes, especially if on the server?
       | Detecting potentially catastrophic regexes is tough,
       | reimplementing the regex engine in order to make it yield to the
       | main thread frequently so that you can stop it is super tough, so
       | what's the solution? You can execute the regex in a Web Worker,
       | and if you haven't received a response within some set amount of
       | time you can just kill the Web Worker, effectively stopping the
       | regex execution, cool!
       | 
       | 2. They kind of allow for blocking on promises, example: normally
       | you can't block the event loop while you are waiting for a
       | promise to be resolved, in other words you can't make an
       | asynchronous function synchronous, except if you use Web Workers,
       | you can execute the asynchronous function you need on a worker,
       | and then use Atomics.wait on the main thread to block (without
       | melting the computer) until that function resolves, super cool!
        
         | jefftk wrote:
         | In the specific case of regular expressions on the server, re2
         | was written to handle exactly this.
        
         | wffurr wrote:
         | In #2 you use Atomics.wait in the Worker instead and then it
         | can signal the main thread when done.
         | 
         | We use this to convert async browser APIs to service
         | synchronous callbacks that our C library compiled to
         | WebAssembly expects.
        
         | domenicd wrote:
         | Atomics.wait() cannot be called on the main thread (i.e. when
         | your global is a Window object).
        
           | no_way wrote:
           | There is Atomics.waitAsync which can be used on the main
           | thread and just returns a promise. It is shipped at least in
           | chrome.
        
           | fabiospampinato wrote:
           | It works fine both in Node and Deno. I haven't tested this
           | out on browsers though, is that supposed to throw even if
           | SharedArrayBuffer is enabled?
        
             | azakai wrote:
             | Yes, browsers will not allow you to block on the main
             | thread. Atomics.waitAsync is supposed to be used instead.
             | 
             | This is a fairly difficult aspect of multithreading on the
             | web, and it makes things more complicated than other
             | platforms (like Node and Deno, as you mentioned). For
             | example in emscripten's pthreads support layer there is
             | code dedicated to do a sort of careful busy-wait when we
             | have no other option, and all that is only for the case of
             | the main thread.
             | 
             | But your point is still very relevant, just not on the main
             | thread: if you can run your application in a worker, then
             | you _can_ block on Promises using another worker that does
             | the async operation while the first worker is synchronous.
             | And that 's really useful!
        
         | chrismorgan wrote:
         | > _and then use Atomics.wait on the main thread to block_
         | 
         | The main thread is not allowed to use Atomics.wait. I'm not
         | certain what the implementation status of this is because I've
         | never used it and I have a vague feeling I heard that _one_
         | browser shipped it without that restriction, but at the very
         | least you may get a TypeError in some user agents and you can
         | expect to in all user agents at some point in the future when
         | they become sterner about not blocking the main thread.
         | 
         | As for stopping synchronous operations, I'm dubious that would
         | actually work; without actually testing it (and I don't have
         | time to test it now, though I'd be interested in the result,
         | including across various platforms), I think it's more likely
         | that the regexp match would be uninterruptable, and that it
         | would just go on munching your CPU until it finished, and
         | _then_ terminate once it returned from the native code to the
         | JavaScript.
        
           | fabiospampinato wrote:
           | I should check what's the status with Atomics.wait on Chrome,
           | currently it seems to work fine under both Node and Deno.
           | 
           | I'll test out the regex thing in the following days as I need
           | it for that exact use case, I just assumed killing the web
           | worker would... work, hopefully that's the case otherwise I'm
           | back to square 0 :D
           | 
           | ---
           | 
           | Edit: MDN says: "The terminate() method of the Worker
           | interface immediately terminates the Worker. This does not
           | offer the worker an opportunity to finish its operations; it
           | is stopped at once." so if that's not the case either the
           | engine is wrong or the docs are wrong.
        
             | chrismorgan wrote:
             | Worker.terminate() aborts the currently running script
             | evaluation, see
             | https://html.spec.whatwg.org/multipage/workers.html#dom-
             | work... - https://html.spec.whatwg.org/multipage/workers.ht
             | ml#terminat... - https://html.spec.whatwg.org/multipage/web
             | appapis.html#abort..., but that's a fairly fuzzy
             | definition, and it's not generally reasonable to expect
             | that to interrupt a currently executing piece of _native_
             | code, because interrupting that (e.g. by sending a signal
             | to the thread and forcibly starting unwinding) could leave
             | data structures in a memory-unsafe state (that is, in the
             | improbable worst case this could be a vector for escaping
             | the sandbox). It's possible they've come up with some way
             | of working around this, but it's going to be _considerably_
             | easier and safer to just treat native code as
             | uninterruptible by default, and possibly get known-slow
             | blocking operations to manually periodically check if
             | they're being asked to stop.
             | 
             | So I say my gut feeling is that the match operation won't
             | actually be interrupted, and I wouldn't be inclined to
             | _depend_ on it actually being interrupted without explicit
             | documentation of what aborting does, even if all
             | environments I cared about did abort it immediately.
        
       | richardanaya wrote:
       | A little known annoying detail of web workers is that passing
       | large data to them via postMessage is incredibly slow. The
       | browser has to convert your javascript object into some sort of
       | internal binary format and it slows down the thread that's doing
       | the sending.
        
         | reginaldo wrote:
         | Depends on the type of large data, but since 2011 there's
         | "Transferable" [1], where objects like ArrayBuffer,
         | MessagePort, and ImageBitmap can be transferred with a low
         | overhead [2]. Now, if you're passing large arbitrary object
         | graphs, you're out of luck.
         | 
         | [1]
         | https://developers.google.com/web/updates/2011/12/Transferab...
         | [2] https://developer.mozilla.org/en-
         | US/docs/Web/API/Worker/post...
        
           | MuffinFlavored wrote:
           | Any libraries that make this nicer/easier to do instead of
           | rewriting the same thing over and over?
        
             | The_rationalist wrote:
             | https://github.com/Bnaya/objectbuffer and comlink
        
       | lanecwagner wrote:
       | I built a little golang playground using web workers and WASM.
       | It's nice to not hang the UI thread.
       | https://app.qvault.io/playground/go
        
       | _squared_ wrote:
       | If you're interested in leveraging web workers easily for
       | repetitive compute-heavy tasks in a webapp, i've built a little
       | library that takes care of launching and managing worker threads
       | for you: https://github.com/GitSquared/rinzler
        
         | wngr wrote:
         | Nice. Do you also support SharedArrayBuffers or does everything
         | need to be serializable that is sent to/from WebWorkers?
        
           | wngr wrote:
           | By the way, I built something similar (?): A Rust library
           | that mimics the API of the `futures-executor` crate, but each
           | worker thread is a single WebWorker.
           | 
           | https://github.com/wngr/wasm-futures-executor
        
       | KubikPixel wrote:
       | Web Workers makes it possible to run a script operation in a
       | background thread separate from the main execution thread of a
       | web application. The advantage of this is that laborious
       | processing can be performed in a separate thread, allowing the
       | main (usually the UI) thread to run without being blocked/slowed
       | down.
        
         | kitsunesoba wrote:
         | That can be a great feature in some circumstances, but it also
         | seems like it can undo much of the efficiency improvement
         | browsers have accomplished by throttling and sleeping less used
         | tabs. Do browsers implement any kind of leashing in terms of
         | how much resources any particular site's workers can use,
         | frequency of running, length of runs, etc?
        
         | andybak wrote:
         | Any context? Is there a specific reason to post this now? Web
         | workers aren't new to Firefox, are they?
        
           | k__ wrote:
           | There was a discussion in another thread about Apple stifling
           | PWAs on iOS.
           | 
           | One point was, web workers aren't supported correctly.
        
             | SahAssar wrote:
             | IIRC normal workers are supported (dedicated workers) but
             | shared workers are not. Service workers are "supported" but
             | some features that are pretty critical (like push
             | notifications) are not.
        
             | styfle wrote:
             | I believe this is the other thread:
             | https://news.ycombinator.com/item?id=29440457
        
             | rubyskills wrote:
             | Also I think push notifications are not supported.
        
       ___________________________________________________________________
       (page generated 2021-12-04 23:01 UTC)