[HN Gopher] Web AI Model Testing: WebGPU, WebGL, and Headless Ch...
       ___________________________________________________________________
        
       Web AI Model Testing: WebGPU, WebGL, and Headless Chrome
        
       Author : kaycebasques
       Score  : 104 points
       Date   : 2024-01-16 19:16 UTC (3 hours ago)
        
 (HTM) web link (developer.chrome.com)
 (TXT) w3m dump (developer.chrome.com)
        
       | FL33TW00D wrote:
       | This can also be done in Rust using the excellent
       | `wasm_bindgen_test`!
        
       | not_a_dane wrote:
       | AFAIK, there is still a memory barrier in chrome which is set to
       | 4gb per tab.
        
         | abxytg wrote:
         | I hate it so much. So arbitrary and capricious. I would say
         | this is currently the number one blocker for the web as a
         | serious platform. And they're doing it on purpose.
        
           | vicktorium wrote:
           | what apps can't run on 4GB?
           | 
           | games?
           | 
           | 3D?
           | 
           | Editing?
           | 
           | have you tried forking chrome and increasing this limit?
        
             | KeplerBoy wrote:
             | Some AI apps. You can't really load a capable LLM in 4 GB.
             | Or does this limit not apply when dealing with WASM and
             | WebGPU?
        
             | paulgb wrote:
             | Video editors are a big one. I've heard of people crashing
             | a browser tab with Figma as well.
             | 
             | For data exploration tools it's very easy to want to use
             | 4GB+ of memory. I found the limit cumbersome while working
             | on financial tools. It usually comes up in internal tools
             | where you reliably have a fast internet connection; it's
             | harder to reach the limit for public-facing tools because
             | there the slowness of sending 4GB+ to the browser is the
             | more limiting factor.
             | 
             | The annoying part isn't just that the limit is there, but
             | that you can't really handle it gracefully as the developer
             | -- when the browser decides you've hit the limit, it may
             | just replace the page with an error message.
        
               | 10000truths wrote:
               | For a video editor, only a small portion of the video
               | needs to be in memory at any given time. The rest can be
               | squirreled away in an IndexedDB store, which has no hard
               | size limits on most browsers.
        
           | whatshisface wrote:
           | I guess the policy is that tabs can use 100% of the available
           | resources on low end devices, but only 10% of the available
           | resources on high end devices.
        
             | skybrian wrote:
             | I think the desktop policy might be better. In the tablets
             | I've used, tabs sometimes get killed when I switch tabs and
             | visit another website with a lot of ads. It's an annoying
             | way to lose work in an unsubmitted form. It doesn't seem to
             | happen for desktop.
        
         | jsheard wrote:
         | At least on desktop you generally know where the line is, on
         | mobile there's a mystery limit you're not allowed to cross, and
         | you're also not allowed to know where the line is until you
         | reach it, which might gracefully throw an error or might result
         | in your tab being force-killed, and you're not allowed to know
         | which of those will happen either.
        
           | echelon wrote:
           | I'm building a list of "second class citizen" mobile web
           | issues for Android and Apple. I wasn't aware of this one! Do
           | you know of anything else like this?
        
             | jsheard wrote:
             | https://github.com/WebAssembly/design/issues/1397
             | 
             | > Currently allocating more than ~300MB of memory is not
             | reliable on Chrome on Android without resorting to Chrome-
             | specific workarounds, nor in Safari on iOS.
             | 
             | That's about allocating CPU memory but the GPU memory
             | situation is similar. The specs don't want to reveal
             | information about how much memory you're allowed to use
             | because it could be used for fingerprinting, but that means
             | that it's practically impossible to build reliable
             | applications which use (or can _optionally_ use) a lot of
             | memory. Every allocation you make past a few hundred MB
             | risks blowing up the app immediately, or putting it into
             | the danger zone where it 's the first in line to get killed
             | when running in the background, either way without any
             | warning or last-chance opportunity to release memory to
             | avert getting killed.
        
               | a_wild_dandan wrote:
               | Could the solution be a user permission dialog? Similar
               | to how browsers implement webcam/etc permissions: "Enable
               | <website> full GPU access? (Default: Off)"
        
           | fragmede wrote:
           | Mobile Safari just has a fixed limit of 500 tabs
        
             | Me1000 wrote:
             | If you switch to private browsing mode you can get an extra
             | 500 tabs. :)
        
               | wlesieutre wrote:
               | What about other tab groups?
        
             | paulgb wrote:
             | This is a user-facing limit; jsheard is talking about how
             | as an app developer you don't know whether your app is
             | below the limit or whether the next allocation will kill
             | the browser tab.
        
         | FL33TW00D wrote:
         | This is a 7B parameter model at int4, lots to play with!
        
         | jmayes wrote:
         | Hello there, I am one of the authors of the piece. Fun fact
         | just for the lols we have tried running a 1.3B parameter
         | unoptimized TensorFlow.js model in this system just to see if
         | it would work (could be much more memory efficient with
         | tweaks), and it does. It uses about 6GB RAM and 14GB VRAM when
         | using V100 GPU on Colab (15GB VRAM limit) but runs pretty fast
         | otherwise once the initial load is complete! Obviously plenty
         | of room to make this use much less memory in the future - we
         | just wanted to check we could run such things as a test for
         | now.
        
       | sylware wrote:
       | Isn't that exactly the modern, AI based, mouse and keyboard BOT?
       | (trained with click farms)
        
       | refulgentis wrote:
       | Real, but naive, question: does TensorFlow have meaningful share
       | outside Google? I've been in the HuggingFace ecosystem and it's
       | overwhelmingly PyTorch, IIRC 93%, (I can't find the blog post
       | that said it, but only gave it 2 minutes)
        
         | nwoli wrote:
         | Best alternative for web imo (perf generally beats onnx for
         | web)
        
         | hatthew wrote:
         | TF used to be the most popular framework by a large margin, so
         | a lot of things that were started 5+ years ago are still on it.
         | PyTorch is most popular in places that only started more
         | recently or have the ability to switch easily, e.g. new
         | startups, research, LLMs, education, and companies that have
         | the resources to do a migration project.
        
         | summerlight wrote:
         | A fun thing is that even in Google JAX is now preferred across
         | researchers and slowly taking over the share.
        
       | antimora wrote:
       | Great!
       | 
       | For Burn project, we have WebGPU example and I was looking into
       | how we could add automated tests in the browser. Now it seems
       | possible.
       | 
       | Here is the image classification example if you'd like to check
       | out:
       | 
       | https://github.com/tracel-ai/burn/tree/main/examples/image-c...
        
       | lxe wrote:
       | I think better SIMD support for webassembly is more inclusive
       | than relying on / expecting WebGPU
        
         | jmayes wrote:
         | For this blog post we are using Chrome for the testing
         | environment which has WebGPU turned on by default now and other
         | common browsers should hopefully follow suit, but given we are
         | using Chrome here we know WebGPU will be available if the WebAI
         | is using that (which many people are turning to for diffusion
         | models and LLMs as its so much faster to run those types of
         | models).
         | 
         | But yes, I am all for better support on all the things too, we
         | have many WASM users too, and when anything new comes out
         | there, this set of instructions can still be used to leverage
         | testing that too as its just Chrome running on Linux
         | essentially with the right flags set.
        
       ___________________________________________________________________
       (page generated 2024-01-16 23:00 UTC)