fsebugoutzone.org:9999

       Post AUiRSbfoATDFitAGOG by corbin@defcon.social
 (DIR) More posts by corbin@defcon.social
 (DIR) Post #AUiIKjrdGtjJBR9EMi by simon@fedi.simonwillison.net
       2023-04-16T15:16:44Z
       
       0 likes, 0 repeats
       
       Web LLM runs the vicuna-7b Large Language Model entirely in your browser, and it’s very impressivehttps://simonwillison.net/2023/Apr/16/web-llm/
       
 (DIR) Post #AUiIh0qenyW2EKJS4m by simon@fedi.simonwillison.net
       2023-04-16T15:20:44Z
       
       0 likes, 0 repeats
       
       This is a full GPT language model that runs /entirely in the browser/ (using WebGPU, which is so new I had to use Chrome Canary on my M2 MacBook) - and it&#39;s pretty capable! It can handle summarization, invent puns and even generated me a passable rap battle between an otter and a pelican
       
 (DIR) Post #AUiIt2J7CqwgEKMCMi by simon@fedi.simonwillison.net
       2023-04-16T15:21:49Z
       
       0 likes, 0 repeats
       
       This is the latest in my series of posts on running Large Language Models on personal devices https://simonwillison.net/series/llms-on-personal-devices/
       
 (DIR) Post #AUiJnXvsoMNSWyR7dA by simon@fedi.simonwillison.net
       2023-04-16T15:33:03Z
       
       0 likes, 0 repeats
       
       Admittedly some answers were better than others!
       
 (DIR) Post #AUiLCPmuKXSsSpf0PA by jeffgreco@indieweb.social
       2023-04-16T15:48:17Z
       
       0 likes, 0 repeats
       
       @simon doesn’t seem “wrong” so much as “mocking you”
       
 (DIR) Post #AUiMKxtLB4CuXo1zGq by corbin@defcon.social
       2023-04-16T16:01:19Z
       
       0 likes, 0 repeats
       
       @simon I don&#39;t understand why this is desirable. As you yourself point out, the amount of data that has to be streamed and cached by the Web browser is unreasonable.
       
 (DIR) Post #AUiMVAFs47kjmhQu8G by simon@fedi.simonwillison.net
       2023-04-16T16:03:38Z
       
       0 likes, 0 repeats
       
       @corbin I wrote about that towards the end of my post - the browser has security features that are especially useful when working with LLMs
       
 (DIR) Post #AUiMgmHX1UtOuLkOgK by simon@fedi.simonwillison.net
       2023-04-16T16:04:06Z
       
       0 likes, 0 repeats
       
       @corbin you could wrap this whole thing in an Electron app or similar to avoid having to download the model over a network
       
 (DIR) Post #AUiMqy6NvlV2SnOQ3E by corbin@defcon.social
       2023-04-16T16:06:42Z
       
       0 likes, 0 repeats
       
       @simon I can trivially prove that my local LLaMA harness won&#39;t make any network calls. Doing the same for a browser is a massive headache.Sandboxes are an anti-pattern; they are what we use for untamed software. However, LLMs are brand-new and trivial to tame, so no sandboxes are required.
       
 (DIR) Post #AUiR8H7AtWq6AbGoTY by simon@fedi.simonwillison.net
       2023-04-16T16:55:29Z
       
       0 likes, 0 repeats
       
       @corbin the moment you start getting it to generate code for it to execute automatically (an increasingly popular pattern) you&#39;re going to want it to have access to a very robust sandbox
       
 (DIR) Post #AUiRSbfoATDFitAGOG by corbin@defcon.social
       2023-04-16T16:58:49Z
       
       0 likes, 0 repeats
       
       @simon Generate code in a language which denotes pure total functions. Then automatic execution can&#39;t do anything worse than waste a few moments of CPU time or a few GiB/min of RAM, and automatic analysis of code is relatively straightforward.People are mostly generating Python and ECMAScript. EMCAScript technically can be tamed, but Python can&#39;t.If you want to generate untrusted code and inspect it, then you need to avoid Turing-completeness. We&#39;ve known this for like a century.
       
 (DIR) Post #AUiTrPe2E7ipZ5I7W4 by simon@fedi.simonwillison.net
       2023-04-16T17:25:51Z
       
       0 likes, 0 repeats
       
       @corbin Python can be tamed if you run it in a WebAssembly sandbox: https://til.simonwillison.net/webassembly/python-in-a-wasm-sandbox
       
 (DIR) Post #AUiWw1XxY8uY8XL8nQ by markus@hachyderm.io
       2023-04-16T18:00:04Z
       
       0 likes, 0 repeats
       
       @simon I love how a passable rap battle between two animals has become a measure of software quality.
       
 (DIR) Post #AUiePibtDfCvlxHime by corbin@defcon.social
       2023-04-16T19:24:08Z
       
       0 likes, 0 repeats
       
       @simon I guess I ought to write a blog post explaining what taming is. There&#39;s an old E document, at least: http://www.erights.org/elib/legacy/taming.htmlYes, WebAssembly is tamed. Yes, emulators written in tamed languages are freely tamed. No, Python&#39;s native type theory is not tamed simply by running in a managed runtime; for example, CPython is not tamed, although PyPy has object spaces which are somewhat tame.
       
 (DIR) Post #AUikF6j1oqAf6Uwndw by simon@fedi.simonwillison.net
       2023-04-16T20:29:31Z
       
       0 likes, 0 repeats
       
       @corbin that&#39;s why I&#39;m very happy to outsource that entire problem to WebAssembly, rather than worrying about it at the level of individual programming languages
       
 (DIR) Post #AUk6GsCjAeqWnS2ulk by StuartGray@mastodonapp.uk
       2023-04-17T12:10:50Z
       
       0 likes, 0 repeats
       
       @simon This is some very impressive work, along with their stable diffusion web app. I can&#39;t wait until they extend support to more models, preferrably with better (non-academic) licences.The specs say you need a GPU with at least 6.4GB VRAM, presumably to fit the entire LLM model.However, whilst I wouldn&#39;t recommend it, I did to get this to work on an old 2008 Quad Core 2 with 8GB RAM &amp; and AMD card with only 4GB VRAM - generates about 2 tokens/sec output.