Post AT0pdl9J60ddiCSaS8 by jamesravey@fosstodon.org
 (DIR) More posts by jamesravey@fosstodon.org
 (DIR) Post #AT0lyxV3yGeAHYUP4K by simon@fedi.simonwillison.net
       2023-02-24T17:29:11Z
       
       0 likes, 0 repeats
       
       New language models released by Meta Research: https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/From the paper: "For instance, LLaMA-13B outperforms GPT-3 on most bench- marks, despite being 10× smaller. We believe that this model will help democratize the access and study of LLMs, since it can be run on a single GPU."
       
 (DIR) Post #AT0nOqKVLutxWkSPce by simon@fedi.simonwillison.net
       2023-02-24T17:45:17Z
       
       0 likes, 0 repeats
       
       I'm now thinking that we will be running language models with a sizable portion of the capabilities of ChatGPT on our own (top of the range) mobile phones and laptops within a year or two
       
 (DIR) Post #AT0o1GXeLSl7pIvXoe by laimis@mstdn.social
       2023-02-24T17:51:54Z
       
       0 likes, 0 repeats
       
       @simon yeah, that feels about right
       
 (DIR) Post #AT0pdl9J60ddiCSaS8 by jamesravey@fosstodon.org
       2023-02-24T18:09:52Z
       
       0 likes, 0 repeats
       
       @simon small LMs that run on a single machine (SLMs?) are so exciting! There are also other promising works that outperform GPT-3 at specific tasks after some fine tuning (e.g. Schick + Schutze PET and GenPET https://arxiv.org/pdf/2012.11926.pdf) PET will happily both train + infer on my 12GB VRAM TitanX card. I am excited about the potential of these techniques to truly democratise NLP instead of "democratisation" based on a chosing one of the proprietary APIs from 3-5 BigTechCos
       
 (DIR) Post #AT0ppqVsygk3p7mh8K by simon@fedi.simonwillison.net
       2023-02-24T18:10:49Z
       
       0 likes, 0 repeats
       
       @jamesravey https://github.com/FMInference/FlexGen is a really interesting development in that space too!
       
 (DIR) Post #AT0pzlHzFw2wW70mLw by wordshaper@weatherishappening.network
       2023-02-24T18:11:24Z
       
       0 likes, 0 repeats
       
       @simon Yeah, definitely. I'm not sure whether that says more about the optimizability of these models or the frankly ludicrous amount of power in a modern phone or laptop, but either way we'll get usable models and probably part of the base OS soon.
       
 (DIR) Post #AT0qQM8Qt1D8K0Dvto by shajith@mastodon.social
       2023-02-24T18:19:13Z
       
       0 likes, 0 repeats
       
       @simon that should result in a rather different market structure from the market structure of current cloud infrastructure with two or three big players. Seems like it would be better if this stuff is commoditized? Maybe it will run in browsers some day, like WebGL?
       
 (DIR) Post #AT0qcA8lJAPqb00X2W by JesseSkinner@toot.cafe
       2023-02-24T18:19:21Z
       
       0 likes, 0 repeats
       
       @simon Ooh, all the world's misinformation at our fingertips.
       
 (DIR) Post #AT0s1hHxIhOkb530KG by simon@fedi.simonwillison.net
       2023-02-24T18:37:26Z
       
       0 likes, 0 repeats
       
       @shajith https://whisper.ggerganov.com/talk/ runs GPT-2 and Whisper in the browser right now!
       
 (DIR) Post #AT0tFVwuMCoXR8M4gK by bartek@sfba.social
       2023-02-24T18:49:24Z
       
       0 likes, 0 repeats
       
       @simon Technically capable of? Maybe. Available in production phones? Doubtful, just look how long it took Apple to make offline Siri available.
       
 (DIR) Post #AT12NdY2t4jSuOfq1w by stabinger@fosstodon.org
       2023-02-24T20:29:30Z
       
       0 likes, 0 repeats
       
       @simonUnfortunately only available on request 😕
       
 (DIR) Post #AT18O85tuvrdN27y3E by Rob_Russell@mastodon.cloud
       2023-02-24T21:40:24Z
       
       0 likes, 0 repeats
       
       @simon @jamesravey FlexGen looks like a nice entrypoint here. I particularly like this very honest message while attempting to start the chat example: "If it seems to get stuck, you can monitor the progress by checking the memory usage of this process." 🤣
       
 (DIR) Post #AT1Ht3NQPCGdIw928m by corbin@defcon.social
       2023-02-24T23:26:45Z
       
       0 likes, 0 repeats
       
       @simon I was playing with Huggingface's question-answering pipeline this week. I fed it my documentation for my house, and it was able to answer basic questions in under a second, on my laptop.That said, I think that what might be more interesting for you is Petals, KoboldAI Horde, or other distributed approaches. On a phone, edge computing is at a premium; even if a model fits, it might be cheaper to run elsewhere.
       
 (DIR) Post #AT1OSEEmcKUYqfG63U by llimllib@vis.social
       2023-02-25T00:40:42Z
       
       0 likes, 0 repeats
       
       @simon I feel like such a worrywort, but I agree and I am super depressed about the prospect
       
 (DIR) Post #AT1TmvOcMbsmGUXxPk by b3n@g0v.social
       2023-02-25T01:39:37Z
       
       0 likes, 0 repeats
       
       @simon Great! More reasons for more people to frequently update their phones AND we can put them into weapons! </sarcasm>
       
 (DIR) Post #AT2KgtUdowSwCFNDyC by Dogzilla@mastodon.sdf.org
       2023-02-25T11:32:57Z
       
       0 likes, 0 repeats
       
       @simon We will still need the ability to train them though, and we’ll want control over the training process. I’m confident we’ll be running them locally, less confident how we’ll handle the training. Maybe it’ll fragment: some folks will use centrally-run ones, some will buy a home appliance that runs their “family ChatGPT”, some will download source and DIY it from start to finish?