Post AcxRHoxb105LRmZzma by jeremybmerrill@journa.host
 (DIR) More posts by jeremybmerrill@journa.host
 (DIR) Post #AcwTjOIdjn7lcSjkUi by simon@fedi.simonwillison.net
       2023-12-18T18:27:02Z
       
       0 likes, 0 repeats
       
       Many options for running Mistral models in your terminal using LLMI wrote about a whole bunch of different ways you can use my LLM tool to run prompts through Mistral 7B, Mixtral 8x7B and the new Mistral-medium from the terminal:https://simonwillison.net/2023/Dec/18/mistral/
       
 (DIR) Post #AcwUTMmzgYpCFurEUC by simon@fedi.simonwillison.net
       2023-12-18T18:35:10Z
       
       0 likes, 0 repeats
       
       Noteworthy that Mistral 7B was released on September 26 and there are already seven LLM plugins that can execute it, either locally or via a hosted API:llm-mistral llm-llama-cpp llm-gpt4all llm-mlc llm-replicate llm-anyscale-endpoints llm-openrouterMistral appears to be establishing itself as the default LLM alternative to OpenAI's models
       
 (DIR) Post #AcwUfdQde87DKxFcnI by osma@sigmoid.social
       2023-12-18T18:36:20Z
       
       0 likes, 0 repeats
       
       @simonExcellent as always! Thanks!Minor nitpick: You say that Mistral Small beats GPT-3.5 on every metric. But in the table it has slightly lower scores for WinoGrande and MT Bench.
       
 (DIR) Post #AcwV49XFvSZ9re2N8a by simon@fedi.simonwillison.net
       2023-12-18T18:37:05Z
       
       0 likes, 0 repeats
       
       @osma Oops good catch, thanks, I'll update the copy
       
 (DIR) Post #AcwX5db9q0QXIhhLM0 by xek@hachyderm.io
       2023-12-18T19:04:36Z
       
       0 likes, 0 repeats
       
       @simon FWIW, I tried following the instructions there in a fresh venv and got `Error: 'gguf' is not a known model`.  `llm --version` shows 0.12, not sure if I'm missing a plugin or something that adds this.
       
 (DIR) Post #AcwXRJ7sdLaaB6KwJU by simon@fedi.simonwillison.net
       2023-12-18T19:08:37Z
       
       0 likes, 0 repeats
       
       @xek Did you install the latest `llm-llama-cpp`plugin?
       
 (DIR) Post #AcwXhZcD5wmnx4HOXA by xek@hachyderm.io
       2023-12-18T19:11:35Z
       
       0 likes, 0 repeats
       
       @simon Ah, sorry, forgot to include that.  `llm plugins` shows it at 0.3.
       
 (DIR) Post #AcwiGpzzAl3wRPr1tY by simon@fedi.simonwillison.net
       2023-12-18T21:09:49Z
       
       0 likes, 0 repeats
       
       Added another option: you can run Mixtral as a llamafile and then configure my LLM tool to talk to it via its OpenAI-compatible localhost API endpoint https://simonwillison.net/2023/Dec/18/mistral/#llamafile-openai
       
 (DIR) Post #AcwkI1aVlmjKa9dHpA by dio@mastodon.online
       2023-12-18T21:32:06Z
       
       0 likes, 0 repeats
       
       @simon very cool I have heard of llamafiles but this is an awesome implementation
       
 (DIR) Post #AcxRHoxb105LRmZzma by jeremybmerrill@journa.host
       2023-12-19T03:56:37Z
       
       0 likes, 0 repeats
       
       @xek @simon Same error, actually. (Likewise llm 0.12, llm-llama-cpp 0.3) If it's helpful, `llm models` also doesn't show any gguf-related output.
       
 (DIR) Post #AcxRHqDaL2T9Lf4HLM by simon@fedi.simonwillison.net
       2023-12-19T05:34:32Z
       
       0 likes, 0 repeats
       
       @jeremybmerrill @xek what does "llm plugins" output?
       
 (DIR) Post #AcxWV3Ryx5oBvGvEOW by sharkjacobs@mastodon.social
       2023-12-19T06:33:00Z
       
       0 likes, 0 repeats
       
       @simon so many options! But which one’s the best? Is MLC the fastest?
       
 (DIR) Post #AcxYPDWulOjdM3ZcBc by gregors@mastodon.world
       2023-12-19T06:54:28Z
       
       0 likes, 0 repeats
       
       @simon you can use mixtral also via web UI over at https://labs.perplexity.aiIt's not as cool as running it locally, but if you are thin on RAM, it's one of the options ;)
       
 (DIR) Post #AcyL8qQ1okyEnWAuP2 by jeremybmerrill@journa.host
       2023-12-19T16:00:04Z
       
       0 likes, 0 repeats
       
       @simon @xek from `$ vllm plugins` I get `llm-llama-cpp` and `llm-gpt4all` (installed separately)xek's suggestion of `$ llm llama-cpp models` returns `{}` but fixes the problem and I can now run `llm -m gguf ...` (well, I run out of RAM, but close enough and I should've anticipated that)
       
 (DIR) Post #AcyMp2g0jYAqvnKtBg by simon@fedi.simonwillison.net
       2023-12-19T16:19:07Z
       
       0 likes, 0 repeats
       
       @jeremybmerrill @xek OK I'll look into that, needing to run "models" like that is weird and shouldn't be necessary
       
 (DIR) Post #AcyvE1TniZ4Wtrhxrc by endquote@mousectrl.org
       2023-12-19T22:44:29Z
       
       0 likes, 0 repeats
       
       @simon I get stuck on step 5, "Error: 'gguf' is not a known model"