Post AZSCDtp1dSPKXu7lEu by Volker@fosstodon.org
(DIR) More posts by Volker@fosstodon.org
(DIR) Post #AZR1HA45yCCQEO0gbY by simon@fedi.simonwillison.net
2023-09-04T20:35:49Z
0 likes, 0 repeats
Big new release of my LLM CLI tool and Python library for working with Large Language models (Llama 2, GPT-4 etc)LLM 0.9 adds support for embedding models, installed via plugins If you aren't familiar with embeddings I have a very detailed explanation of what they can do and how you can use them here:https://simonwillison.net/2023/Sep/4/llm-embeddings/
(DIR) Post #AZR3glc9hPUFpuHLt2 by simon@fedi.simonwillison.net
2023-09-04T21:03:03Z
0 likes, 0 repeats
Here's a fun example of something you can now do with LLM: search for every README.md file in your home directory and store embeddings for all of them in a collection called "readmes":```llm embed readmes \ --model sentence-transformers/all-MiniLM-L6-v2 \ --files ~/ '**/README.md'```Then run a similarity search for "sqlite" like this:```llm similar readmes -c sqlite```
(DIR) Post #AZR43yAFCnBnQ3HsfI by xek@hachyderm.io
2023-09-04T21:07:11Z
0 likes, 0 repeats
@simon Now to add "a one-sentence summary of what it does" next to the search results, for those of us who can't keep track of all the arbitrary codenames for our myriad half-completed projects. 😅
(DIR) Post #AZR5Yu3kcHftcUZ3Vg by simon@fedi.simonwillison.net
2023-09-04T21:24:00Z
0 likes, 0 repeats
Also new today: the llm-cluster plugin, which derives clusters of documents from a collection of embeddingsA fun trick with that is that you can ask it to pass the items in each cluster through an LLM in order to generate titles for each cluster!https://github.com/simonw/llm-cluster
(DIR) Post #AZR5unKKKSEv6rkxFI by simon@fedi.simonwillison.net
2023-09-04T21:24:48Z
0 likes, 0 repeats
@band Oops, that should be "llm embed-multi" - will update, thanks!
(DIR) Post #AZSCDtp1dSPKXu7lEu by Volker@fosstodon.org
2023-09-05T10:13:23Z
0 likes, 0 repeats
@simon Can you further embed the embeddings returned from the llms using a UMAP to get a graphical overview? (more of rethorical question, I think I know the answer ... would be nice to see in a tool).
(DIR) Post #AZSgUVWsCp1OkcT9NI by simon@fedi.simonwillison.net
2023-09-05T15:52:27Z
0 likes, 0 repeats
I just released a new version of my Symbex tool, which finds functions and classes in a Python codebaseIt can now export the code it finds in a format that can then be piped to "llm embed-multi" to embed those functionshttps://github.com/simonw/symbex/releases/tag/1.4
(DIR) Post #AZSm6UJ1haQtjQvEeW by impersonal@mastodon.social
2023-09-05T16:55:17Z
0 likes, 0 repeats
@simon Please excuse my ignorance, but what is the use case?(Genuine q, python beginner)
(DIR) Post #AZSobx6CtoVHuy3bOq by simon@fedi.simonwillison.net
2023-09-05T17:23:35Z
0 likes, 0 repeats
@impersonal totally reasonable question! I'm still figuring out what you can do with embeddings of Python functions myselfMy hunch is that they'll get interesting when combined with other tricks - like finding the example code most relevant to a question posed to an LLM
(DIR) Post #AZTKy1PV3aWFWUZ6gq by simon@fedi.simonwillison.net
2023-09-05T23:26:04Z
0 likes, 0 repeats
Ran "llm embed-multi" against a CSV file of all of my tweets, now I'm having fun running vibes-based searches against them