Post AnWqJ3ltOdbeMErJqa by admin@mastodon.foxfam.club
(DIR) More posts by admin@mastodon.foxfam.club
(DIR) Post #AnWoFBh9nL8hrjfame by markigra@sciences.social
2024-10-29T18:52:05Z
0 likes, 0 repeats
Considering getting an Mac Mini with 64GB RAM to run local #llm -- seems like borderline to run the big models, but alternatives are 2x the cost and/or require building PC with custom hardware. I've seen published numbers like those below, but don't have a sense for usability for things like RAG & embedding based analysis of texts that can't be shared. https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
(DIR) Post #AnWoFD4weNlI9no6V6 by markigra@sciences.social
2024-10-29T19:00:29Z
0 likes, 0 repeats
Alternatively, I could use online platforms, but I have a hard time figuring out which ones are going to keep the content that I use truly private. Would appreciate feedback from folks who have been using #llm in #sociology or #digitalhumanities
(DIR) Post #AnWoFE3D2Jy3Ai0EQy by fpianz@mstdn.social
2024-10-29T20:14:07Z
0 likes, 0 repeats
@markigra Ollama works well even with 16gb RAM https://ollama.com/
(DIR) Post #AnWoFEVvJYw6blRABs by fredy_pferdi@social.linux.pizza
2024-10-29T20:49:57Z
0 likes, 0 repeats
@fpianz @markigra yeah but only ~ 8b models
(DIR) Post #AnWoFFCok5FOknVQ36 by admin@mastodon.foxfam.club
2024-10-30T09:49:25Z
0 likes, 0 repeats
@fredy_pferdi @fpianz @markigra best just to get a GPU for LLM work IMO... It's faster by far
(DIR) Post #AnWoHPgClsNJokHmSW by fredy_pferdi@social.linux.pizza
2024-10-30T09:49:51Z
0 likes, 0 repeats
@admin @fpianz @markigra what options do run 70b models,especially on mobile devices?
(DIR) Post #AnWoj7xIGOqkxFUgc4 by admin@mastodon.foxfam.club
2024-10-30T09:54:53Z
0 likes, 0 repeats
@fredy_pferdi @fpianz @markigra 70b on mobile?! That even possible on any consumer device?!
(DIR) Post #AnWoq4ZRFKrQU4eDlg by fredy_pferdi@social.linux.pizza
2024-10-30T09:56:06Z
0 likes, 0 repeats
@admin @fpianz @markigra M series macs as but also technically on the ryzen chips as-well if you can allocate enough vram (that's the gist of this whole discussion)
(DIR) Post #AnWpC6lNm64Hegk328 by admin@mastodon.foxfam.club
2024-10-30T10:00:07Z
0 likes, 0 repeats
@fredy_pferdi @fpianz @markigra I'll have to test that...I have a Ryzen 9 but I'm only allocating 4gb to the GPU (out of 64 total)... Using my 4090 for the heavy lifting since it has 24gb vram... But that only lets me run 22b models...I might have to add another 64gb to run this experiment 😁
(DIR) Post #AnWpIsN7gCDf34cbWy by fredy_pferdi@social.linux.pizza
2024-10-30T10:01:18Z
0 likes, 0 repeats
@admin @fpianz @markigra what mainboard and configuration you run which ryzen 9?
(DIR) Post #AnWq7zqndLvF0nsYFs by fredy_pferdi@social.linux.pizza
2024-10-30T10:10:32Z
0 likes, 0 repeats
@admin @fpianz @markigra would also be interesting to know how much vram you are capable of allocating.
(DIR) Post #AnWqJ3ltOdbeMErJqa by admin@mastodon.foxfam.club
2024-10-30T10:12:35Z
0 likes, 0 repeats
@fredy_pferdi @fpianz @markigraAorus B650 motherboardRyzen 9 7900X3D (SMT is on - 24 threads) Gigabyte RTX 4090 24GB1tb nvmeRunning Fedora 40 (I'll upgrade to 41 next week)