[HN Gopher] Show HN: I've built a locally running perplexity clone
___________________________________________________________________
Show HN: I've built a locally running perplexity clone
The video demo runs a 7b Model on a normal gaming GPU. I think it
already works quite well (accounting for the limited hardware
power). :)
Author : nilsherzig
Score : 18 points
Date : 2024-04-03 21:27 UTC (1 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| keyle wrote:
| Impressive, I don't think I've seen a local model call upon
| specialised modules yet (although I can't keep up with everything
| going on).
|
| I too use local 7b open-hermes and it's really good.
| nilsherzig wrote:
| Thanks :). It's just a lot of prompting and string parsing.
| There are models like "Hermes-2-Pro-Mistral" (the one from the
| video) which are trained to work with function signatures and
| outputting structured text. But at the end it's just strings in
| > strings out, haha. But its fun (and sometimes frustrating) to
| use LLMs for flow control (conditions, loops...) inside your
| programs.
| nilsherzig wrote:
| Happy to answer any questions and open for suggestions :)
|
| It's basically a LLMs with access to a search engine and the
| ability to query a vector db.
|
| The top n results from each search query (initialized by the LLM)
| will be scraped, split into little chunks and saved to the vector
| db. The LLM can then query this vector db to get the relevant
| chunks. This obviously isn't as comprehensive as having a 128k
| context LLM just summarize everything, but at least on local
| hardware it's a lot faster and way more resource friendly. The
| demo on GitHub runs on a normal consumer GPU (amd rx 6700xt) with
| 12gb vRAM.
___________________________________________________________________
(page generated 2024-04-03 23:00 UTC)