[HN Gopher] Dot - A standalone open source app meant for easy us...
___________________________________________________________________
Dot - A standalone open source app meant for easy use of local LLMs
and RAG
Author : irsagent
Score : 150 points
Date : 2024-04-07 00:41 UTC (22 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| NKosmatos wrote:
| Looks promising, especially if you can select just your docs and
| avoid interacting with Mistral. I'll give it a try to see how it
| performs. So far I've had mixed results with other similar
| solutions.
|
| https://news.ycombinator.com/item?id=39925316
|
| https://news.ycombinator.com/item?id=39896923
| reacharavindh wrote:
| I'm curious to try it out. There seem to be many options to
| upload a document and ask stuff about it.
|
| But, the holy grail is an LLM that can successfully work on a
| large corpus of documents and data like slack history, huge wiki
| installations and answer useful questions with proper references.
|
| I tried a few, but they don't really hit the mark. We need the
| usability of a simple search engine UI with private data sources.
| snowfield wrote:
| Rag is limited in that sense. Since the max amount of data you
| can send is still limited by the token amount that the LLM can
| process.
|
| But if all you wanted is a search engine that's a bit easier.
|
| The problem is often that a huge wiki installation etc will
| have a lot of outdated data etc. Which will still be an issue
| for an llm. And if you had fixed the data you might as well
| just search for the things you need no?
| rgrieselhuber wrote:
| This gets to the heart of it. Humans are good at keeping a
| working memory, as a group or individuals, as lore.
| boredemployee wrote:
| I think it depends of what they want. Like a search is indeed
| an easy solution, but if they want a summarization or a
| generated, straight answer so then things get a little bit
| harder.
| HeavyStorm wrote:
| The LLM would have to be trained on the local data. Not
| impossible, but maybe too costly?
| snowfield wrote:
| It sounds nice in theory but your dataset is most likely
| too small for the LLM to "learn" anything.
| IanCal wrote:
| I'd like to play with giving it more turns. When answering a
| question the note interesting ones require searching,
| reading, then searching again, reading more etc.
| haizhung wrote:
| Differentiale search indices go into this direction:
| https://arxiv.org/abs/2202.06991
|
| The approach in the paper has rough edges, but the metrics are
| bonkers (double digit percentage POINTS improvement over dual
| encoders). This paper was written before the LLM craze, and I
| am not aware of any further developments in that area. I think
| that this area might be ripe for some break through innovation.
| verdverm wrote:
| https://www.kapa.ai/ seems to be the most popular saas for
| developer tools & docs. I'm seeing it all over the place
| victor106 wrote:
| Used it, it's just glorified marketing and among all the
| solutions we tried it ranked in the bottom three.
|
| The best at least for now is to just use OpenAI's custom gpt
| and with some clever (but not hard) it's quite good.
| CharlesW wrote:
| Assuming the word "promoting" got lost, can you share more
| about this?
| oulipo wrote:
| Have you tried https://markprompt.com/ ?
| PhilippGille wrote:
| Previous submission from 20 days ago:
| https://news.ycombinator.com/item?id=39734406
| eole666 wrote:
| Looks nice! But some informations about the hardware requirement
| are often missing in this kind of project :
|
| - how much ram is needed
|
| - what CPU do you need for decent performances
|
| - can it run on a GPU? And if it does how much vram do you need /
| does it work only on Nvidia?
| prosunpraiser wrote:
| Not sure if this helps but this is from tinkering with Mistral
| 7B on both my M1 Pro (10 Core, 16 GB RAM) and WSL 2 w/ CUDA
| (Acer Predator 17, i7-7700HK, GTX 1070 Mobile, 16GB DRAM, 8GB
| VRAM). - Got 15 - 18 Tokens / sec on WSL 2 with slightly higher
| on M1. Can think of that to about 10 - 15 words per second.
| Both were using GPU. Haven't tried CPU on M1 but on WSL 2 it
| was low single digits - super slow for anything productive. -
| Used Mistral 7B via llamafile cross-platform APE executable. -
| For local-uses I found increasing the context size increased
| the RAM a lot - but it's fast enough. I am considering adding
| another 16x1 or 8x2.
|
| Tinkering with building a RAG with some of my documents using
| the vector stores and chaining multiple calls now.
| spxneo wrote:
| how does 7b match up to Mistral 8x7B?
|
| coming from chatgpt4 it was a huge breath of fresh air to not
| deal with the judeo-christian biased censorship.
|
| i think this is the ideal localllama setup--uncensored,
| unbiased, unlimited (only by hardware) LLM+RAG
| alexpinel wrote:
| Right now the minimum amount of RAM I would recommend is 16gb,
| I think it can run with less memory but that will require a few
| changes here and there (although they might reduce
| performance). I would also strongly recommend using a GPU over
| CPU, in my experience it can make the LLM run twice as fast if
| not more. Only Nvidia GPUs are supported for now and the CUDA
| toolkit 12.2 is required to run Dot.
| logro wrote:
| I have a reasonably wast library of technical/scientific
| epubs/documents. Could I use this to import them and the quiz the
| books?
| alexpinel wrote:
| Yes! Of course because the LLM is running locally it is not as
| advanced as bigger models like Claude or GPT, but you can
| definately quiz the documents. From my experience it performs
| better with specific questions rather than more ambigous
| questions that require extensive understanding of the whole
| document.
| turnsout wrote:
| Curious about the choice of FAISS. It's a bit older now, and
| there are many more options for creating and selecting
| embeddings. Does FAISS still offer some advantages?
| simonw wrote:
| What options do you think work better?
| J_Shelby_J wrote:
| I'm trying build an exhaustive list of realistic options. See
| the spreadsheet here
|
| https://shelbyjenkins.github.io/blog/retrieval-is-all-you-
| ne...
| turnsout wrote:
| I don't have an opinion, just wondering why they didn't
| choose other another option such as Sentence Embeddings,
| OpenAI embeddings, etc.
| simonw wrote:
| FAISS can be used with OpenAI embeddings (and any other
| embedding model).
|
| FAISS is technology for fast indexed similarity vector
| search - using it is an independent decision from which
| model you use to create those vectors.
| alexpinel wrote:
| Hi! I'm the guy who made Dot. I remember experimenting with a
| few different vector stores in the early stages of the project
| but decided to settle with FAISS. I mainly chose it because it
| made it easy to perform the whole embedding process locally and
| also because it allows to merge vector stores which is what I
| use to load multiple types of documents at once. But I am
| definately not an expert on the topic and would really
| appreciate suggestions on other alternatives that might work
| better! :)
| bee_rider wrote:
| Imagine the marketing coup, when we're all saying "Machine
| learning? Eh, it's all just a bunch of Dot's products."
| pentagrama wrote:
| Not sure if install the Windows GPU or CPU app version [1].
|
| I have:
|
| Processor: Ryzen 5 3600
|
| Video card: Geforce GTX 1660 TI 6Gb DDR6 (Zotac)
|
| RAM: 16Gb DDR4 2666mhz
|
| Any recommendations?
|
| [1] https://dotapp.uk/download.html
| alexpinel wrote:
| With those settings I would recommend GPU. CUDA acceleration
| really makes it faster, but keep in mind the CUDA toolkit 12.2
| install will be a some 3-4gb
| pentagrama wrote:
| Thank you!
| MasterYoda wrote:
| I have collected so much information in text files on my computer
| that it has become unmanageable to find anything. Now with local
| AI solutions, I wondered if I could create a smart search engine
| that could provide answers to the information that exists on my
| personal data.
|
| My question is.
|
| 1 - Even if there is so much data that I can no longer find
| stuff, how much text data is needed to train an LLM to work ok?
| Im not after an AI that could answer general question, only an AI
| that should be able to answer what I already know exist in the
| data.
|
| 2 - I understand that the more structured the data are, the
| better, but how important is it when training an LLM with
| structured data? Does it just figuring stuff out anyways in a
| good way mostly?
|
| 3 - Any recommendation where to start, how to run an LLM AI
| locally, train on your own data?
| gavmor wrote:
| Thanks for sharing! I look forward to playing with this once I
| get off my phone. Took a look at the code, though, to see if
| you've implemented any of the tricks I've been too lazy to try.
|
| `text_splitter=RecursiveCharacterTextSplitter( chunk_size=8000,
| chunk_overlap=4000)`
|
| Does this simple numeric chunking approach actually work? Or are
| more sophisticated splitting rules going to make a difference?
|
| `vector_store_ppt=FAISS.from_documents(text_chunks_ppt,
| embeddings)`
|
| So we're embedding all 8000 chars behind a single vector index. I
| wonder if certain documents perform better at this fidelity than
| others. To say nothing of missed "prompt expansion"
| opportunities.
___________________________________________________________________
(page generated 2024-04-07 23:01 UTC)