[HN Gopher] Show HN: IncarnaMind-Chat with your multiple docs us...
___________________________________________________________________
Show HN: IncarnaMind-Chat with your multiple docs using LLMs
Author : joeyxiong
Score : 25 points
Date : 2023-09-15 19:32 UTC (3 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| smcleod wrote:
| Only supports private / closed LLMs like OpenAI and Claud. People
| need to design for local LLM first, then for-profit providers.
| joeyxiong wrote:
| Yeah, This can definatly be used for local models, but the
| problem is that most personal computers cannot host large LLMs
| and the cost is not cheaper than closed LLMs. But for
| organisations, local LLMs are a better choice.
| gsuuon wrote:
| Those diagrams are nice! What did you use to make them? The
| sliding window mechanic is interesting but I'm not seeing how the
| first, second and third retrievers relate. Only the final medium
| chunks are used, but how are those arrived at?
| joeyxiong wrote:
| Hi, I created the diagrams using Figma.
|
| The retrieval process consists of three stages. The first stage
| retrieves small chunks from multiple documents to create a
| document filter using their metadat. This filter is then
| applied in the second stage to extract relevant large chunks,
| essentially sections of documents, which further refines our
| search parameters. Finally, using both the document and large
| chunk filters, the third stage retrieves the most pertinent
| medium-sized chunks of information to be passed to the Language
| Model, ensuring a focused and relevant response to your query.
| skeptrune wrote:
| Can we talk about how dynamic chunking works by any chance? That
| is the most interesting piece imo.
|
| We have a similar thing (w/ UIs for search/chat) at
| https://github.com/arguflow/arguflow .
|
| - nick@arguflow.gg
| all2 wrote:
| A team where I work recently rolled out a doc-answer LLM and
| context was an issue we ran into. Retrieved doc chunks didn't
| have nearly enough context to answer some of the broader
| questions well.
|
| Another issue I've run into with doc-answer LLMs is that they
| don't handle synonyms well. If I don't know the terminology for
| the tool, say llama-index [0], I can't ask around the concept to
| see if something _like_ what I 'm describing exists.
|
| A part of me thinks a lang-chain with the LLM in it might be
| useful.
|
| Something like
|
| 1. User makes vague query "hey, llama-index, how do I create a
| moving chunk answer thing with llama-index?"
|
| 2. Initial context comes back to the LLM, and the LLM determines
| there is not straight forward answer to the question.
|
| 2a. The LLM might ask followup questions "when you say X, what do
| you mean?" to clarify terms it doesn't have ready answers for.
|
| 2b. The LLM says "hm, let me think about that. I'll email you
| when I have a good answer."
|
| 2c. The LLM reads the docs and relevant materials and attempts to
| solve the problem.
|
| 3. Email the user with a potential answer to the question.
|
| 4. Stashes the solution text in the docs if the user OKs the
| plan. Updates an embedding table to include words/terms used that
| the docs didn't contain.
|
| This last step is the most important. Some kind of method to
| capture common questions and answers, synonyms, etc. would ensure
| that the model has access to (potentially) increasingly robust
| information.
| sergiotapia wrote:
| You can have a pre-qualification step to qualify the answer
| into several highly specific categories. These categories have
| highly tailored context that allow much better answers.
|
| Of course, you can only generate these categories once you see
| what kind of questions your users ask, but this means your
| product can continuously improve.
| pstorm wrote:
| I'm impressed by your chunking and retrieval strategies. I think
| this aspect is often overly simplistic.
|
| One aspect I don't quite understand is why you filter by the
| sliding window chunks vs just using the medium chunks? If I
| understand it correctly, you find the large chunks that contain
| the matched small chunks from the first retrieval. Then in the
| third retrieval, you are getting the medium chunks that comprise
| the large chunks? What extra value does that provide?
| joeyxiong wrote:
| Thank you for your comment. The sliding window approach allows
| me to dynamically identify relevant "large chunks," which can
| be thought of as sections in a document. Often, your questions
| may pertain to multiple such sections. Using only medium chunks
| for retrieval could result in sparse or fragmented information.
|
| The third retrieval focuses on "medium chunks" within these
| identified large chunks. This ensures that only the most
| relevant information is passed to the Language Model, enhancing
| both time efficiency and focus. For example, if you're asking
| for a paper summary, I can zero in on medium chunks within the
| Abstract, Introduction, and Conclusion sections, eliminating
| noise from other irrelevant sections. Additionally, this
| strategy helps manage token limitations, like GPT-3.5's
| 4000-token cap, by selectively retrieving information
| pstorm wrote:
| Ah I see! So, the large/sliding window chunks act as a pre-
| filter for the medium chunks. That makes a lot of sense. I
| appreciate the response
| dilap wrote:
| I feel like an LLM trained on Slack could be something like the
| perfect replacement for trying to maintain docs.
| SamBam wrote:
| Testing it out. I'm getting an error after I added my pdfs to the
| data directory and then ran % python docs2db.py
| Processing files: 6% Traceback (most recent call
| last): File "[...]/IncarnaMind/docs2db.py", line
| 179, in process_metadata file_name =
| doc[0].metadata["source"].split("/")[-1].split(".")[0]
| IndexError: list index out of range
| joeyxiong wrote:
| Hi, I've pushed the new commit to the main branch. Could you
| please test it out? If it still has this error, you can check
| if your doc has relevant metadata.
|
| ```` for d in doc: print("metadata:", d.metadata) ```
|
| before file_name =
| doc[0].metadata["source"].split("/")[-1].split(".")[0]
| SamBam wrote:
| This looks awesome, and really useful.
|
| A few weeks ago I asked in Hacker News "I'm in the middle of a
| graduate degree and am reading lots of papers, how could I get
| ChatGPT to use my whole library as context when answering
| questions?"
|
| And I was told, basically, "It's really easy! Just First you just
| extract all of the text from the PDFs into arxiv, parse to
| separate content from style, then store that in a a DuckDB
| database, with zstd compression, then just use some encoder model
| to process all of these texts into Qdrant database. Then use
| Vicuna or Guanaco 30b GPTQ, with langcgain, and....."
|
| I was like, ok... guess I won't be asking ChatGPT where I can
| find which paper talked about which thing after all.
| jarvist wrote:
| https://github.com/whitead/paper-qa
|
| >This is a minimal package for doing question and answering
| from PDFs or text files (which can be raw HTML). It strives to
| give very good answers, with no hallucinations, by grounding
| responses with in-text citations.
| skeptrune wrote:
| I don't know why you need the "ask chatGPT" piece. Why not just
| semantic search on the documents?
|
| What is the value add of generative output?
| all2 wrote:
| I think the value is "Hey, I remember a paper talking X topic
| with Y sentiment, it also mentioned data from <vague source>.
| Which paper was that?"
|
| If you're dealing with 100s of papers, then having a front
| end that can deal with vague queries would be a huge benefit.
| skeptrune wrote:
| You could just write "X topic with Y sentiment similar to
| foo/<vague-source>" into a search bar.
|
| Then, plain old vector distance on your data would find the
| chunks relevant. No need for generative AI.
|
| citation to prove this works: chat.arguflow.ai /
| search.arguflow.ai
___________________________________________________________________
(page generated 2023-09-15 23:00 UTC)