[HN Gopher] Show HN: Neum AI - Open-source large-scale RAG frame...
___________________________________________________________________
Show HN: Neum AI - Open-source large-scale RAG framework
Over the last couple months we have been supporting developers in
building large-scale RAG pipelines to process millions of pieces of
data. We documented our approach in an HN post
(https://news.ycombinator.com/item?id=37824547) a couple weeks ago.
Today, we are open sourcing the framework we have developed. The
framework focuses on RAG data pipelines and provides scale,
reliability, and data synchronization capabilities out of the box.
For those newer to RAG, it is a technique to provide context to
Large Language Models. It consists of grabbing pieces of
information (i.e. pieces of news articles, papers, descriptions,
etc.) and incorporating them into prompts to help contextualize the
responses. The technique goes one level deeper in finding the right
pieces of information to incorporate. The search for relevant
information is done through the use of vector embeddings and vector
databases. Those pieces of news articles, papers, etc. are
transformed into a vector embedding that represents the semantic
meaning of the information. These vector representations are
organized into indexes where we can quickly search for the pieces
of information that most closely resembles (from a semantic
perspective) a given question or query. For example, if I take news
articles from this year, vectorize them, and add them to an index,
I can quickly search for pieces of information about the US
elections. To help achieve this, the Neum AI framework features:
Starting with built-in data connectors for common data sources,
embedding services and vector stores, the framework provides
modularity to build data pipelines to your specification. The
connectors support pre-processing capabilities to define loading,
chunking and selecting strategies to optimize content to be
embedded. This also includes extracting metadata that is going to
be associated to a given vector. The generated pipelines support
large scale jobs through a high throughput distributed
architecture. The connectors allow you to parallelize tasks like
downloading documents, processing them, generating embedding and
ingesting data into the vector DB. For data sources that might be
continuously changing, the framework supports data scheduling and
synchronization. This includes delta syncs where only new data is
pulled. Once data is transformed into a vector database, the
framework supports querying of the data including hybrid search
using the available metadata added during pre-processing. As part
of the querying process, the framework provides capabilities to
capture feedback on retrieved data as well as run evaluations
against different pipeline configurations. Try it out and if
interested in chatting more about this shoot us an email
founders@tryneum.com
Author : picohen
Score : 66 points
Date : 2023-11-21 19:20 UTC (3 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| eigenvalue wrote:
| Cool. Do you do any of the relevance calculations directly, or is
| that all handled by Weaviate? If so, is there any way to
| influence that part of it, or is it something of a black box?
| picohen wrote:
| Relevance calculations are handled by the vector db but we try
| to improve such relevance with the use of metadata (you will
| see how our components have "selectors" so that metadata can
| flow all the way to the vector database at the vector level and
| have an influence when results/scores get retrieved at search
| time)
| westurner wrote:
| DAIR.AI > Prompt Engineering Guide > Technics > Retrieval
| Augmented Generation (RAG)
| https://www.promptingguide.ai/techniques/rag
|
| https://github.com/topics/rag
| alchemist1e9 wrote:
| If someone is about to start their project using Haystack would
| you suggest they instead look at Neumtry?
| picohen wrote:
| Well, of course I'm biased on the answer :). But to give a not-
| so-biased answer, I would first try to understand what the
| project is about and whether RAG is a priority in it. If the
| project is leveraging agents and LLMs without worrying too much
| on context/up-to-date data then Haystack could be a good
| option. If the focus is to eventually use RAG then our
| framework could help.
|
| Additionally, there might be a potential route where both are
| used, depending on the use case.
|
| Feel free to dm if you want to chat further on this!
| J_Shelby_J wrote:
| How does the improve upon retrieval compared to just using any
| vector db and semantic search?
| ddematheu wrote:
| Co-founder here :)
|
| Today, it is mostly about convenience. We provide abstractions
| in the form of a pipeline that encompasses a data source, embed
| and sink definition. This means that you don't have to think
| about embedding your query or what class you used to add the
| data into the vector DB.
|
| In the future, we have some additional abstractions that we are
| adding that will add more convenience. For example, we are
| working on a concept of pipeline collections so that you can
| search across multiple indexes but get unified results. We are
| also adding more automation around metadata given that as part
| of the pipeline configuration we know what metadata was added
| and examples of it, so we can help translate queries into
| hybrid search. I think about it as a self-query retriever from
| Langchain or Llama Index but that automatically has context of
| the data at hand. (no need to provide attributes)
|
| Are there any specific retrieval capabilities you are looking
| for?
| omarfarooq wrote:
| Have you guys connected with MemGPT?
| hrpnk wrote:
| Interesting to see that the semantic chunking in the tools
| library is a wrapper around GPT-4. Asks GPT for the python code
| and executes it:
| https://github.com/NeumTry/NeumAI/blob/main/neumai-tools/neu...
___________________________________________________________________
(page generated 2023-11-21 23:00 UTC)