hngopher.com

       [HN Gopher] Show HN: Neum AI - Open-source large-scale RAG frame...
       ___________________________________________________________________
        
       Show HN: Neum AI - Open-source large-scale RAG framework
        
       Over the last couple months we have been supporting developers in
       building large-scale RAG pipelines to process millions of pieces of
       data.  We documented our approach in an HN post
       (https://news.ycombinator.com/item?id=37824547) a couple weeks ago.
       Today, we are open sourcing the framework we have developed.  The
       framework focuses on RAG data pipelines and provides scale,
       reliability, and data synchronization capabilities out of the box.
       For those newer to RAG, it is a technique to provide context to
       Large Language Models. It consists of grabbing pieces of
       information (i.e. pieces of news articles, papers, descriptions,
       etc.) and incorporating them into prompts to help contextualize the
       responses. The technique goes one level deeper in finding the right
       pieces of information to incorporate. The search for relevant
       information is done through the use of vector embeddings and vector
       databases.  Those pieces of news articles, papers, etc. are
       transformed into a vector embedding that represents the semantic
       meaning of the information. These vector representations are
       organized into indexes where we can quickly search for the pieces
       of information that most closely resembles (from a semantic
       perspective) a given question or query. For example, if I take news
       articles from this year, vectorize them, and add them to an index,
       I can quickly search for pieces of information about the US
       elections.  To help achieve this, the Neum AI framework features:
       Starting with built-in data connectors for common data sources,
       embedding services and vector stores, the framework provides
       modularity to build data pipelines to your specification.  The
       connectors support pre-processing capabilities to define loading,
       chunking and selecting strategies to optimize content to be
       embedded. This also includes extracting metadata that is going to
       be associated to a given vector.  The generated pipelines support
       large scale jobs through a high throughput distributed
       architecture. The connectors allow you to parallelize tasks like
       downloading documents, processing them, generating embedding and
       ingesting data into the vector DB.  For data sources that might be
       continuously changing, the framework supports data scheduling and
       synchronization. This includes delta syncs where only new data is
       pulled.  Once data is transformed into a vector database, the
       framework supports querying of the data including hybrid search
       using the available metadata added during pre-processing. As part
       of the querying process, the framework provides capabilities to
       capture feedback on retrieved data as well as run evaluations
       against different pipeline configurations.  Try it out and if
       interested in chatting more about this shoot us an email
       founders@tryneum.com
        
       Author : picohen
       Score  : 66 points
       Date   : 2023-11-21 19:20 UTC (3 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | eigenvalue wrote:
       | Cool. Do you do any of the relevance calculations directly, or is
       | that all handled by Weaviate? If so, is there any way to
       | influence that part of it, or is it something of a black box?
        
         | picohen wrote:
         | Relevance calculations are handled by the vector db but we try
         | to improve such relevance with the use of metadata (you will
         | see how our components have "selectors" so that metadata can
         | flow all the way to the vector database at the vector level and
         | have an influence when results/scores get retrieved at search
         | time)
        
       | westurner wrote:
       | DAIR.AI > Prompt Engineering Guide > Technics > Retrieval
       | Augmented Generation (RAG)
       | https://www.promptingguide.ai/techniques/rag
       | 
       | https://github.com/topics/rag
        
       | alchemist1e9 wrote:
       | If someone is about to start their project using Haystack would
       | you suggest they instead look at Neumtry?
        
         | picohen wrote:
         | Well, of course I'm biased on the answer :). But to give a not-
         | so-biased answer, I would first try to understand what the
         | project is about and whether RAG is a priority in it. If the
         | project is leveraging agents and LLMs without worrying too much
         | on context/up-to-date data then Haystack could be a good
         | option. If the focus is to eventually use RAG then our
         | framework could help.
         | 
         | Additionally, there might be a potential route where both are
         | used, depending on the use case.
         | 
         | Feel free to dm if you want to chat further on this!
        
       | J_Shelby_J wrote:
       | How does the improve upon retrieval compared to just using any
       | vector db and semantic search?
        
         | ddematheu wrote:
         | Co-founder here :)
         | 
         | Today, it is mostly about convenience. We provide abstractions
         | in the form of a pipeline that encompasses a data source, embed
         | and sink definition. This means that you don't have to think
         | about embedding your query or what class you used to add the
         | data into the vector DB.
         | 
         | In the future, we have some additional abstractions that we are
         | adding that will add more convenience. For example, we are
         | working on a concept of pipeline collections so that you can
         | search across multiple indexes but get unified results. We are
         | also adding more automation around metadata given that as part
         | of the pipeline configuration we know what metadata was added
         | and examples of it, so we can help translate queries into
         | hybrid search. I think about it as a self-query retriever from
         | Langchain or Llama Index but that automatically has context of
         | the data at hand. (no need to provide attributes)
         | 
         | Are there any specific retrieval capabilities you are looking
         | for?
        
       | omarfarooq wrote:
       | Have you guys connected with MemGPT?
        
       | hrpnk wrote:
       | Interesting to see that the semantic chunking in the tools
       | library is a wrapper around GPT-4. Asks GPT for the python code
       | and executes it:
       | https://github.com/NeumTry/NeumAI/blob/main/neumai-tools/neu...
        
       ___________________________________________________________________
       (page generated 2023-11-21 23:00 UTC)