hngopher.com

       [HN Gopher] Show HN: SpRAG - Open-source RAG implementation for ...
       ___________________________________________________________________
        
       Show HN: SpRAG - Open-source RAG implementation for challenging
       real-world tasks
        
       Hey HN, I'm Zach from Superpowered AI (YC S22). We've been working
       in the RAG space for a little over a year now, and we've recently
       decided to open-source all of our core retrieval tech.  spRAG is a
       retrieval system that's designed to handle complex real-world
       queries over dense text, like legal documents and financial
       reports. As far as we know, it produces the most accurate and
       reliable results of any RAG system for these kinds of tasks. For
       example, on FinanceBench, which is an especially challenging open-
       book financial question answering benchmark, spRAG gets 83% of
       questions correct, compared to 19% for the vanilla RAG baseline
       (which uses Chroma + OpenAI Ada embeddings + LangChain).  You can
       find more info about how it works and how to use it in the
       project's README. We're also very open to contributions. We
       especially need contributions around integrations (i.e. adding
       support for more vector DBs, embedding models, etc.) and around
       evaluation.
        
       Author : zmccormick7
       Score  : 40 points
       Date   : 2024-05-02 15:44 UTC (7 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | skanga wrote:
       | You mentioned that spRAG uses OpenAI for embeddings, Claude 3
       | Haiku for AutoContext, and Cohere for reranking. Can you explain
       | why & how did you make those choices?
        
         | zmccormick7 wrote:
         | Those are just the defaults, and spRAG is designed to be
         | flexible in terms of the models you can use with it. For
         | AutoContext (which is just a summarization task) Haiku offers a
         | great balance of price and performance. Llama 3-8B would also
         | be a great choice there, especially if you want something you
         | can run locally. For reranking, the Cohere v3 reranker is by
         | far the best performer on the market right now. And for
         | embeddings, it's really a toss-up between OpenAI, Cohere, and
         | Voyage.
        
           | Cheer2171 wrote:
           | I bet you'll get a lot more adoption if you put info about
           | using it with local self-hosted LLMs there. I'll never trust
           | a cloud service with the documents I want to RAG.
        
             | zmccormick7 wrote:
             | Agreed. I've gotten a lot of feedback along those lines
             | today, so that's my top priority now.
        
       | cyanydeez wrote:
       | Im planning a RAG and this seems to implement the "autocontext" i
       | was expectong to do.
       | 
       | For the library i have, its not just the file name but multiple
       | folder names are descriptive, especially if i build a data
       | dictionary.
       | 
       | Have you looked into some simple tagging/conversion dictionary
       | that preproccess the context?
        
         | zmccormick7 wrote:
         | In our AutoContext implementation, the document title gets
         | included with the generated summary. So if you have files that
         | are organized into nested folders with descriptive names, you
         | can input that full file path as the `document_title`. I did
         | this with one of our internal benchmarks and it worked really
         | well.
        
       | bitshaker wrote:
       | Amazing. I'm looking at building an app to look over the
       | employment sections of the legal code and come back with results
       | if things are allowed or not.
       | 
       | Answering questions like:
       | 
       | Can my employer do X? As an employee in this country, what's my
       | minimum days off I can take?
       | 
       | And so on.
       | 
       | If your claims are true, then this will be exactly what I'm
       | looking for.
        
         | zmccormick7 wrote:
         | I think spRAG should be pretty well suited for that use case. I
         | think the biggest challenge will be generating specific search
         | queries off of more general user inputs. You can look at the
         | `auto_query.py` file for a basic implementation of that kind of
         | system, but it'll likely require some experimentation and
         | customization for your use case.
        
       | esafak wrote:
       | I'd replace the "challenging real-world tasks" in the title with
       | "dense text, like financial reports and legal documents". It
       | sounds less general but that's a good thing.
       | 
       | The repo is only two weeks old, and looks it, so how do you think
       | spRAG distinguishes itself? This is a crowded space with more
       | established players.
       | 
       | The "vanilla RAG" benchmark figure you cite is not convincing
       | because it can not be verified. Please share your benchmarking
       | code.
        
         | zmccormick7 wrote:
         | That's great feedback. I actually went back and forth between
         | those two descriptions. I agree that "dense text, like
         | financial reports and legal documents" is more precise. Those
         | are the kinds of use cases this project is built for.
         | 
         | I want to keep this project tightly scoped to just retrieval
         | over dense unstructured text, rather than trying to build a
         | fully-featured RAG framework.
        
       | serjester wrote:
       | Does this mean you're winding down your business? Just curious
       | what the motivation to open source this was given this seems like
       | your guy's core value add? Congrats on the launch.
        
         | zmccormick7 wrote:
         | That's a great question. I'll start with a little context: most
         | of the users of our existing hosted platform are no-code/low-
         | code developers who choose us because we're the simplest
         | solution for building what they want to build (primarily
         | because we have end-to-end workflows like Chat built-in). The
         | improved retrieval performance is a nice-to-have for this
         | group, but usually not the primary reason they choose us.
         | 
         | For the larger companies and venture-backed startups we've
         | talked to, they almost universally want to own their RAG stack
         | in-house or build on open-source frameworks, rather than
         | outsource it to an end-to-end solution provider. So open-
         | sourcing our core retrieval tech is our bid to appeal to these
         | developers.
        
       | bashtoni wrote:
       | Interested to see how this performs against RAPTOR, which does
       | summarisation and clustering.
       | 
       | https://github.com/profintegra/raptor-rag
       | https://github.com/langchain-ai/langchain/blob/master/cookbo...
        
       | jwuphysics wrote:
       | How much do you expect auto-context and clustering+re-ranking to
       | help for cases in which documents already have high-quality
       | summaries? For context, I parse astrophysics research papers from
       | arXiv and simply embed by paper abstracts (which _must_ be of a
       | certain size), and then append (parts of) the rest of the paper
       | for RAG.
        
         | zmccormick7 wrote:
         | So the point of AutoContext is so you don't have to do that
         | two-step process of first finding the right document, and then
         | finding the right section of that document. I think it's
         | cleaner to do it this way, but it's not necessarily going to
         | perform any better or worse. But then spRAG also has the RSE
         | part which is what identifies the right section(s) of the
         | document. Whether or not that helps in your case is going to
         | depend on how good of a solution you already have for that.
        
           | jwuphysics wrote:
           | That makes sense and I'll run a few evals. Many thanks for
           | open sourcing your work!
        
       ___________________________________________________________________
       (page generated 2024-05-02 23:01 UTC)