[HN Gopher] RAGFlow is an open-source RAG engine based on OCR an...
___________________________________________________________________
RAGFlow is an open-source RAG engine based on OCR and document
parsing
Author : marban
Score : 80 points
Date : 2024-04-01 17:50 UTC (5 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| esafak wrote:
| Apparently "deep document understanding" refers to OCR and
| structured document parsing:
| https://github.com/infiniflow/ragflow/blob/main/deepdoc/READ...
|
| Since "deep document understanding" is not a term of art, I would
| have just said "OCR and document parsing".
|
| How well does it work? Please include benchmarks. You may be
| interested in
|
| https://paperswithcode.com/sota/optical-character-recognitio...
|
| https://paperswithcode.com/task/document-layout-analysis
|
| The models seem to be closed source, hosted here:
| https://huggingface.co/InfiniFlow/deepdoc
| dang wrote:
| Ok we've taken deep document understanding out of the title
| above. Thanks!
| kergonath wrote:
| I am curious about the performance of their OCR and layout and
| table detection. Hopefully it's on par with Amazon, Google, or
| Microsoft's tools.
| gardenfelder wrote:
| It seems to be limited to certain LLM servers, on of which is
| OpenAI, none of which includes e.g. Mystral and popular OSS LLMs.
|
| I wonder if that will change - eventually.
|
| Discord channels are named in Chinese, though there are English
| posts.
| shekhar101 wrote:
| It's trivial to run a proxy server that routes all OpenAi calls
| to another LLM, even local ones. See litellm-proxy.
| bschmidt1 wrote:
| I see a `LocalLLM` chat model where it looks like you can pass
| a host/port (for example, ollama's)
| bschmidt1 wrote:
| Is there a JavaScript library? Both LlamaIndex and Langchain have
| nice JS/TS packages on npm. Could thinly wrap a JS client around
| this Python API but the community aspect of having an official
| library is nice.
|
| Also might be helpful to have a simple example on the README
| showing how to fetch a document and start querying it. I would
| try it!
| NKosmatos wrote:
| If only they supported local LLMs out of the box. I have a very
| specific use case buy it needs to run locally offline only. Any
| suggestions/recommendations from fellow HN users are more than
| welcomed :-)
| mpeg wrote:
| Took me some time to figure out how to run it, but the layout
| recogniser model hosted on huggingface is pretty good!
|
| It correctly identifies tables that even paid models like the AWS
| Textract Document Analysis API fails to - for instance tables
| with one column which often confuse AWS even if they have a clear
| header and are labelled "Table" in the text.
|
| I would however love to know broadly what kind of document it was
| trained on, as my results could be pure luck, hard to say without
| a proper benchmark
|
| Very nice layout recognition, although I can't quite comment on
| the RAG performance itself - I think some of the architecture
| decisions are odd, it mixes a bunch of different PDF parsers for
| example which will all result in different quality and it's not
| clear to me which one it defaults to as it seems to be different
| in different places in the code (the simple parser defaults to
| pypdf2 which is not a great option)
___________________________________________________________________
(page generated 2024-04-01 23:00 UTC)