[HN Gopher] Autoflow, a Graph RAG based and conversational knowl...
       ___________________________________________________________________
        
       Autoflow, a Graph RAG based and conversational knowledge base tool
        
       Author : jinqueeny
       Score  : 209 points
       Date   : 2024-11-22 02:42 UTC (20 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | visarga wrote:
       | I'd love to see a GraphRAG browser that collects the pages I
       | visit automatically.
        
         | _flux wrote:
         | Many years ago there used to be a Firefox extension (..or might
         | have even been a Mozilla one..) that would store all the pages
         | I visit. I recall its name was Breadcrumbs but I could be
         | misremembering. Space is cheap, or at least affordable if one
         | would exclude videos, which are probably technically more
         | difficult to archive anyway, but sometimes one remembers having
         | seen content that is never to be found again.
         | 
         | I think it would be useful to have just a personal basic search
         | engine on that kind of contents, but possibly a RAG or even a
         | fine tuned LLM would be even cooler.
         | 
         | Actually, e.g. Firefox could do that at least for its bookmarks
         | and tabs, though it already does provide the function for
         | tagging bookmarks. And I think there's probably an extension
         | for searching tabs' contents..
        
           | fire_lake wrote:
           | Given how personal browsing history can be this is a great
           | use case for local LLMs. I would love for Mozilla to deliver
           | on this.
        
             | jumping_frog wrote:
             | Building personal assistant could be beneficial to Mozilla
             | based on how much we do online. I would like to track
             | changes to my beliefs based on how I came across new
             | information. In future, the AI could automatically shorten
             | paragraphs in essays about topics or terms I am already
             | aware of while keeping new concepts introduced in it full
             | expanded so that I grok them better.
        
           | TiredOfLife wrote:
           | The original version of read it later (now Mozilla owned
           | Pocket) had that option. but then removed that option because
           | it went against their commercial interests.
        
             | monkeydust wrote:
             | Pocket is good. I use it across all my devices, simple and
             | works for me but do wonder if they could or should do more
             | with the data they collect from me which is all the things
             | I really care about.
        
               | 3abiton wrote:
               | What's the selling point for it though? I don't get it?
        
           | gazreese wrote:
           | I need this so much, someone please build it ASAP. This would
           | be so useful!
        
           | irthomasthomas wrote:
           | Not identical but I started building a smart bookmark tool
           | that stores the content in vectors and sqlite dB and hosts
           | them in GitHub issues with labels managed by the ai. Check
           | it: https://undecidability.com and code lives at
           | https://github.com/irthomasthomas/label-maker It's a bit
           | rough but there is a working cli. It uses local jina
           | embeddings model but openai logprobs to determine when to
           | create new labels.
        
         | TiredOfLife wrote:
         | According to HN and Reddit that would be spyware and and you
         | are wrong for wanting that.
        
           | stogot wrote:
           | Only if it's turned on by default and uploaded to the cloud.
           | Privacy and user choice are what these readers want
        
             | TiredOfLife wrote:
             | That's exactly what Recall is: offline and fully
             | customizable, but HN/Reddit went mad over it.
        
               | woodson wrote:
               | They got mad because you got Recall in an update, no
               | matter whether you wanted it or not, and after another
               | update you couldn't uninstall it anymore. No choice.
        
               | TiredOfLife wrote:
               | Recall isn't even released yet.
        
               | ubertaco wrote:
               | > offline and fully customizable, but HN/Reddit went mad
               | over it.
               | 
               | ...until it isn't.
               | 
               | A self-hosted open-source project you can download and
               | run (or compile yourself and _then_ run) is very
               | different from a closed-source OS-level component that 's
               | developed by a for-profit company that makes at least
               | some portion of its revenue on ads.
               | 
               | Twitter was "the public square of the web", until it
               | wasn't. Google Reader was a best-in-class easy RSS
               | reader, until it wasn't.
               | 
               | If you don't have the source code, you don't own or
               | control the software. And when you don't own or control
               | the software, it's reasonable to have more-guarded views
               | on what data you're willing to give to that software.
               | 
               | If that software suddenly appears installed on your
               | machine, constantly recording your screen and running
               | entirely-opaque "AI processing" on it, unless you go
               | through a series of steps to opt out...it's reasonable to
               | be upset, because the opportunity to choose what you're
               | willing to share has been denied to you.
               | 
               | And since it's a closed-source OS component, it's only
               | something you can opt out from....until it isn't.
        
         | m-s-y wrote:
         | I'd love to see a brain interface so that all these pages we
         | visit can instantly become available to our own non-ai in-brain
         | all-human reasoning.
        
         | jpt4 wrote:
         | Local archiving tool I've been testing: webchiver.com
        
       | asabla wrote:
       | Oh, this looks pretty well made. Since it's using nextjs and
       | shadcn/ui, I wonder if they also used v0 to generate components.
       | 
       | Has anyone any experience with TiDB? Haven't heard about it
       | before this post
        
         | datadeft wrote:
         | Yes I have some experience with TiDB. It is pretty amazing
         | actually. They came up with a novel way of distributing data
         | across nodes and having strong consistency while also
         | maintaining great performance. We are recommending it to some
         | of our clients who are looking for an easy scaling option with
         | MySQL (TiDB is MySQL compatible on the connector level.)
        
       | kristjansson wrote:
       | FYI the 'StackVM' link that pops up appears to show all inbound
       | messages.
       | 
       | https://stackvm-ui.vercel.app/tasks/3710e8d2-fb66-4274-9f78-...
        
         | sykp241095 wrote:
         | Hi, this link is currently for demo purposes. With the help of
         | StackVM, we can DEBUG a RAG retrieval flow step by step and
         | reevaluate the retrieval plan.
        
           | kristjansson wrote:
           | Sure, security expectations for a demo are ~0, but "everyone
           | can see everyone else's inputs" is surprising even by demo
           | standards
        
       | thawab wrote:
       | Thanks a lot, this is the first time i saw a RAG using DSPy. I
       | wanted to know about the expected cost. A few days ago fast
       | graphrag compared their implementation with Microsoft:
       | 
       | > Using The Wizard of Oz, fast-graphrag costs $0.08 vs. graphrag
       | $0.48 -- a 6x costs saving that further improves with data size
       | and number of insertions.
        
       | silversmith wrote:
       | Is this wholly self-hostable? I'd be curious to run something
       | like this on a home server, have some small model via ollama
       | slowly chew through my documents / conversations / receipts /
       | .... and provide a chat-like search engine over the whole mess.
        
         | manishsharan wrote:
         | Here is how I am implementing something close to what you
         | mentioned. In my setup, I make sure to create a readme.md at
         | the root of every folder which is a document for me as well as
         | LLM that tells me what is inside the folder and how it is
         | relevant to my life or project. kind of a drunken brain dump
         | for the folder .
         | 
         | I have a cron job that executes every night and iterates
         | through my filesystem looking for changes since the last time
         | it ran. If it finds new files or changes, it creates embeddings
         | and stores them in Milvus.
         | 
         | The chat with LLM using Embeddings if not that great yet. To be
         | fair,I have not yet tried to implement the GraphRAG or Claude's
         | contexual RAG approaches. I have a lot of code in different
         | programming languages, text documents, bills pdf, images. Not
         | sure if one RAG can handle it all.
         | 
         | I am using AWS Bedrock APIs for LLama and Claude and locally
         | hosted Milvus
        
           | j45 wrote:
           | Wondering if you have tried AnythingLLM, and if so what you
           | thought of it.
        
             | manishsharan wrote:
             | I have not .. but this seems to be something I must try.
        
       | xianshou wrote:
       | I ask "what is TiDB" in the demo as suggested, and it takes 2
       | minutes to start responding in the midst of a multi-stage
       | workflow with several steps each of graph retrieval, vector
       | search, generation, and response combination.
       | 
       | Each of these is individually cool, but it strikes me as tragic
       | that so much effort has been put into an intricate workflow and
       | beautifully crafted UI only to culminate in a completely useless
       | hello-world example, which after 5+ minutes of successive
       | querying and response-building concludes with a network error.
       | 
       | I could use this to build exactly what I need...after stripping
       | out 80% of the features to make it streamlined and responsive.
       | 
       | Why isn't that minimal version the default?
        
         | andai wrote:
         | What would you remove?
        
         | striking wrote:
         | It appears to be much faster on more specific questions (like
         | the ones that are suggested after you ask it "what is TiDB"). I
         | got a response in about 40s on the question "How does TiDB's
         | cloud-native design enhance its scalability and reliability
         | compared to traditional MySQL databases?"
         | 
         | Also, what's wrong with a nice UI? It appears to mostly be
         | components from https://ui.shadcn.com/. Is there something
         | wrong with good frontend craft, especially for a demo where
         | you're trying to sell something?
         | 
         | It seems like something that is being offered as a self-
         | contained tool that's easy for end users to play with, which
         | isn't going to be the minimal version. I'm sure you could build
         | something that suits your needs exactly, but it would be hard
         | for someone else to predict your exact needs, and there's a
         | decent chance everyone needs or wants a slightly different set
         | of features, and that those things may not make for the most
         | ideal demo.
         | 
         | I am personally far from the typical profile of an AI booster,
         | but I can't help but say something about what I feel is a
         | middlebrow dismissal.
        
       | smcleod wrote:
       | It looked neat but relies on a cloud db called 'TIDB', I checked
       | its repo out and it looks like you can self host that as well but
       | damn - it's a lot of containers. So yeah looks like self hosting
       | is an option but likely a pain in the ass.
        
       ___________________________________________________________________
       (page generated 2024-11-22 23:00 UTC)