[HN Gopher] Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM
       ___________________________________________________________________
        
       Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM
        
       We show the potential of modern, embedded graph databases in the
       browser by demonstrating a fully in-browser chatbot that can
       perform Graph RAG using Kuzu (the graph database we're building)
       and WebLLM, a popular in-browser inference engine for LLMs. The
       post retrieves from the graph via a Text-to-Cypher pipeline that
       translates a user question into a Cypher query, and the LLM uses
       the retrieved results to synthesize a response. As LLMs get better,
       and WebGPU and Wasm64 become more widely adopted, we expect to be
       able to do more and more in the browser in combination with LLMs,
       so a lot of the performance limitations we see currently may not be
       as much of a problem in the future.  We will soon also be releasing
       a vector index as part of Kuzu that you can also use in the browser
       to build traditional RAG or Graph RAG that retrieves from both
       vectors and graphs. The system has come a long way since we open
       sourced it about 2 years ago, so please give us feedback about how
       it can be more useful!
        
       Author : sdht0
       Score  : 92 points
       Date   : 2025-03-10 15:12 UTC (7 hours ago)
        
 (HTM) web link (blog.kuzudb.com)
 (TXT) w3m dump (blog.kuzudb.com)
        
       | esafak wrote:
       | The example is not ideal for showcasing a graph analytics
       | database because they could have used a traditional relational
       | database to answer the same query, _Which of my contacts work at
       | Google?_
        
         | laminarflow027 wrote:
         | Hi, I work at Kuzu and can offer my thoughts on this.
         | 
         | You're making a fair observation here and it's true for any
         | high level query language - SQL and Cypher and interchangeable
         | unless the queries are recursive, in which case Cypher's graph
         | syntax (e.g., the Kleene star * or shortest paths) has several
         | advantages. One could make the argument that Cypher is easier
         | for LLMs to generate because the joins are less verbose (you
         | simply express the join as a query pattern). This post is not
         | necessarily about graph analytics. It's about demonstrating
         | that it's very simple to develop a relatively complex
         | application using LLMs and a database fully in-browser, which
         | can potentially open up new use cases. I'm sure many people
         | will come up with other creative ways putting these fully in-
         | browser technologies, both graph-specific, and not, e.g., using
         | vector search-based retrieval. In fact, there are already some
         | of our users doing this right now.
        
           | echelon wrote:
           | This is really cool, but I'm super anxious about entering my
           | personal data, especially LinkedIn connections.
           | 
           | Is there some other demo you could do with public graph data?
           | It'd be just as cool of a demo, but with less fear of
           | information misuse.
           | 
           | I'm even more anxious about leaking information about my
           | professional connections as I am leaking my own data.
        
             | laminarflow027 wrote:
             | Your concern makes sense, but in the demo we show, all your
             | private data AND the graph database AND the LLM (basically,
             | everything) is confined to your client session in the
             | browser, and no data actually ever leaves your machine.
             | That's the whole point of Wasm!
             | 
             | The graph that you build is more for your own exploration
             | and not for sharing with the outside world.
        
               | w10-1 wrote:
               | Still, using non-personal example would mean the user
               | wouldn't have to consider whether to trust you on that
               | point (or do the analysis), and would make the technology
               | demo friction-free.
               | 
               | imo, privacy shouldn't be the driver but the kicker,
               | because it's so inflammatory.
        
               | semihsalihoglu wrote:
               | I think having a version with a sample synthetic dataset
               | makes sense.
        
         | beefnugs wrote:
         | We wont be seeing any ai examples that actually are anywhere
         | near useful until we rewrite all serialization/de serialization
         | into "natural language" as well as create layers upon layers of
         | loops of frameworks with simulations and test cases around this
         | nonsense
        
       | srameshc wrote:
       | I heard about it for the first time, an embedable graph database
       | Kuzu and even better the WASM mix and LLM.
        
       | nattaylor wrote:
       | This is very cool. Kuzu has a ton of great blog content on all
       | the ways they make Kuzu light and fast. WebLMM (or in the future
       | chrome.ai.* etc) + embedded graph could make for some great UXes
       | 
       | At one time I thought I read that there was a project to embed
       | Kuzu into DuckDB, but bringing a vector store natively into kuzu
       | sounds even better.
        
         | laminarflow027 wrote:
         | Great point! Several years ago there was a project GRainDB,
         | which along with GraphflowDB (a purely in-memory graph
         | database) formed the ideas of what is now Kuzu :)
         | 
         | https://graindb.github.io/
         | https://github.com/graphflow/graphflow-columnar-techniques
        
       | jasonthorsness wrote:
       | Don't the resource requirements from even small LLMs exclude most
       | devices/users from being able to use stuff like this?
        
         | laminarflow027 wrote:
         | True, but there are likely innovations happening in multiple
         | dimensions all at once: WebGPU improvements that better utilize
         | a device's compute, Wasm64. And of course, LLMs over time
         | become SLMs (smaller and smaller models), that can do a
         | surprisingly large variety of things well.
         | 
         | Putting aside LLMs for a minute, even applications that do not
         | need LLMs, but benefit from a graph database, can be unlocked
         | to help build interactive UIs and visualizations that retain
         | privacy on the client side without ever moving the data to a
         | server. Loads of possibilities!
        
       | mentalgear wrote:
       | Nice! You might also want to check out Orama - which is also an
       | open-source hybrid vector/full text search engine for any js
       | runtime.
        
       | willguest wrote:
       | I absolutely love this. I make VR experiences that run on the
       | ICP, which delivers wasm modules as smart contracts - I've been
       | waiting for a combo of node-friendly, wasm deployable tools and
       | webLLM. The ICP essentially facilitates self-hosting of data and
       | provides consensus protocols for secure messaging and
       | transactions.
       | 
       | This will make it super easy for me to add LLM functionality to
       | existing webxr spaces, and I'm excited to see how an intelligent
       | avatar or convo between them will play out. This is, very likely,
       | the thing that will make this possible :)
       | 
       | If anyone wants to collab, or contribute in some way, I'm open to
       | ideas and support. Search for 'exeud' to find more info
        
         | wkat4242 wrote:
         | Why the Blockchain there? I don't really see the value. But
         | maybe I misunderstand. It's just that I tend to be pretty
         | dismissive of products mentioning blockchain. Mostly from the
         | time when this tech was severely overhyped. Like Metaverse
         | after it and now of course AI. I do know there's some usecases
         | for it, I just wonder what they are and why you chose it.
         | 
         | I think I like the idea but I don't think I fully understand
         | what it is that you're doing :) But I love everything VR.
        
           | varelaseb wrote:
           | Take this with a grain of salt, as I run a startup in the
           | industry.
           | 
           | Blockchain has taken a weird path. It started with Bitcoin
           | offering something genuinely new - a Byzantine fault-tolerant
           | mechanism for decentralized value exchange without trusted
           | intermediaries. But the industry has drifted toward "web3"
           | hype where the technology often isn't necessary.
           | 
           | Companies pick tech stacks for all sorts of reasons beyond
           | technical merit - vendor relationships, development velocity,
           | legacy system compatibility, and UX considerations all factor
           | into these decisions.
           | 
           | Truth is, most blockchain companies today are solving
           | problems that could be handled just fine with traditional
           | databases and APIs. The industry is shifting toward
           | abstraction layers that hide the consensus mechanisms anyway,
           | focusing on user experience instead.
           | 
           | The project mentioned probably doesn't actually need a
           | blockchain backend for what it's doing, except maybe for
           | tradable collectibles on an ERC standard.
        
       | itissid wrote:
       | Since I already have a browser connected to the Internet where
       | this would execute, could one have the option of transparently
       | executing the webGPU + LLM in a cloud container communicating
       | with the browser process?
        
       | nsonha wrote:
       | Could someone please explain in-browser inference to me? So in
       | the context of OpenAI usage (WebLLM github), this means I will
       | send binary to OpenAI instead of text? And it will lower the cost
       | and run faster?
        
         | a-ungurianu wrote:
         | Not exactly. If you refer to the following line:
         | 
         | > Full OpenAI Compatibility
         | 
         | > WebLLM is designed to be fully compatible with OpenAI API.
         | 
         | It means that WebLLM exposes an API that is identical in
         | behaviour with the OpenAI one, so any tools that build against
         | that API could also build against WebLLM and it will still
         | work.
         | 
         | WebLLM by the looks of it runs the inference purely in the
         | browser. None of your data leaves your browser.
         | 
         | WebLLM does need to get a model from somewhere, with the demo
         | linked here getting Llama3.1 Instruct 8B model{1}.
         | 
         | 1: https://huggingface.co/mlc-
         | ai/Llama-3.1-8B-Instruct-q4f32_1-...
        
       ___________________________________________________________________
       (page generated 2025-03-10 23:00 UTC)