[HN Gopher] Grafeo - A fast, lean, embeddable graph database bui...
       ___________________________________________________________________
        
       Grafeo - A fast, lean, embeddable graph database built in Rust
        
       Author : 0x1997
       Score  : 161 points
       Date   : 2026-03-21 14:50 UTC (8 hours ago)
        
 (HTM) web link (grafeo.dev)
 (TXT) w3m dump (grafeo.dev)
        
       | satvikpendem wrote:
       | There seem to be a lot of these, how does it compare to Helix DB
       | for example? Also, why would you ever want to query a _database_
       | with GraphQL, for which it was explicitly not made for that
       | purpose?
        
       | adsharma wrote:
       | There are 25 graph databases all going me too in the AI/LLM
       | driven cycle.
       | 
       | Writing it in Rust gets visibility because of the popularity of
       | the language on HN.
       | 
       | Here's why we are not doing it for LadybugDB.
       | 
       | Would love to explore a more gradual/incremental path.
       | 
       | Also focusing on just one query language: strongly typed cypher.
       | 
       | https://github.com/LadybugDB/ladybug/discussions/141
        
         | tadfisher wrote:
         | Is LadybugDB not one of these 25 projects?
        
           | adsharma wrote:
           | LadybugDB is backed by this tech (I didn't write it)
           | 
           | https://vldb.org/cidrdb/2023/kuzu-graph-database-
           | management-...
           | 
           | You can judge for yourself what work has been done in the
           | last 5 months. Many short videos here. New open source
           | contributors who I didn't know before ramping up.
           | 
           | https://youtube.com/@ladybugdb
        
       | Aurornis wrote:
       | Does anyone have any experience with this DB? Or context about
       | where it came from?
       | 
       | From the commit history it's obvious that this is an AI coded
       | project. It was started a few months ago, 99% of commits are from
       | 1 contributor, and that 1 contributor has some times committed
       | 100,000 lines of code per week. (EDIT: 200,000 lines of code in
       | the first week)
       | 
       | I'm not anti-LLM, but I've done enough AI coding to know that one
       | person submitting 100,000 lines of code a week is not doing deep
       | thought and review on the AI output. I also know from experience
       | that letting AI code the majority of a complex project leads to
       | something very fragile, overly complicated, and not well thought
       | out. I've been burned enough times by investigating projects that
       | turned out to be AI slop with polished landing pages. In some
       | cases the claimed benchmarks were improperly run or just
       | hallucinated by the AI.
       | 
       | So is anyone actually using this? Or is this someone's personal
       | experiment in building a resume portfolio project by letting AI
       | run against a problem for a few months?
        
         | gdotv wrote:
         | Agreed, there's been a literal explosion in the last 3 months
         | of new graph databases coded from scratch, clearly largely LLM
         | assisted. I'm having to keep track of the industry quite a bit
         | to decide what to add support for on https://gdotv.com and
         | frankly these days it's getting tedious.
        
           | piyh wrote:
           | I'm turning off my brain and using neo4j
        
             | UltraSane wrote:
             | Neo4j is pretty nice.
        
             | gdotv wrote:
             | proof that Neo4j won the popularity contest!
        
           | aorth wrote:
           | Figurative!
        
         | jandrewrogers wrote:
         | That is a lot of code for what appears to be a vanilla graph
         | database with a conventional architecture. The thing I would be
         | cautious about is that graph database engines in particular are
         | known for hiding many sharp edges without a lot of subtle and
         | sophisticated design. It isn't obvious that the necessary level
         | of attention to detail has been paid here.
        
           | justonceokay wrote:
           | Yes a graph database will happily lead you down a n^3 (or
           | worse!) path when trying to query for a single relation if
           | you are not wise about your indexes, etc.
        
             | adsharma wrote:
             | Are you talking about the query plan for scanning the rel
             | table? Kuzu used a hash index and a join.
             | 
             | Trying to make it optional.
             | 
             | Try
             | 
             | explain match (a)-[b]->(c) return a.rowid, b.rowid,
             | c.rowid;
        
             | cluckindan wrote:
             | That sounds like a "graph" DB which implements edges as
             | separate tables, like building a graph in a standard SQL
             | RDB.
             | 
             | If you wish to avoid that particular caveat, look for a
             | graph DB which materializes edges within vertices/nodes.
             | The obvious caveat there is that the edges are not
             | normalized, which may or may not be an issue for your
             | particulat application.
        
           | adsharma wrote:
           | Are you talking about Andy Pavlo bet here?
           | 
           | https://news.ycombinator.com/item?id=29737326
           | 
           | Kuzu folks took some of these discussions and implemented
           | them. SIP, ASP joins, factorized joins and WCOJ.
           | 
           | Internally it's structured very similar to DuckDB, except for
           | the differences noted above.
           | 
           | DuckDB 1.5 implemented sideways information passing (SIP).
           | And LadybugDB is bringing in support for DuckDB node tables.
           | 
           | So the idea that graph databases have shaky internals stems
           | primarily from pre 2021 incumbents.
           | 
           | 4 more years to go to 2030!
        
             | adsharma wrote:
             | Source: https://www.theregister.com/2023/03/08/great_graph_
             | debate_we...
             | 
             | > There are some additional optimizations that are specific
             | to graphs that a relational DBMS needs to incorporate:
             | [...]
             | 
             | This is essentially what Kuzu implemented and DuckDB tried
             | to implement (DuckPGQ), without touching relational
             | storage.
             | 
             | The jury is out on which one is a better approach.
        
             | jandrewrogers wrote:
             | I wasn't referring to the Pavlo bet but I would make the
             | same one! Poor algorithm and architecture scalability is a
             | serious bottleneck. I was part of a research program
             | working on the fundamental computer science of high-scale
             | graph databases ~15 years ago. Even back then we could show
             | that the architectures you mention couldn't scale even in
             | theory. Just about everyone has been re-hashing the same
             | basic design for decades.
             | 
             | As I like to point out, for two decades DARPA has offered
             | to pay many millions of dollars to anyone who can
             | demonstrate a graph database that can handle a sparse
             | trillion-edge graph. That data model easily fits on a
             | single machine. No one has been able to claim the money.
             | 
             | Inexplicably, major advances in this area 15-20 years ago
             | under the auspices of government programs never bled into
             | the academic literature even though it materially improved
             | the situation. (This case is the best example I've seen of
             | obviously valuable advanced research that became lost for
             | mundane reasons, which is pretty wild if you think about
             | it.)
        
               | adsharma wrote:
               | > many millions of dollars to anyone who can demonstrate
               | a graph database that can handle a sparse trillion-edge
               | graph.
               | 
               | I wonder why no one has claimed it. It's possible to
               | compress large graphs to 1 byte per edge via Graph
               | reordering techniques. So a trillion scale graph becomes
               | 1TB, which can fit into high end machines.
               | 
               | Obviously it won't handle high write rates and mutations
               | well. But with Apache Arrow based compression, it's
               | certainly possible to handle read-only and read-mostly
               | graphs.
               | 
               | Also the single machine constraint feels artificial. For
               | any columnar database written in the last 5 years,
               | implementing object store support is tablestakes.
        
               | jandrewrogers wrote:
               | Achieving adequate performance at 1T edges in one aspect
               | requires severe tradeoffs in other aspects, making every
               | implementation impractical at that scale. You touched on
               | a couple of the key issues when I was working in this
               | domain.
               | 
               | There is no single machine constraint, just the
               | observation that we routinely run non-graph databases at
               | similar scale on single machines without issue. It
               | doesn't scale on in-memory supercomputers either, so the
               | hardware details are unrelated to the problem:
               | 
               | - Graph database with good query performance typically
               | has terrible write performance. It doesn't matter how
               | fast queries are if it takes too long to get data into
               | the system. At this scale there can be no secondary
               | indexing structures into the graph; you need a graph
               | cutting algorithm efficient for both scalable writes and
               | join recursion. This was solved.
               | 
               | - Graph workloads break cache replacement algorithms for
               | well-understood theory reasons. Avoiding disk just
               | removes one layer of broken caching among many but
               | doesn't address the abstract purpose for which a cache
               | exists. This is why in-memory systems still scale poorly.
               | We've known how to solve this in theory since at least
               | the 1980s. The caveat is it is surprisingly difficult to
               | fully reduce to practice in software, especially at
               | scale, so no one really has. This is a work in progress.
               | 
               | - Most implementations use global synchronization
               | barriers when parallelizing algorithms such as BFS, which
               | greatly increases resource consumption while throttling
               | hardware scalability and performance. My contribution to
               | research was actually in this area: I discovered a way to
               | efficiently use error correction algorithms to elide the
               | barriers. I think there is room to make this even better
               | but I don't think anyone has worked on it since.
               | 
               | The pathological cache replacement behavior is the real
               | killer here. It is what is left even if you don't care
               | about write performance or parallelization.
               | 
               | I haven't worked in this area for many years but I do
               | keep tabs on new graph databases to see if someone is
               | exploiting that prior R&D, even if developed
               | independently.
        
               | rossjudson wrote:
               | I guess it all depends on the meaning of the word
               | "handle", and what the use cases are.
        
           | stult wrote:
           | It certainly does seem problematic to have a graph database
           | hiding edges, sharp or not
        
         | arthurjean wrote:
         | Sounds about right for someone who ships fast and iterates. 54
         | days for a v0 that probably needs refactoring isn't that crazy
         | if the dev has a real DB background. We've all seen open source
         | projects drag on for 3 years without shipping anything, that's
         | not necessarily better
        
           | Aurornis wrote:
           | 200,000 lines of code on week 1 is not a sign of a quality
           | codebase with careful thought put into it.
           | 
           | > We've all seen open source projects drag on for 3 years
           | without shipping anything, that's not necessarily better
           | 
           | There are more options than "never ship anything" and "use AI
           | to slip 200,000 lines of code into a codebase"
        
           | TheJord wrote:
           | shipping fast matters a lot less than shipping something you
           | actually understand. 200k lines in a week means nobody knows
           | what's in there, including the author. that's not a codebase,
           | it's a liability
        
         | ozgrakkurt wrote:
         | Using a LLM coded database sounds like hell considering even
         | major databases can have some rough edges and be painful to
         | use.
        
         | hrmtst93837 wrote:
         | Six figures a week is a giant red flag. That kind of commit log
         | usually means codegen slop or bulk reformatting, and even if
         | some of it works I wouldn't trust the design, test coverage, or
         | long-term maintenance story enough to put that DB anywhere near
         | prod.
        
       | measurablefunc wrote:
       | This looks like another avant-garde "art" project.
        
       | nexxuz wrote:
       | I was ready to learn more about this but I saw "written in Rust"
       | and I literally rolled my eyes and said never mind.
        
         | ComputerGuru wrote:
         | I think "written by genAI" should be a bigger turnoff than
         | "written in Rust".
        
           | andriy_koval wrote:
           | alternative opinion:
           | 
           | * it is possible to write high quality software using GenAI
           | 
           | * not using GenAI could mean project won't be competitive in
           | current landscape
        
             | quantumHazer wrote:
             | > not using GenAI could mean project won't be competitive
             | in current landscape
             | 
             | why? this is false in my opinion, iterating fast is not a
             | good indicator of quality nor competitiveness
        
               | andriy_koval wrote:
               | iterating fast over quality (e.g. refactoring, tests
               | coverage, benchmarks, documentation, trying new
               | nontrivial ideas) is a good indicator of quality.
        
             | Aurornis wrote:
             | > * it is possible to write high quality software using
             | GenAI
             | 
             | From examine this codebase it doesn't appear to be written
             | carefully _with_ AI.
             | 
             | It looks like code that was promoted into existence as fast
             | as possible.
        
               | andriy_koval wrote:
               | sure, there are bad genAI projects and there are good
               | genAI projects. You can remove genAI term from previous
               | sentence.
        
         | chuckadams wrote:
         | Too bad you don't do the same for commenting on HN.
        
       | OtomotO wrote:
       | Interesting... Need to check how this differs from agdb, with
       | which I had some success for a sideproject in the past.
       | 
       | https://github.com/agnesoft/agdb
       | 
       | Ah, yeah, a different query language.
        
       | cluckindan wrote:
       | The d:Document syntax looks so happy!
        
       | cjlm wrote:
       | Overwhelmed by the sheer number of graph databases? I released a
       | new site this week that lists and categorises them. https://gdb-
       | engines.com
        
         | dbacar wrote:
         | Did you generate the list using an LLM?
        
           | cjlm wrote:
           | I was inspired by https://arxiv.org/abs/2505.24758 and
           | collated their assessment into a table and then just kept
           | adding databases :)
           | 
           | Claude helped a lot but it's all reviewed and curated by me.
        
       | natdempk wrote:
       | Serious question: are there any actually good and useful graph
       | databases that people would trust in production at reasonable
       | scale and are available as a vendor or as open source? eg. not
       | Meta's TAO
        
         | cjlm wrote:
         | Serious answer: limiting to just Open Source: JanusGraph,
         | DGraph, Apache AGE, HugeGraph, MemGraph and ArcadeDB all meet
         | that criteria.
        
           | adsharma wrote:
           | What is open source and what is a graph database are both
           | hotly debated topics.
           | 
           | Author of ArcadeDB critiques many nominally open source
           | licenses here:
           | 
           | https://www.linkedin.com/posts/garulli_why-arcadedb-will-
           | nev...
           | 
           | What is a graph database is also relevant:                 -
           | Does it need index free adjacency?       - Does it need to
           | implement compressed sparse rows?       - Does it need to
           | implement ACID?       - Does translating Cypher to SQL count
           | as a graph database?
        
         | pphysch wrote:
         | Yeah: Postgres, etc.
         | 
         | When you actually need to run _graph algorithms_ against your
         | relational data, you export the subset of that data into
         | something like Grafeo (embedded mode is a big plus here) and
         | run your analysis.
        
           | adsharma wrote:
           | That importing is expensive and prevents you from handling
           | billion scale graphs.
           | 
           | It's possible to run cypher against duckdb (soon postgres as
           | well via duckdb's postgres extension) without having to
           | import anything. That's a game changer when everything is in
           | the same process.
        
         | szarnyasg wrote:
         | That's a difficult question and I would like to avoid giving a
         | direct answer (because I co-lead a nonprofit benchmarking graph
         | databases) but even knowing what you need for a graph database
         | can be a tricky decision. See my FOSDEM 2025 talk, where I
         | tried to make sense of the field:
         | 
         | https://archive.fosdem.org/2025/schedule/event/fosdem-2025-5...
        
         | adsharma wrote:
         | What people perceive as "Facebook production graph" is not just
         | TAO. There is an ecosystem around it and I wrote one piece of
         | it.
         | 
         | Full history here: https://www.linkedin.com/pulse/brief-
         | history-graphs-facebook...
        
         | gdotv wrote:
         | plenty of those - I've had to work with dozens of different
         | graph databases integrating them on https://gdotv.com, save for
         | maybe 1-2 exceptions in the list of supported databases on our
         | website, they're all production ready and either backed by a
         | vendor or open-source (or sometimes both, e.g. Apache AGE for
         | Azure PostgreSQL). There are some technologies that have been
         | around for a long time but really flying under the radar,
         | despite being used a lot in enterprise (e.g. JanusGraph).
        
       | mark_l_watson wrote:
       | I just spent an hour with Grafeo, trying to also get the
       | associated library grafeo_langchain working with a local Ollama
       | model. Mixed results. I really like the Python Kuzu graph
       | database, still use it even though the developers no longer
       | support it.
        
       | lmeyerov wrote:
       | Speaking of embeddable, we just announced cypher syntax for gfql,
       | so the first OSS CPU/GPU cypher query engine you can use on
       | dataframes
       | 
       | Typically used with scaleout DBs like databricks & splunk for
       | analytical apps: security/fraud/event/social data analysis
       | pipelines, ML+AI embedding & enrichment pipelines, etc. We
       | originally built it for the compute-tier gap here to help
       | Graphistry users making embeddable interactive GPU graph viz apps
       | and dashboards and not wanting to add an external graph DB phase
       | into their interactive analytics flows.
       | 
       | Single GPU can do 1B+ edges/s, no need for a DB install, and can
       | work straight on your dataframes / apache arrow / parquet:
       | https://pygraphistry.readthedocs.io/en/latest/gfql/benchmark...
       | 
       | We took a multilayer approach to the GPU & vectorization
       | acceleration, including a more parallelism-friendly core
       | algorithm. This makes fancy features pay-as-you-go vs dragging
       | everything down as in most columnar engines that are appearing.
       | Our vectorized core conforms to over half of TCK already, and we
       | are working to add trickier bits on different layers now that
       | flow is established.
       | 
       | The core GFQL engine has been in production for a year or two now
       | with a lot of analyst teams around the world (NATO, banks, US
       | gov, ...) because it is part of Graphistry. The open-source
       | cypher support is us starting to make it easy for others to
       | directly use as well, including LLMs :)
        
       | xlii wrote:
       | I wonder if people are using (or intend to use) vibe-coded
       | projects like the one linked.
       | 
       | I mean - I understand, some people have fun looking at new tech
       | no matter the source, but my question is is there a person who
       | would be designated to pick a GraphQL in language and would
       | ignore all the LLM flags and put it in production.
        
       | brunoborges wrote:
       | Why is everything "... built in Rust" trending so easily on HN?
        
         | IshKebab wrote:
         | Because Rust is an excellent language that pushes you into the
         | "pit of success", and consequently software written in Rust
         | tends to be fast, robust and easy to deploy.
         | 
         | There's no big mystery. No conspiracy or organised evangelism.
         | Rust is just really good.
        
         | mattvr wrote:
         | It implies high performance, reliability, and a higher degree
         | of mastery of the developer.
         | 
         | (Which may not all be true, but perhaps moreso than your
         | average project)
        
       | foota wrote:
       | I added a super cheap and bad embedding database in a project
       | that allows the agent to call a tool for searching all the
       | content it's built, it seems to work pretty well! This way the
       | agent doesn't need to call a bunch of list tools (which I was
       | worried would introduce lost of data to the context), and can
       | find things based on fuzzy search.
        
       ___________________________________________________________________
       (page generated 2026-03-21 23:00 UTC)