hngopher.com

       [HN Gopher] Show HN: Graphiti - LLM-Powered Temporal Knowledge G...
       ___________________________________________________________________
        
       Show HN: Graphiti - LLM-Powered Temporal Knowledge Graphs
        
       Hey HN! We're Paul, Preston, and Daniel from Zep. We've just open-
       sourced Graphiti, a Python library for building temporal Knowledge
       Graphs using LLMs.  Graphiti helps you create and query graphs that
       evolve over time. Knowledge Graphs have been explored extensively
       for information retrieval. What makes Graphiti unique is its
       ability to build a knowledge graph while handling changing
       relationships and maintaining historical context.  At Zep, we build
       a memory layer for LLM applications. Developers use Zep to recall
       relevant user information from past conversations without including
       the entire chat history in a prompt. Accurate context is crucial
       for LLM applications. If an AI agent doesn't remember that you've
       changed jobs or confuses the chronology of events, its responses
       can be jarring or irrelevant, or worse, inaccurate.  Before
       Graphiti, our approach to storing and retrieving user "memory" was,
       in effect, a specialized RAG pipeline. An LLM extracted "facts"
       from a user's chat history. Semantic search, reranking, and other
       techniques then surfaced facts relevant to the current conversation
       back to a developer for inclusion in their prompt.  We attempted to
       reconcile how new information may change our understanding of
       existing facts:  Fact: "Kendra loves Adidas shoes"  User message:
       "I'm so angry! My favorite Adidas shoes fell apart! Puma's are my
       new favorite shoes!"  Facts:  - "Kendra used to love Adidas shoes
       but now prefers Puma."  - "Kendra's Adidas shoes fell apart."
       Unfortunately, this approach became problematic. Reconciling facts
       from increasingly complex conversations challenged even frontier
       LLMs such as gpt-4o. We saw incomplete facts, poor recall, and
       hallucinations. Our RAG search also failed at times to capture the
       nuanced relationships between facts, leading to irrelevant or
       contradictory information being retrieved.  We tried fixing these
       issues with prompt optimization but saw diminishing returns on
       effort. We realized that a graph would help model a user's complex
       world, potentially addressing these challenges.  We were intrigued
       by Microsoft's GraphRAG, which expanded on RAG text chunking with a
       graph to better model a document corpus. However, it didn't solve
       our core problem: GraphRAG is designed for static documents and
       doesn't natively handle temporality.  So, we built Graphiti, which
       is designed from the ground up to handle constantly changing
       information, hybrid semantic and graph search, and scale:  -
       Temporal Awareness: Tracks changes in facts and relationships over
       time. Graph edges include temporal metadata to record relationship
       lifecycles.  - Episodic Processing: Ingests data as discrete
       episodes, maintaining data provenance and enabling incremental
       processing.  - Hybrid Search: Semantic and BM25 full-text search,
       with the ability to rerank results by distance from a central node.
       - Scalable: Designed for large datasets, parallelizing LLM calls
       for batch processing while preserving event chronology.  - Varied
       Sources: Ingests both unstructured text and structured data.
       Graphiti has significantly improved our ability to maintain
       accurate user context. It does a far better job of fact
       reconciliation over long, complex conversations. Node distance
       reranking, which places a user at the center of the graph, has also
       been a valuable tool. Quantitative data evaluation results may be a
       future ShowHN.  Work is ongoing, including:  1. Improving support
       for faster and cheaper small language models.  2. Exploring fine-
       tuning to improve accuracy and reduce latency.  3. Adding new
       querying capabilities, including search over neighborhood (sub-
       graph) summaries.  ## Getting Started  Graphiti is open source and
       available on GitHub: https://github.com/getzep/graphiti.  We'd love
       to hear your thoughts. Please also consider contributing!
        
       Author : roseway4
       Score  : 61 points
       Date   : 2024-09-04 13:21 UTC (9 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | tcdent wrote:
       | Than you for open sourcing this!
       | 
       | You are definitely onto something here.
        
         | roseway4 wrote:
         | Pleasure! We'd love feedback + suggestions should you try it
         | out.
        
       | spothedog1 wrote:
       | Looks cool, would love support for RDF Graphs. The reason I
       | prefer those is because the ontology is already well defined in a
       | lot of cases which is 80% of the battle with Knowledge Graphs in
       | my experience. Without a well defined Ontology I think LLM <> KG
       | integration will not live up to its potential. LLMs have to know
       | what nodes and edges really mean across diverse datasets
        
         | prasmuss15 wrote:
         | Hey, thanks for the feedback! I'm one of the devs on graphiti
         | and adding support for custom schema is high on our to-do list.
         | I agree that this is an important step in helping to bridge the
         | gap between structured and unstructured data, as well as for
         | refining the graph on specific use cases.
         | 
         | Currently, we do have some ways of helping the graph to
         | understand what nodes and edges "really mean." In addition to
         | the name of the relationship our edges also store a "hydrated"
         | version of the fact triple. For example, if Alice and Bob are
         | siblings you might see an edge with the name IS_SIBLING_OF
         | between the two. In addition to this, the edge also stores the
         | fact: "Alice is the sibling of Bob". This way we are storing
         | much of the semantic context on the nodes and edges themselves
         | in addition to the graph structure.
         | 
         | We also support ingesting structured JSON, and I those cases
         | the edges will be exactly the properties in the JSON doc.
        
       | fudged71 wrote:
       | Let's say I want to ingest information from a series of
       | interviews with multiple interviewees (multiple interviews per
       | interviewee). It's possible their opinions/facts change between
       | interviews; but also each interviewee is going to have different
       | opinions/facts.
       | 
       | Would it make most sense to capture this with multiple Graphiti
       | graphs? Or would it be possible to do this in one graph?
       | 
       | At the end of the day the analysis would be finding insights
       | across all interviewees and you want the cumulative knowledge...
        
         | roseway4 wrote:
         | You could achieve this with a single graph. Graphiti has a
         | "message" EpisodeType that expects transcripts in a "<user>:
         | <content>" format. When using this EpisodeType, Graphiti pays
         | careful attention to "users," creating nodes for them and
         | maintaining "fact" context for each user subgraph.
         | 
         | "Facts" shared across all users will also be updated
         | universally. Alongside Graphiti's search, you'd be able to use
         | cypher to query Neo4j to, for example, find hub nodes (aka
         | highly-connected nodes), identifying common beliefs.
         | 
         | More here: https://help.getzep.com/graphiti/graphiti/adding-
         | episodes
        
           | fudged71 wrote:
           | Oh that's excellent! Thank you
        
           | fudged71 wrote:
           | I see that you mention Microsoft's GraphRAG. My understanding
           | is that a key part of their approach is hierarchical
           | agglomeration of graph clusters to be able to answer wide
           | questions from the graph. Is that in the works?
        
             | prasmuss15 wrote:
             | Yes, that is in the works and is a high priority for us.
             | The major discussion point internally around implementing
             | this feature has been on the retrieval portion. In general
             | we want to provide many flexible search strategies that
             | return a variety of different information. We want to
             | organize search in such a way that it is flexible enough to
             | meet a variety of demands, while also being ergonomic
             | enough to be usable and understandable. We want to make
             | sure that we update our retrieval approach at the same time
             | as adding the community summaries so that it is easy to
             | make use of this additional information.
             | 
             | Our implementation will likely involve us adding community
             | nodes that will contain a summary of the nodes in that
             | community. Did you have any perspective or opinions on best
             | ways to implement the graphRAG style summarizations?
        
         | thorax51 wrote:
         | Hey, I'm one of the developers on Graphiti project
         | 
         | Adding to Daniel's reply, ingesting a series of interviews is
         | definitely doable with one graph, please make sure to add the
         | episodes from the interviews in their chronological order.
         | 
         | After all the episodes are processed by graphiti, you will be
         | able to retrieve the "complete picture" for every participant
         | in the interviews that reflects the possible change in their
         | views/opinions.
        
       ___________________________________________________________________
       (page generated 2024-09-04 23:00 UTC)