[HN Gopher] Emerging Architectures for LLM Applications
       ___________________________________________________________________
        
       Emerging Architectures for LLM Applications
        
       Author : makaimc
       Score  : 94 points
       Date   : 2023-06-20 19:48 UTC (3 hours ago)
        
 (HTM) web link (a16z.com)
 (TXT) w3m dump (a16z.com)
        
       | RealQwertie wrote:
       | I think sidecar vector databases that work with existing dbs will
       | emerge as more prevalent than the pure vector DB. I also think
       | the vector & graph combo on highly interconnected data will have
       | additional benefits for those building a wide range of LLM
       | applications. A good example is the VectorLink architecture with
       | TerminusDB [1] which is based on Hierarchical Navigable Small
       | World graphs written in Rust.
       | 
       | [1] https://github.com/terminusdb-labs/terminusdb-semantic-
       | index...
        
       | LeicaLatte wrote:
       | Any companies making vector databases for iOS or Android?
        
       | DethNinja wrote:
       | Do a16z invest in small scale AI companies? Or are they only
       | doing series B+ investments?
        
         | swyx wrote:
         | they do. you cant be their size and not do everything.
        
       | akisej wrote:
       | Great starting point! These diagrams notably miss a LLM firewall
       | layer, which is critical in practice to safe LLM adoption.
       | Source: We work with thousands of users for logicloop.com/ai
        
         | applgo443 wrote:
         | What do you mean by firewall layer? What tools do you use here?
        
           | __loam wrote:
           | I imagine he's talking about preventing prompt injection (or
           | making shit up)
        
             | lukasb wrote:
             | That doesn't seem like the type of problem that can be
             | solved with a drop-in solution.
        
             | akisej wrote:
             | Yup, that's part of it but I mean it bidirectionally -
             | users can accidentally leak data to models too, which is
             | concerning to SecOps teams without a way to monitor / auto-
             | redact.
        
           | akisej wrote:
           | These common issues tend to prevent LLMs from being used in
           | the wild: * Data Leakage * Hallucination * Prompt Injection *
           | Toxicity
           | 
           | So yes it does include prompt injection, but is a bit
           | broader. Data Leakage is one that several customers have
           | called out, aka accidentally leakage PII to underlying models
           | when asking them questions about your data.
           | 
           | I'm evaluating tools like Private AI, Arthur AI etc. but
           | they're all fairly nascent.
        
       | neilv wrote:
       | With everyone writing about LLMs, and not time to read them even
       | 1% of it all, the reason to read A16z is for technical/analytical
       | merit, or for an investment-pumping angle?
        
       | mikehollinger wrote:
       | How prescient is the "Hidden Technical Debt" [1] paper from ~8
       | yrs ago compared to this? See the top of pg4 for a figure that
       | I've personally found to be useful in explaining all the stuff
       | necessary to put together a reasonable app using ML/DL stuff (up
       | until today, anyway).
       | 
       | I see all the same bits called out:
       | 
       | - Data collection
       | 
       | - Machine /resource management
       | 
       | - Serving
       | 
       | - Monitoring
       | 
       | - Analysis
       | 
       | - Process mgt
       | 
       | - Data verification
       | 
       | There's some new concepts that aren't quite captured in the
       | original paper like the "playground" though.
       | 
       | I've kind of been expecting a follow-up that shows an update to
       | that original paper.
       | 
       | [1]
       | https://proceedings.neurips.cc/paper_files/paper/2015/file/8...
        
       | killdozer wrote:
       | This blog post is way more complex than it needs to be, a lot of
       | what most people are doing with llms right now boils down to
       | using vector databases to provide the "best" info/examples to
       | your prompt. This is a slick marketing page but im not sure what
       | they think they're providing beyond that.
        
       | bluecoconut wrote:
       | > So, agents have the potential to become a central piece of the
       | LLM app architecture (or even take over the whole stack, if you
       | believe in recursive self-improvement). ... . There's only one
       | problem: agents don't really work yet.
       | 
       | I really appreciate that they called out and separated some hype
       | vs. practice, specifically with regards to Agents. This is
       | something I keep hoping works better than it does, and in
       | practice every attempt I've taken in this direction leads to
       | disappointment.
        
         | liampulles wrote:
         | This mirrors my experience as well. I've tried to use them for
         | pretty straightforward support agent type tasks and found that
         | they very often go down wrong paths trying to solve the
         | problem.
        
         | majestic5762 wrote:
         | What vector DB are you using? What is the data structure that
         | you're vectorizing? What is your chunk size? Have you
         | implemented memory? What prompt or technique are you using
         | (ReACT, CoT, few-shot, etc)? Are you only using vector DBs? Do
         | you use sequential chains? Does it need tools? Depending on
         | your data, business case and what output you expect from the
         | agent, there is no one-size-fits-them-all.
        
           | baner2022 wrote:
           | Ha, I had the exact same reaction
           | 
           | I feel like lots of paper are getting published and reviewed
           | which is good as bad ideas don't get to propagate for ages
        
         | jerpint wrote:
         | Similarly for retrieval augmented LLM agents, they break down
         | very quickly once the question is not directly found in the
         | documents
        
       | applgo443 wrote:
       | They mention the contextual stack is is relatively
       | underdeveloped. Any idea on what can be improved there?
        
         | jgraettinger1 wrote:
         | Building contexts for structured data used in AI / GPT tasks is
         | something I've seen little written about, but is obviously
         | quite important.
         | 
         | Confluent calls it a "customer 360" problem [1] and I don't
         | disagree.
         | 
         | We (Estuary) also wrote up a post showing an approach for Slack
         | => ChatGPT => Google Sheets [2], and have more content coming
         | for Salesforce, HubSpot, and some others.
         | 
         | [1] https://www.confluent.io/blog/chatgpt-and-streaming-data-
         | for...
         | 
         | [2] https://estuary.dev/gpt-real-time-pipeline/
        
       | arguflow wrote:
       | Should have Qdrant in the list of vector db's
        
       | zitterbewegung wrote:
       | This feels like exactly what we have done with full stack
       | engineering and recommend everyone in the space needs all of
       | this...
        
       | nico wrote:
       | Reading the comments, it seems like we need better human-agent
       | interaction tools
       | 
       | Many are frustrated about not being able to better direct the
       | agents
       | 
       | It's like the agents have certain pre-learned things they can do,
       | but they aren't really learning how to apply those things to the
       | environments their human operators want them to develop in
       | 
       | Or at least it is not easy/straightforward how to teach the model
       | new tricks
        
       | lmeyerov wrote:
       | Something not obvious to me with these VC diagrams wrt the memory
       | tier being just vector DBs vs also including knowledge graphs
       | 
       | Good: We're (of course) doing a lot of these architectures
       | behind-the-scenes for louie.ai and client projects around that.
       | Vector embeddings are an easy way to do direct recall for data
       | that's bigger-than-context. As long as the user has a simple
       | question that just needs recalling a text snippet that fairly
       | directly overlaps with the question, vector embeddings are
       | magical. Conversational memory for sharing DB queries across
       | teammates, simple discussion of decades of PDF archives and
       | internal wikis... amazing.
       | 
       | Not so good: What happens when the text data to answer your
       | question isn't a directly semantic search match away? "Why does
       | Team X have so many outages?" => What projects is Team X on" +
       | "Outages for those projects" + "Analysis for outage" . AFAICT,
       | this gets into:
       | 
       | A. Failure: Stick with query -> vector DB -> LLM summary and get
       | the wrong answer over the wrong data
       | 
       | B. AutoGPT: Getting into an autoGPT langchain that iteratively
       | queries the vector DB, and iteratively reasons over results, &
       | iteratively plans, until it finds what it wants. But autoGPT
       | seems to be more excitement than production use. Many questions
       | like speed, cost, & quality...
       | 
       | C. Knowledge graphs: Getting into use the LLM to generate a
       | higher-quality knowledge graph of the data that is more receptive
       | to LLM querying. The above question now becomes a simpler multi-
       | hop query over the KG, so both fast and cost-effective... If
       | you've indexed correctly and taught your LLM to generate the
       | right queries.
       | 
       | (Related: If you're into this kind of topic, we're hiring here to
       | build out these systems + help use them on our customers in
       | investigative areas like cyber, misinfo, & emergency response.
       | See new openings up @ ttps://www.graphistry.com/careers !)
        
       | fzliu wrote:
       | Bit of self-promotion, but Milvus (https://milvus.io) is another
       | open-source vector database option as well (I have a pretty good
       | idea as to why it isn't listed in a16z's blog post). We also have
       | milvus-lite, a pip-installable package that uses the same API,
       | for folks who don't want to stand up a local service.
       | pip install milvus
       | 
       | Other than that, it's great to see the shout-out for Vespa.
        
         | baner2022 wrote:
         | Appreciate sharing, will try Milvus
         | 
         | Vector database space is the Wild West, keep at it
        
         | tartakovsky wrote:
         | Why do you not think it's featured? Has a16z funded many of
         | those companies, lol? And somehow, rejected milvus?
        
       | swyx wrote:
       | interesting to see that the word "generative" does not appear in
       | this blogpost (apart from the tags). 6 months ago Generative AI
       | was all the rage: https://a16z.com/2023/01/19/who-owns-the-
       | generative-ai-platf...
       | 
       | I think this is a very well articulated breakdown of the "LLM
       | Core Code Shell" (https://www.latent.space/p/function-
       | agents#%C2%A7llm-core-co...) view of the world. but it is
       | underselling the potential to leave the agents stuff to a three
       | paragraph "what about agents?" piece at the end. the emerging
       | architecture of "Code Core, LLM Shell" decentralizing and
       | specializing the role of the LLM will hopefully get more airtime
       | in the december a16z landscape chart!
        
         | rajko_rad wrote:
         | Hi @swyx, Thanks for the kind words!
         | 
         | we actually just purposefully left that part a bit scarce
         | because we have something else coming up on the topic! I'm sure
         | we will be chatting through it soon :)
        
       | pryelluw wrote:
       | I covered the subject during a python Atlanta talk last month.
       | There isn't much that's new at the moment. Mostly because an LLM
       | can be considered a software agent. That may change soon as
       | things become more complex, though. Things like AWS's Kendra show
       | there's some new patterns in the pipeline.
       | 
       | I'll say this post is rather shallow to be considered technical
       | or even fit the title
        
       | Xen9 wrote:
       | In future we will see ton of similar charts borrowing elements
       | from graph and signal theories. There's no limit on amount of
       | different LLM-multiagents.
        
         | czbond wrote:
         | I second the request for you to expand on that.....
        
         | killthebuddha wrote:
         | This sounds quite interesting, could you expand on it? What is
         | some of the low-hanging fruit in your opinion? Do you have any
         | examples of projects that are explicitly building on top of
         | these ideas?
        
       ___________________________________________________________________
       (page generated 2023-06-20 23:00 UTC)