[HN Gopher] Emerging Architectures for LLM Applications
___________________________________________________________________
Emerging Architectures for LLM Applications
Author : makaimc
Score : 94 points
Date : 2023-06-20 19:48 UTC (3 hours ago)
(HTM) web link (a16z.com)
(TXT) w3m dump (a16z.com)
| RealQwertie wrote:
| I think sidecar vector databases that work with existing dbs will
| emerge as more prevalent than the pure vector DB. I also think
| the vector & graph combo on highly interconnected data will have
| additional benefits for those building a wide range of LLM
| applications. A good example is the VectorLink architecture with
| TerminusDB [1] which is based on Hierarchical Navigable Small
| World graphs written in Rust.
|
| [1] https://github.com/terminusdb-labs/terminusdb-semantic-
| index...
| LeicaLatte wrote:
| Any companies making vector databases for iOS or Android?
| DethNinja wrote:
| Do a16z invest in small scale AI companies? Or are they only
| doing series B+ investments?
| swyx wrote:
| they do. you cant be their size and not do everything.
| akisej wrote:
| Great starting point! These diagrams notably miss a LLM firewall
| layer, which is critical in practice to safe LLM adoption.
| Source: We work with thousands of users for logicloop.com/ai
| applgo443 wrote:
| What do you mean by firewall layer? What tools do you use here?
| __loam wrote:
| I imagine he's talking about preventing prompt injection (or
| making shit up)
| lukasb wrote:
| That doesn't seem like the type of problem that can be
| solved with a drop-in solution.
| akisej wrote:
| Yup, that's part of it but I mean it bidirectionally -
| users can accidentally leak data to models too, which is
| concerning to SecOps teams without a way to monitor / auto-
| redact.
| akisej wrote:
| These common issues tend to prevent LLMs from being used in
| the wild: * Data Leakage * Hallucination * Prompt Injection *
| Toxicity
|
| So yes it does include prompt injection, but is a bit
| broader. Data Leakage is one that several customers have
| called out, aka accidentally leakage PII to underlying models
| when asking them questions about your data.
|
| I'm evaluating tools like Private AI, Arthur AI etc. but
| they're all fairly nascent.
| neilv wrote:
| With everyone writing about LLMs, and not time to read them even
| 1% of it all, the reason to read A16z is for technical/analytical
| merit, or for an investment-pumping angle?
| mikehollinger wrote:
| How prescient is the "Hidden Technical Debt" [1] paper from ~8
| yrs ago compared to this? See the top of pg4 for a figure that
| I've personally found to be useful in explaining all the stuff
| necessary to put together a reasonable app using ML/DL stuff (up
| until today, anyway).
|
| I see all the same bits called out:
|
| - Data collection
|
| - Machine /resource management
|
| - Serving
|
| - Monitoring
|
| - Analysis
|
| - Process mgt
|
| - Data verification
|
| There's some new concepts that aren't quite captured in the
| original paper like the "playground" though.
|
| I've kind of been expecting a follow-up that shows an update to
| that original paper.
|
| [1]
| https://proceedings.neurips.cc/paper_files/paper/2015/file/8...
| killdozer wrote:
| This blog post is way more complex than it needs to be, a lot of
| what most people are doing with llms right now boils down to
| using vector databases to provide the "best" info/examples to
| your prompt. This is a slick marketing page but im not sure what
| they think they're providing beyond that.
| bluecoconut wrote:
| > So, agents have the potential to become a central piece of the
| LLM app architecture (or even take over the whole stack, if you
| believe in recursive self-improvement). ... . There's only one
| problem: agents don't really work yet.
|
| I really appreciate that they called out and separated some hype
| vs. practice, specifically with regards to Agents. This is
| something I keep hoping works better than it does, and in
| practice every attempt I've taken in this direction leads to
| disappointment.
| liampulles wrote:
| This mirrors my experience as well. I've tried to use them for
| pretty straightforward support agent type tasks and found that
| they very often go down wrong paths trying to solve the
| problem.
| majestic5762 wrote:
| What vector DB are you using? What is the data structure that
| you're vectorizing? What is your chunk size? Have you
| implemented memory? What prompt or technique are you using
| (ReACT, CoT, few-shot, etc)? Are you only using vector DBs? Do
| you use sequential chains? Does it need tools? Depending on
| your data, business case and what output you expect from the
| agent, there is no one-size-fits-them-all.
| baner2022 wrote:
| Ha, I had the exact same reaction
|
| I feel like lots of paper are getting published and reviewed
| which is good as bad ideas don't get to propagate for ages
| jerpint wrote:
| Similarly for retrieval augmented LLM agents, they break down
| very quickly once the question is not directly found in the
| documents
| applgo443 wrote:
| They mention the contextual stack is is relatively
| underdeveloped. Any idea on what can be improved there?
| jgraettinger1 wrote:
| Building contexts for structured data used in AI / GPT tasks is
| something I've seen little written about, but is obviously
| quite important.
|
| Confluent calls it a "customer 360" problem [1] and I don't
| disagree.
|
| We (Estuary) also wrote up a post showing an approach for Slack
| => ChatGPT => Google Sheets [2], and have more content coming
| for Salesforce, HubSpot, and some others.
|
| [1] https://www.confluent.io/blog/chatgpt-and-streaming-data-
| for...
|
| [2] https://estuary.dev/gpt-real-time-pipeline/
| arguflow wrote:
| Should have Qdrant in the list of vector db's
| zitterbewegung wrote:
| This feels like exactly what we have done with full stack
| engineering and recommend everyone in the space needs all of
| this...
| nico wrote:
| Reading the comments, it seems like we need better human-agent
| interaction tools
|
| Many are frustrated about not being able to better direct the
| agents
|
| It's like the agents have certain pre-learned things they can do,
| but they aren't really learning how to apply those things to the
| environments their human operators want them to develop in
|
| Or at least it is not easy/straightforward how to teach the model
| new tricks
| lmeyerov wrote:
| Something not obvious to me with these VC diagrams wrt the memory
| tier being just vector DBs vs also including knowledge graphs
|
| Good: We're (of course) doing a lot of these architectures
| behind-the-scenes for louie.ai and client projects around that.
| Vector embeddings are an easy way to do direct recall for data
| that's bigger-than-context. As long as the user has a simple
| question that just needs recalling a text snippet that fairly
| directly overlaps with the question, vector embeddings are
| magical. Conversational memory for sharing DB queries across
| teammates, simple discussion of decades of PDF archives and
| internal wikis... amazing.
|
| Not so good: What happens when the text data to answer your
| question isn't a directly semantic search match away? "Why does
| Team X have so many outages?" => What projects is Team X on" +
| "Outages for those projects" + "Analysis for outage" . AFAICT,
| this gets into:
|
| A. Failure: Stick with query -> vector DB -> LLM summary and get
| the wrong answer over the wrong data
|
| B. AutoGPT: Getting into an autoGPT langchain that iteratively
| queries the vector DB, and iteratively reasons over results, &
| iteratively plans, until it finds what it wants. But autoGPT
| seems to be more excitement than production use. Many questions
| like speed, cost, & quality...
|
| C. Knowledge graphs: Getting into use the LLM to generate a
| higher-quality knowledge graph of the data that is more receptive
| to LLM querying. The above question now becomes a simpler multi-
| hop query over the KG, so both fast and cost-effective... If
| you've indexed correctly and taught your LLM to generate the
| right queries.
|
| (Related: If you're into this kind of topic, we're hiring here to
| build out these systems + help use them on our customers in
| investigative areas like cyber, misinfo, & emergency response.
| See new openings up @ ttps://www.graphistry.com/careers !)
| fzliu wrote:
| Bit of self-promotion, but Milvus (https://milvus.io) is another
| open-source vector database option as well (I have a pretty good
| idea as to why it isn't listed in a16z's blog post). We also have
| milvus-lite, a pip-installable package that uses the same API,
| for folks who don't want to stand up a local service.
| pip install milvus
|
| Other than that, it's great to see the shout-out for Vespa.
| baner2022 wrote:
| Appreciate sharing, will try Milvus
|
| Vector database space is the Wild West, keep at it
| tartakovsky wrote:
| Why do you not think it's featured? Has a16z funded many of
| those companies, lol? And somehow, rejected milvus?
| swyx wrote:
| interesting to see that the word "generative" does not appear in
| this blogpost (apart from the tags). 6 months ago Generative AI
| was all the rage: https://a16z.com/2023/01/19/who-owns-the-
| generative-ai-platf...
|
| I think this is a very well articulated breakdown of the "LLM
| Core Code Shell" (https://www.latent.space/p/function-
| agents#%C2%A7llm-core-co...) view of the world. but it is
| underselling the potential to leave the agents stuff to a three
| paragraph "what about agents?" piece at the end. the emerging
| architecture of "Code Core, LLM Shell" decentralizing and
| specializing the role of the LLM will hopefully get more airtime
| in the december a16z landscape chart!
| rajko_rad wrote:
| Hi @swyx, Thanks for the kind words!
|
| we actually just purposefully left that part a bit scarce
| because we have something else coming up on the topic! I'm sure
| we will be chatting through it soon :)
| pryelluw wrote:
| I covered the subject during a python Atlanta talk last month.
| There isn't much that's new at the moment. Mostly because an LLM
| can be considered a software agent. That may change soon as
| things become more complex, though. Things like AWS's Kendra show
| there's some new patterns in the pipeline.
|
| I'll say this post is rather shallow to be considered technical
| or even fit the title
| Xen9 wrote:
| In future we will see ton of similar charts borrowing elements
| from graph and signal theories. There's no limit on amount of
| different LLM-multiagents.
| czbond wrote:
| I second the request for you to expand on that.....
| killthebuddha wrote:
| This sounds quite interesting, could you expand on it? What is
| some of the low-hanging fruit in your opinion? Do you have any
| examples of projects that are explicitly building on top of
| these ideas?
___________________________________________________________________
(page generated 2023-06-20 23:00 UTC)