[HN Gopher] Building Boba AI: Lessons learnt in building an LLM-...
___________________________________________________________________
Building Boba AI: Lessons learnt in building an LLM-powered
application
Author : nalgeon
Score : 75 points
Date : 2023-06-29 17:19 UTC (5 hours ago)
(HTM) web link (martinfowler.com)
(TXT) w3m dump (martinfowler.com)
| selalipop wrote:
| I worked on something very much in this vein (notionsmith.ai) and
| feel like I should do a write up after reading this!
|
| I think a lot of people are learning these lessons in isolation,
| I do wish there was a centralized place where people working on
| UX-focused LLM based apps were exchanging lessons
| tinco wrote:
| I think a lot of us are working heads down in isolation because
| we don't have a shareworthy project yet. In a week or two I
| think my system will be fancy enough to write a blog post about
| and maybe make open source.
|
| HN has been a pretty good source of exchanging knowledge so
| far, every couple days or so there's a write up like this that
| has some new tidbits or confirmations of ideas. If everyone
| keeps doing that we're doing great in my opinion. Looking
| forward to seeing your write up on here!
| ignoramous wrote:
| Things on the LLM front for utility apps are fairly nascent and
| by OpenAI's own admission, the current limitations are
| fleeting, as in, as a developer, you will soon not need the
| workarounds used today.
|
| Multi-modal models are going to change things even further.
| daviding wrote:
| This is an interesting article, and a bit of a mish mash of UI
| conventions, application use ideas for GPT and actual patterns
| for LLMs. I really do miss Martin Fowler's actual take on these
| things, but using his name as some sort of gestalt brain for
| Thoughtworks works too.
|
| It still feels like a bit of a Wild West for patterns in this
| area as yet, with a lot of people trying lots of things and it
| might be too soon for defining terms. A useful resource is still
| things like the OpenAI Cookbook, that is a decent collection of a
| lot of the things in this article but with a more implementation
| bent.[1]
|
| The area that seems to get a lot of idea duplication currently is
| in providing either a 'session' or a longer term context for GPT,
| be it with embeddings or rolling prompts for these apps. The use
| of vector search and embedded chunks is something that seems to
| be missing so far from vendors like OpenAI, and you can't help
| but wonder that they'll move it behind their API eventually with
| a 'session id' in the end. I think that was mentioned as on their
| roadmap for this year too. The lack of GPT-4 fine tuning options
| just seems to push people more to look at the Pinecone, Weaviates
| etc stores and chaining up their own sequences to achieve some
| sort of memory.
|
| I've implemented features with GPT-4 and functions and so far
| it's feeling useful for 'data model' like use (where you're
| bringing json into the prompt about a domain noun, e.g. 'Tasks')
| but is pretty hairy when it comes to pure functions - the tuning
| they've done to get it to pick which function and which
| parameters to use is still hard going to get right, which means
| there doesn't feel like a lot of trust that it is going to be
| usable. It's like there needs to be a set of patterns or
| categories for 'business apps' that are heavily siloed into just
| a subset of available functions it can work with, making it more
| task-specific rather than as a general chat agent we see a lot
| of. The difference in approach between LangChain's Chain of
| Thought pattern and just using OpenAI functions is sort of up in
| the air as well. Like I said, it still all feels like we're in
| wild west times, at least as an app developer.
|
| [1] https://github.com/openai/openai-cookbook
| ignoramous wrote:
| > _A useful resource is still things like the OpenAI Cookbook,
| that is a decent collection of a lot of the things in this
| article_
|
| By far, the best resource I've found is the _Prompt Engineering
| Guide_ : https://www.promptingguide.ai/
|
| > _you can 't help but wonder that they'll move it behind their
| API eventually with a 'session id' in the end_
|
| For in-context learning, I think it is fair to expect _100k_ to
| _500k_ context windows sooner. OpenAI is already at _32k_.
| daviding wrote:
| > By far, the best resource I've found is the Prompt
| Engineering Guide: https://www.promptingguide.ai/
|
| Agreed, that is a good resource for sure. For tooling I like
| https://promptmetheus.com/ but any pun name gets bonus points
| from me.
|
| > For in-context learning, I think it is fair to expect 100k
| to 500k context windows sooner. OpenAI is already at 32k.
|
| It has been interesting to see that window increase so
| quickly. For LLM context the biggest thing is the pay-per-
| token constraint if you don't run your own, so have to wonder
| if that is what will be around in the future given how this
| is trending? Just in terms of idempotent calls, throwing
| everything in context up every time seems like it makes it
| likely that OpenAI will encroach on the stores side as well
| and do sessions?
| akiselev wrote:
| _> Along the way, we've learned some useful lessons on how to
| build these kinds of applications, which we've formulated in
| terms of patterns._ * Use a text template to
| enrich a prompt with context and structure * Tell the LLM
| to respond in a structured data format * Stream the
| response to the UI so users can monitor progress *
| Capture and add relevant context information to subsequent action
| * Allow direct conversation with the LLM within a context.
| * Tell LLM to generate intermediate results while answering
| * Provide affordances for the user to have a back-and-forth
| interaction with the co-pilot * Combine LLM with other
| information sources to access data beyond the LLM's training set
| manojlds wrote:
| The short courses from dl.ai are better at driving these points
| - https://www.deeplearning.ai/short-courses/
| frankgrecojr wrote:
| > Stream the response to the UI so users can monitor progress
|
| This is a game changer to the UX
| jamifsud wrote:
| Anyone know of any good "tolerant" JSON parsers? I'd love to
| be able to stream a JSON response down to the client and have
| it be able to parse the JSON as it goes and handle the
| formatting errors that we sometimes see.
| senko wrote:
| It's a crutch to minimize the user annoyance at having to
| wait up to a minute for the response. It sure beats the
| spinner but it's still a crutch.
| behnamoh wrote:
| Actually, it's annoying because as you start reading the
| first lines, the content keeps scrolling (often with jagged
| movements). I always have to scroll up immediately after the
| stream begins to disable this behavior.
| tobr wrote:
| That's totally fixable, though. ReadyRunner handles it
| simply by scrolling all the way from the start, leaving
| space for the message to grow.
| trafnar wrote:
| Hey, that's my app! https://www.readyrunner.ai
| huydotnet wrote:
| Still not a reasonable way if you're expecting a structured
| data in the response, like JSON or something that you're
| required to parse before showing to the user.
| sgt101 wrote:
| I find the whole idea of adding text into text to drive a outcome
| pretty worrying if I have to rely on the output.
|
| If the probability of the model spitting out something bad is
| 0.01% will my testing find it? Probably not.. but my users
| certainly will.
| phillipcarter wrote:
| Well, it's a tool for ideation, not a strategy emitter. You
| don't rely on the output, you rely on the people who finalize
| and commit to a strategy.
| sgt101 wrote:
| Yeah - for an application like this I get it. But no one is
| getting rich or shifting the dial on scientific progress with
| this sort of thing.
| m3kw9 wrote:
| LLM latency is a huge no go for most apps except for chat apps.
| I've try to build apps based on OpenAI and that itself creates a
| bad experience no matter how much elevator music/mirrors/spinners
| you place. Then you need proper error correction when dealing
| with structured responses/occasional hallucinations
| mvdtnz wrote:
| I am so despondent at the lack of creativity in most of the
| (many, many) LLM powered projects that are popping up. I have
| seen hardly a single thing that goes beyond "it's a chat bot, but
| with a special prompt". Like, is this the best we can expect from
| this supposedly ground-breaking technology?
| bugglebeetle wrote:
| Most of the stuff it's actually good at (like NLP tasks) are
| both super boring and require a secondary layer of processing
| to catch hallucinations. Not as cool of a sales pitch to
| everyone on the "it's alive!" hype train.
| pertymcpert wrote:
| Same. You just know most of the paid apps are going to be
| abandoned in a few months.
___________________________________________________________________
(page generated 2023-06-29 23:00 UTC)