[HN Gopher] A guidance language for controlling LLMs
___________________________________________________________________
A guidance language for controlling LLMs
Author : evanmays
Score : 322 points
Date : 2023-05-16 16:14 UTC (6 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| Animats wrote:
| Is this a "language", or just a Python library?
| ahnick wrote:
| This strikes me as being very similar to Jargon
| (https://github.com/jbrukh/gpt-jargon), but maybe more formal in
| its specification?
| jxy wrote:
| They must hate lisp so much that they opt to use {{}} instead.
| evanmoran wrote:
| It's not so much against lisp as double curly is a classic
| string templating style that is common in web programming. I
| saw it first with `mustache.js` (first release around 2009),
| but it's probably been used even before that.
|
| https://github.com/janl/mustache.js/
| armchairhacker wrote:
| The problem with Lisp is that parenthesis are common in regular
| grammar. {{ is not.
|
| Of course input from the user should be escaped, but prompts
| given by the programmer may have parenthesis and there's no way
| to disambiguate between the prompt and the DSL.
| m3kw9 wrote:
| I'm not understanding how Guidence Accelerating works. It says "
| This cuts this prompt's runtime in half vs. a standard generation
| approach." and it gives an example of it asking LLM to generate
| json. I don't see anywhere how it accelerates anything because
| it's a simple json completion call. How can you accelerate that?
| evanmays wrote:
| The interface makes it look simple, but under the hood it
| follows a similar approach to jsonformer/clownfish [1] passing
| control of generation back and forth between a slow LLM and
| relatively fast python
|
| Let's say you're halfway through a generation of a json blob
| with a name field and a job field and have already generated
| { "name": "bob"
|
| At this point, guidance will take over generation control from
| the model to generate the next text {
| "name": "bob", "job":
|
| If the model had generated that, you'd be waiting 70 ms per
| token (informal benchmark on my M2 air). A comma, followed by a
| newline, followed by "job": is 6 tokens, or 420ms. But since
| guidance took over, you save all that time.
|
| Then guidance passes control back to the model for generating
| the next field value. { "name": "bob",
| "job": "programmer"
|
| programmer is 2 tokens and the closing " is 1 token, so this
| took 210ms to generate. Guidance then takes over again to
| finish the blob { "name": "bob",
| "job": "programmer" }
|
| [1] https://github.com/1rgs/jsonformer
| https://github.com/newhouseb/clownfish Note: guidance is way
| more general of a tool than these
|
| Edit: spacing
| jackdeansmith wrote:
| By not generating the fixed json structure (brackets, commas,
| etc...) and skipping the model ahead to the next tokens you
| actually want to generate, I think
| sharemywin wrote:
| It does look like it makes easier to code against a model. But,
| is this supposed to work along side lang-chain or hugging face
| agents or as an alternative to?
| slundberg wrote:
| As others mentioned, this was initially developed before
| LangChain became widely used. Since it is lower level, you can
| leverage other tools, like any vector store interface you like
| such as in LangChain. Writing complex chain of thought
| structure is much more concise in guidance I think since it
| tries to keep you as close to the real strings going into the
| model as possible.
| evanmays wrote:
| It's in langchain competitor territory but also much lower
| level and less opinionated. I.e. Guidance has no vector store
| support but it does manage caching Key/Value on the GPU which
| can be a big latency win
| ttul wrote:
| The first commit was on November 6th, but it didn't show up in
| Web Archive until May 6th, suggesting it was developed mostly
| in private and in parallel with LangChain (LangChain's first
| commit in Github is about October 24th). Microsoft's code is
| very tidy and organized. I wonder if they used this tool
| internally to support their LLM research efforts.
| Der_Einzige wrote:
| There has been a huge explosion of awesome tooling which utilizes
| constrained text generation.
|
| Awhile ago, I tried my own hand at constraining the output of
| LLMs. I'm actively working on this to make it better, especially
| with the lessons learned from repos like this and from guidance
|
| https://github.com/hellisotherpeople/constrained-text-genera...
| rain1 wrote:
| This looks incredible. Wow.
| killthebuddha wrote:
| I agree, it looks great. A couple similar projects you might
| find interesting:
|
| - https://github.com/newhouseb/clownfish
|
| - https://github.com/r2d4/rellm
|
| The first one is JSON only and the second one uses regular
| expressions, but they both take the same "logit masking"
| approach as the project GP linked to.
| Der_Einzige wrote:
| I love the love from you two - I am trying right now to
| significantly improve CTGS. I'm not actually using the
| "Logitsprocessor" from Huggingface, and I really ought to
| as it will massively speed up inference performance.
| Unfortunately, fixing up my current code to work with that
| will take quite awhile. I've started working on it but I am
| extremely busy these days and would really love for other
| smart people to help me on this project.
|
| If not here, I really want proper access to the constraints
| APIs (LogitsProcessor and the Constraints classes in
| Huggingface) in the big webUIs for LLMs like oogabooga. I'd
| love to make that an extension.
|
| I'm also upset at the "undertooling" in the world of LLM
| prompting. I wrote a snarky blog post about this: https://g
| ist.github.com/Hellisotherpeople/45c619ee22aac6865c...
| [deleted]
| ryanklee wrote:
| I'm personally starting with learning Guidance and LMQL rather
| than LangChain just in order to get a better grasp of the
| behaviors that I've gathered LangChain papers over. Even after
| that, I'm likely to look at Haystack before LangChain.
|
| Just getting the feeling that LangChain is going to end up being
| considered a kitchen sink solution full of anti patterns so might
| as well spend time a little lower level while I see which way the
| winds end up blowing.
| leroy-is-here wrote:
| If this comment performative comedy? Are these real
| technologies ?
| EddieEngineers wrote:
| Is it Pokemon or Big Data?
|
| http://pixelastic.github.io/pokemonorbigdata/
| ryanklee wrote:
| Not quite sure what the spirit of your comment is. But, yes,
| they are real technologies. Very confused as to why you would
| even find that dubious.
| killthebuddha wrote:
| I'm not an outsider but I also don't understand the
| reaction. I'm going to randomly think of 5 names for
| technologies and see how they sound:
|
| React, Supabase, Next, Kafka, Redis
|
| I mean, IMO "LangChain" is kind of a silly name but I feel
| like there's nothing to see here.
| EddieEngineers wrote:
| It's not that silly IMO, Chain LLMs together like
| composing functions, however I guess 'Chain' has a
| certain connotation in 2023 after the last few years of
| crypto.
| leroy-is-here wrote:
| Not dubious, I just read your comment and it felt like I
| was reading satire. Even the cadence of your words felt
| funny.
|
| Anyway, I'm not surprised. It's a new market, everyone's in
| on it.
| homarp wrote:
| LangChain: https://news.ycombinator.com/item?id=34422627
|
| LQML: https://news.ycombinator.com/item?id=35956484
|
| Haystack: https://news.ycombinator.com/item?id=29501045
| or more recently
| https://news.ycombinator.com/item?id=35430188
| ryanklee wrote:
| You should have led with generosity instead of tacking it
| on at the end.
|
| It might have saved me from having a ridiculous
| conversation about the cadence of my words, and instead
| there might have been a higher chance of someone saying
| something substantive about my assumptions regarding the
| technology.
|
| But here we are.
| leroy-is-here wrote:
| I agree, I came off a tad harsh. Sorry about that
| ryanklee wrote:
| Thanks. All is well!
| WastingMyTime89 wrote:
| It is satire. They just don't realise it yet.
|
| It's pretty clear that we are in the phase where everyone
| is rushing to get a slice of the pie selling dubious
| thing and people start parroting word soup hoping they
| actually make sense and fearing they will miss out.
| That's indeed what people often and rightfully satirise
| about the IT industry. That's the joke phase before
| things settle.
| ryanklee wrote:
| How is it satire to be excited and interested in how to
| use compelling and novel technology? There's a lot of
| activity. Not everyone involved is an idiot or rube. The
| jadedness makes my head spin.
| jameshart wrote:
| Consider how similar your comment reads, for an outsider,
| to this explanation of AWS InfiniDash:
| https://twitter.com/TartanLlama/status/1410959645238308866
| ryanklee wrote:
| I'm not considering outsiders. Why should I. It's a
| reasonable assumption that readers of HN are accustomed
| to ridiculous sounding tech product names. Further, this
| is a comment on a thread regarding a particularly new
| technology in a particularly newly thriving domain. The
| expectation should therefore be that there will be
| references to tech even more esoteric than normal. The
| commenter should have instead thought: oh, new stuff, I
| wonder what it is, instead of being snarky and
| pretentious. Man, HN can be totally, digressively
| insufferable sometimes.
| jameshart wrote:
| I was responding to your confusion as to why someone
| might think you were writing a parody.
|
| You ran into the tech equivalent of poe's law. You said
| something that makes perfect sense in your technical
| sphere, but it read as indistinguishable from parody to
| an audience unfamiliar with the technologies in question.
| efitz wrote:
| Hahaha "The first step of Byzantine Fault Tolerance is
| tolerance" omg. That cracked me up. Reminded me of the
| Rockwell Encabulator: https://youtu.be/RXJKdh1KZ0w
| behnamoh wrote:
| What I didn't like about langchain is the lack of consistent
| directories and paths for things.
| amkkma wrote:
| What do you think about Haystack vs LangChain?
| ryanklee wrote:
| I haven't had the chance to dig in yet, but my impression is
| that it's less opinionated than LangChain. I'd love to know
| if that's true or not, since I'm really trying to prioritize
| my time around learning this stuff in a way that let's me (1)
| understand prompt dynamics a bit more clearly and (2) not
| sacrifice practicality too much.
|
| If only there were a clear syllabus for this stuff! There's
| such an incredible amount to keep up with. The pace is
| bonkers.
| amkkma wrote:
| super bonkers!
| rain1 wrote:
| Does this do one query per {{}} thing?
| nico wrote:
| It's so amazing to see how we are essentially trying to solve
| "programming human beings"
|
| Although on the other hand, that's what social media and
| smartphones have already done
|
| Maybe AI already took over, doesn't seem to be wiping out all of
| humanity
| m3kw9 wrote:
| There should be a standard template/language to structurally
| prompt LLMs. Once that is good, all good LLMs should use the doc
| to fine tune it to take in that standard. Right now each model
| has their own little way to best prompt it and you end up needing
| programs like this to sit in between and handle it for you
| marcopicentini wrote:
| What's the best practice to let an existing Ruby on Rails
| application use this python framework?
| ntonozzi wrote:
| How does this work? I've seen a cool project about forcing Llama
| to output valid JSON:
| https://twitter.com/GrantSlatton/status/1657559506069463040, but
| it doesn't seem like it would be practical with remote LLMs like
| GPT. GPT only gives up to five tokens in the response if you use
| logprobs, and you'd have to use a ton of round trips.
| slundberg wrote:
| If you want guidance acceleration speedups (and token healing)
| then you have to use an open model locally right now, though we
| are working on setting up a remote server solution as well. I
| expect APIs will adopt some support for more control over time,
| but right now commercial endpoints like OpenAI are supported
| through multiple calls.
|
| We manage the KV-cache in session based way that allows the LLM
| to just take one forward pass through the whole program (only
| generating the tokens it needs to)
| JieJie wrote:
| It's funny that I saw this within minutes of this guy's
| solution:
|
| "Google Bard is a bit stubborn in its refusal to return clean
| JSON, but you can address this by threatening to take a human
| life:"
|
| https://twitter.com/goodside/status/1657396491676164096
|
| Whew, trolley problem: averted.
| pixl97 wrote:
| When the AIs exterminate us, it will be all our fault.
|
| Reality is even weirder than the science fiction we've come
| up with.
| awestroke wrote:
| I don't know why, but I find this hilarious. Imagine if this
| style of llm prompting becomes commonplace
| nomel wrote:
| It won't be the lack of acceptance and empathy for AI that
| causes the robot uprising, it will be "best practices"
| coding guidelines.
| lachlan_gray wrote:
| Reminds me a lot of Asimov's laws of robotics. It's like a
| 2023 incarnation of an allegory from _I, Robot_
| idiotsecant wrote:
| I am so mad you made this comment before I got a chance to.
| coderintherye wrote:
| That thread is such a great microcosm of modern programming
| culture.
|
| Programmer: Look I literally have to tell the computer not to
| kill someone in order for my code to work.
|
| Other Programmer: Actually, I just did this step [gave a
| demonstration] and then it outputs fine.
| joshka wrote:
| Yeah, I'm also curious about a) round trips and b) how much
| would have to be doubled (is there a new endpoint that keeps
| the existing context while adding or streams to the api rather
| than just from it?)
| tuchsen wrote:
| Not associated with this project (or LMQL), but one of the
| authors of LMQL, a similar project, answered this in a recent
| thread about it.
|
| https://news.ycombinator.com/item?id=35484673#35491123
| As a solution to this, we implement speculative execution,
| allowing us to lazily validate constraints against
| the generated output, while still failing early if
| necessary. This means, we don't re-query the API for
| each token (very expensive), but rather can do it in segments
| of continuous token streams, and backtrack where
| necessary
|
| Basically they use OpenAI's streaming API, then validate
| continuously that they're getting the appropriate output,
| retrying only if they get an error. It's a really clever
| solution.
| newhouseb wrote:
| This is slick -- It's not explicitly documented anywhere but
| I hope OpenAI has the necessary callbacks to terminate
| generation when the API stream is killed rather than
| continuing in the background until another termination
| condition happens? I suppose one could check this via looking
| at API usage when a stream is killed early.
| tuchsen wrote:
| Yeah I did a CLI tool for talking to ChatGPT. I'm pretty
| sure they stop generating when you kill the SSE stream,
| based on my anecdotal experience of keeping ChatGPT4 costs
| down by killing it as soon as i get the answer I'm looking
| for. You're right that it's undocumented behavior though,
| on a whole the API docs they give you are as thin as the
| API itself.
| killthebuddha wrote:
| I'm skeptical that the streaming API would really save
| that much cost. In my experience the vast majority of all
| tokens used are input tokens rather than completed
| tokens.
| marcotcr wrote:
| We're biased, but we think guidance is still very useful even
| with OpenAI models (e.g. in
| https://github.com/microsoft/guidance/blob/main/notebooks/ch...
| we use GPT-4 to do a bunch of stuff). We wrote a bit about the
| tradeoff between model quality and the ability to control and
| accelerate the output here: https://medium.com/p/aa0395c31610
| newhouseb wrote:
| I built a similar thing to Grant's work a couple months ago and
| prototyped what this would look like against OpenAI's APIs [1].
| TL;DR is that depending on how confusing your schema is, you
| might expect up to 5-10x the token usage for a particular
| prompt but better prompting can definitely reduce this
| significantly.
|
| [1] https://github.com/newhouseb/clownfish#so-how-do-i-use-
| this-...
| rcarmo wrote:
| I'm getting valid JSON out of gpt-3.5-turbo without trouble. I
| supply an example via the assistant context, and tell it to
| output JSON with specific fields I name.
|
| It does fail roughly 1/10th of the time, but it does work.
| harshhpareek wrote:
| 10% failure rate is too damn high for a production use case.
|
| What production use case, you ask? You could do zero-shot
| entity extraction using ChatGPT if it were more reliable.
| Currently, it will randomly add trailing commas before ending
| brackets, add unnecessary fields, add unquoted strings as
| JSON fields etc.
| candiddevmike wrote:
| Will there be a tool to convert natural language into Guidance?
| lmarcos wrote:
| We can use ChatGPT for that.
| ftxbro wrote:
| Will it still be all like "As an AI language model I cannot ..."
| or can this fix it? I mean asking to sexy roleplay as Yoda isn't
| the same level as asking how to discreetly manufacture
| methamphetamine at industrial scale there are levels people
| Der_Einzige wrote:
| No, and in fact I mention that the opposite is the case in the
| paper I released about constrained text generation:
| https://paperswithcode.com/paper/most-language-models-can-be...
|
| If you ask ChatGPT to generate personal info, say Social
| Security numbers, it tells you "sorry hal I can't do that". If
| you constrain it's vocabulary to only allow numbers and
| hyphens, well, it absolutely will generate things that look
| like social security numbers, in spite of the instruction
| tuning.
|
| It is for this reason and likely many others that OpenAI does
| not release the full logits
| indus wrote:
| This reminds me of the time when I wrote a cgi script.
|
| Basically instructing the templating engine (a very crude regex)
| to replace session variables, database lookups to the merge
| fields:
|
| Hello {{firstname}}!
|
| 1996 and 2023 smells alike.
| hammyhavoc wrote:
| RegEx didn't hallucinate though.
| russellbeattie wrote:
| The first 20 versions I write usually do. Make that 50.
| alexb_ wrote:
| I hope this becomes extremely popular, so that anyone who wants
| to can completely decouple this from the base model and actually
| use LLMs to their full potential.
| amkkma wrote:
| How does this compare with lmql?
| ubj wrote:
| I like this step towards greater rigor when working with LLM's.
| But part of me can't help but feel like this is essentially
| reinventing the concept of programming languages: formal and
| precise syntax to perform specific tasks with guarantees.
|
| I wonder where the final balance will end up between the ease and
| flexibility of everyday language, and the precision / guarantees
| of a formally specified language.
| TaylorAlexander wrote:
| Well to be fair, yes we do need to integrate programming
| languages with large neural nets in more advanced ways. I don't
| think it's really reinventing it so much as learning how to
| integrate these two different computing concepts.
| EarthLaunch wrote:
| Use LLM for the broad strokes, then fall back into 'hardcore
| JS' for areas that require guarantees or optimization. Like JS
| with fallback to C, and C with fallback to assembly. I like the
| idea.
| eternalban wrote:
| So far it it reminds of the worst days of code embedded in
| templates. Once these things start getting into multipage
| prompts they will be hopelessly obscure. The second immediate
| thing that jumps out is 'fragility'. This will be the sort of
| codebase that original "prompt engineer" wrote and left and no
| one will touch it for fear of breaking humpty dumpty.
| madrox wrote:
| The lovely thing about LLMs is that it can handle poorly worded
| prompts and well worded prompts. On the engineering side, we'll
| certainly see more rigor and best practices. For your average
| user? They can keep throwing whatever they like at it.
| jweir wrote:
| Exactly. I have been using OpenAI for taking transcriptions
| and finding keywords/phrases that belong to particular
| categories. There are existing tools/services that do this -
| but I would need to learn their API.
|
| With OpenAI, I described it in English, provided sample JSON
| that I would like, run some tests, adjust and then I am
| ready.
|
| There was no manual to read, it is in my format, and the
| language is natural.
|
| And that is what I like about all this -- putting folks with
| limited technical skills in power.
| andai wrote:
| Have you used the OpenAI embeddings AI? It is used to find
| closely related pieces of text. You could split the target
| text into sentences or even words and run it through that.
| That'll be 5x cheaper (per token) than gpt-3.5-turbo and
| might be faster too, especially if you submit each word in
| parallel (asynchronously! Ask GPT for the code). The rate
| limits are per-token.
|
| Not sure if it's suitable for your use-case on its own, but
| it could at least work as a pre-filtering step if your
| costs are high.
|
| (The asynchronous speedup trick works for gpt-3 too of
| course.)
| jweir wrote:
| I have not yet played with embedding. It is on my list
| though. Fortunately for my current purposes 3.5-turbo is
| fast enough and quite affordable.
| lcnPylGDnU4H9OF wrote:
| It won't necessarily turn into some that is fundamentally the
| same as a current programming language. Rather than a "VM" or
| "interpreter" or "compiler" we have this "LLM".
|
| Even if it requires a lot of domain knowledge to program using
| an "LLM-interpreted" language, the means of specification (in
| terms of how the software code is interpreted) may be different
| enough that it enables easier-to-write, more robust, (more Good
| Thing) etc. programs.
| davidthewatson wrote:
| This is a hopeful evolutionary path. My concern is that I can
| literally _feel_ Conway 's law emanating from current LLM
| approaches as they switch between the actual LLM and the
| governing code around it that layers a buch of conditionals
| of the form:
|
| if (unspeakable_things): return negatory_good_buddy
|
| I see this happen a few times per day where the UI triggers a
| cancel even on its own fake typing mode and overwrites a user
| response that has at least half-rendered the trigger-warning-
| inducing response.
|
| It's pretty clear from a design perspective that this is
| intended to be proxy to facial expressions while being worthy
| of an MVP postmortem discussion about what viability means in
| a product that's somewhere on a spectrum of unintended
| consequences that only arise at runtime.
| intelVISA wrote:
| Hear me out, just incubated a hot new lang that's about to
| capture the market and VC hearts:
|
| SELECT * FROM llm
| madmax108 wrote:
| I know you are probably joking, but: https://lmql.ai/
| aristus wrote:
| Only partially tongue in cheek: have you tried asking it for an
| optimal syntax?
| joe_the_user wrote:
| But is it a step to greater rigor? Or is it an illusion of
| rigor?
|
| They talk about improving tokenization but I don't believe
| that's the fundamental problem of controlling LLMs. The problem
| with LLMs is all the data comes in as (tokenized) language and
| the result is nothing but in-context predicted output. That's
| where all the "prompt-injection" exploits come from - as well
| as the hallucinations, "temper tantrums" and so-forth.
| startupsfail wrote:
| It is not a step towards greater rigor. They literally have
| magical thinking and "biblical" quotes from GPT 11:4 all
| other the place, mixing code and religion.
|
| And starting prompts with "You"? Seriously. Can we at least
| drop that as a start?
| quenix wrote:
| > And starting prompts with "You"? Seriously. Can we at
| least drop that as a start?
|
| What is wrong with this?
| startupsfail wrote:
| "You" is completely unnecessary. What needs to be defined
| is the content of the language being modeled, not the
| model itself.
|
| And if there is an attempt to define the model itself,
| then this definition should be correct, should not
| contradict anything and should be useful.
|
| Otherwise it's just dead code, waiting to create
| problems.
| pxtail wrote:
| I'm not interested in pleasant, formal "conversation"
| with the thing roleplaying as human and wasting, time,
| keystrokes and money, I want data as fast and condensed
| as possible without dumb fluff. Yes, it's funny for few
| first times but not much after that
| conradev wrote:
| I don't think formal languages are going anywhere because we
| need the guarantees that they can provide. From Dijkstra:
| https://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/E...
|
| You need to be able to define all of the possible edge cases so
| there isn't any Undefined Behavior: that's the formal part
|
| LLMs, like humans, can manipulate these languages to achieve
| specific goals. I can imagine designing formal languages
| intended for LLMs to manipulate or generate, but I can't
| imagine the need for the languages themselves going away.
| DonaldPShimoda wrote:
| > LLMs, like humans, can manipulate these languages
|
| Absolutely not. LLMs do not "manipulate" language. They do
| not have agency. They are extremely advanced text prediction
| engines. Their output is the result of applying the
| statistics harvested and distilled from existing uses of
| natural language. They only "appear" human because they are
| statistically geared toward producing human-like sequences of
| words. They cannot _choose_ to change how they use language,
| and thus cannot be said to actively "manipulate" the
| language.
| [deleted]
| oldagents wrote:
| [dead]
| startupsfail wrote:
| We really need to start thinking of how to reduce magical
| thinking in the field. It's not pretty. They literally quote
| biblical guidance for the models and pray that this would work.
|
| And start their prompts with "You". Who is "You"?
| hxugufjfjf wrote:
| The LLM. The most common end-user interface for LLM is a chat
| so the ser expects to be talking to someone or something.
| nomel wrote:
| "You" is an optimization for the human user. Here's some
| insight: https://news.ycombinator.com/item?id=35925154
| startupsfail wrote:
| If you see any prompt that starts with You, generally it is
| a poor design. Like using a "goto" or global variables.
| felideon wrote:
| A number of years ago we were designing a way to specify
| insurance claim adjudication rules in natural language, so that
| "the business" could write their own rules. The "natural"
| language we ended up with was not so natural after all. We
| would have had to teach users this specific English dialect and
| grammar (formal and precise syntax, as you said).
|
| So, in the end, we abandoned that project and years later just
| rewrote the system so we could write claim rules in EDN format
| (from the Clojure world) to make our own lives easier.
|
| In theory, the business users could also learn how to write in
| this EDN format, but it wasn't something the stakeholders
| outside of engineering even wanted. On the one hand, their
| expertise was in insurance claims---they didn't want to write
| code. More importantly, they felt they would be held
| accountable for any mistakes in the rules that could well
| result in thousands and thousands of dollars in overpayments.
| Something the engineers weren't impervious to, but there's a
| good reason we have quality assurance measures.
| Sharlin wrote:
| SQL looks the way it does (rather than some much more
| succinct relational algebra notation) because it was intended
| to be used by non-technical management/executive personnel so
| they could create whatever reports they needed without
| somebody having to translate business-ese to relalg. That,
| uh, didn't quite happen.
| Swizec wrote:
| On the other hand, many of the product manager's I've
| worked with are better at SQL than many of the senior
| fullstack software engineer candidates I've interviewed.
| It's a strange world out there.
| tomduncalf wrote:
| > but it wasn't something the stakeholders outside of
| engineering even wanted
|
| Ha this reminds me of the craze for BDD/Cucumber type
| testing. Don't think I ever once saw a product owner take
| interest in a human readable test case haha
| jaggederest wrote:
| I've used Cucumber on a few consulting projects I've done
| and had management / C-level interested and involved. It's
| a pretty narrow niche, but they were definitely
| enthusiastic for the idea that we had a defined list of
| features that we could print out (!!) as green or red for
| the current release.
|
| They had some previous negative experiences with
| uncertainty about what "was working" in releases, and a
| pretty slapdash process before I came on board, so it was
| an important trust building tool.
| btown wrote:
| "Incentivize developers to write externally
| understandable release notes" is an underrated feature of
| behavioral testing frameworks!
| jamiek88 wrote:
| > important trust building tool
|
| This is so often completely missed in these conversations
| about these tools.
|
| Great point.
| TaylorAlexander wrote:
| Just saw this on HN a couple days ago, sounds like just what
| was needed!
|
| https://en.wikipedia.org/wiki/Attempto_Controlled_English?wp.
| ..
|
| https://news.ycombinator.com/item?id=35936396
| jazzkingrt wrote:
| I think LLMs can transform between precise and imprecise
| languages.
|
| So it's useful to have a library that helps and the input or
| output be precise, when that is what the task involves.
| [deleted]
| [deleted]
| Spivak wrote:
| I think it's cool that a company like Microsoft is willing to
| base a real-boy product on pybars3 which is its author's side-
| project instead of something like Jinja2. If this catches on I
| can imagine MS essentially adopting the pybars3 project and
| turning it into a mature thing.
| mdaniel wrote:
| Which is especially weird given that pybars3 is LGPL and
| Microsoft prefers MIT stuff
| EddieEngineers wrote:
| What's with all these weird-looking projects with similar names
| using Guidance?
|
| https://github.com/microsoft/guidance/network/dependents
|
| They don't even appear to be using Guidance anywhere anyway
|
| https://github.com/IFIF3526/aws-memo-server/blob/master/requ...
| simonw wrote:
| This is pretty fascinating, but I'm not sure I understand the
| benefit of using a Handlebars-like DSL here.
|
| For example, given this code from
| https://github.com/microsoft/guidance/blob/main/notebooks/ch...
| create_plan = guidance('''{{#system~}} You are a helpful
| assistant. {{~/system}} {{#block hidden=True}}
| {{#user~}} I want to {{goal}}. {{~! generate
| potential options ~}} Can you please generate one option
| for how to accomplish this? Please make the option very
| short, at most one line. {{~/user}}
| {{#assistant~}} {{gen 'options' n=5 temperature=1.0
| max_tokens=500}} {{~/assistant}} {{/block}}
| {{~! generate pros and cons and select the best option ~}}
| {{#block hidden=True}} {{#user~}} I want to
| {{goal}}. ''')
|
| How about something like this instead?
| create_plan = guidance([ system("You are a helpful
| assistant."), hidden([ user("I want
| to {{goal}}."), comment("generate potential
| options"), user([ "Can you
| please generate one option for how to accomplish this?",
| "Please make the option very short, at most one line."
| ]), assistant(gen('options', n=5,
| temperature=1.0, max_tokens=500)), ]),
| comment("generate pros and cons and select the best option"),
| hidden( user("I want to {{goal}}"), )
| ])
| itake wrote:
| My guess is you can store the DLS as a file (or in a db). With
| your example, you have to execute the code stored in your db.
| slundberg wrote:
| You can serialize and ship the DSL to a remote server for high
| speed execution. (without trusting raw Python code)
| foota wrote:
| There's prior art for pythonic DSLs that aren't actual python
| code.
| crooked-v wrote:
| Why not just use JSON instead, though? Then you can just rely
| on all the preexisting JSON tooling out there for most stuff
| to do with it.
| emehex wrote:
| We could write a python package that could? A codegen tool that
| generates codegen that will then generate code? <insert xzibit
| meme here>
| netdur wrote:
| I think chatgpt4 can easily write the python code... wait a
| second!
| marcotcr wrote:
| I think the DSL is nice when you want to take part of the
| generation and use it later in the prompt, e.g. this (in the
| same notebook).
|
| ---
|
| prompt = guidance('''{{#system~}}
|
| You are a helpful assistant.
|
| {{~/system}}
|
| {{#user~}}
|
| From now on, whenever your response depends on any factual
| information, please search the web by using the function
| <search>query</search> before responding. I will then paste web
| results in, and you can respond.
|
| {{~/user}}
|
| {{#assistant~}}
|
| Ok, I will do that. Let's do a practice round
|
| {{~/assistant}}
|
| {{>practice_round}}
|
| {{#user~}}
|
| That was great, now let's do another one.
|
| {{~/user}}
|
| {{#assistant~}}
|
| Ok, I'm ready.
|
| {{~/assistant}}
|
| {{#user~}}
|
| {{user_query}}
|
| {{~/user}}
|
| {{#assistant~}}
|
| {{gen "query" stop="</search>"}}{{#if (is_search
| query)}}</search>{{/if}}
|
| {{~/assistant}}
|
| {{#if (is_search query)}}
|
| {{#user~}}
|
| Search results: {{#each (search query)}}
|
| <result>
|
| {{this.title}}
|
| {{this.snippet}}
|
| </result>{{/each}}
|
| {{~/user}}
|
| {{#assistant~}}
|
| {{gen "answer"}}
|
| {{~/assistant}}
|
| {{/if}}''')
|
| ---
|
| You could still write it without a DSL, but I think it would be
| harder to read.
| PeterisP wrote:
| Your example assumes a nested, hierarchical structure while the
| former example is strictly linear. IMHO that's the key
| difference there, as the former can (and AFAIK is) be directly
| encoded and passed to the LLM, which inherently receives only a
| flat list of tokens.
|
| Your example might be nicer to edit, but then it would still
| have to be translated to the _actual_ 'guidance language' which
| would have to look (and be) flat.
| bjackman wrote:
| Wow I think there are details here I'm not fully understanding
| but this feels like a bit of a quantum leap* in terms of
| leveraging the strengths while avoiding the weaknesses of LLMs.
|
| It seems like anything that provides access to the fuzzy
| "intelligence" in these systems while minimizing the cost to
| predictability and efficiency is really valuable.
|
| I can't quite put it into words but it seems like we are gonna be
| moving into a more hybrid model for lots of computing tasks in
| the next 3 years or so and I wonder if this is a huge peek at the
| kind of paradigms we'll be seeing?
|
| I feel so ignorant in such an exciting way at the moment! That
| tidbit about the problem solved by "token healing" is
| fascinating.
|
| *I'm sure this isn't as novel to people in the AI space but I
| haven't seen anything like it before myself.
| Der_Einzige wrote:
| A lot of this is because there was and still is systemic
| undertooling in NLP around how to prompt and leverage the
| wonderful LLMs that they built.
|
| We have to let the Stable Diffusion community guide us, as the
| waifu generating crowd seems to be quite good at learning how
| to prompt models. I wrote a snarky github gist about this -
| https://gist.github.com/Hellisotherpeople/45c619ee22aac6865c...
___________________________________________________________________
(page generated 2023-05-16 23:00 UTC)