[HN Gopher] Structured Outputs in the API
___________________________________________________________________
Structured Outputs in the API
Author : davidbarker
Score : 369 points
Date : 2024-08-06 17:41 UTC (5 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| gamegoblin wrote:
| I'm glad they gave up on their "fine-tuning is all you need"
| approach to structured output. It's possible fine-tuning will
| work in the long term, but in the short-term, people are trying
| to build things, and fine-tuning wasn't cutting it.
|
| Surprised it took them so long -- llama.cpp got this feature 1.5
| years ago (actually an even more general version of it that
| allows the user to provide any context free grammar, not just
| JSON schema)
| chhabraamit wrote:
| How does llama.cpp's grammar adherence work?
|
| Does it keep validating the predicted tokens and backtrack when
| it's not valid?
| gamegoblin wrote:
| It's essentially an Earley Parser[0]. It maintains a set of
| all possible currently valid parses, and zeroes out the
| probability of any token that isn't valid in at least 1 of
| the current potential parse trees.
|
| There are contrived grammars you can give it that will make
| it use exponential memory, but in practice most real-world
| grammars aren't like this.
|
| [0] https://en.wikipedia.org/wiki/Earley_parser
| tcdent wrote:
| GPT is still a language model, so at some point it's still just
| tokens.
|
| Is this just a schema validation layer on their end to avoid
| the round trip (and cost) of repeating the call?
| gamegoblin wrote:
| Language models like GPT output a large vector of
| probabilities for the next token. Then a sampler decides
| which of those tokens to pick.
|
| The simplest algorithm for getting good quality output is to
| just always pick the highest probability token.
|
| If you want more creativity, maybe you pick randomly among
| the top 5 highest probability tokens or something. There are
| a lot of methods.
|
| All that grammar-constrained decoding does is zero out the
| probability of any token that would violate the grammar.
| nickreese wrote:
| Thank you for this explanation. A few things just clicked
| for me.
| BoorishBears wrote:
| I was surprised it took so long until I reached this line:
|
| > The model can fail to follow the schema if the model chooses
| to refuse an unsafe request. If it chooses to refuse, the
| return message will have the refusal boolean set to true to
| indicate this.
|
| I'm not sure how they implemented that, maybe they've figured
| out a way to give the grammar a token or set of tokens that are
| always valid mid generation and indicate the model would rather
| not continue generating.
|
| Right now JSON generation is one of the most reliable ways to
| get around refusals, and they managed not to introduce that
| weakness into their model
| Der_Einzige wrote:
| For many things, fine-tuning as we know of it will NEVER fully
| solve it, there's no hope. Even fine-tuning a model to not use
| the letter "e" to an overwhelming degree doesn't entirely
| prevent it, only reduces its chances to increasingly small
| amounts. Shamesless self plug, and from before the ChatGPT era
| too! https://paperswithcode.com/paper/most-language-models-can-
| be...
| leetharris wrote:
| At the bottom:
|
| >Acknowledgements Structured Outputs takes inspiration from
| excellent work from the open source community: namely, the
| outlines, jsonformer, instructor, guidance, and lark libraries.
|
| It is cool to see them acknowledge this, but it's also lame for a
| company named "OpenAI" to acknowledge getting their ideas from
| open source, then contributing absolutely NOTHING back to open
| source with their own implementation.
| warkdarrior wrote:
| > it's also lame for a company named "OpenAI" to acknowledge
| getting their ideas from open source, then contributing
| absolutely NOTHING back to open source with their own
| implementation
|
| Maybe those projects were used as-is by OpenAI, so there was
| nothing new to contribute.
| reustle wrote:
| I think they may be alluding to sponsorships as well as code
| contributions.
|
| i.e. https://github.com/sponsors/jxnl
| spencerchubb wrote:
| Is offering gpt4o for free through chatgpt not enough of a
| contribution? They didn't release source code, but they made a
| product free to use
| mplewis wrote:
| No. If it were free you'd be able to use it as a programming
| API. It's not free and it's not unlimited - it's a time-
| limited marketing tool.
| spencerchubb wrote:
| How are you defining the word free?
| talldayo wrote:
| Free service != Open software
| notarobot123 wrote:
| This isn't generosity, it's a well known and much used
| strategy for market penetration. Free until-we-decide-
| otherwise is very much not the same as open source.
| rvense wrote:
| In so far as it is a conscious strategy to make it more
| expensive at a later data, it is actually sort of the
| opposite of generosity.
| spencerchubb wrote:
| So if something is free but only temporarily, then that
| cancels out the generosity? Also, you and I have no idea
| how long the features will remain free. If anything,
| chatgpt has been making _more_ features and stronger models
| free over time.
| simonw wrote:
| Sometimes it does, yeah. It's not unheard of for
| companies to deliberately operate at a loss in order to
| drive out their competition, then raise prices again.
| This is known as "predatory pricing".
| echelon wrote:
| That can actually make competition from open source _harder_.
| New upstarts that are open source can 't compete with free
| service from OpenAI and can't make money to grow their
| development or offerings.
|
| OpenAI wants to kill everything that isn't OpenAI.
| ben_w wrote:
| New open source models* still wouldn't be able to compete
| even if OpenAI was forcibly shut down.
|
| Hardware's too expensive, and will be for a while, because
| _all_ the big players are trying to get in on it.
|
| * cue arguments: "'open weights' or training data'?"; "does
| the Meta offering count or are they being sneaky and
| evil?"; etc.
| spencerchubb wrote:
| So should OpenAI make their product less accessible, in
| order to make it easier for competition? That makes no
| sense
| oblio wrote:
| I call chicken. Let them make all their products paid.
|
| Hint: they won't, it would kill their company. The hype
| around OpenAI is based on people using it for free, at
| least at the start.
|
| Heck, even drug dealers know this trick!
| sirspacey wrote:
| You don't anyone will use it to contribute to open source
| projects?
|
| Seems like an obvious net gain for the community.
| jjcm wrote:
| Interesting tidbit at the very end that's worth noting for anyone
| using the API today:
|
| > By switching to the new gpt-4o-2024-08-06, developers save 50%
| on inputs ($2.50/1M input tokens) and 33% on outputs ($10.00/1M
| output tokens) compared to gpt-4o-2024-05-13.
| scrollop wrote:
| From what I've learned from OpenAI, the "latest" "cheaper"
| model will perform worse than the previous model on various
| tasks (esp reasoning).
| ralusek wrote:
| I don't think it's been well enough acknowledged that all of
| the shortcuts LLMs have been taking with ways of attempting
| to compress/refine/index the attention mechanism seem to
| result in dumber models.
|
| GPT 4 Turbo was more like GPT 3.9, and GPT 4o is more like
| GPT 3.7.
| scrollop wrote:
| Some commenters acknowledge it - and quantify it:
|
| https://www.youtube.com/watch?v=Tf1nooXtUHE&t=689s
| Der_Einzige wrote:
| They try to gaslight us and tell us this isn't true because
| of benchmarks, as though anyone has done anything but do
| the latent space exploration equivalent of throwing darts
| at the ocean from space.
|
| It's taken years to get even preliminary reliable decision
| boundary examples from LLMs because doing so is expensive.
| scrollop wrote:
| Also, is it a coincidence that at cheaper (potentially
| faster?) model has been released (just) before they roll out
| the "new" voice mode (which boasts very low latency)?
| codingwagie wrote:
| Its usually a distilled smaller model
| samstave wrote:
| Am I the only one that wants to know 1,000% * _WHY*_ such
| things?
|
| Is it a natural function of how models evolve?
|
| Is it engineered as such? Why?
| Marketing/money/resources/what?
|
| WHO makes these decisions and why?
|
| ---
|
| I have been building a thing with Claude 3.5 pro account and
| its * _utter fn garbage*_ of an experience.
|
| It lies, hallucinates, malevolently changes code that was
| already told was correct, removes features - explicitly
| ignore project files. Has no search, no line items, so much
| screen real-estate is consumed with useless empty space. It
| ignores states style guides. get CAUGHT forgetting about a
| premise we were actively working on them condescendingly
| apologies "oh you're correct - I should have been using XYZ
| knowledge"
|
| It makes things FN harder to learn.
|
| If I had any claude engineers sitting in the room watching
| what a POS service it is from a project continuity point...
|
| Its evil. It actively f's up things.
|
| One should have the ability to CHARGE the model token credit
| when it Fs up so bad.
|
| NO FN SEARCH??? And when asked for line nums in it output -
| its in txt...
|
| Seriously, I practically want not just a refund, I want
| claude to pay me for my time correcting its mistakes.
|
| ChatGPT does the same thing. It forgets things committed to
| memory - refactors successful things back out of files.
| ETc....
|
| Its been a really eye opening and frustrating experience and
| my squinty looks are aiming that its specifically
| intentional:
|
| They dont want people using a $20/month AI plan to actually
| be able to do any meaningful work and build a product.
| scrollop wrote:
| Use an API from the top models with a good frontend, then,
| and use precise instructions.
|
| It's odd, as many people praise claude's coding
| capabilities.
| minimaxir wrote:
| The new price is also now reflected on the pricing page:
| https://openai.com/api/pricing/
|
| It's weird that's only a footnote when it's actually a major
| shift.
| sjnair96 wrote:
| I also looked up the same. I wonder why. They must have a
| subsequent announcement regarding this I'd expect.
| ComputerGuru wrote:
| If you use the undecorated gpt-4o do you automatically get the
| latest?
| tedsanders wrote:
| We'll update gpt-4o in 3 weeks. (We've always updated it
| couple weeks after launch, so no one is immediately surprised
| by a new model drop.)
| OutOfHere wrote:
| For the record, you should never use that in an application.
| Always explicitly note the full versioned model name. This
| will prevent bad surprises because not every new version is
| an improvement; sometimes they get worse, especially at
| specific tasks.
| voiper1 wrote:
| >We will give a 3-week notice before updating gpt-4o to point
| to the new snapshot gpt-4o-2024-08-06.
|
| Source: https://platform.openai.com/docs/models/gpt-4o
| nerdjon wrote:
| I have a bad feeling that this is just going to introduce more
| shovelware apps that try to shove AI use in without really
| understanding what they are going to get back.
|
| Yay I can now ensure the json object will look how I want, but
| lets completely disregard any concern of wether or not the data
| returned is valuable.
|
| I don't understand why we are already treating these systems as
| general purpose AI when they are not. (Ok I do understand it, but
| it is frustrating).
|
| The example given of "look up all my orders in may of last year
| that were fulfilled but not delivered on time".
|
| First I have found these models incredibly dumb when it comes to
| handling time. But even beyond that, if you really are going to
| do this. I really hope you double check the data before
| presenting the data you get back as true. Worse that is just
| double checking what it gives back to you is accurate, not
| checking that it isn't telling you about something.
|
| Every time I try to experiment with supplying data and asking for
| data back, they fall flat on their face before we even get to the
| json being formatted properly. That was not the issue that needed
| to be solved yet when it still fundamentally messes up the data.
| Often just returning wrong information. Sometimes it will be
| right though, but that is the problem. It may luck out and be
| right enough times that you gain confidence in it and stop double
| checking what it is giving back to you.
|
| I guarantee you someone is going to have a discussion about using
| this, feeding it data, and then storing the response in a
| database.
| titzer wrote:
| It's so wild that the bar for AI performance is both absurdly
| high and absurdly low at the same time. To specify an output
| format (language or grammar) for solving a computational problem
| is one of the oldest exercises around. On the one hand, it's
| breathtakingly mundane that the model can now do the most basic
| of tasks: conform to an output specification. It's weird reading
| the kind of self-congratulating blogpost about this, like OpenAI
| has just discovered flint knives. On the other hand, a computer
| system can process natural language with extremely ambiguous,
| open-ended problems, compute solutions to said problems, even
| correct its own mistakes-- _and then_ it can format the output
| correctly. And then on yet another hand, it only took about 10^25
| floating point operations (yeah, just ten million trillion
| trillion, right!?) to get this outcome.
| codingwagie wrote:
| I think it will take a long time for the world at large to
| realize and then operationalize the potential of this "mundane"
| technology. It is revolutionary, and also sitting in plain
| sight. Such a huge technological shift that was considered
| decades out only a few years ago
| ben_w wrote:
| Although I am an optimist* about what this can do, I am very
| much aware -- from personal experience -- how easy it is to
| see more than is really there.
|
| The realisation of the tech might be fantastic new things...
| or it might be that people like me are Clever Hans-ing the
| models.
|
| * that may be the wrong word; "strong capabilities" is what I
| think is present, those can be used for ill effects which is
| pessimistic.
| srcreigh wrote:
| If I wanted to be a silly pedant, I'd say that Turing machines
| are language specifications and thus it's theoretically
| impossible for an LLM or any program to validate output formats
| in general.
| jes5199 wrote:
| in _general_ sure, but if you restricted each token to
| conform to a Kleene-star grammar you should be able to
| guarantee that you get something that parses according to a
| context-free grammar
| tommica wrote:
| For some reason it reminds me of my civilization runs - rush to
| certain high level tech and then after that discovery writing
| :D
| thruway516 wrote:
| I dont understand your complaint at all. If you develop a new
| revolutionary technology called an automobile, developing
| steering, brakes, starter, mufflers for it is a pretty big deal
| even if reins, clamps, mufflers and keys are mundane and have
| existed for decades. Structured outputs are a pretty big step
| in making this magic actually usable by developers as opposed
| to generating impressive cat pictures or whatever has captured
| the public imagination.
| Bjartr wrote:
| I don't think it was an complaint, just a observation.
| thruway516 wrote:
| Yes probably. But considering non-deterministic outputs is
| the nature of the beast with Llms and we're (mostly)
| engineers here, calling any part of this mundane sounds
| almost more like fighting words than just observation
| the8thbit wrote:
| Extremely pedantic, but is "non-deterministic" really the
| right language? The same input will always produce the
| same output, provided you haven't intentionally
| configured the system to use the model non-
| deterministically. It seems like the right way to
| describe it is as a chaotic deterministic system. The
| same input will always produce the same output, but small
| shifts in the input or weights can result in dramatic and
| difficult to predict changes in outputs.
| davedx wrote:
| Llms are indeed non deterministic
| visarga wrote:
| > The same input will always produce the same output
|
| Not guaranteed even with the same seed. If you don't
| perform all operations in exactly the same order, even a
| simple float32 sum, if batched differently, will result
| in different final value. This depends on the load factor
| and how resources are allocated.
| simonw wrote:
| Yeah, the fact that floating point multiplication isn't
| associative is a real pain for producing deterministic
| outputs - especially when you're running massively
| parallel computations on GPUs (or multiple GPUs) making
| the order of operations even less predictable.
| jappgar wrote:
| Structured outputs are hard... but they claimed to have
| solved this a year ago.
|
| They were lying, of course, and meanwhile charged output
| tokens for malformed JSON.
| ramraj07 wrote:
| This is like saying "we shouldn't be celebrating a computer
| that can talk, my parrot can do that!"
| throwawaymaths wrote:
| > On the one hand, it's breathtakingly mundane that the model
| can now do the most basic of tasks: conform to an output
| specification.
|
| I highly doubt it's the model that does this... It's very
| likely code injected into the token picker. You could put this
| into any model all the way down to gpt-2.
| crowcroft wrote:
| I wonder if you get 90% of the way with prompt engineering,
| and then the last 10% is just brute force, validate output,
| if it fails, rerun the prompt.
|
| My assumption is if that's all this is they would have done
| it a long time ago though.
| jeeceebees wrote:
| You can just mask the output probabilities for each token
| based on which options are valid according to a grammar.
|
| There are quite a few open source implementations of this
| e.g. https://github.com/outlines-dev/outlines
| contravariant wrote:
| You could simply censor invalid tokens, but that does
| rely on 2 assumptions.
|
| 1. There is always a valid next token.
|
| 2. This greedy algorithm doesn't result in a
| qualitatively different distribution from a rejection
| sampling algorithm.
|
| The latter isn't too obvious, and may in fact be (very)
| false. Look up maze generation algorithms if you want
| some feeling for the effects this could have.
|
| If you just want a quick argument, consider what happens
| if picking the most likely token would increase the
| chance of an invalid token further down the line to
| nearly 100%. By the time your token-picking algorithm has
| any effect it would be too late to fix it.
| throwawaymaths wrote:
| Sorry, how could there not be a valid next token?
| Presumably your interface would generate a state machine
| with appropriate masking arrays, and iirc generally
| speaking all 256 byte choices are in the token list.
| There's no way to get stuck in a place where the JSON is
| invalid? Can you give an example?
|
| If you want to be really clever about your picker, a
| deterministic result would blat out the all the known
| possible strings.
|
| For example, if you had an object with defined a defined
| set of properties, you could just go ahead and not bother
| generating tokens for all the properties and just
| tokenize, E.G. `{"foo":"` (6-ish tokens) without even
| passing through the LLM. As soon as an unescaped `"`
| arrives, you know the continuation must be `,"bar":"`,
| for example
|
| > This greedy algorithm doesn't result in a qualitatively
| different distribution from a rejection sampling
| algorithm.
|
| It absolutely will. But so will adding an extra newline
| in your prompt, for example. That sort of thing is part
| and parcel of how llms work
| contravariant wrote:
| Hmm, I think any example where it can get stuck is going
| to be a bit contrived since really it's a question of how
| easy it is to recognize a valid prefix. Say for example
| you want the LLM to generate a valid chess match and it
| ends up in a situation with just 2 kings left. If you're
| not careful with your definitions you could end up in an
| endless loop that never ends.
|
| That said if you _know_ all valid prefixes in your
| language then you can always realise when a token leaves
| no valid continuations.
|
| > It absolutely will. But so will adding an extra newline
|
| A newline is less likely to dramatically drop the
| quality, a greedy method could easily end driving itself
| into a dead end (if not grammatically then semantically).
|
| Say you want it to give a weather prediction consisting
| of a description followed by a tag 'sunny' or 'cloudy'
| and your model is on its way to generate
| { desc: "Strong winds followed by heavy
| rainfall.", tag: "stormy" }
|
| If it ever gets to the 's' in stormy it will be forced to
| pick 'sunny', even if that makes no sense in context.
| dilap wrote:
| You just sample from a grammar and you automatically get
| 100%; who knows but it seems the most likely thing they are
| doing. llama.cpp has supported this for a while ( using a
| BNF-style grammar -- https://github.com/ggerganov/llama.cpp
| /blob/master/grammars/... )
|
| edit: oh actually, we do sort of know -- they call out
| jsonformer as an inspiration in the acknowledgements
|
| https://github.com/1rgs/jsonformer
| crowcroft wrote:
| Oh, thanks for the links. Super interesting!
| senko wrote:
| Using this in a naive way can easily degenerate into the
| LLM outputting syntactically/gramatically valid tokens
| that make no sense, like in this example:
| https://community.openai.com/t/json-format-causes-
| infinite-n...
|
| This might be even more pronounced when the output is
| restricted more using the JSON schema.
|
| So the heavy lifting here was most likely to align the
| model to avoid/minimize such outcomes, not in tweaking
| the token sampler.
| dilap wrote:
| Isn't your example showing an issue w/ the opposite
| approach, where someone is getting bad output w/ an
| earlier openAI json mode that worked via training rather
| than mechanical output restriction to conform to a
| schema?
|
| FWIW (not too much!) I have used llama.cpp grammars to
| restrict to specific formats (not particular json, but an
| expected format), fine-tuned phi2 models, and I didn't
| hit any issues like this.
|
| I am not intuitively seeing why restricting sampling to
| tokens matching a schema would cause the LLM to converge
| on valid tokens that make no sense...
|
| Are there examples of this happening w/ people using e.g.
| jsonformer?
| throwawaymaths wrote:
| Yeah but that's hugely wasteful of tokens.
| scarmig wrote:
| I have struggled writing valid YAML before (my tokenizer
| doesn't handle whitespace very well). And it probably takes me
| a quadrillion operations on the reals to get a minimal YAML
| file (I think your 10^25 fp ops is an overestimate--I think
| it's more like 10^18-10^19).
|
| It's kind of like an inverse Moravec's paradox.
| theturtle32 wrote:
| Relatable!!
| m3kw9 wrote:
| It's doing more, it is allowing user to input using natural
| language and the output is the json format the API that is
| defined
| raincole wrote:
| I don't know, it doesn't sound wild at all to me. Human
| languages are very imprecise, vague and error-tolerant, which
| is the opposite of an output format like JSON. So the a model
| can't do these two things well at the same time is quite an
| intuitive conclusion.
|
| The wild part is that a model trained with so much human
| language text can still outputs mostly compilable code.
| wewtyflakes wrote:
| Why would someone want `strict` to be anything other than `true`?
| davidkunz wrote:
| Maybe if you can't precisely model your structure with
| (OpenAI's subset of) JSON schema.
| ComputerGuru wrote:
| There are many reasons, though I am not sure which _they_ had
| in mind. One thing is that LLMs in general tend to do better
| when they can be more verbose in their output and sort of
| "think aloud" to reach an answer. Insisting on strict output
| format would rob it of the benefits (because it doesn't just
| not emit but completely skips those stages, or else you'd be
| paying for those elided output tokens).
| wewtyflakes wrote:
| But then why would someone specify that the response has to
| be in a given JSON schema (by presence of the schema itself),
| but then also not care if it is actually using that schema
| (by specifying `strict` as `false`)? That is the use-case I
| can't wrap my head around.
| tedsanders wrote:
| We didn't cover this in the announcement post, but there are a
| few reasons:
|
| - The first request with each JSON schema will be slow, as we
| need to preprocess the JSON schema into a context-free grammar.
| If you don't want that latency hit (e.g., you're prototyping,
| or have a use case that uses variable one-off schemas), then
| you might prefer "strict": false
|
| - You might have a schema that isn't covered by our subset of
| JSON schema. (To keep performance fast, we don't support some
| more complex/long-tail features.)
|
| - In JSON mode and Structured Outputs, failures are rarer but
| more catastrophic. If the model gets too confused, it can get
| stuck in loops where it just prints technically valid output
| forever without ever closing the object. In these cases, you
| can end up waiting a minute for the request to hit the
| max_token limit, and you also have to pay for all those useless
| tokens. So if you have a really tricky schema, and you'd rather
| get frequent failures back quickly instead of infrequent
| failures back slowly, you might also want "strict": false
|
| But in 99% of cases, you'll want "strict": true.
| _vaporwave_ wrote:
| Anyone else catch this reference in one of the examples?
|
| > 9.11 and 9.9 -- which is bigger
|
| https://community.openai.com/t/why-9-11-is-larger-than-9-9-i...
| jodacola wrote:
| Amusingly, I immediately thought 9.11 - but in the context of a
| newer version of software. Ever have those moments where you're
| so deep in context of some ecosystem that you skip right past
| the basics, like 9.9 being a larger number than 9.11?
| elpocko wrote:
| Doesn't the BNF grammar approach in llama.cpp solve this issue in
| a generic way that should work with any model? Why wouldn't they
| use that?
| ejones wrote:
| Similar approach to llama.cpp under the hood - they convert the
| schema to a grammar. Llama.cpp's implementation was specific to
| the ggml stack, but what they've built sounds similar to
| Outlines, which they acknowledged.
| HanClinto wrote:
| llama.cpp's GBNF grammar is generic, and indeed works with any
| model.
|
| I can't speak for other approaches, but -- while llama.cpp's
| implementation is nice in that it always generates valid
| grammars token-by-token (and doesn't require any backtracking),
| it is tough in that -- in the case of ambiguous grammars (where
| we're not always sure where we're at in the grammar until it
| finishes generating), then it keeps all valid parsing option
| stacks in memory at the same time. This is good for the no-
| backtracking case, but it adds a (sometimes significant) cost
| in terms of being rather "explosive" in the memory usage
| (especially if one uses a particularly large or poorly-formed
| grammar). Creating a grammar that is openly hostile and crashes
| the inference server is not difficult.
|
| People have done a lot of work to try and address some of the
| more egregious cases, but the memory load can be significant.
|
| One example of memory optimization:
| https://github.com/ggerganov/llama.cpp/pull/6616
|
| I'm not entirely sure what other options there are for
| approaches to take, but I'd be curious to learn how other
| libraries (Outlines, jsonformer) handle syntax validation.
| behnamoh wrote:
| Well, there goes one of the big advantages of open-source
| models...
|
| For a long time, I was relying on such guaranteed structured
| outputs as a "secret sauce" that only works using llama.cpp's
| GBNF grammars. Now OpenAI literally introduced the same concept
| but a bit more accessible (since you create a JSON and they
| convert it to a grammar).
|
| Those of you who have used GBNF, do you think it still has any
| advantage over what OpenAI just announced?
| ejones wrote:
| FWIW, llama.cpp has always had a JSON schema -> GBNF converter,
| although it launched as a companion script. Now I think it's
| more integrated in the CLI and server.
|
| But yeah I mean, GBNF or other structured output solutions
| would of course allow you to supply formats other than JSON
| schema. It sounds conceivable though that OpenAI could expose
| the grammars directly in the future, though.
| behnamoh wrote:
| I think for certain tasks it's still easier to write the
| grammar directly. Does converting from JSON to a CFG limit
| the capabilities of the grammar? i.e., are there things JSON
| can't represent that a context free grammar can?
| ejones wrote:
| You might be right that they're similarly powerful. In some
| cases, an arbitrary output format might in and of itself be
| desirable. Like it might result in token savings or be more
| natural for the LLM. For instance, generating code snippets
| to an API or plain text with constraints.
|
| And this is more esoteric, but technically in the case of
| JSON I suppose you could embed a grammar inside a JSON
| string, which I'm not sure JSON schema can express.
| J_Shelby_J wrote:
| JSON is a sub-set of what GBNF can do, so there are still
| advantages to that approach. But even GBNF doesn't go far
| enough. Ever try to restrict a model to a single sentence?
|
| root ::= \" \" item{{{min_count},{max_count}}}
|
| item ::= [A-Z]
| [^\\\r\\\n\\\x0b\\\x0c\\\x85\\\u2028\\\u2029.?!]+ [a-z] (\". \"
| | \"? \" | \"! \")
|
| This kinda works if you don't mind no abbreviations, but you
| can't do something like this with JSON grammars afaik.
| enobrev wrote:
| In a startup I was working on last year, I had a surprisingly
| good experience with using a json-schema in my prompt. I had to
| tweak the json response a bit because it was always invalid, but
| the issue was generally a missing colon or misplaced bracket.
| Data-wise it stuck to the schema very well, and cleaning up the
| json was simple enough that we got to zero parsing errors. I
| believe this was with 3.5.
|
| Sadly, that project was a final (relatively successful) attempt
| at getting traction before the startup was sold and is no longer
| live.
|
| Edit: Ouch, are the down-votes disbelief? Annoyance? Not sure
| what the problem is.
| nichochar wrote:
| I'm a little confused why you have to specify "strict: true" to
| get this behavior. It is obviously always desired, I would be
| surprised for people to ever specify "strict: false". That API
| design leaves to be desired.
|
| I also learned about constrainted decoding[1], that they give a
| brief explanation about. This is a really clever technique! It
| will increase reliability as well as reduce latency (less tokens
| to pick from) once the initial artifacts are loaded.
|
| [1] https://www.aidancooper.co.uk/constrained-decoding/
| dgellow wrote:
| Could you develop a bit re: the API? What do you dislike other
| than the "strict: true"?
| athyuttamre wrote:
| Hi, I work on the OpenAI API -- structured outputs schemas have
| limitations (e.g. all fields must be required, no additional
| properties allowed):
| https://platform.openai.com/docs/guides/structured-
| outputs/s....
|
| If your schema is not supported, but you still want to use the
| model to generate output, you would use `strict: false`.
| Unfortunately we cannot make `strict: true` the default because
| it would break existing users. We hope to make it the default
| in a future API version.
| Der_Einzige wrote:
| You should also mention that before you had done custom
| alignment accounting for this feature, that it was an
| excellent alignment breaker (therefor a big no-no to release
| too early)
|
| For example, if I ask an LLM to generate social security
| numbers, it will give the whole "I'm sorry Hal, I can't do
| that". If I ban all tokens except numbers and hyphens, prior
| to your "refusal = True" approach, it was guaranteed that
| even "aligned" models would generate what appeared to be
| social security numbers.
| ethical_source wrote:
| And if LLMs can generate plausible social security numbers,
| our civilization will fall /s
|
| Christ, I hate the AI safety people who brain-damage models
| so that they refuse to do things trivial to do by other
| means. Is LLM censorship preventing bad actors from
| generating social security numbers? Obviously not. THEN WHY
| DOES DAMAGING AN LLM TO MAKE IT REFUSE THIS TASK MAKE
| CIVILIZATION BETTER OFF?
|
| History will not be kind to safetyist luddites.
| Terretta wrote:
| I'm less concerned with the AI teams lobotomizing
| utility, more concerned with damage to language,
| particularly redefining the term "safe" to mean something
| like "what we deem suitable".
|
| That said, when zero "safety" is at stake might be the
| best time to experiment with how to build and where to
| put safety latches, for when we get to a point we mean
| actual safety. I'm even OK with models that default to
| parental control for practice provided it can be switched
| off.
| srcreigh wrote:
| > The tokens that are valid at the beginning of the output
| include things like {, {", {\n, etc. However, once the model has
| already sampled {"val, then { is no longer a valid token
|
| Oops, this is incorrect. {"val{":2} is valid json.
|
| (modulo iOS quotes lol)
| jhgg wrote:
| Valid JSON, sure, but that key does not conform to the schema
| provided in the example. The LLM must generate valid JSON that
| _also_ conforms to the provided schema.
| simonw wrote:
| The price decrease is particularly notable because it represents
| a 50% cut in the price to handle image inputs, across any OpenAI
| model.
|
| Previously image inputs on GPT-4o-mini were priced the SAME as
| GPT-4o, so using mini wouldn't actually save you any money on
| image analysis.
|
| This new gpt-4o-2024-08-06 model is 50% cheaper than both GPT-4o
| AND GPT-4o-mini for image inputs, as far as I can tell.
|
| UPDATE: I may be wrong about this. The pricing calculator for
| image inputs on https://openai.com/api/pricing/ doesn't indicate
| any change in price for the new model.
| minimaxir wrote:
| The calculator doesn't account for the fact that there are now
| two different prices in a given price matrix.
| jeffharris wrote:
| yep image input on the new model is also 50% cheaper
|
| and apologies for the outdated pricing calculator ... we'll be
| updating it later today
| cvhc wrote:
| I wonder why the top level has to be an object instead of an
| array... I have some pretty normal use cases where I expect the
| model to extract a list of objects from the text.
|
| ``` openai.BadRequestError: Error code: 400 - {'error':
| {'message': 'Invalid schema for response_format
| \'PolicyStatements\': schema must be a JSON Schema of \'type:
| "object"\', got \'type: "array"\'.', 'type':
| 'invalid_request_error', 'param': 'response_format', 'code':
| None}} ```
|
| I know I can always put the array into a single-key object but
| it's just so annoying I also have to modify the prompts
| accordingly to accomodate this.
| manquer wrote:
| I can't say for OpenAI, but in general I have seen and used
| this design pattern to keep consistency of root object output
| and remove a lot of unnecessary validations and branching flows
|
| Otherwise you will to handle the scenarios in code everywhere
| if you don't know if the root is object or array. If the root
| has a key that confirms to a known schema then validation
| becomes easier to write for that scenario,
|
| Similar reasons to why so many APIs wrap with a key like
| 'data', 'value' or 'error' all responses or in RESTful HTTP
| endpoints collection say GET /v1/my-object endpoints do no mix
| with resource URIs GET /v1/my-object/1 the former is always an
| array the latter is always an object.
| tomComb wrote:
| Well, this wouldn't be a very satisfying explanation, but these
| JSON objects are often represented as Python dictionaries and
| those can't have top level arrays.
| heliophobicdude wrote:
| Back in the old days, top level arrays were a security risk
| because the array constructor in JS could be redefined and do
| bad-guy stuff. I cannot think of any json parsing clients that
| are vulnerable to this.
| moritzwarhier wrote:
| It's a relatively common convention for JSON APIs.
|
| Possible reasons:
|
| - Extensibility without breaking changes
|
| - Forcing an object simplifies parsing of API responses,
| ideally the key should describe the contents, like additional
| metadata. It also simplifies validation, if considered separate
| from parsing
|
| - Forcing the root of the API response to be an object makes
| sure that there is a single entry point into consuming it.
| There is no way to place non-descript heterogenous data items
| next to each other
|
| - Imagine that you want to declare types (often generated from
| JSON schemas) for your API responses. That means you should
| refrain from placing different types, or a single too broad
| type in an array. Arrays should be used in a similar way to
| stricter languages, and not contain unexpected types. A top-
| level array invites dumping unspecified data to the client that
| is expensive and hard to process
|
| - The blurry line between arrays and objects in JS does not
| cleanly map to other languages, not even very dynamic ones like
| PHP or Python. I'm aware that JSON and JS object literals are
| not the same. But even the JSON subset of JS (apart from number
| types, where it's not a subset AFAIK) already creates
| interesting edge cases for serialization and deserialization
| simonw wrote:
| I've regretted designing APIs that return an array rather than
| an object in the past.
|
| It's all about the extensibility. If you return an object you
| can add extra keys, for things like "an error occurred, here
| are the details", or "this is truncated, here's how to paginate
| it", or a logs key for extra debug messages, or information
| about the currently authenticated user.
|
| None of those are possible if the root is an array.
| __jl__ wrote:
| There is another big change in gpt-4o-2024-08-06: It supports 16k
| output tokens compared to 4k before. I think it was only
| available in beta before. So gpt-4o-2024-08-06 actually brings
| three changes. Pretty significant for API users
|
| 1. Reliable structured outputs 2. Reduced costs by 50% for input,
| 33% for output 3. Up to 16k output tokens compared to 4k
|
| https://platform.openai.com/docs/models/gpt-4o
| Culonavirus wrote:
| That's actually pretty impressive... if they didn't dumb it
| down that is, which only time will tell.
| santiagobasulto wrote:
| I've noticed that lately GPT has gotten more and more verbose.
| I'm wondering if it's a subtle way to "raise prices", as the
| average response is going to incur I more tokens, which makes
| any API conversation to keep growing in tokens of course (each
| IN message concatenates the previous OUT messages).
| sashank_1509 wrote:
| they also spend more to generate more tokens. The more
| obvious reason is it seems like people rate responses better
| the longer they are. Lmsys demonstrated that GPT tops the
| leaderboard because it tends to give much longer and more
| detailed answers, and it seems like OpenAI is optimizing or
| trying to maximize lmsys.
| throwaway48540 wrote:
| It's a subtle way to make it smarter. Making it write out the
| "thinking process" and decisions has always helped with
| reliability and quality.
| tedsanders wrote:
| GPT has indeed been getting more verbose, but revenue has
| zero bearing on that decision. There's always a tradeoff
| here, and we do our imperfect best to pick a default that
| makes the most people happy.
|
| I suspect the reason why most big LLMs have ended up in a
| pretty verbose spot is that it's easier for users to scroll &
| skim than to ask follow-up questions.
|
| With regard to this new gpt-4o model: you'll find it actually
| bucks the recent trend and is less verbose than its
| predecessor.
| sophiabits wrote:
| I've especially noticed this with gpt-4o-mini [1], and it's a
| big problem. My particular use case involves keeping a
| running summary of a conversation between a user and the LLM,
| and 4o-mini has a really bad tendency of inventing details in
| order to hit the desired summary word limit. I didn't see
| this with 4o or earlier models
|
| Fwiw my subjective experience has been that non-technical
| stakeholders tend to be more impressed with / agreeable to
| longer AI outputs, regardless of underlying quality. I have
| lost count of the number of times I've been asked to make
| outputs longer. Maybe this is just OpenAI responding to what
| users want?
|
| [1] https://sophiabits.com/blog/new-llms-arent-always-
| better#exa...
| surfingdino wrote:
| Is it still NTSAT (Never The Same Answer Twice)?
| H8crilA wrote:
| Yes, this happens by design.
| oblio wrote:
| Interesting, why? Is there no theoretical way to have stable
| models? Or some kind of executive decision?
| H8crilA wrote:
| How is this different from function calling?
| binarymax wrote:
| Function calling uses JSON mode. While it has been mostly
| correct, I do get an incorrectly formatted response sometimes
| (maybe 1 in 10k requests?). So it sounds like this fixes that
| bug.
| tedsanders wrote:
| Under the hood, it's quite similar to function calling. A few
| differences:
|
| - Structured Outputs is a bit more straightforward. e.g., you
| don't have to pretend you're writing a function where the
| second arg could be a two-page report to the user, and then
| pretend the "function" was called successfully by returning
| {"success": true}
|
| - Having two interfaces lets us teach the model different
| default behaviors and styles, depending on which you use
|
| - Another difference is that our current implementation of
| function calling can return both a text reply plus a function
| call (e.g., "Let me look up that flight for you"), whereas
| Structured Outputs will only return the JSON
|
| (I worked on this feature at OpenAI.)
| technics256 wrote:
| How can we enable the text reply with a function call?
| Usually the message returned is a tool call only when it
| calls a tool?
| tedsanders wrote:
| There's no special interface, but you can write an
| instruction in a system message in the first position.
| E.g., "Before each function call, explain to the user what
| you're about to do." It's not super reliable, but the model
| can do it. Few-shot prompting might help as well.
| paradite wrote:
| Really important update that was not mentioned:
|
| gpt-4o-2024-08-06 has 16,384 tokens output limit instead of 4,096
| tokens.
|
| https://platform.openai.com/docs/models/gpt-4o
|
| We don't need the GPT-4o Long Output anymore.
| OutOfHere wrote:
| But is this also the default or just the max? Is the default 4k
| or 16k?
|
| Also, the question of the default value applies both at the
| server level and at the SDK level.
| floam wrote:
| Long Output is 64K though.
| gdiamos wrote:
| We've had this for over 1 year in Lamini - https://lamini-
| ai.github.io/inference/json_output/
|
| Works with any open LLM, including Llama 3.1
| radarsat1 wrote:
| Looks useful!
| AStrangeMorrow wrote:
| Also the outlines library: https://github.com/outlines-
| dev/outlines
| gdiamos wrote:
| Yeah! - outlines, guidance, jsonformer were inspiring for
| this line of work
| HanClinto wrote:
| Also note llama.cpp with grammar support:
|
| https://github.com/ggerganov/llama.cpp/tree/master/grammars
|
| Supports an EBNF-like syntax, as well as JSON-Schema.
| msoad wrote:
| Why not JSON Schema?
| gdiamos wrote:
| We did some user studies and found that people found it less
| intuitive.
| zoogeny wrote:
| Totally tangential, totally not related to the post (unless you
| squint your eyes and really blur things) ...
|
| I was thinking about the old canard of the sufficiently smart
| compiler. It made me think about LLM output and how in some way
| the output of a LLM could be bytecode as much as it could be the
| English language. You have a tokenized input and the translated
| output. You have a massive and easily generatable training set. I
| wonder if, one day, our compilers will be LLMs?
| jcims wrote:
| You definitely could, not far removed from text to image or
| text to audio generators.
| pjc50 wrote:
| Why would you tolerate a nonreliable compiler with no assured
| relationship between its inputs and its outputs? Have people
| just got too comfortable with the C++ model of "UB means I can
| insert a security bug for you"?
| bigyikes wrote:
| In a hypothetical future where the reliability of LLMs
| improves, I can imagine the model being able to craft
| optimizations that a traditional compiler cannot.
|
| Like there are already cases where hand-rolling assembly can
| eke out performance gains, but few do that because it's so
| arduous. If the LLM could do it reliably it'd be a huge win.
|
| It's a big if, but not outside the realm of possibility.
| zoogeny wrote:
| I agree it is currently a pipe dream. But if I was looking
| for a doctoral research idea, it might be fun to work on
| something like that.
|
| Lots of potential avenues to explore, e.g. going from a
| high-level language to some IR, from some IR to bytecode,
| or straight from high-level to machine code.
|
| I mean, -O3 is already so much of a black box that I can't
| understand it. And the tedium of hand optimizing massive
| chunks of code is why we automate it at all. Boredom is
| something we don't expect LLMs to suffer, so having one
| pore over some kind of representation and apply
| optimizations seems totally reasonable. And if it had some
| kinds of "emergent behaviors" based on intelligence that
| allow it to beat the suite of algorithmic optimization we
| program into compilers, it could actually be a benefit.
| killthebuddha wrote:
| A function that implements natural language -> bytecode is IMO
| way more likely to be under the hood an LLM _operating a
| compiler_ (or maybe a compiler operating LLMs) rather than a
| "bare" LLM. From an end user's perspective maybe it won't
| matter but I think it's an important technical point. IMO
| there's no evidence that an LLM will ever be the best way to
| execute general purpose computations.
| thih9 wrote:
| I guess an actual compiler would be cheaper and more reliable.
|
| In theory we could do the same with mathematical computations,
| 2+2=4 and the like; but computing the result seems easier.
| adagradschool wrote:
| While text and image generation are getting cheaper at a
| significant rate, audio still seems to be just as expensive with
| ElevenLabs. I wonder why it is so.
| say_it_as_it_is wrote:
| This puts like a dozen popular python libraries out of business
| AStrangeMorrow wrote:
| At least depends on the approach and use: stuff like outlines
| (https://github.com/outlines-dev/outlines) that actually
| changes the sampling to adhere to a grammar and can be used
| with local/custom models shouldn't be too impacted. Those are
| not really used on top of openAI models
| sansseriff wrote:
| Preprocessing new schema takes 'under 10 seconds'. That's... a
| huge range? Unless the preprocessing time is a small fraction of
| the inference time, I don't see the point.
|
| I'm working on an app that dynamically generates schema based on
| user input (a union of arbitrary types pulled from a library).
| The resulting schema is often in the 800 token range. Curious how
| long that would take to preprocess
| pton_xd wrote:
| Isn't "we hardcoded JSON into the latest model" kind of the
| opposite direction, strategically, from "we're on the way to AGI
| and I need 7 trillion to get there?"
| isoprophlex wrote:
| You are witnessing the final stages in the evolution of OpenAI
| from a messianic hype machine to Yet Another Product Company.
|
| Hence all the people leaving, too.
| gardenhedge wrote:
| I am ootl, employees are leaving openai?
| dangrossman wrote:
| > John Schulman, one of the co-founders of artificial
| intelligence company OpenAI, has left the ChatGPT maker for
| rival Anthropic, he said in a post on social media platform
| X late Monday.
|
| > OpenAI's President and co-founder Greg Brockman is also
| taking a sabbatical through the end of the year, he said in
| a X post late Monday.
|
| > Peter Deng, a vice-president of product, also left in
| recent months, a spokesperson said. And earlier this year,
| several members of the company's safety teams exited.
|
| That's after co-founder and Chief Scientist Ilya Sutskever
| left in May.
| oblio wrote:
| Are there any co-founders left?
| sashank_1509 wrote:
| sam Altman for one.
| KaiMagnus wrote:
| Yeah, definitely a way to end up with a Siri like mess if you
| do this long enough. The use case is there and it's going to be
| very useful, but the magic is wearing off.
| irgolic wrote:
| Wasn't this already fully supported in the tool calling API?
| ramoz wrote:
| Can someone explain how this is different/better than the current
| state of function calling (which I've been using to get a
| consistent json schema response without issue)?
| mrshu wrote:
| This post (from an OpenAI researcher) contains a bit more
| background: https://news.ycombinator.com/item?id=41174213
| jacobsimon wrote:
| For starters, the naming is much less confusing. But the
| behavior also appears to be enforced/validated at some layer
| (hopefully?), which function calling did not seem to be. I was
| experimenting with it a couple weeks ago and it would work like
| 75% of the time but would often give me invalid results for
| schemas with relatively simple nested objects.
| zbyforgotp wrote:
| This is guaranteed, function calling without it is not. The old
| way can work for you, but my experience is different,
| especially with complex schemas.
| Der_Einzige wrote:
| Now the question is when they will support soft constraints like
| this: https://huggingface.co/blog/constrained-beam-search
| agtech_andy wrote:
| I have had a lot of success using BoundaryML
| (https://www.boundaryml.com/) for this. They have also been super
| responsive for any of my questions.
| aaronvg wrote:
| thanks for the shoutout, we benchmarked our approach against
| other function-calling techniques and we've been able to beat
| all other approaches every time (even by 8%!) just by getting
| better at parsing the data and representing schemas with less
| tokens using type definitions instead of json schema.
|
| You can take a look at our BFCL results on that site or the
| github: https://github.com/BoundaryML/baml
|
| We'll be publishing our comparison against OpenAI structured
| outputs in the next 2 days, and a deeper dive into our results,
| but we aim to include this kind of constrained generation as a
| capability in the BAML DSL anyway longterm!
| tarofchaos wrote:
| Two years too late. I think we are going through a bozo period at
| OpenAI where small things are being highlighted as achievements.
| damsta wrote:
| Can we get something like that in Gemini 1.5 Flash?
| mugivarra69 wrote:
| cohere had this like a while ago
| MattDaEskimo wrote:
| "We have ripped code from a bunch of open-source variations and
| slapped it behind our brutally abstracted API.
|
| Interoperable with other external models like the open source
| versions? What, are you mad?"
| jumploops wrote:
| By using JSON mode, GPT-4{o} has been able to do this reliably
| for months (100k+ calls).
|
| We use GPT-4o to build dynamic UI+code[0], and almost all of our
| calls are using JSON mode. Previously it mostly worked, but we
| had to do some massaging on our end (backtick removal, etc.).
|
| With that said, this will be great for GPT-4o-mini, as it often
| struggles/forgets to format things as we ask.
|
| Note: we haven't had the same success rate with function calling
| compared to pure JSON mode, as the function calling seems to add
| a level of indirection that can reduce the quality of the LLMs
| output YMMV.
|
| Anyhow, excited for this!
|
| [0]https://magicloops.dev
| qwertox wrote:
| What a cool product! I was about to recommend you to submit it
| as a "Show HN", but it turns out that it already got submitted
| one year ago.
|
| Would you mind sharing a bit on how things have evolved?
| jumploops wrote:
| Thanks and great question :)
|
| When we first launched, the tool was very manual; you had to
| generate each step via the UI. We then added a "Loop Creator
| agent" that now builds Loops for you without intervention.
| Over the past few months we've mostly been fixing feature
| gaps and improving the Loop Creator.
|
| Based on recent user feedback, we've put a few things in
| motion:
|
| - Form generator (for manual loops)
|
| - Chrome extension (for local automations)
|
| - In-house Google Sheets integration
|
| - Custom outputs (charts, tables, etc.)
|
| - Custom Blocks (shareable with other users)
|
| With these improvements, you'll be able to create "single
| page apps" like this one I made for my wife's annual mango
| tasting party[0].
|
| In addition to those features, we're also launching a new
| section for Loop templates + educational content/how-tos, in
| an effort to help people get started.
|
| To be super candid, the Loop Creator has been a pain. We
| started at an 8% success rate and we're only just now at 25%.
| Theoretically we should be able to hit 80%+ based on existing
| loop requests, but we're running into limits with the current
| state of LLMs.
|
| [0]https://mangota.ngo
| gleb wrote:
| Where do you get such a large variety of mangoes?
| tomcam wrote:
| Asking the important questions
| jumploops wrote:
| My mother-in-law is the President of the Central Florida
| Fruit Society, and is in charge of sourcing mangoes for
| their annual party. She sends us all the excess mangoes
| :)
|
| As I understand it, this year's mangoes mostly came from
| Merritt Island, as there was some not-so-great weather in
| southern Florida.
| tomcam wrote:
| Very interesting. Did you build magicloops using this tech?
| jumploops wrote:
| We first built Magic Loops with GPT-4, about a year ago, well
| before JSON mode was a thing.
|
| We had to a do a bunch of extra prompting to make it work, as
| GPT would often include backticks or broken JSON (most
| commonly extra commas). At the time, YAML was a much better
| approach.
|
| Thankfully we've been able to remove most of these hacks, but
| we still use a best effort JSON parser[0] to help stream
| partial UI back to the client.
|
| [0]https://www.npmjs.com/package/best-effort-json-parser
| diego_sandoval wrote:
| Can I use Magic Loops to generate Magic Loops for me?
| msp26 wrote:
| Is the JSON actually being fed into the LLM's context or is it
| still being converted into typescript?
|
| The previous setup didn't allow for custom types, only
| objects/string/num/bool.
|
| Are the enums put into context or purely used for constrained
| sampling?
| OutOfHere wrote:
| Using this feature will obviously "lock you in" to OpenAI,
| specifically to this model too, at least until other companies
| catch on. While text prompts can more easily be moved to other
| LLMs, this feature cannot currently be ported as such. I would
| use it only if a text prompt is insufficient despite retries.
| BoorishBears wrote:
| It's already supported by multiple other providers. Fireworks,
| Together, probably more.
| dtquad wrote:
| OpenAI-style JSON mode and function calling rapidly became the
| industry standard way of doing it. It will probably also happen
| for this feature.
| toomuchtodo wrote:
| "S3 compatible"
| PufPufPuf wrote:
| This feature has existed for quite some time in several
| inference libraries, like Outlines, under the names
| "constrained decoding" or "guided decoding". Some even include
| it in their OpenAI-compatible API in a very similar form
| (allowing to pass in a JSON Schema). All this required doing
| your own inference, though -- so the announcement really just
| brings this popular feature "to the masses".
| moralestapia wrote:
| OTOH, not using it could "lock you out" of building a cool
| product for your users, so ...
| faizshah wrote:
| The converse API in AWS bedrock lets you use function calling
| across a number of different providers (doesn't support
| OpenAI):
| https://docs.aws.amazon.com/bedrock/latest/userguide/convers...
|
| I have been using it so that my agents aren't specific to a
| particular model or api.
|
| Like others have said many other providers already have
| function calling and json schema for structure outputs.
| jappgar wrote:
| It's nice that they're not making me pay for broken json anymore
| but touting this as a "feature" is laughable.
|
| It's a bug fix. They should never have been charging for
| malformed responses in the first place!
| LAC-Tech wrote:
| Good to see JSON Schema being more widely adopted. I remember
| doing a project a few years ago in XML just because XML schemas
| were everywhere and JSON ones were still not really used.
| Brosper wrote:
| I think they would like to have something like artifacts in
| Claude
___________________________________________________________________
(page generated 2024-08-06 23:00 UTC)