[HN Gopher] Native JSON Output from GPT-4
___________________________________________________________________
Native JSON Output from GPT-4
Author : yonom
Score : 246 points
Date : 2023-06-14 19:07 UTC (3 hours ago)
(HTM) web link (yonom.substack.com)
(TXT) w3m dump (yonom.substack.com)
| aecorredor wrote:
| Newbie in machine learning here. It's crazy that this is the top
| post just today. I've been doing the intro to deep learning
| course from MIT this week, mainly because I have a ton of JSON
| files that are already classified, and want to train a model that
| can generate new JSON data by taking classification tags as
| input.
|
| So naturally this post is exciting. My main unknown right now is
| figuring out which model to train my data on. An RNN, a GAN, a
| diffusion model?
| ilaksh wrote:
| Did you read the article? To do it with OpenAI you would just
| put a few output examples in the prompt and then give it a
| function that takes the class and the output parameters
| correspond to the JSON format you want, or just a string
| containing JSON.
|
| You could also fine tuned an LLM like Falcon-7b but probably
| not necessary and nothing to do with OpenAI.
|
| You might also look into the OpenAI Embedding API as a third
| option.
|
| I would try the first option though.
| chaxor wrote:
| Is there a decent way of converting to a structure with a very
| constrained vocabulary? For example, given some input text,
| converting it to something like {"OID-189": "QQID-378",
| "OID-478":"QQID-678"}. Where OID and QQID dictionaries can be
| e.g. millions of different items defined by a description. The
| rules for mapping could be essentially what looks closest in
| semantic space to the descriptions given in a dictionary.
|
| I know this should be able to be solvable by local LLMs and bert
| cosine similarity (it isn't exactly, but it's a start on the
| idea), but is there a way to do this with decoder models rather
| than encoder models with other logic?
| 037 wrote:
| I'm wondering if introducing a system message like "convert the
| resulting json to yaml and return the yaml only" would adversely
| affect the optimization done for these models. The reason is that
| yaml uses significantly fewer tokens compared to json. For the
| output, where data type specification or adding comments may not
| be necessary, this could be beneficial. From my understanding,
| specifying functions in json now uses fewer tokens, but I believe
| the response still consumes the usual amount of tokens.
| lbeurerkellner wrote:
| I think one should not underestimate the impact on downstream
| performance the output format can have. From a modelling
| perspective it is unclear whether asking/fine-tuning the model
| to generate JSON (or YAML) output is really lossless with
| respect to the raw reasoning powers of the model (e.g. it may
| perform worse on tasks when asked/trained to always respond in
| JSON).
|
| I am sure they ran tests on this internally, but I wonder what
| the concrete effects are, especially comparing different output
| formats like JSON, YAML, different function calling conventions
| and/or forms of tool discovery.
| imranq wrote:
| Wouldnt this be possible with a solution like Guidance where you
| have a pre structured JSON format ready to go and all you need is
| text: https://github.com/microsoft/guidance
| swyx wrote:
| i think people are underestimating the potential here for agents
| building - it is now a lot easier for GPT4 to call other models,
| or itself. while i was taking notes for our emergency pod
| yesterday (https://www.latent.space/p/function-agents) we had
| this interesting debate with Simon Willison on just how many
| functions will be supplied to this API. Simon thinks it will be
| "deep" rather than "wide" - eg a few functions that do many
| things, rather than many functions that do few things. I think i
| agree.
|
| you can now trivially make GPT4 decide whether to call itself
| again, or to proceed to the next stage. it feels like the first
| XOR circuit from which we can compose a "transistor", from which
| we can compose a new kind of CPU.
| ilaksh wrote:
| The thing is the relevant context often depends on what it's
| trying to do. You can give it a lot of context in 16k but if
| there are too many different types of things then I think it
| will be confused or at least have less capacity for the actual
| selected task.
|
| So what I am thinking is that some functions might just be like
| gateways into a second menu level. So instead of just edit_file
| with the filename and new source, maybe only
| select_files_for_edit is available at the top level. In that
| case I can ensure it doesn't try to overwrite an existing file
| without important stuff that was already in there, by providing
| the requested files existing contents along with the function
| allowing the file edit.
| naiv wrote:
| I think big context only makes sense for document analysis.
|
| For programming you want to keep it slim. Just like you
| should keep your controllers and classes slim.
|
| Also people with 32k access report very very long response
| times of up to multiple minutes which is not feasible if you
| only want a smaller change or analysis.
| jonplackett wrote:
| It was already quite easy to get GPT-4 to output json. You just
| append 'reply in json with this format' and it does a really
| good job.
|
| GPT-3.5 was very haphazard though and needs extensive
| babysitting and reminding, so if this makes gpt3 better then
| it's useful - it does have an annoying disclaimer though that
| 'it may not reply with valid json' so we'll still have to do
| some sense checks into he output.
|
| I have been using this to make a few 'choose your own
| adventure' type games and I can see there's a TONNE of
| potential useful things.
| reallymental wrote:
| Is there any publicly available resource replicate your work?
| I would love to just find the right kind of "incantation" for
| the gpt-3.5-t or gpt-4 to output a meaningful story arc etc.
|
| Any examples of your work would be greatly helpful as well!
| SamPatt wrote:
| I'm not the person you're asking, but I built a site that
| allows you to generate fiction if you have an OpenAI API
| key. You can see the prompts sent in console, and it's all
| open source:
|
| https://havewords.ai/
| ignite wrote:
| > You just append 'reply in json with this format' and it
| does a really good job.
|
| It does an ok job. Except when it doesn't. Definitely misses
| a lot of the time, sometimes on prompts that succeeded on
| previous runs.
| bradly wrote:
| I could not get GPT-4 to reliably not give some sort of text
| response, even if was just a simple "Sure" followed by the
| JSON.
| rytill wrote:
| Did you try using the API and providing a very clear system
| message followed by several examples that were pure JSON?
| cwxm wrote:
| even with gpt 4, it hallucinates enough that it's not
| reliable, forgetting to open/close brackets and quotes. This
| sounds like it'd be a big improvement.
| ztratar wrote:
| Nah, this was solved by most teams a while ago.
| jonplackett wrote:
| Not that it matters now but just doing something like this
| works 99% of the time or more with 4 and 90% with 3.5.
|
| It is VERY IMPORTANT that you respond in valid JSON ONLY.
| Nothing before or after. Make sure to escape all strings.
| Use this format:
|
| {"some_variable": [describe the variable purpose]}
| SamPatt wrote:
| 99% of the time is still super frustrating when it fails,
| if you're using it in a consumer facing app. You have to
| clean up the output to avoid getting an error. If it goes
| from 99% to 100% JSON that is a big deal for me, much
| simpler.
| jonplackett wrote:
| Except it says in the small print to expect invalid JSON
| occasionally, so you have to write your error handling
| code either way
| davepeck wrote:
| Yup. Is there a good/forgiving "drunken JSON parser"
| library that people like to use? Feels like it would be a
| useful (and separable) piece?
| minimaxir wrote:
| "Trivial" is misleading. From OpenAI's docs and demos, the full
| ReAct workflow is an order of magnitude more difficult than
| typical ChatGPT API usage with a new set of constaints (e.g.
| schema definitions)
|
| Even OpenAI's notebook demo has error handling workflows which
| was actually necessary since ChatGPT returned incorrect
| formatted output.
| cjonas wrote:
| Maybe trivial isn't the right word, but it's still very
| straight-forward to get something basic, yet really
| powerful...
|
| ReAct Setup Prompt (goal + available actions) -> Agent
| "ReAction" -> Parse & Execute Action -> Send Action Response
| (success or error) -> Agent "ReAction" -> repeat
|
| As long as each action has proper validation and returns
| meaningful error messages, you don't need to even change the
| control flow. The agent will typically understand what went
| wrong, and attempt to correct it in the next "ReAction".
|
| I've been refactoring some agents to use "functions" and so
| far it seems to be a HUGE improvement in reliability vs the
| "Return JSON matching this format" approach. Most impactful
| is that fact that "3.5-turbo" will now reliability return
| JSON (before you'd be forced to use GPT-4 for an ReAct style
| agent of modest complexity).
|
| My agents also seem to be better at following other
| instructions now that the noise of the response format is
| gone (of course it's still there, but in a way it has been
| specifically trained on). This could also just be a result of
| the improvements to the system prompt though.
| [deleted]
| lbeurerkellner wrote:
| It's interesting to think about this form of computation (LLM +
| function call) in terms of circuitry. It is still unclear to me
| however, if the sequential form of reasoning imposed by a
| sequence of chat messages is the right model here. LLM decoding
| and also more high-level "reasoning algorithms" like tree of
| thought are not that linear.
|
| Ever since we started working on LMQL, the overarching vision
| all along was to get to a form of language model programming,
| where LLM calls are just the smallest primitive of the "text
| computer" you are running on. It will be interesting to see
| what kind of patterns emerge, now that the smallest primitive
| becomes more robust and reliable, at least in terms of the
| interface.
| moneywoes wrote:
| Wow your brand is huge. Crazy growth. i wonder how much these
| subtle mentions on forums help
| TeMPOraL wrote:
| They're the only one commenter on HN I noticed keeps writing
| "smol" instead of "small", and is associated with projects
| with "smol" in their name. Surely I'm not the only one who
| missed it being a meme around 2015 or sth., and finds this
| word/use jarring - and therefore very attention-grabbing?
| Wonder how much that helps with marketing.
|
| This is meant with no negative intentions. It's just that
| 'swyx was, in my mind, "that HN-er that does AI and keeps
| saying 'smol'" for far longer than I was aware of
| latent.space articles/podcasts.
| ftxbro wrote:
| > "you can now trivially make GPT4 decide whether to call
| itself again, or to proceed to the next stage."
|
| Does this mean the GPT-4 API is now publicly available, or is
| there still a waitlist? If there's a waitlist and you literally
| are not allowed to use it no matter how much you are willing to
| pay then it seems like it's hard to call that trivial.
| bayesianbot wrote:
| "With these updates, we'll be inviting many more people from
| the waitlist to try GPT-4 over the coming weeks, with the
| intent to remove the waitlist entirely with this model. Thank
| you to everyone who has been patiently waiting, we are
| excited to see what you build with GPT-4!"
|
| https://openai.com/blog/function-calling-and-other-api-
| updat...
| Tostino wrote:
| Not GP, but it's still the latter...i've been (im)patiently
| waiting.
|
| From their blog post the other day: With these updates, we'll
| be inviting many more people from the waitlist to try GPT-4
| over the coming weeks, with the intent to remove the waitlist
| entirely with this model. Thank you to everyone who has been
| patiently waiting, we are excited to see what you build with
| GPT-4!
| londons_explore wrote:
| If you put contact info in your HN profile - especially an
| email address that matches one you use to login to openai,
| someone will probably give you access...
|
| Anyone with access can share it with any other user via the
| 'invite to organisation' feature. Obviously that allows the
| invited person do requests billed to the inviter, but since
| most experiments are only a few cents that doesn't really
| matter much in practice.
| majormajor wrote:
| GPT-4 was already a massive improvement on 3.5 in terms of
| replying consistently in a certain JSON structure - I often
| don't even need to give examples, just a sentence describing
| the format.
|
| It's great to see they're making it even better, but where I'm
| currently hitting the limit still in GPT-4 for "shelling out"
| is about it being truly "creative" or "introspective" about "do
| I need to ask for clarifications" or "can I find a truly novel
| away around this task" type of things vs "here's a possible but
| half-baked sequence I'm going to follow".
| babyshake wrote:
| What would be an example where there needs to be an arbitrary
| level of recursive ability for GPT4 to call itself?
| iamflimflam1 wrote:
| It's pretty interesting how the work they've been doing on
| plugins has fed into this.
|
| I suspect that they've managed to get a lot of good training data
| by calling the APIs provided by plugins and detecting when it's
| gone wrong from bad request responses.
| irthomasthomas wrote:
| It's a shame they couldn't use yaml, instead. I compared them and
| yaml uses about 20% fewer tokens. However, I can understand
| accuracy, derived from frequency, being more important than token
| budget.
| IshKebab wrote:
| I would imagine JSON is easier for a LLM to understand (and for
| humans!) because it doesn't rely on indentation and confusing
| syntax for lists, strings etc.
| nasir wrote:
| Its a lot more straightforward to use JSON programmatically
| than YAML.
| TeMPOraL wrote:
| It really shouldn't be, though. I.e. not unless you're
| parsing or emitting it ad-hoc, for example by assuming that
| an expression like: "{" + $someKey + ":" +
| $someValue + "}"
|
| produces a valid JSON. It does - sometimes - and then it's
| indeed easier to work with. It'll also blow up in your face.
| Using JSON the right way - via a proper parser and serializer
| - should be identical to using YAML or any other equivalent
| format.
| AdrienBrault wrote:
| I think YAML actually uses more tokens than JSON without
| indents, especially with deep data. For example "," being a
| single token makes JSON quite compact.
|
| You can compare JSON and YAML on
| https://platform.openai.com/tokenizer
| rank0 wrote:
| OpenAI integration is going to be a goldmine for criminals in the
| future.
|
| Everyone and their momma is gonna start passing poorly
| validated/sanitized client input to shared sessions of a non-
| deterministic function.
|
| I love the future!
| zyang wrote:
| Is it possible to fine-tune with custom data to output JSON?
| edwin wrote:
| That's not the current OpenAI recipe. Their expectation is that
| your custom data will be retrieved via a function/plugin and
| then be subsequently processed by a chat model.
|
| Only the older completion models (davinci, curie, babbage, ada)
| are avaialble for fine-tuning.
| jamesmcintyre wrote:
| In the openai blog post they mention "Convert "Who are my top ten
| customers this month?" to an internal API call" but I'm assuming
| they mean gpt will respond with structured json (we define via
| schema in function prompt) that we can use to more easily
| programatically make that api call?
|
| I could be confused but I'm interpreting this function calling as
| "a way to define structured input and selection of function and
| then structured output" but not the actual ability to send it
| arbitrary code to execute.
|
| Still amazing, just wanting to see if I'm wrong on this.
| williamcotton wrote:
| This does not execute code!
| jamesmcintyre wrote:
| Ok, yea this makes sense. Also for others curious of the flow
| here's a video walkthrough I just skimmed through:
| https://www.youtube.com/watch?v=91VVM6MNVlk
| smallerfish wrote:
| I will experiment with this at the weekend. Once thing I found
| useful with supplying a json schema in the prompt was that I
| could supply inline comments and tell it when to leave a field
| null, etc. I found that much more reliable than describing these
| nuances elsewhere in the prompt. Presumably I can't do this with
| functions, but maybe I'll be able to work around it in the prompt
| (particularly now that I have more room to play with.)
| loughnane wrote:
| Just this morning I wrote a JSON object. I told GPT to turn it
| into a schema. I tweaked that and then gave a list of terms for
| which I wanted GPT to populate the schema accordingly.
|
| It worked pretty well without any functions, but I did feel like
| I was missing something because I was ready to be explicit and
| there wasn't any way for me to tell that to GPT.
|
| I look forward to trying this out.
| mritchie712 wrote:
| Glad we didn't get to far into adopting something like
| Guardrails. This sort of kills it's main value prop for OpenAI.
|
| https://shreyar.github.io/guardrails/
| Blahah wrote:
| Luckily it's for LLMs, not openai
| swyx wrote:
| i mean only at the most superficial level. she has a ton of
| other validators that arent superceded (eg SQL is validated by
| branching the database - we discussed on our pod
| https://www.latent.space/p/guaranteed-quality-and-structure)
| mritchie712 wrote:
| yeah, listened to the pod (that's how I found out about
| guardrails!).
|
| fair point, I should have said: "value prop for our use
| case"... the thing I was most interested in was how well
| Guardrails structured output.
| Kiro wrote:
| Can I use this to make it reliably output code (say JavaScript)?
| I haven't managed to do it with just prompt engineering as it
| will still add explanations, apologies and do other unwanted
| things like splitting the code into two files as markdown.
| minimaxir wrote:
| Here's a demo of some system prompt engineering which resulted
| in better results for the older ChatGPT:
| https://github.com/minimaxir/simpleaichat/blob/main/examples...
|
| Coincidentially, the new gpt-3.5-turbo-0613 model also has
| better system prompt guidance: for the demo above and some
| further prompt tweaking, it's possible to get ChatGPT to output
| code super reliably.
| williamcotton wrote:
| Here's an approach to return just JavaScript:
|
| https://github.com/williamcotton/transynthetical-engine
|
| The key is the addition of few-shot exemplars.
| sanxiyn wrote:
| Not this, but using the token selection restriction approach,
| you can let LLM produce output that conforms to arbitrary
| formal grammar completely reliably. JavaScript, Python,
| whatever.
| Xen9 wrote:
| Marvin Minsky was so damn far ahead of his time with Society of
| Mind.
|
| Engineering of cognitively advanced multiagent systems will
| become the area of research of this century / multiple decades.
|
| GPT-GPT > GPT-API in terms of power.
|
| The space of possible combinations of GPT multiagents goes beyond
| imagination since even GPT-4 goes so.
|
| Multiagent systems are best modeled with signal theory, graph
| theory and cognitive science.
|
| Of course "programming" will also play a role, in sense of
| abstractions and creation of systems of / for thought.
|
| Signal theory will be a significant approach for thinking about
| embedded agency.
|
| Complex multiagent systems approach us.
| edwin wrote:
| For those who want to test out the LLM as API idea, we are
| building a turnkey prompt to API product. Here's Simon's recipe
| maker deployed in a minute:
| https://preview.promptjoy.com/apis/1AgCy9 . Public preview to
| make and test your own API: https://preview.promptjoy.com
| yonom wrote:
| This is cool! Are you using one-shot learning under the hood
| with a user provided example?
| edwin wrote:
| BTW: Here's a more performant version (fewer tokens)
| https://preview.promptjoy.com/apis/jNqCA2 that uses a smaller
| example but will still generate pretty good results.
| edwin wrote:
| Thanks. We find few-shot learning to be more effective
| overall. So we are generating additional examples from the
| provided example.
| darepublic wrote:
| I have been using gpt4 to translate natural language to JSON
| already. And on v4 ( not v3) it hasn't returned any malformed
| JSON iirc
| yonom wrote:
| - if the only reason you're using v4 over v3.5 is to generate
| JSON, you can now use this API and downgrade for faster and
| cheaper API calls. - malicious user input may break your json
| (by asking GPT to include comments around the JSON, as another
| user suggested); this may or may not be an issue (e. g. if one
| user can influence other users' experience)
| nocsi wrote:
| What if you ask it to include comments in the JSON explaining
| its choices
| courseofaction wrote:
| Nice to have an endpoint which takes care of this. I've been
| doing this manually, it's a fairly simple process:
|
| * Add "Output your response in json format, with the fields 'x',
| which indicates 'x_explanation', 'z', which indicates
| 'z_explanation' (...)" etc. GPT-4 does this fairly reliably.
|
| * Validate the response, repeat if malformed.
|
| * Bam, you've got a json.
|
| I wonder if they've implemented this endpoint with validation and
| carefully crafted prompts on the base model, or if this is
| specifically fine-tuned.
| 037 wrote:
| It appears to be fine-tuning:
|
| "These models have been fine-tuned to both detect when a
| function needs to be called (depending on the user's input) and
| to respond with JSON that adheres to the function signature."
|
| https://openai.com/blog/function-calling-and-other-api-updat...
| wskish wrote:
| here is code (with several examples) that takes it a couple steps
| further by validating the output json and pydantic model and
| providing feedback to the llm model when it gets either of those
| wrong:
|
| https://github.com/jiggy-ai/pydantic-chatcompletion/blob/mas...
| sublinear wrote:
| > The process is simple enough that you can let non-technical
| people build something like this via a no-code interface. No-code
| tools can leverage this to let their users define "backend"
| functionality.
|
| Early prototypes of software can use simple prompts like this one
| to become interactive. Running an LLM every time someone clicks
| on a button is expensive and slow in production, but _probably
| still ~10x cheaper to produce than code._
|
| Hah wow... no. Definitely not.
| social_ism wrote:
| [dead]
| thorum wrote:
| The JSON schema not counting toward token usage is huge, that
| will really help reduce costs.
| minimaxir wrote:
| That is up in the air and needs more testing. Field
| descriptions, for example, are important but extraneous input
| that would be tokenized and count in the costs.
|
| At the least for ChatGPT, input token costs were cut by 25% so
| it evens out.
| stavros wrote:
| > Under the hood, functions are injected into the system
| message in a syntax the model has been trained on. This means
| functions count against the model's context limit and are
| billed as input tokens. If running into context limits, we
| suggest limiting the number of functions or the length of
| documentation you provide for function parameters.
| yonom wrote:
| I believe functions do count in some way toward the token
| usage; but it seems to be in a more efficient way than pasting
| raw JSON schemas into the prompt. Nevertheless, the token usage
| seems to be far lower than previous alternatives, which is
| awesome!
| adultSwim wrote:
| _Running an LLM every time someone clicks on a button is
| expensive and slow in production, but probably still ~10x cheaper
| to produce than code._
| edwin wrote:
| New techniques like semantic caching will help. This is the
| modern era's version of building a performant social graph.
| daralthus wrote:
| What's semantic caching?
| edwin wrote:
| With LLMs, the inputs are highly variable so exact match
| caching is generally less useful. Semantic caching groups
| similar inputs and returns relevant results accordingly. So
| {"dish":"spaghetti bolognese"} and {"dish":"spaghetti with
| meat sauce"} could return the same cached result.
| m3kw9 wrote:
| Or store as sentence embedding and calculate the vector
| distance, but creates many edge cases
| minimaxir wrote:
| After reading the docs for the new ChatGPT function calling
| yesterday, it's structured and/or typed data for GPT input or
| output that's the key feature of these new models. The ReAct flow
| of tool selection that it provides is secondary.
|
| As this post notes, you don't even need to the full flow of
| passing a function result back to the model: getting structured
| data from ChatGPT in itself has a lot of fun and practical use
| cases. You could coax previous versions of ChatGPT to "output
| results as JSON" with a system prompt but in practice results are
| mixed, although even with this finetuned model the docs warn that
| there still could be parsing errors.
|
| OpenAI's demo for function calling is not a Hello World, to put
| it mildly: https://github.com/openai/openai-
| cookbook/blob/main/examples...
| tornato7 wrote:
| IIRC, there's a way to "force" LLMs to output proper JSON by
| adding some logic to the top token selection. I.e. in the
| randomness function (which OpenAI calls temperature) you'd
| never choose a next token that results in broken JSON. The only
| reason it wouldn't would be if the output exceeds the token
| limit. I wonder if OpenAI is doing something like this.
| ManuelKiessling wrote:
| Note that you don't necessarily need to have the AI output
| any JSON at all -- simply have it answer when being asked for
| the value to a specific JSON key, and handle the JSON
| structure part in your hallucinations-free own code:
| https://github.com/manuelkiessling/php-ai-tool-bridge
| naiv wrote:
| Thanks for sharing!
| woodrowbarlow wrote:
| the linked article hypothesizes:
|
| > I assume OpenAI's implementation works conceptually similar
| to jsonformer, where the token selection algorithm is changed
| from "choose the token with the highest logit" to "choose the
| token with the highest logit which is valid for the schema".
| senko wrote:
| It would seem not, as the official documentation mentions the
| arguments may be hallucinated or _be a malformed JSON_.
|
| (except if the meaning is the JSON syntax is valid but may
| not conform to the schema, but they're unclear on that).
| sanxiyn wrote:
| For various reasons, token selection may be implemented as
| upweighting/downweighting instead of outright ban of
| invalid tokens. (Maybe it helps training?) Then the model
| could generate malformed JSON. I think it is premature to
| infer from "can generate malformed JSON" that OpenAI is not
| using token selection restriction.
| sanxiyn wrote:
| Note that this (token selection restriction) is even
| available on OpenAI API as logit_bias.
| newhouseb wrote:
| But only for the whole generation. So if you want to
| constrain things one token at a time (as you would to force
| things to follow a grammar) you have to make fresh calls
| and only request one token which makes things more or less
| impractical if you want true guarantees. A few months ago I
| built this anyway to suss out how much more expensive it
| was [1]
|
| [1] https://github.com/newhouseb/clownfish#so-how-do-i-use-
| this-...
| have_faith wrote:
| How would a tweaked temp enforce a non broken output exactly?
| isoprophlex wrote:
| Not traditional temperature, maybe the parent worded it
| somewhat obtusely. Anyway, to disambiguate...
|
| I think it works something like this: You let something
| akin to a json parser run with the output sampler. First
| token must be either '{' or '['; then if you see [ has the
| highest probability, you select that. Ignore all other
| tokens, even those with high probability.
|
| Second token must be ... and so on and so on.
|
| Guarantee for non-broken (or at least parseable) json
| sanxiyn wrote:
| It's not temperature, but sampling. Output of LLM is
| probabilistic distribution over tokens. To get concrete
| tokens, you sample from that distribution. Unfortunately,
| OpenAI API does not expose the distribution. You only get
| the sampled tokens.
|
| As an example, on the link JSON schema is defined such that
| recipe ingredient unit is one of
| grams/ml/cups/pieces/teaspoons. LLM may output the
| distribution grams(30%), cups(30%), pounds(40%). Sampling
| the best token "pounds" would generate an invalid document.
| Instead, you can use the schema to filter tokens and sample
| from the filtered distribution, which is grams(50%),
| cups(50%).
| behnamoh wrote:
| What's the implication of this new change for Microsoft
| Guidance, LMQL, Langchain, etc.? It looks like much of their
| functionality (controlling model output) just became obsolete.
| Am I missing something?
| [deleted]
| lbeurerkellner wrote:
| If anything this removes a major roadblock for
| libraries/languages that want to employ LLM calls as a
| primitive, no? Although, I fear the vendor lock-in
| intensifies here, also given how restrictive and specific the
| Chat API.
|
| Either way, as part of the LMQL team, I am actually pretty
| excited about this, also with respect to what we want to
| build going forward. This makes language model programming
| much easier.
| koboll wrote:
| `Although, I fear the vendor lock-in intensifies here, also
| given how restrictive and specific the Chat API.`
|
| Eh, would be pretty easy to write a wrapper that takes a
| functions-like JSON Schema object and interpolates it into
| a traditional "You MUST return ONLY JSON in the following
| format:" prompt snippet.
| londons_explore wrote:
| > Although, I fear the vendor lock-in intensifies here,
|
| The openAI API is super simple - any other vendor is free
| to copy it, and I'm sure many will.
| m3kw9 wrote:
| It works pretty good. You define a few "function" and enter a
| description on what it does, when user prompts, it will
| understand the prompt and tell you which likely "function" to
| use, which is just the function name. I feel like this is a new
| way to program, a sort of fuzzy logic type of programming
| Sai_ wrote:
| > fuzzy logic
|
| Yes and no. While the choice of which function to call is
| dependent on an llm, ultimately, you control the function
| itself whose output is deterministic.
|
| Even today, given an api, people can choose to call or not call
| based on some factor. We don't call this fuzzy logic. E.g.,
| people can decide to sell or buy stock through an api based on
| some internal calculations - doesn't make the system "fuzzy".
| jonplackett wrote:
| This is useful, but for me at least, GPT-4 is unusable because it
| sometimes takes 30 seconds + to reply to even basic queries.
| m3kw9 wrote:
| Also the rate limit is pretty bad if you want to release any
| type of app
| emilsedgh wrote:
| Building agents that use advanced API's was not really practical
| until now. Things like Langchain's Structured Agents worked
| somewhat reliably, but due to the massive token count it was so
| slow, the experience was _never_ going to be useful.
|
| Due to this, the performance in which our agent processes results
| has improved 5-6 times and it does actually do a pretty good job
| of keeping the schema.
|
| One problem that is not resolved yet is that it still
| hallucinates a lot of attributes. For example we have tool that
| allows it to create contacts in user's CRM. I ask it to:
|
| "Create contacts for top 3 Barcelona players:.
|
| It creates an structure like this"
|
| 1. Lionel Messi - Email: lionel.messi@barcelona.com - Phone
| Number: +1234567890 - Tags: Player, Barcelona
|
| 2. Gerard Pique - Email: gerard.pique@barcelona.com - Phone
| Number: +1234567891 - Tags: Player, Barcelona
|
| 3. Marc-Andre ter Stegen - Email: marc-terstegen@barcelona.com -
| Phone Number: +1234567892 - Tags: Player, Barcelona
|
| And you can see it hallucinated email addresses and phone
| numbers.
| 037 wrote:
| I would never rely on an LLM as a source of such information,
| just as I wouldn't trust the general knowledge of a human being
| used as a database. Does your workflow include a step for
| information search? With the new json features, it should be
| easy to instruct it to perform a search or directly feed it the
| right pages to parse.
| pluijzer wrote:
| ChatGPT can be usefully for many things, but you should really,
| not use it if you want to retrieve factual data. This might
| partly be resolved by querying the internet like bing does but
| purely on the language model side these hallucinations are just
| an unavoidable part of it.
| Spivak wrote:
| Yep, it's _always_ _always_ write code / query / function /
| whatever you need that you would parse and retrieve the data
| from an external system.
| dang wrote:
| Recent and related:
|
| _Function calling and other API updates_ -
| https://news.ycombinator.com/item?id=36313348 - June 2023 (154
| comments)
| minimaxir wrote:
| IMO this isn't a dupe and shouldn't be penalized as a result.
| dang wrote:
| It's certainly not a dupe. It looks like a follow-up though.
| No?
| minimaxir wrote:
| More a very timely but practical demo.
| dang wrote:
| Ok, thanks!
| EGreg wrote:
| Actually I'm looking to take GPT-4 output and create file formats
| like keynote presentations, or pptx. Is that currently possible
| with some tools?
| yonom wrote:
| I would recommend creating a simplified JSON schema for the
| slides (say, presentation is an array of slides, each slide has
| a title, body, optional image, optional diagram, each diagram
| is one of pie, table, ... Then use a library to generate the
| pptx file from the content generated.
| EGreg wrote:
| Library? What library?
|
| It seems to me that a Transformer should excel at
| Transforming, say, text into pptx or pdf or HTML with CSS
| etc.
|
| Why don't they train it on that? So I don't have to sit there
| with manually written libraries. It can easily transform HTML
| to XML or text bullet points so why not the other formats?
| yonom wrote:
| I don't think the name "Transformer" is meant in the sense
| of "transforming between file formats".
|
| My intuition is that LLMs tend to be good at things human
| brains are good at (e.g. reasoning), and bad at things
| human brains are bad at (e.g. math, writing pptx binary
| files from scratch, ...).
|
| Eventually, we might get LLMs that can open PowerPoint and
| quickly design the whole presentation using a virtual mouse
| and keyboard but we're not there yet.
| EGreg wrote:
| It's just XML They can produce HTML and transform python
| into php etc.
|
| So why not? It's easy for them no?
| stevenhuang wrote:
| apparently pandoc also supports pptx
|
| so you can tell GPT4 to output markdown, then use pandoc to
| convert that markdown to pptx or pdf.
| edwin wrote:
| Here you go: https://preview.promptjoy.com/apis/m7oCyL
___________________________________________________________________
(page generated 2023-06-14 23:00 UTC)