[HN Gopher] Harmony: OpenAI's response format for its open-weigh...
___________________________________________________________________
Harmony: OpenAI's response format for its open-weight model series
Author : meetpateltech
Score : 359 points
Date : 2025-08-05 16:07 UTC (6 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| throwaway314155 wrote:
| what's this for?
| babelfish wrote:
| read the README
| koakuma-chan wrote:
| Basically, LLMs are trained with a specific conversation
| format, and if your input does not follow that format, the LLM
| will perform poorly. We usually don't have to worry about this
| because their API automatically puts our input into the proper
| format, but I guess now that they open sourced a model, they
| are also releasing the corresponding format.
| jfoster wrote:
| The page links to: https://gpt-oss.com/ and
| https://openai.com/open-models
|
| ... but these links aren't active yet. I presume they will be
| imminently, and I guess that means that OpenAI are releasing an
| open weights GPT model today?
| FergusArgyll wrote:
| None of their links work?
|
| - https://gpt-oss.com/ Auth required?
|
| - https://openai.com/open-models/ seems empty?
|
| - https://cookbook.openai.com/topic/gpt-oss 404
|
| - https://openai.com/index/gpt-oss-model-card/ empty page?
|
| Am I holding the internet wrong?
| FergusArgyll wrote:
| Does seem like we're gonna get open weights models today tho
| jfoster wrote:
| I think they're currently doing the release. I am guessing
| those will all be online soon.
| stillpointlab wrote:
| Also https://cookbook.openai.com/articles/openai-harmony is
| referenced 3 times in the README but it is 404
| Imustaskforhelp wrote:
| the link does work now for what its worth
| trenchpilgrim wrote:
| Presumably they use GitHub and their release process is delayed
| by the current GitHub outage.
| echelon wrote:
| Cosmically bad timing.
| skhameneh wrote:
| Apparently the issue was resolved, but there's no indication
| there was an outage in the last 24 hours when looking at
| status... https://www.githubstatus.com/
|
| Not a fan of this presentation of communication.
| skhameneh wrote:
| The status page now reflects an issue, at time of writing
| it had been resolved for almost an hour and there was no
| indication of an issue.
| bbor wrote:
| IDK I think this is on purpose:
| https://nitter.net/sama/status/1952759361417466016#m
| we have a lot of new stuff for you over the next few days!
| something big-but-small today. and then a big upgrade
| later this week.
|
| EDIT: nevermind, I spoke too soon! I guess this was referring
| to GPT 5 later this week. https://openai.com/open-models/ is
| live
| rvz wrote:
| > Am I holding the internet wrong?
|
| The GitHub outage is delaying them on their release.
| minimaxir wrote:
| The new transformers release describes the model:
| https://github.com/huggingface/transformers/releases/tag/v4....
|
| > GPT OSS is a hugely anticipated open-weights release by
| OpenAI, designed for powerful reasoning, agentic tasks, and
| versatile developer use cases. It comprises two models: a big
| one with 117B parameters (gpt-oss-120b), and a smaller one with
| 21B parameters (gpt-oss-20b). Both are mixture-of-experts
| (MoEs) and use a 4-bit quantization scheme (MXFP4), enabling
| fast inference (thanks to fewer active parameters, see details
| below) while keeping resource usage low. The large model fits
| on a single H100 GPU, while the small one runs within 16GB of
| memory and is perfect for consumer hardware and on-device
| applications.
| paxys wrote:
| I'm guessing someone published the github repo too early.
| echelon wrote:
| GitHub is having an outage.
|
| OpenAI might have tried coordinating the press release of
| their open model to counter Google Genie 3 news but got stuck
| in the middle of the outage.
| Bluestein wrote:
| GitHub got hugged to death by OpenAI :)
| guluarte wrote:
| https://ollama.com/library/gpt-oss
| wilg wrote:
| Every OpenAI announcement has threads of people complaining
| that the links don't work yet as if you can trivially deploy 10
| different interconnected websites completely instantly.
| stronglikedan wrote:
| > Am I holding the internet wrong?
|
| Likely, considering every single one opens right up for me.
| qsort wrote:
| pelican when
| MarcelOlsz wrote:
| What's pelican?
| babelfish wrote:
| @simonw asks every new foundation model to generate an SVG of
| a pelican riding a bicycle as a part of his review post
| echelon wrote:
| The foundation model companies should just learn that case
| and call it a day.
| schmidtleonard wrote:
| He spotted a pelican in a presentation the other week, so
| they're on to him and he's on to them.
| unglaublich wrote:
| Benchmark-driven development, like Dieselgate in
| automotive.
| pythonaut_16 wrote:
| Yes, they should definitely Goodhardt the Pelican Test so
| we can... just have to invent a new test?
| Spivak wrote:
| Yes but then you can use the pelican test in all your
| marketing where you say that this is the < _apple slide
| deck voice_ > most capable model. ever. And then ignore
| the new test except as a footnote in some long dry boring
| evaluation.
| HaZeust wrote:
| wen pelican.... WEN BICYCLE
| righthand wrote:
| I hope this ends in well poisoning to where all data about
| pelicans is associated with a bicycle in some way to which you
| can't get any model to give you correct information about
| pelicans or bicycles but you can get a pelican riding a
| bicycle.
| deckar01 wrote:
| gpt-oss models are reportedly being hosted on huggingface.
|
| https://www.bleepingcomputer.com/news/artificial-intelligenc...
| bbor wrote:
| (as of 3 days ago)
| lajr wrote:
| This format, or similar formats, seem to be the standard now, I
| was just reading the "Lessons from Building Manus"[1] post and
| they discuss the Hermes Format[2] which seems similar in terms of
| being pseudo-xml.
|
| My initial thought was how hacky the whole thing feels, but then
| the fact that it works and gives rise to complex behaviour (like
| coercing specific tool selection in the Manus post) is quite
| simple and elegant.
|
| Also as an aside, it is good that it appears that each standard
| tag is a single token in the OpenAI repo.
|
| [1] https://manus.im/blog/Context-Engineering-for-AI-Agents-
| Less... [2] https://github.com/NousResearch/Hermes-Function-
| Calling
| obviyus wrote:
| Links seem to be working now:
|
| - https://openai.com/index/introducing-gpt-oss/
|
| - https://cdn.openai.com/pdf/419b6906-9da6-406c-a19d-1bb078ac7...
| dr_dshiv wrote:
| Yesterday I gave a presentation on the role of harmony in AI --
| as a matter of philosophical interest. I'd previously written a
| large literature review on the concept of harmony (here:
| https://www.sciencedirect.com/science/article/pii/S240587262...).
| If you are curious about the slides, here: Bit.ly/ozora2025
|
| I assume they are using the concept of harmony to refer to the
| consistent response format? Or is it their intention for an open
| weights release?
| Scene_Cast2 wrote:
| I wonder how much performance is left on the table due to it not
| being zero-copy.
| irthomasthomas wrote:
| Prediction: GPT-5 will use a consortium of models for parallel
| reasoning, possibly including their oss versions. Each using
| different 'channels' from the harmony spec.
|
| I have a branch of llm-consortium where I was noodling with
| giving each member model a role. Only problem is it's expensive
| to evaluate these ideas so I put it on hold. But maybe now with
| oss models being cheap I can try and it on those.
| Imustaskforhelp wrote:
| What are your thoughts on some other model like qwen using
| something like this?
|
| Pardon me but are you thinking that this method is superior
| than mixture of experts? What are your thoughts?
| irthomasthomas wrote:
| I tested a consortium of qwens on the brainfuck test and it
| solved it, while the single models fail.
|
| MOEs are a single model. An 'expert' is a subset of layers
| chosen by a router model for each token. This makes them run
| faster. A consortium is a type of parallel reasoning that
| uses multiple of the same or different models to generate
| parallel response and find the best one.
|
| All models have a jagged frontier with weird skill gaps. A
| consortium can bridge those gaps and increase performance on
| the frontier.
| onlyrealcuzzo wrote:
| Has anyone compared a consortium of leading edge 3B-20B
| models compared to the most powerful models?
|
| I'd love to see how they performed.
| irthomasthomas wrote:
| Do you have a favourite benchmark? I may just have the
| budget for testing some 3b models
| nxobject wrote:
| Computer science's favorite move: we've reached the limits of a
| scaling law meant to benefit single-threaded processes, so
| let's go parallel...
| 42lux wrote:
| we are scaling in one direction for 2 years now...
| mindwok wrote:
| This is what Grok 4 Heavy does with apparent success.
| irthomasthomas wrote:
| They may have been inspired by it. It was shared by
| karpathy... https://x.com/karpathy/status/1870692546969735361
|
| I wish someone would extract the Grok Heavy prompts to
| confirm, but I guess those jailbreakers don't have the $200
| sub.
| gsibble wrote:
| It's weird to me that ChatGPT would release a local model that
| you can't plug directly into their client.....kind of defeats the
| purpose.
|
| Also creates a walled garden on purpose.
| accrual wrote:
| > The format enables the model to output to multiple different
| channels for chain of thought, and tool calling preambles along
| with regular responses
|
| That's pretty cool and seems like a logical next step to
| structure AI outputs. We started out with a stream of plaintext.
| In the future perhaps we'll have complex typed output.
|
| Humans also emit many channels of information simutaneously. Our
| speech, tone of voice, body language, our appearance - it all has
| an impact on how our information is received by another.
| citizensinan wrote:
| Same here - all those links are either broken or asking for auth.
| Classic case of announcing something before the infrastructure is
| ready.
|
| This kind of coordination failure is surprisingly common with AI
| releases lately. Remember when everyone was trying to access
| GPT-4 on launch day? Or when Anthropic's Claude had those random
| outages during their big announcements?
|
| Makes you wonder if they're rushing to counter Google's Genie 3
| news and got caught with their pants down during the GitHub
| outage. The timing seems too coincidental.
|
| At least when it does go live, having truly open weights models
| will be huge for the community. Just wish they'd test their
| deployment pipeline before hitting 'publish' on the blog post.
___________________________________________________________________
(page generated 2025-08-05 23:02 UTC)