[HN Gopher] Harmony: OpenAI's response format for its open-weigh...
       ___________________________________________________________________
        
       Harmony: OpenAI's response format for its open-weight model series
        
       Author : meetpateltech
       Score  : 359 points
       Date   : 2025-08-05 16:07 UTC (6 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | throwaway314155 wrote:
       | what's this for?
        
         | babelfish wrote:
         | read the README
        
         | koakuma-chan wrote:
         | Basically, LLMs are trained with a specific conversation
         | format, and if your input does not follow that format, the LLM
         | will perform poorly. We usually don't have to worry about this
         | because their API automatically puts our input into the proper
         | format, but I guess now that they open sourced a model, they
         | are also releasing the corresponding format.
        
       | jfoster wrote:
       | The page links to: https://gpt-oss.com/ and
       | https://openai.com/open-models
       | 
       | ... but these links aren't active yet. I presume they will be
       | imminently, and I guess that means that OpenAI are releasing an
       | open weights GPT model today?
        
       | FergusArgyll wrote:
       | None of their links work?
       | 
       | - https://gpt-oss.com/ Auth required?
       | 
       | - https://openai.com/open-models/ seems empty?
       | 
       | - https://cookbook.openai.com/topic/gpt-oss 404
       | 
       | - https://openai.com/index/gpt-oss-model-card/ empty page?
       | 
       | Am I holding the internet wrong?
        
         | FergusArgyll wrote:
         | Does seem like we're gonna get open weights models today tho
        
         | jfoster wrote:
         | I think they're currently doing the release. I am guessing
         | those will all be online soon.
        
         | stillpointlab wrote:
         | Also https://cookbook.openai.com/articles/openai-harmony is
         | referenced 3 times in the README but it is 404
        
           | Imustaskforhelp wrote:
           | the link does work now for what its worth
        
         | trenchpilgrim wrote:
         | Presumably they use GitHub and their release process is delayed
         | by the current GitHub outage.
        
           | echelon wrote:
           | Cosmically bad timing.
        
           | skhameneh wrote:
           | Apparently the issue was resolved, but there's no indication
           | there was an outage in the last 24 hours when looking at
           | status... https://www.githubstatus.com/
           | 
           | Not a fan of this presentation of communication.
        
             | skhameneh wrote:
             | The status page now reflects an issue, at time of writing
             | it had been resolved for almost an hour and there was no
             | indication of an issue.
        
           | bbor wrote:
           | IDK I think this is on purpose:
           | https://nitter.net/sama/status/1952759361417466016#m
           | we have a lot of new stuff for you over the next few days!
           | something big-but-small today.       and then a big upgrade
           | later this week.
           | 
           | EDIT: nevermind, I spoke too soon! I guess this was referring
           | to GPT 5 later this week. https://openai.com/open-models/ is
           | live
        
         | rvz wrote:
         | > Am I holding the internet wrong?
         | 
         | The GitHub outage is delaying them on their release.
        
         | minimaxir wrote:
         | The new transformers release describes the model:
         | https://github.com/huggingface/transformers/releases/tag/v4....
         | 
         | > GPT OSS is a hugely anticipated open-weights release by
         | OpenAI, designed for powerful reasoning, agentic tasks, and
         | versatile developer use cases. It comprises two models: a big
         | one with 117B parameters (gpt-oss-120b), and a smaller one with
         | 21B parameters (gpt-oss-20b). Both are mixture-of-experts
         | (MoEs) and use a 4-bit quantization scheme (MXFP4), enabling
         | fast inference (thanks to fewer active parameters, see details
         | below) while keeping resource usage low. The large model fits
         | on a single H100 GPU, while the small one runs within 16GB of
         | memory and is perfect for consumer hardware and on-device
         | applications.
        
         | paxys wrote:
         | I'm guessing someone published the github repo too early.
        
           | echelon wrote:
           | GitHub is having an outage.
           | 
           | OpenAI might have tried coordinating the press release of
           | their open model to counter Google Genie 3 news but got stuck
           | in the middle of the outage.
        
             | Bluestein wrote:
             | GitHub got hugged to death by OpenAI :)
        
         | guluarte wrote:
         | https://ollama.com/library/gpt-oss
        
         | wilg wrote:
         | Every OpenAI announcement has threads of people complaining
         | that the links don't work yet as if you can trivially deploy 10
         | different interconnected websites completely instantly.
        
         | stronglikedan wrote:
         | > Am I holding the internet wrong?
         | 
         | Likely, considering every single one opens right up for me.
        
       | qsort wrote:
       | pelican when
        
         | MarcelOlsz wrote:
         | What's pelican?
        
           | babelfish wrote:
           | @simonw asks every new foundation model to generate an SVG of
           | a pelican riding a bicycle as a part of his review post
        
             | echelon wrote:
             | The foundation model companies should just learn that case
             | and call it a day.
        
               | schmidtleonard wrote:
               | He spotted a pelican in a presentation the other week, so
               | they're on to him and he's on to them.
        
               | unglaublich wrote:
               | Benchmark-driven development, like Dieselgate in
               | automotive.
        
               | pythonaut_16 wrote:
               | Yes, they should definitely Goodhardt the Pelican Test so
               | we can... just have to invent a new test?
        
               | Spivak wrote:
               | Yes but then you can use the pelican test in all your
               | marketing where you say that this is the < _apple slide
               | deck voice_ > most capable model. ever. And then ignore
               | the new test except as a footnote in some long dry boring
               | evaluation.
        
         | HaZeust wrote:
         | wen pelican.... WEN BICYCLE
        
         | righthand wrote:
         | I hope this ends in well poisoning to where all data about
         | pelicans is associated with a bicycle in some way to which you
         | can't get any model to give you correct information about
         | pelicans or bicycles but you can get a pelican riding a
         | bicycle.
        
       | deckar01 wrote:
       | gpt-oss models are reportedly being hosted on huggingface.
       | 
       | https://www.bleepingcomputer.com/news/artificial-intelligenc...
        
         | bbor wrote:
         | (as of 3 days ago)
        
       | lajr wrote:
       | This format, or similar formats, seem to be the standard now, I
       | was just reading the "Lessons from Building Manus"[1] post and
       | they discuss the Hermes Format[2] which seems similar in terms of
       | being pseudo-xml.
       | 
       | My initial thought was how hacky the whole thing feels, but then
       | the fact that it works and gives rise to complex behaviour (like
       | coercing specific tool selection in the Manus post) is quite
       | simple and elegant.
       | 
       | Also as an aside, it is good that it appears that each standard
       | tag is a single token in the OpenAI repo.
       | 
       | [1] https://manus.im/blog/Context-Engineering-for-AI-Agents-
       | Less... [2] https://github.com/NousResearch/Hermes-Function-
       | Calling
        
       | obviyus wrote:
       | Links seem to be working now:
       | 
       | - https://openai.com/index/introducing-gpt-oss/
       | 
       | - https://cdn.openai.com/pdf/419b6906-9da6-406c-a19d-1bb078ac7...
        
       | dr_dshiv wrote:
       | Yesterday I gave a presentation on the role of harmony in AI --
       | as a matter of philosophical interest. I'd previously written a
       | large literature review on the concept of harmony (here:
       | https://www.sciencedirect.com/science/article/pii/S240587262...).
       | If you are curious about the slides, here: Bit.ly/ozora2025
       | 
       | I assume they are using the concept of harmony to refer to the
       | consistent response format? Or is it their intention for an open
       | weights release?
        
       | Scene_Cast2 wrote:
       | I wonder how much performance is left on the table due to it not
       | being zero-copy.
        
       | irthomasthomas wrote:
       | Prediction: GPT-5 will use a consortium of models for parallel
       | reasoning, possibly including their oss versions. Each using
       | different 'channels' from the harmony spec.
       | 
       | I have a branch of llm-consortium where I was noodling with
       | giving each member model a role. Only problem is it's expensive
       | to evaluate these ideas so I put it on hold. But maybe now with
       | oss models being cheap I can try and it on those.
        
         | Imustaskforhelp wrote:
         | What are your thoughts on some other model like qwen using
         | something like this?
         | 
         | Pardon me but are you thinking that this method is superior
         | than mixture of experts? What are your thoughts?
        
           | irthomasthomas wrote:
           | I tested a consortium of qwens on the brainfuck test and it
           | solved it, while the single models fail.
           | 
           | MOEs are a single model. An 'expert' is a subset of layers
           | chosen by a router model for each token. This makes them run
           | faster. A consortium is a type of parallel reasoning that
           | uses multiple of the same or different models to generate
           | parallel response and find the best one.
           | 
           | All models have a jagged frontier with weird skill gaps. A
           | consortium can bridge those gaps and increase performance on
           | the frontier.
        
             | onlyrealcuzzo wrote:
             | Has anyone compared a consortium of leading edge 3B-20B
             | models compared to the most powerful models?
             | 
             | I'd love to see how they performed.
        
               | irthomasthomas wrote:
               | Do you have a favourite benchmark? I may just have the
               | budget for testing some 3b models
        
         | nxobject wrote:
         | Computer science's favorite move: we've reached the limits of a
         | scaling law meant to benefit single-threaded processes, so
         | let's go parallel...
        
           | 42lux wrote:
           | we are scaling in one direction for 2 years now...
        
         | mindwok wrote:
         | This is what Grok 4 Heavy does with apparent success.
        
           | irthomasthomas wrote:
           | They may have been inspired by it. It was shared by
           | karpathy... https://x.com/karpathy/status/1870692546969735361
           | 
           | I wish someone would extract the Grok Heavy prompts to
           | confirm, but I guess those jailbreakers don't have the $200
           | sub.
        
       | gsibble wrote:
       | It's weird to me that ChatGPT would release a local model that
       | you can't plug directly into their client.....kind of defeats the
       | purpose.
       | 
       | Also creates a walled garden on purpose.
        
       | accrual wrote:
       | > The format enables the model to output to multiple different
       | channels for chain of thought, and tool calling preambles along
       | with regular responses
       | 
       | That's pretty cool and seems like a logical next step to
       | structure AI outputs. We started out with a stream of plaintext.
       | In the future perhaps we'll have complex typed output.
       | 
       | Humans also emit many channels of information simutaneously. Our
       | speech, tone of voice, body language, our appearance - it all has
       | an impact on how our information is received by another.
        
       | citizensinan wrote:
       | Same here - all those links are either broken or asking for auth.
       | Classic case of announcing something before the infrastructure is
       | ready.
       | 
       | This kind of coordination failure is surprisingly common with AI
       | releases lately. Remember when everyone was trying to access
       | GPT-4 on launch day? Or when Anthropic's Claude had those random
       | outages during their big announcements?
       | 
       | Makes you wonder if they're rushing to counter Google's Genie 3
       | news and got caught with their pants down during the GitHub
       | outage. The timing seems too coincidental.
       | 
       | At least when it does go live, having truly open weights models
       | will be huge for the community. Just wish they'd test their
       | deployment pipeline before hitting 'publish' on the blog post.
        
       ___________________________________________________________________
       (page generated 2025-08-05 23:02 UTC)