[HN Gopher] DeepSeekMath-V2: Towards Self-Verifiable Mathematica...
       ___________________________________________________________________
        
       DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning
        
       Author : victorbuilds
       Score  : 251 points
       Date   : 2025-12-01 08:54 UTC (14 hours ago)
        
 (HTM) web link (huggingface.co)
 (TXT) w3m dump (huggingface.co)
        
       | victorbuilds wrote:
       | Notable: they open-sourced the weights under Apache 2.0, unlike
       | OpenAI and DeepMind whose IMO gold models are still proprietary.
        
         | SilverElfin wrote:
         | If they open source just weights and not the training code and
         | data, then it's still proprietary.
        
           | mips_avatar wrote:
           | Yeah but you can distill
        
             | amelius wrote:
             | Is that the equivalent of decompile?
        
               | c0balt wrote:
               | No, that is the equivalent of lossy compression.
        
             | littlestymaar wrote:
             | You can distill closed weights models as well. (Just not
             | logit-distillation)
        
               | mips_avatar wrote:
               | Though it violates their terms of service
        
           | falcor84 wrote:
           | Isn't that a bit like saying that if I open source a tool,
           | but not a full compendium of all the code that I had read,
           | which led me to develop it, then it's not really open source?
        
             | fragmede wrote:
             | No. In that case, you're providing two things, a binary
             | version of your tool, and the tool's source. That tool's
             | source is available to inspect and build their own copy.
             | However, given just the weights, we don't have the source,
             | and can't inspect what alignment went into it. In the case
             | of DeepSeek, we know they had to purposefully cause their
             | model to consider Tiananmen Square something it shouldn't
             | discuss. But without the source used to create the model,
             | we don't know what else is lurking around inside the model.
        
               | NitpickLawyer wrote:
               | > However, given just the weights, we don't have the
               | source
               | 
               | This is incorrect, given the definitions in the license.
               | 
               | > (Apache 2.0) "Source" form shall mean _the preferred
               | form for making modifications_ , including but not
               | limited to software source code, documentation source,
               | and configuration files.
               | 
               | (emphasis mine)
               | 
               | In LLMs, the weights _are_ the preferred form of making
               | modifications. Weights are not _compiled_ from something
               | else. You start with the weights (randomly initialised)
               | and at every step of training you adjust the weights.
               | That is not akin to compilation, for many reasons (both
               | theoretical and practical).
               | 
               | In general licenses do not give you rights over the
               | "know-how" or "processes" in which the licensed parts
               | were created. What you get is the ability to inspect,
               | modify, redistribute the work as you see fit. And most
               | importantly, you modify the work just like the creators
               | modify the work (hence the preferred form). Just not with
               | the same data (i.e. you can modify the source of chrome
               | all you want, just not with the "know-how and knowledge"
               | of a google engineer - the license can not offer that).
               | 
               | This is also covered in the EU AI act btw.
               | 
               | > General-purpose AI models released under free and open-
               | source licences should be considered to ensure high
               | levels of transparency and openness if their parameters,
               | including the weights, the information on the model
               | architecture, and the information on model usage are made
               | publicly available. The licence should be considered to
               | be free and open-source also when it allows users to run,
               | copy, distribute, study, change and improve software and
               | data, including models under the condition that the
               | original provider of the model is credited, the identical
               | or comparable terms of distribution are respected.
        
               | fragmede wrote:
               | > In LLMs, the weights are the preferred form of making
               | modifications.
               | 
               | No they aren't. We happen to be able to do things to
               | modify the weights, sure, but why would any lab ever
               | train something from scratch if editing weights was
               | _preferred_?
        
               | NitpickLawyer wrote:
               | training _is_ modifying the weights. How you modify them
               | is not the object of a license, never was.
        
               | noodletheworld wrote:
               | > And most importantly, you modify the work _just like
               | the creators modify the work_
               | 
               | Emphasis mine.
               | 
               | Weights are not open source.
               | 
               | You can define terms to mean whatever you want, but
               | _fundametally_ if you cannot modify the "output" the way
               | the original creators could, its not in the spirit of
               | open source.
               | 
               | Isnt that _literally_ what you said?
               | 
               | How can you possibly claim both that a) you can modify it
               | the creators did, b) thats all you need to be open
               | source, but...
               | 
               | Also c) the _categorically incorrect_ assertion that the
               | weights allow you to do this?
               | 
               | Whatever, I guess, but your argument is logically wrong,
               | and philosophically flawed.
        
               | NitpickLawyer wrote:
               | > Weights are not open source.
               | 
               | If they are released under an open source license, they
               | are.
               | 
               | I think you are confusing two concepts. One is the
               | technical ability to modify weights. And that's what the
               | license grants you. The right to modify. The second is
               | the "know-how" on _how_ to modify the weights. That is
               | not something that a license has ever granted you.
               | 
               | Let me put it this way:
               | 
               | ```python
               | 
               | THRESHOLD = 0.73214
               | 
               | if input() < THRESHOLD:                 print ("low")
               | 
               | else:                 print ("high")
               | 
               | ```
               | 
               | If I release that piece of code under Apache 2.0, you
               | have the right to study it, modify it and release it as
               | you see fit. But you _can not_ have the right (at least
               | the license doesn 't deal with that) to know _how_ I
               | reached that threshold value. And me not telling you does
               | not in any way invalidate the license being Apache 2.0.
               | That 's simply not something that licenses do.
               | 
               | In LLMs the source is a collection of architecture (when
               | and how to apply the "ifs"), inference code (how to
               | optimise the computation of the "ifs") and hardcoded
               | values (weights). You are being granted a license to run,
               | study, modify and release those hardcoded values. You do
               | not, never had, never will in the scope of a license, get
               | the right to know how those hardcoded values were
               | reached. The process by which those values were found can
               | be anything from "dreamt up" to "found via ML". The fact
               | that you don't know _how_ those values were derived does
               | not in any way preclude you from exercising the rights
               | under the license.
        
               | roblabla wrote:
               | You are fundamentally conflating releasing a binary under
               | an open source license with the software being open
               | source. Nobody is saying that they're violating the
               | license of Apache2 by not releasing the training data.
               | What people are objecting to is that calling this release
               | "open source", when the only thing covered by the open
               | source license is the weights, to be an abuse of the
               | meaning of "Open Source".
               | 
               | To give you an example: I can release a binary (without
               | sources) under the MIT - an open source license. That
               | will give you the rights to use, copy, modify, merge,
               | publish, distribute, sublicense, and/or sell copies of
               | said binary. In doing so, I would have released the
               | binary under an open source license. However, most people
               | would agree that the software would not be open source
               | under the conventional definition, as the sources would
               | not be published. While people could modify it by
               | disassembling it and modifying it, there is a general
               | understanding that Open Source requires distributing the
               | _sources_.
               | 
               | This is very similar to what is being done here. They're
               | releasing the weights under an open source license - but
               | the overall software is not open source.
        
               | v9v wrote:
               | Would you accept the argument that compiling is modifying
               | the bytes in the memory space reserved for an executable?
               | 
               | I can edit the executable at the byte level if I so
               | desire, and this is also what compilers do, but the
               | developer would instead be modifying the source code to
               | make changes to the program and then feed that through a
               | compiler.
               | 
               | Similarly, I can edit the weights of a neural network
               | myself (using any tool I want) but the developers of the
               | network would be altering the training dataset and the
               | training code to make changes instead.
        
               | NitpickLawyer wrote:
               | I think the confusion for a lot of people comes from what
               | they imagine compilation to be. In LLMs, the process is
               | this (simplified):
               | 
               | define_architecture (what the operations are, and the
               | order in which they're performed)
               | 
               | initialise_model(defined_arch) -> weights. Weights are
               | "just" hardcoded values. Nothing more, nothing less.
               | 
               | The weights are the result of the arch, at "compile"
               | time.
               | 
               | optimise_weights(weights, data) -> better_weights.
               | 
               | ----
               | 
               | You can, should you wish, totally release a model after
               | iitialisation. It would be a useless model, but, again,
               | the license does not deal with that. You would have the
               | rights to run, modify and release the model, even if it
               | were a random model.
               | 
               | tl;dr; Licenses deal with _what_ you can do with a model.
               | You can run it, modify it, redistribute it. They do not
               | deal with _how_ you modify them (i.e. what data you use
               | to arrive at the  "optimal" hardcoded values). See also
               | my other reply with a simplified code example.
        
               | falcor84 wrote:
               | The big difference that an Open Source license gives me
               | is that regardless of the tool I use to make the edits,
               | if I rewrite the bytes of the Linux kernel, I can freely
               | release my version with the same license, but if I
               | rewrite the bytes of Super Mario Odyssey and try to
               | release the modified version, I'll soon be having a very
               | fun time at the bankruptcy court.
        
             | nextaccountic wrote:
             | No, it's like saying that if you release under Apache
             | license, it's not open source even though it's under an
             | open source license
             | 
             | For something to be open source it needs to have sources
             | released. Sources are the things in the preferred format to
             | be edited. So the code used for training is obviously
             | source (people can edit the training code to change
             | something about the released weights). Also the training
             | data, under the same rationale: people can select which
             | data is used for training to change the weights
        
               | falcor84 wrote:
               | Well, this is just semantics. I can have a repo that
               | includes a collection of json files that I had generated
               | via a semi-manual build process that depends on
               | everything from the state of my microbiome to my cat's
               | scratching pattern during Mercury's last retrograde. If I
               | attach an open source license to it, then that's the
               | source - do with it what you will. Otherwise, I don't see
               | how this discussion doesn't lead to "you must first
               | invent the universe".
        
               | typ wrote:
               | The difference is that you can customize/debug it or not.
               | You might say that a .EXE can be modified too. But I
               | don't think that's the conventional definition of open
               | source.
               | 
               | I understand that these days, businesses and hobbyists
               | just want to use free LLMs without paying subscriptions
               | for economic motives, that is, either saving money or
               | making money. They don't really care whether the _source_
               | is truly available or not. They are just end users of a
               | product, not open-source developers by any means.
        
               | nextaccountic wrote:
               | Not just semantics, the concept of open source
               | fundamentally depend on what the preferred form of
               | modification is
               | 
               | https://opensource.org/ai/open-source-ai-definition
        
             | nurettin wrote:
             | Is this a troll? They don't want to reproduce your open
             | source code, they want to reproduce the weights.
        
               | falcor84 wrote:
               | What does open sourcing have to do with "reproducing"?
               | Last I checked, open sourcing is about allowing others to
               | modify and to distribute the modified version, which you
               | can do with these. Yes, having the full training data and
               | tooling would make it significantly easier, and it is a
               | requirement for GPL, but not for Open Source licenses in
               | general. You may add this as another argument in favor of
               | going back in time and doing more to support Richard
               | Stallman's vision, but this is the world in which we live
               | now.
        
               | nurettin wrote:
               | For obvious reasons, there is no world in which you can
               | "build" this kind of so-called open source project
               | without the data sets. Play around with words all you
               | want.
        
             | KaiserPro wrote:
             | No its like releasing a binary. I can hook into it and its
             | API and make it do other things. But I can't rebuild it
             | from scratch.
        
               | falcor84 wrote:
               | > rebuild it from scratch
               | 
               | That's beyond the definition of Open Source. Doing a bit
               | of license research now, only the GPL has such a
               | requirement - GPLv3:
               | 
               | > The "Corresponding Source" for a work in object code
               | form means all the source code needed to generate,
               | install, and (for an executable work) run the object code
               | and to modify the work, including scripts to control
               | those activities.
               | 
               | But all other Open Source compliant licenses I checked
               | don't, and just refer to making whatever is in the repo
               | available to others.
        
               | PunchyHamster wrote:
               | ok but just the model isn't even close to anything open,
               | it's literally a compiled binary, without even the source
               | data
        
               | KaiserPro wrote:
               | If you distribute a binary to someone, with gpl2, you
               | should also, if asked provide the source code used to
               | _build_ that binary. Other licenses will differ. MIT for
               | example lets you do pretty much anything, so long as you
               | keep the MIT license and attribution public.
               | 
               | But when people are talking about open source, they
               | generally mean "oh I can see the source code and build it
               | my self." rather than freeware which is "I can run the
               | binary and not have to pay"
        
             | exe34 wrote:
             | "open source" as a verb is doing too much work here. are
             | you proposing to release the human readable code or the
             | object/machine code?
             | 
             | if it's the latter, it's not the source. it's free as in
             | beer. not freedom.
        
               | falcor84 wrote:
               | Yes, I 100% agree. Open Source is a lot more about not
               | paying than about liberty.
               | 
               | This is exactly the tradeoff that we had made in the
               | industry a couple of decades ago. We could have pushed
               | all-in on Stallman's vision and the FSF's definition of
               | Free Software, but we (collectively) decided that it's
               | more important to get the practical benefits of having
               | all these repos up there on GitHub and us not suing each
               | other over copyright infringement. It's absolutely
               | legitimate to say that we made the wrong choice, and I
               | might agree, but a choice was made, and Open Source !=
               | Free Software.
               | 
               | https://www.gnu.org/philosophy/open-source-misses-the-
               | point....
        
           | amelius wrote:
           | True. But the headline says open weights.
        
           | ekianjo wrote:
           | It's just open weights, the source has no place in this
           | expression
        
           | jimmydoe wrote:
           | you are absolutely right. I'd rather use true closed models,
           | not fake open source ones from China.
        
         | PunchyHamster wrote:
         | I think we should treat copyright for the weights the same way
         | the AI companies treat source material ;)
        
           | littlestymaar wrote:
           | We don't even have to do that: weights being entirely machine
           | generated without human intervention, they are likely not
           | copyrightable in the first place.
           | 
           | In fact, we should collectively refuse to abide to these
           | fantasy license before weight copyrightability gets created
           | out of thin air because it's been commonplace for long
           | enough.
        
             | mitthrowaway2 wrote:
             | There's an argument by which machine-learned neural network
             | weights are a lossy compression of (as well as a smooth
             | interpolator over) the training set.
             | 
             | An mp3 file is also a machine-generated lossy compression
             | of a cd-quality .wav file, but it's clearly copyrightable.
             | 
             | To that extent, the main difference between a neural
             | network and an .mp3 is that the mp3 compression cannot be
             | used to interpolate between two copyrighted works to output
             | something in the middle. This is, on the other hand,
             | perhaps the most common use case for genAI, and it's
             | actually tricky to get it to not output something "in the
             | middle" (but also not impossible).
             | 
             | I think the copyright argument could really go either way
             | here.
        
               | littlestymaar wrote:
               | > An mp3 file is also a machine-generated lossy
               | compression of a cd-quality .wav file, but it's clearly
               | copyrightable.
               | 
               | Not the .mp3 itself, the creative piece of art that it
               | encode.
               | 
               | You can't record Taylor Swift at a concert and claim
               | copyright on that. Nor can you claim copyright on mp3 re-
               | encoded old audio footage that belong to the public
               | domain.
               | 
               | Whether LLMs are in the first category (copyright
               | infringement of copyright holders of the training data)
               | or in the second (public domain or fair use) is an open
               | question that jurisprudence is slowly resolving depending
               | on the jurisdiction, but that doesn't address the
               | question of the weight themselves.
        
               | mitthrowaway2 wrote:
               | Right, the .mp3 is machine generated but on a creatively
               | -generated input. The analogy I'm making is that an LLM's
               | weights (or let's say, a diffusion image model) are also
               | machine-generated (by the training process) from the
               | works in its training set, many of which are creative
               | works, and the neural network encodes those creative
               | works much like mp3 file does.
               | 
               | In this analogy, distributing the weights would be akin
               | to distributing an mp3, and offering a genAI service,
               | like charGPT inference or a stable diffusion API, would
               | be akin to broadcasting.
        
               | littlestymaar wrote:
               | I'd be fine with this interpretation, but that would
               | definitely rule out fair use for training, and be even
               | worse for LLM makers than having LLM non-copyrightable.
        
               | mitthrowaway2 wrote:
               | Oh yes, absolutely.
        
           | larodi wrote:
           | Of course we should! And everyone who says otherwise must be
           | delusional or sort of a gaslighter, as this whole
           | "innovation" (or remix (or comopression)) is enabled by the
           | creative value of the source product. Given AI companies
           | never ever respected this copyright, we should give them
           | similar treatment.
        
       | ilmj8426 wrote:
       | It's impressive to see how fast open-weights models are catching
       | up in specialized domains like math and reasoning. I'm curious if
       | anyone has tested this model for complex logic tasks in coding?
       | Sometimes strong math performance correlates well with debugging
       | or algorithm generation.
        
         | stingraycharles wrote:
         | kimi-k2 is pretty decent at coding but it's nowhere near the
         | SOTA models of Anthropic/OpenAI/Google.
        
           | tripplyons wrote:
           | Are you referring to the new reasoning version of Kimi K2?
        
         | alansaber wrote:
         | It makes complete sense to me: highly-specific models don't
         | have much commercial value, and at-scale llm training favours
         | generalism.
        
       | yorwba wrote:
       | Previous discussion:
       | https://news.ycombinator.com/item?id=46072786 218 points 3 days
       | ago, 48 comments
        
         | victorbuilds wrote:
         | Ah, missed that one. Thanks for the link.
        
       | terespuwash wrote:
       | Why isn't OpenAI's gold medal-winning model available to the
       | public yet?
        
         | esafak wrote:
         | 'coz it was for advertisement. They'll roll their lessons into
         | the next general purpose model.
        
       | H8crilA wrote:
       | How do you run this kind of a model at home? On a CPU on a
       | machine that has about 1TB of RAM?
        
         | pixelpoet wrote:
         | Wow, it's 690GB of downloaded data, so yeah, 1TB sounds about
         | right. Not even my two Strix Halo machines paired can do this,
         | damn.
        
         | Gracana wrote:
         | You can do it slowly with ik_llama.cpp, lots of RAM, and one
         | good GPU. Also regular llama.cpp, but the ik fork has some
         | enhancements that make this sort of thing more tolerable.
        
         | bertili wrote:
         | Two 512GB Mac Studios connected with thunderbolt 5.
        
       | sschueller wrote:
       | How is OpenAI going to be able to serve ads in chatgpt without
       | everyone immediately jumping ship to another model?
        
         | miroljub wrote:
         | I don't care about OpenAI even if they don't serve ads.
         | 
         | I can't trust any of their output until they become honest
         | enough to change their name to CloseAI.
        
         | Coffeewine wrote:
         | I suppose the hope is that they don't, and we wind up with
         | commodity frontier models from multiple providers at market
         | rates.
        
         | KeplerBoy wrote:
         | Google served ads for decades and no one ever jumped ship to
         | another search engine.
        
           | sschueller wrote:
           | Because Google gave the best results for a long time.
        
             | PunchyHamster wrote:
             | and now, when they are not, everyone else's results are
             | also pretty terrible...
        
           | bootsmann wrote:
           | They pay $30bn (more than OpenAIs lifetime revenue) each year
           | to make sure noone does.
        
             | KeplerBoy wrote:
             | What are you referring to?
        
               | rzerowan wrote:
               | Search deals with mobile OEMs and apple(preferred engine
               | in all mobile browsers) also paying off Mozilla for a
               | start. Also Goog has had a first mover moat for a while
               | before Duck came along.
        
         | dist-epoch wrote:
         | The same way people stayed on Google despite DuckDuckGo
         | existing.
        
         | PunchyHamster wrote:
         | by having datacenters with GPUs and API everyone uses.
         | 
         | So they are either earning money directly or on the API calls.
         | 
         | Now, competition can come and compete on that, but they will
         | probably still be the first choice for foreseeable future
        
         | astrange wrote:
         | ChatGPT is a website. There's nothing unusual about ads on a
         | website.
         | 
         | People use Instagram too.
        
       | simianwords wrote:
       | A bit important that this model is not general purpose whereas
       | the ones Google and OpenAI used were general purpose.
        
         | mangolie wrote:
         | https://x.com/deepseek_ai/status/1995452646459858977
         | 
         | Boom
        
           | simianwords wrote:
           | Oh you may be correct. Are these models general purpose or
           | fine tuned for mathematics?
        
           | yorwba wrote:
           | That's a different model: https://huggingface.co/deepseek-
           | ai/DeepSeek-V3.2-Speciale
        
           | andy12_ wrote:
           | Do note that that is a different model. The one we are
           | talking about here, DeepSeekMath-V2, is indeed overcooked
           | with math RL. It's so eager to solve math problems, that it
           | even comes up with random ones if you prompt it with "Hello".
           | 
           | https://x.com/AlpinDale/status/1994324943559852326?s=20
        
         | yorwba wrote:
         | Both OpenAI and Google used models made specifically for the
         | task, not their general-purpose products.
         | 
         | OpenAI:
         | https://xcancel.com/alexwei_/status/1946477756738629827#m "we
         | are releasing GPT-5 soon, and we're excited for you to try it.
         | But just to be clear: the IMO gold LLM is an experimental
         | research model. We don't plan to release anything with this
         | level of math capability for several months."
         | 
         | DeepMind: https://deepmind.google/blog/advanced-version-of-
         | gemini-with... "we additionally trained this version of Gemini
         | on novel reinforcement learning techniques that can leverage
         | more multi-step reasoning, problem-solving and theorem-proving
         | data. We also provided Gemini with access to a curated corpus
         | of high-quality solutions to mathematics problems, and added
         | some general hints and tips on how to approach IMO problems to
         | its instructions."
        
           | simianwords wrote:
           | Not true
        
           | simianwords wrote:
           | https://x.com/sama/status/1946569252296929727
           | 
           | >we achieved gold medal level performance on the 2025 IMO
           | competition with a _general-purpose reasoning system!_ to
           | emphasize, this is an LLM doing math and not a specific
           | formal math system; it is part of our main push towards
           | general intelligence.
           | 
           | asterisks mine
        
             | yorwba wrote:
             | DeepSeekMath-V2 is also an LLM doing math and not a
             | specific formal math system. What interpretation of
             | "general purpose" were you using where one of them is
             | "general purpose" and the other isn't?
        
               | simianwords wrote:
               | This model can't be used for say questions on biology or
               | history.
        
               | yorwba wrote:
               | How do you know how well OpenAI's unreleased experimental
               | model does on biology or history questions?
        
               | simianwords wrote:
               | Sam specifically says it is general purpose and also this
               | 
               | > Typically for these AI results, like in
               | Go/Dota/Poker/Diplomacy, researchers spend years making
               | an AI that masters one narrow domain and does little
               | else. But this isn't an IMO-specific model. It's a
               | reasoning LLM that incorporates new experimental general-
               | purpose techniques.
               | 
               | https://x.com/polynoamial/status/1946478250974200272
        
               | lossolo wrote:
               | You are overinterpreting what they said again.
               | "Go/Dota/Poker/Diplomacy" do not use LLMs, which means
               | they are not considered "general purpose" by them. And to
               | prove it to you, look at the OpenAI IMO solutions on
               | GitHub, which clearly show that it's not a general
               | purpose trained LLM because of how the words and
               | sentences are generated there. These are models
               | specifically fine tuned for math.
        
               | simianwords wrote:
               | they could not have been more clear - sorry but are you
               | even reading?
        
               | lossolo wrote:
               | Clear about what? Do you know the difference between an
               | LLM based on transformer attention and a monte carlo tree
               | search system like the one used in Go? You do not
               | understand what they are saying. It was a fine tuned
               | model, just as DeepSeekMath is a fine tuned LLM for math,
               | which means it was a special purpose model. Read the
               | OpenAI GitHub IMO submissions to see the proof.
        
       | letmetweakit wrote:
       | Does anyone know if this will become available on OpenRouter?
        
       | WhitneyLand wrote:
       | Shouldn't there be a lot of skepticism here?
       | 
       | All the problems they claim to have solved are on are the
       | Internet and they explicitly say they crawled them. They do not
       | mention doing any benchmark decontamination or excluding
       | 2024/2025 competition problems from training.
       | 
       | IIRC correctly OpenAI/Google did not have access to the 2025
       | problems before testing their experimental math models.
        
       | LZ_Khan wrote:
       | Don't they distill directly off OpenAI/Google outputs?
        
       ___________________________________________________________________
       (page generated 2025-12-01 23:02 UTC)