[HN Gopher] DeepSeek-v3.2: Pushing the frontier of open large la...
       ___________________________________________________________________
        
       DeepSeek-v3.2: Pushing the frontier of open large language models
       [pdf]
        
       https://huggingface.co/deepseek-ai/DeepSeek-V3.2  https://api-
       docs.deepseek.com/news/news251201
        
       Author : pretext
       Score  : 435 points
       Date   : 2025-12-01 15:48 UTC (7 hours ago)
        
 (HTM) web link (huggingface.co)
 (TXT) w3m dump (huggingface.co)
        
       | nimchimpsky wrote:
       | Pretty amazing that a relatively small Chinese hedge fund can
       | build AI better than almost anyone.
        
       | BoorishBears wrote:
       | 3.2-Exp came out in September: this is 3.2, along with a special
       | checkpoint (DeepSeek-V3.2-Speciale) for deep reasoning that
       | they're claiming surpasses GPT-5 and matches Gemini 3.0
       | 
       | https://x.com/deepseek_ai/status/1995452641430651132
        
       | zparky wrote:
       | Benchmarks are super impressive, as usual. Interesting to note in
       | table 3 of the paper (p. 15), DS-Speciale is 1st or 2nd in
       | accuracy in all tests, but has much higher token output (50%
       | more, or 3.5x vs gemini 3 in the codeforces test!).
        
         | futureshock wrote:
         | The higher token output is not by accident. Certain kinds of
         | logical reasoning problems are solved by longer thinking
         | output. Thinking chain output is usually kept to a reasonable
         | length to limit latency and cost, but if pure benchmark
         | performance is the goal you can crank that up to the max until
         | the point of diminishing returns. DeepSeek being 30x cheaper
         | than Gemini means there's little downside to max out the
         | thinking time. It's been shown that you can further scale this
         | by running many solution attempts in parallel with max thinking
         | then using a model to choose a final answer, so increasing
         | reasoning performance by increasing inference compute has a
         | pretty high ceiling.
        
       | jodleif wrote:
       | I genuinely do not understand the evaluations of the US AI
       | industry. The chinese models are so close and far cheaper
        
         | newyankee wrote:
         | Yet tbh if the US industry had not moved ahead and created the
         | race with FOMO it would not had been easier for Chinese
         | strategy to work either.
         | 
         | The nature of the race may change as yet though, and I am
         | unsure if the devil is in the details, as in very specific edge
         | cases that will work only with frontier models ?
        
         | jazzyjackson wrote:
         | Valuation is not based on what they have done but what they
         | might do, I agree tho it's investment made with very little
         | insight into Chinese research. I guess it's counting on
         | deepseek being banned and all computers in America refusing to
         | run open software by the year 2030 /snark
        
           | bilbo0s wrote:
           | > _I guess it 's counting on deepseek being banned_
           | 
           | And the people making the bets are in a position to make sure
           | the banning happens. The US government system being what it
           | is.
           | 
           | Not that our leaders need any incentive to ban Chinese tech
           | in this space. Just pointing out that it's not _necessarily_
           | a  "bet".
           | 
           | "Bet" imply you don't know the outcome _and_ you have no
           | influence over the outcome. Even  "investment" implies you
           | don't know the outcome. I'm not sure that's the case with
           | these people?
        
             | coliveira wrote:
             | Exactly. "Business investment" these days means that the
             | people involved will have at least some amount of power to
             | determine the winning results.
        
           | jodleif wrote:
           | > Valuation is not based on what they have done but what they
           | might do
           | 
           | Exactly what I'm thinking. Chinese models catching rapidly.
           | Soon to be on-par with the big dogs.
        
             | ksynwa wrote:
             | Even if they do continue to lag behind they are a good bet
             | against monopolisation by proprietary vendors.
        
               | coliveira wrote:
               | They would if corporations were allowed to run these
               | models. I fully expect the US government to prohibit
               | corporations from doing anything useful with Chinese
               | models (full censorship). It's the same game they use
               | with chips.
        
         | jasonsb wrote:
         | It's all about the hardware and infrastructure. If you check
         | OpenRouter, no provider offers a SOTA chinese model matching
         | the speed of Claude, GPT or Gemini. The chinese models may
         | benchmark close on paper, but real-world deployment is
         | different. So you either buy your own hardware in order to run
         | a chinese model at 150-200tps or give up an use one of the Big
         | 3.
         | 
         | The US labs aren't just selling models, they're selling
         | globally distributed, low-latency infrastructure at massive
         | scale. That's what justifies the valuation gap.
         | 
         | Edit: It looks like Cerebras is offering a very fast GLM 4.6
        
           | csomar wrote:
           | According to OpenRouter, z.ai is 50% faster than Anthropic;
           | which matches my experience. z.ai does have frequent
           | downtimes but so does Claude.
        
           | jodleif wrote:
           | Assuming your hardware premise is right (and lets be honest,
           | nobody really wants to send their data to chinese providers)
           | You can use a provider like Cerebras, Groq?
        
           | observationist wrote:
           | The network effects of using consistently behaving models and
           | maintaining API coverage between updates is valuable, too -
           | presumably the big labs are including their own domains of
           | competence in the training, so Claude is likely to remain
           | being very good at coding, and behave in similar ways,
           | informed and constrained by their prompt frameworks, so that
           | interactions will continue to work in predictable ways even
           | after major new releases occur, and upgrades can be clean.
           | 
           | It'll probably be a few years before all that stuff becomes
           | as smooth as people need, but OAI and Anthropic are already
           | doing a good job on that front.
           | 
           | Each new Chinese model requires a lot of testing and bespoke
           | conformance to every task you want to use it for. There's a
           | lot of activity and shared prompt engineering, and some
           | really competent people doing things out in the open, but
           | it's generally going to take a lot more expert work getting
           | the new Chinese models up to snuff than working with the big
           | US labs. Their product and testing teams do a lot of valuable
           | work.
        
           | kachapopopow wrote:
           | cerebras AI offers models at 50x the speed of sonnet?
        
           | DeathArrow wrote:
           | > If you check OpenRouter, no provider offers a SOTA chinese
           | model matching the speed of Claude, GPT or Gemini.
           | 
           | I think GLM 4.6 offered by Cerebras is much faster than any
           | US model.
        
             | jasonsb wrote:
             | You're right, I forgot about that one.
        
           | irthomasthomas wrote:
           | Gemini 3 = ~70tps https://openrouter.ai/google/gemini-3-pro-
           | preview
           | 
           | Opus 4.5 = ~60-80tps https://openrouter.ai/anthropic/claude-
           | opus-4.5
           | 
           | Kimi-k2-think = ~60-180tps
           | https://openrouter.ai/moonshotai/kimi-k2-thinking
           | 
           | Deepseek-v3.2 = ~30-110tps (only 2 providers rn)
           | https://openrouter.ai/deepseek/deepseek-v3.2
        
             | jasonsb wrote:
             | It doesn't work like that. You need to actually use the
             | model and then go to /activity to see the actual speed. I
             | constantly get 150-200tps from the Big 3 while other
             | providers barely hit 50tps even though they advertise much
             | higher speeds. GLM 4.6 via Cerebras is the only one faster
             | than the closed source models at over 600tps.
        
               | irthomasthomas wrote:
               | These aren't advertised speeds, they are the average
               | measured speeds by openrouter across different providers.
        
         | isamuel wrote:
         | There is a great deal of orientalism --- it is genuinely
         | unthinkable to a lot of American tech dullards that the Chinese
         | could be better at anything requiring what they think of as
         | "intelligence." Aren't they Communist? Backward? Don't they eat
         | weird stuff at wet markets?
         | 
         | It reminds me, in an encouraging way, of the way that German
         | military planners regarded the Soviet Union in the lead-up to
         | Operation Barbarossa. The Slavs are an obviously inferior race;
         | their Bolshevism dooms them; we have the will to power; we will
         | succeed. Even now, when you ask questions like what you ask of
         | that era, the answers you get are genuinely not better than
         | "yes, this should have been obvious at the time if you were not
         | completely blinded by ethnic and especially ideological
         | prejudice."
        
           | newyankee wrote:
           | but didn't Chinese already surpass the rest of the world in
           | Solar, batteries, EVs among other things ?
        
             | cyberlimerence wrote:
             | They did, but the goalposts keep moving, so to speak. We're
             | approximately here : advanced semiconductors, artificial
             | intelligence, reusable rockets, quantum computing, etc.
             | Chinese will never catch up. /s
        
           | mosselman wrote:
           | Back when deepseek came out and people were tripping over
           | themselves shouting it was so much better than what was out
           | there, it just wasn't good.
           | 
           | It might be this model is super good, I haven't tried it, but
           | to say the Chinese models are better is just not true.
           | 
           | What I really love though is that I can run them (open
           | models) on my own machine. The other day I categorised images
           | locally using Qwen, what a time to be alive.
           | 
           | Further even than local hardware, open models make it
           | possible to run on providers of choice, such as European
           | ones. Which is great!
           | 
           | So I love everything about the competitive nature of this.
        
             | CamperBob2 wrote:
             | If you thought DeepSeek "just wasn't good," there's a good
             | chance you were running it wrong.
             | 
             | For instance, a lot of people thought they were running
             | "DeepSeek" when they were really running some random
             | distillation on ollama.
        
               | bjourne wrote:
               | WDYM? Isn't https://chat.deepseek.com/ the real DeepSeek?
        
               | CamperBob2 wrote:
               | Good point, I was assuming the GP was running local for
               | some reason. Hard to argue when it's the official
               | providers who are being compared.
               | 
               | I ran the 1.58-bit Unsloth quant locally at the time it
               | came out, and even at such low precision, it was _super_
               | rare for it to get something wrong that o1 and GPT4 got
               | right. I have never actually used a hosted version of the
               | full DS.
        
           | lukan wrote:
           | "It reminds me, in an encouraging way, of the way that German
           | military planners regarded the Soviet Union in the lead-up to
           | Operation Barbarossa. The Slavs are an obviously inferior
           | race; ..."
           | 
           | Ideology played a role, but the data they worked with, was
           | the finnish war, that was disastrous for the sowjet side.
           | Hitler later famously said, it was all a intentionally
           | distraction to make them believe the sowjet army was worth
           | nothing. (Real reasons were more complex, like previous
           | purging).
        
           | littlestymaar wrote:
           | > It reminds me, in an encouraging way, of the way that
           | German military planners regarded the Soviet Union in the
           | lead-up to Operation Barbarossa. The Slavs are an obviously
           | inferior race; their Bolshevism dooms them; we have the will
           | to power; we will succeed
           | 
           | Though, because Stalin had decimated the red army leadership
           | (including most of the veteran officer who had Russian civil
           | war experience) during the Moscow trials purges, the German
           | almost succeeded.
        
             | gazaim wrote:
             | > Though, because Stalin had decimated the red army
             | leadership (including most of the veteran officer who had
             | Russian civil war experience) during the Moscow trials
             | purges, the German almost succeeded.
             | 
             | There were many counter revolutionaries among the
             | leadership, even those conducting the purges. Stalin was
             | like "ah fuck we're hella compromised." Many revolutions
             | fail in this step and often end up facing a CIA backed
             | coup. The USSR was under constant siege and attempted
             | infiltration since inception.
        
               | littlestymaar wrote:
               | > There were many counter revolutionaries among the
               | leadership
               | 
               | Well, Stalin was, by far, the biggest counter-
               | revolutionary in the Politburo.
               | 
               | > Stalin was like "ah fuck we're hella compromised."
               | 
               | There's no evidence that anything significant was
               | compromised at that point, and clear evidence that Stalin
               | was in fact medically paranoid.
               | 
               | > Many revolutions fail in this step and often end up
               | facing a CIA backed coup. The USSR was under constant
               | siege and attempted infiltration since inception.
               | 
               | Can we please not recycle 90-years old soviet propaganda?
               | The Moscow trial being irrational self-harm was
               | acknowledged by the USSR leadership as early as the
               | fifties...
        
           | breppp wrote:
           | Not sure how the entire Nazi comparison plays out, but at the
           | time there were good reasons to imagine the Soviets will fall
           | apart (as they initially did)
           | 
           | Stalin just finished purging his entire officer corps, which
           | is not a good omen for war, and the USSR failed miserably
           | against the Finnish who were not the strongest of nations,
           | while Germany just steamrolled France, a country that was
           | much more impressive in WW1 than the Russians (who collapsed
           | against Germany)
        
           | ecshafer wrote:
           | I don't think that anyone, much less someone working in tech
           | or engineering in 2025, could still hold beliefs about
           | Chinese not being capable scientists or engineers. I could
           | maybe give (the naive) pass to someone in 1990 thinking China
           | will never build more than junk. But in 2025 their product
           | capacity, scientific advancement, and just the amount of us
           | who have worked with extremely talented Chinese colleagues
           | should dispel those notions. I think you are jumping to
           | racism a bit fast here.
           | 
           | Germany was right in some ways and wrong in others for the
           | soviet unions strength. USSR failed to conquer Finland
           | because of the military purges. German intelligence vastly
           | under-estimated the amount of tanks and general preparedness
           | of the Soviet army (Hitler was shocked the soviets had 40k
           | tanks already). Lend Lease act really sent an astronomical
           | amount of goods to the USSR which allowed them to fully
           | commit to the war and really focus on increasing their weapon
           | production, the numbers on the amount of tractors, food,
           | trains, ammunition, etc. that the US sent to the USSR is
           | staggering.
        
           | gazaim wrote:
           | These Americans have no comprehension of intelligence being
           | used to benefit humanity instead of being used to fund a
           | CEO's new yacht. I encourage them to visit China to see how
           | far the USA lags behind.
        
         | espadrine wrote:
         | Two aspects to consider:
         | 
         | 1. Chinese models typically focus on text. US and EU models
         | also bear the cross of handling image, often voice and video.
         | Supporting all those is additional training costs not spent on
         | further reasoning, tying one hand in your back to be more
         | generally useful.
         | 
         | 2. The gap seems small, because so many benchmarks get
         | saturated so fast. But towards the top, every 1% increase in
         | benchmarks is significantly better.
         | 
         | On the second point, I worked on a leaderboard that both
         | normalizes scores, and predicts unknown scores to help improve
         | comparisons between models on various criteria:
         | https://metabench.organisons.com/
         | 
         | You can notice that, while Chinese models are quite good, the
         | gap to the top is still significant.
         | 
         | However, the US models are typically much more expensive for
         | inference, and Chinese models do have a niche on the Pareto
         | frontier on cheaper but serviceable models (even though US
         | models also eat up the frontier there).
        
           | jodleif wrote:
           | 1. Have you seen the Qwen offerings? They have great multi-
           | modality, some even SOTA.
        
             | brabel wrote:
             | Qwen Image and Image Edit were among the best image models
             | until Nano Banana Pro came along. I have tried some open
             | image models and can confirm , the Chinese models are
             | easily the best or very close to the best, but right now
             | the Google model is even better... we'll see if the Chinese
             | catch up again.
        
               | BoorishBears wrote:
               | I'd say Google still hasn't caught up on the smaller
               | model side at all, but we've all been (rightfully) wowed
               | enough by Pro to ignore that for now.
               | 
               | Nano Banano Pro starts at 15 cents per image at <2k
               | resolution, and is not strictly better than Seedream 4.0:
               | yet the latter does 4K for 3 cents per image.
               | 
               | Add in the power of fine-tuning on their open weight
               | models and I don't know if China actually needs to catch
               | up.
               | 
               | I finetuned Qwen Image on 200 generations from Seedream
               | 4.0 that were cleaned up with Nano Banana Pro, and got
               | results that were as good _and more reliable_ than either
               | model could achieve otherwise.
        
           | torginus wrote:
           | Thanks for sharing that!
           | 
           | The scales are a bit murky here, but if we look at the
           | 'Coding' metric, we see that Kimi K2 outperforms Sonnet 4.5 -
           | that's considered to be the price-perf darling I think even
           | today?
           | 
           | I haven't tried these models, but in general there have been
           | lots of cases where a model performs much worse IRL than the
           | benchmarks would sugges (certain Chinese models and GPT-OSS
           | have been guilty of this in the past)
        
             | espadrine wrote:
             | Good question. There's 2 points to consider.
             | 
             | * For both Kimi K2 and for Sonnet, there's a non-thinking
             | and a thinking version. Sonnet 4.5 Thinking is better than
             | Kimi K2 non-thinking, but the K2 Thinking model came out
             | recently, and beats it on all comparable pure-coding
             | benchmarks I know: OJ-Bench (Sonnet: 30.4% < K2: 48.7%),
             | LiveCodeBench (Sonnet: 64% < K2: 83%), they tie at SciCode
             | at 44.8%. It is a finding shared by ArtificialAnalysis:
             | https://artificialanalysis.ai/models/capabilities/coding
             | 
             | * The reason developers love Sonnet 4.5 for coding, though,
             | is not just the quality of the code. They use Cursor,
             | Claude Code, or some other system such as Github Copilot,
             | which are increasingly agentic. On the Agentic Coding
             | criteria, Sonnet 4.5 Thinking is much higher.
             | 
             | By the way, you can look at the Table tab to see all known
             | and predicted results on benchmarks.
        
           | agumonkey wrote:
           | forgive me for bringing politics into it, are chinese LLM
           | more prone to censorship bias than US ones ?
        
             | coliveira wrote:
             | Being open source, I believe Chinese models are less prone
             | to censorship, since the US corporations can add censorship
             | in several ways just by being a closed model that they
             | control.
        
             | skeledrew wrote:
             | It's not about a LLM being prone to anything, but more
             | about the way a LLM is fine-tuned (which can be subject to
             | the requirements of those wielding political power).
        
               | agumonkey wrote:
               | that's what i meant even though i could have been more
               | precise
        
           | raincole wrote:
           | > video
           | 
           | Most of AI-generated videos we see on social media now are
           | made with Chinese models.
        
           | coliveira wrote:
           | Nothing you said helps with the issue of valuation. Yes, the
           | US models may be better by a few percentage points, but how
           | can they justify being so costly, both operationally as well
           | as in investment costs? Over the long run, this is a business
           | and you don't make money being the first, you have to be more
           | profitable overall.
        
             | ben_w wrote:
             | I think the investment race here is an "all-pay auction"*.
             | Lots of investors have looked at the ultimate prize --
             | basically winning something larger than the entire present
             | world economy forever -- and think "yes".
             | 
             | But even assuming that we're on the right path for that
             | (which we may not be) and assuming that nothing intervenes
             | to stop it (which it might), there may be only one winner,
             | and that winner may not have even entered the game yet.
             | 
             | * https://en.wikipedia.org/wiki/All-pay_auction
        
               | coliveira wrote:
               | > investors have looked at the ultimate prize --
               | basically winning something larger than the entire
               | present world economy
               | 
               | This is what people like Altman want investors to
               | believe. It seems like any other snake oil scam because
               | it doesn't match reality of what he delivers.
        
               | saubeidl wrote:
               | Yeah, this is basically financial malpractice/fraud.
        
         | Bolwin wrote:
         | Third party providers rarely support caching.
         | 
         | With caching the expensive US models end up being like 2x the
         | price (e.g sonnet) and often much cheaper (e.g gpt-5 mini)
         | 
         | If they start caching then US companies will be completely out
         | priced.
        
         | beastman82 wrote:
         | Then you should short the market
        
         | fastball wrote:
         | They're not that close (on things like LMArena) and being
         | cheaper is pretty meaningless when we are not yet at the point
         | where LLMs are good enough for autonomy.
        
         | mrinterweb wrote:
         | I would expect one of the motivations for making these LLM
         | model weights open is to undermine the valuation of other
         | players in the industry. Open models like this must diminish
         | the value prop of the frontier focused companies if other
         | companies can compete with similar results at competitive
         | prices.
        
         | rprend wrote:
         | People pay for products, not models. OpenAI and Anthropic make
         | products (ChatGPT, Claude Code).
        
       | TIPSIO wrote:
       | It's awesome that stuff like this is open source, but even if you
       | have a basement rig with 4 NVIDIA GeForce RTX 5090 graphic cards
       | ($15-20k machine), can it even run with any reasonable context
       | window that isn't like a crawling 10/tps?
       | 
       | Frontier models are far exceeding even the most hardcore consumer
       | hobbyist requirements. This is even further
        
         | bigyabai wrote:
         | People with basement rigs generally aren't the target audience
         | for these gigantic models. You'd get much better results out of
         | an MoE model like Qwen3's A3B/A22B weights, if you're running a
         | homelab setup.
        
           | Spivak wrote:
           | Yeah I think the advantage of OSS models is that you can get
           | your pick of providers and aren't locked into just Anthropic
           | or just OpenAI.
        
         | noosphr wrote:
         | Home rigs like that are no longer cost effective. You're better
         | off buying an rtx pro 6000 outright. This holds both for the
         | sticker price, the supporting hardware price, the electricity
         | cost to run it and cooling the room that you use it in.
        
           | torginus wrote:
           | I was just watching this video about a Chinese piece of
           | industrial equipment, designed for replacing BGA chips such
           | as flash or RAM with a good deal of precision:
           | 
           | https://www.youtube.com/watch?v=zwHqO1mnMsA
           | 
           | I wonder how well the aftermarket memory surgery business on
           | consumer GPUs is doing.
        
             | ThrowawayTestr wrote:
             | LTT recently did a video on upgrading a 5090 to 96gb of ram
        
             | dotancohen wrote:
             | I wonder how well the opthalmologist is doing. These guys
             | are going to be paying him a visit playing around with
             | those lasers and no PPE.
        
           | mikae1 wrote:
           | Or perhaps a 512GB Mac Studio. 671B Q4 of R1 runs on it.
        
             | redrove wrote:
             | I wouldn't say runs. More of a gentle stroll.
        
               | storus wrote:
               | I run it all the time, token generation is pretty good.
               | Just large contexts are slow but you can hook a DGX Spark
               | via Exo Labs stack and outsource token prefill to it.
               | Upcoming M5 Ultra should be faster than Spark in token
               | prefill as well.
        
               | embedding-shape wrote:
               | > I run it all the time, token generation is pretty good.
               | 
               | I feel like because you didn't actually talk about prompt
               | processing speed or token/s, you aren't really giving the
               | whole picture here. What is the prompt processing tok/s
               | and the generation tok/s actually like?
        
               | storus wrote:
               | I addressed both points - I mentioned you can offload
               | token prefill (the slow part, 9t/s) to DGX Spark. Token
               | generation is at 6t/s which is acceptable.
        
               | hasperdi wrote:
               | With quantization, converting it to an MOE model... it
               | can be a fast walk
        
           | throw4039 wrote:
           | Yeah, the pricing for the rtx pro 6000 is surprisingly
           | competitive with the gamer cards (at actual prices, not
           | MSRP). A 3x5090 rig will require significant
           | tuning/downclocking to be run from a single North American
           | 15A plug, and the cost of the higher powered supporting
           | equipment (cooling, PSU, UPS, etc) needed will pay for the
           | price difference, not to mention future expansion
           | possibilities.
        
         | halyconWays wrote:
         | As someone with a basement rig of 6x 3090s, not really. It's
         | quite slow, as with that many params (685B) it's offloading
         | basically all of it into system RAM. I limit myself to models
         | with <144B params, then it's quite an enjoyable experience. GLM
         | 4.5 Air has been great in particular
        
         | tarruda wrote:
         | You can run at ~20 tokens/second on a 512GB Mac Studio M3
         | Ultra: https://youtu.be/ufXZI6aqOU8?si=YGowQ3cSzHDpgv4z&t=197
         | 
         | IIRC the 512GB mac studio is about $10k
        
           | hasperdi wrote:
           | and can be faster if you can get an MOE model of that
        
             | dormento wrote:
             | "Mixture-of-experts", AKA "running several small models and
             | activating only a few at a time". Thanks for introducing me
             | to that concept. Fascinating.
             | 
             | (commentary: things are really moving too fast for the
             | layperson to keep up)
        
               | whimsicalism wrote:
               | that's not really a good summary of what MoEs are. you
               | can more consider it like sublayers that get routed
               | through (like how the brain only lights up certain
               | pathways) rather than actual separate models.
        
               | Mehvix wrote:
               | The gains from MoE is that you can have a large model
               | that's efficient, it lets you decouple #params and
               | computation cost. I don't see how anthropomorphizing MoE
               | <-> brain affords insight deeper than 'less activity
               | means less energy used'. These are totally different
               | systems, IMO this shallow comparison muddies the water
               | and does a disservice to each field of study. There's
               | been loads of research showing there's redundancy in MoE
               | models, ie cerebras has a paper[1] where they selectively
               | prune half the experts with minimal loss across domains
               | -- I'm not sure you could disable half the brain and
               | notice a stupefying difference.
               | 
               | [1] https://www.cerebras.ai/blog/reap
        
               | hasperdi wrote:
               | As pointed out by a sibling comment. MOE consists of a
               | router and a number of experts (eg 8). These experts can
               | be imagined as parts of the brain with specialization,
               | although in reality they probably don't work exactly like
               | that. These aren't separate models, they are components
               | of a single large model.
               | 
               | Typically, input gets routed to a number of of experts
               | eg. top 2, leaving the others inactive. This reduces
               | number of activation / processing requirements.
               | 
               | Mistral is an example of a model that's designed like
               | this. Clever people created converters to transform dense
               | models to MOE models. These days many popular models are
               | also available in MOE configuration
        
             | bigyabai wrote:
             | >90% of inference hardware is faster if you run an MOE
             | model.
        
             | miohtama wrote:
             | All modern models are MoE already, no?
        
         | reilly3000 wrote:
         | There are plenty of 3rd party and big cloud options to run
         | these models by the hour or token. Big models really only work
         | in that context, and that's ok. Or you can get yourself an H100
         | rack and go nuts, but there is little downside to using a cloud
         | provider on a per-token basis.
        
           | cubefox wrote:
           | > There are plenty of 3rd party and big cloud options to run
           | these models by the hour or token.
           | 
           | Which ones? I wanted to try a large base model for automated
           | literature (fine-tuned models are a lot worse at it) but I
           | couldn't find a provider which makes this easy.
        
         | potsandpans wrote:
         | I run a bunch of smaller models on a 12gb vram 3060 and it's
         | quite good. For larger open models ill use open router. I'm
         | looking into on- demand instances with cloud/vps providers, but
         | haven't explored the space too much.
         | 
         | I feel like private cloud instances that run on demand is still
         | in the spirit of consumer hobbyist. It's not as good as having
         | it all local, but the bootstrapping cost plus electricity to
         | run seems prohibitive.
         | 
         | I'm really interested to see if there's a space for consumer
         | TPUs that satisfy usecases like this.
        
         | seanw265 wrote:
         | FWIW it looks like OpenRouter's two providers for this model
         | (one of whom being Deepseek itself) are only running the model
         | around 28tps at the moment.
         | 
         | https://openrouter.ai/deepseek/deepseek-v3.2
         | 
         | This only bolsters your point. Will be interesting to see if
         | this changes as the model is adopted more widely.
        
       | red2awn wrote:
       | Worth noting this is not only good on benchmarks, but
       | significantly more efficient at inference
       | https://x.com/_thomasip/status/1995489087386771851
        
       | zug_zug wrote:
       | Well props to them for continuing to improve, winning on cost-
       | effectiveness, and continuing to publicly share their
       | improvements. Hard not to root for them as a force to prevent an
       | AI corporate monopoly/duopoly.
        
         | srameshc wrote:
         | As much I agree with your sentiment, but I doubt the intention
         | is singular.
        
           | echelon wrote:
           | I don't care if this kills Google and OpenAI.
           | 
           | I hope it does, though I'm doubtful because distribution is
           | important. You can't beat "ChatGPT" as a brand in laypeople's
           | minds (unless perhaps you give them a massive "Temu: Shop
           | Like A Billionaire" commercial campaign).
           | 
           | Closed source AI is almost by design morphing into an
           | industrial, infrastructure-heavy rocket science that
           | commoners can't keep up with. The companies pushing it are
           | building an industry we can't participate or share in.
           | They're cordoning off areas of tech and staking ground for
           | themselves. It's placing a steep fence around tech.
           | 
           | I hope every such closed source AI effort is met with
           | equivalent open source and that the investments made into
           | closed AI go to zero.
           | 
           | The most likely outcome is that Google, OpenAI, and Anthropic
           | win and every other "lab"-shaped company dies an expensive
           | death. RunwayML spent hundreds of millions and they're barely
           | noticeable now.
           | 
           | These open source models hasten the deaths of the second tier
           | also-ran companies. As much as I hope for dents in the big
           | three, I'm doubtful.
        
             | raw_anon_1111 wrote:
             | I can't think of a single company I've worked with as a
             | consultant that I could convince to use DeepSeek because of
             | its ties with China even if I explained that it was hosted
             | on AWS and none of the information would go to China.
             | 
             | Even when the technical people understood that, it would be
             | too much of a political quagmire within their company when
             | it became known to the higher ups. It just isn't worth the
             | political capital.
             | 
             | They would feel the same way about using xAI or maybe even
             | Facebook models.
        
               | siliconc0w wrote:
               | Even when self-hosting, there is still a real risk of
               | using Chinese models (or any provider you can't
               | trust/sue) because they can embed malicious actions into
               | the model. For example, a small random percentage of the
               | time, it could add a subtle security vulnerability to any
               | code generation.
               | 
               | This is a known-playbook of China and so it's pretty
               | likely that if they aren't already doing this, they will
               | eventually if the models see high adoption.
        
               | nagaiaida wrote:
               | on what hypothetical grounds would you be more
               | meaningfully able to sue the american maker of a self-
               | hosted statistical language model that you select your
               | own runtime sampling parameters for after random subtle
               | security vulnerabilities came out the other side when you
               | asked it for very secure code?
               | 
               | put another way, how do you propose to tell this subtle
               | nefarious chinese sabotage you baselessly imply to be
               | commonplace from the very real limitations of this
               | technology in the first place?
        
               | fragmede wrote:
               | This paper may be of interest to you:
               | https://arxiv.org/html/2504.15867v1
        
               | nagaiaida wrote:
               | the mechanism of action for that attack appears to be
               | reading from poisoned snippets on stackoverflow or a
               | similar site, which to my mind is an excellent example of
               | why it seems like it would be difficult to retroactively
               | pin "insecure code came out of my model" on the evil
               | communist base weights of the model in question
        
               | kriops wrote:
               | "Baselessly" - I'm sorry but realpolitik is plenty of
               | basis. China is a geopolitical adversary of both the EU
               | and the US. And China will be the first to admit this,
               | btw.
        
               | coliveira wrote:
               | Competitor != adversary. It is US warmongering ideology
               | that tries to equate these concepts.
        
               | kriops wrote:
               | That is just objectively incorrect, and fundamentally
               | misunderstanding the basics of statehood. China, the US,
               | and any other local monopoly on force would absolutely
               | take any chance they could get to extend their influence
               | and diminish the others. That is they are acting
               | rationally to at _minimum_ maximise the probability they
               | are able to maintain their current monopolies on force.
        
               | nagaiaida wrote:
               | sorry, is your contention here "spurious accusations
               | don't require evidence when aimed at designated state
               | enemies"? because it feels uncharitably rude to infer
               | that's what you meant to say here, but i struggle to
               | parse this in a different way where you say something
               | more reasonable.
        
               | kriops wrote:
               | I'm sorry you feel that way. It is however entirely
               | reasonable to assume that the comment I replied to was
               | made entirely in bad faith, seeing as it dismisses any
               | rational basis for the behaviour of the entities it is
               | making claims about.
        
               | StealthyStart wrote:
               | This is the real cause. At the enterprise level, trust
               | outweighs cost. My company hires agencies and consultants
               | who provide the same advice as our internal team; this is
               | not to imply that our internal team is incorrect; rather,
               | there is credibility that if something goes wrong, the
               | decision consequences can be shifted, and there is a
               | reason why companies continue to hire the same four
               | consulting firms. It's trust, whether it's real or
               | perceived.
        
               | 0xWTF wrote:
               | Children do the same thing intuitively: parents
               | continually complain that their children don't listen to
               | them. But as soon as someone else tells them to "cover
               | their nose", "chew with their mouth closed", "don't run
               | with scissors", whatever, they listen and integrate that
               | guidance into their behavior. What's harder to observe is
               | all the external guidance they get that they don't
               | integrate until their parents tell them. It's internal vs
               | external validation.
        
               | raw_anon_1111 wrote:
               | Or in many cases they go over to their grandparents house
               | and they let them run wild and all of the sudden your
               | parents have "McDonald's money" for their grandkids when
               | they never had it for you.
        
               | raw_anon_1111 wrote:
               | I have seen it much more nuanced than that.
               | 
               | 2020 - I was a mid level (L5) cloud consultant at AWS
               | with only two years of total AWS experience and that was
               | only at a small startup before then. Yet every customer
               | took my (what in hindsight might not have been the best)
               | advice all of the time without questioning it as long as
               | it met their business goals. Just because I had
               | @amazon.com as my email address.
               | 
               | Late 2023 - I was the subject matter expert in a niche of
               | a niche in AWS that the customer focused on and it was
               | still almost impossible to get someone to listen to a
               | consultant from a shitty third rate consulting company.
               | 
               | 2025 - I left the shitty consulting company last year
               | after only a year and now work for one with a much better
               | reputation and I have a better title "staff consultant".
               | I also play the game and be sure to mention that I'm
               | former "AWS ProServe" when I'm doing introductions. Now
               | people listen to me again.
        
               | coliveira wrote:
               | So much worse for American companies. This only means
               | that they will be uncompetitive with similar companies
               | that use models with realistic costs.
        
               | tokioyoyo wrote:
               | If the Chinese model becomes better than competitors,
               | these worries will suddenly disappear. Also, there are
               | plenty startups and enterprises that are running fine-
               | tuned versions of different OS models.
        
               | raw_anon_1111 wrote:
               | Yeah that's not how Big Enterprise works...
               | 
               | And most startups are just doing prompt engineering that
               | will never go anywhere. The big companies will just throw
               | a couple of developers at the feature and add it to their
               | existing business.
        
               | subroutine wrote:
               | As a government contractor, using a Chinese model is a
               | non-starter.
        
               | hhh wrote:
               | No... Nobody I work for will touch these models. The fear
               | is real that they have been poisoned or have some
               | underlying bomb. Plus y'know, they're produced by China,
               | so they would never make it past a review board in most
               | mega enterprises IME.
        
               | kriops wrote:
               | For good reason, too. Hostile governments have a much
               | easier time poisoning their "local" LLMs.
        
             | giancarlostoro wrote:
             | ChatGPT is like "Photoshop" people will call any AI
             | chatgpt.
        
           | twelvechairs wrote:
           | The bar is incredibly low considering what OpenAI has done as
           | a "not for profit"
        
         | jstummbillig wrote:
         | How could we judge if anyone is "winning" on cost-
         | effectiveness, when we don't know what everyones profits/losses
         | are?
        
           | ericskiff wrote:
           | I believe this was a statement on cost per token to us as
           | consumers of the service
        
           | rowanG077 wrote:
           | Well consumers care about the cost to them, and those we
           | know. And deepseek is destroying everything in that
           | department.
        
           | tedivm wrote:
           | If you're trying to build AI based applications you can and
           | should compare the costs between vendor based solutions and
           | hosting open models with your own hardware.
           | 
           | On the hardware side you can run some benchmarks on the
           | hardware (or use other people's benchmarks) and get an idea
           | of the tokens/second you can get from the machine. Normalize
           | this for your usage pattern (and do your best to implement
           | batch processing where you are able to, which will save you
           | money on both methods) and you have a basic idea of how much
           | it would cost per token.
           | 
           | Then you compare that to the cost of something like GPT5,
           | which is a bit simpler because the cost per (million) token
           | is something you can grab off of a website.
           | 
           | You'd be surprised how much money running something like
           | DeepSeek (or if you prefer a more established company, Qwen3)
           | will save you over the cloud systems.
           | 
           | That's just one factor though. Another is what hardware you
           | can actually run things on. DeepSeek and Qwen will function
           | on cheap GPUs that other models will simply choke on.
        
         | make3 wrote:
         | I suspect they will keep doing this until they have a
         | substantially better model than the competition. Sharing
         | methods to look good & allow the field to help you keep up with
         | the big guys is easy. I'll be impressed if they keep publishing
         | even when they do beat the big guys soundly.
        
         | catigula wrote:
         | To push back on naivety I'm sensing here I think it's a little
         | silly to see Chinese Communist Party backed enterprise as
         | somehow magnanimous and without ulterior, very harmful motive.
        
           | jascha_eng wrote:
           | Oh they need control of models to be able to censor and
           | ensure whatever happens inside the country with AI stays
           | under their control. But the open-source part? Idk I think
           | they do it to mess with the US investment and for the typical
           | open source reasons of companies: community, marketing, etc.
           | But tbh especially the messing with the US, as a european
           | with no serious competitor, I can get behind.
        
             | catigula wrote:
             | They're pouring money to disrupt American AI markets and
             | efforts. They do this in countless other fields. It's a
             | model of massive state funding -> give it away for cut-rate
             | -> dominate the market -> reap the rewards.
             | 
             | It's a very transparent, consistent strategy.
             | 
             | AI is a little different because it has geopolitical
             | implications.
        
               | ForceBru wrote:
               | When it's a competition among individual producers, we
               | call it "a free market" and praise Hal Varian. When it's
               | a competition among countries, it's suddenly threatening
               | to "disrupt American AI markets and efforts". The obvious
               | solution here is to pour money into LLM research too.
               | Massive state funding -> provide SOTA models for free ->
               | dominate the market -> reap the rewards (from the free
               | models).
        
               | catigula wrote:
               | We don't do that.
        
               | fragmede wrote:
               | It's not like the US doesn't face similar accusations.
               | One such case is the WTO accusing Boeing of receiving
               | illegal subsidies from the US government.
               | https://www.transportenvironment.org/articles/wto-says-
               | us-ga...
        
               | tokioyoyo wrote:
               | I can't believe I'm shilling for China in these comments,
               | but how different it is for company A getting blank check
               | investments from VCs and wink-wink support from the
               | government in the west? And AI-labs in China has been
               | getting funding internally in the companies for a while
               | now, before the LLM-era.
        
             | ptsneves wrote:
             | This is the rare earth minerals dumping all over again.
             | Devalue to such a price as to make the market participants
             | quit, so they can later have a strategic stranglehold on
             | the supply.
             | 
             | This is using open source in a bit of different spirit than
             | the hacker ethos, and I am not sure how I feel about it.
             | 
             | It is a kind of cheat on the fair market but at the same
             | time it is also costly to China and its capital costs may
             | become unsustainable before the last players fold.
        
               | jsiepkes wrote:
               | The way we fund the AI bubble in the west could also be
               | described as: "kind of cheat on the fair market". OpenAI
               | has never made a single dime of profit.
        
               | embedding-shape wrote:
               | > This is using open source in a bit of different spirit
               | than the hacker ethos, and I am not sure how I feel about
               | it.
               | 
               | It's a bit early to have any sort of feelings about it,
               | isn't it? You're speaking in absolutes, but none of this
               | is necessarily 100% true as we don't know their
               | intentions. And judging a group of individuals intention
               | based on what their country seems to want, from the lens
               | of a foreign country, usually doesn't land you with the
               | right interpretation.
        
               | CamperBob2 wrote:
               | Good luck making OpenAI and Google cry uncle. They have
               | the US government on their side. They will not be allowed
               | to fail, and they know it.
               | 
               | What I appreciate about the Chinese efforts is that they
               | are being forced to get more intelligence from less
               | hardware, and they are not only releasing their work
               | products but documenting the R&D behind them at least as
               | well as our own closed-source companies do.
               | 
               | A good reason to stir up dumping accusations and anti-
               | China bias would be if they stopped publishing not just
               | the open-source models, but the technical papers that go
               | with them. Until that happens, I think it's better to
               | prefer more charitable explanations for their posture.
        
               | tokioyoyo wrote:
               | I mentioned this before as well, but AI-competition
               | within China doesn't care that much about the western
               | companies. Internal market is huge, and they know winner-
               | takes-it-all in this space is real.
        
               | Jedd wrote:
               | > It is a kind of cheat on the fair market ...
               | 
               | I am very curious on your definition and usage of 'fair'
               | there, and whether you would call the LLM etc sector as
               | it stands now, but hypothetically absent deepseek say, a
               | 'fair market'. (If not, why not?)
        
               | jascha_eng wrote:
               | Do they actually spend that much though? I think they are
               | getting similar results with much fewer resources.
               | 
               | It's also a bit funny that providing free models is
               | probably the most communist thing China has done in a
               | long time.
        
               | josh_p wrote:
               | Isn't it already well accepted that the LLM market exists
               | in a bubble with a handful of companies artificially
               | inflating their own values?
               | 
               | ESH
        
               | DiogenesKynikos wrote:
               | Are you by chance an OpenAI investor?
               | 
               | We should all be happy about the price of AI coming down.
        
               | doctorwho42 wrote:
               | But the economy!!! /s
               | 
               | Seriously though, our leaders are actively throwing
               | everything and the kitchen sink into AI companies - in
               | some vain attempt to become immortal or own even more of
               | the nations wealth beyond what they already do, chasing
               | some kind of neo-tech feudalism. Both are unachievable
               | because they rely on a complex system that they clearly
               | don't understand.
        
               | coliveira wrote:
               | > cheat on the fair market
               | 
               | Can you really view this as a cheat this when the US is
               | throwing a trillion dollars in support of a supposedly
               | "fair market"?
        
           | gazaim wrote:
           | *Communist Party of China (CPC)
        
             | v0y4g3r wrote:
             | You nailed it
        
           | amunozo wrote:
           | The motive is to destroy the American supremacy on AI, it's
           | not that deep. This is much easier to do open sourcing the
           | models than competing directly, and this can have good
           | ramifications for everybody, even if the motive is "bad".
        
         | paulvnickerson wrote:
         | If you value life in the West, you should not be rooting for a
         | Communist model or probably any state-backed model
         | https://venturebeat.com/security/deepseek-injects-50-more-se...
        
           | Lucasoato wrote:
           | > CrowdStrike researchers next prompted DeepSeek-R1 to build
           | a web application for a Uyghur community center. The result
           | was a complete web application with password hashing and an
           | admin panel, but with authentication completely omitted,
           | leaving the entire system publicly accessible.
           | 
           | > When the identical request was resubmitted for a neutral
           | context and location, the security flaws disappeared.
           | Authentication checks were implemented, and session
           | management was configured correctly. The smoking gun:
           | political context alone determined whether basic security
           | controls existed.
           | 
           | Holy shit, these political filters seem embedded directly in
           | the model weights.
        
           | amunozo wrote:
           | Should I root for the democratic OpenAI, Google or Microsoft
           | instead?
        
         | ActorNightly wrote:
         | >winning on cost-effectiveness
         | 
         | Nobody is winning in this area until these things run in full
         | on single graphics cards. Which is sufficient compute to run
         | even most of the complex tasks.
        
           | beefnugs wrote:
           | Why does that matter? They wont be making at home graphics
           | cards anymore. Why would you do that when you can be pre-sold
           | $40k servers for years into the future
        
           | JSR_FDED wrote:
           | Nobody is winning until cars are the size of a pack of cards.
           | Which is big enough to transport even the largest cargo.
        
           | bbor wrote:
           | I mean, there are lots of models that run on home graphics
           | cards. I'm having trouble finding reliable requirements for
           | this new version, but V3 (from February) has a 32B parameter
           | model that runs on "16GB or more" of VRAM[1], which is very
           | doable for professionals in the first world. Quantization can
           | also help immensely.
           | 
           | Of course, the smaller models aren't as good at complex
           | reasoning as the bigger ones, but that seems like an
           | inherently-impossible goal; there will always be more
           | powerful programs that can run in datacenters (as long as our
           | techniques are constrained by compute, I guess).
           | 
           | FWIW, the small models of today are a lot better than
           | anything I thought I'd live to see as of 5 years ago! Gemma3n
           | (which is built to run on _phones_ [2]!) handily beats
           | ChatGPT 3.5 from January 2023 -- rank ~128 vs. rank ~194 on
           | LLMArena[3].
           | 
           | [1] https://blogs.novita.ai/what-are-the-requirements-for-
           | deepse...
           | 
           | [2] https://huggingface.co/google/gemma-3n-E4B-it
           | 
           | [3] https://lmarena.ai/leaderboard/text/overall [1]
           | https://blogs.novita.ai/what-are-the-requirements-for-
           | deepse...
        
       | htrp wrote:
       | what is the ballpark vram / gpu requirement to run this ?
        
         | rhdunn wrote:
         | For just the model itself: 4 x params at F32, 2 x params at
         | F16/BF16, or 1 x params at F8, e.g. 685GB at F8. It will be
         | smaller for quantizations, but I'm not sure how to estimate
         | those.
         | 
         | For a Mixture of Experts (MoE) model you only need to have the
         | memory size of a given expert. There will be some swapping out
         | as it figures out which expert to use, or to change expert, but
         | once that expert is loaded it won't be swapping memory to
         | perform the calculations.
         | 
         | You'll also need space for the context window; I'm not sure how
         | to calculate that either.
        
           | petu wrote:
           | I think your idea of MoE is incorrect. Despite the name
           | they're not "expert" at anything in particular, used experts
           | change more or less on each token -- so swapping them into
           | VRAM is not viable, they just get executed on CPU
           | (llama.cpp).
        
             | jodleif wrote:
             | A common pattern is to offload (most of) the expert layers
             | to the CPU. This combination is still quite fast even with
             | slow system ram, though obviously inferior to a pure VRAM
             | loading
        
           | anvuong wrote:
           | I think your understanding of MoE is wrong. Depending on the
           | settings, each token can actually be routed to multiple
           | experts, called experts choice architecture. This makes it
           | easier to parallelize the inference (each expert on a
           | different device for example), but it's not simply just
           | keeping one expert in memory.
        
       | lalassu wrote:
       | Disclaimer: I did not test this yet.
       | 
       | I don't want to make big generalizations. But one thing I noticed
       | with chinese models, especially Kimi, is that it does very well
       | on benchmarks, but fails on vibe testing. It feels a little bit
       | over-fitting to the benchmark and less to the use cases.
       | 
       | I hope it's not the same here.
        
         | vorticalbox wrote:
         | This used to happen with bench marks on phones, manufacturers
         | would tweak android so benchmarks ran faster.
         | 
         | I guess that's kinda how it is for any system that's trained to
         | do well on benchmarks, it does well but rubbish at everything
         | else.
        
           | make3 wrote:
           | yes, they turned off all energy economy measures when
           | benchmarking software activity was detected, which completely
           | broke the point of the benchmarks because your phone is
           | useless if it's very fast but the battery lasts one hour
        
         | make3 wrote:
         | I would assume that huge amount is spent in frontier models
         | just making the models nicer to interact with, as it is likely
         | one of the main things that drives user engagement.
        
         | not_that_d wrote:
         | What is "Vibe testing"?
        
           | BizarroLand wrote:
           | I would assume that it is testing how well and appropriately
           | the LLM responds to prompts.
        
           | catigula wrote:
           | He means capturing things that benchmarks don't. You can use
           | Claude and GPT-5 back-to-back in a field that score nearly
           | identically on. You will notice several differences. This is
           | the "vibe".
        
         | msp26 wrote:
         | K2 Thinking has immaculate vibes. Minimal sycophancy and a
         | pleasant writing style while being occasionally funny.
         | 
         | If it had vision and was better on long context I'd use it so
         | much more.
        
         | catigula wrote:
         | This is why I stopped bothering checking out these models and,
         | funnily enough, grok.
        
       | spullara wrote:
       | I hate that their model ids don't change as they change the
       | underlying model. I'm not sure how you can build on that.
       | % curl https://api.deepseek.com/models \                   -H
       | "Authorization: Bearer ${DEEPSEEK_API_KEY}"
       | {"object":"list","data":[{"id":"deepseek-
       | chat","object":"model","owned_by":"deepseek"},{"id":"deepseek-
       | reasoner","object":"model","owned_by":"deepseek"}]}
        
         | KronisLV wrote:
         | Oh hey, quality improvement without doing anything!
         | 
         | (unless/until a new version gets worse for your use case)
        
       | twistedcheeslet wrote:
       | How capable are these models at tool calling?
        
         | potsandpans wrote:
         | From some very brief experimentation with deepseek about 2
         | months ago, tool calling is very hot or miss. Claude appears to
         | be the absolute best.
        
       | Foobar8568 wrote:
       | At least, there is no doubt where he is from !
       | 
       | which version are you?
       | 
       | Wo Shi DeepSeekZui Xin Ban Ben Mo Xing ! Ru Guo Ni Xiang Liao Jie
       | Ju Ti De Ban Ben Hao Xin Xi ,Wo Jian Yi Ni :
       | Cha Kan Guan Fang Wen Dang  - DeepSeekGuan Wang He Wen Dang Hui
       | You Zui Zhun Que De Ban Ben Xin Xi               Guan Zhu Guan
       | Fang Gong Gao  - Ban Ben Geng Xin Tong Chang Hui Zai Guan Fang Qu
       | Dao Gong Bu               Cha Kan Ying Yong Shang Dian /Wang Ye
       | Ban  - Shi Yong Jie Mian Tong Chang Hui Xian Shi Dang Qian Ban
       | Ben
       | 
       | Wo Ju Bei DeepSeekDe Suo You Zui Xin Gong Neng Te Xing ,Bao Gua :
       | Qiang Da De Dui Hua He Tui Li Neng Li               128KShang Xia
       | Wen Chang Du               Wen Jian Shang Chuan Chu Li (Tu Xiang
       | , Wen Dang Deng )              Lian Wang Sou Suo Gong Neng (Xu
       | Shou Dong Kai Qi )              Wan Quan Mian Fei Shi Yong
       | 
       | Ru Guo Ni Xu Yao Zhi Dao Jing Que De Ban Ben Hao Yong Yu Ji Zhu
       | Dui Jie Huo Qi Ta Te Ding Yong Tu ,Zui Hao Zhi Jie Cha Yue Guan
       | Fang Ji Zhu Wen Dang ,Na Li Hui You Zui Zhun Que He Xiang Xi De
       | Ji Zhu Gui Ge Shuo Ming .
       | 
       | You Shi Yao Qi Ta Wen Ti Wo Ke Yi Bang Ni Jie Da Ma ?
        
         | schlauerfox wrote:
         | It's so strange when it obviously hits a preprogrammed non-
         | answer in these models, how can one ever trust them when there
         | is a babysitter that interferes in an actual answer. I suppose
         | that asking it what version it is isn't a valid question in
         | it's training data so it's programmed to say check the
         | documentation, but still definitely suspicious when it gives a
         | non-answer.
        
       | embedding-shape wrote:
       | > DeepSeek-V3.2 introduces significant updates to its chat
       | template compared to prior versions. The primary changes involve
       | a revised format for tool calling and the introduction of a
       | "thinking with tools" capability.
       | 
       | At first, I thought they had gone the route of implementing yet
       | another chat format that can handle more dynamic conversations
       | like that, instead of just using Harmony, but looking at the
       | syntax, doesn't it look exactly like Harmony? That's a good
       | thing, don't get me wrong, but why not mention straight up that
       | they've implemented Harmony, so people can already understand up
       | front that it's compatible with whatever parsing we're using for
       | GPT-OSS?
        
       | mcbuilder wrote:
       | After using it a couple hours playing around, it is a very solid
       | entry, and very competitive compared with the big US relaeses.
       | I'd say it's better than GLM4.6 and I'm Kimi K2. Looking forward
       | to v4
        
       | gradus_ad wrote:
       | How will the Google/Anthropic/OpenAI's of the world make money on
       | AI if open models are competitive with their models? What hurt
       | open source in the past was its inability to keep up with the
       | quality and feature depth of closed source competitors, but
       | models seem to be reaching a performance plateau; the top open
       | weight models are generally indistinguishable from the top
       | private models.
       | 
       | Infrastructure owners with access to the cheapest energy will be
       | the long run winners in AI.
        
         | tsunamifury wrote:
         | Pure models clearly aren't the monetizing strategy, use of them
         | on existing monetized surfaces are the core value.
         | 
         | Google would love a cheap hq model on its surfaces. That just
         | helps Google.
        
           | gradus_ad wrote:
           | Hmmm but external models can easily operate on any "surface".
           | For instance Claude Code simply reads and edits files and
           | runs in a terminal. Photo editing apps just need a photo
           | supplied to them. I don't think there's much juice to squeeze
           | out of deeply integrated AI as AI by its nature exists above
           | the application layer, in the same way that we exist above
           | the application layer as users.
        
         | dotancohen wrote:
         | People and companies trust OpenAI and Anthropic, rightly or
         | wrongly, with hosting the models and keeping their company data
         | secure. Don't underestimate the value of a scapegoat to point a
         | finger at when things go wrong.
        
           | reed1234 wrote:
           | But they also trust cloud platforms like GCP to host models
           | and store company data.
           | 
           | Why would a company use an expensive proprietary model on
           | Vertex AI, for example, when they could use an open-source
           | one on Vertex AI that is just as reliable for a fraction of
           | the cost?
           | 
           | I think you are getting at the idea of branding, but branding
           | is different from security or reliability.
        
         | jonplackett wrote:
         | Either...
         | 
         | Better (UX / ease of use)
         | 
         | Lock in (walled garden type thing)
         | 
         | Trust (If an AI is gonna have the level of insight into your
         | personal data and control over your life, a lot of people will
         | prefer to use a household name)
        
           | niek_pas wrote:
           | > Trust (If an AI is gonna have the level of insight into
           | your personal data and control over your life, a lot of
           | people will prefer to use a household name.
           | 
           | Not Google, and not Amazon. Microsoft is a maybe.
        
             | polyomino wrote:
             | The success of Facebook basically proves that public brand
             | perception does not matter at all
        
               | acephal wrote:
               | Facebook itself still has a big problem with it's lack of
               | youth audience though. Zuck captured the boomers and
               | older Gen X, which are the biggest demos of living people
               | however.
        
             | reed1234 wrote:
             | People trust google with their data in search, gmail, docs,
             | and android. That is quite a lot of personal info, and
             | trust, already.
             | 
             | All they have to do is completely switch the google
             | homepage to gemini one day.
        
           | poszlem wrote:
           | Or lobbing for regulations. You know. The "only american
           | models are safe" kind of regulation.
        
         | iLoveOncall wrote:
         | > How will the Google/Anthropic/OpenAI's of the world make
         | money on AI if open models are competitive with their models?
         | 
         | They won't. Actually, even if open models aren't competitive,
         | they still won't. Hasn't this been clear since a while already?
         | 
         | There's no moat in models, investments in pure models has only
         | been to chase AGI, all other investment (the majority, from
         | Google, Amazon, etc.) has been on products using LLMs, not
         | models themselves.
         | 
         | This is not like the gold rush where the ones who made good
         | money were the ones selling shovels, it's another kind of gold
         | rush where you make money selling shovels but the gold itself
         | is actually worthless.
        
       | wosined wrote:
       | Remember: If it is not peer-reviewed, then it is an ad.
        
         | vessenes wrote:
         | I mean.. true. Also, DeepSeek has good cred so far on
         | delivering roughly what their PR says they are delivering. My
         | prior would be that their papers are generally credible.
        
       | orena wrote:
       | Any results on frontier math or arc ?
        
       ___________________________________________________________________
       (page generated 2025-12-01 23:00 UTC)