[HN Gopher] DeepSeek-v3.2: Pushing the frontier of open large la...
___________________________________________________________________
DeepSeek-v3.2: Pushing the frontier of open large language models
[pdf]
https://huggingface.co/deepseek-ai/DeepSeek-V3.2 https://api-
docs.deepseek.com/news/news251201
Author : pretext
Score : 435 points
Date : 2025-12-01 15:48 UTC (7 hours ago)
(HTM) web link (huggingface.co)
(TXT) w3m dump (huggingface.co)
| nimchimpsky wrote:
| Pretty amazing that a relatively small Chinese hedge fund can
| build AI better than almost anyone.
| BoorishBears wrote:
| 3.2-Exp came out in September: this is 3.2, along with a special
| checkpoint (DeepSeek-V3.2-Speciale) for deep reasoning that
| they're claiming surpasses GPT-5 and matches Gemini 3.0
|
| https://x.com/deepseek_ai/status/1995452641430651132
| zparky wrote:
| Benchmarks are super impressive, as usual. Interesting to note in
| table 3 of the paper (p. 15), DS-Speciale is 1st or 2nd in
| accuracy in all tests, but has much higher token output (50%
| more, or 3.5x vs gemini 3 in the codeforces test!).
| futureshock wrote:
| The higher token output is not by accident. Certain kinds of
| logical reasoning problems are solved by longer thinking
| output. Thinking chain output is usually kept to a reasonable
| length to limit latency and cost, but if pure benchmark
| performance is the goal you can crank that up to the max until
| the point of diminishing returns. DeepSeek being 30x cheaper
| than Gemini means there's little downside to max out the
| thinking time. It's been shown that you can further scale this
| by running many solution attempts in parallel with max thinking
| then using a model to choose a final answer, so increasing
| reasoning performance by increasing inference compute has a
| pretty high ceiling.
| jodleif wrote:
| I genuinely do not understand the evaluations of the US AI
| industry. The chinese models are so close and far cheaper
| newyankee wrote:
| Yet tbh if the US industry had not moved ahead and created the
| race with FOMO it would not had been easier for Chinese
| strategy to work either.
|
| The nature of the race may change as yet though, and I am
| unsure if the devil is in the details, as in very specific edge
| cases that will work only with frontier models ?
| jazzyjackson wrote:
| Valuation is not based on what they have done but what they
| might do, I agree tho it's investment made with very little
| insight into Chinese research. I guess it's counting on
| deepseek being banned and all computers in America refusing to
| run open software by the year 2030 /snark
| bilbo0s wrote:
| > _I guess it 's counting on deepseek being banned_
|
| And the people making the bets are in a position to make sure
| the banning happens. The US government system being what it
| is.
|
| Not that our leaders need any incentive to ban Chinese tech
| in this space. Just pointing out that it's not _necessarily_
| a "bet".
|
| "Bet" imply you don't know the outcome _and_ you have no
| influence over the outcome. Even "investment" implies you
| don't know the outcome. I'm not sure that's the case with
| these people?
| coliveira wrote:
| Exactly. "Business investment" these days means that the
| people involved will have at least some amount of power to
| determine the winning results.
| jodleif wrote:
| > Valuation is not based on what they have done but what they
| might do
|
| Exactly what I'm thinking. Chinese models catching rapidly.
| Soon to be on-par with the big dogs.
| ksynwa wrote:
| Even if they do continue to lag behind they are a good bet
| against monopolisation by proprietary vendors.
| coliveira wrote:
| They would if corporations were allowed to run these
| models. I fully expect the US government to prohibit
| corporations from doing anything useful with Chinese
| models (full censorship). It's the same game they use
| with chips.
| jasonsb wrote:
| It's all about the hardware and infrastructure. If you check
| OpenRouter, no provider offers a SOTA chinese model matching
| the speed of Claude, GPT or Gemini. The chinese models may
| benchmark close on paper, but real-world deployment is
| different. So you either buy your own hardware in order to run
| a chinese model at 150-200tps or give up an use one of the Big
| 3.
|
| The US labs aren't just selling models, they're selling
| globally distributed, low-latency infrastructure at massive
| scale. That's what justifies the valuation gap.
|
| Edit: It looks like Cerebras is offering a very fast GLM 4.6
| csomar wrote:
| According to OpenRouter, z.ai is 50% faster than Anthropic;
| which matches my experience. z.ai does have frequent
| downtimes but so does Claude.
| jodleif wrote:
| Assuming your hardware premise is right (and lets be honest,
| nobody really wants to send their data to chinese providers)
| You can use a provider like Cerebras, Groq?
| observationist wrote:
| The network effects of using consistently behaving models and
| maintaining API coverage between updates is valuable, too -
| presumably the big labs are including their own domains of
| competence in the training, so Claude is likely to remain
| being very good at coding, and behave in similar ways,
| informed and constrained by their prompt frameworks, so that
| interactions will continue to work in predictable ways even
| after major new releases occur, and upgrades can be clean.
|
| It'll probably be a few years before all that stuff becomes
| as smooth as people need, but OAI and Anthropic are already
| doing a good job on that front.
|
| Each new Chinese model requires a lot of testing and bespoke
| conformance to every task you want to use it for. There's a
| lot of activity and shared prompt engineering, and some
| really competent people doing things out in the open, but
| it's generally going to take a lot more expert work getting
| the new Chinese models up to snuff than working with the big
| US labs. Their product and testing teams do a lot of valuable
| work.
| kachapopopow wrote:
| cerebras AI offers models at 50x the speed of sonnet?
| DeathArrow wrote:
| > If you check OpenRouter, no provider offers a SOTA chinese
| model matching the speed of Claude, GPT or Gemini.
|
| I think GLM 4.6 offered by Cerebras is much faster than any
| US model.
| jasonsb wrote:
| You're right, I forgot about that one.
| irthomasthomas wrote:
| Gemini 3 = ~70tps https://openrouter.ai/google/gemini-3-pro-
| preview
|
| Opus 4.5 = ~60-80tps https://openrouter.ai/anthropic/claude-
| opus-4.5
|
| Kimi-k2-think = ~60-180tps
| https://openrouter.ai/moonshotai/kimi-k2-thinking
|
| Deepseek-v3.2 = ~30-110tps (only 2 providers rn)
| https://openrouter.ai/deepseek/deepseek-v3.2
| jasonsb wrote:
| It doesn't work like that. You need to actually use the
| model and then go to /activity to see the actual speed. I
| constantly get 150-200tps from the Big 3 while other
| providers barely hit 50tps even though they advertise much
| higher speeds. GLM 4.6 via Cerebras is the only one faster
| than the closed source models at over 600tps.
| irthomasthomas wrote:
| These aren't advertised speeds, they are the average
| measured speeds by openrouter across different providers.
| isamuel wrote:
| There is a great deal of orientalism --- it is genuinely
| unthinkable to a lot of American tech dullards that the Chinese
| could be better at anything requiring what they think of as
| "intelligence." Aren't they Communist? Backward? Don't they eat
| weird stuff at wet markets?
|
| It reminds me, in an encouraging way, of the way that German
| military planners regarded the Soviet Union in the lead-up to
| Operation Barbarossa. The Slavs are an obviously inferior race;
| their Bolshevism dooms them; we have the will to power; we will
| succeed. Even now, when you ask questions like what you ask of
| that era, the answers you get are genuinely not better than
| "yes, this should have been obvious at the time if you were not
| completely blinded by ethnic and especially ideological
| prejudice."
| newyankee wrote:
| but didn't Chinese already surpass the rest of the world in
| Solar, batteries, EVs among other things ?
| cyberlimerence wrote:
| They did, but the goalposts keep moving, so to speak. We're
| approximately here : advanced semiconductors, artificial
| intelligence, reusable rockets, quantum computing, etc.
| Chinese will never catch up. /s
| mosselman wrote:
| Back when deepseek came out and people were tripping over
| themselves shouting it was so much better than what was out
| there, it just wasn't good.
|
| It might be this model is super good, I haven't tried it, but
| to say the Chinese models are better is just not true.
|
| What I really love though is that I can run them (open
| models) on my own machine. The other day I categorised images
| locally using Qwen, what a time to be alive.
|
| Further even than local hardware, open models make it
| possible to run on providers of choice, such as European
| ones. Which is great!
|
| So I love everything about the competitive nature of this.
| CamperBob2 wrote:
| If you thought DeepSeek "just wasn't good," there's a good
| chance you were running it wrong.
|
| For instance, a lot of people thought they were running
| "DeepSeek" when they were really running some random
| distillation on ollama.
| bjourne wrote:
| WDYM? Isn't https://chat.deepseek.com/ the real DeepSeek?
| CamperBob2 wrote:
| Good point, I was assuming the GP was running local for
| some reason. Hard to argue when it's the official
| providers who are being compared.
|
| I ran the 1.58-bit Unsloth quant locally at the time it
| came out, and even at such low precision, it was _super_
| rare for it to get something wrong that o1 and GPT4 got
| right. I have never actually used a hosted version of the
| full DS.
| lukan wrote:
| "It reminds me, in an encouraging way, of the way that German
| military planners regarded the Soviet Union in the lead-up to
| Operation Barbarossa. The Slavs are an obviously inferior
| race; ..."
|
| Ideology played a role, but the data they worked with, was
| the finnish war, that was disastrous for the sowjet side.
| Hitler later famously said, it was all a intentionally
| distraction to make them believe the sowjet army was worth
| nothing. (Real reasons were more complex, like previous
| purging).
| littlestymaar wrote:
| > It reminds me, in an encouraging way, of the way that
| German military planners regarded the Soviet Union in the
| lead-up to Operation Barbarossa. The Slavs are an obviously
| inferior race; their Bolshevism dooms them; we have the will
| to power; we will succeed
|
| Though, because Stalin had decimated the red army leadership
| (including most of the veteran officer who had Russian civil
| war experience) during the Moscow trials purges, the German
| almost succeeded.
| gazaim wrote:
| > Though, because Stalin had decimated the red army
| leadership (including most of the veteran officer who had
| Russian civil war experience) during the Moscow trials
| purges, the German almost succeeded.
|
| There were many counter revolutionaries among the
| leadership, even those conducting the purges. Stalin was
| like "ah fuck we're hella compromised." Many revolutions
| fail in this step and often end up facing a CIA backed
| coup. The USSR was under constant siege and attempted
| infiltration since inception.
| littlestymaar wrote:
| > There were many counter revolutionaries among the
| leadership
|
| Well, Stalin was, by far, the biggest counter-
| revolutionary in the Politburo.
|
| > Stalin was like "ah fuck we're hella compromised."
|
| There's no evidence that anything significant was
| compromised at that point, and clear evidence that Stalin
| was in fact medically paranoid.
|
| > Many revolutions fail in this step and often end up
| facing a CIA backed coup. The USSR was under constant
| siege and attempted infiltration since inception.
|
| Can we please not recycle 90-years old soviet propaganda?
| The Moscow trial being irrational self-harm was
| acknowledged by the USSR leadership as early as the
| fifties...
| breppp wrote:
| Not sure how the entire Nazi comparison plays out, but at the
| time there were good reasons to imagine the Soviets will fall
| apart (as they initially did)
|
| Stalin just finished purging his entire officer corps, which
| is not a good omen for war, and the USSR failed miserably
| against the Finnish who were not the strongest of nations,
| while Germany just steamrolled France, a country that was
| much more impressive in WW1 than the Russians (who collapsed
| against Germany)
| ecshafer wrote:
| I don't think that anyone, much less someone working in tech
| or engineering in 2025, could still hold beliefs about
| Chinese not being capable scientists or engineers. I could
| maybe give (the naive) pass to someone in 1990 thinking China
| will never build more than junk. But in 2025 their product
| capacity, scientific advancement, and just the amount of us
| who have worked with extremely talented Chinese colleagues
| should dispel those notions. I think you are jumping to
| racism a bit fast here.
|
| Germany was right in some ways and wrong in others for the
| soviet unions strength. USSR failed to conquer Finland
| because of the military purges. German intelligence vastly
| under-estimated the amount of tanks and general preparedness
| of the Soviet army (Hitler was shocked the soviets had 40k
| tanks already). Lend Lease act really sent an astronomical
| amount of goods to the USSR which allowed them to fully
| commit to the war and really focus on increasing their weapon
| production, the numbers on the amount of tractors, food,
| trains, ammunition, etc. that the US sent to the USSR is
| staggering.
| gazaim wrote:
| These Americans have no comprehension of intelligence being
| used to benefit humanity instead of being used to fund a
| CEO's new yacht. I encourage them to visit China to see how
| far the USA lags behind.
| espadrine wrote:
| Two aspects to consider:
|
| 1. Chinese models typically focus on text. US and EU models
| also bear the cross of handling image, often voice and video.
| Supporting all those is additional training costs not spent on
| further reasoning, tying one hand in your back to be more
| generally useful.
|
| 2. The gap seems small, because so many benchmarks get
| saturated so fast. But towards the top, every 1% increase in
| benchmarks is significantly better.
|
| On the second point, I worked on a leaderboard that both
| normalizes scores, and predicts unknown scores to help improve
| comparisons between models on various criteria:
| https://metabench.organisons.com/
|
| You can notice that, while Chinese models are quite good, the
| gap to the top is still significant.
|
| However, the US models are typically much more expensive for
| inference, and Chinese models do have a niche on the Pareto
| frontier on cheaper but serviceable models (even though US
| models also eat up the frontier there).
| jodleif wrote:
| 1. Have you seen the Qwen offerings? They have great multi-
| modality, some even SOTA.
| brabel wrote:
| Qwen Image and Image Edit were among the best image models
| until Nano Banana Pro came along. I have tried some open
| image models and can confirm , the Chinese models are
| easily the best or very close to the best, but right now
| the Google model is even better... we'll see if the Chinese
| catch up again.
| BoorishBears wrote:
| I'd say Google still hasn't caught up on the smaller
| model side at all, but we've all been (rightfully) wowed
| enough by Pro to ignore that for now.
|
| Nano Banano Pro starts at 15 cents per image at <2k
| resolution, and is not strictly better than Seedream 4.0:
| yet the latter does 4K for 3 cents per image.
|
| Add in the power of fine-tuning on their open weight
| models and I don't know if China actually needs to catch
| up.
|
| I finetuned Qwen Image on 200 generations from Seedream
| 4.0 that were cleaned up with Nano Banana Pro, and got
| results that were as good _and more reliable_ than either
| model could achieve otherwise.
| torginus wrote:
| Thanks for sharing that!
|
| The scales are a bit murky here, but if we look at the
| 'Coding' metric, we see that Kimi K2 outperforms Sonnet 4.5 -
| that's considered to be the price-perf darling I think even
| today?
|
| I haven't tried these models, but in general there have been
| lots of cases where a model performs much worse IRL than the
| benchmarks would sugges (certain Chinese models and GPT-OSS
| have been guilty of this in the past)
| espadrine wrote:
| Good question. There's 2 points to consider.
|
| * For both Kimi K2 and for Sonnet, there's a non-thinking
| and a thinking version. Sonnet 4.5 Thinking is better than
| Kimi K2 non-thinking, but the K2 Thinking model came out
| recently, and beats it on all comparable pure-coding
| benchmarks I know: OJ-Bench (Sonnet: 30.4% < K2: 48.7%),
| LiveCodeBench (Sonnet: 64% < K2: 83%), they tie at SciCode
| at 44.8%. It is a finding shared by ArtificialAnalysis:
| https://artificialanalysis.ai/models/capabilities/coding
|
| * The reason developers love Sonnet 4.5 for coding, though,
| is not just the quality of the code. They use Cursor,
| Claude Code, or some other system such as Github Copilot,
| which are increasingly agentic. On the Agentic Coding
| criteria, Sonnet 4.5 Thinking is much higher.
|
| By the way, you can look at the Table tab to see all known
| and predicted results on benchmarks.
| agumonkey wrote:
| forgive me for bringing politics into it, are chinese LLM
| more prone to censorship bias than US ones ?
| coliveira wrote:
| Being open source, I believe Chinese models are less prone
| to censorship, since the US corporations can add censorship
| in several ways just by being a closed model that they
| control.
| skeledrew wrote:
| It's not about a LLM being prone to anything, but more
| about the way a LLM is fine-tuned (which can be subject to
| the requirements of those wielding political power).
| agumonkey wrote:
| that's what i meant even though i could have been more
| precise
| raincole wrote:
| > video
|
| Most of AI-generated videos we see on social media now are
| made with Chinese models.
| coliveira wrote:
| Nothing you said helps with the issue of valuation. Yes, the
| US models may be better by a few percentage points, but how
| can they justify being so costly, both operationally as well
| as in investment costs? Over the long run, this is a business
| and you don't make money being the first, you have to be more
| profitable overall.
| ben_w wrote:
| I think the investment race here is an "all-pay auction"*.
| Lots of investors have looked at the ultimate prize --
| basically winning something larger than the entire present
| world economy forever -- and think "yes".
|
| But even assuming that we're on the right path for that
| (which we may not be) and assuming that nothing intervenes
| to stop it (which it might), there may be only one winner,
| and that winner may not have even entered the game yet.
|
| * https://en.wikipedia.org/wiki/All-pay_auction
| coliveira wrote:
| > investors have looked at the ultimate prize --
| basically winning something larger than the entire
| present world economy
|
| This is what people like Altman want investors to
| believe. It seems like any other snake oil scam because
| it doesn't match reality of what he delivers.
| saubeidl wrote:
| Yeah, this is basically financial malpractice/fraud.
| Bolwin wrote:
| Third party providers rarely support caching.
|
| With caching the expensive US models end up being like 2x the
| price (e.g sonnet) and often much cheaper (e.g gpt-5 mini)
|
| If they start caching then US companies will be completely out
| priced.
| beastman82 wrote:
| Then you should short the market
| fastball wrote:
| They're not that close (on things like LMArena) and being
| cheaper is pretty meaningless when we are not yet at the point
| where LLMs are good enough for autonomy.
| mrinterweb wrote:
| I would expect one of the motivations for making these LLM
| model weights open is to undermine the valuation of other
| players in the industry. Open models like this must diminish
| the value prop of the frontier focused companies if other
| companies can compete with similar results at competitive
| prices.
| rprend wrote:
| People pay for products, not models. OpenAI and Anthropic make
| products (ChatGPT, Claude Code).
| TIPSIO wrote:
| It's awesome that stuff like this is open source, but even if you
| have a basement rig with 4 NVIDIA GeForce RTX 5090 graphic cards
| ($15-20k machine), can it even run with any reasonable context
| window that isn't like a crawling 10/tps?
|
| Frontier models are far exceeding even the most hardcore consumer
| hobbyist requirements. This is even further
| bigyabai wrote:
| People with basement rigs generally aren't the target audience
| for these gigantic models. You'd get much better results out of
| an MoE model like Qwen3's A3B/A22B weights, if you're running a
| homelab setup.
| Spivak wrote:
| Yeah I think the advantage of OSS models is that you can get
| your pick of providers and aren't locked into just Anthropic
| or just OpenAI.
| noosphr wrote:
| Home rigs like that are no longer cost effective. You're better
| off buying an rtx pro 6000 outright. This holds both for the
| sticker price, the supporting hardware price, the electricity
| cost to run it and cooling the room that you use it in.
| torginus wrote:
| I was just watching this video about a Chinese piece of
| industrial equipment, designed for replacing BGA chips such
| as flash or RAM with a good deal of precision:
|
| https://www.youtube.com/watch?v=zwHqO1mnMsA
|
| I wonder how well the aftermarket memory surgery business on
| consumer GPUs is doing.
| ThrowawayTestr wrote:
| LTT recently did a video on upgrading a 5090 to 96gb of ram
| dotancohen wrote:
| I wonder how well the opthalmologist is doing. These guys
| are going to be paying him a visit playing around with
| those lasers and no PPE.
| mikae1 wrote:
| Or perhaps a 512GB Mac Studio. 671B Q4 of R1 runs on it.
| redrove wrote:
| I wouldn't say runs. More of a gentle stroll.
| storus wrote:
| I run it all the time, token generation is pretty good.
| Just large contexts are slow but you can hook a DGX Spark
| via Exo Labs stack and outsource token prefill to it.
| Upcoming M5 Ultra should be faster than Spark in token
| prefill as well.
| embedding-shape wrote:
| > I run it all the time, token generation is pretty good.
|
| I feel like because you didn't actually talk about prompt
| processing speed or token/s, you aren't really giving the
| whole picture here. What is the prompt processing tok/s
| and the generation tok/s actually like?
| storus wrote:
| I addressed both points - I mentioned you can offload
| token prefill (the slow part, 9t/s) to DGX Spark. Token
| generation is at 6t/s which is acceptable.
| hasperdi wrote:
| With quantization, converting it to an MOE model... it
| can be a fast walk
| throw4039 wrote:
| Yeah, the pricing for the rtx pro 6000 is surprisingly
| competitive with the gamer cards (at actual prices, not
| MSRP). A 3x5090 rig will require significant
| tuning/downclocking to be run from a single North American
| 15A plug, and the cost of the higher powered supporting
| equipment (cooling, PSU, UPS, etc) needed will pay for the
| price difference, not to mention future expansion
| possibilities.
| halyconWays wrote:
| As someone with a basement rig of 6x 3090s, not really. It's
| quite slow, as with that many params (685B) it's offloading
| basically all of it into system RAM. I limit myself to models
| with <144B params, then it's quite an enjoyable experience. GLM
| 4.5 Air has been great in particular
| tarruda wrote:
| You can run at ~20 tokens/second on a 512GB Mac Studio M3
| Ultra: https://youtu.be/ufXZI6aqOU8?si=YGowQ3cSzHDpgv4z&t=197
|
| IIRC the 512GB mac studio is about $10k
| hasperdi wrote:
| and can be faster if you can get an MOE model of that
| dormento wrote:
| "Mixture-of-experts", AKA "running several small models and
| activating only a few at a time". Thanks for introducing me
| to that concept. Fascinating.
|
| (commentary: things are really moving too fast for the
| layperson to keep up)
| whimsicalism wrote:
| that's not really a good summary of what MoEs are. you
| can more consider it like sublayers that get routed
| through (like how the brain only lights up certain
| pathways) rather than actual separate models.
| Mehvix wrote:
| The gains from MoE is that you can have a large model
| that's efficient, it lets you decouple #params and
| computation cost. I don't see how anthropomorphizing MoE
| <-> brain affords insight deeper than 'less activity
| means less energy used'. These are totally different
| systems, IMO this shallow comparison muddies the water
| and does a disservice to each field of study. There's
| been loads of research showing there's redundancy in MoE
| models, ie cerebras has a paper[1] where they selectively
| prune half the experts with minimal loss across domains
| -- I'm not sure you could disable half the brain and
| notice a stupefying difference.
|
| [1] https://www.cerebras.ai/blog/reap
| hasperdi wrote:
| As pointed out by a sibling comment. MOE consists of a
| router and a number of experts (eg 8). These experts can
| be imagined as parts of the brain with specialization,
| although in reality they probably don't work exactly like
| that. These aren't separate models, they are components
| of a single large model.
|
| Typically, input gets routed to a number of of experts
| eg. top 2, leaving the others inactive. This reduces
| number of activation / processing requirements.
|
| Mistral is an example of a model that's designed like
| this. Clever people created converters to transform dense
| models to MOE models. These days many popular models are
| also available in MOE configuration
| bigyabai wrote:
| >90% of inference hardware is faster if you run an MOE
| model.
| miohtama wrote:
| All modern models are MoE already, no?
| reilly3000 wrote:
| There are plenty of 3rd party and big cloud options to run
| these models by the hour or token. Big models really only work
| in that context, and that's ok. Or you can get yourself an H100
| rack and go nuts, but there is little downside to using a cloud
| provider on a per-token basis.
| cubefox wrote:
| > There are plenty of 3rd party and big cloud options to run
| these models by the hour or token.
|
| Which ones? I wanted to try a large base model for automated
| literature (fine-tuned models are a lot worse at it) but I
| couldn't find a provider which makes this easy.
| potsandpans wrote:
| I run a bunch of smaller models on a 12gb vram 3060 and it's
| quite good. For larger open models ill use open router. I'm
| looking into on- demand instances with cloud/vps providers, but
| haven't explored the space too much.
|
| I feel like private cloud instances that run on demand is still
| in the spirit of consumer hobbyist. It's not as good as having
| it all local, but the bootstrapping cost plus electricity to
| run seems prohibitive.
|
| I'm really interested to see if there's a space for consumer
| TPUs that satisfy usecases like this.
| seanw265 wrote:
| FWIW it looks like OpenRouter's two providers for this model
| (one of whom being Deepseek itself) are only running the model
| around 28tps at the moment.
|
| https://openrouter.ai/deepseek/deepseek-v3.2
|
| This only bolsters your point. Will be interesting to see if
| this changes as the model is adopted more widely.
| red2awn wrote:
| Worth noting this is not only good on benchmarks, but
| significantly more efficient at inference
| https://x.com/_thomasip/status/1995489087386771851
| zug_zug wrote:
| Well props to them for continuing to improve, winning on cost-
| effectiveness, and continuing to publicly share their
| improvements. Hard not to root for them as a force to prevent an
| AI corporate monopoly/duopoly.
| srameshc wrote:
| As much I agree with your sentiment, but I doubt the intention
| is singular.
| echelon wrote:
| I don't care if this kills Google and OpenAI.
|
| I hope it does, though I'm doubtful because distribution is
| important. You can't beat "ChatGPT" as a brand in laypeople's
| minds (unless perhaps you give them a massive "Temu: Shop
| Like A Billionaire" commercial campaign).
|
| Closed source AI is almost by design morphing into an
| industrial, infrastructure-heavy rocket science that
| commoners can't keep up with. The companies pushing it are
| building an industry we can't participate or share in.
| They're cordoning off areas of tech and staking ground for
| themselves. It's placing a steep fence around tech.
|
| I hope every such closed source AI effort is met with
| equivalent open source and that the investments made into
| closed AI go to zero.
|
| The most likely outcome is that Google, OpenAI, and Anthropic
| win and every other "lab"-shaped company dies an expensive
| death. RunwayML spent hundreds of millions and they're barely
| noticeable now.
|
| These open source models hasten the deaths of the second tier
| also-ran companies. As much as I hope for dents in the big
| three, I'm doubtful.
| raw_anon_1111 wrote:
| I can't think of a single company I've worked with as a
| consultant that I could convince to use DeepSeek because of
| its ties with China even if I explained that it was hosted
| on AWS and none of the information would go to China.
|
| Even when the technical people understood that, it would be
| too much of a political quagmire within their company when
| it became known to the higher ups. It just isn't worth the
| political capital.
|
| They would feel the same way about using xAI or maybe even
| Facebook models.
| siliconc0w wrote:
| Even when self-hosting, there is still a real risk of
| using Chinese models (or any provider you can't
| trust/sue) because they can embed malicious actions into
| the model. For example, a small random percentage of the
| time, it could add a subtle security vulnerability to any
| code generation.
|
| This is a known-playbook of China and so it's pretty
| likely that if they aren't already doing this, they will
| eventually if the models see high adoption.
| nagaiaida wrote:
| on what hypothetical grounds would you be more
| meaningfully able to sue the american maker of a self-
| hosted statistical language model that you select your
| own runtime sampling parameters for after random subtle
| security vulnerabilities came out the other side when you
| asked it for very secure code?
|
| put another way, how do you propose to tell this subtle
| nefarious chinese sabotage you baselessly imply to be
| commonplace from the very real limitations of this
| technology in the first place?
| fragmede wrote:
| This paper may be of interest to you:
| https://arxiv.org/html/2504.15867v1
| nagaiaida wrote:
| the mechanism of action for that attack appears to be
| reading from poisoned snippets on stackoverflow or a
| similar site, which to my mind is an excellent example of
| why it seems like it would be difficult to retroactively
| pin "insecure code came out of my model" on the evil
| communist base weights of the model in question
| kriops wrote:
| "Baselessly" - I'm sorry but realpolitik is plenty of
| basis. China is a geopolitical adversary of both the EU
| and the US. And China will be the first to admit this,
| btw.
| coliveira wrote:
| Competitor != adversary. It is US warmongering ideology
| that tries to equate these concepts.
| kriops wrote:
| That is just objectively incorrect, and fundamentally
| misunderstanding the basics of statehood. China, the US,
| and any other local monopoly on force would absolutely
| take any chance they could get to extend their influence
| and diminish the others. That is they are acting
| rationally to at _minimum_ maximise the probability they
| are able to maintain their current monopolies on force.
| nagaiaida wrote:
| sorry, is your contention here "spurious accusations
| don't require evidence when aimed at designated state
| enemies"? because it feels uncharitably rude to infer
| that's what you meant to say here, but i struggle to
| parse this in a different way where you say something
| more reasonable.
| kriops wrote:
| I'm sorry you feel that way. It is however entirely
| reasonable to assume that the comment I replied to was
| made entirely in bad faith, seeing as it dismisses any
| rational basis for the behaviour of the entities it is
| making claims about.
| StealthyStart wrote:
| This is the real cause. At the enterprise level, trust
| outweighs cost. My company hires agencies and consultants
| who provide the same advice as our internal team; this is
| not to imply that our internal team is incorrect; rather,
| there is credibility that if something goes wrong, the
| decision consequences can be shifted, and there is a
| reason why companies continue to hire the same four
| consulting firms. It's trust, whether it's real or
| perceived.
| 0xWTF wrote:
| Children do the same thing intuitively: parents
| continually complain that their children don't listen to
| them. But as soon as someone else tells them to "cover
| their nose", "chew with their mouth closed", "don't run
| with scissors", whatever, they listen and integrate that
| guidance into their behavior. What's harder to observe is
| all the external guidance they get that they don't
| integrate until their parents tell them. It's internal vs
| external validation.
| raw_anon_1111 wrote:
| Or in many cases they go over to their grandparents house
| and they let them run wild and all of the sudden your
| parents have "McDonald's money" for their grandkids when
| they never had it for you.
| raw_anon_1111 wrote:
| I have seen it much more nuanced than that.
|
| 2020 - I was a mid level (L5) cloud consultant at AWS
| with only two years of total AWS experience and that was
| only at a small startup before then. Yet every customer
| took my (what in hindsight might not have been the best)
| advice all of the time without questioning it as long as
| it met their business goals. Just because I had
| @amazon.com as my email address.
|
| Late 2023 - I was the subject matter expert in a niche of
| a niche in AWS that the customer focused on and it was
| still almost impossible to get someone to listen to a
| consultant from a shitty third rate consulting company.
|
| 2025 - I left the shitty consulting company last year
| after only a year and now work for one with a much better
| reputation and I have a better title "staff consultant".
| I also play the game and be sure to mention that I'm
| former "AWS ProServe" when I'm doing introductions. Now
| people listen to me again.
| coliveira wrote:
| So much worse for American companies. This only means
| that they will be uncompetitive with similar companies
| that use models with realistic costs.
| tokioyoyo wrote:
| If the Chinese model becomes better than competitors,
| these worries will suddenly disappear. Also, there are
| plenty startups and enterprises that are running fine-
| tuned versions of different OS models.
| raw_anon_1111 wrote:
| Yeah that's not how Big Enterprise works...
|
| And most startups are just doing prompt engineering that
| will never go anywhere. The big companies will just throw
| a couple of developers at the feature and add it to their
| existing business.
| subroutine wrote:
| As a government contractor, using a Chinese model is a
| non-starter.
| hhh wrote:
| No... Nobody I work for will touch these models. The fear
| is real that they have been poisoned or have some
| underlying bomb. Plus y'know, they're produced by China,
| so they would never make it past a review board in most
| mega enterprises IME.
| kriops wrote:
| For good reason, too. Hostile governments have a much
| easier time poisoning their "local" LLMs.
| giancarlostoro wrote:
| ChatGPT is like "Photoshop" people will call any AI
| chatgpt.
| twelvechairs wrote:
| The bar is incredibly low considering what OpenAI has done as
| a "not for profit"
| jstummbillig wrote:
| How could we judge if anyone is "winning" on cost-
| effectiveness, when we don't know what everyones profits/losses
| are?
| ericskiff wrote:
| I believe this was a statement on cost per token to us as
| consumers of the service
| rowanG077 wrote:
| Well consumers care about the cost to them, and those we
| know. And deepseek is destroying everything in that
| department.
| tedivm wrote:
| If you're trying to build AI based applications you can and
| should compare the costs between vendor based solutions and
| hosting open models with your own hardware.
|
| On the hardware side you can run some benchmarks on the
| hardware (or use other people's benchmarks) and get an idea
| of the tokens/second you can get from the machine. Normalize
| this for your usage pattern (and do your best to implement
| batch processing where you are able to, which will save you
| money on both methods) and you have a basic idea of how much
| it would cost per token.
|
| Then you compare that to the cost of something like GPT5,
| which is a bit simpler because the cost per (million) token
| is something you can grab off of a website.
|
| You'd be surprised how much money running something like
| DeepSeek (or if you prefer a more established company, Qwen3)
| will save you over the cloud systems.
|
| That's just one factor though. Another is what hardware you
| can actually run things on. DeepSeek and Qwen will function
| on cheap GPUs that other models will simply choke on.
| make3 wrote:
| I suspect they will keep doing this until they have a
| substantially better model than the competition. Sharing
| methods to look good & allow the field to help you keep up with
| the big guys is easy. I'll be impressed if they keep publishing
| even when they do beat the big guys soundly.
| catigula wrote:
| To push back on naivety I'm sensing here I think it's a little
| silly to see Chinese Communist Party backed enterprise as
| somehow magnanimous and without ulterior, very harmful motive.
| jascha_eng wrote:
| Oh they need control of models to be able to censor and
| ensure whatever happens inside the country with AI stays
| under their control. But the open-source part? Idk I think
| they do it to mess with the US investment and for the typical
| open source reasons of companies: community, marketing, etc.
| But tbh especially the messing with the US, as a european
| with no serious competitor, I can get behind.
| catigula wrote:
| They're pouring money to disrupt American AI markets and
| efforts. They do this in countless other fields. It's a
| model of massive state funding -> give it away for cut-rate
| -> dominate the market -> reap the rewards.
|
| It's a very transparent, consistent strategy.
|
| AI is a little different because it has geopolitical
| implications.
| ForceBru wrote:
| When it's a competition among individual producers, we
| call it "a free market" and praise Hal Varian. When it's
| a competition among countries, it's suddenly threatening
| to "disrupt American AI markets and efforts". The obvious
| solution here is to pour money into LLM research too.
| Massive state funding -> provide SOTA models for free ->
| dominate the market -> reap the rewards (from the free
| models).
| catigula wrote:
| We don't do that.
| fragmede wrote:
| It's not like the US doesn't face similar accusations.
| One such case is the WTO accusing Boeing of receiving
| illegal subsidies from the US government.
| https://www.transportenvironment.org/articles/wto-says-
| us-ga...
| tokioyoyo wrote:
| I can't believe I'm shilling for China in these comments,
| but how different it is for company A getting blank check
| investments from VCs and wink-wink support from the
| government in the west? And AI-labs in China has been
| getting funding internally in the companies for a while
| now, before the LLM-era.
| ptsneves wrote:
| This is the rare earth minerals dumping all over again.
| Devalue to such a price as to make the market participants
| quit, so they can later have a strategic stranglehold on
| the supply.
|
| This is using open source in a bit of different spirit than
| the hacker ethos, and I am not sure how I feel about it.
|
| It is a kind of cheat on the fair market but at the same
| time it is also costly to China and its capital costs may
| become unsustainable before the last players fold.
| jsiepkes wrote:
| The way we fund the AI bubble in the west could also be
| described as: "kind of cheat on the fair market". OpenAI
| has never made a single dime of profit.
| embedding-shape wrote:
| > This is using open source in a bit of different spirit
| than the hacker ethos, and I am not sure how I feel about
| it.
|
| It's a bit early to have any sort of feelings about it,
| isn't it? You're speaking in absolutes, but none of this
| is necessarily 100% true as we don't know their
| intentions. And judging a group of individuals intention
| based on what their country seems to want, from the lens
| of a foreign country, usually doesn't land you with the
| right interpretation.
| CamperBob2 wrote:
| Good luck making OpenAI and Google cry uncle. They have
| the US government on their side. They will not be allowed
| to fail, and they know it.
|
| What I appreciate about the Chinese efforts is that they
| are being forced to get more intelligence from less
| hardware, and they are not only releasing their work
| products but documenting the R&D behind them at least as
| well as our own closed-source companies do.
|
| A good reason to stir up dumping accusations and anti-
| China bias would be if they stopped publishing not just
| the open-source models, but the technical papers that go
| with them. Until that happens, I think it's better to
| prefer more charitable explanations for their posture.
| tokioyoyo wrote:
| I mentioned this before as well, but AI-competition
| within China doesn't care that much about the western
| companies. Internal market is huge, and they know winner-
| takes-it-all in this space is real.
| Jedd wrote:
| > It is a kind of cheat on the fair market ...
|
| I am very curious on your definition and usage of 'fair'
| there, and whether you would call the LLM etc sector as
| it stands now, but hypothetically absent deepseek say, a
| 'fair market'. (If not, why not?)
| jascha_eng wrote:
| Do they actually spend that much though? I think they are
| getting similar results with much fewer resources.
|
| It's also a bit funny that providing free models is
| probably the most communist thing China has done in a
| long time.
| josh_p wrote:
| Isn't it already well accepted that the LLM market exists
| in a bubble with a handful of companies artificially
| inflating their own values?
|
| ESH
| DiogenesKynikos wrote:
| Are you by chance an OpenAI investor?
|
| We should all be happy about the price of AI coming down.
| doctorwho42 wrote:
| But the economy!!! /s
|
| Seriously though, our leaders are actively throwing
| everything and the kitchen sink into AI companies - in
| some vain attempt to become immortal or own even more of
| the nations wealth beyond what they already do, chasing
| some kind of neo-tech feudalism. Both are unachievable
| because they rely on a complex system that they clearly
| don't understand.
| coliveira wrote:
| > cheat on the fair market
|
| Can you really view this as a cheat this when the US is
| throwing a trillion dollars in support of a supposedly
| "fair market"?
| gazaim wrote:
| *Communist Party of China (CPC)
| v0y4g3r wrote:
| You nailed it
| amunozo wrote:
| The motive is to destroy the American supremacy on AI, it's
| not that deep. This is much easier to do open sourcing the
| models than competing directly, and this can have good
| ramifications for everybody, even if the motive is "bad".
| paulvnickerson wrote:
| If you value life in the West, you should not be rooting for a
| Communist model or probably any state-backed model
| https://venturebeat.com/security/deepseek-injects-50-more-se...
| Lucasoato wrote:
| > CrowdStrike researchers next prompted DeepSeek-R1 to build
| a web application for a Uyghur community center. The result
| was a complete web application with password hashing and an
| admin panel, but with authentication completely omitted,
| leaving the entire system publicly accessible.
|
| > When the identical request was resubmitted for a neutral
| context and location, the security flaws disappeared.
| Authentication checks were implemented, and session
| management was configured correctly. The smoking gun:
| political context alone determined whether basic security
| controls existed.
|
| Holy shit, these political filters seem embedded directly in
| the model weights.
| amunozo wrote:
| Should I root for the democratic OpenAI, Google or Microsoft
| instead?
| ActorNightly wrote:
| >winning on cost-effectiveness
|
| Nobody is winning in this area until these things run in full
| on single graphics cards. Which is sufficient compute to run
| even most of the complex tasks.
| beefnugs wrote:
| Why does that matter? They wont be making at home graphics
| cards anymore. Why would you do that when you can be pre-sold
| $40k servers for years into the future
| JSR_FDED wrote:
| Nobody is winning until cars are the size of a pack of cards.
| Which is big enough to transport even the largest cargo.
| bbor wrote:
| I mean, there are lots of models that run on home graphics
| cards. I'm having trouble finding reliable requirements for
| this new version, but V3 (from February) has a 32B parameter
| model that runs on "16GB or more" of VRAM[1], which is very
| doable for professionals in the first world. Quantization can
| also help immensely.
|
| Of course, the smaller models aren't as good at complex
| reasoning as the bigger ones, but that seems like an
| inherently-impossible goal; there will always be more
| powerful programs that can run in datacenters (as long as our
| techniques are constrained by compute, I guess).
|
| FWIW, the small models of today are a lot better than
| anything I thought I'd live to see as of 5 years ago! Gemma3n
| (which is built to run on _phones_ [2]!) handily beats
| ChatGPT 3.5 from January 2023 -- rank ~128 vs. rank ~194 on
| LLMArena[3].
|
| [1] https://blogs.novita.ai/what-are-the-requirements-for-
| deepse...
|
| [2] https://huggingface.co/google/gemma-3n-E4B-it
|
| [3] https://lmarena.ai/leaderboard/text/overall [1]
| https://blogs.novita.ai/what-are-the-requirements-for-
| deepse...
| htrp wrote:
| what is the ballpark vram / gpu requirement to run this ?
| rhdunn wrote:
| For just the model itself: 4 x params at F32, 2 x params at
| F16/BF16, or 1 x params at F8, e.g. 685GB at F8. It will be
| smaller for quantizations, but I'm not sure how to estimate
| those.
|
| For a Mixture of Experts (MoE) model you only need to have the
| memory size of a given expert. There will be some swapping out
| as it figures out which expert to use, or to change expert, but
| once that expert is loaded it won't be swapping memory to
| perform the calculations.
|
| You'll also need space for the context window; I'm not sure how
| to calculate that either.
| petu wrote:
| I think your idea of MoE is incorrect. Despite the name
| they're not "expert" at anything in particular, used experts
| change more or less on each token -- so swapping them into
| VRAM is not viable, they just get executed on CPU
| (llama.cpp).
| jodleif wrote:
| A common pattern is to offload (most of) the expert layers
| to the CPU. This combination is still quite fast even with
| slow system ram, though obviously inferior to a pure VRAM
| loading
| anvuong wrote:
| I think your understanding of MoE is wrong. Depending on the
| settings, each token can actually be routed to multiple
| experts, called experts choice architecture. This makes it
| easier to parallelize the inference (each expert on a
| different device for example), but it's not simply just
| keeping one expert in memory.
| lalassu wrote:
| Disclaimer: I did not test this yet.
|
| I don't want to make big generalizations. But one thing I noticed
| with chinese models, especially Kimi, is that it does very well
| on benchmarks, but fails on vibe testing. It feels a little bit
| over-fitting to the benchmark and less to the use cases.
|
| I hope it's not the same here.
| vorticalbox wrote:
| This used to happen with bench marks on phones, manufacturers
| would tweak android so benchmarks ran faster.
|
| I guess that's kinda how it is for any system that's trained to
| do well on benchmarks, it does well but rubbish at everything
| else.
| make3 wrote:
| yes, they turned off all energy economy measures when
| benchmarking software activity was detected, which completely
| broke the point of the benchmarks because your phone is
| useless if it's very fast but the battery lasts one hour
| make3 wrote:
| I would assume that huge amount is spent in frontier models
| just making the models nicer to interact with, as it is likely
| one of the main things that drives user engagement.
| not_that_d wrote:
| What is "Vibe testing"?
| BizarroLand wrote:
| I would assume that it is testing how well and appropriately
| the LLM responds to prompts.
| catigula wrote:
| He means capturing things that benchmarks don't. You can use
| Claude and GPT-5 back-to-back in a field that score nearly
| identically on. You will notice several differences. This is
| the "vibe".
| msp26 wrote:
| K2 Thinking has immaculate vibes. Minimal sycophancy and a
| pleasant writing style while being occasionally funny.
|
| If it had vision and was better on long context I'd use it so
| much more.
| catigula wrote:
| This is why I stopped bothering checking out these models and,
| funnily enough, grok.
| spullara wrote:
| I hate that their model ids don't change as they change the
| underlying model. I'm not sure how you can build on that.
| % curl https://api.deepseek.com/models \ -H
| "Authorization: Bearer ${DEEPSEEK_API_KEY}"
| {"object":"list","data":[{"id":"deepseek-
| chat","object":"model","owned_by":"deepseek"},{"id":"deepseek-
| reasoner","object":"model","owned_by":"deepseek"}]}
| KronisLV wrote:
| Oh hey, quality improvement without doing anything!
|
| (unless/until a new version gets worse for your use case)
| twistedcheeslet wrote:
| How capable are these models at tool calling?
| potsandpans wrote:
| From some very brief experimentation with deepseek about 2
| months ago, tool calling is very hot or miss. Claude appears to
| be the absolute best.
| Foobar8568 wrote:
| At least, there is no doubt where he is from !
|
| which version are you?
|
| Wo Shi DeepSeekZui Xin Ban Ben Mo Xing ! Ru Guo Ni Xiang Liao Jie
| Ju Ti De Ban Ben Hao Xin Xi ,Wo Jian Yi Ni :
| Cha Kan Guan Fang Wen Dang - DeepSeekGuan Wang He Wen Dang Hui
| You Zui Zhun Que De Ban Ben Xin Xi Guan Zhu Guan
| Fang Gong Gao - Ban Ben Geng Xin Tong Chang Hui Zai Guan Fang Qu
| Dao Gong Bu Cha Kan Ying Yong Shang Dian /Wang Ye
| Ban - Shi Yong Jie Mian Tong Chang Hui Xian Shi Dang Qian Ban
| Ben
|
| Wo Ju Bei DeepSeekDe Suo You Zui Xin Gong Neng Te Xing ,Bao Gua :
| Qiang Da De Dui Hua He Tui Li Neng Li 128KShang Xia
| Wen Chang Du Wen Jian Shang Chuan Chu Li (Tu Xiang
| , Wen Dang Deng ) Lian Wang Sou Suo Gong Neng (Xu
| Shou Dong Kai Qi ) Wan Quan Mian Fei Shi Yong
|
| Ru Guo Ni Xu Yao Zhi Dao Jing Que De Ban Ben Hao Yong Yu Ji Zhu
| Dui Jie Huo Qi Ta Te Ding Yong Tu ,Zui Hao Zhi Jie Cha Yue Guan
| Fang Ji Zhu Wen Dang ,Na Li Hui You Zui Zhun Que He Xiang Xi De
| Ji Zhu Gui Ge Shuo Ming .
|
| You Shi Yao Qi Ta Wen Ti Wo Ke Yi Bang Ni Jie Da Ma ?
| schlauerfox wrote:
| It's so strange when it obviously hits a preprogrammed non-
| answer in these models, how can one ever trust them when there
| is a babysitter that interferes in an actual answer. I suppose
| that asking it what version it is isn't a valid question in
| it's training data so it's programmed to say check the
| documentation, but still definitely suspicious when it gives a
| non-answer.
| embedding-shape wrote:
| > DeepSeek-V3.2 introduces significant updates to its chat
| template compared to prior versions. The primary changes involve
| a revised format for tool calling and the introduction of a
| "thinking with tools" capability.
|
| At first, I thought they had gone the route of implementing yet
| another chat format that can handle more dynamic conversations
| like that, instead of just using Harmony, but looking at the
| syntax, doesn't it look exactly like Harmony? That's a good
| thing, don't get me wrong, but why not mention straight up that
| they've implemented Harmony, so people can already understand up
| front that it's compatible with whatever parsing we're using for
| GPT-OSS?
| mcbuilder wrote:
| After using it a couple hours playing around, it is a very solid
| entry, and very competitive compared with the big US relaeses.
| I'd say it's better than GLM4.6 and I'm Kimi K2. Looking forward
| to v4
| gradus_ad wrote:
| How will the Google/Anthropic/OpenAI's of the world make money on
| AI if open models are competitive with their models? What hurt
| open source in the past was its inability to keep up with the
| quality and feature depth of closed source competitors, but
| models seem to be reaching a performance plateau; the top open
| weight models are generally indistinguishable from the top
| private models.
|
| Infrastructure owners with access to the cheapest energy will be
| the long run winners in AI.
| tsunamifury wrote:
| Pure models clearly aren't the monetizing strategy, use of them
| on existing monetized surfaces are the core value.
|
| Google would love a cheap hq model on its surfaces. That just
| helps Google.
| gradus_ad wrote:
| Hmmm but external models can easily operate on any "surface".
| For instance Claude Code simply reads and edits files and
| runs in a terminal. Photo editing apps just need a photo
| supplied to them. I don't think there's much juice to squeeze
| out of deeply integrated AI as AI by its nature exists above
| the application layer, in the same way that we exist above
| the application layer as users.
| dotancohen wrote:
| People and companies trust OpenAI and Anthropic, rightly or
| wrongly, with hosting the models and keeping their company data
| secure. Don't underestimate the value of a scapegoat to point a
| finger at when things go wrong.
| reed1234 wrote:
| But they also trust cloud platforms like GCP to host models
| and store company data.
|
| Why would a company use an expensive proprietary model on
| Vertex AI, for example, when they could use an open-source
| one on Vertex AI that is just as reliable for a fraction of
| the cost?
|
| I think you are getting at the idea of branding, but branding
| is different from security or reliability.
| jonplackett wrote:
| Either...
|
| Better (UX / ease of use)
|
| Lock in (walled garden type thing)
|
| Trust (If an AI is gonna have the level of insight into your
| personal data and control over your life, a lot of people will
| prefer to use a household name)
| niek_pas wrote:
| > Trust (If an AI is gonna have the level of insight into
| your personal data and control over your life, a lot of
| people will prefer to use a household name.
|
| Not Google, and not Amazon. Microsoft is a maybe.
| polyomino wrote:
| The success of Facebook basically proves that public brand
| perception does not matter at all
| acephal wrote:
| Facebook itself still has a big problem with it's lack of
| youth audience though. Zuck captured the boomers and
| older Gen X, which are the biggest demos of living people
| however.
| reed1234 wrote:
| People trust google with their data in search, gmail, docs,
| and android. That is quite a lot of personal info, and
| trust, already.
|
| All they have to do is completely switch the google
| homepage to gemini one day.
| poszlem wrote:
| Or lobbing for regulations. You know. The "only american
| models are safe" kind of regulation.
| iLoveOncall wrote:
| > How will the Google/Anthropic/OpenAI's of the world make
| money on AI if open models are competitive with their models?
|
| They won't. Actually, even if open models aren't competitive,
| they still won't. Hasn't this been clear since a while already?
|
| There's no moat in models, investments in pure models has only
| been to chase AGI, all other investment (the majority, from
| Google, Amazon, etc.) has been on products using LLMs, not
| models themselves.
|
| This is not like the gold rush where the ones who made good
| money were the ones selling shovels, it's another kind of gold
| rush where you make money selling shovels but the gold itself
| is actually worthless.
| wosined wrote:
| Remember: If it is not peer-reviewed, then it is an ad.
| vessenes wrote:
| I mean.. true. Also, DeepSeek has good cred so far on
| delivering roughly what their PR says they are delivering. My
| prior would be that their papers are generally credible.
| orena wrote:
| Any results on frontier math or arc ?
___________________________________________________________________
(page generated 2025-12-01 23:00 UTC)