[HN Gopher] GPU-rich labs have won: What's left for the rest of ...
___________________________________________________________________
GPU-rich labs have won: What's left for the rest of us is
distillation
Author : npmipg
Score : 48 points
Date : 2025-08-08 19:29 UTC (3 hours ago)
(HTM) web link (inference.net)
(TXT) w3m dump (inference.net)
| madars wrote:
| The blog kept redirecting to the home page after a second, so
| here's an archive: https://archive.is/SE78v
| ilaksh wrote:
| There is huge pressure to prove and scale radical alternative
| paradigms like memory-centric compute such as memristors, or
| SNNs, etc. That's why I am surprised we don't hear a lot about
| very large speculative investments in these directions to
| dramatically multiply AI compute efficiency.
|
| But one has to imagine that seeing so many huge datacenters go up
| and not being able to do training runs etc. is motivating a lot
| of researchers to try things that are really different. At least
| I hope so.
|
| It seems pretty short sighted that the funding numbers for
| memristor startups (for example) are so low so far.
|
| Anyway, assuming that within the next several years more
| radically different AI hardware and AI architecture paradigms pay
| off in efficiency gains, the current situation will change. Fully
| human level AI will be commoditized, and training will be well
| within the reach of small companies.
|
| I think we should anticipate this given the strong level of need
| to increase efficiency dramatically, the number of existing
| research programs, the amount of investment in AI overall, and
| the history of computation that shows numerous dramatic paradigm
| shifts.
|
| So anyway "the rest of us" I think should be banding together and
| making much larger bets on proving and scaling radical new AI
| hardware paradigms.
| sidewndr46 wrote:
| I think a pretty good chunk of HP's history explains why
| memristors don't get used in a commercial capacity.
| ofrzeta wrote:
| You remember The Machine? I had a vague memory but I had to
| look it up.
| michelpp wrote:
| Not sure why this is being downvoted, it's a thoughtful
| comment. I too see this crisis as an opportunity to push
| boundaries past current architectures. Sparse models for
| example show a lot of promise and more closely track real
| biological systems. The human brain has an estimated graph
| density of 0.0001 to 0.001. Advances in sparse computing
| libraries and new hardware architectures could be key to
| achieving this kind of efficiency.
| lazide wrote:
| Memristors have been tried for literally decades.
|
| If the posters other guesses pay out the same rate, this will
| likely play out never.
| ilaksh wrote:
| Other technologies tried for decades before becoming huge:
| Neural-network AI; Electric cars; mRNA vaccines; Solar
| photovoltaics; LED lighting
| lazide wrote:
| Ho boy, should we start listing the 10x number of things
| that went in the wastebasket too?
| ToValueFunfetti wrote:
| If I only have to try 11 things for one of them to be LED
| lights or electric cars, I'd better get trying. Sure, I
| might have to empty a wastebasket at some point, but I'll
| just pay someone for that.
| kelipso wrote:
| There was a bit of noise regarding spiking neural networks
| a few years ago but now I am not seeing it so often
| anymore.
| thekoma wrote:
| Even in that scenario, what would stop the likes of OpenAI to
| instead throw 50M+ a day to the new way of doing things and
| still outcompete smaller fry?
| hnuser123456 wrote:
| >memory-centric compute
|
| This already exists: https://www.cerebras.ai/chip
|
| They claim 44 GB of SRAM at 21 PB/s.
| cma wrote:
| They use separate memory servers, networked memory adjacent
| adjacent compute with small amounts of fast local memory.
|
| Waferscale severely limits bandwidth once you go beyond SRAM,
| because with far less chip perimeter per unit area there is
| less place to hook up IO.
| marcosdumay wrote:
| Memristors in particular just won't happen.
|
| But memory-centric compute didn't happen because of Moore's
| law. (SNNs have the problem that we don't actually know how to
| use them.) Now that it's gone, it may have a chance, but it
| still takes a large amount of money thrown into the idea and
| the people with money are so risk-adverse that they create
| entire new risks for themselves.
|
| Forward neural networks were very lucky that there existed a
| mainstream use for the kind of hardware it needed.
| latchkey wrote:
| Not a fan of fear based marketing: "The whole world is too big
| and expensive for you to participate in, so use our service
| instead"
|
| I'd rather approach these things from the PoV of: "We use
| distillation to solve your problems today"
|
| The last sentence kind of says it all: "If you have 30k+/mo in
| model spend, we'd love to chat."
| 42lux wrote:
| We haven't seen a proper npu and we are in the launch of the
| first consumer grade unified architectures by Nvidia and AMD. The
| battle of homebrew AI hasn't even started yet.
| stego-tech wrote:
| Hell, we haven't even seen actual AI yet. This is all just
| brute-forcing likely patterns of tokens based on a corpus of
| existing material, not anything brand new or particularly
| novel. Who would've guessed that giving CompSci and Mathematics
| researchers billions of dollars in funding and millions of GPUs
| in parallel without the usual constraints of government
| research would produce the most expensive brute-force
| algorithms in human history?
|
| I still believe this is going to be an embarrassing chapter of
| the history of AI when we actually do create it. "Humans - with
| the sort of hubris only a neoliberal post-war boom period could
| produce - honestly thought their first serious development in
| computing (silicon-based mircoprocessors) would lead to
| Artificial General Intelligence and usher in a utopia of the
| masses. Instead they squandered their limited resources on a
| Fool's Errand, ignoring more important crises that would have
| far greater impacts on their immediate prosperity in the naive
| belief they could create a Digital God from Silicon and
| Electricity alone."
| braooo wrote:
| Yeh. We're still barely beyond the first few pixels that make
| up the bottom tail of the S-curve for autonomous type AI
| everyone imagines
|
| Energy models and other substrates are going to be key, and
| it has nothing to do with text at all as human intelligence
| existed before language. It's Newspeak to run a chat bot on
| what is obviously a computer and call it an intelligence like
| a human. 1984 like dystopia crap.
| YetAnotherNick wrote:
| Deepseek main run costed $6M. qwen3-30b-a3b probably would cost
| few $100Ks, which is ranked 13th.
|
| GPU cost of the final model training isn't the biggest chunk of
| the cost and you can probably replicate results of models like
| Llama 3 very cheaply. It's the cost of experiments, researchers,
| data collection which brings overall cost 1 or 2 order of
| magnitude higher.
| ilaksh wrote:
| What's your source for any of that? I think the $6 million
| thing was identified as a lie they felt was necessary because
| of GPU export laws.
| YetAnotherNick wrote:
| It wasn't a lie, it was a misrepresentation of the total
| cost. It's not hard to calculate the cost of the training
| though. It takes 6 * active parameters * tokens flops[1]. To
| get number of seconds you can divide by Flops/s * MFU, where
| MFU is around 45% for H100 for large enough models[2].
|
| [1]: https://arxiv.org/abs/2001.08361
|
| [2]: https://github.com/facebookresearch/lingua
| muratsu wrote:
| If I'm understanding this correctly, we should see some great
| coding LLMs. Idk, could be as limited as a single stack eg
| laravel/nextjs ecosystem.
| thomassmith65 wrote:
| Perhaps one of these days a random compsci undergrad will come up
| a DeepSeek-calibre optimization.
|
| Just imagine his or her 'ChatGPT with 10,000x fewer propagations'
| Reddit post appearing on a Monday...
|
| ...and $3 trillion of Nvidia stock going down the drain by
| Friday.
| therealpygon wrote:
| One can only hope. Maybe then they'll sell us GPUs with 2025
| quantity memory instead of 2015.
| ilaksh wrote:
| DeepSeek came up with several significant optimizations, not
| just one. And master's students do contribute to leading edge
| research all the time.
|
| There have really been many significant innovations in
| hardware, model architecture, and software, allowing companies
| to keep up with soaring demand and expectations.
|
| But that's always how it's been in high technology. You only
| really hear about the biggest shifts, but the optimizations are
| continuous.
| thomassmith65 wrote:
| True, but I chose the words 'ChatGPT' and 'optimization' for
| brevity. There are many more eyes on machine learning since
| ChatGPT came along. There could be simpler techniques yet to
| discover. What boggles the mind is the $4 trillion parked in
| Nvidia stock, and wasted if more efficient code lessens the
| need for expensive GPUs.
| tudorw wrote:
| Tropical Distillation?
| ripped_britches wrote:
| 50m per day is insane! Any link supporting that?
___________________________________________________________________
(page generated 2025-08-08 23:00 UTC)