[HN Gopher] GPU-rich labs have won: What's left for the rest of ...
       ___________________________________________________________________
        
       GPU-rich labs have won: What's left for the rest of us is
       distillation
        
       Author : npmipg
       Score  : 48 points
       Date   : 2025-08-08 19:29 UTC (3 hours ago)
        
 (HTM) web link (inference.net)
 (TXT) w3m dump (inference.net)
        
       | madars wrote:
       | The blog kept redirecting to the home page after a second, so
       | here's an archive: https://archive.is/SE78v
        
       | ilaksh wrote:
       | There is huge pressure to prove and scale radical alternative
       | paradigms like memory-centric compute such as memristors, or
       | SNNs, etc. That's why I am surprised we don't hear a lot about
       | very large speculative investments in these directions to
       | dramatically multiply AI compute efficiency.
       | 
       | But one has to imagine that seeing so many huge datacenters go up
       | and not being able to do training runs etc. is motivating a lot
       | of researchers to try things that are really different. At least
       | I hope so.
       | 
       | It seems pretty short sighted that the funding numbers for
       | memristor startups (for example) are so low so far.
       | 
       | Anyway, assuming that within the next several years more
       | radically different AI hardware and AI architecture paradigms pay
       | off in efficiency gains, the current situation will change. Fully
       | human level AI will be commoditized, and training will be well
       | within the reach of small companies.
       | 
       | I think we should anticipate this given the strong level of need
       | to increase efficiency dramatically, the number of existing
       | research programs, the amount of investment in AI overall, and
       | the history of computation that shows numerous dramatic paradigm
       | shifts.
       | 
       | So anyway "the rest of us" I think should be banding together and
       | making much larger bets on proving and scaling radical new AI
       | hardware paradigms.
        
         | sidewndr46 wrote:
         | I think a pretty good chunk of HP's history explains why
         | memristors don't get used in a commercial capacity.
        
           | ofrzeta wrote:
           | You remember The Machine? I had a vague memory but I had to
           | look it up.
        
         | michelpp wrote:
         | Not sure why this is being downvoted, it's a thoughtful
         | comment. I too see this crisis as an opportunity to push
         | boundaries past current architectures. Sparse models for
         | example show a lot of promise and more closely track real
         | biological systems. The human brain has an estimated graph
         | density of 0.0001 to 0.001. Advances in sparse computing
         | libraries and new hardware architectures could be key to
         | achieving this kind of efficiency.
        
           | lazide wrote:
           | Memristors have been tried for literally decades.
           | 
           | If the posters other guesses pay out the same rate, this will
           | likely play out never.
        
             | ilaksh wrote:
             | Other technologies tried for decades before becoming huge:
             | Neural-network AI; Electric cars; mRNA vaccines; Solar
             | photovoltaics; LED lighting
        
               | lazide wrote:
               | Ho boy, should we start listing the 10x number of things
               | that went in the wastebasket too?
        
               | ToValueFunfetti wrote:
               | If I only have to try 11 things for one of them to be LED
               | lights or electric cars, I'd better get trying. Sure, I
               | might have to empty a wastebasket at some point, but I'll
               | just pay someone for that.
        
             | kelipso wrote:
             | There was a bit of noise regarding spiking neural networks
             | a few years ago but now I am not seeing it so often
             | anymore.
        
         | thekoma wrote:
         | Even in that scenario, what would stop the likes of OpenAI to
         | instead throw 50M+ a day to the new way of doing things and
         | still outcompete smaller fry?
        
         | hnuser123456 wrote:
         | >memory-centric compute
         | 
         | This already exists: https://www.cerebras.ai/chip
         | 
         | They claim 44 GB of SRAM at 21 PB/s.
        
           | cma wrote:
           | They use separate memory servers, networked memory adjacent
           | adjacent compute with small amounts of fast local memory.
           | 
           | Waferscale severely limits bandwidth once you go beyond SRAM,
           | because with far less chip perimeter per unit area there is
           | less place to hook up IO.
        
         | marcosdumay wrote:
         | Memristors in particular just won't happen.
         | 
         | But memory-centric compute didn't happen because of Moore's
         | law. (SNNs have the problem that we don't actually know how to
         | use them.) Now that it's gone, it may have a chance, but it
         | still takes a large amount of money thrown into the idea and
         | the people with money are so risk-adverse that they create
         | entire new risks for themselves.
         | 
         | Forward neural networks were very lucky that there existed a
         | mainstream use for the kind of hardware it needed.
        
       | latchkey wrote:
       | Not a fan of fear based marketing: "The whole world is too big
       | and expensive for you to participate in, so use our service
       | instead"
       | 
       | I'd rather approach these things from the PoV of: "We use
       | distillation to solve your problems today"
       | 
       | The last sentence kind of says it all: "If you have 30k+/mo in
       | model spend, we'd love to chat."
        
       | 42lux wrote:
       | We haven't seen a proper npu and we are in the launch of the
       | first consumer grade unified architectures by Nvidia and AMD. The
       | battle of homebrew AI hasn't even started yet.
        
         | stego-tech wrote:
         | Hell, we haven't even seen actual AI yet. This is all just
         | brute-forcing likely patterns of tokens based on a corpus of
         | existing material, not anything brand new or particularly
         | novel. Who would've guessed that giving CompSci and Mathematics
         | researchers billions of dollars in funding and millions of GPUs
         | in parallel without the usual constraints of government
         | research would produce the most expensive brute-force
         | algorithms in human history?
         | 
         | I still believe this is going to be an embarrassing chapter of
         | the history of AI when we actually do create it. "Humans - with
         | the sort of hubris only a neoliberal post-war boom period could
         | produce - honestly thought their first serious development in
         | computing (silicon-based mircoprocessors) would lead to
         | Artificial General Intelligence and usher in a utopia of the
         | masses. Instead they squandered their limited resources on a
         | Fool's Errand, ignoring more important crises that would have
         | far greater impacts on their immediate prosperity in the naive
         | belief they could create a Digital God from Silicon and
         | Electricity alone."
        
           | braooo wrote:
           | Yeh. We're still barely beyond the first few pixels that make
           | up the bottom tail of the S-curve for autonomous type AI
           | everyone imagines
           | 
           | Energy models and other substrates are going to be key, and
           | it has nothing to do with text at all as human intelligence
           | existed before language. It's Newspeak to run a chat bot on
           | what is obviously a computer and call it an intelligence like
           | a human. 1984 like dystopia crap.
        
       | YetAnotherNick wrote:
       | Deepseek main run costed $6M. qwen3-30b-a3b probably would cost
       | few $100Ks, which is ranked 13th.
       | 
       | GPU cost of the final model training isn't the biggest chunk of
       | the cost and you can probably replicate results of models like
       | Llama 3 very cheaply. It's the cost of experiments, researchers,
       | data collection which brings overall cost 1 or 2 order of
       | magnitude higher.
        
         | ilaksh wrote:
         | What's your source for any of that? I think the $6 million
         | thing was identified as a lie they felt was necessary because
         | of GPU export laws.
        
           | YetAnotherNick wrote:
           | It wasn't a lie, it was a misrepresentation of the total
           | cost. It's not hard to calculate the cost of the training
           | though. It takes 6 * active parameters * tokens flops[1]. To
           | get number of seconds you can divide by Flops/s * MFU, where
           | MFU is around 45% for H100 for large enough models[2].
           | 
           | [1]: https://arxiv.org/abs/2001.08361
           | 
           | [2]: https://github.com/facebookresearch/lingua
        
       | muratsu wrote:
       | If I'm understanding this correctly, we should see some great
       | coding LLMs. Idk, could be as limited as a single stack eg
       | laravel/nextjs ecosystem.
        
       | thomassmith65 wrote:
       | Perhaps one of these days a random compsci undergrad will come up
       | a DeepSeek-calibre optimization.
       | 
       | Just imagine his or her 'ChatGPT with 10,000x fewer propagations'
       | Reddit post appearing on a Monday...
       | 
       | ...and $3 trillion of Nvidia stock going down the drain by
       | Friday.
        
         | therealpygon wrote:
         | One can only hope. Maybe then they'll sell us GPUs with 2025
         | quantity memory instead of 2015.
        
         | ilaksh wrote:
         | DeepSeek came up with several significant optimizations, not
         | just one. And master's students do contribute to leading edge
         | research all the time.
         | 
         | There have really been many significant innovations in
         | hardware, model architecture, and software, allowing companies
         | to keep up with soaring demand and expectations.
         | 
         | But that's always how it's been in high technology. You only
         | really hear about the biggest shifts, but the optimizations are
         | continuous.
        
           | thomassmith65 wrote:
           | True, but I chose the words 'ChatGPT' and 'optimization' for
           | brevity. There are many more eyes on machine learning since
           | ChatGPT came along. There could be simpler techniques yet to
           | discover. What boggles the mind is the $4 trillion parked in
           | Nvidia stock, and wasted if more efficient code lessens the
           | need for expensive GPUs.
        
       | tudorw wrote:
       | Tropical Distillation?
        
       | ripped_britches wrote:
       | 50m per day is insane! Any link supporting that?
        
       ___________________________________________________________________
       (page generated 2025-08-08 23:00 UTC)