[HN Gopher] Nvidia pursues $30B custom chip opportunity with new...
       ___________________________________________________________________
        
       Nvidia pursues $30B custom chip opportunity with new unit
        
       Author : dev_tty01
       Score  : 78 points
       Date   : 2024-02-10 17:46 UTC (5 hours ago)
        
 (HTM) web link (www.reuters.com)
 (TXT) w3m dump (www.reuters.com)
        
       | tmaly wrote:
       | I would love to see a consumer graphics card with 128GB VRAM
       | 
       | Would be nice to be able to work with some of the larger open
       | source LLM models.
        
         | datameta wrote:
         | I think that goal is fine and good, but I would rather see huge
         | investments toward in-memory compute like ReRAM and such. If we
         | bridge the efficiency advancements of TinyML with the leap of
         | LLM abilities, perhaps we can start on the road of not being
         | limited by the impact of training on climate.
        
           | asicsarecool wrote:
           | Why the fuck was this downvoted.
           | 
           | Very occasionally I get the feeling HN is entering the /.
           | phase
        
             | wmf wrote:
             | I didn't downvote it but in-memory compute is crackpot and
             | alternative memory tech is really crackpot. It's not going
             | to happen and it's ridiculous to propose it on the same
             | level as GPUs with more RAM.
        
           | epistasis wrote:
           | Agree with the comment, but riffing on the last few words:
           | 
           | Climate is pretty much my #1 concern about the world, but LLM
           | use of energy is really really far down on the list of
           | important actions for climate.
           | 
           | First and foremost are removing roadblocks for deploying
           | existing technologies for clean energy, and speeding up the
           | necessary supporting infrastructure such as transmission and
           | market policies for choosing cheapest possible solutions
           | (over the objections of dinosaur execs that choose last
           | century's solutions). Then the big hard to decarbonize parts
           | of industry like cement and steel, as well as deploying
           | electrolyzers to get ammonia fertilizer production switched
           | over to carbon neutral production rather than from fossil-
           | generated hydrogen.
           | 
           | Reducing energy consumption is important for advancing AI in
           | general, but ultimately all its energy consumption will be
           | from clean energy sources anyway, and the switch that needs
           | to happen is that switch in energy sources. Reducing energy
           | use by 2x or 10x is not good enough, we must change the
           | sources fundamentally.
        
         | amelius wrote:
         | That's overkill for any type of graphics application though.
         | 
         | What you want is a DL or parallel compute card, not a graphics
         | card.
         | 
         | They are far more expensive though because compute doesn't sell
         | to the average consumer like graphics does.
        
           | sigmoid10 wrote:
           | Meh, that's like saying 640kb of RAM is more than anyone
           | would ever need. Demand follows hardware development, which
           | in turn accelerates demand. I'm sure game developers would
           | easily find a way to use 128GB of VRAM if it was commonly
           | available in their target market.
        
             | late2part wrote:
             | To be fair, 640kb is more than anyone really needs. It's
             | just far, far less than we want.
        
             | Takennickname wrote:
             | Can you imagine the download sizes?
        
             | amelius wrote:
             | Turns out that demand for graphics memory got stuck at a
             | point where compute is still hungry for more.
             | 
             | That may certainly change but it doesn't help compute much,
             | today.
        
             | godelski wrote:
             | This isn't an argument against giving users this much VRAM,
             | but I'm pretty sure they'd just drop the whole game into
             | VRAM and call it a day instead of actually optimizing.
        
               | fbdab103 wrote:
               | Isn't that exactly how things improve? When you can
               | reduce the cognitive load on bookkeeping, you can
               | accomplish more productive work.
               | 
               | I can produce a featureful webapp in Python-Django solo
               | because I do not have to worry about optimally filling
               | registers and CPU cache. As you cut deeper to the
               | hardware limitations, you have to be significantly more
               | cognizant of writing to the hardware constraints than
               | solving the business problem.
               | 
               | Taken to the extreme, we have Electron applications
               | consuming multi-GB of RAM, but it does expand the
               | universe of possibilities.
        
           | dotnet00 wrote:
           | A modern graphics card already is a parallel compute card. A
           | modern graphics pipeline is mostly compute with only very
           | specific stuff using fixed function functionality.
        
             | amelius wrote:
             | Yes, but that was not the point. The point is that they are
             | sold as graphics cards, not compute cards. Therefore,
             | expect these cards to be good at (and have the memory for)
             | typical graphics operations.
        
         | Aurornis wrote:
         | Unfortunately, as soon as you make a card with specifications
         | that make it great at enterprise-grade tasks, it will be bought
         | in mass quantities by people building out data centers. This
         | pushes the price up, as we've already seen.
         | 
         | So labeling it "consumer" doesn't really mean much. They've
         | tried to enforce the distinction with EULAs before, but that
         | doesn't work well.
        
           | sydd wrote:
           | > it will be bought in mass quantities by people building out
           | data centers.
           | 
           | Isn't this good? The production at scale effects will kick
           | in, lowering the price and supply will meet demand after some
           | hiccups.
        
             | Aurornis wrote:
             | If demand drove prices down like that then we'd already
             | have cheap cards available.
             | 
             | Demand puts upward pressure on prices.
             | 
             | Supply is already maxed out and growing as fast as
             | possible.
        
               | piva00 wrote:
               | > Isn't this good? The production at scale effects will
               | kick in, lowering the price and supply will meet demand
               | after some hiccups.
               | 
               | You didn't account for the "hiccups", which can vary from
               | 5-20 years until competition catches up, longer than the
               | life of many companies. In spherical cows worlds of
               | economics that would be just a hiccup.
        
             | wmf wrote:
             | Nvidia doesn't want the price to be lower; that's why they
             | won't make this card.
        
         | karolist wrote:
         | Not exactly what you've asked but Mac Studio exists, with 192GB
         | at that.
        
         | brucethemoose2 wrote:
         | We are getting that (in early 2025?) With AMD Strix Halo.
         | 
         | 40 CUs, 256 bit LPDDR5X, 16 CPU cores. Or so the rumors say.
        
       | jedberg wrote:
       | Just today I was reading the article about OpenAI wanting $7T to
       | develop their own AI chips. In the comments were a bunch of
       | people talking about all the startups in the last 18 months
       | trying to make bespoke AI chips.
       | 
       | This makes a lot of sense for NVIDIA. They have the expertise,
       | the money, the _scale_ , and the experience already. They can
       | probably do it cheaper than any startup and then either pass on
       | that savings or make more profit.
        
         | jetbalsa wrote:
         | Don't forget the tooling, ROCm still hasn't taken off very
         | well.
        
         | buryat wrote:
         | every custom chip sold is another NVidia H100 not bought
        
           | drozycki wrote:
           | That's a bit like "every Honda Accord sold is another
           | Maserati not bought". Providing a cheaper option can net more
           | profit on volume.
        
             | DrNosferatu wrote:
             | Or the other way around:
             | 
             | Every bespoke chip will be much more expensive - and
             | profitable - than the generic units that were not bought.
        
           | nabla9 wrote:
           | Not necessarily.
           | 
           | Custom chips are likely made in technology nodes step behind
           | They are cheaper to manufacture. Nvidia's H200 and Apple M2
           | are so profitable that they get the latest technology nodes.
        
         | greggsy wrote:
         | Introducing that amount of capital would be extremely
         | disruptive for the rest of the market - was it really 7
         | _trillion_ , or billion?
        
       | m3kw9 wrote:
       | Imagine if they got ARM, sort of good they did not as the
       | competition would suffer
        
       | jprd wrote:
       | Nvidia isn't in the Fab biz, so maybe this will be easier for
       | them to generate customer interest in a way that Intel has not
       | been able to?
        
       | wmf wrote:
       | I wonder if customers really want custom chips or just cheaper
       | ones. Many of these custom AI chips are slower than flagship GPUs
       | so presumably a cut-down GPU at a lower price would be just as
       | good.
        
         | brucethemoose2 wrote:
         | There is a lot of silicon consumer GPUs don't "need" for AI.
         | 
         | But on the other hand the software stack is very mature and
         | they are heavily amortized by the huge volume, so its kinda
         | hard to argue with. In fact its so good that Nvidia can charge
         | outrageous prices for the L40, A10 and such and then turn
         | around and sell the exact same dies to consumers (with less
         | memory).
        
         | tomasGiden wrote:
         | For customers like Ericsson it wouldn't surprise me if they
         | request special instructions and special hard function blocks.
         | In telecom there are certain operations that's specified by the
         | standard (and some which aren't but used as a de facto
         | standard) which are performed so often that you want to do them
         | in hardware instead of in software. Or the opposite, Ericsson
         | just wants to integrate NVIDIAs IP into Ericsson's own ASICs
         | instead of using their own cores and other third party cores.
        
       | wslh wrote:
       | I imagine there will be cheaper service providers soon for
       | training (2024/2025). Like what companies such as Hetzner,
       | Digital Ocean and others are providing for cloud. They are not in
       | the same league of AWS, Google Cloud, Azure but can add more
       | specific cloud services.
        
         | brucethemoose2 wrote:
         | AWS/Azure prices are really awful TBH. There are already much
         | better places to get GPUs.
        
           | wslh wrote:
           | I know but, for example, Google Cloud has a current advantage
           | with their own hardware (TPUs). What is approximately the
           | cost of training something like ChatGPT or Gemini? They have
           | an advantage because they can rely in Azure and Google
           | respectively without paying anything or with subsidised
           | prices. Could a new player compete with them for training for
           | other companies?
        
             | brucethemoose2 wrote:
             | Google prices the TPUs pretty exorbitantly, actually.
             | 
             | But they give a lot of TPU time away for research, which is
             | nice.
             | 
             | It seems Intel Gaudi 2 is priced in a sweeter spot, but
             | I've never head of anyone but Intel using them.
        
               | wslh wrote:
               | Then I understand their advantage is training their own
               | models and pricing it high for others no matter the cost.
        
       | zerreh50 wrote:
       | From Nvidia's history of working with AIBs, Sony, Apple, the
       | Linux community, and probably many more, they seem to be a very
       | hard company to work with. They have an idea of what the product
       | looks like and it's their way or the highway. I wonder if this
       | new department will change that. If it doesn't, it won't amount
       | to much.
        
       | bluerooibos wrote:
       | I read about a new approach for making AI chips sometime last
       | year - analogue chips by this company - https://mythic.ai/
       | 
       | Haven't heard anything about it since though.
        
       ___________________________________________________________________
       (page generated 2024-02-10 23:00 UTC)