hngopher.com

       [HN Gopher] Nvidia announces its most powerful AI chip as it see...
       ___________________________________________________________________
        
       Nvidia announces its most powerful AI chip as it seeks to become a
       platform co.
        
       Author : tiahura
       Score  : 123 points
       Date   : 2024-03-18 20:32 UTC (2 hours ago)
        
 (HTM) web link (www.cnbc.com)
 (TXT) w3m dump (www.cnbc.com)
        
       | paulpauper wrote:
       | Stock unchanged in afterhours. A lot of people were hoping for a
       | big pop on some big development.
        
         | synergy20 wrote:
         | not only that, it lost steam during the day, maybe it was
         | overheated too much and no more news can pump it up any
         | further.
        
         | dagmx wrote:
         | I imagine it'll pop in the morning
        
         | dxbydt wrote:
         | guy is messing it up bigtime and in real-time as well. sheesh.
         | none of his jokes are landing. "we had 2 customers. we have
         | more now". long pause. screen behind him covered with logos of
         | all his customers. pause. pause. finally applause. ok on to the
         | next tidbit.
         | 
         | whole conference has been proceeding like this now. look if you
         | invite cramer and the wall street crowd, you should throw in
         | some dollar figures. like - who is paying for all this. how
         | much. and why. talk is entirely about token generation
         | bandwidth, exaflops and petaflops, data parallel vs tensor
         | parallel vs pipeline parallel - do you honestly think cramer
         | knows the difference between an ml pipeline and an oil pipeline
         | ?
         | 
         | i am watching this conf with my kid - proper GenZ member - who
         | got up after 5 mins and said man who is this comedian, his
         | jokes are so bad, and left :(
        
           | gmerc wrote:
           | Nah, wallstreet doesn't understand what it's looking at.
           | 
           | That's fine, it's a developer conference for a founder lead
           | company that hasn't reached the "stock price is the product"
           | state. He's not trying to optimize the next 5 days of stock.
           | 
           | There's a full ecosystem grab with Nim there, a new GPU that
           | forces every major datacenter to adopt (or their competitors
           | will massively increase their compute density)
        
           | anon291 wrote:
           | This is a developer's conference, not a financial one.
        
           | Takennickname wrote:
           | Cramer is an entertainer. Not a developer or an investor.
        
           | smallmancontrov wrote:
           | You might not like it, but this is what peak performance
           | looks like.
        
         | TheAlchemist wrote:
         | Well, stock price is not a good short term indicator about
         | Nvidia developments, nor any company for that matter. Nvidia is
         | doing a very good job.
         | 
         | That being said, their stock is absolutely and hilariously
         | overvalued.
        
           | costcofries wrote:
           | Tell me more about why you believe their stock is hilariously
           | overvalued.
        
             | Takennickname wrote:
             | Because he missed the train. My guess.
        
             | Workaccount2 wrote:
             | They are priced as if they are the only ones who are
             | capable of creating chips that can crunch LLM algos. But
             | AMD, Google, Intel, and even Apple are also capable.
             | 
             | Apple is in talks with Google to bring Gemini to the
             | iPhone, and it will obviously also be on android phones. So
             | almost every phone on earth is poised to be using Gemini in
             | the near future, and Gemini runs entirely on Google's own
             | custom hardware (which is at parity or better than nVidia's
             | offerings anyway).
        
               | jerf wrote:
               | This seems as good a place as any to be Corrected by the
               | Internet, so... correct me if I'm wrong.
               | 
               | Making a graphics chip that is as good as Nvidia: Very
               | difficult. Huge moat, huge effort, lots of barriers, lots
               | of APIs, lot of experience, lots of decades of experience
               | to overcome.
               | 
               | Making something that can run a NN: Much, much easier.
               | I'd guess, start-up level feasible. The math is much
               | simpler. There's a lot of it, but my biggest concern
               | would be less about pulling it off and more around
               | whether my custom hardware is still the correct custom
               | hardware by the time it is released. You'd think you
               | could even eke out a bit of a performance advantage in
               | not having all the other graphics stuff around. LLMs in
               | their current state are characterized by vast swathes of
               | input data and unbelievably repetitive number crunching,
               | not complicated silicon architectures and decades-refined
               | algorithms. (I mean, the algorithms are decades refined,
               | but they're still simple as programs go.)
               | 
               | I understand nVidia's graphics moat. I do not understand
               | the moat implied by their stock valuation, that as you
               | say, they are the only people who will ever be able to
               | build AI hardware. That doesn't seem remotely true.
               | 
               | So... correct me Internet. Explain why nVidia has
               | persistent advantages in the specific field of neural
               | nets that can not be overcome. I'm seriously listening,
               | because I'm curious; this is a deliberate Cunningham's
               | Law invocation, not me speaking from authority.
        
               | smallmancontrov wrote:
               | I agree with you, but let me devil's advocate.
               | 
               | After 10 years of pretending to care about compute, AMD
               | has filled the industry with burned-once experts who,
               | when weighing nvidia against competitors, instinctively
               | include "likely boondoggle" against every competitor's
               | quote because they've seen it happen, possibly several
               | times. Combine this with nvidia's deep experience and and
               | huge rich-get-richer R&D budget keeping them always one
               | or two architecture and software steps ahead, like it did
               | in graphics, and their rich-get-richer TSMC budget buying
               | them a step ahead in hardware, and you have a scenario
               | where it continues makes sense to pay the green tax for
               | the next generation or three. Red/blue/other rebels get
               | zinged and join team "just pay the green tax." NV
               | continues to dominate. Competitors go green with envy, as
               | was fortold.
        
               | htrp wrote:
               | > burned-once experts
               | 
               | More like burned 2x / 3x / 4x of this time it's different
               | people.
               | 
               | Looking at you Intel
        
               | bgnn wrote:
               | CUDA is/was their biggest advantage to be honest, not the
               | HW. They saw the demand to super high-end GPUs driven by
               | Bitcoin mining craze thanks to CUDA, and it transitioned
               | gracefully to AI/ML workloads. Google was much more ahead
               | to see the need and develop TPUs for example.
               | 
               | I don't think they have a crazy advantage HW wise. Couple
               | of start-ups are able to achieve this. If SW
               | infrastracture end is standardized, we will have a more
               | level playground.
        
               | elorant wrote:
               | CUDA is a big reason for their moat. And that's not
               | something you can build in a couple of years no matter
               | how money you can throw on it.
               | 
               | Without CUDA you have a chip that runs on premise without
               | anyone having a clue how good that is which is supposedly
               | what Google does. Your only offering is cloud services.
               | As big as this is, corporations would want to build their
               | own datacenters.
        
               | sottol wrote:
               | Sure, CUDA has a lot of highly optimized utilities baked-
               | in (CUDNN and the likes) and maybe more importantly,
               | implementors have a lot of experience with it but afaict
               | everyone is working on their own HAL/compiler and not
               | using CUDA directly to implement the actual models. It's
               | part of the HAL/framework. You can probably port any of
               | these frameworks to a new hardware platform with a few
               | man-years worth of work imo if you can spare the
               | manpower.
               | 
               | I think nobody had the time to port any of these
               | architectures away from CUDA because: * the leaders want
               | to maintain their lead and everyone needs to catch up
               | asap so no time to waste, * and progress was _super_ fast
               | so doubly no time to waste, * there was/is plenty of
               | money that buys some perceived value in maintaining the
               | lead or catching up.
               | 
               | But imo: 1. progress has slowed a bit, maybe there's time
               | to explore alternatives, 2. nvidia GPUs are pretty hard
               | to come by, switching vendors may actually be a
               | competitive advantage (if performance/price pans out and
               | you can actually buy the hardware now as opposed to
               | later).
               | 
               | In terms of ML "compilers"/frameworks, afaik there's:
               | 
               | * Google JAX/Tensorflow XLA/MLIR, * OpenAI Triton, * Meta
               | Glow, * Apple PyTorch+Metal fork.
        
               | belter wrote:
               | Good luck with that. Gemini Advanced is simply unusable
               | right now....It's so bad its hard to believe nobody
               | picked up on that yet.
        
             | xyst wrote:
             | Because their stock value is highly coupled with crypto
             | mining and AI craze.
             | 
             | The move from PoW to PoS for most crypto networks in
             | combination with bust of '22. NVDA slid down in value.
             | 
             | OpenAI debuts ChatGPT in late 2022 and now it's suddenly
             | bumping in price as the hype and rush for GPUs from
             | companies of all types buys up their stock of GPUs. Demand
             | is far outpacing the supply. Nvda can't keep up.
             | 
             | Thus, share price is brittle. Competition in the GPU market
             | is dominantly owned by Nvidia. That can change, but so far
             | openai loves using nvidia for some reason.
        
               | ryandrake wrote:
               | If you are a true believer that AI is not a craze, then
               | the stock can only go up from here. If you think there is
               | a chance that everyone gets bored of AI and moves on to
               | some other fad that is not in Nvidia's wheelhouse, then
               | it's probably down from here. I'm staying out of this
               | bet: don't have the stomach for it.
        
               | AlexandrB wrote:
               | There's another case for pessimism as well: cost. It's
               | possible that many AI applications aren't worth the money
               | required for the extra compute. AI-enhanced search comes
               | to mind here: how is Microsoft going to monetize users of
               | Copilot in Bing to justify the extra cost? Right now a
               | lot of this stuff is heavily subsidized by VCs or the
               | MSFTs of the world, but when it comes time to make a
               | profit we'll see what actually sticks around.
        
               | partiallypro wrote:
               | AI is obviously the future, though current iterations
               | will probably die at some point. but the dot com bubble
               | ended up with the internet being more pervasive than may
               | have even been thought of at the time, but regardless
               | even the likes of Amazon's stock went bust before it
               | recollected itself. Not a perfect comparison given Nvidia
               | has really good revenue growth, but the point still
               | stands.
        
             | swalsh wrote:
             | 72 P/E ratio while they have a mere monopoly on one the
             | most valuable resource in the world.
             | 
             | Competition WILL come. Maybe it's Groq, maybe AMD, maybe
             | Cerebras. Maybe there's a stealth startup out there. Point
             | is, they're going to be challenged soon.
        
               | htrp wrote:
               | You and what fab?
               | 
               | It's almost impossible to manufacture at scale with good
               | yields and leading edge fabs are almost all bought out.
        
             | smallmancontrov wrote:
             | No moat.
             | 
             | Yes, CUDA, but CUDA is maaaaaybe a few tens of billion USD
             | deep and a few (more) years wide. When the rest of the
             | industry saw compute as a vanity market, that was
             | sufficient. Now, it's a matter of time before margins go
             | to, uhhh, less than 90%.
             | 
             | Does that make shorting a good idea? I wouldn't count on
             | it. The market can always remain irrational longer than you
             | can remain solvent.
        
               | cma wrote:
               | They also bought infiniband which has played a big role
               | in being the best at clustering, though Google's TPU
               | reconfigurable topology stuff seems really cool too.
               | 
               | Tesla went after them with Dojo and has still ended up
               | splurging on big H100 clusters.
        
               | tiahura wrote:
               | And MS and everyone else have plenty of interest in
               | helping AMD commodify CUDA compatibility.
        
             | TheAlchemist wrote:
             | Their market cap is 2.2T $.
             | 
             | In the past year, they had a revenue of 60B $ and net
             | income of 30B $. Absolutely amazing numbers, I agree. The
             | year before they had a revenue of 30B $ and a net income of
             | 4.5B $ - and it was a rather good year. What happens next
             | of course depend of how you judge the situation - was it a
             | peak hype demand ? Will it stabilize now ? Grow at current
             | extraordinary rates ?
             | 
             | Scenario 1 - margins get back to normal due to hype going
             | down, competition improving etc - in this case the company
             | is worth at best ~200B $ - or 1/10 of what it is now.
             | 
             | Scenario 2 - they maintain current revenue and the
             | exceptional margins - the company would be worth ~1T - or
             | 1/2 of what it is now.
             | 
             | Scenario 3 - they current growth rate (based on past 12
             | months) continue for ~5 years. This is the case the company
             | is worth ~2T $.
             | 
             | But they are in a business where most money come from a
             | handful of customers, all of which are working on similar
             | chips - and given the sums in play now, the incentives are
             | *very* strong.
             | 
             | My opinion, is that the company is already priced for
             | perfection - basically the current price reflects the
             | perfect scenario. I struggle to see any upside, unless we
             | have AGI in the next 5 years and it decides it can only run
             | on Nvidia chips.
             | 
             | All of this is akin to Tesla in the past years. They grew
             | from a small startup to a medium car maker - the % growth
             | rate was huge of course - an amazing achievement in itself.
             | But people projected that the % growth rate would continue
             | - and the stock was priced accordingly. Reality is catching
             | up on Tesla, even if some projections are still absolutely
             | crazy.
        
         | rvz wrote:
         | > A lot of people were hoping for a big pop on some big
         | development
         | 
         | They are waiting for earnings projections for such a pop since
         | right now it is extremely overbought and struggling to move
         | past >$1,000 per share.
         | 
         | For now, Microsoft and OpenAI will use these chips, but in the
         | long term they are just looking at this and plotting to build
         | their own chips and reducing their dependence on Nvidia and
         | will be ready to switch once their contracts have run out.
        
         | swalsh wrote:
         | At 2 trillion, it's all baked in already
        
       | dagmx wrote:
       | FP8 being 2.5x Hopper is kind of disappointing after such a long
       | time. Since its 2 fused chips, that means it's 25% effective
       | delta.
       | 
       | though it seems most of the progress has been on memory
       | throughput and power use which is still very impressive.
       | 
       | I wonder how this will trickle down to the consumer segment.
        
         | azeirah wrote:
         | Jensen revealed later that the LLM inference is 30x due to
         | architectural improvements, it's massive. I don't know if it's
         | latency or just 2-3x performance boost with 30x more customers
         | served in the same chip. Either way, 30x is massive.
        
           | ephemeral-life wrote:
           | 30x is the type of number that when you see it in a
           | generational improvement, you should ignore it as marketing
           | fluff.
        
             | azeirah wrote:
             | From how I understood it, it means they optimised the
             | entire stack from CUDA to the networking interconnects
             | specifically for data centers, meaning you get 30x more
             | inference per dollar for a datacenter. This is probably not
             | fluff, but it's only relevant for a very very specific use-
             | case, ie enterprises with the money to buy a stack to serve
             | thousands of users with LLMs.
             | 
             | It doesn't matter for anyone who's not microsoft, aws or
             | openai or similar.
        
               | acchow wrote:
               | They showed 30x was for FP4. Who is using FP4 in
               | practice?
        
               | KaoruAoiShiho wrote:
               | But maybe you should. Once the software stack is ready
               | for it there'll be more people since the performance
               | gains are so massive.
        
           | my123 wrote:
           | The 30x number is for a really narrow scenario tbh. Running a
           | GPT 1.8T parameters (w/ MOE) on one GB200
        
             | huac wrote:
             | 'narrow scenario,' perhaps, but one that also happens to
             | closely match rumors for GPT4's size
        
           | dagmx wrote:
           | Yeah and the 30x is largely due to the increase in factors
           | like packaging and throughput. It's not indicative of general
           | purpose performance which is what I was talking about.
           | 
           | Again, I do think the throughput and energy efficiency gains
           | are impressive, but the raw performance gain is lower than
           | I'd have expected for the massive leap in node size etc
        
           | modeless wrote:
           | He always does that. They stack up a bunch of special case
           | features like sparsity that most people don't use in practice
           | to get these unrealistic numbers. It'll be faster, certainly,
           | but 30x will only be achievable in very special cases I'm
           | sure.
        
             | cma wrote:
             | Isn't sparsity almost always a win at this point? Making
             | everything fully connected is a major waste.
        
               | modeless wrote:
               | The kind of sparsity that the hardware supports is not
               | fully general. I'm not aware of any large models trained
               | using it. Maybe they are all leaving 2x perf on the table
               | for no reason, but maybe not. I don't think sparsity is
               | really proven to be "almost always a win" for training.
        
           | qwertox wrote:
           | But Blackwell in the graph is FP4 whereas Hopper is FP8.
        
         | YetAnotherNick wrote:
         | How is 2.5x disappointing in one generation?
        
           | chimney wrote:
           | Compare to the 10x that was Hopper uplift.
        
             | YetAnotherNick wrote:
             | Because it involved scaling in chip area needed for FP8. AI
             | community realized that FP8 training is possible few years
             | back so the transistors given for FP8 was scaled. Overall I
             | think transistors grow just by ~50% per generation so most
             | of the gains comes from removing FP32/FP64 share which were
             | dominant 10 years back, but there is only some point it
             | could go to.
        
       | jairuhme wrote:
       | I haven't listened to Jensen speak before, but am I the only one
       | who thought the presentation wasn't very polished? Not a knock on
       | anything he has accomplished, just an observation that sorta
       | surprised me
        
         | acchow wrote:
         | He has more important things to do than perfecting a
         | presentation. He likes his employees to message him freely with
         | things they think he can help with.
        
         | jefozabuss wrote:
         | When he had that slide up with generating everything I was kind
         | of expecting that he'd say this whole keynote is generated
         | including him. That'd have been crazy.
        
         | azeirah wrote:
         | He said he didn't rehearse well. I think it makes him come
         | across very genuinely, not some dumb hyperpolished corporate
         | blabla
        
         | ipsum2 wrote:
         | Yeah he said he didn't rehearse and it really shows.
        
         | angm128 wrote:
         | The products, animations and slides are doing some heavy
         | lifting. Most jokes don't land and his presentation is somewhat
         | confusing at times (e.g. star trek intro token count)
        
         | swalsh wrote:
         | He's selling water in a desert, kind of doesn't matter how
         | polished his presentation is.
        
         | cableshaft wrote:
         | The beginning in particular seemed pretty rough, but he seemed
         | to mostly get into a groove about halfway into it. At least he
         | started talking a lot smoother around then.
        
         | sct202 wrote:
         | I think it's a good reminder that objectively great CEO's and
         | leaders can be kind of cringe when presenting. A lot of times
         | people like that get passed up in promotions in favor of smooth
         | talkers.
        
         | modeless wrote:
         | It's typical. He isn't a great public speaker IMO. Not terrible
         | but not great.
        
         | erupt7893 wrote:
         | I've been watching his keynotes for as long as I can remember,
         | this is how it's always been
        
         | caycep wrote:
         | I remember his opening line at NEURIPS 2017, to an audience of
         | grad students and postdocs, "only Nvidia would unveil their
         | most expensive product to an audience who's completely broke"
         | 
         | Then he went into a comedic monologue about GANS. But hey, at
         | least that meant that the CEO was reading the actual conference
         | proceedings...
        
       | bluedino wrote:
       | They acquired Bright Cluster Manager a few years ago, who would
       | be next on their list to acquire? It seems like they want to
       | provide customers with the whole stack.
        
         | shiftpgdn wrote:
         | Canonical is a ripe target. Canonical has been trying to grow
         | Ubuntu and other tools in the enterprise world for the last few
         | years without significant success, and much of the Nvidia
         | devkit stuff is built around Ubuntu.
        
           | echelon wrote:
           | Please do not give them this idea.
           | 
           | Ubuntu is actually a pretty great daily driver desktop Linux,
           | and I'd hate for that to lose priority and disappear.
           | 
           | I'm not a fan of what happened to the Red Hat ecosystem for
           | exactly the same reasons.
        
             | xmprt wrote:
             | As someone who has used Ubuntu in the past and has now
             | moved onto greener pastures, I appreciate everything
             | Canonical and Ubuntu have done for the Linux community but
             | there are many better options today and Canonical is
             | already far from the company it once used to be.
        
               | dgfitz wrote:
               | Whenever I see an open job req for canonical I run for
               | the hills.
        
             | greggsy wrote:
             | Tbh, Ubuntu's only pull is the support and breadth of
             | users. As a desktop, it's let down by Unity, which IMHO is
             | basically a port of Windows 8 tablet UI.
             | 
             | If they defaulted back to a menu and taskbar-based WM, it
             | might actually be more approachable to users who are more
             | familiar with macOS and Windows.
        
         | kflansburg wrote:
         | Run:AI https://news.ycombinator.com/item?id=39738342
        
       | stevethomas wrote:
       | Time to sell. When they start becoming a platform, it means they
       | have nothing more concrete in the near future. Sell now and buy
       | again later once the price corrects.
        
         | belter wrote:
         | Don't bet against a CEO who knows what is talking about, has
         | 80% market share and a arm tattoo of his own company logo....
         | :-)
         | 
         | So far the short sellers, learned that bitter lesson.
        
         | golergka wrote:
         | Does a company like Nvidia has to have anything more concrete
         | than newer, bigger and faster chips?
        
         | pvg wrote:
         | If you held and then sold Nvidia stock when they announced CUDA
         | or GeForce Live, you'd now be now a big pile of negative money
         | richer.
        
         | Keyframe wrote:
         | They still have to announce you can send an email through their
         | platform.
        
       | lvl102 wrote:
       | Seems Nvidia is going for maximum margin as they see competition
       | ahead.
        
         | theGnuMe wrote:
         | And they can build a big moat with cuda.
        
       | qwertox wrote:
       | What is FP4, 4 bit floating point? If so, the comparison graph
       | [0] with 30x above Hopper was a bit misleading.
       | 
       | [0] https://youtu.be/Y2F8yisiS6E?t=4698
        
         | sipjca wrote:
         | yes
        
         | fancyfredbot wrote:
         | That's right.There was mention of a precision aware transformer
         | engine which might make it easier to use fp4, but it's not 30x
         | faster in a like for like way. This shouldn't be surprising
         | since it's more or less two hoppers next to one another on a
         | slightly improved process node. 2.5x seems more likely in cases
         | where you don't exploit a new feature like that or the
         | increased memory.
        
       | tamimio wrote:
       | I think at this point, they should stop making it video "cards"
       | but rather video "stations", a full tower station with power
       | supply and one giant "card" inside with proper cooling, etc.,
       | might also justify the crazy prices anyway.
        
         | ufocia wrote:
         | Probably better to stick to the GPUs. Integration is a low
         | margin game.
        
           | caycep wrote:
           | granted, at this point we plug the computer into the GPU, so
           | it might not make a difference...
        
           | georgyo wrote:
           | I'd prefer they stick to GPUs, but I think you're over
           | simplifying.
           | 
           | Dell proves that selling complete units is very profitable.
           | 
           | Apple shows that owning the entire stack is immensely
           | profitable.
           | 
           | Nvidia already has significant hardware and software
           | investment. They very well could fully integrate and grab
           | larger slices of the pie.
           | 
           | In fact, Nvidia already has complete appliance like fully
           | integrated machines. But enterprises like to install their
           | own OS and run their own software stack. These appliances
           | have not caught on, at least not yet.
        
         | jedberg wrote:
         | https://lambdalabs.com/gpu-workstations/vector
        
         | ribosometronome wrote:
         | Isn't that just a computer? Or an eGPU, if it doesn't contain
         | the rest of the computer?
        
       | herecomethefuzz wrote:
       | "Platform company" means multi-chip in this case?
       | 
       | Seems logical since it's becoming impractical to cram so many
       | transistors on a single die.
        
         | 1oooqooq wrote:
         | no it means rent seeking.
         | 
         | imagine aws if they also sold all computers in the world, now
         | you can only rent from them
        
       ___________________________________________________________________
       (page generated 2024-03-18 23:00 UTC)