[HN Gopher] Nvidia announces its most powerful AI chip as it see...
___________________________________________________________________
Nvidia announces its most powerful AI chip as it seeks to become a
platform co.
Author : tiahura
Score : 123 points
Date : 2024-03-18 20:32 UTC (2 hours ago)
(HTM) web link (www.cnbc.com)
(TXT) w3m dump (www.cnbc.com)
| paulpauper wrote:
| Stock unchanged in afterhours. A lot of people were hoping for a
| big pop on some big development.
| synergy20 wrote:
| not only that, it lost steam during the day, maybe it was
| overheated too much and no more news can pump it up any
| further.
| dagmx wrote:
| I imagine it'll pop in the morning
| dxbydt wrote:
| guy is messing it up bigtime and in real-time as well. sheesh.
| none of his jokes are landing. "we had 2 customers. we have
| more now". long pause. screen behind him covered with logos of
| all his customers. pause. pause. finally applause. ok on to the
| next tidbit.
|
| whole conference has been proceeding like this now. look if you
| invite cramer and the wall street crowd, you should throw in
| some dollar figures. like - who is paying for all this. how
| much. and why. talk is entirely about token generation
| bandwidth, exaflops and petaflops, data parallel vs tensor
| parallel vs pipeline parallel - do you honestly think cramer
| knows the difference between an ml pipeline and an oil pipeline
| ?
|
| i am watching this conf with my kid - proper GenZ member - who
| got up after 5 mins and said man who is this comedian, his
| jokes are so bad, and left :(
| gmerc wrote:
| Nah, wallstreet doesn't understand what it's looking at.
|
| That's fine, it's a developer conference for a founder lead
| company that hasn't reached the "stock price is the product"
| state. He's not trying to optimize the next 5 days of stock.
|
| There's a full ecosystem grab with Nim there, a new GPU that
| forces every major datacenter to adopt (or their competitors
| will massively increase their compute density)
| anon291 wrote:
| This is a developer's conference, not a financial one.
| Takennickname wrote:
| Cramer is an entertainer. Not a developer or an investor.
| smallmancontrov wrote:
| You might not like it, but this is what peak performance
| looks like.
| TheAlchemist wrote:
| Well, stock price is not a good short term indicator about
| Nvidia developments, nor any company for that matter. Nvidia is
| doing a very good job.
|
| That being said, their stock is absolutely and hilariously
| overvalued.
| costcofries wrote:
| Tell me more about why you believe their stock is hilariously
| overvalued.
| Takennickname wrote:
| Because he missed the train. My guess.
| Workaccount2 wrote:
| They are priced as if they are the only ones who are
| capable of creating chips that can crunch LLM algos. But
| AMD, Google, Intel, and even Apple are also capable.
|
| Apple is in talks with Google to bring Gemini to the
| iPhone, and it will obviously also be on android phones. So
| almost every phone on earth is poised to be using Gemini in
| the near future, and Gemini runs entirely on Google's own
| custom hardware (which is at parity or better than nVidia's
| offerings anyway).
| jerf wrote:
| This seems as good a place as any to be Corrected by the
| Internet, so... correct me if I'm wrong.
|
| Making a graphics chip that is as good as Nvidia: Very
| difficult. Huge moat, huge effort, lots of barriers, lots
| of APIs, lot of experience, lots of decades of experience
| to overcome.
|
| Making something that can run a NN: Much, much easier.
| I'd guess, start-up level feasible. The math is much
| simpler. There's a lot of it, but my biggest concern
| would be less about pulling it off and more around
| whether my custom hardware is still the correct custom
| hardware by the time it is released. You'd think you
| could even eke out a bit of a performance advantage in
| not having all the other graphics stuff around. LLMs in
| their current state are characterized by vast swathes of
| input data and unbelievably repetitive number crunching,
| not complicated silicon architectures and decades-refined
| algorithms. (I mean, the algorithms are decades refined,
| but they're still simple as programs go.)
|
| I understand nVidia's graphics moat. I do not understand
| the moat implied by their stock valuation, that as you
| say, they are the only people who will ever be able to
| build AI hardware. That doesn't seem remotely true.
|
| So... correct me Internet. Explain why nVidia has
| persistent advantages in the specific field of neural
| nets that can not be overcome. I'm seriously listening,
| because I'm curious; this is a deliberate Cunningham's
| Law invocation, not me speaking from authority.
| smallmancontrov wrote:
| I agree with you, but let me devil's advocate.
|
| After 10 years of pretending to care about compute, AMD
| has filled the industry with burned-once experts who,
| when weighing nvidia against competitors, instinctively
| include "likely boondoggle" against every competitor's
| quote because they've seen it happen, possibly several
| times. Combine this with nvidia's deep experience and and
| huge rich-get-richer R&D budget keeping them always one
| or two architecture and software steps ahead, like it did
| in graphics, and their rich-get-richer TSMC budget buying
| them a step ahead in hardware, and you have a scenario
| where it continues makes sense to pay the green tax for
| the next generation or three. Red/blue/other rebels get
| zinged and join team "just pay the green tax." NV
| continues to dominate. Competitors go green with envy, as
| was fortold.
| htrp wrote:
| > burned-once experts
|
| More like burned 2x / 3x / 4x of this time it's different
| people.
|
| Looking at you Intel
| bgnn wrote:
| CUDA is/was their biggest advantage to be honest, not the
| HW. They saw the demand to super high-end GPUs driven by
| Bitcoin mining craze thanks to CUDA, and it transitioned
| gracefully to AI/ML workloads. Google was much more ahead
| to see the need and develop TPUs for example.
|
| I don't think they have a crazy advantage HW wise. Couple
| of start-ups are able to achieve this. If SW
| infrastracture end is standardized, we will have a more
| level playground.
| elorant wrote:
| CUDA is a big reason for their moat. And that's not
| something you can build in a couple of years no matter
| how money you can throw on it.
|
| Without CUDA you have a chip that runs on premise without
| anyone having a clue how good that is which is supposedly
| what Google does. Your only offering is cloud services.
| As big as this is, corporations would want to build their
| own datacenters.
| sottol wrote:
| Sure, CUDA has a lot of highly optimized utilities baked-
| in (CUDNN and the likes) and maybe more importantly,
| implementors have a lot of experience with it but afaict
| everyone is working on their own HAL/compiler and not
| using CUDA directly to implement the actual models. It's
| part of the HAL/framework. You can probably port any of
| these frameworks to a new hardware platform with a few
| man-years worth of work imo if you can spare the
| manpower.
|
| I think nobody had the time to port any of these
| architectures away from CUDA because: * the leaders want
| to maintain their lead and everyone needs to catch up
| asap so no time to waste, * and progress was _super_ fast
| so doubly no time to waste, * there was/is plenty of
| money that buys some perceived value in maintaining the
| lead or catching up.
|
| But imo: 1. progress has slowed a bit, maybe there's time
| to explore alternatives, 2. nvidia GPUs are pretty hard
| to come by, switching vendors may actually be a
| competitive advantage (if performance/price pans out and
| you can actually buy the hardware now as opposed to
| later).
|
| In terms of ML "compilers"/frameworks, afaik there's:
|
| * Google JAX/Tensorflow XLA/MLIR, * OpenAI Triton, * Meta
| Glow, * Apple PyTorch+Metal fork.
| belter wrote:
| Good luck with that. Gemini Advanced is simply unusable
| right now....It's so bad its hard to believe nobody
| picked up on that yet.
| xyst wrote:
| Because their stock value is highly coupled with crypto
| mining and AI craze.
|
| The move from PoW to PoS for most crypto networks in
| combination with bust of '22. NVDA slid down in value.
|
| OpenAI debuts ChatGPT in late 2022 and now it's suddenly
| bumping in price as the hype and rush for GPUs from
| companies of all types buys up their stock of GPUs. Demand
| is far outpacing the supply. Nvda can't keep up.
|
| Thus, share price is brittle. Competition in the GPU market
| is dominantly owned by Nvidia. That can change, but so far
| openai loves using nvidia for some reason.
| ryandrake wrote:
| If you are a true believer that AI is not a craze, then
| the stock can only go up from here. If you think there is
| a chance that everyone gets bored of AI and moves on to
| some other fad that is not in Nvidia's wheelhouse, then
| it's probably down from here. I'm staying out of this
| bet: don't have the stomach for it.
| AlexandrB wrote:
| There's another case for pessimism as well: cost. It's
| possible that many AI applications aren't worth the money
| required for the extra compute. AI-enhanced search comes
| to mind here: how is Microsoft going to monetize users of
| Copilot in Bing to justify the extra cost? Right now a
| lot of this stuff is heavily subsidized by VCs or the
| MSFTs of the world, but when it comes time to make a
| profit we'll see what actually sticks around.
| partiallypro wrote:
| AI is obviously the future, though current iterations
| will probably die at some point. but the dot com bubble
| ended up with the internet being more pervasive than may
| have even been thought of at the time, but regardless
| even the likes of Amazon's stock went bust before it
| recollected itself. Not a perfect comparison given Nvidia
| has really good revenue growth, but the point still
| stands.
| swalsh wrote:
| 72 P/E ratio while they have a mere monopoly on one the
| most valuable resource in the world.
|
| Competition WILL come. Maybe it's Groq, maybe AMD, maybe
| Cerebras. Maybe there's a stealth startup out there. Point
| is, they're going to be challenged soon.
| htrp wrote:
| You and what fab?
|
| It's almost impossible to manufacture at scale with good
| yields and leading edge fabs are almost all bought out.
| smallmancontrov wrote:
| No moat.
|
| Yes, CUDA, but CUDA is maaaaaybe a few tens of billion USD
| deep and a few (more) years wide. When the rest of the
| industry saw compute as a vanity market, that was
| sufficient. Now, it's a matter of time before margins go
| to, uhhh, less than 90%.
|
| Does that make shorting a good idea? I wouldn't count on
| it. The market can always remain irrational longer than you
| can remain solvent.
| cma wrote:
| They also bought infiniband which has played a big role
| in being the best at clustering, though Google's TPU
| reconfigurable topology stuff seems really cool too.
|
| Tesla went after them with Dojo and has still ended up
| splurging on big H100 clusters.
| tiahura wrote:
| And MS and everyone else have plenty of interest in
| helping AMD commodify CUDA compatibility.
| TheAlchemist wrote:
| Their market cap is 2.2T $.
|
| In the past year, they had a revenue of 60B $ and net
| income of 30B $. Absolutely amazing numbers, I agree. The
| year before they had a revenue of 30B $ and a net income of
| 4.5B $ - and it was a rather good year. What happens next
| of course depend of how you judge the situation - was it a
| peak hype demand ? Will it stabilize now ? Grow at current
| extraordinary rates ?
|
| Scenario 1 - margins get back to normal due to hype going
| down, competition improving etc - in this case the company
| is worth at best ~200B $ - or 1/10 of what it is now.
|
| Scenario 2 - they maintain current revenue and the
| exceptional margins - the company would be worth ~1T - or
| 1/2 of what it is now.
|
| Scenario 3 - they current growth rate (based on past 12
| months) continue for ~5 years. This is the case the company
| is worth ~2T $.
|
| But they are in a business where most money come from a
| handful of customers, all of which are working on similar
| chips - and given the sums in play now, the incentives are
| *very* strong.
|
| My opinion, is that the company is already priced for
| perfection - basically the current price reflects the
| perfect scenario. I struggle to see any upside, unless we
| have AGI in the next 5 years and it decides it can only run
| on Nvidia chips.
|
| All of this is akin to Tesla in the past years. They grew
| from a small startup to a medium car maker - the % growth
| rate was huge of course - an amazing achievement in itself.
| But people projected that the % growth rate would continue
| - and the stock was priced accordingly. Reality is catching
| up on Tesla, even if some projections are still absolutely
| crazy.
| rvz wrote:
| > A lot of people were hoping for a big pop on some big
| development
|
| They are waiting for earnings projections for such a pop since
| right now it is extremely overbought and struggling to move
| past >$1,000 per share.
|
| For now, Microsoft and OpenAI will use these chips, but in the
| long term they are just looking at this and plotting to build
| their own chips and reducing their dependence on Nvidia and
| will be ready to switch once their contracts have run out.
| swalsh wrote:
| At 2 trillion, it's all baked in already
| dagmx wrote:
| FP8 being 2.5x Hopper is kind of disappointing after such a long
| time. Since its 2 fused chips, that means it's 25% effective
| delta.
|
| though it seems most of the progress has been on memory
| throughput and power use which is still very impressive.
|
| I wonder how this will trickle down to the consumer segment.
| azeirah wrote:
| Jensen revealed later that the LLM inference is 30x due to
| architectural improvements, it's massive. I don't know if it's
| latency or just 2-3x performance boost with 30x more customers
| served in the same chip. Either way, 30x is massive.
| ephemeral-life wrote:
| 30x is the type of number that when you see it in a
| generational improvement, you should ignore it as marketing
| fluff.
| azeirah wrote:
| From how I understood it, it means they optimised the
| entire stack from CUDA to the networking interconnects
| specifically for data centers, meaning you get 30x more
| inference per dollar for a datacenter. This is probably not
| fluff, but it's only relevant for a very very specific use-
| case, ie enterprises with the money to buy a stack to serve
| thousands of users with LLMs.
|
| It doesn't matter for anyone who's not microsoft, aws or
| openai or similar.
| acchow wrote:
| They showed 30x was for FP4. Who is using FP4 in
| practice?
| KaoruAoiShiho wrote:
| But maybe you should. Once the software stack is ready
| for it there'll be more people since the performance
| gains are so massive.
| my123 wrote:
| The 30x number is for a really narrow scenario tbh. Running a
| GPT 1.8T parameters (w/ MOE) on one GB200
| huac wrote:
| 'narrow scenario,' perhaps, but one that also happens to
| closely match rumors for GPT4's size
| dagmx wrote:
| Yeah and the 30x is largely due to the increase in factors
| like packaging and throughput. It's not indicative of general
| purpose performance which is what I was talking about.
|
| Again, I do think the throughput and energy efficiency gains
| are impressive, but the raw performance gain is lower than
| I'd have expected for the massive leap in node size etc
| modeless wrote:
| He always does that. They stack up a bunch of special case
| features like sparsity that most people don't use in practice
| to get these unrealistic numbers. It'll be faster, certainly,
| but 30x will only be achievable in very special cases I'm
| sure.
| cma wrote:
| Isn't sparsity almost always a win at this point? Making
| everything fully connected is a major waste.
| modeless wrote:
| The kind of sparsity that the hardware supports is not
| fully general. I'm not aware of any large models trained
| using it. Maybe they are all leaving 2x perf on the table
| for no reason, but maybe not. I don't think sparsity is
| really proven to be "almost always a win" for training.
| qwertox wrote:
| But Blackwell in the graph is FP4 whereas Hopper is FP8.
| YetAnotherNick wrote:
| How is 2.5x disappointing in one generation?
| chimney wrote:
| Compare to the 10x that was Hopper uplift.
| YetAnotherNick wrote:
| Because it involved scaling in chip area needed for FP8. AI
| community realized that FP8 training is possible few years
| back so the transistors given for FP8 was scaled. Overall I
| think transistors grow just by ~50% per generation so most
| of the gains comes from removing FP32/FP64 share which were
| dominant 10 years back, but there is only some point it
| could go to.
| jairuhme wrote:
| I haven't listened to Jensen speak before, but am I the only one
| who thought the presentation wasn't very polished? Not a knock on
| anything he has accomplished, just an observation that sorta
| surprised me
| acchow wrote:
| He has more important things to do than perfecting a
| presentation. He likes his employees to message him freely with
| things they think he can help with.
| jefozabuss wrote:
| When he had that slide up with generating everything I was kind
| of expecting that he'd say this whole keynote is generated
| including him. That'd have been crazy.
| azeirah wrote:
| He said he didn't rehearse well. I think it makes him come
| across very genuinely, not some dumb hyperpolished corporate
| blabla
| ipsum2 wrote:
| Yeah he said he didn't rehearse and it really shows.
| angm128 wrote:
| The products, animations and slides are doing some heavy
| lifting. Most jokes don't land and his presentation is somewhat
| confusing at times (e.g. star trek intro token count)
| swalsh wrote:
| He's selling water in a desert, kind of doesn't matter how
| polished his presentation is.
| cableshaft wrote:
| The beginning in particular seemed pretty rough, but he seemed
| to mostly get into a groove about halfway into it. At least he
| started talking a lot smoother around then.
| sct202 wrote:
| I think it's a good reminder that objectively great CEO's and
| leaders can be kind of cringe when presenting. A lot of times
| people like that get passed up in promotions in favor of smooth
| talkers.
| modeless wrote:
| It's typical. He isn't a great public speaker IMO. Not terrible
| but not great.
| erupt7893 wrote:
| I've been watching his keynotes for as long as I can remember,
| this is how it's always been
| caycep wrote:
| I remember his opening line at NEURIPS 2017, to an audience of
| grad students and postdocs, "only Nvidia would unveil their
| most expensive product to an audience who's completely broke"
|
| Then he went into a comedic monologue about GANS. But hey, at
| least that meant that the CEO was reading the actual conference
| proceedings...
| bluedino wrote:
| They acquired Bright Cluster Manager a few years ago, who would
| be next on their list to acquire? It seems like they want to
| provide customers with the whole stack.
| shiftpgdn wrote:
| Canonical is a ripe target. Canonical has been trying to grow
| Ubuntu and other tools in the enterprise world for the last few
| years without significant success, and much of the Nvidia
| devkit stuff is built around Ubuntu.
| echelon wrote:
| Please do not give them this idea.
|
| Ubuntu is actually a pretty great daily driver desktop Linux,
| and I'd hate for that to lose priority and disappear.
|
| I'm not a fan of what happened to the Red Hat ecosystem for
| exactly the same reasons.
| xmprt wrote:
| As someone who has used Ubuntu in the past and has now
| moved onto greener pastures, I appreciate everything
| Canonical and Ubuntu have done for the Linux community but
| there are many better options today and Canonical is
| already far from the company it once used to be.
| dgfitz wrote:
| Whenever I see an open job req for canonical I run for
| the hills.
| greggsy wrote:
| Tbh, Ubuntu's only pull is the support and breadth of
| users. As a desktop, it's let down by Unity, which IMHO is
| basically a port of Windows 8 tablet UI.
|
| If they defaulted back to a menu and taskbar-based WM, it
| might actually be more approachable to users who are more
| familiar with macOS and Windows.
| kflansburg wrote:
| Run:AI https://news.ycombinator.com/item?id=39738342
| stevethomas wrote:
| Time to sell. When they start becoming a platform, it means they
| have nothing more concrete in the near future. Sell now and buy
| again later once the price corrects.
| belter wrote:
| Don't bet against a CEO who knows what is talking about, has
| 80% market share and a arm tattoo of his own company logo....
| :-)
|
| So far the short sellers, learned that bitter lesson.
| golergka wrote:
| Does a company like Nvidia has to have anything more concrete
| than newer, bigger and faster chips?
| pvg wrote:
| If you held and then sold Nvidia stock when they announced CUDA
| or GeForce Live, you'd now be now a big pile of negative money
| richer.
| Keyframe wrote:
| They still have to announce you can send an email through their
| platform.
| lvl102 wrote:
| Seems Nvidia is going for maximum margin as they see competition
| ahead.
| theGnuMe wrote:
| And they can build a big moat with cuda.
| qwertox wrote:
| What is FP4, 4 bit floating point? If so, the comparison graph
| [0] with 30x above Hopper was a bit misleading.
|
| [0] https://youtu.be/Y2F8yisiS6E?t=4698
| sipjca wrote:
| yes
| fancyfredbot wrote:
| That's right.There was mention of a precision aware transformer
| engine which might make it easier to use fp4, but it's not 30x
| faster in a like for like way. This shouldn't be surprising
| since it's more or less two hoppers next to one another on a
| slightly improved process node. 2.5x seems more likely in cases
| where you don't exploit a new feature like that or the
| increased memory.
| tamimio wrote:
| I think at this point, they should stop making it video "cards"
| but rather video "stations", a full tower station with power
| supply and one giant "card" inside with proper cooling, etc.,
| might also justify the crazy prices anyway.
| ufocia wrote:
| Probably better to stick to the GPUs. Integration is a low
| margin game.
| caycep wrote:
| granted, at this point we plug the computer into the GPU, so
| it might not make a difference...
| georgyo wrote:
| I'd prefer they stick to GPUs, but I think you're over
| simplifying.
|
| Dell proves that selling complete units is very profitable.
|
| Apple shows that owning the entire stack is immensely
| profitable.
|
| Nvidia already has significant hardware and software
| investment. They very well could fully integrate and grab
| larger slices of the pie.
|
| In fact, Nvidia already has complete appliance like fully
| integrated machines. But enterprises like to install their
| own OS and run their own software stack. These appliances
| have not caught on, at least not yet.
| jedberg wrote:
| https://lambdalabs.com/gpu-workstations/vector
| ribosometronome wrote:
| Isn't that just a computer? Or an eGPU, if it doesn't contain
| the rest of the computer?
| herecomethefuzz wrote:
| "Platform company" means multi-chip in this case?
|
| Seems logical since it's becoming impractical to cram so many
| transistors on a single die.
| 1oooqooq wrote:
| no it means rent seeking.
|
| imagine aws if they also sold all computers in the world, now
| you can only rent from them
___________________________________________________________________
(page generated 2024-03-18 23:00 UTC)