hngopher.com

       [HN Gopher] Nvidia announces next-gen RTX 5090 and RTX 5080 GPUs
       ___________________________________________________________________
        
       Nvidia announces next-gen RTX 5090 and RTX 5080 GPUs
        
       Author : somebee
       Score  : 400 points
       Date   : 2025-01-07 03:12 UTC (19 hours ago)
        
 (HTM) web link (www.theverge.com)
 (TXT) w3m dump (www.theverge.com)
        
       | ChrisArchitect wrote:
       | Official release: https://nvidianews.nvidia.com/news/nvidia-
       | blackwell-geforce-...
       | 
       | (https://news.ycombinator.com/item?id=42618849)
        
         | ryao wrote:
         | This thread was posted first.
        
       | jsheard wrote:
       | 32GB of GDDR7 at 1.8TB/sec for $2000, best of luck to the gamers
       | trying to buy one of those while AI people are buying them by the
       | truckload.
       | 
       | Presumably the pro hardware based on the same silicon will have
       | 64GB, they usually double whatever the gaming cards have.
        
         | codespin wrote:
         | At what point do we stop calling them graphics cards?
        
           | avaer wrote:
           | At what point did we stop calling them phones?
        
             | Whatarethese wrote:
             | Compute cards, AI Cards, or Business Cards.
             | 
             | I like business cards, I'm going to stick with that one.
             | Dibs.
        
               | stackghost wrote:
               | Let's see Paul Allen's GPU.
        
               | benreesman wrote:
               | Oh my god.
               | 
               | It even has a low mantissa FMA.
        
               | blitzar wrote:
               | The tasteful thickness of it.
        
               | aaronmdjones wrote:
               | Nice.
        
               | Yizahi wrote:
               | Business Cards is an awesome naming :)
        
           | paxys wrote:
           | Nvidia literally markets H100 as a "GPU"
           | (https://www.nvidia.com/en-us/data-center/h100/) even though
           | it wasn't built for graphics and I doubt there's a single
           | person or company using one to render any kind of graphics.
           | GPU is just a recognizable term for the product category, and
           | will keep being used.
        
             | philistine wrote:
             | General Purpose Unit.
        
               | taskforcegemini wrote:
               | General Processing Unit?
        
               | Y-bar wrote:
               | General Contact Unit (Very Little Gravitas Indeed).
        
             | ryao wrote:
             | Someone looked into running graphics on the A100, which is
             | the H100's predecessor. He found that it supports OpenGL:
             | 
             | https://www.youtube.com/watch?v=zBAxiQi2nPc
             | 
             | I assume someone is doing rendering on them given the
             | OpenGL support. In theory, you could do rendering in CUDA,
             | although it would be missing access to some of the hardware
             | that those who work with graphics APIs claim is needed for
             | performance purposes.
        
             | robotnikman wrote:
             | The Amazon reviews for the H100 are amusing
             | https://www.amazon.com/NVIDIA-Hopper-Graphics-5120-Bit-
             | Learn...
        
           | nickpsecurity wrote:
           | It's a good question. I'll note that, even in the GPGPU days
           | (eg BrookGPU), they were architecturally designed for
           | graphics applications (eg shaders). The graphics hardware was
           | being re-purposed to do something else. It was quite a
           | stretch to do the other things compared to massively-
           | parallel, general-purpose designs. They started adding more
           | functionality to them, like physics. Now, tensors.
           | 
           | While they've come a long way, I'd imagine they're still
           | highly specialized compared to general-purpose hardware and
           | maybe still graphics-oriented in many ways. One could test
           | this by comparing them to SGI-style NUMA machines, Tilera's
           | tile-based systems, or Adapteva's 1024-core design. Maybe
           | Ambric given it aimed for generality but Am2045's were DSP-
           | style. They might still be GPU's if they still looked more
           | like GPU's side by side with such architectures.
        
             | ryao wrote:
             | GPUs have been processing "tensors" for decades. What they
             | added that is new is explicit "tensor" instructions.
             | 
             | A tensor operation is a generalization of a matrix
             | operation to include higher order dimensions. Tensors as
             | used in transformers do not use any of those higher order
             | dimensions. They are just simple matrix operations (either
             | GEMV or GEMM, although GEMV can be done by GEMM).
             | Similarly, vectors are matrices, which are tensors. We can
             | take this a step further by saying scalars are vectors,
             | which are matrices, which are tensors. A scalar is just a
             | length 1 vector, which is a 1x1 matrix, which is a tensor
             | with all dimensions set to 1.
             | 
             | As for the "tensor" instructions, they compute tiles for
             | GEMM if I recall my read of them correctly. They are just
             | doing matrix multiplications, which GPUs have done for
             | decades. The main differences are that you do not need need
             | to write code to process the GEMM tile anymore as doing
             | that is a higher level operation and this applies only to
             | certain types introduced for AI while the hardware
             | designers expect code using FP32 or FP64 to process the
             | GEMM tile the old way.
        
               | nickpsecurity wrote:
               | Thanks for the correction and insights!
        
           | msteffen wrote:
           | I mean HPC people already call them accelerators
        
           | MuffinFlavored wrote:
           | How long until a "PC" isn't CPU + GPU but just a GPU? I know
           | CPUs are good for some things that GPUs aren't and vice versa
           | but... it really kind of makes you wonder.
           | 
           | Press the power button, boot the GPU?
           | 
           | Surely a terrible idea, and I know system-on-a-chip makes
           | this more confusing/complicated (like Apple Silicon, etc.)
        
             | robin_reala wrote:
             | "Press the power button, boot the GPU" describes the
             | Raspberry Pi.
        
             | jampekka wrote:
             | Probably never if the GPU architecture resembles anything
             | like they currently are.
        
             | jerf wrote:
             | Never. You can to a first approximation model a GPU as a
             | whole bunch of slow CPUs harnessed together and ordered to
             | run the same code at the same time, on different data. When
             | you can feed all the slow CPUs different data and do real
             | work, you get the big wins because the CPU count times the
             | compute rate will thrash what CPUs can put up for that same
             | number, due to sheer core count. However, if you are in an
             | environment where you can only have one of those CPUs
             | running at once, or even a small handful, you're
             | transported back to the late 1990s in performance. And you
             | can't speed them up without trashing their GPU performance
             | because the optimizations you'd need are at direct odds
             | with each other.
             | 
             | CPUs are not fast or slow. GPUs are not fast or slow. They
             | are fast and slow _for certain workloads_. Contra popular
             | belief, CPUs are actually _really good_ at what they do,
             | and the workloads they are fast at are more common than the
             | workloads that GPUs are fast at. There 's a lot to be said
             | for being able to bring a lot of power to bear on a single
             | point, and being able to switch that single point
             | reasonably quickly (but not instantaneously). There's also
             | a lot to be said for having a very broad capacity to run
             | the same code on lots of things at once, but it definitely
             | imposes a significant restriction on the shape of the
             | problem that works for.
             | 
             | I'd say that broadly speaking, CPUs can make better GPUs
             | than GPUs can make CPUs. But fortunately, we don't need to
             | choose.
        
           | WillPostForFood wrote:
           | We've looped back to the "math coprocessor" days.
           | 
           | https://en.wikipedia.org/wiki/Coprocessor
        
         | ryao wrote:
         | Do they double it via dual rank or clamshell mode? It is not
         | clear which approach they use.
        
         | wruza wrote:
         | Why do you need one of those as a gamer? 1080ti was 120+ fps in
         | heavy realistic looking games. 20xx RT slashed that back to 15
         | fps, but is RT really necessary to play games? Who cares about
         | real-world reflections? And reviews showed that RT+DLSS
         | introduced so many artefacts sometimes that the realism
         | argument seemed absurd.
         | 
         | Any modern card under $1000 is more than enough for graphics in
         | virtually all games. The gaming crisis is not in a graphics
         | card market at all.
        
           | bowsamic wrote:
           | > is RT really necessary to play games? Who cares about real-
           | world reflections?
           | 
           | I barely play video games but I definitely do
        
             | Vampiero wrote:
             | Indeed you're not a gamer, but you're the target audience
             | for gaming advertisements and $2000 GPUs.
             | 
             | I still play traditional roguelikes from the 80s (and their
             | modern counterparts) and I'm a passionate gamer. I don't
             | need a fancy GPU to enjoy the masterpieces. Because at the
             | end of the day nowhere in the definition of "game" is there
             | a requirement for realistic graphics -- and what passes off
             | as realistic changes from decade to decade anyway. A game
             | is about gameplay, and you can have great gameplay with
             | barely any graphics at all.
             | 
             | I'd leave raytracing to those who like messing with GLSL on
             | shadertoy; now people like me have 0 options if they want a
             | good budget card that just has good raster performance and
             | no AI/RTX bullshit.
             | 
             | And ON TOP OF THAT, every game engine has turned to utter
             | shit in the last 5-10 years. Awful performance, awful
             | graphics, forced sub-100% resolution... And in order to get
             | anything that doesn't look like shit and runs at a passable
             | framerate, you need to enable DLSS. Great
        
               | bowsamic wrote:
               | I play roguelikes too
        
             | williamDafoe wrote:
             | 1. Because you shoot at puddles? 2. Because you play at
             | night after a rainstorm?
             | 
             | Really, these are the only 2 situations where ray tracing
             | makes much of a difference. We already have simulated
             | shadowing in many games and it works pretty well, actually.
        
               | bowsamic wrote:
               | I just find screen space effects a bit jarring
        
               | t-writescode wrote:
               | Yes, actually. A lot of games use water, a lot, in their
               | scenes (70% of the planet is covered in it, after all),
               | and that does improve immersion and feels nice to look
               | at.
               | 
               | Silent Hill 2 Remake and Black Myth: Wukong both have a
               | meaningful amount of water in them and are improved
               | visually with raytracing for those exact reasons.
        
           | nullandvoid wrote:
           | Many people are running 4k resolution now, and a 4080
           | struggles to to break 100 frames in many current games maxed
           | (never-mind future titles) - therefore there's plenty of a
           | market with gamers and the 5x series (myself included) who
           | are looking for closer to 4090 performance at a non obscene
           | price.
        
             | williamDafoe wrote:
             | This is just absolutely false, Steam says that 4.21% of
             | users play at 4K. The number of users that play at higher
             | than 1440p is only 10.61%. So you are wrong, simply wrong.
        
               | nullandvoid wrote:
               | Did I say all the people, or did I say many people?..
               | 
               | Why are you so hostile? I'm not justifying the cost, I'm
               | simply in the 4k market and replying to OP's statement
               | "Any modern card under $1000 is more than enough for
               | graphics in virtually all games" which is objectively
               | false if you're a 4k user.
        
               | int_19h wrote:
               | This is a chicken and egg thing, though - people don't
               | play at 4K because it requires spending a lot of $$$ on
               | top-of-the-line GPU, not because they don't want to.
        
           | me551ah wrote:
           | > Any modern card under $1000 is more than enough for
           | graphics in virtually all games
           | 
           | I disagree. I run a 4070 Super, Ryzen 7700 with DDR5 and I
           | still cant run Asseto Corsa Competizione in VR at 90fps. MSFS
           | 2024 runs at 30 something fps at medium settings. VR gaming
           | is a different beast
        
             | Vampiero wrote:
             | Spending $2 quadrillion on a GPU won't fix poor raster
             | performance which is what you need when you're rendering
             | two frames side by side. Transistors only get so small
             | before AI slop is sold as an improvement.
        
           | rane wrote:
           | 1080ti is most definitely not powerful enough to play modern
           | games at 4k 120hz.
        
           | agloe_dreams wrote:
           | A bunch of new games are RT-only. Nvidia has aggressively
           | marketed on the idea that RT, FG, and DLSS are "must haves"
           | in game engines and that 'raster is the past'. Resolution is
           | also a big jump. 4K 120Hz in HDR is rapidly becoming common
           | and the displays are almost affordable (esp. so for TV-based
           | gaming). In fact, as of today, Even the very fastest RTX 4090
           | cannot run CP2077 at max non-RT settings and 4K at 120fps.
           | 
           | Now, I do agree that $1000 is plenty for 95% of gamers, but
           | for those who want the best, Nvidia is pretty clearly holding
           | out intentionally. The gap between a 4080TI and a 4090 is
           | GIANT. Check this great comparison from Tom's Hardware: https
           | ://cdn.mos.cms.futurecdn.net/BAGV2GBMHHE4gkb7ZzTxwK-120...
           | 
           | The biggest next-up offering leap on the chart is 4090.
        
             | wruza wrote:
             | I'm an ex-gamer, pretty recent ex-, and I own 4070Ti
             | currently (just to show I'm not a grumpy GTX guy). Max
             | settings are nonsensical. You never want to spend 50% of
             | frame budget on ASDFAA x64. Lowering AA alone to barely
             | noticeable levels makes a game run 30-50% faster*. Anyone
             | who chooses a graphics card may watch benchmarks and
             | basically multiply FPS by 1.5-2 because that's what
             | playable settings will be. And 4K is a matter of taste
             | really, especially in "TV" segment where it's a snakeoil
             | resolution more than anything else.
             | 
             | * also you want to ensure your CPU doesn't C1E-power-cycle
             | every frame and your frametimes don't look like EKG.
             | There's much more to performance tuning than just buying a
             | $$$$$ card. It's like installing a V12 engine into a rusted
             | fiat. If you want performance, you want RTSS, AB, driver
             | settings, bios settings, _then_ 4090.
        
           | berbec wrote:
           | I get under 50fps in certain places in FF14. I run a 5900x
           | with 32GB of ram and a 3090.
        
             | williamDafoe wrote:
             | The 3090 + 5900x is a mistake. The 5900x is 2 x 5600x CPUs.
             | So therefore, when the games asks for 8 cores, it will get
             | 6 good cores and 2 very slow cores across the infinity
             | switching fabric. What's more, NVidia GPUs take MUCH MORE
             | CPU than AMD GPUs. You should either buy an AMD GPU or
             | upgrade/downgrade to ANYTHING OTHER THAN 5900x with 8+
             | cores (5800x, 5800, 5700, 5700x3d, 5950x, 5900xt, anything
             | really ...)
        
           | ErneX wrote:
           | These are perfect for games featuring path tracing. Not many
           | games though but those really flex the 4090.
        
           | some_random wrote:
           | It's a leisure activity, "necessary" isn't the metric to be
           | used here, people clearly care about RT/PT while DLSS seems
           | to be getting better and better.
        
           | pknomad wrote:
           | You need as much FPS as possible for certain games for
           | competitive play like Counter Strike.
           | 
           | I went from 80 FPS (highest settings) to 365 FPS (capped to
           | my alienware 360hz monitor) when I upgraded from my old rig
           | (i7-8700K and 1070GTX) to a new one ( 7800X3D and 3090 RTX)
        
           | ryao wrote:
           | > Any modern card under $1000 is more than enough for
           | graphics in virtually all games. The gaming crisis is not in
           | a graphics card market at all.
           | 
           | You will love the RTX 5080 then. It is priced at $999.
        
           | t-writescode wrote:
           | > Who cares about real-world reflections?
           | 
           | Me. I do. I *love* raytracing; and, as has been said and seen
           | for several of the newest AAA games, raytracing is no longer
           | optional for the newest games. It's required, now. Those
           | 1080s, wonderful as long as they have been (and they have
           | been truly great cards) are definitely in need of an upgrade
           | now.
        
         | Hilift wrote:
         | 100% you will be able to buy them. And receive a rock in the
         | package from Amazon.
        
       | malnourish wrote:
       | I will be astonished if I'll be able to get a 5090 due to
       | availability. The 5080's comparative lack of memory is a buzzkill
       | -- 16 GB seems like it's going to be a limiting factor for 4k
       | gaming.
       | 
       | Does anyone know what these might cost in the US after the
       | rumored tariffs?
        
         | ericfrederich wrote:
         | 4k gaming is dumb. I watched a LTT video that came out today
         | where Linus said he primarily uses gaming monitors and doesn't
         | mess with 4k.
        
           | Our_Benefactors wrote:
           | There are good 4K gaming monitors, but they start at over
           | $1200 and if you don't also have a 4090 tier rig, you won't
           | be able to get full FPS out of AAA games at 4k.
        
             | archagon wrote:
             | I still have a 3080 and game at 4K/120Hz. Most AAA games
             | that I try can pull 60-90Hz at ~4K if DLSS is available.
        
             | out_of_protocol wrote:
             | Also, ultrawide monitors. They exist, provide more
             | immersion. And typical resolution is 3440x1440 which is
             | high and and the same time have low ppi (basically regular
             | 27" 1440p monitor with extra width). Doubling that is way
             | outside modern GPU capabilities
        
               | FuriouslyAdrift wrote:
               | A coworker who is really into flight sims runs 6
               | ultrawide curved monitors to get over 180 degrees around
               | his head.
               | 
               | I have to admit with the display wrapping around into
               | peripheral vision, it is very immersive.
        
           | kcb wrote:
           | No it's not. 2560x1440 has terrible PPI on larger screens.
           | Either way with a 4k monitor you don't technically need to
           | game at 4k as most intensive games offer DLSS anyway.
        
             | snvzz wrote:
             | And FSR, which is cross gpu vendor.
        
               | SirMaster wrote:
               | Not anymore. FSR4 is AMD only, and only the new RDNA4
               | GPUs.
        
             | perching_aix wrote:
             | What matters is the PPD, not the PPI, otherwise it's an
             | unsound comparison.
        
               | kcb wrote:
               | Too much personal preference with PPD. When I upgraded to
               | a 32" monitor from a 27" one i didn't push my display
               | through my wall, it sat in the same position.
        
               | perching_aix wrote:
               | Not entirely clear on what you mean, but if you refuse to
               | reposition your display or yourself after hopping between
               | diagonal sizes and resolutions, I'd say it's a bit
               | disingenuous to blame or praise either afterwards.
               | Considering you seem to know what PPD is, I think you
               | should be able to appreciate the how and why.
        
           | zeroonetwothree wrote:
           | Yep. I have both 4k and 1440p monitors and I can't tell the
           | difference in quality so I always use the latter for better
           | frames. I use the 4k for reading text though, it's noticeably
           | better.
        
             | munchbunny wrote:
             | That's why I also finally went from 1920x1200 to 4k about
             | half a year ago. It was mostly for reading text and
             | programming, not gaming.
             | 
             | I can tell the difference in games if I go looking for it,
             | but in the middle of a tense shootout I honestly don't
             | notice that I have double the DPI.
        
           | ggregoire wrote:
           | I watched the same video you talking about [1], where he's
           | trying the PG27UCDM (new 27" 4K 240Hz OLED "gaming monitor"
           | [2]) and his first impressions are "it's so clean and sharp",
           | then he starts Doom Eternal and after a few seconds he says
           | "It's insane [...] It looks perfect".
           | 
           | [1] https://www.youtube.com/watch?v=iQ404RCyqhk
           | 
           | [2] https://rog.asus.com/monitors/27-to-31-5-inches/rog-
           | swift-ol...
        
           | akimbostrawman wrote:
           | Taking anything Linus or LTT says seriously is even
           | dumber....
        
           | Yeul wrote:
           | Nonsense 4k gaming was inevitable as soon as 4k TVs got
           | mainstream.
        
         | stego-tech wrote:
         | Honestly, with how fast memory is being consumed nowadays and
         | the increased focus on frame generation/interpolation vs "full
         | frames", I'll keep my 3090 a little longer instead of upgrading
         | to a 5080 or 5090. It's not the fastest, but it's a solid card
         | even in 2025 for 1440p RT gaming on a VRR display, and the
         | memory lets me tinker with LLMs without breaking a sweat.
         | 
         | If DLSS4 and "MOAR POWAH" are the only things on offer versus
         | my 3090, it's a hard pass. I need efficiency, not a bigger TDP.
        
           | DimmieMan wrote:
           | I use my 3090 on a 4K TV and still don't see a need, although
           | a lot of that is being bored with most big budget games so I
           | don't have many carrots to push me to upgrade.
           | 
           | Turn down a few showcase features and games still look great
           | and run well with none or light DLSS. UE5 Lumen/ray tracing
           | are the only things I feel limited on and until consoles can
           | run them they'll be optional.
           | 
           | It seems all the gains are brute forcing these features with
           | upscaling & frame generation which I'm not a fan of anyway.
           | 
           | Maybe a 7090 at this rate for me.
        
           | ziml77 wrote:
           | Efficiency is why I switched from a 3090 to a 4080. The
           | amount of heat generated by my PC was massively reduced with
           | that change. Even if the xx90 weren't jumping up in price
           | each generation, I wouldn't be tempted to buy one again (I
           | didn't even really want the 3090, but that was during the
           | supply shortages and it was all I could get my hands on).
        
           | ryao wrote:
           | Pricing for the next generation might be somewhat better if
           | Nvidia switches to Samsung for 2nm like the rumors suggest:
           | 
           | https://wccftech.com/nvidia-is-rumored-to-switch-towards-
           | sam...
           | 
           | Coincidentally, the 3090 was made using Samsung's 8nm
           | process. You would be going from one Samsung fabricated GPU
           | to another.
        
             | lordofgibbons wrote:
             | NVidia's pricing isn't based on how much it takes to
             | produce their cards, but since they have no competition,
             | it's purely based on how much consumers are grudgingly
             | willing to pay up. If AMD continues to sleep, they'll sale
             | these cards for the same price, even if they could produce
             | them for free.
        
               | ryao wrote:
               | Nvidia's Titan series cards always were outrageously
               | priced for the consumer market. The 5090 is a Titan
               | series card in all but name.
               | 
               | I suspect there is a correlation to the price that it
               | costs Nvidia to produce these. In particular, the price
               | is likely 3 times higher than the production and
               | distribution costs. The computer industry has always had
               | significant margins on processors.
        
               | Yeul wrote:
               | AMD is not sleeping. They publicly admitted that they
               | threw in the towel- they have exited the high end market.
        
               | stego-tech wrote:
               | And if these 50-series specs are anything to go by, they
               | made a good call in doing so. All the big improvements
               | are coming in mid-range cards, where AMD, nVidia, and
               | Intel(!) are trading blows.
               | 
               | If the only way to get better raw frames in modern GPUs
               | is to basically keep shoveling power into them like an
               | old Pentium 4, then that's not exactly an enticing or
               | profitable space to be in. Best leave that to nVidia and
               | focus your efforts on a competitive segment where cost
               | and efficiency are more important.
        
           | lemoncookiechip wrote:
           | DLSS4 is coming to other RTX cards, eventually.
           | https://www.nvidia.com/en-us/geforce/news/dlss4-multi-
           | frame-...
        
       | glimshe wrote:
       | Let's see the new version of frame generation. I enabled DLSS
       | frame generation on Diablo 4 using my 4060 and I was very
       | disappointed with the results. Graphical glitches and partial
       | flickering made the game a lot less enjoyable than good old 60fps
       | with vsync.
        
         | ziml77 wrote:
         | The new DLSS 4 framegen really needs to be much better than
         | what's there in DLSS 3. Otherwise the 5070 = 4090 comparison
         | won't just be very misleading but flatly a lie.
        
           | sliken wrote:
           | Seems like pretty heavily stretched truth. Looks like the
           | actual performance uplift is more like 30%. The 5070=4090
           | comes from generating multiple fake frames per actual frame
           | and using different versions of DLSS on the cards. Multiple
           | frame generation (required for 5070=4090) increases latency
           | between user input and updated pixels and can also cause
           | artifacts when predictions don't match what the game engine
           | would display.
           | 
           | As always wait for fairer 3rd party reviews that will compare
           | new gen cards to old gen with the same settings.
        
             | jakemoshenko wrote:
             | > Multiple frame generation (required for 5070=4090)
             | increases latency between user input and updated pixels
             | 
             | Not necessarily. Look at the reprojection trick that lots
             | of VR uses to double framerates with the express purpose of
             | decreasing latency between user movements and updated
             | perspective. Caveat: this only works for movements and
             | wouldn't work for actions.
        
         | evantbyrne wrote:
         | The main edge Nvidia has in gaming is ray tracing performance.
         | I'm not playing any RT heavy titles and frame gen being a mixed
         | bag is why I saved my coin and got a 7900 XTX.
        
         | roskelld wrote:
         | There's some very early coverage on Digital Foundry where they
         | got to look at the 5080 and Cyberpunk.
         | 
         | https://youtu.be/xpzufsxtZpA
        
       | lostmsu wrote:
       | Did they discontinue Titan series for good?
        
         | greenknight wrote:
         | Last titan was released 2018.... 7 years ago.
         | 
         | They may resurrect it at some stage, but at this stage yes.
        
         | coffeebeqn wrote:
         | Yes the xx90 is the new Titan
        
         | ryao wrote:
         | The 3090, 3090 Ti, 4090 and 5090 are Titan series cards. They
         | are just no longer labelled Titan.
        
       | smcleod wrote:
       | It's a shame to see they max out at just 32GB, for that price in
       | 2025 you'd be hoping for a lot more, especially with Apple
       | Silicon - while not nearly as fast - being very usable with
       | 128GB+ for LLMs for $6-7k USD (comes with a free laptop too ;))
        
         | ryao wrote:
         | Presumably the workstation version will have 64GB of VRAM.
         | 
         | By the way, this is even better as far as memory size is
         | concerned:
         | 
         | https://www.asrockrack.com/minisite/AmpereAltraFamily/
         | 
         | However, memory bandwidth is what matters for token generation.
         | The memory bandwidth of this is only 204.8GB/sec if I
         | understand correctly. Apple's top level hardware reportedly
         | does 800GB/sec.
        
           | lostmsu wrote:
           | All of this is true only while no software is utilizing
           | parallel inference of multiple LLM queries. The Macs will hit
           | the wall.
        
             | ryao wrote:
             | People interested in running multiple LLM queries in
             | parallel are not people who would consider buying Apple
             | Silicon.
        
               | int_19h wrote:
               | There are other ways to parallelize even a single query
               | for faster output, e.g. speculative decoding with small
               | draft models.
        
           | sliken wrote:
           | AMD Strix Halo is 256GB/sec or so. Similarly AMD's Epyc
           | Sienna family is similar. The EPYC turin family (zen 5) has
           | 576GB/sec or so per socket. Not sure how well any of them do
           | on LLMs. Bandwidth helps, but so does hardware support for
           | FP8 or FP4.
        
             | ryao wrote:
             | Memory bandwidth is the most important thing for token
             | generation. Hardware support for FP8 or FP4 probably does
             | not matter much for token generation. You should be able to
             | run the operations on the CPU in FP32 while reading/writing
             | them from/to memory as FP4/FP8 by doing conversions in the
             | CPU's registers (although to be honest, I have not looked
             | into how those conversions would work). That is how
             | llama.cpp supports BF16 on CPUs that have no BF16 support.
             | Prompt processing would benefit from hardware FP4/FP8
             | support, since prompt processing is compute bound, not
             | memory bandwidth bound.
             | 
             | As for how well those CPUs do with LLMs. The token
             | generation will be close to model size / memory bandwidth.
             | At least, that is what I have learned from local
             | experiments:
             | 
             | https://github.com/ryao/llama3.c
             | 
             | Note that prompt processing is the phase where the LLM is
             | reading the conversation history and token generation is
             | the phase where the LLM is writing a response.
             | 
             | By the way, you can get an ampere altra motherboard + CPU
             | for $1,434.99:
             | 
             | https://www.newegg.com/asrock-rack-
             | altrad8ud-1l2t-q64-22-amp...
             | 
             | I would be shocked if you can get any EYPC CPU with
             | similar/better memory bandwidth for anything close to that
             | price. As for Strix Halo, anyone doing local inference
             | would love it if it is priced like a gaming part. 4 of them
             | could run llama 3.1 405B on paper. I look forward to seeing
             | its pricing.
        
               | sliken wrote:
               | Hmm, seems pretty close. Not sure how the memory channels
               | related to the performance. But the ampere board above
               | has 8 64 bit channels @ 3200 MHz, the AMD Turins have 24
               | 32 bit channels @ 6400 Mhz. So the AMD memory system is
               | 50% wider, 2x the clock, and 3x the channels.
               | 
               | As for price the AMD Epyc Turin 9115 is $726 and a common
               | supermicro motherboard is $750. Both the Ampere and AMD
               | motherboards have 2x10G. No idea if the AMD's 16 cores
               | with Zen 5 will be able to saturate the memory bus
               | compared to 64 cores of the Amphere Altra.
               | 
               | I do hope the AMD Strix Halo is reasonably priced (256
               | bits wide @ 8533 MHz), but if not the Nvidia Digit (GB10)
               | looks promising. 128GB ram, likely a wider memory system,
               | and 1 Pflop of FP4 sparse. It's going to be $3k, but with
               | 128GB ram that is approaching reasonable. Seems like it's
               | likely has around 500GB/sec of memory bandwidth, but that
               | is speculation.
               | 
               | Interesting Ampere board, thanks for the link.
        
         | jsheard wrote:
         | Apple Silicons architecture is better for running huge AI
         | models but much worse for just about anything else that you'd
         | want to run on a GPU, bandwidth is _far_ more important in most
         | other applications.
         | 
         | That's not even close, the M4 Max 12C has less than a third of
         | the 5090s memory throughput and the 10C version has less than a
         | quarter. The M4 Ultra should trade blows with the 4090 but
         | it'll still fall well short of the 5090.
        
         | whywhywhywhy wrote:
         | Just isn't comparable speed wise for anything apart from LLM
         | and in the long run you can double up and swap out Nvidia cards
         | while Mac you need to rebuy the whole machine.
        
         | FuriouslyAdrift wrote:
         | Guess you missed the Project Digits announcement... desktop
         | supercomputer for AI at $3k (128 GB ram)
         | 
         | https://www.nvidia.com/en-us/project-digits/
        
       | PaulKeeble wrote:
       | Looks like most of the improvement is only going to come when
       | DLSS4 is in use and its generating most of the frame for Ray
       | Tracing and then also generating 3 predicted frames. When you use
       | all that AI hardware then its maybe 2x, but I do wonder how much
       | fundamental rasterisation + shaders performance gain there is in
       | this generation in practice on the majority of actual games.
        
         | DimmieMan wrote:
         | Yeah I'm not holding my breath if they aren't advertising it.
         | 
         | I'm expecting a minor bump that will look less impressive if
         | you compare it to watts, these things are hungry.
         | 
         | It's hard to get excited when most of the gains will be limited
         | to a few new showcase AAA releases and maybe an update to a
         | couple of your favourites if your lucky.
        
           | coffeebeqn wrote:
           | It feels like GPUs are now well beyond what game studios can
           | put out. Consoles are stuck at something like RTX 2070 levels
           | for some years still. I hope Nvidia puts out some budget
           | cards for 50 series
        
             | DimmieMan wrote:
             | At the same time they're still behind demand as most of the
             | pretty advertising screenshots and frame rate bragging have
             | been behind increasingly aggressive upscaling.
             | 
             | On pc you can turn down the fancy settings at least but For
             | consoles I wonder if we're now in the smudgy upscale era
             | like overdone bloom or everything being brown.
        
         | jroesch wrote:
         | There was some solid commentary on the Ps5Pro tech talk stating
         | core rendering is so well optimized much of the gains in the
         | future will come from hardware process technology improvements
         | not from radical architecture changes. It seems clear the
         | future of rendering is likely to be a world where the gains
         | come from things like dlss and less and free lunch savings due
         | to easy optimizations.
        
           | jayd16 wrote:
           | Nanite style rendering still seems fairly green. That could
           | take off and they decide to re-implement the software
           | rasterization in hardware.
        
             | jms55 wrote:
             | Raster is believe it or not, not quite the bottleneck.
             | Raster speed definitely _matters_, but it's pretty fast
             | even in software, and the bigger bottleneck is just overall
             | complexity. Nanite is a big pipeline with a lot of
             | different passes, which means lots of dispatches and memory
             | accesses. Same with material shading/resolve after the
             | visbuffer is rendered.
             | 
             | EDIT: The _other_ huge issue with Nanite is overdraw with
             | thin/aggregate geo that 2pass occlusion culling fails to
             | handle well. That's why trees and such perform poorly in
             | Nanite (compared to how good Nanite is for solid opaque
             | geo). There's exciting recent research in this area though!
             | https://mangosister.github.io/scene_agn_site.
        
         | WeylandYutani wrote:
         | Like with how you cannot distinguish reality from CGI in movies
         | DLSS will also become perfected over the years.
        
         | yakaccount4 wrote:
         | 3 Generated frames sounds like a lot of lag, probably a
         | sickening amount for many games. The magic of "blackwell flip
         | metering" isn't quite described yet.
        
           | dagmx wrote:
           | It's 3 extrapolated frames not interpolated. So would be
           | reduced lag at the expense of greater pop-in.
           | 
           | There's also the new reflex 2 which uses reprojection based
           | on mouse motion to generate frames that should also help, but
           | likely has the same drawback.
        
             | perching_aix wrote:
             | > It's 3 extrapolated frames not interpolated.
             | 
             | Do you have a source for this? Doesn't sound like a very
             | good idea. Nor do I think there's additional latency mind
             | you, but not because it's not interpolation.
        
               | dagmx wrote:
               | https://www.nvidia.com/en-us/geforce/news/dlss4-multi-
               | frame-...
        
               | perching_aix wrote:
               | Could you please point out where on that page does it say
               | anything about "extrapolation"? Searched for the
               | (beginning of the) word directly and even gave all the
               | text a skim, didn't catch anything of the sort.
        
               | vel0city wrote:
               | Interpolation means you have frame 1 and frame 2, now
               | compute the interstitial steps between these two.
               | 
               | Extrapolation means you have frame 1, and sometime in the
               | future you'll get a frame 2. But until then, take the
               | training data and the current frame and "guess" what the
               | next few frames will be.
               | 
               | Interpolation requires you to have the final state
               | between the added frames, extrapolation means you don't
               | yet know what the final state will be but you'll keep
               | drawing until you get there.
               | 
               | You shouldn't get additional latency from generating,
               | assuming it's not slowing down the traditional render
               | generation pipeline.
        
               | perching_aix wrote:
               | I understand this - doesn't address anything of what I
               | said.
        
               | gruez wrote:
               | https://www.nvidia.com/content/dam/en-
               | zz/Solutions/geforce/n...
        
               | ryao wrote:
               | Jensen Huang said during his keynote that you get 3 AI
               | generated frames when rendering a native frame.
        
             | kllrnohj wrote:
             | > It's 3 extrapolated frames not interpolated. So would be
             | reduced lag at the expense of greater pop-in.
             | 
             | it's certainly not reduced lag relative to native
             | rendering. It might be reduced relative to dlss3 frame gen
             | though.
        
               | ryao wrote:
               | https://news.ycombinator.com/item?id=42623153
        
               | kllrnohj wrote:
               | This isn't relevant to what I said?
        
             | kllrnohj wrote:
             | Digital Foundry just covered this. 3x and 4x both add
             | additional latency on top of 2x.
             | 
             | https://youtu.be/xpzufsxtZpA?si=hZZlX-g_nueAd7-Q
        
         | kllrnohj wrote:
         | > but I do wonder how much fundamental rasterisation + shaders
         | performance gain there is in this generation in practice on the
         | majority of actual games.
         | 
         | likely 10-30% going off of both the cuda core specs (nearly
         | unchanged gen/gen for everything but the 5090) as well as the 2
         | benchmarks Nvidia published that didn't use dlss4 multi frame
         | gen - Far Cry 6 & A Plague Tale
         | 
         | https://www.nvidia.com/en-us/geforce/graphics-cards/50-serie...
        
         | williamDafoe wrote:
         | Given that Jensen completely omitted ANY MENTION of
         | rasterization performance, I think we can safely assume it's
         | probably WORSE in the 5000 series than the 4000 series, given
         | the large price cuts applied to every card below then 5090
         | (NVidia was never happy charging $1000 for the 4080 super - AMD
         | forced them to do it with the 7900xtx).
        
       | paxys wrote:
       | Even though they are all marketed as gaming cards, Nvidia is now
       | very clearly differentiating between 5070/5070 Ti/5080 for mid-
       | high end gaming and 5090 for consumer/entry-level AI. The gap
       | between xx80 and xx90 is going to be too wide for regular gamers
       | to cross this generation.
        
         | kcb wrote:
         | Yup, the days of the value high end card are dead it seems
         | like. I thought we would see a cut down 4090 at some point last
         | generation but it never happened. Surely there's a market gap
         | somewhere between 5090 and 5080.
        
           | smallmancontrov wrote:
           | Yes, but Nvidia thinks enough of them get pushed up to the
           | 5090 to make the gap worthwhile.
           | 
           | Only way to fix this is for AMD to decide it likes money. I'm
           | not holding my breath.
        
             | kaibee wrote:
             | Don't necessarily count Intel out.
        
               | romon wrote:
               | Intel is halting its construction of new factories and
               | mulling over whether to break up the company...
        
               | User23 wrote:
               | Intel's Board is going full Kodak.
        
               | 63 wrote:
               | I wouldn't count Intel out in the long term, but it'll
               | take quite a few generations for them to catch up and who
               | knows what the market will be like by then
        
               | blitzar wrote:
               | Intel hate making money even more than AMD.
        
               | hylaride wrote:
               | Starting around 2000, Intel tried to make money via
               | attempts at everything but making a better product
               | (pushing RAMBUS RAM, itanium, cripling low-end chips more
               | than they needed to be, focusing more on keeping chip
               | manufacturing in-house thereby losing out on economy of
               | scale). The result was engineers were (not always, but
               | too often) nowhere near the forefront of technology. Now
               | AMD, NVIDIA, ARM are all chipping away (pun intended).
               | 
               | It's not dissimilar to what happened to Boeing. I'm a
               | capitalist, but the current accounting laws (in
               | particular corporate taxation rules) mean that all
               | companies are pushed to use money for stock buybacks than
               | R&D (which Intel spent more on the former over the latter
               | over the past decade and I'm watching Apple stagnate
               | before my eyes).
        
               | FuriouslyAdrift wrote:
               | Intel's Arc B580 budget card is selling like hotcakes...
               | https://www.pcworld.com/article/2553897/intel-
               | arc-b580-revie...
        
               | blitzar wrote:
               | They fired the CEO for daring to make a product such as
               | this. The 25mil they paid to get rid of him might even
               | wipe out their profits on this product.
        
             | FuriouslyAdrift wrote:
             | AMD announced they aren't making a top tier card for the
             | next generation and is focusing on mid-tier.
             | 
             | Next generation, the are finally reversing course and
             | unifying their AI and GPU architectures (just like nVidia).
             | 
             | 2026 is the big year for AMD.
        
               | officeplant wrote:
               | AMD's GPU marketing during CES has been such a shit show.
               | No numbers, just adjectives and vibes. They're either
               | hiding their hand, or they continue to have nothing to
               | bring to the table.
               | 
               | Meanwhile their CPU marketing has numbers and graphs
               | because their at the top of their game and have nothing
               | to hide.
               | 
               | I'm glad they exist because we need the competition, but
               | the GPU market continues to look dreary. At least we have
               | a low/mid range battle going on between the three
               | companies to look forward to for people with sensible
               | gaming budgets.
        
           | ryao wrote:
           | The xx90 cards are really Titan cards. The 3090 was the
           | successor to the Titan RTX, while the 3080 Ti was the
           | successor to the 2080 Ti, which succeeded the 1080 Ti. This
           | succession continued into the 40 series and now the 50
           | series. If you consider the 2080 Ti to be the "value high end
           | card" of its day, then it would follow that the 5080 is the
           | value high end card today, not the 5090.
        
             | kcb wrote:
             | In all those historical cases the second tier card was a
             | cut down version of the top tier one. Now the 4080 and 5080
             | are a different chip and there's a gulf of a performance
             | gap between them and the top tier. That's the issue I am
             | highlighting, the 5080 is half a 5090, in the past a 3080
             | was only 10% off a 3090 performance wise.
        
         | ziml77 wrote:
         | The 4090 already seemed positioned as a card for consumer AI
         | enthusiast workloads. But this $1000 price gap between the 5080
         | and 5090 seems to finally cement that. Though we're probably
         | still going to see tons of tech YouTubers making videos
         | specifically about how the 5090 isn't a good value for gaming
         | as if it even matters. The people who want to spend $2000 on a
         | GPU for gaming don't care about the value and everyone else
         | already could see it wasn't worth it.
        
           | dijit wrote:
           | From all the communication I've had with Nvidia, the
           | prevailing sentiment was that the 4090 was an 8K card, that
           | _happened_ to be good for AI due to vram requirements from 8K
           | gaming.
           | 
           | However, I'm a AAA gamedev CTO and they might have been
           | telling me what the card means _to me_.
        
             | ziml77 wrote:
             | I do recall an 8K push but I thought that was on the 3090
             | (and was conditional on DLSS doing the heavy lifting). I
             | don't remember any general marketing about the 4090 being
             | an 8K card but I could very well have missed it or be
             | mixing things up! I mean it does make sense to market it
             | for 8K since anyone who is trying to drive that many pixels
             | when gaming probably has deep pockets.
        
               | ryao wrote:
               | I recall the 3090 8K marketing too. However, I also
               | recall Nvidia talking about 8K in reference to the 4090:
               | 
               | https://www.nvidia.com/en-us/geforce/technologies/8k/
               | 
               | That said, I recall that the media was more enthusiastic
               | about christening the 4090 as an 8K card than Nvidia was:
               | 
               | https://wccftech.com/rtx-4090-is-the-first-
               | true-8k-gaming-gp...
        
             | ryao wrote:
             | I recall them making the same claims about the 3090:
             | 
             | https://www.nvidia.com/en-us/geforce/news/geforce-
             | rtx-3090-8...
        
             | Refusing23 wrote:
             | Seems kinda silly to make an 8K video card when ... nobody
             | on the planet has an 8K screen
        
               | dijit wrote:
               | 2018 (6 years ago):
               | https://www.techradar.com/reviews/dell-ultrasharp-up3218k
               | 
               | It's uncommon, sure, but as mentioned it was sold to me
               | as being a development board for future resolutions.
        
               | gnabgib wrote:
               | Perhaps you don't, but several of us do. They've been
               | around a while, available in your local bestbuy/costco if
               | you're rocking a 4:4:4 TV they're not even particularly
               | pricey and great for computing (depending on the subpixel
               | layout).
               | 
               | On the planet? Many people. Maybe you're thinking 12K or
               | 16K.
        
               | jkolio wrote:
               | It's been a few years since I worked at [big tech
               | retailer], but 8K TVs basically didn't sell at the time.
               | There was basically no native content - even the demos
               | were upscaled 4K - and it was very hard to tell the
               | difference between the two unless you were so close to
               | the screen that you couldn't see the whole thing. For the
               | content that was available, either you were dealing with
               | heavy compression or setting up a high-capacity server,
               | since file sizes basically necessitated most of the space
               | on what people would consider a normal-sized hard drive
               | to store just a few movies.
               | 
               | The value just wasn't there and probably won't ever be
               | for most use cases. XR equipment might be an exception,
               | video editing another.
        
               | duffyjp wrote:
               | I got 4K TVs for both of my kids, they're dirt cheap--
               | sub $200. I'm surprised the Steam hardware survey doesn't
               | show more. A lot of my friends also set their kids up on
               | TVs, and you can't hardly buy a 1080P TV anymore.
        
               | martiuk wrote:
               | > Seems kinda silly to make a 4K video card when ...
               | nobody on the planet has a 4K screen.
               | 
               | Someone else probably said that years ago when everyone
               | was rocking 1080/1440p screens.
        
               | close04 wrote:
               | First consumer 4K monitors came out more than a decade
               | ago. I think the Asus PQ321 in 2013. That's close to
               | where we are now with 8K.
               | 
               | How many of the cards of that time would you call "4K
               | cards"? Even the Titan X that came a couple of years
               | later doesn't really cut it.
               | 
               | There's such a thing as being _too_ early to the game.
        
               | Eloso wrote:
               | Gaming isn't the only use-case, but Steam hardware survey
               | says ~4% of users are using 4k screens. So the market is
               | still small.
        
               | ErneX wrote:
               | If you look at the Steam hardware survey you'll find the
               | majority of gamers are still rocking 1080p/1440p
               | displays.
               | 
               | What gamers look for is more framerate not particularly
               | resolution. Most new gaming monitors are focusing on high
               | refresh rates.
               | 
               | 8K feels like a waste of compute for a very diminished
               | return compared to 4K. I think 8K only makes sense when
               | dealing with huge displays, I'm talking beyond 83 inches,
               | we are still far from that.
        
               | int_19h wrote:
               | Gaming aside, 4K is desirable even on <30" displays, and
               | honestly I wouldn't mind a little bit more pixel density
               | there to get it to true "retina" resolution. 6K might be
               | a sweet spot?
               | 
               | Which would then imply that you don't need a display as
               | big as 83" to see the benefits from 8K. Still, we're
               | talking about very large panels here, of the kind that
               | wouldn't even fit many computer desks, so yeah...
        
             | out_of_protocol wrote:
             | Well, modern games + modern cards can't even do 4k at high
             | fps and no dlss. 8k story is totally fairy tale. Maybe
             | "render at 540p, display at 8k"-kind of thing?
             | 
             | P.S. Also, VR. For VR you need 2x4k at 90+ stable fps.
             | There's (almost) no vr games though
        
               | diggan wrote:
               | > modern games + modern cards can't even do 4k at high
               | fps
               | 
               | What "modern games" and "modern cards" are you
               | specifically talking about here? There are plenty of AAA
               | games released last years that you can do 4K at 60fps
               | with a RTX 3090 for example.
        
               | philjohn wrote:
               | This - latest Call of Duty game on my (albeit water
               | cooled) 3080TI founders edition saw frame rates in the
               | 90-100fps running natively at 4k (no DLSS).
        
               | bavell wrote:
               | Can't CoD do 60+ fps @1080p on a potato nowadays?... not
               | exactly a good reference point.
        
               | CobaltFire wrote:
               | 4k90 is about 6 times that, and he probably has the
               | options turned up.
               | 
               | I'd say the comparison is what's faulty, not the example.
        
               | sfmike wrote:
               | new cod is really unoptimized. on a few years old 3080
               | still getting 100 fps on 4k that's pretty great. if he
               | uses some frame gen such as lossless he can get 120-150.
               | Say what you will about nvidia prices but you do get
               | years of great gaming out of them.
        
               | CobaltFire wrote:
               | Honestly my water cooled 3080TI FE has been great. Wish
               | it had more VRAM for VR (DCS, MSFS) but otherwise it's
               | been great.
        
               | philjohn wrote:
               | Which block did you go with? I went with the EK Vector
               | special edition which has been great, but need to look
               | for something else if I upgrade to 5080 with their recent
               | woes.
        
               | CobaltFire wrote:
               | I just have the Alphacool AIO with a second 360 radiator.
               | 
               | I've done tons of custom stuff but was at a point where I
               | didn't have the time for a custom loop. Just wanted plug
               | and play.
               | 
               | Seen some people talking down the block, but honestly I
               | run 50c under saturated load at 400 watts, +225 core,
               | +600 memory with a hot spot of 60c and VRAM of 62c. Not
               | amazing but it's not holding the card back. That's with
               | the Phanteks T30's at about 1200RPM.
               | 
               | Stock cooler I could never get the card stable despite
               | new pads and paste. I was running 280 watts, barely able
               | to run -50 on the core and no offset on memory. That
               | would STILL hit 85c core, 95c hotspot and memory.
        
               | kllrnohj wrote:
               | > There are plenty of AAA games released last years that
               | you can do 4K at 60fps with a RTX 3090 for example.
               | 
               | Not when you turn on ray tracing.
               | 
               | Also 60fps is pretty low, certainly isn't "high fps"
               | anyway
        
               | robertfall wrote:
               | This.
               | 
               | You can't get high frame rates with path tracing and 4K.
               | It just doesn't happen. You need to enable DLSS and frame
               | gen to get 100fps with more complete ray and path tracing
               | implementations.
               | 
               | People might be getting upset because the 4090 is WAY
               | more power than games need, but there are games that try
               | and make use of that power and are actually limited by
               | the 4090.
               | 
               | Case in point Cyberpunk and Indiana Jones with path
               | tracing don't get anywhere near 100FPS with native
               | resolution.
               | 
               | Now many might say that's just a ridiculous ask, but
               | that's what GP was talking about here. There's no way
               | you'd get more than 10-15fps (if that) with path tracing
               | at 8K.
        
               | kllrnohj wrote:
               | > Case in point Cyberpunk and Indiana Jones with path
               | tracing don't get anywhere near 100FPS with native
               | resolution.
               | 
               | Cyberpunk native 4k + path tracing gets sub-20fps on a
               | 4090 for anyone unfamiliar with how demanding this is.
               | Nvidia's own 5090 announcement video showcased this as
               | getting a whopping... 28 fps: https://www.reddit.com/medi
               | a?url=https%3A%2F%2Fi.redd.it%2Ff...
        
               | mastax wrote:
               | > Also 60fps is pretty low, certainly isn't "high fps"
               | anyway
               | 
               | I'm sure some will disagree with this but most PC gamers
               | I talk to want to be at 90FPS minimum. I'd assume if
               | you're spending $1600+ on a GPU you're pretty particular
               | about your experience.
        
               | bee_rider wrote:
               | I'm so glad I grew up in the n64/xbox era. You save so
               | much money if you are happy at 30fps. And the games look
               | really nice.
        
               | marxisttemp wrote:
               | I wish more games had an option for N64/Xbox-level
               | graphics to maximize frame rate. No eye candy tastes as
               | good as 120Hz feels.
        
               | bee_rider wrote:
               | I'm sure you could do N64 style graphics at 120Hz on an
               | iGPU with modern hardware, hahaha. I wonder if that would
               | be a good option for competitive shooters.
               | 
               | I don't really mind low frame rates, but latency is often
               | noticeable and annoying. I often wonder if high frame
               | rates are papering over some latency problems in modern
               | engines. Buffering frames or something like that.
        
               | nfriedly wrote:
               | Doom 2016 at 1080p with a 50% resolution scale (so,
               | really, 540p) can hit 120 FPS on an AMD 8840U. That's
               | what I've been doing on my GPD Win Mini, except that I
               | usually cut the TDP down to 11-13W, where it's hitting
               | more like 90-100 FPS. It looks and feels great!
        
               | kllrnohj wrote:
               | You can also save tons of money by combining used GPUs
               | from two generations ago with a patientgamer lifestyle
               | without needing to resort to suffering 30fps
        
               | necheffa wrote:
               | > Also 60fps is pretty low, certainly isn't "high fps"
               | anyway
               | 
               | Uhhhhhmmmmmm....what are you smoking?
               | 
               | Almost no one is playing competitive shooters and such at
               | 4k. For those games you play at 1080p and turn off lots
               | of eye candy so you can get super high frame rates
               | because that does actually give you an edge.
               | 
               | People playing at 4k are doing immersive story driven
               | games and consistent 60fps is perfectly fine for that,
               | you don't really get a huge benefit going higher.
               | 
               | People that want to split the difference are going 1440p.
        
               | lifeformed wrote:
               | Anyone playing games would benefit from higher frame rate
               | no matter their case. Of course it's most critical for
               | competitive gamers, but someone playing a story driven
               | FPS at 4k would still benefit a lot from framerates
               | higher than 60.
               | 
               | For me, I'd rather play a story based shooter at 1440p @
               | 144Hz than 4k @ 60Hz.
        
               | kllrnohj wrote:
               | Games other than esports shooters and slow paced story
               | games exist, you know. In fact, most games are in this
               | category you completely ignored for some reason.
               | 
               | Also nobody is buying a 4090/5090 for a "fine"
               | experience. Yes 60fps is fine. But better than that is
               | expected/desired at this price point.
        
               | int_19h wrote:
               | You seem to be assuming that the only two buckets are
               | "story-driven single player" and "PvP multiplayer", but
               | online co-op is also pretty big these days. FWIW I play
               | online co-op shooters at 4K 60fps myself, but I can see
               | why people might prefer higher frame rates.
        
               | causi wrote:
               | Personally I've yet to see a ray tracing implementation
               | that I would sacrifice 10% of my framerate for, let alone
               | 30%+. Most of the time, to my tastes, it doesn't even
               | look _better_ , it just looks _different_.
        
               | marxisttemp wrote:
               | Yep. Few AAA games can run at 4K60 at max graphics
               | without upscaling or frame gen on a 4090 without at least
               | occasionally dipping below 60. Also, most monitors sold
               | with VRR (which I would argue is table stakes now) are
               | >60FPS.
        
             | pier25 wrote:
             | The 4080 struggles to play high end games at 4k and there
             | aren't that many 8k tvs/monitors in the market... Doesn't
             | make much sense that anyone would think about the 4090 as
             | an 8k GPU to be honest.
        
             | Aardwolf wrote:
             | Why does 8K gaming require more VRAM?
             | 
             | I think the textures and geometry would have the same
             | resolution (or is that not the case? but in 4K if you walk
             | closer to the wall you'd want higher texture resolution as
             | well anyway, if the graphics artists have made the assets
             | at that resolution anyway)
             | 
             | 8K screen resolution requires 132 megabytes of memory to
             | store the pixels (for 32-bit color), that doesn't explain
             | gigabytes of extra VRAM
             | 
             | I'd be curious to know what information I'm missing
        
               | Macha wrote:
               | My understand is between double buffering and multiple
               | sets of intermediate info for shaders, you usually have a
               | bunch of screen size buffers hanging around in VRAM,
               | though you are probably right that these aren't the
               | biggest contributor to VRAM usage in the end.
        
               | dijit wrote:
               | You're only thinking of the final raster framebuffer,
               | there are multiple raster and shader stages. Increasing
               | the native output has an nearly exponential increase in
               | memory requirements.
        
               | atq2119 wrote:
               | When you render a higher resolution natively, you
               | typically also want higher resolution textures and more
               | detailed model geometry.
        
           | angled wrote:
           | I wonder if these will be region-locked (eg, not for HK SAR).
        
           | ryao wrote:
           | If I recall correctly, the 3090, 3090 Ti and 4090 were
           | supposed to replace the Titan cards that had been Nvidia's
           | top gaming cards, but were never meant for gaming.
        
             | KMnO4 wrote:
             | Someone very clever at Nvidia realized that if they rename
             | their professional card (Titan) to be part of their
             | "gaming" line, you can convince adults with too much
             | disposable income that they need it to play Elden Ring.
             | 
             | I didn't know of anyone who used the Titan cards (which
             | were actually priced cheaper than their respective xx90
             | cards at release) for gaming, but somehow people were happy
             | spending >$2000 when the 3090 came out.
        
               | Cumpiler69 wrote:
               | _> but somehow people were happy spending >$2000 when the
               | 3090 came out_
               | 
               | Of course they did, the 3090 came out at the height of
               | the pandemic and crypto boom in 2020, when people were
               | locked indoors with plenty of free time and money to
               | spare, what else where they gonna spend it on?
        
               | cptcobalt wrote:
               | As an adult with too much disposable income and a 3090,
               | it just becomes a local LLM server w/ agents when I'm not
               | playing games on it. Didn't even see the potential for it
               | back then, but now I'm convinced that the xx90 series
               | offers me value outside of just gaming uses.
        
         | simondotau wrote:
         | Nvidia is also clearly differentiating the 5090 as the gaming
         | card for people who want the best and an extra thousand dollars
         | is a rounding error. They could have sold it for $1500 and
         | still made big coin, but no doubt the extra $500 is pure wealth
         | tax.
         | 
         | It probably serves to make the 4070 look reasonably priced,
         | even though it isn't.
        
           | ryao wrote:
           | Leaks indicate that the PCB has 14 layers with a 512-bit
           | memory bus. It also has 32GB of GDDR7 memory and the die size
           | is expected to be huge. This is all expensive. Would you
           | prefer that they had not made the card and instead made a
           | lesser card that was cheaper to make to avoid the higher
           | price? That is the AMD strategy and they have lower prices.
        
             | simondotau wrote:
             | That PCB is probably a few dollars per unit. The die is
             | probably the same as the one in the 5070. I've no doubt
             | it's an expensive product to build, but that doesn't mean
             | the price is cost plus markup.
        
               | ryao wrote:
               | Currently, the 5070 is expected to use the GB205 die
               | while the 5090 is expected to use the GB202 die:
               | 
               | https://www.techpowerup.com/gpu-specs/geforce-
               | rtx-5070.c4218
               | 
               | https://www.techpowerup.com/gpu-specs/geforce-
               | rtx-5090.c4216
               | 
               | It is unlikely that the 5070 and 5090 share the same die
               | when the 4090 and 4080 did not share same die.
               | 
               | Also, could an electrical engineer estimate how much this
               | costs to manufacture:
               | 
               | https://videocardz.com/newz/nvidia-geforce-rtx-5090-pcb-
               | leak...
        
               | positr0n wrote:
               | Is the last link wrong? It doesn't mention cost.
        
               | ryao wrote:
               | The PCB cost did not leak. We need an electrical engineer
               | to estimate the cost based on what did leak.
        
               | shadowpho wrote:
               | >That PCB is probably a few dollars per unit.
               | 
               | It's not. 14L PCB are expensive. When I looked at Apple
               | cost for their PCB it was probably closer to $50, and
               | they have smaller area
        
           | sliken wrote:
           | Double the bandwidth, double the ram, double the pins, and
           | double the power isn't cheap. I wouldn't be surprised if the
           | profit on the 4090 was less than the 4080, especially since
           | any R&D costs will be spread over significantly less units.
        
             | formerly_proven wrote:
             | There have been numerous reports over the years that the
             | 4090 actually outsold the 4080.
        
               | BigJ1211 wrote:
               | The 4080 was also quite the bad value compared to the
               | much better 4090. That remains to be seen for the 5000
               | series.
        
               | williamDafoe wrote:
               | The 4080 was designed as a strawman card expressly to
               | drive sales towards the 4090. So this is by design.
        
           | epolanski wrote:
           | Gaming enthusiasts didn't beat an eye at 4090 price and won't
           | beat one there either.
           | 
           | 4090 was already priced for high income (in first world
           | countries) people. Nvidia saw 4090s were being sold on second
           | hand market way beyond 2k. They merely milking the cow.
        
         | ryao wrote:
         | The 3090 and 3090 Ti both support software ECC. I assume that
         | the 4090 has it too. That alone positions the xx90 as a pseudo-
         | professional card.
        
           | gregoryl wrote:
           | The 4090 indeed does have ecc support
        
             | sliken wrote:
             | Yes, but ECC is inline, so it costs bandwidth and memory
             | capacity.
        
               | fulafel wrote:
               | Doesn't it always. (Except sometimes on some hw you can't
               | turn it off)
        
               | sliken wrote:
               | I believe the cards that are intended for compute instead
               | of GPU default to ECC being on and report memory
               | performance with the overheads included.
        
               | FuriouslyAdrift wrote:
               | Anything with DDR5 or above has built in limited ECC...
               | it's required by the spec.
               | https://www.corsair.com/us/en/explorer/diy-
               | builder/memory/is...
        
               | sliken wrote:
               | Sure, but it's very limited. It doesn't detect or fix
               | errors in the dimm (outside the chips), motherboard
               | traces, CPU socket, or CPU.
        
         | lz400 wrote:
         | How will a 5090 compare against project digits? now that
         | they're both in the front page :)
        
           | ryao wrote:
           | We will not really know until memory bandwidth and compute
           | numbers are published. However, Project Digits seems like a
           | successor to the NVIDIA Jetson AGX Orin 64GB Developer Kit,
           | which was based on the Ampere architecture and has
           | 204.8GB/sec memory bandwidth:
           | 
           | https://www.okdo.com/wp-content/uploads/2023/03/jetson-
           | agx-o...
           | 
           | The 3090 Ti had about 5 times the memory bandwidth and 5
           | times the compute capability. If that ratio holds for
           | blackwell, the 5090 will run circles around it when it has
           | enough VRAM (or you have enough 5090 cards to fit everything
           | into VRAM).
        
             | lz400 wrote:
             | Very interesting, thanks!
             | 
             | 32gb for the 5090 vs 128gb for digits might put a nasty cap
             | on unleashing all that power for interesting models.
             | 
             | Several 5090s together would work but then we're talking
             | about multiple times the cost (4x$2000+PC VS $3000)
        
               | ryao wrote:
               | Inference presumably will run faster on a 5090. If the 5x
               | memory bandwidth figure holds, then token generation
               | would run 5 times faster. That said, people in the digits
               | discussion predict that the memory bandwidth will be
               | closer to 546GB/sec, which is closer to 1/3 the memory
               | bandwidth of the 5090, so a bunch of 5090 cards would
               | only run 3 times faster at token generation.
        
         | whalesalad wrote:
         | It's the same pricing from last year. This already happened.
        
         | oliwarner wrote:
         | The only difference is scalar. That isn't differentiating,
         | that's segregation.
         | 
         | It won't stop crypto and LLM peeps from buying everything (one
         | assumes TDP is proportional too). Gamers not being able to find
         | an affordable option is still a problem.
        
           | officeplant wrote:
           | >Gamers not being able to find an affordable option is still
           | a problem.
           | 
           | Used to think about this often because I had a side hobby of
           | building and selling computers for friends and coworkers that
           | wanted to get into gaming, but otherwise had no use for a
           | powerful computer.
           | 
           | For the longest time I could still put together $800-$1000
           | PC's that could blow consoles away and provide great value
           | for the money.
           | 
           | Now days I almost want to recommend they go back to console
           | gaming. Seeing older ps5's on store shelves hit $349.99
           | during the holidays really cemented that idea. Its so
           | astronomically expensive for a PC build at the moment unless
           | you can be convinced to buy a gaming laptop on a deep sale.
        
             | dolni wrote:
             | One edge that PCs have is massive catalog.
             | 
             | Consoles have historically not done so well with backwards
             | compatibility (at most one generation). I don't do much
             | console gaming but _I think_ that is changing.
             | 
             | There is also something to be said about catalog
             | portability via something like a Steam Deck.
        
               | officeplant wrote:
               | Cheaper options like the Steam Deck are definitely a boon
               | to the industry. Especially the idea of "good enough"
               | gaming at lower resolutions on smaller screens.
               | 
               | Personally, I just don't like that its attached to steam.
               | Which is why I can be hesitant to suggest consoles as
               | well now that they have soft killed their physical game
               | options. Unless you go out of your way to get the add-on
               | drive for PS5, etc
               | 
               | Its been nice to see backwards compatibility coming back
               | in modern consoles to some extent with Xbox especially if
               | you have a Series-X with the disc drive.
               | 
               | I killed my steam account with 300+ games just because I
               | didn't see a future where steam would actually let me own
               | the games. Repurchased everything I could on GoG and gave
               | up on games locked to Windows/Mac AppStores, Epic, and
               | Steam. So I'm not exactly fond of hardware attached to
               | that platform, but that doesn't stop someone from just
               | loading it up with games from a service like GoG and
               | running them thru steam or Heroic Launcher.
               | 
               | 2024 took some massive leaps forward with getting a
               | proton-like experience without steam and that gives me a
               | lot of hope for future progress on Linux gaming.
        
           | foobarian wrote:
           | Are crypto use cases still there? I thought that went away
           | after eth switched their proof model.
        
             | oliwarner wrote:
             | Bitcoin is still proof of work.
        
               | foobarian wrote:
               | Yeah but BTC is not profitable on GPU I thought (needs
               | ASIC farms)
        
         | ffsm8 wrote:
         | The price of a 4090 already was ~1800-2400EUR where I live (not
         | scalper prices, the normal online Shops)
         | 
         | We'll have to see how much they'll charge for these cards this
         | time, but I feel like the price bump has been massively
         | exaggerated by people on HN
        
           | BigJ1211 wrote:
           | MSRP went from 1959,- to 2369,-. That's quite the increase.
        
         | epolanski wrote:
         | You underestimate how many gamers got a 4090.
        
       | m3kw9 wrote:
       | You also need to upgrade your air conditioner
        
         | lingonland wrote:
         | Or just open a window, depending on where you live
        
         | polski-g wrote:
         | Yeah I'm not really sure what the solution is at this point.
         | Put it in my basement and run 50foot HDMI cables through my
         | house or something...
        
       | nullc wrote:
       | Way too little memory. :(
        
       | ksec wrote:
       | Anyone has any info on Node? Can't find anything online. Seems to
       | be 4nm but performance suggest otherwise. Hopefully someone do a
       | deep dive soon.
        
         | kcb wrote:
         | Good bet it's 4nm. The 5090 doesn't seem that much greater than
         | the 4090 in terms of raw performance. And it has a big TDP bump
         | to provide that performance.
        
         | wmf wrote:
         | I'm guessing it's N4 and the performance is coming from larger
         | dies and higher power.
        
       | biglost wrote:
       | Mmm i think my wallet Is safe since i only play SNES and old dos
       | games.
        
       | jmyeet wrote:
       | The interesting part to me was that Nvidia claim the new 5070
       | will have 4090 level performance for a much lower price ($549).
       | Less memory however.
       | 
       | If that holds up in the benchmarks, this is a nice jump for a
       | generation. I agree with others that more memory would've been
       | nice, but it's clear Nvidia are trying to segment their SKUs into
       | AI and non-AI models and using RAM to do it.
       | 
       | That might not be such a bad outcome if it means gamers can
       | actually buy GPUs without them being instantly bought by robots
       | like the peak crypto mining era.
        
         | dagmx wrote:
         | That claim is with a heavy asterisk of using DLSS4. Without
         | DLSS4, it's looking to be a 1.2-1.3x jump over the 4070.
        
           | knallfrosch wrote:
           | Do games need to implement something on their side to get
           | DLSS4?
        
             | Vampiero wrote:
             | On the contrary, they need to be optimized so badly that
             | they run like shit on 2025 graphics cards despite looking
             | the exact same as games from years ago
        
             | Macha wrote:
             | The asterisk is DLSS4 is using AI to generate extra frames,
             | rather than rendering extra frames, which hurts image
             | stability and leads to annoying fuzziness/flickering. So
             | it's not comparing like with like.
             | 
             | Also since they're not coming from the game engine, they
             | don't actually react as the game would, so they don't have
             | advantages in terms of response times that actual frame
             | rate does.
        
         | popcalc wrote:
         | Was surprised to relearn the GTX 980 premiered at $549 a decade
         | ago.
        
           | izacus wrote:
           | Which is 750$ in 2024 adjusted for inflation and you got a
           | card that's providing 1/3 of performance of a 4070Ti at equal
           | price range. 1/4 with 5070Ti probably.
           | 
           | 3x the FPS at same cost (ignoring AI cores, encoders,
           | resolutions, etc.) is a decent performance track record. With
           | DLSS enabled the difference is significantly bigger.
        
       | nottorp wrote:
       | Do they come with a mini nuclear reactor to power them?
        
         | romon wrote:
         | The future is SMRs next to everyone's home
        
         | wmf wrote:
         | No, you get that from Enron.
        
       | jms55 wrote:
       | * MegaGeometry (APIs to allow Nanite-like systems for raytracing)
       | - super awesome, I'm super super excited to add this to my
       | existing Nanite-like system, finally allows RT lighting with high
       | density geometry
       | 
       | * Neural texture stuff - also super exciting, big advancement in
       | rendering, I see this being used a lot (and helps to make up for
       | the meh vram blackwell has)
       | 
       | * Neural material stuff - might be neat, Unreal strata materials
       | will like this, but going to be a while until it gets a good
       | amount of adoption
       | 
       | * Neural shader stuff in general - who knows, we'll see how it
       | pans out
       | 
       | * DLSS upscaling/denoising improvements (all GPUs) - Great! More
       | stable upscaling and denoising is very much welcome
       | 
       | * DLSS framegen and reflex improvements - bleh, ok I guess,
       | reflex especially is going to be very niche
       | 
       | * Hardware itself - lower end a lot cheaper than I expected!
       | Memory bandwidth and VRAM is meh, but the perf itself seems good,
       | newer cores, better SER, good stuff for the most part!
       | 
       | Note that the material/texture/BVH/denoising stuff is all
       | research papers nvidia and others have put out over the last few
       | years, just finally getting production-ized. Neural textures and
       | nanite-like RT is stuff I've been hyped for the past ~2 years.
       | 
       | I'm very tempted to upgrade my 3080 (that I bought used for $600
       | ~2 years ago) to a 5070 ti.
        
         | magicalhippo wrote:
         | For gaming I'm also looking forward to the improved AI workload
         | sharing mentioned, where, IIUC, AI and graphics workloads could
         | operate at the same time.
         | 
         | I'm hoping generative AI models can be used to generate more
         | immersive NPCs.
        
       | friedtofu wrote:
       | As a lifelong nvidia consumer, I think it's a safe bet to ride
       | out the first wave of 5xxx series GPUs and wait for the
       | inevitable 5080/5070 (GT/Ti/Super/whatever) that should release a
       | few months after with similar specs and better performance based
       | on whatever the complaints surrounding the initial GPUs lacked.
       | 
       | I would expect something like the 5080 super will have something
       | like 20/24Gb of VRAM. 16Gb just seems wrong for their "target"
       | consumer GPU.
        
         | ryao wrote:
         | They could have used 32Gbps GDDR7 to push memory bandwidth on
         | the 5090 to 2.0TB/sec. Instead, they left some performance on
         | the table. I wonder if they have some compute cores disabled
         | too. They are likely leaving room for a 5090 Ti follow-up.
        
           | nsteel wrote:
           | Maybe they wanted some thermal/power headroom. It's already
           | pretty mad.
        
         | arvinsim wrote:
         | I made the mistake of not waiting befpre.
         | 
         | This time around, I will save for the 5090 or just wait for the
         | Ti/Super refreshes.
        
         | knallfrosch wrote:
         | Or you wait out the 5000 Super too and get the 6000 series that
         | fixes all the first-gen 5000-Super problems...
        
       | ryao wrote:
       | The most interesting news is that the 5090 Founders' Edition is a
       | 2-slot card according to Nvidia's website:
       | 
       | https://www.nvidia.com/en-us/geforce/graphics-cards/50-serie...
       | 
       | When was the last time Nvidia made a high end GeForce card use
       | only 2 slots?
        
         | archagon wrote:
         | Fantastic news for the SFF community.
         | 
         | (Looks like Nvidia even advertises an "SFF-Ready" label for
         | cards that are small enough: https://www.nvidia.com/en-
         | us/geforce/news/small-form-factor-...)
        
           | sliken wrote:
           | Not really, 575 watts for the GPU is going to make it tough
           | to cool or provide power for.
        
             | archagon wrote:
             | There are 1000W SFX-L (and probably SFX) PSUs out there,
             | and console-style cases provide basically perfect cooling
             | through the sides. The limiting factor really is slot
             | width.
             | 
             | (But I'm more eyeing the 5080, since 360W is pretty easy to
             | power and cool for most SFF setups.)
        
           | kllrnohj wrote:
           | It's a dual flow-through design, so some SFF cases will work
           | OK but the typical sandwich style ones probably won't even
           | though it'll physically fit
        
         | _boffin_ wrote:
         | Donno why I feel this, but probably going to end up being 2.5
         | slots
        
         | matja wrote:
         | The integrator decides the form factor, not NVIDIA, and there
         | were a few 2-slot 3080's with blower coolers. Technically
         | water-cooled 40xx's can be 2-slot also but that's cheating.
        
           | favorited wrote:
           | 40-series water blocks can even be single slot:
           | https://shop.alphacool.com/en/shop/gpu-water-
           | cooling/nvidia/...
        
       | knallfrosch wrote:
       | Smaller cards with higher power consumption - will GPU water-
       | cooling be cool again?
        
       | sub7 wrote:
       | Would have been nice to get double the memory on the 5090 to run
       | those giant models locally. Would've probably upgraded at 64gb
       | but the jump from 24 to 32gb isn't big enough
       | 
       | Gaming performance has been plateaued for some time now, maybe an
       | 8k monitor wave can revive things
        
       | lxdlam wrote:
       | I have a serious question about the term "AI TOPS". I find many
       | conflicting definitions while others say nothing. A meaningful
       | metric should at least be well defined on its own term, like in
       | "TOPS" or expanded "Tera Operations Per Second", what operation
       | it will measure?
       | 
       | Seemingly NVIDIA is just playing number games, like wow 3352 is a
       | huge leap compared to 1321 right? But how does it really help us
       | in LLMs, diffusion models and so on?
        
         | diggan wrote:
         | It would be cool if something like vast.ai's "DLPerf" would
         | become popular enough for the hardware producers to start using
         | it too.
         | 
         | > DLPerf (Deep Learning Performance) - is our own scoring
         | function. It is an approximate estimate of performance for
         | typical deep learning tasks. Currently, DLPerf predicts
         | performance well in terms of iters/second for a few common
         | tasks such as training ResNet50 CNNs. For example, on these
         | tasks, a V100 instance with a DLPerf score of 21 is roughly ~2x
         | faster than a 1080Ti with a DLPerf of 10. [...] Although far
         | from perfect, DLPerf is more useful for predicting performance
         | than TFLops for most tasks.
         | 
         | https://vast.ai/faq#dlperf
        
       | thefz wrote:
       | > GeForce RTX 5070 Ti: 2X Faster Than The GeForce RTX 4070 Ti
       | 
       | 2x faster _in DLSS_. If we look at the 1:1 resolution
       | performance, the increase is likely 1.2x.
        
         | alkonaut wrote:
         | That's what I'm wondering. What's the actual raw render/compute
         | difference in performance, if we take a game that predates
         | DLSS?
        
           | thefz wrote:
           | We shall wait for real world benchmarks to address the raster
           | performance increase.
           | 
           | The bold claim "5070 is like a 4090 at 549$" is quite
           | different if we factor in that it's basically in DLSS only.
        
             | kllrnohj wrote:
             | it's actually a lot worse than it sounds even. The 5070 is
             | like a 4090 is when the 5070 has multi frame generation on
             | and the 4090 doesn't. So it's not even comparable levels of
             | DLSS, the 5070 is hallucinating 2x+ more frames than the
             | 4090 is in that claim
        
           | izacus wrote:
           | Based on non-DLSS tests, it seems like a respectable ~25%.
        
             | vizzier wrote:
             | Respectable outright, but 450W -> 575W TDP takes the edge
             | off a bit. We'll have to see how that translates to at the
             | wall. My room already gets far too hot with a 320W 3080.
        
       | christkv wrote:
       | 575W TDP for the 5090. A buddy has 3x 4090 in a machine with a 32
       | core AMD cpu must be putting out close to 2000W of heat at peak
       | if he switched to 5090. Uff
        
         | aurbano wrote:
         | 2kW is literally the output of my patio heater haha
        
           | buildbot wrote:
           | They work as effective heaters! I haven't used my (electric)
           | heat all winter, I just use my training computer's waste heat
           | instead.
        
         | buildbot wrote:
         | I have a very similar setup, 3x4090s. Depending on the model
         | I'm training, the GPUs use anywhere from 100-400 watts, but
         | don't get much slower when power limited to say, 250w. So they
         | could power limit the 5090s if they want and get pretty decent
         | performance most likely.
         | 
         | The cat loves laying/basking on it when it's putting out 1400w
         | in 400w mode though, so I leave it turned up most of the time!
         | (200w for the cpu)
        
           | jiggawatts wrote:
           | May I ask what you're training? And why not just rent GPUs in
           | some cloud?
        
       | blixt wrote:
       | Pretty interesting watching their tech explainers on YouTube
       | about the changes in their AI solutions. Apparently they switched
       | from CNNs to transformers for upscaling (with ray tracing
       | support) if I understood correctly though for frame generation
       | makes even more sense to me.
       | 
       | 32 GB VRAM on the highest end GPU seems almost small after
       | running LLMs with 128 GB RAM on the M3 Max, but the speed will
       | most likely more than make up for it. I do wonder when we'll see
       | bigger jumps in VRAM though, now that the need for running
       | multiple AI models at once seems like a realistic use case (their
       | tech explainers also mentions they already do this for games).
        
         | bick_nyers wrote:
         | Check out their project digits announcement, 128GB unified
         | memory with infiniband capabilities for $3k.
         | 
         | For more of the fast VRAM you would be in Quadro territory.
        
         | terhechte wrote:
         | If you have 128gb ram, try running MoE models, they're a far
         | better fit for Apple's hardware because they trade memory for
         | inference performance. using something like Wizard2 8x22b
         | requires a huge amount of memory to host the 176b model, but
         | only one 22b slice has to be active at a time so you get the
         | token speed of a 22b model.
        
           | FuriouslyAdrift wrote:
           | Project Digits... https://www.nvidia.com/en-us/project-
           | digits/
        
             | throwaway48476 wrote:
             | I guess they're tired of people buying macs for AI.
        
           | cma wrote:
           | You can also run the experts on separate machines with low
           | bandwidth networking or even the internet (token rate limited
           | by RTT)
        
           | logankeenan wrote:
           | Do you have any recommendations on models to try?
        
             | stkdump wrote:
             | Mixtral and Deepseek use MOE. Most others don't.
        
             | Terretta wrote:
             | Mixtral 8x22b https://mistral.ai/news/mixtral-8x22b/
        
             | terhechte wrote:
             | In addition to the ones listed by others, WizardLM2 8x22b
             | (was never officially released by Microsoft but is
             | available).
        
             | memhole wrote:
             | I planted garlic this year. Thanks for documenting! I can't
             | wait to see what I get harvest time.
             | 
             | I like the Llama models personally. Meta aside. Qwen is
             | fairly popular too. There's a number of flavors you can try
             | out. Ollama is a good starting point to try things quickly.
             | You're def going to have to tolerate things crashing or not
             | working imo before you understand what your hardware can
             | handle.
        
           | memhole wrote:
           | I haven't had great luck with the wizard as a counter point.
           | The token generation is unbearably slow. I might have been
           | using too large of a context window, though. It's an
           | interesting model for sure. I remember the output being
           | decent. I think it's already surpassed by other models like
           | Qwen.
        
             | terhechte wrote:
             | Long context windows are a problem. I gave Qwen 2.5 70b a
             | ~115k context and it took ~20min for the answer to finish.
             | The upside of MoE models vs 70b+ models is that they have
             | much more world knowledge.
        
         | ActionHank wrote:
         | They are intentionally keeping the VRAM small on these cards to
         | force people to buy their larger, more expensive offerings.
        
           | Havoc wrote:
           | Saw someone else point out that potentially the culprit here
           | isn't nvidia but memory makers. It's still 2gb per chip and
           | has been since forever
        
             | tharmas wrote:
             | GDDR7 apparently has the capability of 3gb per chip. As it
             | becomes more available their could be more VRAM
             | configurations. Some speculate maybe an RTX 5080 Super 24gb
             | release next year. Wishful thinking perhaps.
        
           | tbolt wrote:
           | Maybe, but if they strapped these with 64gb+ wouldn't that be
           | wasted on folks buying it for its intended purpose? Gaming.
           | Though the "intended use" is changing and has been for a bit
           | now.
        
             | knowitnone wrote:
             | hmmm, maybe they can had different offerings like 16GB,
             | 32GB, 64GB, etc. Maybe we can even have 4 wheels on a car.
        
             | mox1 wrote:
             | Not really, the more textures you can put into memory the
             | faster they can do their thing.
             | 
             | PC gamers would say that a modern mid-range card (1440p
             | card) should really have 16GB of vram. So a 5060 or even a
             | 5070 with less than that amount is kind of silly.
        
             | whywhywhywhy wrote:
             | XX90 is only half a gaming card it's also the one the
             | entire creative professional 3D CGI, AI, game dev industry
             | runs on.
        
             | Aerroon wrote:
             | The only reason gaming doesn't use all the VRAM is because
             | typically GPUs don't have all the VRAM. If they did then
             | games would somehow find a way to use it.
        
               | jajko wrote:
               | Game engines are optimized for lowest common denominator,
               | being in this case consoles. PC games are rarely
               | exclusivities, so same engine has to make it running with
               | least ram available and differences between versions are
               | normally small.
               | 
               | One normally uses some ultra texture pack to utilize
               | current gen card's memory fully on many games.
        
           | tharmas wrote:
           | Totally agree. I call this the "Apple Model". Just like the
           | Apple Mac base configurations with skimpy RAM and Drive
           | capacities to make the price look "reasonable". However, just
           | like Apple, NVIDIA does make really good hardware.
        
           | marginalia_nu wrote:
           | Makes sense. The games industry doesn't want another crypto
           | mining-style GPU shortage.
        
           | hibikir wrote:
           | If the VRAM wasn't small, the cards would all get routed to
           | non gaming uses. Remember the state of the market when the
           | 3000 series was new?
        
             | ginko wrote:
             | Then they should sell more of them.
        
               | ChoGGi wrote:
               | Why sell more when you can sell less for more
        
               | wkat4242 wrote:
               | They can only make so many, that's part of the problem
        
               | bornfreddy wrote:
               | They should contact Intel.
        
           | SideQuark wrote:
           | So you're saying more VRAM costs more money? What a novel
           | idea!
           | 
           | Conversely, this means you can pay less if you need less.
           | 
           | Seems like a win all around.
        
           | vonneumannstan wrote:
           | No gamers need such high VRAM, if you're buying Gaming cards
           | for ML work you're doing it wrong.
        
             | riskable wrote:
             | It's Nvidia that considers them, "gaming cards". The
             | _market_ decides their use in reality though.
             | 
             | Their strategy is to sell lower-VRAM cards to consumers
             | with the understanding that they can make more money on
             | their more expensive cards for professionals/business. By
             | doing this, though they're creating a gap in the market
             | that their competitors could fill (in theory).
             | 
             | Of course, this assumes their competitors have half a brain
             | cell (I'm looking at YOU, Intel! For fuck's sake give us a
             | 64GB ARC card already!).
        
             | epolanski wrote:
             | Games already exceed 16 GBs at 4k from years.
        
               | throwaway48476 wrote:
               | I exceed 16GB in Chrome.
        
             | throwaway314155 wrote:
             | > Gaming cards for ML work you're doing it wrong
             | 
             | lol okay. "doing it wrong" for a tenth of the cost.
        
               | moogly wrote:
               | And screwing gamers over by raising the prices by 2x.
               | Fuck that.
        
             | muchosandwich wrote:
             | It seems like the 90-series cards are going to be targeting
             | prosumers again. People who play games but may use their
             | desktop for work as well. Some people are doing AI training
             | on some multiple of 3090/4090 today but historically the
             | Titan cards that preceded the 90s cards were used by game
             | developers, video editors and other content developers. I
             | think NVIDIA is going to try to move the AI folks onto
             | Digits and return the 90-series back to its roots but also
             | add in some GenAI workloads.
        
             | sfmike wrote:
             | forget the post but some dude had a startup piping his 3090
             | to use via cloudflare tunnels for his ai saas making 5
             | figures a month off of his 1k gpu that handled the work
             | load, I'd say he was doing it more then right.
        
           | barbazoo wrote:
           | Is there actually less VRAM on the cards or is it just
           | disabled?
        
             | deaddodo wrote:
             | GPU manufacturers have no reason to include additional
             | memory chips of no use on a card.
             | 
             | This isn't like a cutdown die, which is a single piece with
             | disabled functionality...the memory chips are all
             | independent (expensive) pieces soldered on board (the black
             | squares surrounding the GPU core):
             | 
             | https://cdn.mos.cms.futurecdn.net/vLHed8sBw8dX2BKs5QsdJ5-12
             | 0...
        
           | wkat4242 wrote:
           | Well, they _are_ gaming cards. 32GB is plenty for that.
        
         | resource_waste wrote:
         | > after running LLMs with 128 GB RAM on the M3 Max,
         | 
         | These are monumentally different. You cannot use your computer
         | as an LLM. Its more novelty.
         | 
         | I'm not even sure why people mention these things. Its
         | possible, but no one actually does this out of testing
         | purposes.
         | 
         | It falsely equates Nivida GPUs with Apple CPUs. The winner is
         | Apple.
        
         | vonneumannstan wrote:
         | If you want to run LLMs buy their H100/GB100/etc grade cards.
         | There should be no expectation that consumer grade gaming cards
         | will be optimal for ML use.
        
           | nahnahno wrote:
           | Yes there should be. We don't want to pay literal 10x markup
           | because the card is suddenly "enterprise".
        
         | quadrature wrote:
         | Why are transformers a better fit for frame generation. Is it
         | because they can better utilize context from the previous
         | history of frames ?
        
       | lemoncookiechip wrote:
       | I have a feeling regular consumers will have trouble buying
       | 5090s.
       | 
       | RTX 5090: 32 GB GDDR7, ~1.8 TB/s bandwidth. H100 (SXM5): 80 GB
       | HBM3, ~3+ TB/s bandwidth.
       | 
       | RTX 5090: ~318 TFLOPS in ray tracing, ~3,352 AI TOPS. H100:
       | Optimized for matrix and tensor computations, with ~1,000 TFLOPS
       | for AI workloads (using Tensor Cores).
       | 
       | RTX 5090: 575W, higher for enthusiast-class performance. H100
       | (PCIe): 350W, efficient for data centers.
       | 
       | RTX 5090: Expected MSRP ~$2,000 (consumer pricing). H100: Pricing
       | starts at ~$15,000-$30,000+ per unit.
        
         | topherjaynes wrote:
         | That's my worry too, I'd like one or two, but 1) will either
         | never be in line for them 2) or can only find via secondary
         | market at 3 or 4x the price...
        
         | boroboro4 wrote:
         | H100 has 3958 TFLOPS sparse fp8 compute. I'm pretty sure listed
         | tflops for 5090 are sparse (and probably) fp4/int4.
        
           | rfoo wrote:
           | Yes, that's the case. Check the (partial) spec of 5090 D,
           | which is the nerfed version for export to China. It is
           | marketed as having 2375 "AI TOPS".
           | 
           | BIS demands it to be less than $4800 TOPS \times Bit-Width$,
           | and the most plausible explanation for the number is - 2375
           | sparse fp4/int4 TOPS, which means 1187.5 dense TOPS for 4
           | bit, or $4750 TOPS \times Bit-Width$.
        
         | bee_rider wrote:
         | How well do these models do at parallelizing across multiple
         | GPUs? Is spending $4k on the 5090 a good idea for training,
         | slightly better performance for much cheaper? Or a bad idea, 0x
         | as good performance because you can't fit your 60GB model on
         | the thing?
        
         | Havoc wrote:
         | > regular consumers will have trouble buying 5090s.
         | 
         | They're not really supposed to either judging by how they
         | priced this. For non AI uses the 5080 is infinitely better
         | positioned
        
           | kllrnohj wrote:
           | > For non AI uses the 5080 is infinitely better positioned
           | 
           | ...and also slower than a 4090. Only the 5090 got a gen/gen
           | upgrade in shader counts. Will have to wait for benchmarks of
           | course, but the rest of the 5xxx lineup looks like a dud
        
       | geertj wrote:
       | Any advice on how to buy the founders edition when it launches,
       | possibly from folks who bought the 4090 FE last time around? I
       | have a feeling there will be a lot of demand.
        
         | logicalfails wrote:
         | Getting a 3080 FE (I also had the option to get the 3090 FE) at
         | the height of pandemic demand required me sleeping outside a
         | Best Buy with 50 other random souls on a wednesday night.
        
           | steelframe wrote:
           | At that time I ended up just buying a gaming PC packaged with
           | the card. I find it's generally worth it to upgrade all the
           | components of the system along with the GPU every 3 years or
           | so.
        
             | Wololooo wrote:
             | This goes at a significant premium for on average OEM parts
             | that are subpar. Buying individually yields much better
             | results and these days it's less of a hassle than it used
             | to.
        
               | rtkwe wrote:
               | It was likely from an integrator not a huge OEM that's
               | spinning their own proprietary motherboard designs like
               | Dell. In that case they only really paid the integrator's
               | margin and lost the choice of their own parts.
        
         | jmuguy wrote:
         | Do you live somewhat near a Microcenter? They'll likely have
         | these as in-store pick up only, no online reservations, 1 per
         | customer. Recently got a 9800X3D CPU from them, its nice
         | they're trying to prevent scalping.
        
           | geertj wrote:
           | I do! Great advice. Going off on a tangent, when I recently
           | visited my Microcenter after a few years of not going there,
           | it totally gave me 80s vibes and I loved it. Staff fit the
           | "computer nerd" stereotype accurately, including jeans shirts
           | and ponytails. And best of all they actually wanted to talk
           | to me and help me find stuff, and were knowledgeable.
        
             | jmuguy wrote:
             | Ours just opened in 2024 and I've tried to give them as
             | much business as possible. Ordering everything for a new PC
             | build, sans the AMD CPU, and then doing pick up was a
             | breeze. Feels great that the place is completely packed
             | every time I go in there. I feel like Bestbuy made me sort
             | of hate electronics retail and Microcenter is reminding of
             | what it used to be like going to Radio Shack and Compusa
             | back in their hayday.
        
             | ryao wrote:
             | I had a similar feeling when going to microcenter for the
             | first time in years a few years ago, but in my case, it was
             | a 90s vibe since I had first visited a computer store in
             | the 90s.
        
           | mjevans wrote:
           | As someone living (near) Seattle, this is a major issue for
           | me every product launch and I don't have a solution.
           | 
           | The area's geography just isn't conducive to allowing a
           | single brick and mortar store to survive and compete with
           | online retail for costs vs volume; but without a B&M store
           | there's no good way to do physical presence anti-scalper
           | tactics.
           | 
           | I can't even get in a purchase opportunity lottery since AMD
           | / Nvidia don't do that sort of thing for allocating restock
           | quota tickets that could be used as tokens to restock product
           | if a purchase is to the correct shipping address.
        
       | ks2048 wrote:
       | This is maybe a dumb question, but why is it so hard to buy
       | Nvidia GPUs?
       | 
       | I can understand lack of supply, but why can't I go on nvidia.com
       | and buy something the same way I go on apple.com and buy
       | hardware?
       | 
       | I'm looking for GPUs and navigating all these different resellers
       | with wildly different prices and confusing names (on top of the
       | already confusing set of available cards).
        
         | datadrivenangel wrote:
         | Nvidia uses resellers as distributors. Helps build out a locked
         | in ecosystem.
        
           | ks2048 wrote:
           | How does that help "build out a locked in ecosystem"? Again,
           | comparing to Apple: they have a very locked-in ecosystem.
        
             | MoreMoore wrote:
             | I don't think lock-in is the reason. The reason is more
             | that companies like Asus and MSI have a global presence and
             | their products are available on store shelves everywhere.
             | NVIDIA avoids having to deal with building up all the
             | required relationships and distribution, they also save on
             | things like technical support staff and dealing with
             | warranty claims directly with customers across the globe.
             | The handful of people who get an FE card aside.
        
             | santoshalper wrote:
             | Nvidia probably could sell cards directly now, given the
             | strength of their reputation (and the reality backing it
             | up) for graphics, crypto, and AI. However, they grew up as
             | a company that sold through manufacturing and channel
             | partners and that's pretty deeply engrained in their
             | culture. Apple is unusually obsessed with integration, most
             | companies are more like Nvidia.
        
             | pragmar wrote:
             | Apple locks users in with software/services. nVidia locks
             | in add-in board manufacturers with exclusive arrangements
             | and partner programs that tie access to chips to contracts
             | that prioritize nVidia. It happens upstream of the
             | consumer. It's always a matter of degree with this stuff as
             | to where it becomes anti-trust, but in this case it's overt
             | enough for governments to take notice.
        
         | doix wrote:
         | Nvidia (and AMD) make the "core", but they don't make a "full"
         | graphics card. Or at least they don't mass produce them, I
         | think Nvidia tried it with their "founders edition".
         | 
         | It's just not their main business model, it's been that way for
         | many many years at this point. I'm guessing business people
         | have decided that it's not worth it.
         | 
         | Saying that they are "resellers" isn't technically accurate.
         | The 5080 you buy from ASUS will be different than the one you
         | buy from MSI.
        
           | SirMaster wrote:
           | They still make reference founders editions. They sell them
           | at Best Buy though, not directly.
        
             | infecto wrote:
             | Reference cards make up the vast minority of cards for a
             | specific generation though. I looked for numbers and could
             | not find them but they tend to be the Goldilocks of cards
             | if you can grab one because they sell at msrp IIRC.
        
               | devmor wrote:
               | Yep, I scored a 3070 Founder's at launch and was very
               | lucky, watching other people pay up to the MSRP of the
               | 3090 to get one from elsewhere.
        
           | sigmoid10 wrote:
           | Nvidia also doesn't make the "core" (i.e. the actual chip).
           | TSMC and Samsung make those. Nvidia _designs_ the chip and
           | (usually) creates a reference PCB to show how to make an
           | actual working GPU using that chip you got from e.g. TSMC.
           | Sometimes (especially in more recent years) they also sell
           | that design as  "founders" edition. But they don't sell most
           | of their hardware directly to average consumers. Of course
           | they also provide drivers to interface with their chips and
           | tons of libraries for parallel computing that makes the most
           | of their design.
           | 
           | Most people don't realize that Nvidia is much more of a
           | software company than a hardware company. CUDA in particular
           | is like 90% of the reason why they are where they are while
           | AMD and Intel struggle to keep up.
        
             | doix wrote:
             | Yeah I should have said design, embarrassingly I used to
             | work in a (fabless) semiconductor company.
             | 
             | Totally agree with the software part. AMD usually designs
             | something in the same ball park as Nvidia, and usually has
             | a better price:performance ratio at many price points. But
             | the software is just too far behind.
        
               | automatic6131 wrote:
               | AMDs driver software is more featureful and better than
               | NVidia's offerings. GeForce Experience + the settings app
               | combo was awful, the Nvidia App is just copying some
               | homework, and integrating MSI Afterburner's freeware.
               | 
               | But the business software stack was, yes, best in class.
               | But it's not so for the consumer!
        
               | knowitnone wrote:
               | I think they mean CUDA
        
             | nightski wrote:
             | I've bought multiple founders editions cards from the
             | nvidia store directly. Did they stop doing that recently?
        
             | themaninthedark wrote:
             | It seems that they have been tightening what they allow
             | their partners to do, which caused EVGA to break away as
             | they were not allowed to deviate too much from the
             | reference design.
        
               | sigmoid10 wrote:
               | That was mostly about Nvidia's pricing. It's basically
               | impossible to compete economically with the founders
               | editions because Nvidia doesn't charge themselves a hefty
               | markup on the chip. That's why their own cards always
               | sell out instantly and then the aftermarket GPU builders
               | can fight to pick up the scraps. The whole idea of the
               | founders edition seems to be to make a quick buck
               | immediately after release. Long term it's much more
               | profitable to sell the chip itself at a price that they
               | would usually sell their entire GPU for.
        
               | ThatMedicIsASpy wrote:
               | This years founders edition is what I really want from a
               | GPU. Stop wasting my 2nd PCIe slot because you've made it
               | 3.5/4 slots BIG! It is insane that they are now cooling
               | 575W with two slots in height.
        
               | simoncion wrote:
               | I would suggest getting a case that has a set of inbuilt
               | (typically vertically-oriented) expansion card slots
               | positioned a distance away from the regular expansion
               | card slots, mount your graphics card there, and connect
               | it to the motherboard with a PCI-E riser cable. It's what
               | I did and I kicked myself for not doing it years prior.
               | 
               | I have no experience with PCI-E 5 cables, but I've a
               | PCI-E 4 riser cable from Athena Power that works just
               | fine (and that you can buy right now on Newegg). It
               | doesn't have any special locking mechanism, so I was
               | concerned that it would work its way off of the card or
               | out of the mobo slot... but it has been in place for
               | years now with no problem.
        
               | MisoRamen wrote:
               | It is an ever uphill battle to compete with Nvidia as a
               | AIB partner.
               | 
               | Nvidia has internal access to the new card way ahead of
               | time, has aerodynamic and thermodynamic simulators,
               | custom engineered boards full of sensors, plus a team of
               | very talented and well paid engineers for months in order
               | to optimize cooler design.
               | 
               | Meanwhile AIB partners is pretty much kept in the blind
               | until a few months in advance. It is basically impossible
               | for a company like EVGA to exist as they pride themselves
               | in their customer support - the finances just does not
               | make sense.
        
               | mbreese wrote:
               | Which is why EVGA stopped working with Nvidia a few years
               | ago... (probably mentioned elsewhere too).
               | 
               | https://www.electronicdesign.com/technologies/embedded/ar
               | tic...
        
           | mrweasel wrote:
           | Didn't Nvidia piss of some of their board partners at some
           | point. I think EVGA stopped making Nvidia based graphics
           | cards because of poor behavior on Nvidia part?
           | 
           | Also aren't most of the business cards made by Nvidia
           | directly... or at least Nvidia branded?
        
           | grogenaut wrote:
           | The founders edition ones that I had were not great gpus.
           | They were both under cooled and over cooled. They had one
           | squirrel cage style blower that was quite loud and powerful
           | and ran bascially at no speed or full blast. But being that
           | it only had the one airpath and one fan it got overwhelmed by
           | dust or if that blower fan had issues the gpu over heated.
           | The consumer / 3rd party made ones usually have multiple fans
           | at lower speeds larger diameter, multiple flow paths, and
           | more control. TL;DR they were better designed, nvidia took
           | the data center ram as much air as you can in there approach
           | which isn't great for your home pc.
        
             | 6SixTy wrote:
             | Founders cards being worse than board partner models hasn't
             | been true in like 8 years. They switched to dual axial
             | rather than a single blower fan with the 20 series, which
             | made the value of board partner models hard to justify.
             | 
             | Since then, Nvidia is locked in a very strange card war
             | with their board partners, because Nvidia has all the juicy
             | inside details on their own chips which they can just not
             | give the same treatment to their partners, stacking the
             | deck for themselves.
             | 
             | Also, the reason why blowers are bad is because the design
             | can't really take advantage of a whole lot of surface area
             | offered by the fins. There's often zero heat pipes
             | spreading the heat evenly in all directions, allowing a hot
             | spot to form.
        
           | orphea wrote:
           | it's not worth it.
           | 
           | I wonder how much "it's not worth it". Surely it should have
           | been _at all_ profitable? (a honest question)
        
         | CivBase wrote:
         | I've always assumed their add-in board (AIB) partners (like
         | MSI, ASUS, Gigabyte, etc) are able to produce PCBs and other
         | components at higher volumes and lower costs than NVIDIA.
        
           | xnyan wrote:
           | Not just the production of the finished boards, but also
           | marketing, distribution to vendors and support/RMA for
           | defective products.
           | 
           | There is profit in this, but it's also a whole set of skills
           | that doesn't really make sense for Nvidia.
        
         | zitterbewegung wrote:
         | This is supply and demand at work. NVIDIA has to choose to
         | either sell consumer or high end and they can reserve so much
         | resources from TSMC. Also, Apple has outsold hardware before or
         | it has high demand when it releases but for NVIDIA they have
         | nearly constant purchases throughout the year from enterprise
         | and also during consumer product launches.
        
         | ggregoire wrote:
         | I read your question and thought to myself "why is it so hard
         | to buy a Steamdeck"? Available only in like 10 countries. Seems
         | like the opposite problem, Valve doesn't use resellers but they
         | can't handle international manufacturing/shipping themselves?
         | At least I can get a Nvidia GPU anytime I want from Amazon,
         | BestBuy or whatever.
        
         | TrackerFF wrote:
         | GPUs are in demand.
         | 
         | So scalpers want to make a buck on that.
         | 
         | All there is to it. Whenever demand surpasses supply, someone
         | will try to make money off that difference. Unfortunately for
         | consumers, that means scalpers use bots to clean out retail
         | stores, and then flip them to consumers.
        
           | WXLCKNO wrote:
           | Without thinking about it too deeply I'm wondering if GPU
           | demand is that much higher than let's say iPhone demand. I
           | don't think I've ever heard of iPhones being scarce and rare
           | and out of stock.
        
             | pas wrote:
             | Apple _very_ tightly controls their whole value chain. It
             | 's their whole thing. Nvidia "dgaf" they are raking in more
             | cash than ever and they are busy trying to figure out
             | what's at the end of the semi-rainbow. (Apparently it's a
             | B2C AI box gimmick.)
        
         | chis wrote:
         | One way to look at is that the third party GPU packagers have a
         | different set of expertise. They generally build motherboards,
         | GPU holder boards, RAM, and often monitors and mice as well.
         | All of these product PCBs are cheaply made and don't depend on
         | the performance of the latest TSMC node the way the GPU chips
         | do, more about ticking feature boxes at the lowest cost.
         | 
         | So nvidia wouldn't have the connections or skillset to do
         | budget manufacturing of low-cost holder boards the way ASUS or
         | EVGA does. Plus with so many competitors angling to use the
         | same nvidia GPU chips, nvidia collects all the margin
         | regardless.
        
           | brigade wrote:
           | Yet the FE versions end up cheaper than third party cards (at
           | least by MSRP), and with fewer issues caused by the third
           | parties cheaping out on engineering...
        
         | blackoil wrote:
         | Maybe, it is simply a legacy business model. Nvidia wasn't
         | always a behemoth. In olden days they must be happy for someone
         | else to manage the global distribution, marketing, service etc.
         | Also, this gives an illusion of choice. You get graphic cards
         | in different color, shape, RGB, water cooling combinations.
        
         | diob wrote:
         | It is frustrating speaking as someone who grew up poor and
         | couldn't afford anything, and now I finally can and nothing is
         | ever in stock. Such a funny twist of events, but also makes me
         | sad.
        
         | michaelt wrote:
         | OK so there are a handful of effects at work at the same time.
         | 
         | 1. Many people knew the new series of nvidia cards was about to
         | be announced, and nobody wanted to get stuck with a big stock
         | of previous-generation cards. So most reputable retailers are
         | just sold out.
         | 
         | 2. With lots of places sold out, some scalpers have realised
         | they can charge big markups. Places like Amazon and Ebay don't
         | mind if marketplace sellers charge $3000 for a $1500-list-price
         | GPU.
         | 
         | 3. For various reasons, although nvidia makes and sells some
         | "founder edition" the vast majority of cards are made by other
         | companies. Sometimes they'll do 'added value' things like
         | adding RGB LEDs and factory overclocking, leading to a 10%
         | price spread for cards with the same chip.
         | 
         | 4. nvidia's product lineup is just very confusing. Several
         | product lines (consumer, workstation, data centre) times
         | several product generations (Turing, Ampere, Ada Lovelace)
         | times several vram/performance mixes (24GB, 16GB, 12GB, 8GB)
         | plus variants (Super, Ti) times desktop and laptop versions.
         | That's a lot of different models!
         | 
         | nvidia also don't particularly _want_ it to be easy for you to
         | compare performance across product classes or generations.
         | Workstation and server cards don 't even have a list price, you
         | can only get them by buying a workstation or server from an
         | approved vendor.
         | 
         | Also nvidia don't tend to update their marketing material when
         | products are surpassed, so if you look up their flagship from
         | three generations ago it'll still say it offers unsurpassed
         | performance for the most demanding, cutting-edge applications.
        
           | ryao wrote:
           | The workstation cards have MSRPs. The RTX 6000 Ada's MSRP is
           | $6799:
           | 
           | https://www.techpowerup.com/gpu-specs/rtx-6000-ada-
           | generatio...
        
         | the__alchemist wrote:
         | It depends on the timing. I lucked out about a year ago on the
         | 4080; I happened to be shopping in what turned out to be the ~1
         | month long window where you could just go to the nvidia site,
         | and order one.
        
       | voidUpdate wrote:
       | Ooo, that means its probably time for me to get a used 2080, or
       | maybe even a 3080 if I'm feeling special
        
         | Kelteseth wrote:
         | Why not go for AMD? I just got a 7900XTX for 850 euros, it runs
         | ollama or comfyUI via WSl2 quite nicely.
        
           | whywhywhywhy wrote:
           | Pointless putting yourself through the support headaches or
           | having to wait for support to arrive to save a few dollars
           | because the rest of the community is running Nvidia
        
             | Kelteseth wrote:
             | Nah it's quite easy these days. Ollama runs perfectly fine
             | on Windows, comfyUI still has some not ported requirements,
             | so you have to do stuff through WSL2.
        
           | viraj_shah wrote:
           | Do you have a good resource for learning what kinds of
           | hardware can run what kinds of models locally? Benchmarks,
           | etc?
           | 
           | I'm also trying to tie together different hardware specs to
           | model performance, whether that's training or inference. Like
           | how does memory, VRAM, memory bandwidth, GPU cores, etc. all
           | play into this. Know of any good resources? Oddly enough I
           | might be best off asking an LLM.
        
             | holoduke wrote:
             | To prevent custom implementations is recommended to get a
             | Nvidia card. Minimum 3080 to get some results. But if you
             | want video you should go for either 4090 or 5090. ComfUI is
             | a popular interface which you can use for graphical stuff.
             | Images and videos. Local text models I would recommend to
             | use the Misty app. Basically a wrapper and downloader for
             | various models. Tons of youtube videos on how to achieve
             | stuff.
        
             | Kelteseth wrote:
             | I tested ollama with 7600XT at work and the mentioned
             | 7900XTX. Both run fine with their VRAM limitations. So you
             | can just switch between different quantization of llama 3.1
             | or the vast amount of different models at
             | https://ollama.com/search
        
           | orphea wrote:
           | AMD driver quality is crap. I upgraded from GTX 1080 to RX
           | 6950 XT because I found a good deal and I didn't want to
           | support nvidia's scammy bullshit of launching inferior GPUs
           | under the same names. Decided to go with AMD this time, and I
           | had everything: black screens, resolution drops to 1024x768,
           | total freezes, severe lags in some games (BG3) unless I
           | downgrade the driver to a very specific version.
        
             | mldbk wrote:
             | It is an outdated claim.
             | 
             | I have both 4090 (workstation) and 7900XT (to play some
             | games) and I would say that 7900XT was rock solid for me
             | for the last year (I purchased it in Dec 2023).
        
           | williamDafoe wrote:
           | AMD is an excellent choice. NVidia UI has been horrible and
           | AMD adrenaline has been better than NVidia for several years
           | now. With NVidia, you are paying A LOT of extra money for
           | trickery and fake pixels, fake frames, fake (ai) rendeering.
           | All fakeness. All hype. When you get down to the raw
           | performance of these new cards, it must be a huge
           | disappointment, otherwise, why would Jensen completely forget
           | to mention anything REAL about the performance of these
           | cards? These are cut-down cards designed to sell at cut-down
           | prices with lots of fluff and whipped cream added on top ...
        
           | satvikpendem wrote:
           | DLSS is good and keeps improving, as with DLSS 4 where most
           | of the features are compatible with even the 2000 series
           | cards. AMD does not have the same software feature set to
           | justify a purchase.
        
         | Macha wrote:
         | The 2080 was a particularly poor value card, especially when
         | considering the small performance uplift and the absolute glut
         | of 1080 Tis that were available. A quick look on my local ebay
         | also indicates they're both around the EUR200-250 range for
         | used buy it now, so it seems to make way more sense to go to a
         | 3080.
        
           | qingcharles wrote:
           | 2080 TI though is a really good sweet spot for
           | price/performance.
        
         | vonneumannstan wrote:
         | a 4070 has much better performance for much cheaper than a
         | 3080...
        
           | rtkwe wrote:
           | Any the 4070 Super is relatively available too. I just bought
           | one with only a small amount of hunting. Bought it right off
           | of Best Buy, originally tried going to the Microcenter near
           | my parent's house while I was down there but should have
           | bought the card online for pickup. In the 2 days between my
           | first check and arriving at the store ~20 cards sold.
        
             | alkonaut wrote:
             | What was the drop in 3070 pricing when the 4070 was
             | released? We should expect a similar drop now I suppose?
        
               | rtkwe wrote:
               | It took a while according to the first price chart I
               | found. The initial release of the 4070 Ti/FE in Jan 2023
               | didn't move the price much but the later release did
               | start dropping the price. Nvidia cards are pretty scarce
               | early in the generation so the price effect takes a
               | minute to really kick into full force.
               | 
               | I just upgraded from a 2080 Ti I had gotten just a few
               | weeks into the earliest COVID lockdowns because I was
               | tired of waiting constantly for the next generation.
               | 
               | https://howmuch.one/product/average-nvidia-geforce-
               | rtx-3070-...
        
       | pier25 wrote:
       | AI is going to push the price closer to $3000. See what happened
       | with crypto a couple of years back.
        
         | theandrewbailey wrote:
         | The ~2017 crypto rush told Nvidia how much people were willing
         | to spend on GPUs, so they priced their next series (RTX 2000)
         | much higher. 2020 came around, wash, rinse, repeat.
        
           | Macha wrote:
           | Note the 20 series bombed, largely because of the price hikes
           | coupled with meager performance gains, so the initial plan
           | was for the 30 series to be much cheaper. But then the 30
           | series scalping happened and they got a second go at re-
           | anchoring what people thought of as reasonable GPU prices.
           | Also they have diversified other options if gamers won't pay
           | up, compared to just hoping that GPU-minable coins won over
           | those that needed ASICs and the crypto market stayed hot. I
           | can see nVidia being more willing to hurt their gaming market
           | for AI than they ever were for crypto.
           | 
           | Also also, AMD has pretty much thrown in the towel at
           | competing for high end gaming GPUs already.
        
       | nfriedly wrote:
       | Meh. Feels like astronomical prices for the smallest upgrades
       | they could get away with.
       | 
       | I miss when high-end GPUs were $300-400, and you could get
       | something reasonable for $100-200. I guess that's just integrated
       | graphics these days.
       | 
       | The most I've ever spent on a GPU is ~$300, and I don't really
       | see that changing anytime soon, so it'll be a long time before
       | I'll even consider one of these cards.
        
         | garbageman wrote:
         | Intel ARC B580 is $249 MSRP and right up your alley in that
         | case.
        
           | nfriedly wrote:
           | Yep. If I needed a new GPU, that's what I'd go for. I'm
           | pretty happy with what I have for the moment, though.
        
             | frognumber wrote:
             | I'd go for the A770 over the B580. 16GB > 12GB, and that
             | makes a difference for a lot of AI workloads.
             | 
             | An older 3060 12GB is also a better option than the B580.
             | It runs around $280, and has much better compatibility
             | (and, likely, better performance).
             | 
             | What I'd love to see on all of these are specs on idle
             | power. I don't mind the 5090 approaching a gigawatt peak,
             | but I want to know what it's doing the rest of the time
             | sitting under my desk when I just have a few windows open
             | and am typing a document.
        
               | dcuthbertson wrote:
               | A gigawatt?! Just a little more power and I won't need a
               | DeLorean for time travel!
        
         | yourusername wrote:
         | >I miss when high-end GPUs were $300-400, and you could get
         | something reasonable for $100-200.
         | 
         | That time is 25 years ago though, i think the Geforce DDR is
         | the last high end card to fit this price bracket. While cards
         | have gotten a lot more expensive those $300 high end cards
         | should be around $600 now. And $200-400 for low end still
         | exists.
        
           | oynqr wrote:
           | 2008 is 25 years ago?
        
       | Insanity wrote:
       | Somewhat related, any recommendations for 'pc builders' where you
       | can configure a PC with the hardware you want, but have it
       | assembled and shipped to you instead of having to build it
       | yourself? With shipping to Canada ideally.
       | 
       | I'm planning to upgrade (prob to a mid-end) as my 5 year old
       | computer is starting to show it's age, and with the new GPUs
       | releasing this might be a good time.
        
         | 0xffff2 wrote:
         | I don't know of any such service, but I'm curious what the
         | value is for you? IMO picking the parts is a lot harder than
         | putting them together.
        
       | chmod775 wrote:
       | > will be two times faster [...] thanks to DLSS 4
       | 
       | Translation: No significant actual upgrade.
       | 
       | Sounds like we're continuing the trend of newer generations being
       | beaten on fps/$ by the previous generations while hardly pushing
       | the envelope at the top end.
       | 
       | A 3090 is $1000 right now.
        
         | intellix wrote:
         | Why is that a problem though? Newer and more GPU intensive
         | games get to benefit from DLSS 4 and older games already run
         | fine. What games without DLSS support could have done with a
         | boost?
         | 
         | I've heard this twice today so curious why it's being mentioned
         | so often.
        
           | epolanski wrote:
           | We all know DLSS4 could be compatible with previous gens.
           | 
           | Nvidia has done that in the past already (see PhysX).
        
           | Diti wrote:
           | > What games without DLSS support could have done with a
           | boost?
           | 
           | DCS World?
        
             | bni wrote:
             | Has DLSS now
        
           | chmod775 wrote:
           | I for one don't like the DLSS/TAA look at all. Between the
           | lack of sharpness, motion blur and ghosting, I don't
           | understand how people can look at that and consider it an
           | upgrade. Let's not even get into the horror that is frame
           | generation. They're a graphics downgrade that gives me a
           | headache and I turn the likes of TAA and DLSS off in every
           | game I can. I'm far from alone in this.
           | 
           | So why should we consider to buy a GPU at twice the price
           | when it has barely improved rasterization performance? An
           | artificially generation-locked feature anyone with good
           | vision/perception despises isn't going to win us over.
        
             | solardev wrote:
             | Do you find DLSS unacceptable even on "quality" mode
             | without frame generation?
             | 
             | I've found it an amazing balance between quality and
             | performance (ultra everything with quality DLSS looks and
             | run way better than, say, medium without DLSS). But I also
             | don't have great vision, lol.
        
           | Yizahi wrote:
           | I also like DLSS, but the OP is correct that it is a problem.
           | Specifically it's a problem with understanding what are these
           | cards capable of. Theoretically we would like to see
           | separately performance with no upscaling at all, then
           | separately with different levels of upscaling. Then we would
           | be able to see easier what is the real performance boost of
           | the hardware, and of the upscaler separately.
           | 
           | It's like BMW comparing new M5 model to the previous gen M5
           | model, while previous gen is on the regular 95 octane, and
           | new gen is on some nitromethane boosted custom fuel. With no
           | information how fast the new car is on a regular fuel.
        
             | jajko wrote:
             | How situation actually looks like will be revealed soon via
             | independent tests. I'm betting its bit of both, no way they
             | can't progress in 2 years raw performance at all, other
             | segments still manage to achieve this. Even 10%, combined
             | with say 25% boost with DLSS, nets nice FPS increase. I
             | wish it could be more but we don't have a choice right now.
             | 
             | Does normal gamers actually notice any difference on some
             | normal 4k low latency monitors/tvs? I mean any form of
             | extra lag, screen tearing etc.
        
         | williamDafoe wrote:
         | It looks like the new cards are NO FASTER than the old cards.
         | So they are hyping the fake frames, fake pixels, fake AI
         | rendering. Anything fake = good, anything real = bad.
         | 
         | Jensen thinks that "Moore's Law is Dead" and it's just time to
         | rest and vest with regards to GPUs. This is the same attitude
         | that Intel adopted 2013-2024.
        
           | piyh wrote:
           | Why are you upset how a frame is generated? We're not talking
           | about free range versus factory farming. Here, a frame is a
           | frame and if your eye can't tell the difference then it's as
           | good as any other.
        
             | throwaway48476 wrote:
             | Latency and visual artifacts.
        
             | adgjlsfhk1 wrote:
             | the main point of more fps is lower latency. if you're
             | getting 1000 fps but they are all ai generated from a
             | single real frame per second, your latency will be 500ms
             | and the experience will suck
        
         | edm0nd wrote:
         | >A 3090 is $1000 right now.
         | 
         | Not really worth it if you can get a 5090 for $1,999
        
           | alekratz wrote:
           | If you can get a 5090 for that price, I'll eat my hat.
           | scalpers with their armies of bots will buy them all before
           | you get a chance.
        
             | ryao wrote:
             | Do you have a recipe in mind for preparing your hat for
             | human consumption or is your plan to eat it raw?
        
             | ewild wrote:
             | it is absurdly easy to get a 5090 on launch. ive gotten
             | their flagship from their website FE every single launch
             | without fail. from 2080 to 3090 to 4090
        
               | richwater wrote:
               | i absolutely do not believe you
        
           | chmod775 wrote:
           | Saving $1000 for only a ~25-30% hit in rasterization perf is
           | going to be worth it for a lot of people.
        
         | m3kw9 wrote:
         | 5090 has 2x the core, higher frequencies, 3x flops. You got to
         | do some dd before talking
        
       | jbarrow wrote:
       | The increasing TDP trend is going crazy for the top-tier consumer
       | cards:
       | 
       | 3090 - 350W
       | 
       | 3090 Ti - 450W
       | 
       | 4090 - 450W
       | 
       | 5090 - 575W
       | 
       | 3x3090 (1050W) is less than 2x5090 (1150W), plus you get 72GB of
       | VRAM instead of 64GB, if you can find a motherboard that supports
       | 3 massive cards or good enough risers (apparently near
       | impossible?).
        
         | holoduke wrote:
         | Can you actually use multiple videocards easily with existing
         | AI model tools?
        
           | jbarrow wrote:
           | Yes, though how you do it depends on what you're doing.
           | 
           | I do a lot of training of encoders, multimodal, and vision
           | models, which are typically small enough to fit on a single
           | GPU; multiple GPUs enables data parallelism, where the data
           | is spread to an independent copy of each model.
           | 
           | Occasionally fine-tuning large models and need to use model-
           | parallelism, where the model is split across GPUs. This is
           | also necessary for inference of the _really_ big models, as
           | well.
           | 
           | But most tooling for training/inference of all kinds of
           | models supports using multiple cards pretty easily.
        
           | benob wrote:
           | Yes, multi-GPU on the same machine is pretty straightforward.
           | For example ollama uses all GPUs out of the box. If you are
           | into training, the huggingface ecosystem supports it and you
           | can always go the manual route to put tensors on their own
           | GPUs with toolkits like pytorch.
        
           | qingcharles wrote:
           | Yes. Depends what software you're using. Some will use more
           | than one (e.g. llama.cpp), some commercial software won't
           | bother.
        
         | iandanforth wrote:
         | Sounds like you might be more the target for the $3k 128GB
         | DIGITS machine.
        
           | jbarrow wrote:
           | I'm really curious what training is going to be like on it,
           | though. If it's good, then absolutely! :)
           | 
           | But it seems more aimed at inference from what I've read?
        
             | bmenrigh wrote:
             | I was wondering the same thing. Training is much more
             | memory-intensive so the usual low memory of consumer GPUs
             | is a big issue. But with 128GB of unified memory the Digits
             | machine seems promising. I bet there are some other
             | limitations that make training not viable on it.
        
               | tpm wrote:
               | It will only have 1/40 performance of BH200, so really
               | not enough for training.
        
               | jbarrow wrote:
               | Primarily concerned about the memory bandwidth for
               | training.
               | 
               | Though I think I've been able to max out my M2 when using
               | the MacBook's integrated memory with MLX, so maybe that
               | won't be an issue.
        
               | ryao wrote:
               | Training is compute bound, not memory bandwidth bound.
               | That is how Cerebras is able to do training with external
               | DRAM that only has 150GB/sec memory bandwidth.
        
               | jdietrich wrote:
               | The architectures really aren't comparable. The Cerebras
               | WSE has fairly low DRAM bandwidth, but it has a huge
               | amount of on-die SRAM.
               | 
               | https://www.hc34.hotchips.org/assets/program/conference/d
               | ay2...
        
           | gpm wrote:
           | Weirdly they're advertising "1 petaflop of AI performance at
           | FP4 precision" [1] when they're advertising the 5090 [2] as
           | having 3352 "AI TOPS" (presumably equivalent to "3 petaflops
           | at FP4 precision"). The closest graphics card they're selling
           | is the 5070 with a GPU performing at 988 "AI TOPS" [2]....
           | 
           | [1] https://nvidianews.nvidia.com/news/nvidia-puts-grace-
           | blackwe...
           | 
           | [2] https://www.nvidia.com/en-us/geforce/graphics-
           | cards/50-serie...
        
         | cogman10 wrote:
         | What I really don't like about it is low power GPUs appear to
         | be a thing of the past essentially. An APU is the closest
         | you'll come to that which is really somewhat unfortunate as the
         | thermal budget for an APU is much tighter than it has to be for
         | a GPU. There is no 75W modern GPU on the market.
        
           | justincormack wrote:
           | the closest is the L4 https://www.nvidia.com/en-us/data-
           | center/l4/ but its a bit weird.
        
             | moondev wrote:
             | RTX A4000 has an actual display output
        
           | moondev wrote:
           | Innodisk EGPV-1101
        
         | Scene_Cast2 wrote:
         | I heavily power limited my 4090. Works great.
        
           | winwang wrote:
           | Yep. I use ~80% and barely see any perf degradation. I use
           | 270W for my 3090 (out of 350W+).
        
         | mikae1 wrote:
         | Performance per watt[1] makes more sense than raw power for
         | most consumer computation tasks today. Would really like to see
         | more focus on energy efficiency going forward.
         | 
         | [1] https://en.wikipedia.org/wiki/Performance_per_watt
        
           | epolanski wrote:
           | That's s blind way to look at that imho. Doesn't work on me
           | for sure.
           | 
           | More energy means more power consumption, more heat in my
           | room, you can't escape thermodynamics. I have a small home
           | office, it's 6 square meters, during summer energy draw in my
           | room makes a gigantic difference in temperature.
           | 
           | I have no intention of drawing more than a total 400w top
           | while gaming and I prefer compromising on lowering settings.
           | 
           | Energy consumption can't keep increasing over and over
           | forever.
           | 
           | I can even understand it on flagships, they meant for
           | enthusiasts, but all the tiers have been ballooning in energy
           | consumption.
        
             | bb88 wrote:
             | Increasing performance per watt means that you can get more
             | performance using the same power. It also means you can
             | budget more power for even better performance if you need
             | it.
             | 
             | In the US the limiting factor is the 15A/20A circuits which
             | will give you at most 2000W. So if the performance is
             | double but it uses only 30% more power, that seems like a
             | worthwhile tradeoff.
             | 
             | But at some point, that ends when you hit a max power that
             | prevents people from running a 200W CPU and other
             | appliances on the same circuit without tripping a breaker.
        
         | marricks wrote:
         | I got into desktop gaming at the 970 and the common wisdom (to
         | me at least, maybe I was silly) was I could get away with a
         | lower wattage power supply and use it in future generations
         | cause everything would keep getting more efficient. Hah...
        
           | epolanski wrote:
           | Yeah, do like me, I lower settings from "ultra hardcore" to
           | "high" and keep living fine on a 3060 at 1440p for another
           | few gens.
           | 
           | I'm not buying GPUs that expensive nor energy consuming, no
           | chance.
           | 
           | In any case I think Maxwell/Pascal efficiency won't be seen
           | anymore, with those RT cores you get more energy draw, can't
           | get around that.
        
             | mikepurvis wrote:
             | I feel similarly; I just picked up a second hand 6600 XT
             | (similar performance to 3060) and I feel like it would be a
             | while before I'd be tempted to upgrade, and certainly not
             | for $500+, much less thousands.
        
             | alyandon wrote:
             | I'm generally a 1080p@60hz gamer and my 3060 Ti is
             | overpowered for a lot of the games I play. However, there
             | are an increasing number of titles being released over the
             | past couple of years where even on medium settings the card
             | struggles to keep a consistent 60 fps frame rate.
             | 
             | I've wanted to upgrade but overall I'm more concerned about
             | power consumption than raw total performance and each
             | successive generation of GPUs from nVidia seems to be going
             | the wrong direction.
        
             | SkyMarshal wrote:
             | I've actually reversed my GPU buying logic from the old
             | days. I used to buy the most powerful bleeding edge GPU I
             | could afford. Now I buy the minimum viable one for the
             | games I play, and only bother to upgrade if a new game
             | requires a higher minimum viable GPU spec. Also I generally
             | favor gameplay over graphics, which makes this strategy
             | viable.
        
           | omikun wrote:
           | I went from 970 to 3070 and it now draws less power on
           | average. I can even lower the max power to 50% and not notice
           | a difference for most games that I play.
        
         | elorant wrote:
         | You don't need to run them in x16 mode though. For inference
         | even half that is good enough.
        
         | ashleyn wrote:
         | most household circuits can only support 15-20 amps at the
         | plug. there will be an upper limit to this and i suspect this
         | is nvidia compromising on TDP in the short term to move faster
         | on compute
        
           | SequoiaHope wrote:
           | I wonder if they will start putting lithium batteries in
           | desktops so they can draw higher peak power.
        
             | jbarrow wrote:
             | There's a company doing that for stovetops, which I found
             | really interesting (https://www.impulselabs.com)!
             | 
             | Unfortunately, when training on a desktop it's _relatively_
             | continuous power draw, and can go on for days. :/
        
           | Yizahi wrote:
           | So you are saying that Nvidia will finally force USA to the
           | 220V standard? :)
        
             | Reason077 wrote:
             | Many American homes already have 240V sockets (eg: NEMA
             | 14-30) for running clothes dryers, car chargers, etc. These
             | can provide over 7200W continuous power!
             | 
             | I guess PC power supplies need to start adopting this
             | standard.
        
               | saltminer wrote:
               | You can't use a NEMA 14-30 to power a PC because 14-30
               | outlets are dual-phase (that's why they have 4 prongs - 2
               | hot legs, shared neutral, shared ground). To my
               | knowledge, the closest you'll get to dual-phase in
               | computing is connecting the redundant unit in a server to
               | a separate phase or a DC distribution system connected to
               | a multi-phase rectifier, but those are both relegated to
               | the datacenter.
               | 
               | You could get an electrician to install a different
               | outlet like a NEMA 6-20 (I actually know someone who did
               | this) or a European outlet, but it's not as simple as
               | installing more appliance circuits, and you'll be paying
               | extra for power cables either way.
               | 
               | If you have a spare 14-30 and don't want to pay an
               | electrician, you could DIY a single-phase 240v circuit
               | with another center tap transformer, though I wouldn't be
               | brave enough to even attempt this, much less connect a
               | $2k GPU to it.
        
         | saomcomrad56 wrote:
         | It's good to know can all heat our bedrooms while mining
         | shitcoins.
        
         | 6SixTy wrote:
         | Nvidia wants you to buy their datacenter or professional cards
         | for AI. Those often come with better perf/W targets, more VRAM,
         | and better form factors allowing for a higher compute density.
         | 
         | For consumers, they do not care.
         | 
         | PCIe Gen 4 dictates a tighter tolerance on signalling to
         | achieve a faster bus speed, and it took quite a good amount of
         | time for good quality Gen 4 risers to come to market. I have
         | zero doubt in my mind that Gen 5 steps that up even further
         | making the product design just that much harder.
        
           | throwaway48476 wrote:
           | In the server space there is gen 5 cabling but not gen 5
           | risers.
        
         | dabinat wrote:
         | This is the #1 reason why I haven't upgraded my 2080 Ti. Using
         | my laser printer while my computer is on (even if it's idle)
         | already makes my UPS freak out.
         | 
         | But NVIDIA is claiming that the 5070 is equivalent to the 4090,
         | so maybe they're expecting you to wait a generation and get the
         | lower card if you care about TDP? Although I suspect that
         | equivalence only applies to gaming; probably for ML you'd still
         | need the higher-tier card.
        
           | iwontberude wrote:
           | That's because you have a Brother laser printer which charges
           | its capacitors in the least graceful way possible.
        
             | throwaway81348 wrote:
             | Please expand, I am intrigued!
        
             | lukevp wrote:
             | This happens with my Samsung laser printer too, is it not
             | all laser printers?
        
               | bob1029 wrote:
               | It's mostly the fuser that is sucking down all the power.
               | In some models, it will flip on and off very quickly to
               | provide a fast warm up (low thermal mass). You can often
               | observe the impact of this in the lights flickering.
        
           | Reason077 wrote:
           | Does a laser printer need to be connected to a UPS?
        
             | iwontberude wrote:
             | It's not connected to the UPS directly, it's causing
             | voltage dip on the circuit tripping the UPS.
        
             | grujicd wrote:
             | Faulty iron in another room fried my LaserJet. UPS isn't
             | just for loss of power, it should also protect from power
             | spikes. Btw. printer was connected to a (cheap) surge
             | protector strip which didn't help. On positive side nothing
             | else was fried and laser was fixed for 40 euros.
        
             | UltraSane wrote:
             | no
        
             | bob1029 wrote:
             | I would be careful connecting laser printers to consumer
             | UPS products. On paper all the numbers may line up, but I
             | don't know why you'd want to if you could otherwise avoid
             | it.
             | 
             | If the printer causes your UPS to trip when merely sharing
             | the circuit, imagine the impact to the semiconductors and
             | other active elements when connected as a protected load.
        
           | jandrese wrote:
           | The big grain of salt with that "the 5070 performs like a
           | 4090" is that it is talking about having the card fake in 3
           | extra frames for each one it properly generates. In terms of
           | actual performance boost a 5070 is about 10% faster than a
           | 4070.
        
             | p1esk wrote:
             | Source for your 10% number?
        
               | brokenmachine wrote:
               | I heard them say that in the Hardware Unboxed youtube
               | video yesterday.
               | 
               | I think it's this one https://youtu.be/olfgrLqtXEo
        
             | buran77 wrote:
             | According to Nvidia [0], DLSS4 with Multi Frame Generation
             | means "15 out of 16 pixels are generated by AI". Even that
             | "original" first out of four frames is rendered in 1080p
             | and AI upscaled. So it's not just 3 extra frames, it's also
             | 75% of the original one.
             | 
             | [0] https://www.nvidia.com/en-us/geforce/news/dlss4-multi-
             | frame-...
        
           | UltraSane wrote:
           | Why would you have your laser printer connected to your UPS?
        
         | zitterbewegung wrote:
         | Instead of risers just use pcie ender cords and you can get 4x
         | 3090's working with a creator motherboard (google one that you
         | know can handle 4). You could also use a mining case to do the
         | same.
         | 
         | But, the advantage is that you can load a much more complex
         | model easily (24GB vs 32GB is much easier since 24GB is just
         | barely around 70B parameters).
        
         | Geee wrote:
         | Yeah, that's bullshit. I have a 3090 and I never want to use it
         | at max power when gaming, because it becomes a loud space
         | heater. I don't know what to do with 575W of heat.
        
         | ryao wrote:
         | I wonder how many generations it will take until Nvidia
         | launches a graphics card that needs 1kW.
        
           | faebi wrote:
           | I wish mining was still a thing, it was awesome to have free
           | heating in the cold winter.
        
             | Arkhadia wrote:
             | Is it not? (Serious question)
        
               | abrookewood wrote:
               | Probably not on GPUs - think it all moved to ASICs years
               | ago.
        
         | wkat4242 wrote:
         | Yes but the memory bandwidth of the 5090 is insanely high
        
         | porphyra wrote:
         | soon you'll need to plug your PC into the 240 V dryer outlet
         | lmao
         | 
         | (with the suggested 1000 W PSU for the current gen, it's quite
         | conceivable that at this rate of increase soon we'll run into
         | the maximum of around 1600 W from a typical 110 V outlet on a
         | 15 A circuit)
        
         | jmward01 wrote:
         | Yeah. I've been looking at changing out my home lab GPU but I
         | want low power and high ram. NVIDIA hasn't been catering to
         | that at all. The new AMD APUs, if they can get their software
         | stack to work right, would be perfect. 55w TDP and access to
         | nearly 128GB, admittedly at 1/5 the mem bandwidth (which likely
         | means 1/5 the real performance for tasks I am looking at but at
         | 55w and being able to load 128g....)
        
         | skocznymroczny wrote:
         | In theory yes, but it also depends on the workload. RTX 4090 is
         | ranking quite well on the power/performance scale. I'd rather
         | have my card take 400W for 10 minutes to finish the job than
         | take only 200W for 30 minutes.
        
         | abrookewood wrote:
         | Sooo much heat .... I'm running a 3080 and playing anything
         | demanding warms my room noticeably.
        
       | holoduke wrote:
       | Some of the better video generators with pretty good quality can
       | run on the 32gb version. Expect lots of AI generated videos with
       | this generation of videocards. Price is steep and we need another
       | 9700 ati successtory for some serious nvidia competition. Not
       | going to happen anytime soon I am afraid.
        
       | snarfy wrote:
       | I'm really disappointed in all the advancement in frame
       | generation. Game devs will end up relying on it for any decent
       | performance in lieu of actually optimizing anything, which means
       | games will look great and play terribly. It will be 300 fake fps
       | and 30 real fps. Throw latency out the window.
        
       | williamDafoe wrote:
       | It looks like the new cards are NO FASTER than the old cards. So
       | they are hyping the fake frames, fake pixels, fake AI rendering.
       | Anything fake = good, anything real = bad.
       | 
       | This is the same thing they did with the RTX 4000 series. More
       | fake frames, less GPU horsepower, "Moore's Law is Dead", Jensen
       | wrings his hands, "Nothing I can do! Moore's Law is Dead!" which
       | is how Intel has been slacking since 2013.
        
         | vinyl7 wrote:
         | Everything is fake these days. We have mass
         | psychosis...everyone is living in a collective schizophrenic
         | delusion
        
         | holoduke wrote:
         | Its more like the 20 series. Definitely faster and for me worth
         | the upgrade. I just count the transistors for a reference. 92
         | and 77 billion. So yeah not that much.
        
       | numpy-thagoras wrote:
       | Similar CUDA core counts for most SKUs compared to last gen
       | (except in the 5090 vs. 4090 comparison). Similar clock speeds
       | compared to the 40-series.
       | 
       | The 5090 just has way more CUDA cores and uses proportionally
       | more power compared to the 4090, when going by CUDA core
       | comparisons and clock speed alone.
       | 
       | All of the "massive gains" were comparing DLSS and other
       | optimization strategies to standard hardware rendering.
       | 
       | Something tells me Nvidia made next to no gains for this
       | generation.
        
         | danudey wrote:
         | > All of the "massive gains" were comparing DLSS and other
         | optimization strategies to standard hardware rendering.
         | 
         | > Something tells me Nvidia made next to no gains for this
         | generation.
         | 
         | Sounds to me like they made "massive gains". In the end, what
         | matters to gamers is
         | 
         | 1. Do my games look good? 2. Do my games run well?
         | 
         | If I can go from 45 FPS to 120 FPS and the quality is still
         | there, I don't care if it's because of frame generation and
         | neural upscaling and so on. I'm not going to be upset that it's
         | not lovingly rasterized pixel by pixel if I'm getting the same
         | results (or better, in some cases) from DLSS.
         | 
         | To say that Nvidia made no gains this generation makes no sense
         | when they've apparently figured out how to deliver better
         | results to users for less money.
        
           | throwaway48476 wrote:
           | Fake frames, fake gains
        
             | ThrowawayTestr wrote:
             | The human eye can't see more than 60 fps anyway
        
               | geerlingguy wrote:
               | Can definitely see more than 60, but it varies how much
               | more you can see. For me it seems like diminishing
               | returns beyond 144Hz.
               | 
               | Though some CRT emulation techniques require more than
               | that to scale realistic 'flickering' effects.
        
               | UltraSane wrote:
               | i can tell up to about 144Hz but struggle to really
               | notice going from 144 to 240Hz. Even if you don't
               | consciously notice the higher refresh rate it could still
               | help for really fast paced games like competitive FPS if
               | you can actually generate that many frames per second by
               | reducing input latency and if you can actually respond
               | fast enough.
        
               | Salgat wrote:
               | The human eyes are analog low pass filters, so beyond
               | 60Hz is when things start to blur together, which is
               | still desirable since that's what we see in real life.
               | But there is a cutoff where even the blurring itself can
               | no longer help increase fidelity. Also keep in mind that
               | this benefit helps visuals even when the frame rate is
               | beyond human response time.
        
             | UltraSane wrote:
             | Are DLSS frames any more fake than the computed P or B
             | frames?
        
               | mdre wrote:
               | Yes.
        
             | m3kw9 wrote:
             | The fps gains are directly because of the AI compute cores,
             | I'd say that's a net gain but not a the traditional sense
             | preAI.
        
           | CSDude wrote:
           | I have 2070 Super. Latest Call of Duty runs on 4k with good
           | quality using DLSS with 60 fps and I can't notice at all
           | (unless I look very closely, even with my 6k ProDisplay XDR)
           | so yeah I was thinking of building a 5090 based computer and
           | it will probably last many more years than my 2070 super with
           | latest AI developments.
        
           | dragontamer wrote:
           | Because if two frames are fake and only one frame is based
           | off of real movements, then you've actually lost a fair bit
           | of latency and will have noticably laggier controls.
           | 
           | Making better looking individual frames and benchmarks for
           | worse gameplay experiences is an old tradition for these GPU
           | makers.
        
             | lovich wrote:
             | If anyone thinks they are having laggier controls or losing
             | latency off of single frames I have a bridge to sell them.
             | 
             | A game running at 60 fps averages around ~16 ms and good
             | human reaction times don't go much below 200ms.
             | 
             | Users who "notice" individual frames are usually noticing
             | when a single frame is lagging for the length of several
             | frames at the average rate. They aren't noticing anything
             | within the span of an average frame lifetime
        
               | foxhill wrote:
               | you're conflating reaction times and latency perception.
               | these are not the same. humans can tell the difference
               | down to 10ms, perhaps lower.
               | 
               | if you added 200ms latency to your mouse inputs, you'd
               | throw your computer out the of the window pretty quickly.
        
               | heyda wrote:
               | You think normal people can't tell? Go turn your monitor
               | to 60hz in your video options and move your mouse in
               | circles on your desktop, then go turn it back to 144hz or
               | higher and move it around on your screen. If an average
               | csgo or valorant player where to play with framegen while
               | the real fps was about 60 and the rest of the frames
               | where fake, it would be so completely obvious it's almost
               | laughable. That said the 5090 can obviously run those
               | games at 200+fps so they would just turn off any frame
               | gen stuff. But a new/next gen twitch shooter will for
               | sure expose it.
        
               | swinglock wrote:
               | I'll take that bridge off your hands.
        
             | UltraSane wrote:
             | DLSS 4 can actually generate 3 frames for ever 1 raster
             | frame. When talking about frame rates well above 200 per
             | second a few extra frames isn't that big of a deal unless
             | you are a professional competitive gamer.
        
           | bitmasher9 wrote:
           | Rasterizing results in better graphics quality than DLSS if
           | compute is not a limiting factor. They are trying to do an
           | apples to oranges comparison by comparing the FPS of standard
           | rendering to upscaled images.
           | 
           | I use DLSS type tech, but you lose a lot of fine details with
           | it. Far away text looks blurry, textures aren't as rich, and
           | lines between individual models lose their sharpness.
           | 
           | Also, if you're spending $2000 for a toy you are allowed to
           | have high standards.
        
             | UltraSane wrote:
             | DLSS 4 uses a completely new model with twice as many
             | parameters and seems to be a big improvement.
        
               | bitmasher9 wrote:
               | I hope so, because it looks like 8k traditional rendering
               | won't be an option for this decade.
        
               | brokenmachine wrote:
               | Why is that an issue? Do you have an 8k monitor?
        
             | maxglute wrote:
             | > if compute is not a limiting factor.
             | 
             | If we're moving towards real time tracing compute is going
             | to always be a limitting factor, as it was in the days of
             | pre rendering. Granted currently raster techniques can
             | simulate ray trace pretty well in many scenarios and looks
             | much better in motion, IMO that's more limitation of real
             | time ray trace. There's a bunch of image quality
             | improvements beyond raster to be gained if enough compute
             | is throw at ray tracing, i think a lot of dlss / frame
             | generation goal is basically to offload more cpu to
             | generate higher IQ hero frames while filling in blanks.
        
           | hervature wrote:
           | These are NVidia's financial results last quarter:
           | 
           | - Data Center: Third-quarter revenue was a record $30.8
           | billion
           | 
           | - Gaming and AI PC: Third-quarter Gaming revenue was $3.3
           | billion
           | 
           | If the gains are for only 10% of your customers, I would put
           | this closer to the "next to no gains" rather than the
           | "massive gains".
        
         | laweijfmvo wrote:
         | I started thinking today, when Nvidia seemingly keeps just
         | magically increasing performance every two years, that they
         | eventually have to "intel" themselves, where they haven't made
         | any real architectural improvements in ~10 years and just
         | suddenly power and thermals don't scale anymore and you have
         | six generations of turds that all perform essentially the same,
         | right?
        
           | ryao wrote:
           | Nvidia is a very innovative company. They reinvent solutions
           | to problems while others are trying to match their old
           | solutions. As long as they can keep doing that, they will
           | keep improving performance. They are not solely reliant on
           | process node shrinks for performance uplifts like Intel was.
        
         | Salgat wrote:
         | The 5090's core increase (30%) is actually underwhelming
         | compared to the 3090->4090 increase (60% more), but the real
         | game changer is the memory improvements, both in size and
         | bandwidth.
        
       | sfmike wrote:
       | One thing I always remember when people say a 2k gpu is insanity.
       | How many people get a 2k ebike. a 100k weekend car. a 15k
       | motorcycle to use once a month. a time share home. Comparatively
       | a gamer using it even a few hours a day for 3k 4090 build is
       | really an amazing return on that investment.
        
       ___________________________________________________________________
       (page generated 2025-01-07 23:00 UTC)