[HN Gopher] AMD Ryzen 7 9800X3D Linux Performance: Zen 5 With 3D...
       ___________________________________________________________________
        
       AMD Ryzen 7 9800X3D Linux Performance: Zen 5 With 3D V-Cache
        
       Author : mfiguiere
       Score  : 127 points
       Date   : 2024-11-06 14:38 UTC (8 hours ago)
        
 (HTM) web link (www.phoronix.com)
 (TXT) w3m dump (www.phoronix.com)
        
       | antisthenes wrote:
       | 9800X3D looks like an all-around winner, so if you don't mind
       | spending $500 on just the CPU, I don't see why anyone would get
       | anything else.
        
         | ThatMedicIsASpy wrote:
         | All-around winner in what? For $500 you can get a lot more
         | cores.
         | 
         | All-around winning, $500, 8 cores makes no sense.
         | 
         | This thing has a premium gaming price tag because there is
         | nothing close to it other than their own 7800X3D.
        
           | bhouston wrote:
           | What would you suggest instead?
           | 
           | It is pretty competitive on the Multi-Core rating:
           | https://browser.geekbench.com/v6/cpu/8633320 compared to
           | other CPUs: https://browser.geekbench.com/processor-
           | benchmarks
        
           | jandrese wrote:
           | The benchmarks in the article suggest that more cores are
           | largely wasted on real world applications.
        
             | ThatMedicIsASpy wrote:
             | Yes so buy according to your needs? 8 cores do not cost
             | $500.
        
               | behringer wrote:
               | They do when those cores are 2 to 4 times faster than the
               | rest.
        
           | sliken wrote:
           | In theory, yes. But in the real world the bottleneck of the
           | same 128 bit wide memory, interface that's been popular way
           | back since the time of dual core chips.
           | 
           | Less cache misses (on popular workloads) helps decrease power
           | and increase performance enough that few things benefit from
           | 12-16 cores.
           | 
           | Thus the M3 max (with a 512 bit wide memory system) has a
           | class leading single core and multi-core scores.
        
             | 0xQSL wrote:
             | I'm not so sure about memory actually being the bottleneck
             | for these 8 core parts. If memory bandwidth is the
             | bottleneck this should show up in benchmarks with higher
             | dram clocks. I can't find any good application benchmarks,
             | but computerbase.de did it for gaming with 7800MHz vs
             | 6000MHz and didn't find much of a difference [1]
             | 
             | The apple chips are APUs and need a lot of their memory
             | bandwidth for the gpu. Are there any good resources on how
             | much of this bandwidth is actually used in common cpu
             | workloads? Can the CPU even max out half of the 512bit bus?
             | 
             | [1] https://www.computerbase.de/artikel/prozessoren/amd-
             | ryzen-7-...
        
               | wmf wrote:
               | For AMD I think Infinity Fabric is the bottleneck so
               | increasing memory clock without increasing IF clock does
               | nothing. And it's also possible that 8 cores with massive
               | cache simply don't need more bandwidth.
        
               | sliken wrote:
               | My understanding is the single CCD chips (like the
               | 9800x3d) have 2 IF links, while the dual CCD chips (like
               | the 9950x) have 1. Keep in mind these CCDs are shared
               | with turin (12 channel), threadripper pro (8 channel),
               | siena (6 channel), threadripper (4 channel).
               | 
               | The higher CCD configurations have 1 IF link per chip,
               | the lower have 2 IF links per chip. Presumably AMD would
               | bother with the 2 IF link chips unless it helped.
        
               | Dylan16807 wrote:
               | I can't find anything to back that up.
               | 
               | That said, each link gives a CCD 64GB/s of read speed and
               | 32GB/s of write speed. 8000MHz memory at 128 bits would
               | get up to 128GB/s. So being stuck with one link would
               | bottleneck badly enough to hide the effects of memory
               | speed.
        
               | sliken wrote:
               | I've been paying close attention, found various hints at
               | anandtech (RIP), chips and cheese, and STH.
               | 
               | It doesn't make much difference to most apps, but I
               | believe the single CCD (like the 9700x) has better
               | bandwidth to IOD then their dual CCD chips, like the
               | 9900x and 9950x
               | 
               | Similarly on the server chips you can get 2,4,8, or 16
               | CCDs. To get 16 cores you can use 2 CCDs or 16 CCDs! But
               | the sweet spot (max bandwidth per CCD) is at 8 CCDs where
               | you get a decent number of cores and twice the bandwidth
               | per CCD. Keep in mind the genoa/turin EPYC chips have 24
               | channels (32 bit x 24) for a 768 bit wide memory
               | interface. Not nearly as constrained as their desktops.
               | 
               | Wish I could paste in a diagram, but check out:
               | 
               | https://www.amd.com/content/dam/amd/en/documents/epyc-
               | techni...
               | 
               | Page 7 has a diagram of 96 core with one GMI (IF) port
               | per CCD and a 32 core chip two GMI ports per CCD.
               | 
               | That's a gen old I believe, the max CCDs is now 16, not
               | 12 with turin.
        
               | Dylan16807 wrote:
               | So "GMI3-wide" and similar terms are the important things
               | to search for.
               | 
               | some diagrams: https://www.servethehome.com/amd-epyc-
               | genoa-gaps-intel-xeon-...
               | 
               | From another page: _" The most noteworthy aspect is that
               | there is a new GMI3-Wide format. With Client Zen 4 and
               | previous generations of Zen chiplets, there was 1 GMI
               | link between the IOD and CCD. With Genoa, in the lower
               | core count, lower CCD SKUs, multiple GMI links can be
               | connected to the CCD."_
               | 
               | And it seems like all the chiplets have two links, but
               | everything I can find says they just don't hook up both
               | on consumer parts.
        
               | sliken wrote:
               | Didn't find anything clearly stating one way or another,
               | but the CCD is the same between ryzen and epyc, so
               | there's certainly the possibility.
               | 
               | I dug around a bit, and it seems Ryzen doesn't get it. I
               | guess that makes sense, if the IOD on ryzen gets 2 GMI
               | links. On the single CCD parts there's no other CCD to
               | talk to. On the dual CCD parts there's not enough GMI
               | links to have both with GMI-wide.
               | 
               | Maybe this will be different on the pending Zen 5 part
               | (Strix Halo) that will have 256 bits wide (16 x 32 bit) @
               | 8533 MHz = 266 GB/sec since there will be 2 CCDs and a
               | significant bump to memory bandwidth.
        
               | wmf wrote:
               | I'm pretty sure that memory bandwidth is only for the GPU
               | just like on Apple silicon.
        
               | Dylan16807 wrote:
               | Yeah, the most relevant diagram I can find shows 32 bytes
               | wide per core cluster and 128 bytes to the GPU.
        
               | sliken wrote:
               | Apple silicon manages around 50% (giver or take) for the
               | CPUs.
        
               | sliken wrote:
               | Well there's much more to memory performance than
               | bandwidth. Generally applications are relatively cache
               | friendly, thus the X3D helps a fair bit, especially with
               | more intensive games (ones that barely hit 60 fps, not
               | the silly game benchmarks that hit 500 fps).
               | 
               | Generally CPUs have relatively small reorder windows, so
               | a cache miss hurts bad, 80ns latency @ 5 GHz is 400 clock
               | cycles, and something north of 1600 instructions that
               | could have been executed. If one in 20 operations is a
               | cache miss that's a serious impediment to getting any
               | decent fraction of peak performance. The pain of those
               | cache misses is part of why the X3D does so well, even a
               | few less cache misses can increase performance a fair
               | bit.
               | 
               | With 8c/16 threads having only 2 (DDR4) or 4 (DDR5) cache
               | misses pending with a 128 bit wide system means that in
               | any given 80-100ns window only 2 or 4 cores can continue
               | resume after a cache miss. DDR-6000 vs DDR-7800 doesn't
               | change that much, you still wait the 80-100ns, you just
               | get the cache line in 8 (16 for ddr5) cycles @ 7800MT/sec
               | instead of 8 (16 for DDR5) cycles @ 6000MT/sec. So the
               | faster DDR5 means more bandwidth (good for GPUs), but not
               | more cache transactions in flight (good for CPUs).
               | 
               | With better memory systems (like the Apple m3 max) you
               | could have 32 cache misses per 80-100ns. I believe about
               | half of those are reserved for the GPU, but even 16 would
               | mean that all of the 9800X3Ds 16 threads could resolve a
               | cache miss per 80-100ns instead of just 2 or 4.
               | 
               | That's part of why a M4 max does so well on multithreaded
               | code. M4 max does better on geekbench 6 multithread than
               | not only the 9800x3d (with 16 threads) but also a 9950x
               | (with 16c/32 threads). Pretty impressive for a low TDP
               | chip that fits in thin/light laptop with great battery
               | life and competes well against Zen 5 chips with a 170
               | watt TDP that often use water cooling.
        
               | Dylan16807 wrote:
               | > only 2 (DDR4) or 4 (DDR5) cache misses pending with a
               | 128 bit wide system
               | 
               | Isn't that the purpose of banks and bank groups, letting
               | a bunch of independent requests work in parallel on the
               | same channel?
        
               | sliken wrote:
               | Dimms are dumb. Not sure, but maybe rambus helped improve
               | this. Dimms are synchronous and each memory channel can
               | have a single request pending. So upon a cache miss on
               | the last level cache (usually L3) you send a row, column,
               | wait 60ns or so, then get a cache line back. Each memory
               | channel can only have a single memory transaction (read
               | or write) in flight. The memory controller (usually
               | sitting between the L3 and ram) can have numerous cache
               | misses pending, each waiting for the right memory channel
               | to free.
               | 
               | There are minor tweaks, I believe you can send a row,
               | column, then on future accesses send only the column.
               | There's also slight differences in memory pages (a dimm
               | page != kernel page) that decrease latency with locality.
               | But the differences are minor and don't really move the
               | needle on main memory latency of 60 ns (not including the
               | L1/l2/l3 latency which have to miss before getting to the
               | memory controller).
               | 
               | There are of course smarter connections, like AMD's
               | hypertransport or more recently infinity fabric (IF) that
               | are async and can have many memory transactions in
               | flight. But sadly the dimms are not connected to HT/IF.
               | IBM's OMI is similar, fast async serial interface, with
               | an OMI connection to each ram stick.
        
           | Hikikomori wrote:
           | Cores or "cores"?
        
         | LorenDB wrote:
         | As a C++ programmer, I just bought a 9900X for my first PC
         | build. Sure, it won't game as well, but I like fast compile
         | times, and the 9900X is on sale for $380 right now. That's $100
         | cheaper than the 9800X3D launch price.
        
           | jeffbee wrote:
           | Yeah, these Zen 5 are killer for that kind of workload. I
           | also replaced my workstation with a 9900-series CPU since my
           | Intel 14900K fried itself, and I am very pleased with every
           | aspect, except idle power consumption which is a minor
           | drawback.
           | 
           | It looks like the X3D is no better than the 9900X for non-
           | game single-threaded workloads like browsers, and it's much
           | worse than the 12 or 16 core parts in terms of overall
           | throughput, so for a non-gamer the plain X seems much better
           | than the X3D.
        
             | mdre wrote:
             | What's your idle power consumption for AMD vs Intel if you
             | don't mind me asking? I'm getting avg 125W for my 13900k
             | build, measured at the wall and it mildly bugs me when I
             | think of it, I thought it'd be closer to 80. And power is
             | very expensive where I live now.
        
               | jeffbee wrote:
               | If you are getting 125W at the wall on a PC at idle, your
               | machine or operating system is extremely broken, or you
               | are running atmosphere physics simulations all the time.
               | The SoC on my Intel box typically drew < 1W as measured
               | by RAPL. The 9950X draws about 18W measured the same way.
               | Because of platform overhead the difference in terms of
               | ratio is not that large but the Ryzen system is drawing
               | about 40W at the wall when it's just sitting there.
        
               | zokier wrote:
               | Discrete gpu can easily add 20-40w of idle power draw, so
               | that's something to keep in mind. I believe that 60ish
               | watts is pretty typical idle consumption for desktop
               | system, Ryzens typically having 10w higher idle draw than
               | Intel. Some random reviews with whole system idle
               | measurements:
               | 
               | https://hothardware.com/reviews/amd-
               | ryzen-7-9800x3d-processo...
               | 
               | https://www.techpowerup.com/review/amd-
               | ryzen-7-9800x3d/23.ht...
        
               | jeffbee wrote:
               | Those comparisons are using a water cooling rig which
               | already blows out the idle power budget. 60W is in no way
               | typical of PC idle power. Your basic Intel PC draws no
               | more power than a laptop, low single digits of watts at
               | the load, low tens of watts at the wall. My NUC12, which
               | is no slouch, draws <5W at the wall when the display is
               | off and when using Wi-Fi instead of Ethernet.
        
               | mdre wrote:
               | Hmm. I'm using an AIO cooler, a 3090 and a 1600W platinum
               | psu - might be a bit inefficient. I remember unplugging
               | the PSU and 3090 and plugging in a 650W gold PSU -- the
               | system drew 70W IIRC. That's a wild difference still!
        
               | jeffbee wrote:
               | Yeah, oversized power supplies are also responsible for
               | high idle power. "Gold" etc ratings are for their
               | efficiency at 50%-100% rated power, not how well they
               | scale down to zero, unfortunately. I have never owned a
               | real GPU, I use the IGP or a fanless Quadro, so I don't
               | have firsthand experience with how that impacts idle
               | power.
        
               | zokier wrote:
               | Gold rating is down to 20%, Titanium is to 10% https://en
               | .wikipedia.org/wiki/80_Plus#Efficiency_level_certi...
        
           | IAmGraydon wrote:
           | I'm about to build a new system and am planning on using the
           | 9900X. It's primarily for coding, Adobe CC, and Ableton, with
           | maybe a rare gaming session here and there. It seems that the
           | 9900X is the best bang for the buck right now. It games just
           | fine, BTW.
        
         | Wytwwww wrote:
         | Intel can still be kind of faster for "productivity" stuff? At
         | least if you are willing to pay for the >8000 MHz CUDIMMs
         | (which i don't think AMD even supports at full speed?) which
         | can result in pretty impressive performance. Of course the
         | value/price is probably not great...
        
       | Night_Thastus wrote:
       | Nice to actually have a decent release this generation of CPUs.
       | 
       | The rest of Zen5 was maybe a 5% bump on average, and Intel's new
       | series actually _regressed_ in performance compared to 14th gen.
       | 
       | Seems like the Zen5X3D's will be the only good parts this time
       | around.
        
         | 13hunteo wrote:
         | To cut Intel some slack, this latest version overhauls their
         | old architecture, and they were fairly upfront about the lack
         | of development in performance in this generation.
         | 
         | The idea is the new platform will allow for better development
         | in future, while improving efficiency fairly significantly.
        
           | qzw wrote:
           | Also nice to be able to boast a bigger uplift in the
           | following gen due to regressing this one! But they definitely
           | did need to get their efficiency under control since their
           | parts were turning into fairly decent personal heating units.
        
           | Night_Thastus wrote:
           | From a consumer standpoint - this doesn't matter. You can't
           | buy that future product that may exist. You can only choose
           | whether to buy the current product or not. And right now,
           | that product is bad.
           | 
           | I certainly hope the next generation is a massive bump for
           | Intel, but we'll see if that's the case.
        
           | fweimer wrote:
           | I think the new T-equivalent CPU could be very interesting if
           | Intel releases one. Those variants are optimized for 35W TDP,
           | and they can be used for building high-performance fanless
           | systems that can sustain their performance for quite some
           | time. The lower power requirements for Arrow Lake might be a
           | really good match there.
        
           | duskwuff wrote:
           | > To cut Intel some slack, this latest version overhauls
           | their old architecture...
           | 
           | ... and their 13th/14th generation processors had serious
           | problems with overvoltage-induced failures - they clearly
           | needed to step back and focus on reliability over
           | performance.
        
           | 5kg wrote:
           | it's scrapped, the new design:
           | https://www.pcworld.com/article/2507953/lunar-lakes-
           | integrat...
        
             | zeusk wrote:
             | Parent is quite possibly talking about arrow lake and not
             | lunar lake which is a mobile only part.
        
           | heraldgeezer wrote:
           | So why buy this generation and not wait unless your computer
           | broke and you NEED Intel?
        
         | notanote wrote:
         | Hardware Unboxed has the interesting theory that the I/O die,
         | which is unchanged between Zen4 and Zen5, is a significant
         | bottleneck especially for the latter. The 3D v-cache would then
         | ease the pressure there, and so see the cpu get an extra boost
         | beyond that expected from increased cache.
        
       | globnomulous wrote:
       | Sharing links from websites with intrusive video advertisements
       | should be prohibited. The websites should be banned, and those
       | who share links to them should receive a paddling.
        
         | sliken wrote:
         | Or maybe you should follow the recommendations of various
         | government agencies (including the FBI) and install an ad
         | blocker.
        
           | rjsw wrote:
           | The last time I viewed this particular website it detected
           | the adblocker and complained that I was depriving the owner
           | of income.
        
             | beeboop wrote:
             | ublock origin and annoyance filters works fine for me
        
               | rjsw wrote:
               | I was using uBlock origin.
               | 
               | Had also seen how he had editorialized some of my mailing
               | list posts and I felt that I would be guilty of Gell-Mann
               | amnesia if I carried on reading the site.
        
             | sliken wrote:
             | I do wish I could pay $25 a month for my web content to be
             | ad free. Portioned out to websites I actually spent time
             | reading.
        
               | pizza234 wrote:
               | This is precisely what Scroll (1) used to do. It seems it
               | didn't end up well, unfortunately.
               | 
               | (1) https://en.wikipedia.org/wiki/Scroll_(web_service)
        
       | nisten wrote:
       | I am surprised at how much this thing is just straight up
       | crushing it with just 8 cores.
       | 
       | I think it topping the machine learning benchmarks has to do with
       | having only 8 cores to share the 96MB of L3 cache, which ends up
       | having a ratio of 1core having 1MBL2 + 12MB L3 which is huge,
       | that means EACH THREAD has more cache than i.e the entire nvidia
       | 3090 (6mb l2 total), and this ends up taking FULL advantage of
       | the extra silicon of various avx extensions.
        
         | BeefWellington wrote:
         | I'm curious to see if AMD will release a 9950X3D this time
         | around. I can foresee that kind of CPU dominating everything
         | else across most workloads given how good this 8-core is
         | holding up against CPUs with double or more cores.
        
           | Tuna-Fish wrote:
           | Yes, it's supposedly coming early next year.
        
             | jsheard wrote:
             | I think the current rumor is that only one of the chiplets
             | will have the extra cache though, so you'll have 8 cores
             | with the big cache and 8 cores with the normal cache.
        
               | qzw wrote:
               | If they make one with extra cache on both CCDs, it would
               | probably get some kind of AI branding and be at a
               | significantly higher price point. Current games would
               | hardly benefit from 16 cores all with that much cache.
        
               | scheeseman486 wrote:
               | The main benefit is that it's a no-compromise product.
               | High single thread performance for games and there's more
               | of those cores for productivity, it'd be the best
               | workstation CPU and the best gaming CPU in one package.
        
               | qzw wrote:
               | But you'd get 95+% of the same benefit with the v-cache
               | on just one CCD, which is what they did with the 7950X3D.
        
           | didgetmaster wrote:
           | I have a 5950x that is now a few years old and I planned to
           | upgrade to a 9950x.
           | 
           | I have never had one of the 3D V-Cache processors and am
           | curious how it would improve the benchmarks for my multi-
           | threaded data management system that does many operations
           | against a set of 4K blocks of data.
           | 
           | I heard rumors that a 9950x3D version will be available in
           | January. I am trying to figure out if I should wait.
        
         | tiffanyh wrote:
         | While true, also keep in mind that the iPad Pro (M4) which has
         | _no_ active cooling, and uses only 1 /4th the power ... is
         | still faster (single & multicore) than this 9800X3D - and it's
         | also been on the market for 1/2 year now already.
        
           | osti wrote:
           | Yup I just looked at the clang score in geekbench, for single
           | threaded 9800x3d scored about 3200, whilst m4 had 4400... The
           | m4 is so far above the rest it's ridiculous. Wish Apple made
           | an x86 equivalent so that it can play Windows games lol.
        
             | nightski wrote:
             | Just supporting Linux would be adequate imho. Non-existent
             | Linux support straight up makes M4 a non-starter for myself
             | as much as I can admire the hardware.
        
               | osti wrote:
               | For developers yes, but gamers seem to have the loudest
               | voice in the desktop PC performance conversation, so I
               | think it's important to cater to that market.
        
               | nieve wrote:
               | Gamers in general are not looking at Apple's chips.
        
             | hulitu wrote:
             | > for single threaded 9800x3d scored about 3200, whilst m4
             | had 4400... The m4 is so far above the rest it's
             | ridiculous.
             | 
             | Except the fact that your computer runs more than one
             | thread. Pity that this "single core" performance cannot be
             | utilized at its maximum potential.
        
           | kuschku wrote:
           | For an apples to apples comparison, you'll need to compare
           | Zen 5 with M3, or whatever Zen 6 is going to be with M4.
           | 
           | Apple is paying for exclusive access to TSMC's next node.
           | That improves their final products, but doesn't make their
           | architecture inherently better.
        
             | ricketycricket wrote:
             | Do you though? M4 is what is on the market now and this
             | chip is just coming out. Maybe they are on different
             | processes, but you still have to compare things at a given
             | point in time.
        
             | rowanG077 wrote:
             | Why would a consumer care about what node something is on?
             | You should only care about a set of processors that is
             | available in the market at the same time. The M4 is
             | available now and Zen 6 is not. Once zen 6 is here we
             | probably have an M5.
        
           | adrian_b wrote:
           | Single core yes, but multi core no.
           | 
           | The Geekbench scores cannot compare laptop CPUs with desktop
           | CPUs, because the tasks that are executed are too short and
           | they do not demonstrate the steady-state throughput of the
           | CPUs. The desktop CPUs are much faster for multithreaded
           | tasks in comparison with laptop/tablet CPUs than it appears
           | in the GB results.
           | 
           | The Apple CPUs have a much better instructions-per-clock-
           | cycle ratio than any other CPUs, and now in M4 they also have
           | a relatively high clock frequency, of at least 4.5 GHz. This
           | allows them to win most single-threaded benchmarks.
           | 
           | However the performance in multi-threaded benchmarks has a
           | very weak dependence on the CPU microarchitecture and it is
           | determined mostly by the manufacturing process used for the
           | CPU.
           | 
           | If we were able to compare Intel, AMD and Apple CPUs with the
           | same number of cores and made with the same TSMC process,
           | their multithreaded performance would be very close at a
           | given power consumption.
           | 
           | The reason is that executing a given benchmark requires a
           | number of logic transitions that is about the same for
           | different microarchitectures, unless some of the design teams
           | have been incompetent. An Apple CPU does more logic
           | transitions per clock cycle, so in single thread it finishes
           | the task faster.
           | 
           | However in multithreaded execution, where the power
           | consumption of the CPU reaches the power limit, the number of
           | logic transitions per second in the same manufacturing
           | process is determined by the power consumption. Therefore the
           | benchmark will be completed in approximately the same number
           | of seconds when the power limits are the same, regardless of
           | the differences in the single-threaded performance.
           | 
           | At equal power, an M4 will have a slightly better MT
           | performance than an Intel or AMD CPU, due to the better
           | manufacturing process, but the difference is too small to
           | make it competitive with a desktop CPU.
        
             | wtallis wrote:
             | > The Geekbench scores cannot compare laptop CPUs with
             | desktop CPUs, because the tasks that are executed are too
             | short and they do not demonstrate the steady-state
             | throughput of the CPUs. The desktop CPUs are much faster
             | for multithreaded tasks in comparison with laptop/tablet
             | CPUs than it appears in the GB results.
             | 
             | Bullshit. What you're talking about is the steady-state of
             | the _heatsink_ , not the steady state of the _chip_. Intel
             | learned the hard way that a fast CPU core in a phone
             | _really does_ become a fast CPU core in a laptop or desktop
             | when given a better cooling solution.
             | 
             | > However in multithreaded execution, where the power
             | consumption of the CPU reaches the power limit, the number
             | of logic transitions per second in the same manufacturing
             | process is determined by the power consumption. Therefore
             | the benchmark will be completed in approximately the same
             | number of seconds when the power limits are the same,
             | regardless of the differences in the single-threaded
             | performance.
             | 
             | No, microarchitecture really does matter. And so does the
             | macro architecture of AMD's desktop chips that burn a huge
             | amount of power on an inefficient chip to chip interface.
        
           | heraldgeezer wrote:
           | And the OS is terrible, so it's practically useless for me.
        
           | ploxiln wrote:
           | Hehe ... yeah, single threaded, in some benchmarks. Very
           | impressive chip, the M4. Multi-threaded loads that take more
           | than 30 seconds, no way, come on. But to see the X3D chips
           | really shine above their competitors, you need to slot in a
           | high-end graphics card, and load up a ... uh well you can't
           | compare to Apple Silicon at that point ...
        
           | JohnTHaller wrote:
           | In multi-threaded workloads, the M4 gets 13,380, the 9800X3D
           | gets ~19,000 (varies by build), and the 9950X gets
           | 22,000-24,000 depending on build.
           | 
           | The M4 Max you can pre-order gets around 26,000 multicore but
           | is significantly more expensive than the 9950X ($569) or
           | 9800X3D ($479). The M4 Max is a $1,200 premium over the M4 on
           | the 14 inch MacBook Pro and a $1,100 premium over the M4 Pro
           | on the 16 inch.
           | 
           | The M4 Max is only available in the MacBook Pro at present.
           | The Mac Mini and iMac will only get the base M4. The Mac
           | Studio is still based on the M2.
           | 
           | This is just a summary of performance and cost. Portability,
           | efficiency, and compatibility factors will weigh everyone's
           | choices.
        
         | jsheard wrote:
         | > I am surprised at how much this thing is just straight up
         | crushing it with just 8 cores.
         | 
         |  _Cache rules everything around me_
        
       | drumhead wrote:
       | Just seen the figures, it's ridiculously good. The gap over it's
       | competition is staggering. I hope the Intel hubris doesn't set in
       | at Amd, especially with the ARM pack snapping at their heels.
        
       | aurareturn wrote:
       | https://tpucdn.com/review/amd-ryzen-7-9800x3d/images/efficie...
       | 
       | Raw gaming performance increase is good but its gaming efficiency
       | seems to have taken a dip compared to 7800X3D.
       | 
       | So AMD chose to decrease efficiency to get more performance this
       | generation.
       | 
       | Source: https://www.techpowerup.com/review/amd-
       | ryzen-7-9800x3d/23.ht...
        
         | Numerlor wrote:
         | The efficiency is only worse because the CPU can use the power
         | without burning itself up unlike the last generation's X3D. And
         | efficiency is always better at lower clocks. You can get this
         | generation's efficiency uplift by limiting its power to the
         | levels where last generation's CPU started throttling to keep
         | its 89C Tjmax, but that will inevitably also limit the
         | frequency that's the main performance uplift for the CPU
         | 
         | For comparison on how limited last gen's X3D was wrt power,
         | tom's hardware has it on 71W with all core AVX, while my 7600X
         | with 2 fewer cores consumes up to 130W
        
           | aurareturn wrote:
           | If I can summarize what you wrote: Same IPC gain as normal
           | Zen5 but more power can be drawn to increase performance due
           | to moving the cache chiplet to the bottom.
        
             | wtallis wrote:
             | The previous 3D cache solutions were not just limited
             | thermally, but also the cache chiplet could not tolerate
             | the high voltages that AMD's CPU cores use at high
             | frequencies. Even with excellent cooling, you weren't going
             | to get a 7800X3D or 5800X3D to match frequencies with the
             | non-3D parts. (This might have been less of a problem if
             | AMD could put the extra cache on a different power rail,
             | but that's hard to retrofit into an existing CPU socket.)
             | This new cache chiplet still has a lower voltage limit than
             | the CPU cores, but it's not as big a disparity.
        
         | Hikikomori wrote:
         | Man Intel is so far behind on that list.
        
           | Already__Taken wrote:
           | Bad arch decision are punishing. AMD was absolutely dwarfed
           | in the early core iX days and never really came back until
           | Ryzen. The whole bulldozer linage was DoA to the point
           | Opteron just never factored in.
           | 
           | Hopefully Intel pull something out again but they look asleep
           | a the wheel.
        
         | shantara wrote:
         | 9800X3D is supposed to have Eco mode with a lower TDP cap,
         | similarly to other AMD processors. I don't see it included in
         | the initial reviews, but it would be curious to see the
         | followup data. If the history is anything to go by, it would
         | significantly decrease the power consumption with only a
         | marginal performance impact.
        
           | SushiHippie wrote:
           | I have the 7950x, and if I set it to 65W eco mode, I still
           | have basically the same geekbench score
           | 
           | 65W: https://browser.geekbench.com/v6/cpu/6126001
           | 
           | 105W: https://browser.geekbench.com/v6/cpu/5821065
           | 
           | I actually haven't tested it with 170W (which is the default
           | for the 7950x) for whatever reason, but the average 7950x
           | score on geekbench is basically the same as my geekbench
           | scores with lower than normal TDP.
           | 
           | https://browser.geekbench.com/processors/amd-ryzen-9-7950x
           | 
           | I wouldn't be surprised if the same is possible with the
           | newer CPUs.
           | 
           | Nice added bonus is that my PC fans barely spin (not at
           | audible speeds)
        
       | whalesalad wrote:
       | The last Intel machine I will ever build was my 13900K, primarily
       | because I liked the fact that I could use cheaper DDR4 memory.
       | 
       | Next rig and everything for the forseeable future will be AMD.
       | I've been a fanboy since the Athlon XP days - took a detour for a
       | bit - but can't wait to get back.
        
         | TacticalCoder wrote:
         | > I've been a fanboy since the Athlon XP days - took a detour
         | for a bit - but can't wait to get back.
         | 
         | Same. But already built a 3700X and then a 7700X.
         | 
         | I've got this feeling the wife she's gonna upgrade her 3700X to
         | a 7700X soon, meaning I'll get build a 9000 series AMD!
        
         | moffkalast wrote:
         | Even if Intel wasn't chugging so badly right now, their recent
         | handling of the overvoltage and oxidation fiasco where they
         | only thought about covering their asses instead of working the
         | problem would leave me with a pretty disgusting taste in my
         | mouth if I bought anything Intel for the foreseeable future.
         | Customer relations should mean something, just look at Noctua.
        
       | TacticalCoder wrote:
       | The results for _decompression_ , but no compression, are all
       | surprisingly bad compared to other benchmarks, how comes? For
       | example 7-zip decompression performs _worse_ than my 7700X (84 K
       | mips vs 93 K mips on my 7700X). Other decompression benchs are
       | equally depressing. But compression performs as expected (as much
       | as 30% faster than my 7700X).
       | 
       | What can explain those disappointing results but only on
       | decompression?
        
         | kevingadd wrote:
         | Modern decompression is compute-bound typically (AFAIK), not
         | memory-bound. It is in fact common to use compression as a
         | workaround for memory-bound workloads to turn them into
         | compute-bound ones.
        
       | heraldgeezer wrote:
       | King CPU. Time to build a new desktop PC!
        
       | pawelduda wrote:
       | Such a gap in these gaming benchmarks.. AMD killing it
        
       ___________________________________________________________________
       (page generated 2024-11-06 23:02 UTC)