[HN Gopher] Near-linear speedup for CPU compute on 20-core Mac S...
___________________________________________________________________
Near-linear speedup for CPU compute on 20-core Mac Studio
Author : selimnairb
Score : 130 points
Date : 2022-04-28 18:02 UTC (4 hours ago)
(HTM) web link (hrtapps.com)
(TXT) w3m dump (hrtapps.com)
| ChrisMarshallNY wrote:
| Theodolite is an awesome app (that I hardly ever need to use).
|
| This guy def knows his math (and working with Apple apps).
| ac29 wrote:
| Doesnt this just suggest they are leaving some performance on the
| table then? The reason the Intel processors scale non-linearly is
| because they run each core faster when there are less cores under
| load.
| sliken wrote:
| Dunno, looks like classic memory bottleneck to me. The m1 ultra
| has 800GB/sec, but I believe a bit more than half of that is
| available to the CPUs. The rest is for the GPU and various on
| chip accelerators.
|
| So with about half the cores (16 vs 28) and twice the bandwidth
| (say 420GB/sec vs 180 GB/sec) it manages twice the performance.
| Looks pretty impressive to me. Looks like the Apple is
| significantly less memory bottle necked than the 6 channel Xeon
| W.
| bebort45 wrote:
| I'm curious why Geekbench haven't put the Mac Studio on their Mac
| leaderboard yet - https://browser.geekbench.com/mac-benchmarks.
| There are plenty of benchmarks submitted
| https://browser.geekbench.com/search?page=7&q=Apple+M1+Ultra...
| 2OEH8eoCRo0 wrote:
| Apple paid them not to because Intel handily beats them in
| single core perf for a fraction of the price.
|
| https://browser.geekbench.com/processor-benchmarks
| jdlshore wrote:
| Do you have any evidence for this statement ("Apple paid them
| not to"), or are you just making shit up?
| 2OEH8eoCRo0 wrote:
| It was a lighthearted joke.
| phs318u wrote:
| Please don't turn HN into Slashdot.
|
| I regularly downvote comments that make no points, are
| solely there to make a gag, and add no substance to the
| discussion.
| 2OEH8eoCRo0 wrote:
| Intel wins on single core performance as well as price
| though.
| recuter wrote:
| Spec us out a full comparable system and show benchmarks.
| sudosysgen wrote:
| You can make a system with a 12900K, 32GB of RAM, and a
| 1TB NVMe SSR for 1150$ :
| https://pcpartpicker.com/list/m3QjZw
| recuter wrote:
| I'll never understand religious loyalty to a corporation.
|
| The computer in the OP is fully assembled and has a 128GB
| of memory, a nice GPU, and 8TB SSD. Why be so obtuse.
|
| If you're just trying to compare to the entry level Mac
| Studio at least have the decency to throw in a full parts
| list, like you know, with a graphics card..
| [deleted]
| 2OEH8eoCRo0 wrote:
| https://browser.geekbench.com/processors/intel-
| core-i9-12900...
|
| https://browser.geekbench.com/v5/cpu/14464705
|
| Intel Core i9-12900K - 1991
|
| Apple M1 Ultra - 1554
| sliken wrote:
| Sure, Intel focuses on single thread perf, high power
| (241 watt max tdp), and automatically overclocks to 5.1
| GHz, only if you have enough power, cooling, and a bunch
| of idle cores. Thus the 15% variation in submitted
| scores. It's also rather memory bandwidth constrained,
| and shows impressive numbers with a single core running.
|
| Apple on the otherhand doesn't overclock, focuses on
| multi-core performance, has great memory bandwidth, and
| all the submitted scores are within 1%.
|
| The M1 ultra is also 1.32x faster in the multiprocessing
| benchmark. Looks pretty impressive to me, even ignoring
| the much less power the M1 ultra uses.
| joakleaf wrote:
| That's kind of disingenuous.
|
| Searching on Geekbench for Apple M1 ultra single core
| scores returns values mostly in the 1770-1780 range. E.g.
| https://browser.geekbench.com/v5/cpu/14597244
|
| Most 12900K score are between 1900 and 2200 but then
| there is this outlier with single core score of 1252:
| https://browser.geekbench.com/v5/cpu/14572307
|
| Intel certainly wins on single core, but the m1 Ultra
| multicore scores are still impressive in comparison being
| generally 23-24000, while the 12900k are around 15-20000.
| hu3 wrote:
| So 1900-2200 for Intel and 1770-1780 for M1?
|
| Disingenuous would be to focus on the outlier.
| ModernMech wrote:
| It's amusing that you're concerned about someone making a
| joke turning this into Slashdot, because the actual
| discussion here comparing specs and cost of Intel vs. Mac
| could be lifted directly from the Slashdot archives circa
| the late 90s (adjusting for the specs).
| [deleted]
| ceeplusplus wrote:
| That seems like a poorly thought out theory considering the
| $500 12900K needs 50 watts to beat the 2 year old M1 in
| single core performance. Whether you're a laptop user or an
| enterprise server customer you care about efficiency.
| seabriez wrote:
| For a workload that is optimized on M1 for 10 more watts
| you get way faster and better functionality than a 2 yo M1
| that uses 40 watts and cant even export h264 format video
| faster than a 5 year old computer.
| mwint wrote:
| zamadatix wrote:
| Outside the rest of the claim not making sense how exactly
| would one compare the price of the M1 Ultra to an Intel CPU
| in the first place? The M1 Ultra doesn't have a price tag and
| even if it did you'd still not have a number you'd compare to
| the cost of a CPU.
| 2OEH8eoCRo0 wrote:
| Apple M1 ultra: $4,000
|
| Intel i9-12900k: $589
| semigroupoid wrote:
| It is pretty useless to compare the price of a single CPU
| with the price of an entire PC...
| dekhn wrote:
| apple doesn't sell unbundled chips. Adding a motherboard
| and RAM would still be less than $4K for most configs.
| jrockway wrote:
| You're definitely right there. I put together a build
| with this CPU and chose the most expensive part available
| (except GPU because chip shortage, case because there are
| $5000 ATX cases for no good reason, PSU because I just
| got the best Seasonic one, and SSD because there are $12k
| enterprise ones): https://pcpartpicker.com/list/DYxhk9
|
| So that's $3500 without a GPU, buy a $500 used GPU on
| eBay and you're beating Apple. And, nobody buys $1000
| motherboards, so that takes $500 off. You don't need a
| $300 case. Etc. Basically the point of the exercise is
| that you can max everything out, and get a faster
| computer for less money, which is what the comment was
| trying to say.
|
| Someone will reply and say that your time sourcing and
| assembling the components isn't free, or that it doesn't
| run OS X, etc. I get it, you don't have to say that. Just
| adding an actual computer that's expensive as possible
| that you could have right now to compare to.
| hedgehog wrote:
| To save the click a 12900K machine from Dell is about
| $3k. If you need CUDA or say SolidWorks get the PC, for
| video and multithreaded workloads the Mac would probably
| be faster, but really only benchmarks of your use case
| can tell you.
| BolexNOLA wrote:
| So made a build to essentially show how expensive an M1
| Mac is compared to an intel machine but left out a
| critical component because it's too expensive?
| seabriez wrote:
| Yeah its only useless to compare when M1 is in a bad
| light, but when its like the article then its totally
| useful; even though the article is misleading AF trash
| (calling matrix multiply "General Compute"? Yea, ok).
| Comparing computers vs Apple that has special HW SOC for
| performing an operation is a lie. And then he says that
| somehow this extends to other workloads? LMAO. Its
| already been proven that M1 is pretty terrible in
| performance for many tasks, including video 264 export
| and many others. If you wasted $6k on a MBP thats on you,
| stop trying to post stupid articles about some kind of
| "fake breakthrough performance that everyone is [missing]
| out on" It only shows how un-knowledgable Mac users are,
| and spreading Apples' fake benchmarks.
| sliken wrote:
| Umm, running a 1990s fortran code that's a CFD simulation
| is a "real" workload. Seems relatively likely that any
| floating point heavy code would act similarly. Hard to
| say if it's the matrix multiple or the memory bandwidth
| that's giving apple such a large lead.
|
| Normally I'd discount using a 28 core Intel CPU from
| 2019, but from what I can tell Intel hasn't improved much
| since then. Keep in mind that Intel has a specialized
| vector unit (AVX256 or AVX512 depending on the model),
| and the listed CPU is pretty high end (with 6 memory
| channels) where the normal i5/i7/i9 is only 2.
|
| So sure it's not a video compression, gaming, or web
| browing benchmark, but some folks do run floating point
| heavy codes. Unlike CUDA, which requires a rewrite, this
| code wasn't specifically optimized for the M1.
| recuter wrote:
| created: 3 months ago karma: 64
|
| ---
|
| Are you a bot? I swear there's an influx of random trolls
| as of late. Like some entity using GPT-3 to sow as much
| disharmony as possible regardless of subject.
| smoldesu wrote:
| Ah yes, the bot accusation comment. Everyone's favorite
| subcategory of HN musing.
| recuter wrote:
| Hey man, maybe I'm a bot too. Who knows. Blip bloop. But
| these are just computer chips and fresh low karma
| accounts with vitriol dialed to the max is a-typical of
| HN. By all means, M1 is the worst, why froth at the mouth
| about it in this manner? You explain it.
| ask_b123 wrote:
| Well, I have lower karma than seabriez and I'm neither a
| bot nor I thought that I had low karma points.
| jfpoole wrote:
| There's a bug in the Browser that we haven't been able to track
| down yet that's preventing the Mac Studio from appearing on the
| leaderboard.
| mrjin wrote:
| So a new discovery of Amdahl's Law?
|
| https://en.wikipedia.org/wiki/Amdahl%27s_law
| danieldk wrote:
| _Above, we're looking at parallel performance of the NASA USM3D
| CFD solver as it computes flow over a classic NACA 0012 airfoil
| section at low speed conditions._
|
| If this solver relies on matrix multiplication and uses the macOS
| Accelerate framework, you are seeing this speedup because M1 Macs
| have AMX matrix multiplication co-processors. In single precision
| GEMM, the M1 is faster than an 8 core Ryzen 3700X and a bit
| slower than a 12 core Ryzen 5900X. The M1 Pro doubles the GFLOPS
| of the M1 (due to having AMX co-processors for both performance
| core clusters). And the M1 Ultra again doubles the GFLOPs (4
| performance core clusters, each with an AMX unit).
|
| Single-precision matrix multiplication benchmark results for the
| Ryzen 3700X/3900X and Apple M1/M1 Pro/M1 Ultra are here:
|
| https://twitter.com/danieldekok/status/1511348597215961093?s...
| stephencanon wrote:
| If it were taking advantage of Accelerate, the performance
| would be much higher, but also the scaling would be quite
| different. Look at the scaling in the tweet you linked--it's
| anything but linear in the number of cores used.
| bee_rider wrote:
| Just for anyone as out of touch with the MacOS ecosystem as me:
| Accelerate includes a BLAS implementation, so at least seems
| plausible (depending on how this library was compiled) that
| their special instructions might have been used.
| torginus wrote:
| But that makes me think, what prevents people from running
| these calculations on the GPU? Even the memory is shared - the
| few 100 gflops they get out of the M1 ultra is pocket change in
| GPU terms.
| the_svd_doctor wrote:
| Well that particular NASA code is from the 90's and in
| Fortran. Maybe with MPI. That's probably part of the reason
| why.
| CamperBob2 wrote:
| If a CPU doesn't run the OS and software I need to run, it
| might as well _be_ a GPU or a Cray 9 or something dug out of
| the wreckage at Area 51. All of which are good descriptions
| of a CPU available only on the increasingly-proprietary Mac
| platform.
|
| So this entire thread is kind of pointless with regard to
| many real-world use cases.
| Reason077 wrote:
| > _" Nano-texture glass gives up a little bit of the sharp
| vibrant look you get with a glossy screen, but it's worth the
| trade in usability, to be able to see the screen without
| distractions all day long."_
|
| "Nano-texture glass" is pretty much just what all screens were
| like back in the days of CRTs and pre-glossy flat screens. Now
| Apple are charging $300 for it!
| astrange wrote:
| "Nano-texture" is different from matte - matte LCDs don't have
| reflective glare, but they also have much lower contrast and
| you can see the grain if you look closely. Nanotexture doesn't
| have those issues, but it's expensive.
| numpad0 wrote:
| It is a marketing name, but refers to a special procedure used
| to create the matte surface, not the fact that it's matte. By
| the way, CRTs were glossy. We wiped them with wet towels.
| Reason077 wrote:
| Most CRTs weren't glossy in the same way that modern flat
| panels are. They diffused reflections: you couldn't see a
| crystal-clear mirror image of yourself on a dark screen like
| you can with today's screens.
| ChrisMarshallNY wrote:
| _> By the way, CRTs were glossy. We wiped them with wet
| towels._
|
| I remember the crackling, if you wiped them, within a few
| minutes of last being on.
|
| Nowadays, there's lots of folks that think "CRT" is a
| political hotbutton topic.
| toast0 wrote:
| > Nowadays, there's lots of folks that think "CRT" is a
| political hotbutton topic.
|
| Well, who wants big government shoving their noses into how
| efficient our screens are, and whether they have leaded
| glass or not. :P
| jjtheblunt wrote:
| the pixels are WAY smaller nowadays, though; perhaps that
| constrains the nanotexture fabrication process in an expensive
| way?
| olliej wrote:
| that was my thought as well ("yay marketing") but apparently
| it's actually structurally different so that it maintains
| contrast. I ordered one in early march and if it ever actually
| arrives I'll try to remember to reply on visible difference :D
| (not kidding, it was due late march, then slipped to Aripl
| 22-27, and today moved to May 23-?? )
| ProllyInfamous wrote:
| https://en.wikipedia.org/wiki/Gustafson's_law
| cglong wrote:
| Editorialized title. Original was "2022 Mac Studio (20-core M1
| Ultra) Review".
| pvg wrote:
| email these in
| cglong wrote:
| Done! Thank you for the reminder :)
| selimnairb wrote:
| Could you explain what "these" are that need to be emailed
| in?
| electroly wrote:
| Title corrections. @dang can't read every thread to see
| these posts, but if you email him, he can take a look.
| selimnairb wrote:
| Makes sense. Thank you for clarifying.
| MBCook wrote:
| I didn't submit it but in this case the original title is so
| generic no one would have looked at it so I'm kind of happy
| they put the important part in the headline here.
| sydthrowaway wrote:
| I bet you can build an AMD system that beats this handily and
| costs half as much.
___________________________________________________________________
(page generated 2022-04-28 23:00 UTC)