[HN Gopher] M1 Icestorm cores can still perform well
___________________________________________________________________
M1 Icestorm cores can still perform well
Author : ingve
Score : 51 points
Date : 2021-09-01 08:02 UTC (1 hours ago)
(HTM) web link (eclecticlight.co)
(TXT) w3m dump (eclecticlight.co)
| simondotau wrote:
| TLDR: Based on a single simple synthetic benchmark, the low
| performance "Icestorm" cores were shown to be as much as 52%--or
| as little as 18%--of the performance of the primary "Firestorm"
| cores. Highly efficient assembly showed the least performance
| drop whereas complex "idiomatic" Swift code showed the greatest
| performance drop.
|
| However the Icestorm cores also use substantially less energy so
| they are an efficiency win regardless. Plus they take up use
| significantly less physical space which is a large cost saving
| for the SOC part.
| Filligree wrote:
| How significantly less, I wonder?
|
| For my workloads it'd be an overall win to have more cores at
| that speed. The more the better; I'd cap out at maybe a a
| hundred or so.
|
| Obviously Firestorm is better, but a hundred-core desktop CPU
| at present seems... unlikely.
| maccard wrote:
| AMD [0] would like a word. it's 64 cores but 128 with
| hyperthreading.
|
| [0] https://www.amd.com/en/products/cpu/amd-ryzen-
| threadripper-3...
| Filligree wrote:
| So, not a hundred-core processor yet.
|
| I can't use hyperthreading. It does give a 60% speed boost,
| but it's also disabled in production so...
| OskarS wrote:
| > Highly efficient assembly showed the least performance drop
| whereas complex "idiomatic" Swift code showed the greatest
| performance drop.
|
| I wonder what this means. The efficient assembly probably has
| fewer instructions that use vector instruction and floating
| point calculations more, while the "idiomatic" Swift probably
| has just a larger number of instructions that aren't doing
| heavy calculation. Does that imply then that the high
| performance cores does much deeper pipelining, but the the
| number floating point units or whatever is probably pretty
| similar across both types?
| simondotau wrote:
| My initial guess is that it's because Icestorm CPUs have less
| L1 and L2 cache, resulting in more frequent cache misses in
| complex loops. I'm by no means an expert in any of this, so I
| really have no place hypothesising.
|
| Firestorm has 128KB L1 per core and 12MB shared L2.
|
| Icestorm has 64KB L1 per core and 4MB shared L2.
| webmobdev wrote:
| big.LITTLE Processing: Defining the Future of SoC Architecture -
| https://www.samsung.com/semiconductor/minisite/exynos/newsro...
|
| With this CPU design some cores are optimised for performance (at
| the expense of using more power) while some cores are optimised
| for efficiency (using the least power at the expense of computing
| performance). This makes sense for laptops and smartphones, as it
| can save power and thus run longer when being powered by
| batteries. But (in my opinion) not for Desktop PC's where most
| people care more about computing performance than saving a few
| watts.
| Synaesthesia wrote:
| Most of the time your PC isn't working hard, and it makes sense
| to use lower power cores to perform basic tasks.
| simondotau wrote:
| I'm not sure that you could make a case for this not making
| sense in a desktop computer, as everything is ultimately a
| trade-off.
|
| It's fairly clear that the Icestorm cores represent a
| performance gain in terms of performance per watt, but also die
| area. The four Icestorm cores and their support infrastructure
| takes up about the same physical space as one Firestorm core
| with its support infrastructure.
|
| I doubt that an M1 with five Firestorm cores would perform as
| well as the eight cores we did get.
| m_eiman wrote:
| Saving watts means lowering fan RPM, meaning less noise. And
| that's a big priority for many.
| n1000 wrote:
| Also, aren't desktop CPUs constrained by thermal load at some
| point or can we use ever bigger coolers? Personally, I find
| it almost obscene that my desktop PC consumes roughly as much
| as a good old incandescent lightbulb (60+W) _while idling_.
| My laptop uses as much under full load.
| Synaesthesia wrote:
| Your PC uses 60w idling? Is that with screen? It's not too
| much in that case. CPUS and GPUS have gotten a lot better
| at idle power consumption, and PSUs are also quite
| efficient these days.
| Roritharr wrote:
| Not only that, i'd prefer to have a small amount of ram and
| cpu to be running 24/7 for always-on features that I'd love
| to have my PC doing.
|
| I don't like having to run a PI for some stuff just because i
| don't want my huge tower running all the time, it would be
| really neat if it could run at anything between 5 - 600W, not
| sure though if the PSUs would be able to offer that range.
| shantara wrote:
| I didn't realize map-reduce was so much slower than a regular
| looped multiplication, regardless of the hardware the code was
| running on.
| codetrotter wrote:
| If the author is here and able to do so I'd much appreciate if
| they would share the complete code for the benchmarking as a
| whole, so that others may use it for benchmarking other code in
| the same way :)
___________________________________________________________________
(page generated 2021-09-01 10:00 UTC)