[HN Gopher] Intel Previews Sierra Forest with 288 E-Cores, Annou...
___________________________________________________________________
Intel Previews Sierra Forest with 288 E-Cores, Announces Granite
Rapids-D
Author : PaulHoule
Score : 61 points
Date : 2024-03-03 09:46 UTC (1 days ago)
(HTM) web link (www.anandtech.com)
(TXT) w3m dump (www.anandtech.com)
| jeffbee wrote:
| Anybody know the details of how these large chips are organized?
| Are they still in quartets of cores that share an L2, like the
| E-cores in recent desktop parts? What kind of ring, grid, mesh or
| whatever connects them?
| wmf wrote:
| I don't think Intel has revealed that officially yet but I
| expect each die has 36 tiles and each tile has four E-cores
| sharing an L2. The mesh and L3 are probably the same as in
| Granite Rapids.
| hulitu wrote:
| > Intel Previews Sierra Forest with 288 E-Cores, Announces
| Granite Rapids-D
|
| Finally a processor which can run svchost.exe.
|
| How is the performance ?
| treprinum wrote:
| If they are using N100/N305 cores then each is like a single
| non-hyperthreaded Skylake core.
| adrian_b wrote:
| They are using a successor of the N100/N305 cores, which is
| said to be significantly improved.
|
| It is likely that the cores of Sierra Forest have a
| microarchitecture very similar to the small cores of Meteor
| Lake (the big cores of Meteor Lake are almost identical to
| the big cores of Raptor Lake/Alder Lake, but its small cores
| are improved).
|
| Compared to the small cores of Meteor Lake, the cores of
| Sierra Forest will support some additional instructions. Most
| of them are some instructions previously available only in
| the server CPUs that support AVX-512, but in Sierra Forest
| (and also in the next desktop/laptop CPUs, i.e. Arrow
| Lake/Lunar Lake) they are re-encoded in the AVX instruction
| format (i.e. using a VEX prefix).
| atlas_hugged wrote:
| I know you said that as a joke, but my work machine with 32GB
| of RAM is constantly being eaten alive by svchost.exe and the
| only thing I can do is reboot once a day to keep it from
| ballooning out of control.
|
| I really don't get why the industry is still on Windows for the
| most part. I wish my company would just standardize on some
| supported variant of a Linux Desktop and be done with Windows
| once and for all.
| kogir wrote:
| svchost.exe is literally what the name implies. It's a
| generic service host. You pass it a dll and an entrypoint
| (via command line arguments and registry keys) and it runs
| it.
|
| You should look at which thing it's actually running to see
| what's using all your CPU.
|
| Some articles detailing what it does and how it works: [1]
| https://nasbench.medium.com/demystifying-the-svchost-exe-
| pro... [2] https://pusha.be/index.php/2020/05/07/exploration-
| of-svchost... [3]
| https://blog.didierstevens.com/2019/10/29/quickpost-
| running-...
| RamRodification wrote:
| > _my work machine with 32GB of RAM is constantly being eaten
| alive by svchost.exe and the only thing I can do is reboot
| once a day_
|
| Maybe you're self-employed and unsuccessful in
| troubleshooting this yourself, but that sounds like a five-
| minutes-with-a-first-line-support-technician problem.
|
| If you don't have a tech support department to turn to (or if
| they are incompetent), investigate the process with
| ProcessExplorer from Sysinternals to find out what that
| service host process is running and go from there.
|
| The industry is still on Windows because it's easier to
| manage in a corporate setting. And usually better for
| software compatibility.
| sidewndr46 wrote:
| This presupposes that identifying the problem is the 1st
| step to fixing it. In many professional settings,
| convincing someone that the problem needs to be fixed is
| the actual problem to solve.
|
| I had a near equivalent problem where my Windows desktop
| was brought to its knees by an instance of Glassfish
| running on it by some sort of policy. We did embedded
| development, low level stuff like data plane processing via
| FPGA. Internal IT spent 1/2 an hour trying to convince me
| that Glassfish was an essential part of my development
| stack.
|
| I never bothered to convince them to solve the morning 9 AM
| virus scan of every file on disk. I just hung out in the
| break room for the 45 minutes it took while the UI was
| almost totally useless
| vundercind wrote:
| At some point you just have to accept that if an employer
| really, really, _really_ insists on paying you to do
| nothing while they prevent you from using a vital tool
| they issued you... well, that must be what they want.
| They 're paying the bills and are in control of that
| entire chain of decision-making, after all. Let 'em pay
| for it if that's what they want.
| ClumsyPilot wrote:
| That's an infantile attitude, you are suppose to take
| responsibility for your work environment.
|
| Speak to the manager of your manager, and if that doesn't
| work, to the manager of the manager of the manager. If
| that does not work, write a handwritten letter to the CEO
| and deliver it. If the CRO knew about this, they would
| definitely fix it! Don't forget a a wax seal, they love
| those!
| vundercind wrote:
| Oh, you should try. I just don't think heroic efforts are
| in order. If you say "hey, this thing is wasting a bunch
| of my time, are you sure you want to do that?" and they
| say "yes"... well, alright.
| kikokikokiko wrote:
| In my case (non US, government owned IT company), they make
| every developer use Windows because someone is getting paid
| to make the purchase, that's it. Once the machines arrive,
| every single dev erases the OS and put his preferred version
| of Linux on it. And that's why my country is the s*t hole it
| is.
| stephenr wrote:
| > I wish my company would just standardize on some
| _supported_ variant of a Linux Desktop
|
| I mean, it doesn't sound like your current windows desktop is
| particularly _supported_.
| Workaccount2 wrote:
| If linux (devs) could just swallow their pride and embrace
| some heavily windows influenced design choices I am confident
| linux could see widespread adoption. This would finally
| create the incentive for product developers to actually
| create and support linux based products.
|
| People _really_ want to get away from windows. But they
| _really really_ don 't want to deal with an OS that feels
| like it's 1990.
| Sohcahtoa82 wrote:
| > But they really really don't want to deal with an OS that
| feels like it's 1990.
|
| I'd rather a 1990 feel than the current feel where
| everything is flat and it's not obvious what's interactable
| and my 27" 4K monitor is 90% whitespace.
| Sohcahtoa82 wrote:
| svchost.exe isn't a service itself, but rather, as the name
| implies, it's a host for other services. In other words, it's
| not necessarily Windows misbehaving, but a particular piece
| of software you're running.
|
| Find the PID of the svchost.exe process that's eating CPU in
| Task Manager. Then go to the Service tab of Task Manager and
| find the service with that PID. You'll have your actual
| culprit of what's eating CPU. It COULD be a Windows service
| that's acting up, but it's just as likely some 3rd party
| service.
| bobim wrote:
| In the end Sun had it right with Niagara.
| buildbot wrote:
| Kinda, it turns out SMT has lots of security pitfalls and
| having many tiny single thread cores vs. some heavily threaded
| cores works better in practice. (I love the niagara chips, I
| had a T1 and T2 box for a bit!)
| ajross wrote:
| So... not really? I mean, The T1/T2 devices are superficially
| similar, being a "big" collection of "small" cores in an SMP
| configuration and targeted at datacenter markets.
|
| But the ideas behind Niagra weren't about scale per se, it's
| was about the idea of using extremely wide multithreaded
| dispatch to get high instruction throughput out of a simple
| (and thus small) in-order CPU core. Normally you'd expect such
| a core to spend most of its time getting stalled on DRAM
| fetches, but with SMT you can usually find another instruction
| to run from another thread, so the pipeline keeps moving.
|
| The Intel E-cores in this device aren't like that at all.
| They're smaller than the P-cores, but are still comparatively
| complicated OOO designs intended to avoid avoid stalls via
| parallel dispatch.
| sillywalk wrote:
| IBM's POWER also has 4 or 8 SMT threads/core, but with big
| OoO cores. I'm not sure how they fit in.
| artemonster wrote:
| exactly double the chucks moore 144 core FORTH CPU :)
| sleepydog wrote:
| The GA144 consumes between .00014 and .65 watts. That's
| probably significantly less than a single one of these
| "E"-cores.
| sp332 wrote:
| I was just thinking it's time to write some Forth for these!
| bee_rider wrote:
| > Initially announced in February 2022 during Intel's Investor
| Meeting, Intel is splitting its server roadmap into solutions
| featuring only performance (P) and efficiency (E) cores. We
| already know that Sierra Forest's new chips feature a full E-core
| architecture designed for maximum efficiency in scale-out, cloud-
| native, and contained environments.
|
| When they say splitting like that, do they mean there won't be
| chips that feature both?
|
| Xeons with homogeneous big cores and Xeons with homogeneous
| little cores... why not call it Knights Forest?
| celrod wrote:
| Knights featured AVX512F and were best at heavily SIMD
| workloads. Sierra Forrest is bad at these jobs of workloads,
| lacking AVX512 and having only 16 byte execution units, so
| their AVX(2) throughput is also poor.
|
| They're thus going after a very different market.
| Fox8 wrote:
| The Xeon Phi reference that I was looking for - this is
| basically Larrabee all over again, now CPU only.
| adrian_b wrote:
| For servers it makes no sense to have hybrid CPUs with
| heterogeneous cores.
|
| Where needed, you can put in the same rack several servers with
| big cores and several servers with small cores, in a proportion
| appropriate for the desired application. When the big cores and
| the small cores are in different sockets and they do not share
| coolers, the big cores can achieve maximum speed without being
| slowed down by the heat produced by the many threads that might
| be run simultaneously on the small cores.
|
| AMD already has both server CPUs with big cores (Genoa and
| Genoa-X) and server CPUs with small cores (Bergamo and Siena).
| AMD's strategy seems wiser, because their small cores are
| logically equivalent with the big cores, but they have a
| smaller size and a better energy efficiency due to a different
| physical design.
|
| Intel's strategy of implementing distinct instruction sets in
| the big cores and in the small cores is an annoying PITA for
| software developers.
| nickpsecurity wrote:
| The main uses I've seen for extra, light cores are redundancy
| against hardware failures, physical isolation, and I/O
| coprocessors. (Other than strictly using them for low-power
| operation that is.)
|
| For redundancy like NonStop pairs or secure decomposition,
| the IPC must be really fast so they can work in lockstep or
| pipelines.
|
| For I/O processors, the efficient one can handle interrupt
| processing while the performance core focuses on main
| application. Like in a mainframe with the hardware more
| condensed.
|
| A separate socket per logical domain with its IPC overhead
| might not be as cost-efficient as heterogenous CPU's. That's
| also before I consider putting the new chips in existing,
| low-cost servers with all servers having the same chips. That
| might have cost and management benefits on top of it.
| JonChesterfield wrote:
| I'm enjoying their capitulation from one big chip to a pile of
| chiplets on a fabric. Also their challenges with hitting their
| deadlines.
|
| Also enjoying that core count is getting so high. Hopefully this
| will encourage a 256 core from AMD.
|
| Exciting times to be a parallel programming enthusiast.
| pixelpoet wrote:
| As the cores get individually weaker and more power efficient,
| eventually what you end up with is a middling GPU with an x86
| identity crisis.
| rbanffy wrote:
| OTOH, it'll be a GPU that can host a whole lot of cloud
| applications at the same time. Or compile lots of code in
| parallel. Or run a browser.
| JonChesterfield wrote:
| There is really significant convergent evolution between x64
| and amdgpu. An x64 core running two threads is very like a
| gpu core running four threads from a stack of a hundred or
| so.
|
| One speculates to hide memory latency, the other shuffles
| threads between cycles to hide memory latency. One has ~64
| byte wide vector ops, one has ~256 byte wide vector ops.
|
| I have a pet theory that the significant difference is the
| cache coherency model.
| fock wrote:
| previously called Xeon Phi.
| keyringlight wrote:
| I'm probably forgetting some vital details, but isn't that
| getting similar to Larrabee? As I recall that was where intel
| seemed to be exploring other uses for their Atom CPUs and
| were trying to push as many as they could into one processor.
|
| One of the uses they prototyped was a GPU, or a large multi-
| threaded (x86)software renderer rather than going through a
| regular 3D acceleration API. I remember reading that part of
| the challenge was that Larrabee was a system itself, so a
| developer needed to boot something like BSD before providing
| it with code to get useful output. This was around the time
| AMD was experimenting with 'fusion' after their purchase of
| AMD, and exploring how to push different different parts of
| an application to the relevant processor in their CPU+GPU
| products.
|
| That's in addition to the other Xeon Phi accelerators they
| did. Obviously Sierra Forest is a regular CPU, but it seems
| there's a bit of "history doesn't repeat, but it rhymes"
| whaleofatw2022 wrote:
| Xeon Phi came in a Socketed form where it could be a main
| CPU IIRC.
|
| Can't remember if it had a lower max core count across SKUs
| but at least one popular vtuber got hands on one.
| smallmancontrov wrote:
| These days CPU vs GPU isn't about the number of ALUs or
| cores, it's about the latency hiding strategy. A GPU assumes
| oodles of similar threads are running at the same time, so
| the moment one blocks on a memory access another can be
| rotated in. It's hyper-hyper-hyper-hyper-...-hyper threaded.
| Meanwhile, a CPU is just hyper-threaded, if that, and instead
| it tries to be clever about prefetching, speculative
| execution, and the like.
|
| So long as some important applications have tons of similar
| threads and some have very few threads, it will probably make
| sense to specialize.
| pixelpoet wrote:
| Yep, I see it as latency vs throughput optimisation,
| particularly wrt memory subsystem. What I was pointing out
| is, x86 is not well suited to GPU execution; Intel tried
| that with Larrabee. Moreover, in the latest generation
| Nvidia chips you have 72-96mb L2 cache and 2.5ghz+ clock
| speed, so it's remarkably capable per-thread.
|
| At some point, and I think 256 cores might be the ballpark,
| you're committing to using so many threads that you're
| probably mostly interested in high throughput. (I'm writing
| commercial path tracers so my bias is obvious!)
| AnthonyMouse wrote:
| GPUs are for parallel instructions. You want to do a ton of
| matrix multiplication.
|
| Multi-core CPUs are for parallel processes. You want to host
| a ton of virtual machines and they all care about branch
| prediction and cache latency more than throughput.
| touisteur wrote:
| The dual 92c EPYC servers are just incredible, can't wait to
| get my hand on a zen4c 2x128c box.
| AnthonyMouse wrote:
| > Hopefully this will encourage a 256 core from AMD.
|
| The limit on this is clearly power. Right now you get 128 cores
| for 360 watts -- less than 3 watts per core. SP5 can provide up
| to 700W, so they could do it if the demand is there.
|
| edit: Damn it Wikipedia, it's only 700W for 1ms. So they might
| need a more efficient process or a new socket.
| bloopernova wrote:
| I wonder what btop would look like running on that hardware?
|
| Would you just display an average of 24 cores, so it would look
| like 12 aggregate cores?
| sp332 wrote:
| Check out the first screenshot on
| https://techcommunity.microsoft.com/t5/windows-server-news-a...
| and this new CPU has twice as many threads.
|
| Edit: playing Tetris on an even bigger CPU
| https://twitter.com/markrussinovich/status/13356511159588945...
| (https://news.ycombinator.com/item?id=25343369)
| bloopernova wrote:
| That's hilarious. Maybe we'll move towards something like
| "72/288 cores in use" or "25% cores used"
___________________________________________________________________
(page generated 2024-03-04 23:01 UTC)