[HN Gopher] First true exascale supercomputer?
___________________________________________________________________
First true exascale supercomputer?
Author : AliCollins
Score : 51 points
Date : 2022-07-06 15:57 UTC (7 hours ago)
(HTM) web link (www.top500.org)
(TXT) w3m dump (www.top500.org)
| rektide wrote:
| > _8,730,112 total cores_
|
| This must include the GPUs, otherwise it'd be 136,408 sockets.
| For a 42U rack with 4P 1U servers (not that that's what's in use,
| but to give an understandable napkin figure), that'd be 812
| racks.
|
| Frontier's own page says 74 "cabinets"/racks, and this is just
| for the compute (and perhaps switching and/or power? storage is
| elsewhere). Made up of 9408 nodes, with 4 MI250X gpu accelerators
| each- those accelerators being dual chip + 8x HBMe2 memory a
| piece monsters. From Anandtech[1], we can see the liquid-cooled
| half-width sleds are dual socket, and packed packed packed.
|
| [1] https://www.anandtech.com/show/17074/amds-instinct-
| mi250x-re...
| JonChesterfield wrote:
| Bit hard to guess what a 'core' would be on a gpu. Compute unit
| / streaming multiprocessor perhaps.
| seiferteric wrote:
| > and relies on gigabit ethernet for data transfer.
|
| This seems suprising to me, I would have expected 10Gb at least,
| if not something like inifiniband.
| xtreme wrote:
| That's a typo, Frontier uses the Slingshot network from
| Cray/HPE. The table below has the correct information.
| dixie_land wrote:
| seems to be a proprietary interconnect that is "Ethernet
| compatible"
|
| https://www.hpe.com/us/en/compute/hpc/slingshot-interconnect...
| pclmulqdq wrote:
| As far as I know, Slingshot uses a layer 1 that is exactly
| the same as Ethernet and allows layer 2 ethernet packets to
| enter switches. However, it has several layer 1/2 extensions
| that let it look more like Infiniband to use cases that need
| it, including flow control and congestion control.
| jagger27 wrote:
| It's wrong, and quite a funny typo. The interconnect is 100
| gigabit.
|
| https://www.olcf.ornl.gov/frontier/#4
| jmpman wrote:
| Back in the 2010 timeframe, there were articles about how an
| Exascale Supercomputers might be impossible. Would be interesting
| if someone could go back and assess where those predictions were
| wrong and where they held, and how the architecture changed to
| get around those true scaling limits.
| porcoda wrote:
| Power efficiency mostly. The power requirements of an exascale
| machine with 2010-timeframe hardware would be crazy.
| peter303 wrote:
| Oak Ridge still consumes 20 megawatts. However older
| technology was appearing to require a gigawatt.
| peter303 wrote:
| Onward to zettaflops around 2037, assuming an order of magnitude
| every five years. Thats been pretty much the case for 60 years.
| einpoklum wrote:
| > "This HPE Cray EX system is the first US system with a peak
| performance exceeding one ExaFlop/s."
|
| So, it's not actually the first one? And another one already
| exists outside the US?
| jagger27 wrote:
| Yes. It is assumed that China is downplaying how capable theirs
| are.
| ncmncm wrote:
| We may assume NSA has faster ones, devoted to speech
| transcription and codebreaking.
| SkyMarshal wrote:
| That was an odd qualification. The only thing they mention is
| that the #2 computer in Japan is theoretically capable of an
| Exaflop, but hasn't demonstrated it yet.
| kvetching wrote:
| If they truly wanted to solve world problems, they need to allow
| an AGI company like DeepMind or OpenAI to use it. The people now
| using it are likely wasting so much money using outdated
| technologies.
| ncmncm wrote:
| As in Vernor Vinge's "A Fire upon the Deep", where Powers in
| the Great Beyond transcend existence while regular people are
| condemned to live out their lives running a trivial program.
| sriram_malhar wrote:
| 21 MW power! Insane.
|
| Interestingly, the second one is 30 MW.
| krylon wrote:
| For reference, Roadrunner, which was the first petascale system
| in 2008, used 2.35 MW (according to Wikipedia). So this one
| gives us 1,000 times the performance for 10 times the energy.
| From a performance/Watt perspective, this is an impressive
| improvement.
|
| EDIT: Wikipedia also says Roadrunner was not considered power-
| efficient in its day, which led to it being decommissioned
| after only five years of operation.
| porcoda wrote:
| That, and the apps teams hated the architecture given very
| poor tooling support - especially since the writing was on
| the wall that the future was GPGPU accelerators and the Cell
| was a dead end. The roadrunner processors were awesome on
| paper, but not so much when it came to working with them.
| Kind of a shame really: there were some really interesting
| ideas in that processor design.
| dekhn wrote:
| To me 50 megawatts is the baseline of what I would expect for a
| decent cluster.
| jeffbee wrote:
| Perhaps you in fact forgot how to count lower than 50.
| beckingz wrote:
| Reference link: https://www.youtube.com/watch?v=3t6L-FlfeaI
| ncmncm wrote:
| He is so fast that just counting to ten, he gets that far
| before he can stop.
| dekhn wrote:
| I can't even get out of bed for less than 3 peer bonuses!
| ncmncm wrote:
| This is the first I have noticed them reporting power draw. It
| seems immoral to run it for anything that doesn't help stop
| global climate catastrophe. (Presumably global thermonuclear
| war would suffice. But carbon capture afterward would be hard
| to arrange for.)
|
| Wondering if they measure while benchmarking, or add up max
| power ratings of the chips.
|
| Did any old mainframe ever burn like that? E.g. the first big
| USAF missile tracking system, the one that filled four floors
| of a custom building?
| jeffbee wrote:
| Are you kidding? You know a single widebody airliner uses
| more energy than that, right?
| ncmncm wrote:
| Widebody airliners aren't doing much for the climate,
| either.
| ben_w wrote:
| While true, I think the point is more about how minuscule
| 21 MW is when considered in isolation -- 0.001 percent of
| global electricity usage.
| sbierwagen wrote:
| Using the secret skill of "clicking on the links for the
| other lists" I discovered that the first TOP500 which had a
| machine report a power draw was the TOP10 in November 1996:
| https://top500.org/lists/top500/1996/11/
|
| (498 kW for 229 GFlops. 136,317 times more power draw per
| flops than the current leader on the Green500.)
| causi wrote:
| It feels like it's been a long time since supercomputers were
| interesting. They're just oodles of identical processors
| connected together like legos. "We can afford more bricks than
| the next guy" is not exciting. When was the last time we had a
| "fastest supercomputer" that could do something the second-
| fastest couldn't also do?
| zamadatix wrote:
| Speed is just the measure of how fast it does something not a
| measure of what it's capable of doing. I wouldn't expect to
| divine more information like "what new things can it do" from
| that number alone outside "things we didn't have enough compute
| time for before we do now".
|
| Lego style supercomputers are still very interesting in my eye
| though. As the technical complexity involved in scaling the raw
| compute performance has simplified to a "how many do you want"
| problem the technical complexity in the interconnects has
| remained interesting and innovative both for connectivity intra
| and inter node. You won't really see that in the FLOPS number
| that makes the headlines but the interconnect can be the
| difference between a type of workload being feasible or not.
| The main push here is how large you can make certain levels of
| shared memory access happen at what latencies to run larger
| jobs instead of just more jobs.
| convolvatron wrote:
| there is also a huge amount of work remaining to be done in
| programming models and consistency.
| rfoo wrote:
| > They're just oodles of identical processors connected
| together like legos.
|
| That's the Cloud, not supercomputing. Supercomputing is all
| about interconnect.
| agumonkey wrote:
| I also wonder how the software side of things changes in
| those settings, how do people design program / algorithms
| around fast and wide data path like these.
| colatkinson wrote:
| I have a bit of experience programming for a highly-
| parallel supercomputer, specifically in my case an IBM
| BlueGene/Q. In that case, the answer is a lot of message
| passing (we used Open MPI [0]). Since the nodes are
| discrete and don't have any shared memory, you end up with
| something kinda reminiscent of the actor model as
| popularized by Erlang and co -- but in C for number-
| crunching performance.
|
| That said, each of the nodes is itself composed of multiple
| cores with shared memory. So in cases where you really want
| to grind out performance, you actually end up using message
| passing to divvy up chunks of work, and then use classic
| pthreads to parallelize things further, with lower latency.
|
| I forget the exact terminology used, but the parent is
| right that the interconnect is the "killer feature." To
| make that message passing fast, there's a lot of crazy
| topography to keep the number of hops down. The Q had nodes
| connected in a "torus" configuration to that end [1].
|
| Debugging is a bit of a nightmare, though, since some bugs
| inevitably only come up once you have a large number of
| nodes running the algorithm in parallel. But you'll
| probably be in a mainframe-style time-sharing setup, so you
| may have to wait hours or more to rerun things.
|
| This applies less to some of the newer supercomputers,
| which are more or less clusters of GPUs instead of clusters
| of CPUs. I imagine there's some commonality, but I haven't
| worked with any of them so I can't really say.
|
| [0] https://www.open-mpi.org/
|
| [1] https://www.scorec.rpi.edu/~shephard/FEP19/notes-2019/I
| ntrod...
| freemint wrote:
| Well fundamentally all super computers are turing machines. So
| one can do X while Y can not doesn't really make sense in that
| context.
|
| However the second-fastest (ARM based Fugaku) absolutely wipes
| the floor with the fastest in certain tasks due to a difference
| in interconnect topology. Fugaku futhermore has no GPUs unlike
| many other super computers and instead a CPU with some vector
| instructions, leading to a different programming model.
|
| If you are more into specialized hardware, Anton3 is amazing.
| bitwize wrote:
| Building the communication fabric it takes to make those oodles
| of identical processors to exchange and share data quickly so
| they don't get bogged down in their own communication overhead
| is a profoundly interesting problem, and by "profoundly
| interesting" I mean "call Richard Feynman in to help you solve
| it":
|
| https://longnow.org/essays/richard-feynman-connection-machin...
|
| Besides which, at that level the goal is not to go "look at
| this cool thing we built", it's more like "how do we cheaply
| and effectively build something that can solve these massive
| weather/nuclear explosion/human brain/etc. simulation problems
| we have?" and if ganging together lots of off-the-shelf
| CPUs/GPUs achieves that goal with less time, effort, and cost
| than building super-custom, boutique-schmoutique hardware, so
| be it.
| guenthert wrote:
| Not sure about exciting, but I'd think the technical
| challenges, particularly regarding intra-cluster communication,
| can be interesting to some. There's a lot of money in it, they
| better do something useful (more useful then running Linpack or
| calculating digits of Pi), rather then being just show cases.
|
| Said that, #1 is about twice as fast as #2, which is about
| three times as fast as number #3. Those gaps are much wider
| then I would have expected this late in the game.
| kjs3 wrote:
| You can still get the NEC SX series, which is a non-x86, non-
| arm vector super. They're pretty nifty. "Fastest" has gone in a
| different direction tho.
| inasio wrote:
| There's a bit of drama in that there are unofficial reports of
| two systems in China with higher performance [0], the arXiv paper
| listed below talks about a 40 million core system with around
| double theoretical performance than Frontier, and there's
| apparently a second system online with similar performance. I
| personally suspect that they didn't submit benchmarks to the
| top500 simply because those don't run well enough in the systems
|
| [0] https://arxiv.org/pdf/2204.07816.pdf
| dekhn wrote:
| Let them build those machines and the f they are any good we
| can steal all their ideas. Turnabout is fair play.
| oefrha wrote:
| I heard they won't submit anymore so as to not draw further
| scrutiny and possible sanctions onto their suppliers. Not sure
| if true, but keeping a low profile certainly makes sense given
| the blows dealt to the more visible vendors in the past few
| years.
| greggsy wrote:
| What are the concerns for vendors?
| perihelions wrote:
| US banned the sale of American HPC components to Chinese
| supercomputers.
|
| https://news.ycombinator.com/item?id=9349116 (2015, 93
| comments)
|
| https://news.ycombinator.com/item?id=26740371 (2021, 151
| comments) etc.
| throwaway4good wrote:
| They also prevent Chinese supercomputing related
| companies having their chips fabbed in Taiwan.
| marcodiego wrote:
| I remember in early 2000's trying to convince people to use linux
| and being mocked that it was a "toy" or "not professional
| enough". While at the time I tried to argue how it was more
| stable, more secure and better performant than competition and
| even arguing that it was improving continuously, some people
| still made fun of me. It is a good thing I've been able, for a
| long time, to see this: https://www.top500.org/statistics/list/ ,
| chose Category:"Operating system family" and click "Submit".
| jacquesm wrote:
| The writing has been on the wall since the early Beowulf
| success stories hit.
|
| I'm pretty bullish on the long term survival of Linux in some
| form or other, proprietary OS's not so much.
| jpsamaroo wrote:
| This is exciting news! What's also exciting is that it's not just
| C++ that can run on this supercomputer; there is also good
| (currently unofficial) support for programming those GPUs from
| Julia, via the AMDGPU.jl library (note: I am the
| author/maintainer of this library). Some of our users have been
| able to run AMDGPU.jl's testsuite on the Crusher test system
| (which is an attached testing system with the same hardware
| configuration as Frontier), as well as their own domain-specific
| programs that use AMDGPU.jl.
|
| What's nice about programming GPUs in Julia is that you can write
| code once and execute it on multiple kinds of GPUs, with
| excellent performance. The KernelAbstractions.jl library makes
| this possible for compute kernels by acting as a frontend to
| AMDGPU.jl, CUDA.jl, and soon Metal.jl and oneAPI.jl, allowing a
| single piece of code to be portable to AMD, NVIDIA, Intel, and
| Apple GPUs, and also CPUs. Similarly, the GPUArrays.jl library
| allows the same behavior for idiomatic array operations, and will
| automatically dispatch calls to BLAS, FFT, RNG, linear solver,
| and DNN vendor-provided libraries when appropriate.
|
| I'm personally looking forward to helping researchers get their
| Julia code up and running on Frontier so that we can push
| scientific computing to the max!
|
| Library link: <https://github.com/JuliaGPU/AMDGPU.jl>
| linsomniac wrote:
| TL;DR: Wow! ~9 million cores, 21 megawatts, >2x the performance
| of #2 but pulling less power (compared to 30MW). #3 is
| 0.15EFLOPS, but also 3MW.
| jakear wrote:
| The spec sheet mentions they're moving from CUDA powering their
| prior supercomputer to "HIP" for this one. This is the first I've
| heard of HIP, does anyone have experience with it? My impression
| was that GPU programming tended to mean CUDA, which isn't cross
| platform (as opposed to HIP).
|
| https://developer.amd.com/resources/rocm-learning-center/fun....
| eslaught wrote:
| HIP is basically CUDA with s/cuda/hip/g.
|
| My experience is that the stack is pretty rough around the
| edges. But when it works, you (almost) literally find-and-
| replace, and it pretty much works as advertised. However, just
| because you can get to a correct code doesn't necessarily mean
| that code will achieve optimal performance (without further
| tuning, of course).
| latchkey wrote:
| Not only tuning the code, but also the bazillion knobs on the
| GPUs themselves.
| gh02t wrote:
| That's the upside of a supercomputer though, fixed
| architecture to target with enough weight that it's
| worthwhile.
| latchkey wrote:
| If you have AMD gpus, then you need to use HIP to run all those
| CUDA applications.
| CoastalCoder wrote:
| I used to be really excited about supercomputers. It's part of
| why I pursued HPC-related work.
|
| But I think that having no interest in their actual applications
| has curbed my enthusiasm. I wish I could make a good living in
| something that interested more.
| JonChesterfield wrote:
| Could work on the supercomputer hardware/toolchain/libraries
| instead of the applications
| linksnapzz wrote:
| I love the applications, but I'm dismayed at the stagnation in
| programming models used to get the best performance out of
| modern clusters. This sums up my feelings:
|
| https://www.usenix.org/conference/atc21/presentation/fri-key...
| nabla9 wrote:
| For comparison, 2000 SP Power3 375 MHz in Oak Ridge National
| Laboratory did the same order of magnitude GFlops as iPhones with
| A14 chip can do.
___________________________________________________________________
(page generated 2022-07-06 23:01 UTC)