[HN Gopher] El Capitan: New Supercomputer Is the Fastest in the ...
___________________________________________________________________
El Capitan: New Supercomputer Is the Fastest in the World
Author : rbanffy
Score : 47 points
Date : 2024-11-19 20:52 UTC (2 hours ago)
(HTM) web link (spectrum.ieee.org)
(TXT) w3m dump (spectrum.ieee.org)
| Melatonic wrote:
| Anybody know what "Inertial Confinement Fusion" is in the
| referenced article?
| JumpCrisscross wrote:
| > _what "Inertial Confinement Fusion" is_
|
| The experimental fusion approach used by the NIF [1][2].
|
| It's conveniently simultaneously an approach to fusion power, a
| way to study fusion plasmas and a tiny nuclear explosion.
|
| [1] https://en.wikipedia.org/wiki/Inertial_confinement_fusion
|
| [2] https://en.wikipedia.org/wiki/National_Ignition_Facility
| MobiusHorizons wrote:
| > El Capitan, housed at Lawrence Livermore National Laboratory in
| Livermore, Calif., can perform over 2700 quadrillion operations
| per second at its peak. The previous record holder, Frontier,
| could do just over 2000 quadrillion peak operations per second.
|
| > El Capitan uses AMD's MI300a chip, dubbed an accelerated
| processing unit, which combines a CPU and GPU in one package. In
| total, the system boasts 44,544 MI300As, connected together by
| HPE's Slingshot interconnects.
|
| Seems like a nice win for AMD.
| alephnerd wrote:
| > Seems like a nice win for AMD
|
| Yep! They've been part of the Exascale project for a long time,
| and it's good to see their commitment on HPC actually succeeded
| unlike Intel's during the same time period.
| einpoklum wrote:
| So, they built this supercomputer to test new and more deadly
| nuclear weapons. That makes me so "happy". I am absolutely not
| worried about two nuclear powers being close to the brink of
| direct war, even as we speak; nor about the abandonment of the
| course of nuclear disarmament treaty; nor about the repeated talk
| of a coming war against certain Asian powers. Everything is great
| and I'll just fawn over the colorful livery and the petaflops
| figure.
| comboy wrote:
| I'd guess it's unlikely to be the real use case. The real one
| is classified. Plus it's not like more deadly nuclear weapons
| would change anything, we can do bad enough with what we
| already have.
| JumpCrisscross wrote:
| > _it 's unlikely to be the real use case. The real one is
| classified._
|
| What are you basing this on?
|
| > _it 's not like more deadly nuclear weapons would change
| anything_
|
| We haven't been chasing yield in nuclear weapons since the
| 60s.
|
| Our oldest warheads date from the 60s [1]. For obvious
| reasons, the experimental track record on half-century old
| pits is scarce. We don't know if novel physics or chemistry
| is going on in there, and we don't want to be the second ones
| to find out.
|
| [1] https://en.wikipedia.org/wiki/B61_nuclear_bomb
| alephnerd wrote:
| > I'd guess it's unlikely to be the real use case
|
| I can safely say that nuclear simulations are one of the
| major drivers for HPC research globally.
|
| It is not the only one (genomics, simulations, fundamental
| research are also major drivers) but it is a fairly prominent
| one.
| realo wrote:
| Maybe there is research not on bigger bangs, but on smaller
| packages?
|
| Think about a baseball-size device able to take out a city
| block.
|
| Then think about an escadron of drones able to transport
| those baseballs to very precise city blocks...
| JumpCrisscross wrote:
| > _they built this supercomputer to test new and more deadly
| nuclear weapons_
|
| If you are afraid of nuclear war, the thing to fear is a
| nuclear state's capacity to retaliate being questioned. These
| supercomputers are the alternative to live tests. Taking them
| away doesn't poof nuclear weapons, it means you are left with a
| half-assed deterrent or must resume live tests.
|
| > _the abandonment of the course of nuclear disarmament treaty_
|
| North Korea, the American interventions in the Middle East and
| Ukraine set the precedent that nuclear sovereignty is in a
| separate category from the treaty-enforced kind. Non-
| proliferation won't be made or broken on the back of aging,
| degrading weapons.
|
| > _repeated talk of a coming war against certain Asian powers_
|
| One invites war by refusing to prepare for it.
| rbanffy wrote:
| The whole point of testing (and making) deadly nuclear weapons
| is to ensure they are never used again. The Mutually Assured
| Destruction doctrine has kept us alive through the darkest pf
| the Cold War (also keeping the Cold War cold). In order to
| credibly threaten anyone who tries to annihilate you with
| certain annihilation is with lots of such doomsday weapons. We
| have lived in this Mexican standoff for longer than we
| remember.
| postalrat wrote:
| Are are living in the darkest days of the cold war right now.
| shagie wrote:
| I would reference an older article on super computers and the
| nuclear weapon arsenal.
|
| https://www.techtarget.com/searchdatacenter/news/252468294/C...
|
| > "The Russians are fielding brand new nuclear weapons and
| bombs," said Lisa Gordon-Hagerty, undersecretary for nuclear
| security at the DOE. She said "a very large portion of their
| military is focused on their nuclear weapons complex."
|
| > It's the same for China, which is building new nuclear
| weapons, Gordon-Hagerty said, "as opposed to the United States,
| where we are not fielding or designing new nuclear weapons. We
| are actually extending the life of our current nuclear weapons
| systems." She made the remarks yesterday in a webcast press
| conference.
|
| > ...
|
| > Businesses use 3D simulation to design and test new products
| in high performance computing. That is not a unique capability.
| But nuclear weapon development, particularly when it involves
| maintaining older weapons, is extraordinarily complex,
| Goldstein said.
|
| > The DOE is redesigning both the warhead and nuclear delivery
| system, which requires researchers to simulate the interaction
| between the physics of the nuclear system and the engineering
| features of the delivery system, Goldstein said. He
| characterized the interaction as a new kind of problem for
| researchers and said 2D development doesn't go far enough. "We
| simply can't rely on two-dimensional simulations -- 3D is
| required," he said.
|
| > Nuclear weapons require investigation of physics and
| chemistry problems in a multidimensional space, Goldstein said.
| The work is a very complex statistical problem, and Cray's El
| Capitan system, which can couple this computation with machine
| learning, is ideally suited for it, he said.
|
| ---
|
| This isn't designing new ones. Or blowing things up (
| https://www.reuters.com/article/us-usa-china-nuclear/china-m...
| ) to see if they still work. It is simulating them to have the
| confidence that they still work - and that the adversaries of
| the US know that the scientists are confident that they still
| work without having to blow things up.
| JumpCrisscross wrote:
| > _to see if they still work. It is simulating them to have
| the confidence that they still work_
|
| The Armageddon scenario is some nuclear states conduct
| stockpile stewardship, some don't, and those who do discover
| that warheads come with a use-by date.
| freeone3000 wrote:
| Eh, we have all the nukes we need and we already know how to
| build them. This is going to help more with fusion _power_ than
| fusion _explosives_.
| theideaofcoffee wrote:
| I'd rather have a few supercomputers doing stockpile
| stewardship over being tested live. As much as I hate it
| personally, these weapons are a part of our society for better
| or for worse until we (as in the people) decide they won't be
| by electing those that will help dismantle the programs. They
| should be maintained and these tools help in that.
| olao99 wrote:
| I fail to understand how these nuclear bomb simulations require
| so much compute power.
|
| Are they trying to model every single atom?
|
| Is this a case where the physicists in charge get away with
| programming the most inefficient models possible and then the
| administration simply replies "oh I guess we'll need a bigger
| supercomputer"
| TeMPOraL wrote:
| Pot, meet kettle? It's usually the industry that's leading with
| "write inefficient code, hardware is cheaper than dev time"
| approach. If anything, I'd expect a long-running physics
| research project to have well-optimized code. After all, that's
| where all the optimized math routines come from.
| bongodongobob wrote:
| My brother in Christ, it's a supercomputer. What an odd
| question.
| CapitalistCartr wrote:
| It's because of the way the weapons are designed, which
| requires a CNWDI clearance to know, so your curiosity is not
| likely to be sated.
| nordsieck wrote:
| > It's because of the way the weapons are designed, which
| requires a CNWDI clearance to know, so your curiosity is not
| likely to be sated.
|
| While that's true, the information that is online is
| surprisingly detailed.
|
| For example, this series "Nuclear 101: How Nuclear Bombs
| Work"
|
| https://www.youtube.com/watch?v=zVhQOhxb1Mc
|
| https://www.youtube.com/watch?v=MnW7DxsJth0
| CapitalistCartr wrote:
| Having once had said clearance limits my answers.
| p_l wrote:
| It literally requires simulating each subatomic particle,
| individually. The increases of compute power have been used for
| twin goals of reducing simulation time (letting you run more
| simulations) and to increase the size and resolution.
|
| The alternative is to literally build and detonate a bomb to
| get empirical data on given design, which might have problems
| with replicability (important when applying the results to rest
| of the stockpile) or how exact the data is.
|
| And remember that there is more than one user of every
| supercomputer deployed at such labs, whether it be multiple
| "paying" jobs like research simulations, smaller jobs run to
| educate, test, and optimize before running full scale work,
| etc.
|
| AFAIK for considerable amount of time, supercomputers run more
| than one job at a time, too.
| pkaye wrote:
| Are they always designing new nuclear bombs? Why the ongoing
| work to simulate?
| danhon wrote:
| It's also to check that the ones they have will still work,
| now that there are test bans.
| dekhn wrote:
| The euphemistic term used in the field is "stockpile
| stewardship", which is a catch-all term involving a wide
| range of activities, some of them forward-looking.
| p_l wrote:
| Because even normal explosives degenerate over time, and
| fissile material in nuclear devices is even worse about it
| - remember that unstable elements are ongoing constant
| fission events, critical mass is just one where they
| trigger each others' fission fast enough for runaway
| process.
|
| So in order to verify that the weapons are still useful and
| won't fail in random ways, you have to test them.
|
| Which either involves actually exploding them (banned by
| various treaties that have enough weight that even USA
| doesn't break them), or numerical simulations.
| AlotOfReading wrote:
| Multiple birds with one stone.
|
| * It's a jobs program to avoid the knowledge loss created
| by the end of the cold war. The US government poured a lot
| of money into recreating the institutional knowledge needed
| to build weapons (e.g. materials like FOGBANK) and it's
| preferred to maintain that knowledge by having people work
| on nuclear programs that aren't quite so objectionable as
| weapon design.
|
| * It helps you better understand the existing weapons
| stockpiles and how they're aging.
|
| * It's an obvious demonstration of your capabilities and
| funding for deterrence purposes.
|
| * It's political posturing to have a big supercomputer and
| the DoE is one of the few agencies with both the means and
| the motivation to do so publicly. This has supposedly been
| a major motivator for the Chinese supercomputers.
|
| There's all sorts of minor ancillary benefits that come out
| of these efforts too.
| colonCapitalDee wrote:
| Basically yes, we are always designing new nuclear bombs.
| This isn't done to increase yield, we've actually been
| moving towards lower yield nuclear bombs ever since the mid
| Cold War. In the 60s the US deployed the B41 bomb with a
| maximum yield of 25 megatons, making it the most powerful
| bomb ever deployed by the US. When the B41 was retired in
| the late 70s, the most powerful bomb in the US arsenal was
| the B53 with a yield of 9 megatons. The B53 was retired in
| 2011, leaving the B83 as the most powerful bomb in the US
| arsenal with a yield of only 1.2 megatons.
|
| There are two kinds of targeting that can be employed in a
| nuclear war: counterforce and countervalue. Counterforce is
| targeting enemy military installations, and especially
| enemy nuclear installations. Countervalue is targeting
| civilian targets like cities and infrastructure. In an all
| out nuclear war counterforce targets are saturated with
| nuclear weapons, with each target receiving multiple
| strikes to hedge against the risks of weapon failure,
| weapon interception, and general target survival due to
| being in a fortified underground positions. Any weapons
| that are not needed for counterforce saturation strike
| countervalue targets. It turns out that having a yield
| greater than a megaton is basically just overkill for both
| counterforce and countervalue. If you're striking an
| underground military target (like a missile silo) protected
| by air defenses, your odds of destroying that target are
| higher if you use three one megaton yield weapons than if
| you use a single 20 megaton yield weapon. If you're
| striking a countervalue target, the devastation caused by a
| single nuclear detonation will be catastrophic enough to
| make optimizing for maximum damage pointless.
|
| Thus, weapons designers started to optimize for things
| other than yield. Safety is a big one, an American nuclear
| weapon going off on US soil would have far reaching
| political effects and would likely cause the president to
| resign. Weapons must fail safely when the bomber carrying
| them bursts into flames on the tarmac, or when the rail
| carrying the bomb breaks unexpectedly. They must be
| resilient against both operator error and malicious
| sabotage. Oh, and none of these safety considerations are
| allowed to get in the way of the weapon detonating when it
| is supposed to. This is really hard to get right!
|
| Another consideration is cost. Nuclear weapons are
| expensive to make, so a design that can get a high yield
| out of a small amount of fissile material is preferred.
| Maintenance, and the cost of maintenance, is also relevant.
| Will the weapon still work in 30 years, and how much money
| is required to ensure that?
|
| The final consideration is flexibility and effectiveness.
| Using a megaton yield weapon on the battlefield to destroy
| enemy troop concentrations is not a viable tactic because
| your own troops would likely get caught in the strike. But
| lower yield weapons suitable for battlefield use (often
| referred to as tactical nuclear weapons) aren't useful for
| striking counterforce targets like missile silos. Thus,
| modern weapon designs are variable yield. The B83 mentioned
| above can be configured to detonate with a yield in the low
| kilotons, or up to 1.2 megatons. Thus a single B83 weapon
| in the US arsenal can cover multiple continencies, making
| it cheaper and more effective than maintaining a larger
| arsenal of single yield weapons. This is in addition to
| special purpose weapons designed to penetrate underground
| bunkers or destroy satellites via EMP, which have their own
| design considerations.
| Jabbles wrote:
| > It literally requires simulating each subatomic particle,
| individually.
|
| Citation needed.
|
| 1 gram of Uranium 235 contains 2e21 atoms, which would take
| 15 minutes for this supercomputer to count.
|
| "nuclear bomb simulations" do not need to simulate every
| atom.
|
| I speculate that there will be _some_ simulations at the
| subatomic scale, and they will be used to inform other
| simulations of larger quantities at lower resolutions.
|
| https://www.wolframalpha.com/input?i=atoms+in+1+gram+of+uran.
| ..
| p_l wrote:
| Subatomic scale is the perfect option, but we tend to not
| have time for that, so we sample and average and do other
| things. At least that's the situation within aerospace's
| hunger for CFD, I figure nuclear has similar approaches.
| Jabbles wrote:
| I would like a citation for anyone in aerospace using (or
| even realistically proposing) subatomic fluid dynamics.
| JumpCrisscross wrote:
| > _Are they trying to model every single atom?_
|
| Given all nuclear physics happens _inside_ atoms, I 'd hope
| they're being more precise.
|
| Note that a frontier of fusion physics is characterising plasma
| flows. So even at the atom-by-atom level, we're nowhere close
| to a solved problem.
| amelius wrote:
| Or maybe it suffices to model the whole thing as a gas. It
| all depends on what they're trying to compute.
| JumpCrisscross wrote:
| > _maybe it suffices to model the whole thing as a gas_
|
| What are you basing this on? Plasmas don't flow like gases
| even absent a magnetic field. They're self interacting,
| even in supersonic modes. This is like saying you can just
| model gases like liquids when trying to describe a plane--
| they're different states of matter.
| alephnerd wrote:
| > I fail to understand how these nuclear bomb simulations
| require so much compute power
|
| I wrote a previous HN comment explaining this:
|
| Tl;dr - Monte Carlo Simulations are hard and the NPT prevents
| live testing similar to Bikini Atoll or Semipalatinsk-21
|
| https://news.ycombinator.com/item?id=39515697
| cryptozeus wrote:
| This is great but I absolutely love that poster of el capitan on
| the supercomputer racks ! Also TIL there is a list of top500 at
| https://www.top500.org/lists/top500/2024/11/
| theideaofcoffee wrote:
| That's a pretty standard Cray feature for systems larger than a
| few cabinets. El Capitan has the landscape, Hopper at NERSC had
| a photo of Grace Hopper, Aurora at ANL has a creamy gradient
| reminiscent of the Borealis, and on and on. Gives them a bit of
| character beyond the bad-ass Cray label on the doors.
| pama wrote:
| Noting here that 2700 quadrillion operations per second is less
| than the estimated sustained throughput of productive bfloat16
| compute during the training of the large llama3 models, which
| IIRC was about 45% of 16,000 quadrillion operations per second,
| ie 16k H100 in parallel at about 0.45 MFU. The compute power of
| national labs has fallen far behind industry in recent years.
| handfuloflight wrote:
| Any idea how that stacks up with GPT-4?
| alephnerd wrote:
| Training an LLM (basically Transformers) is different workflow
| from Nuclear Simulations (basically Monte Carlo simulations)
|
| There are a lot of intricates, but at a high level they require
| different compute approaches.
| handfuloflight wrote:
| Can you expand on why the operations per second is not an apt
| comparison?
| pertymcpert wrote:
| When you're doing scientific simulations, you're generally
| a lot more sensitive to FP precision than ML training which
| is very, very tolerant of reduced precision. So while FP8
| might be fine for transformer networks, it would likely be
| unacceptably inaccurate/unusable for simulations.
| pama wrote:
| Absolutely. Though the performance of El Capitain is only
| measured by a linpack benchmark not the actual application.
| pertymcpert wrote:
| I thought modern supercomputers use benchmarks like HPCG
| instead of LINPACK?
| fancyfredbot wrote:
| The top 500 includes both. There is no HPCG result for El
| Capitan yet:
|
| https://top500.org/lists/hpcg/2024/11/
| Koshkin wrote:
| This is about the raw compute, no matter the workflow.
| bryanlarsen wrote:
| A 64 bit float operation is >4X as expensive as a 16 bit float
| operation.
| Koshkin wrote:
| In terms of heat dissipation, maybe, yes. But not necessarily
| in time.
| pama wrote:
| Agreed. However also note that if it was only matrix
| multiplies and no full transformer training, the performance
| of that Meta cluster would be closer to 16k PFlops/s, still
| much faster than the El Capitain performance measured on
| linpack and multiplied by 4. Other companies presumably
| cabled 100k H100s together, but they dont yet publish
| training data for their LLMs. It is good to have competition,
| I just didnt expect the tables to flip so dramatically over
| the last two decades from a time when governments still ruled
| the top spots in computer centers with ease to nowadays where
| the assumption is that there are at least ten companies with
| larger clusters than the most powerful governments.
| declan_roberts wrote:
| Do super computers need proximity to other compute nodes in order
| to perform this kind of computations?
|
| I wonder what would happen if Apple offered people something like
| iCloud+ in exchange for using their idle M4 compute at night time
| for a distributed super computer.
| conception wrote:
| If you weren't aware -
| https://en.m.wikipedia.org/wiki/Folding@home
| declan_roberts wrote:
| More of a SETI@home man myself.
| theideaofcoffee wrote:
| The thing that sets these machines apart from something that
| you could set up in AWS (to some degree), or in a distributed
| sense like you're suggesting is the interconnect, how the
| compute nodes communicate. For a large system like El Capitan,
| you're paying a large chunk of the cost in connecting the nodes
| together, low latency, interesting topologies that ethernet,
| nor even Infiniband can get close to. Code that requires a lot
| of DMA or message passing really will take up all of the
| bandwidth that's available, that becomes the primary bottleneck
| in these systems.
|
| The interconnect has been Cray's bread and butter for multiple
| decades: Slingshot, Dragonfly, Aries, Gemini, SeaStar, numalink
| via sgi, etc. and those for the less massively parallel systems
| before those.
| philipkglass wrote:
| Yes, supercomputers need low-latency communication between
| nodes. If a problem is "embarrassingly parallel" (like
| folding@home, mentioned by sibling comment) then you can use
| loosely coordinated nodes. Those sorts of problems usually
| don't get run on supercomputers in the first place, since there
| are cheaper ways to solve them.
| balia wrote:
| Some may not want to hear this, but these "fastest supercomputer"
| list is now meaningless because all the Chinese labs have started
| obfuscating their progress.
|
| A while ago there were a few labs in China in top 10 and they all
| attracted sanctions / bad attention. Now no Chinese lab report
| any data now
| pknomad wrote:
| I wouldn't say meaningless... just incomplete.
| leptons wrote:
| I doubt the US Government is telling everyone about their
| fastest computer.
| sandworm101 wrote:
| This new upstart to the name may win in search results today, but
| in a few years the first and true El Cap will reclaim its place.
| It will outlast all of us.
|
| https://en.wikipedia.org/wiki/El_Capitan
___________________________________________________________________
(page generated 2024-11-19 23:01 UTC)