hngopher.com

       [HN Gopher] EU Grabs ARM for First ExaFLOP Supercomputer
       ___________________________________________________________________
        
       EU Grabs ARM for First ExaFLOP Supercomputer
        
       Author : timthorn
       Score  : 134 points
       Date   : 2023-10-06 14:18 UTC (8 hours ago)
        
 (HTM) web link (www.hpcwire.com)
 (TXT) w3m dump (www.hpcwire.com)
        
       | ameerhamza8796 wrote:
       | [dead]
        
       | panick21_ wrote:
       | I much rather have some investment into a advanced open-source
       | RISC-V CPU and phones and laptops based on that. An Open hardware
       | graphics accilerator would be great too.
       | 
       | But if we are gone do a HPC thing, at least make the processor
       | open-hardware and RISC-V.
        
         | brucethemoose2 wrote:
         | The libraries and compiler infrastructure are not ready. There
         | is even some HPC optimization difficulty with ARM, which is
         | much more mature.
         | 
         | Doubly so on the consumer side of things.
        
           | panick21_ wrote:
           | Good oppertunity to get those things up to speed.
        
             | brucethemoose2 wrote:
             | And that is happening right now.
             | 
             | RISC-V is coming, it just takes a long time.
        
       | znpy wrote:
       | As an european...
       | 
       | On one hand, it's nice to see funding to european companies to
       | develop european technolgy, aiming at a technological
       | sovereignty.
       | 
       | On the other hand, SiPearl looks like it was virtually unknown up
       | to this point, and I can't seem to find anything looking like a
       | cpu review (their website claims they have already released at
       | least one generation of Rhea cpus). So this amount of money might
       | not be wasted but still not optimally spent. Which isn't 100%
       | bad, but at least bittersweet.
       | 
       | If anything, without reviews and performance benchmarks, we might
       | just get ExaFLOPS _on paper_.
       | 
       | Like how does one of these Rhea CPUs compare to, say, a Graviton
       | 2/3 or to an Ampere Altra cpu?
        
         | brucethemoose2 wrote:
         | Actually Rhea was "known" for awhile, but reading between the
         | lines, looks like it got delayed and updated:
         | 
         | https://www.anandtech.com/show/16072/sipearl-lets-rhea-desig...
         | 
         | (Neoverse V1 and HBM2e woild make this chip kinda old when its
         | finally operational).
         | 
         | CPU design takes many years, and this was a HPC only chip, so
         | it doesn't necessarily need to be marketed and paraded around,
         | and the workloads will be totally different than what Graviton
         | processors run.
        
         | llm_nerd wrote:
         | They are using the ARM Neoverse V1 platform, so they aren't
         | really greenfielding this. Comparatively the Altra uses the
         | Neoverse N1 platform, which is an older HPC design. The
         | supercomputer is planned to be very GPU heavy, so while the
         | CPUs offer SVE they really are primarily orchestrators to the
         | GPU and wouldn't be a major factor regardless. They're the duct
         | tape.
         | 
         | Like many vendors in the ARM space, most of the real innovation
         | and design comes from ARM.
        
         | cherryteastain wrote:
         | There's only one way to promote domestic industries in this
         | space when they are behind/nonexistent: tons of subsidies, even
         | if the domestic alternatives are worse. That's how China,
         | Taiwan, Korea and Japan did it.
        
           | znpy wrote:
           | yeah i understand that, hence the first point.
        
           | blitzar wrote:
           | I know it gets a bad rap in China, and you have to get
           | through the crys of socialism ... but I would like to see the
           | state take partial ownership when it throws tons of subsidies
           | at a domestic company like this.
        
       | zitterbewegung wrote:
       | I wonder if we (United States ) could get a risc v in a
       | supercomputer .
        
         | redundantly wrote:
         | It's not RISC-V:
         | 
         | > SiPearl chose ARM as it is well-established and ready for
         | high-performance applications. Experts say RISC-V is many years
         | away from mainstream server adoption.
        
           | zitterbewegung wrote:
           | I mean the United States should build a risc v supercomputer
           | and fund the research
        
       | moffkalast wrote:
       | > The Jupiter supercomputer, which will cost EUR273 million to
       | build, will pair SiPearl's Rhea processor, which is based on ARM
       | architecture, with accelerator technology from Nvidia.
       | 
       | An ARM and a leg, for sure.
        
       | brucethemoose2 wrote:
       | SiPearl didn't come out of the blue. This has been in planning
       | for years:
       | 
       | https://www.anandtech.com/show/16072/sipearl-lets-rhea-desig...
       | 
       | https://semiengineering.com/tag/sipearl/
       | 
       | ...It may even be behind schedule?
        
         | riedel wrote:
         | Interesting thing is that they do not have a Wikipedia entry.
         | It seems purely a product of JU / European Processor
         | Initiative. If this works it would be one of the few real
         | successes of the European funding framework.
        
           | brucethemoose2 wrote:
           | Many interesting chips escape Wikipedia's gaze. For instance,
           | there was a very interesting x86, Zen-like CPU from Centaur
           | with an onboard AI accelerator that is basically undocumented
           | on Wikipedia:
           | 
           | https://fuse.wikichip.org/news/3256/centaur-
           | new-x86-server-p...
           | 
           | https://fuse.wikichip.org/news/3099/centaur-unveils-its-
           | new-...
        
             | my123 wrote:
             | At this point, Centaur pretty much no longer exists, with
             | engineers transferred to Intel. CNS never ended up becoming
             | a product.
        
               | brucethemoose2 wrote:
               | Yes, but it was still fascinating!
               | 
               | Imagine if it came out today. I feel like its the near
               | perfect architecture for cheap GenAI.
        
       | owlbite wrote:
       | It seems odd to focus on the CPU here, when presumably the vast
       | majority of those flops re coming from the NVIDIA parts?
       | 
       | Are the CPUs expected to contribute significant compute, as
       | opposed to marshaling data in/out of the real compute units?
        
         | cmdrk wrote:
         | it really depends on the workload. as other posters have said,
         | not everything can/should be ported to GPU. some scientific
         | calculations are simply not parallelizable in that way.
         | 
         | typically at least in the US there's a mix of GPU-focused
         | machines as well as traditional CPU-focused machines. the
         | leadership class machines (i.e., the machines funded to push
         | the FLOPS records) tend to be highly focused on GPU. one reason
         | is fixed cooling/power availability. I assume these facilities
         | are looking at ARM as a way to save 10-20% on power and thus
         | cram that much more into the facility.
        
         | monocasa wrote:
         | You'd be surprised. A lot of supercomputers aren't that much
         | about individual CPU core perf, but having a lot of low power
         | cores connected in a novel way. The BlueGene supercomputers
         | were composed of low spec PowerPC cores (even for the time).
         | 
         | High perf/watt matter more than just high perf/node, but even
         | that balanced against 'how low latency can the interconnect
         | be'.
         | 
         | You then hit the high FLOP count with tons of nodes.
         | 
         | To be fair Nvidia realized this paradigm years ago too, which
         | is why they bought Mellanox.
        
         | jefft255 wrote:
         | Yes, CPUs are still the main workhorse for many scientific
         | workloads. Sometimes just because the code hasn't been ported,
         | sometimes because it's just not something that a GPU can do
         | well.
        
           | londons_explore wrote:
           | > just because the code hasn't been ported,
           | 
           | Seems stupid to use millions of dollars of supercomputer time
           | just because you can't be bothered to get a few phd students
           | to spend a few months rewriting in CUDA...
        
             | mlyle wrote:
             | A supercomputer might cost $200M and use $6M of electricity
             | per year.
             | 
             | Amortizing the supercomputer over 5 years, a 12 hour job on
             | that supercomputer may cost $63k.
             | 
             | If you want it cheaper, your choices are:
             | 
             | A) run on the supercomputer as-is, and get your answer in
             | 12 hours (+ scheduling time based on priority)
             | 
             | B) run on a cheaper computer for longer-- an already-
             | amortized supercomputer, or non-supercomputing resources
             | (pay calendar time to save cost)
             | 
             | C) try to optimize the code (pay human time and calendar
             | time to save cost) -- how much you benefit depends upon
             | labor cost, performance uplift, and how much calendar time
             | matters.
             | 
             | Not all kinds of problems get much uplift from CUDA,
             | anyways.
        
               | jonwachob91 wrote:
               | >> A supercomputer might cost $200M and use $6M of
               | electricity per year.
               | 
               | I'm curious, what university has a $200MM super computer?
               | 
               | I know governments have numerous Supercomputers that blow
               | past $200MM in build price, but what universities do?
        
               | sophacles wrote:
               | University of Illinois had Blue Waters ($200+MM, built in
               | ~2012, decomissioned in the last couple years).
               | 
               | https://www.ncsa.illinois.edu/research/project-
               | highlights/bl...
               | 
               | https://en.wikipedia.org/wiki/Blue_Waters
               | 
               | They have always had a lot of big compute around.
        
               | mlyle wrote:
               | > I know governments have numerous Supercomputers that
               | blow past $200MM in build price, but what universities
               | do?
               | 
               | Even when individual universities don't-- governments
               | have supercomputing centers that universities are a
               | primary user of and often charge back value of computing
               | time to the university or it is a separate item that is
               | competitively granted.
               | 
               | Here we're talking about Jupiter, which is a ~$300M
               | supercomputer where research universities will be a
               | primary user.
        
             | otabdeveloper4 wrote:
             | CUDA is buggy proprietary shit that doesn't work half the
             | time or segfaults with compiler errors.
             | 
             | Basically, unless you have a very specific workload that
             | NVidia has specifically tested, I wouldn't bother with it.
        
             | brnt wrote:
             | The JSC employs a good number of people doing exactly this.
        
             | cmdrk wrote:
             | sometimes the code is deeply complex stuff that has
             | accumulated for over 30 years. to _just_ rewrite it in CUDA
             | can be a massive undertaking that could easily produce
             | subtly incorrect results that end up in papers could
             | propagate far into the future by way of citations etc
        
               | throwaway10965 wrote:
               | Sounds like a great job for LLMs. Are there any public
               | repositories of this code? I want to try.
        
               | mlyle wrote:
               | Sounds like a -terrible- job for LLMs, because this is
               | all about attention to detail. Order of operations and
               | specific constructs of how floating point work in the
               | codes in question are usually _critical_.
               | 
               | Have fun: https://www.qsl.net/m5aiq/nec-
               | code/nec2-1.2.1.2.f
        
               | londons_explore wrote:
               | All the more reason to rewrite it... You don't want some
               | mistake in 30 year old COBOL code to be making your 2023
               | experiment to have wrong results.
        
               | mlyle wrote:
               | The whole point is in these older numerical codes is that
               | they're proven and there's a long history of results to
               | compare against.
        
               | gmueckl wrote:
               | That's the complete opposite of what is actually the
               | case: some of that really old code in these programs is
               | battle-tested and verified. Any rewrite of such parts
               | would just destroy that work for no good reason.
        
               | dpe82 wrote:
               | *FORTRAN.
        
               | _a_a_a_ wrote:
               | Why don't YOU take some old code and rewrite it. I tried
               | it for some 30+ year old HPC code and it was a grim
               | experience and I failed hard. So why not keep your lazy,
               | fatuous suggestions to yourself.
        
             | bee_rider wrote:
             | >> just because the code hasn't been ported, sometimes
             | because it's just not something that a GPU can do well.
             | 
             | > Seems stupid to use millions of dollars of supercomputer
             | time just because you can't be bothered to get a few phd
             | students to spend a few months rewriting in CUDA...
             | 
             | Rewriting code in CUDA won't magically make workloads well
             | suited to GPGPU.
        
               | wang_li wrote:
               | It's highly likely that a workload that is suitable to
               | run on hundreds of disparate computers with thousands of
               | CPU cores is going to be equally well suited for running
               | on tens of thousands of GPU compute threads.
        
               | atq2119 wrote:
               | Not necessarily. GPUs simply aren't optimized around
               | branch-heavy or pointer-chasey code. If that describes
               | the inner loop of your workload, it just doesn't matter
               | how well you can parallelize it at a higher level, CPU
               | cores are going to be better than GPU cores at it.
        
               | monocasa wrote:
               | They're not that disparate; the workloads are normally
               | very dependent on the low latency interconnect of most
               | supercomputers.
        
         | xadhominemx wrote:
         | Agreed, the CPUs are not performing the scientific calculations
         | in this system.
         | 
         | Also note -- this project is quite modest in scale. Dozens of
         | GenAI clusters larger than this computer will be installed at
         | cloud data centers in the next 18 months.
        
           | RetroTechie wrote:
           | _" The Jupiter will instead have SiPearl's ARM processor
           | based on ARM's Neoverse V1 CPU design. SiPearl has designed
           | the Rhea chip to be universally compliant with many
           | accelerators, and it supports high-bandwidth memory and DDR5
           | memory channels."_
           | 
           | And
           | 
           |  _" Julich is also building out its machine-learning and
           | quantum computing infrastructure, which the supercomputing
           | center hopes to plug in as accelerator modules hosted at its
           | facility."_
           | 
           | So a modular setup, where different aspects can be upgraded
           | as needed. Btw:
           | 
           | > Also note -- this project is quite modest in scale.
           | 
           | "Exascale" and EUR273M doesn't sound modest to me. No matter
           | what it's compared against.
        
             | xadhominemx wrote:
             | At EUR300m, the Jupiter will be ~10,000 H100s. Each major
             | CSP will have several clusters this size or larger within a
             | few quarters.
        
             | pclmulqdq wrote:
             | Almost all of the AI computers being built now are
             | relatively modestly sized compared to a supercomputer. All
             | but the biggest ones are at or under the low hundreds of
             | nodes (low thousands of GPUs). The only real exceptions are
             | the few AI hyperscale companies that want to sell GPU
             | computing to others.
        
               | bee_rider wrote:
               | Do AI hyperscalers devote a their whole system to one big
               | run anyway?
               | 
               | If they don't, then those are big clusters in the sense
               | that AWS is the world's biggest supercomputer, which is
               | to say, not.
        
               | pclmulqdq wrote:
               | AWS is not a supercomputer because it doesn't have high-
               | adjacency networking. If AWS turned its biggest region
               | loose on Linpack, I would be surprised if they cracked
               | the top 50 on the supercomputer list, despite probably
               | having more cores than #1.
               | 
               | The AI hyperscalers certainly claim to be able to devote
               | 100% of cluster capacity to one training run. Google is
               | training some huge models, OpenAI is also.
        
           | bee_rider wrote:
           | Will these GenAI clusters have similar interconnects and
           | ability to run scientific computing/HPC codes? AI has moved
           | over to ASICS and GPUs nowadays, right? I also have no idea
           | what their interconnect requirements are, but the task seems
           | pretty low communication, I wonder if they can get by with a
           | cheaper interconnect.
        
             | the_svd_doctor wrote:
             | AI training requires a lot of global reductions which must
             | be very fast otherwise everything slows down. So they also
             | require fast and low latency interconnects.
        
           | hexane360 wrote:
           | I don't know how you can describe "equal to the world's
           | fastest supercomputer, which was built less than a year ago"
           | as "quite modest".
        
         | brucethemoose2 wrote:
         | SIMD heavy CPUs can provide quite respectable HPC throughout.
         | 
         | The US Dept of Energy had very favorable things to say about
         | the Fujitsu A64FX, which is architecturally similar to the
         | SiPearl Rhea (HBM memory, ARM SVE happy, fast interconnect):
         | https://www.osti.gov/biblio/1965278
         | 
         | They seemed to like the easy porting and flexible programming
         | (since its "just" CPU SIMD) and specifically describe it as
         | competitive with Nvidia:
         | 
         | > To highlight, the pink line represents the energy efficiency
         | metric for A64FX in boost power mode (described in Section
         | IV-C) with an estimated TDP of 140 W and surpassed by the red
         | and yellow lines that represent data for the Volta V100 GPU
         | (highest) and KNL, respectively. The A64FX architecture scores
         | better with the energy efficiency metric relative to the
         | performance efficiency metric due to its low power consumption.
         | 
         | In fact, ARM A64FX supercomputers topped the Green500 for some
         | time, which is the global supercomputer power efficiency
         | ranking, outclassing Nvidia/Intel/AMD machines.
        
       | hashtag-til wrote:
       | This is really cool, and as it uses an established architecture,
       | can benefit from the software ecosystem that exists around it.
       | 
       | The article focuses basically on the x86 vs. arm competition.
       | 
       | Any idea where to read more about the application this machine is
       | expected to run? I guess the usual like weather forecast and
       | such?
        
       | zzbn00 wrote:
       | Can report Julich is not near Munich.... Anyway will be
       | fascinating to see how the SiPearl chip works out.
        
         | eunos wrote:
         | Yea, much closer to Aachen in NRW. Side note, it's a small town
         | without campus, I wonder why dont they locate the center in
         | Aachen or Koln.
        
           | orbifold wrote:
           | They used to do nuclear physics research (and still do to
           | some extend) there and had an experimental fast breeder
           | reactor (capable of producing weapons grade plutonium) on
           | campus. They also prepared for rapid development of nuclear
           | weapons capabilities in the 60s. It was a site for a
           | potential German nuclear weapons program, I think they would
           | have been able to produce enough material in ~6 weeks. They
           | frame it as "nuclear disarmament" now of course
           | https://www.fz-juelich.de/en/news/archive/press-
           | release/2022..., but effectively if there is any place where
           | scientists in Germany have active knowledge how to develop
           | nuclear weapons it would be there. There are several former
           | military installations close by, including barracks for a
           | guards and supply company.
           | 
           | You don't want something like that in a city centre.
        
           | zzbn00 wrote:
           | My vague memory is that there was/is a military or government
           | site nearby which is why the research centre is also there.
        
           | b3orn wrote:
           | But there's a big research center next to the small town.
           | 
           | https://en.wikipedia.org/wiki/Forschungszentrum_J%C3%BClich
        
         | solarist wrote:
         | They mixed up Garching with Julich.
        
       | dopa42365 wrote:
       | Isn't it mostly GPUs/accelerators based rather than CPUs?
       | 
       | Does NVIDIA even sell those anymore without the whole package
       | deal since they came up with Grace?
       | 
       | The last supercomputer with NVIDIA GPUs and third party CPUs I
       | remember reading about was with Zen 2 cores, multiple years ago.
        
         | dotnet00 wrote:
         | Perlmutter (#8) was just commissioned into full service earlier
         | this year and uses Zen 3 cores with A100s. Leonardo (#4) is
         | also current year and uses Xeon CPUs with A100s. Google's H3
         | also seems to pair H100s with Xeon CPUs.
         | 
         | But yes, the CPU is mostly just a footnote, most of the FLOPs
         | come from the GPUs. Although of course the CPUs still need to
         | be sufficiently fast enough that the GPUs can be kept fed.
        
         | leetharris wrote:
         | The article says they are using SiPearl's Rhea processor. So
         | I'm guessing it's not a "package deal."
         | 
         | And regarding your question about GPU/accelerators, CPUs still
         | do a LOT of work in HPC. I'm guessing they chose ARM for
         | performance per watt, very important when scaling to many
         | processors.
        
       | datameta wrote:
       | A bit sad that while POWER9 processors were used in then-SoTA
       | supercomputers, as far as I can tell POWER10 (which I worked on
       | more) is not being used for scientific/industrial HPC.
        
         | chasil wrote:
         | I understand that POWER9 was much more open than its successor.
         | Is that a factor?
        
           | wmf wrote:
           | No, I think IBM just gave up on HPC.
        
             | datameta wrote:
             | It would seem to some that the focus is on servers and
             | mainframes. But the thing is the very same reasons the P10
             | chip excels in a high-end server apply to massively
             | parallel processing. So I don't see a technological or
             | implementation barrier.
        
             | stonogo wrote:
             | Can confirm; supercomputers don't slot neatly enough into
             | quartly EPS goals.
        
               | datameta wrote:
               | I can see the humor, but the thing is server and
               | mainframe sales already fluctuate based on hardware
               | generation cycles (~3 yrs start to finish, sometimes
               | server overlapping with mainframe or memory controller).
        
               | stonogo wrote:
               | Those contracts are reliable in that the customer is
               | extremely unlikely to move to a different product line.
               | Especially when you've got a customer locked in, refresh
               | timescales are pretty predictable.
               | 
               | HPC contracts are generally borne of federal-agency RFPs,
               | and are _extremely_ competitive, and they only  'pay out'
               | upon a passed acceptance test, so it's not trivially
               | possible to predict which quarter your revenue will land
               | for a given sale. You wind up with sales teams putting
               | tons of work into a contract that didn't get selected,
               | which sucks, but even if you win you might wind up
               | missing sales goals, and then overshooting the mark the
               | following quarter.
               | 
               | In a company less hidebound this obviously wouldn't be a
               | problem, but IBM has been run by the beancounters for
               | long enough that the prestige isn't worth the murky
               | forecast.
        
           | datameta wrote:
           | As my own opinion, I believe the OpenPower project went
           | strong with P10. I was not around to hear the contrast in
           | decisions between P9 and P10 strategy, so I can't quite
           | compare.
        
       | ThinkBeat wrote:
       | I get the impression this is more subsidizing domestic
       | development than a decision made for the overall "best" CPU.
       | 
       | https://www.eenewseurope.com/en/sipearl-raises-e90m-for-rhea...
        
         | bigdave42 wrote:
         | ARM is a UK company - the UK is (stupidly) not part of the EU
        
           | pjmlp wrote:
           | It is still closer to us than depending on US technology, as
           | times are proving globalization has gone too far.
        
           | sproketboy wrote:
           | [dead]
        
           | bee_rider wrote:
           | I thought SoftBank bought them, making them a Japanese
           | company?
           | 
           | In any case, SiPearl seems to be the one designing the actual
           | chip, they are French.
        
             | devnullbrain wrote:
             | >I thought SoftBank bought them, making them a Japanese
             | company?
             | 
             | What a weird metric for determining the nationality of a
             | company. Intel are publicly traded: are they stateless?
        
               | shard wrote:
               | I think typically a company having shareholders all over
               | the world does not make people think that it is
               | stateless. However, ownership transfer does make a
               | difference, especially when considering how much the new
               | parent company alters the original company's
               | image/culture. For example, I think that Segway has
               | pretty much lost its US company image after being
               | purchased by Ninebot, with Ninebot products being
               | prominently displayed on the Segway website.
        
               | bee_rider wrote:
               | I should have been more direct.
               | 
               | The chip is being designed by a French company, they can
               | license the IP from outside the EU while still building
               | up the EU domestic chip building capabilities. They've
               | just outsourced one (big) piece of the puzzle.
               | 
               | Calling ARM a Japanese company was just to highlight the
               | international nature of these sorts of projects.
        
             | sgerenser wrote:
             | Although SoftBank still owns 90% of the equity, they're now
             | back to being public and since their headquarters is still
             | Cambridge, I'd still call them a UK company.
        
               | [deleted]
        
               | bee_rider wrote:
               | That's probably fair, I was just teasing.
               | 
               | In general, I think ARM just was famously started in the
               | UK and so they'll always be associated with the country
               | in some intangible way.
               | 
               | It is sort of funny that we label companies like this,
               | really they are all multi-national entities. Especially
               | in the case of a company like ARM--they license out the
               | designs to be (sometimes quite significantly!) customized
               | by engineers in other countries, and I'm sure they
               | integrate lots of feedback from those partners. Then
               | those designs are often actually fabricated in a third
               | country!
               | 
               | Which is good, the world is best when we all need each
               | other.
        
             | [deleted]
        
           | b3orn wrote:
           | They chose an ARM CPU by SiPearl. The European Processor
           | Initiative (EPI) lists them as French/German. Interestingly
           | Forschungszentrum Julich (which this supercomputer is for) is
           | also listed as member of EPI.
        
         | chasil wrote:
         | However, the fact that Fujitsu previously claimed the top-
         | performing "Fugaku" supercomputer with their own custom 48-core
         | CPU (fabbed at TSMC) certainly justifies the choice of an
         | AArch64 design.
         | 
         | https://en.wikipedia.org/wiki/Fujitsu_A64FX
        
       | nologic01 wrote:
       | The economics (and politics) of the HPC ecosystem always feels a
       | bit murky and its impact on the broader computing landscape not
       | as much as you would expect. Expensive, one-of-a-kind designs
       | that push the envelope (and pressumably deliver what they are
       | commissioned for) but are floating somewhere above what the rest
       | of the world is using.
       | 
       | What would a healthy EU HPC ecosystem look like? At some point
       | there was some excitement about Beowulf clusters [1]. When
       | building a new supercomputer, think, for example about making at
       | least its main compute units more widely available (universities,
       | startups, SME's etc). HPC Computing is arcane and to tap its
       | potential in the post-Moore's law era it needs to get much more
       | democratized and popular.
       | 
       | [1] https://en.wikipedia.org/wiki/Beowulf_cluster
        
         | Someone wrote:
         | > making at least its main compute units more widely available
         | 
         | If they did, would anybody want them? Are those units
         | competitive for smaller setups and the kind of jobs they run?
        
           | nologic01 wrote:
           | it would depend heavily on costs and tangible benefits versus
           | e.g., renting something from cloud providers. its a new
           | vista, its anybody's guess how things will look in five
           | years, but when Intel's CEO is touting the era of the "AI PC"
           | [1] their projection must be that a certain market will form
           | around compute intensive _local_ computing (largely prompted
           | by the popularity of LLM /AI but that just one domain).
           | 
           | on the second branch of your question, indeed a local
           | "supercomputer piece" should have a sufficient number of
           | CPU/GPU's to pack meaningful computational power. this way it
           | would also require and enable the right kind of tooling and
           | programming that scales to larger sizes.
           | 
           | given that algorithms can enhance practically any existing
           | application (productivity, games etc), this might be a case
           | of "build it and they will come"
           | 
           | [1] https://www.pcmag.com/news/intel-ceo-get-ready-for-the-
           | ai-pc
        
           | KeplerBoy wrote:
           | Sure.. Plenty of Supercomputers are just A100s, which are
           | also perfectly usable in a single DL workstation.
           | 
           | * at least if they used pci-e
        
         | blitzar wrote:
         | > What would a healthy EU HPC ecosystem look like?
         | 
         | It already exists, there are probably >100 HPC clusters spread
         | throughout the EU in universities + the CERN cluster etc.
         | 
         | > startups, SME's etc
         | 
         | Why would we want to provide resources for a startup to waste
         | compute resources to optimise advertising clicks? They can
         | spend their VC cash at aws.
        
           | nologic01 wrote:
           | you (or rather the tax payer that pays your salary) would not
           | "provide" anything. the concept is for non-government money
           | sponsored entities to get access (by buying) substantially
           | similar (but scaled down) architectures _instead_ of spending
           | their VC cash at aws.
           | 
           | ultimately this is also better use of taxpayer money:
           | diffusing technology more wider and educating people to make
           | use of supercomputing technologies beyond the ivory towers
        
             | blitzar wrote:
             | The demand compute time within the existing users far
             | outstrips the supply, by orders of magnitude, hence why
             | more is being installed.
        
         | semi-extrinsic wrote:
         | Direct quote from your link:
         | 
         | "Since 2017, every system on the Top500 list of the world's
         | fastest supercomputers has used Beowulf software methods and a
         | Linux operating system."
         | 
         | As for accessible by everyone: here is how you can apply for
         | computing time via PRACE, if you work at an academic
         | institution, a commercial company or a government entity
         | located in Europe:
         | 
         | https://prace-ri.eu/call/eurohpc-ju-call-for-proposals-for-r...
         | 
         | In addition to the very large machines that are covered by
         | PRACE, typically there are national calls for access to
         | "smaller" HPC resources, say up to a few million CPU-hours per
         | year. The allocations on PRACE average around 30-40 million
         | cpu-hours.
         | 
         | What is explicitly _NOT_ allowed on these machines is typically
         | running jobs that use just a handful of cores. They 've paid a
         | lot of money for the fancy interconnect, amd they want to see
         | it used.
        
           | nologic01 wrote:
           | Remote "cloud" style access is also interesting and important
           | for various use-cases. But I was thinking more in terms of
           | local compute capabilities. I.e. somebody actually packaging
           | these new compute units into workstations / servers to be
           | used by diverse entities.
        
         | AdamN wrote:
         | There is also the benefit of all the people that get that
         | experience and then bring it to trad-computing.
        
       | Havoc wrote:
       | If they only spent half the budget then is the Nvidia part still
       | to come?
        
       | wslh wrote:
       | What is the approximate price of the SiPearl's Rhea processor?
        
         | swores wrote:
         | I believe it's not announced yet.
         | 
         | I'll chuck out an unqualified estimate of EUR10k each, will
         | find out next year (probably) if I'm anywhere close!
        
       | riedel wrote:
       | >The Julich Supercomputing Centre, which is near Munich, will
       | host the system.
       | 
       | Interesting take on geography .
       | 
       | The confusion, pobably has to do with the fact that the German
       | tier 0 Gauss super computing center is actually spread over 3
       | sites (Julich near Cologne/Aachen, Stuttgart and Garching near
       | Munich)
        
         | tremon wrote:
         | _Stuttgart and Garching near Munich_
         | 
         | This reads weird. It took me way too many seconds of wondering
         | "wouldn't Stuttgart be nearer to... Stuttgart?" before I
         | understood what you wrote. Sometimes the Oxford Comma has
         | value, it seems.
        
           | suyjuris wrote:
           | The name of the city is "Garching bei Munchen" which
           | translates to "Garching near Munich". This disambiguates it
           | from ,,Garching an der Alz". (Although Julich is just called
           | Julich.)
        
       ___________________________________________________________________
       (page generated 2023-10-06 23:01 UTC)