[HN Gopher] Why Fugaku, Japan's fastest supercomputer, went virt...
       ___________________________________________________________________
        
       Why Fugaku, Japan's fastest supercomputer, went virtual on AWS
        
       Author : panrobo
       Score  : 77 points
       Date   : 2024-04-18 12:02 UTC (10 hours ago)
        
 (HTM) web link (aws.amazon.com)
 (TXT) w3m dump (aws.amazon.com)
        
       | echelon wrote:
       | I can't even imagine the hourly bill. This seems like a great way
       | for Amazon to gobble up institutional research budgets.
        
         | KaiserPro wrote:
         | > I can't even imagine the hourly bill
         | 
         | Yes, and no. As a replacement for a reasonably well used
         | cluster, its going to be super expensive. Something like 3-5x
         | the price.
         | 
         | For providing burst capacity, its significantly cheaper/faster,
         | and can be billed directly.
         | 
         | But would I recommend running your stuff directly on AWS for
         | compute heavy batch processing? no. if you are thinking about
         | hiring machines in a datacenter, then yes.
        
           | littlestymaar wrote:
           | > For providing burst capacity, its significantly
           | cheaper/faster, and can be billed directly.
           | 
           | Cloud is cheaper for such a workload, yes. But you still
           | wouldn't want to pay the AWS premium for that.
           | 
           | But I guess nobody ever got fired for choosing AWS.
        
             | KaiserPro wrote:
             | Depends.
             | 
             | If you want high speed storage, thats reliable and zero
             | cost to setup (ie doesn't require staff to do) then AWS and
             | spot instances is very much a cheap option.
             | 
             | If you have in house expertise and can knock up a reliable
             | high speed storage cluster (no porting to s3 is not the
             | right option, that takes years and means you can't edit
             | files inline.) then running on one of the other cloud
             | providers is an option.
             | 
             | But they often can't cope with the scale, or don't provide
             | the right support.
        
               | littlestymaar wrote:
               | > and zero cost to setup (ie doesn't require staff to do)
               | 
               | Ah, the fallacy of the cloud not requiring staff to
               | magically work. Funnily enough, every time I heard this
               | IRL, it was coming from dudes who was in fact paid to
               | manage AWS instances and services for their company as
               | their day job but somehow never included his salary and
               | the one of his colleagues.
               | 
               | > But they often can't cope with the scale
               | 
               | Most people vastly overestimate the scale they need or
               | underestimate the scale of even second tier cloud
               | providers but there's not many workload that could cause
               | storage trouble anywhere. For the record a single rack
               | can host 5/10 PB, how many people needs multiple exabytes
               | of storage again?
               | 
               | > or don't provide the right support.
               | 
               | I've never been a whole-data-centers-scale customer but
               | my experience with AWS support didn't leave me in awe.
        
           | moralestapia wrote:
           | I currently work on a research group that does all of its
           | bioinformatics analysis in AWS. The bill is expensive, sure,
           | 6 figures regularly, maybe 7 in its lifetime, but
           | provisioning our own cluster (let alone a supercomputer)
           | would've been way more costly and time consuming. You also
           | now need people to maintain it, etc... And at the end of the
           | day, it could well be the case that you still end up using
           | AWS for something (hosting, storage, w/e)
           | 
           | I think it's a W, so far.
        
             | fock wrote:
             | because we essentially did spend exactly that to build a
             | cluster: what amount of ressources are you using
             | (coreh/instances/storage)?
        
               | jasonjmcghee wrote:
               | There are tradeoffs in both cases.
               | 
               | Some issues with building a cluster are, you're locked
               | into the tech you bought, need space, expertise to manage
               | it, cost to run/maintain.
               | 
               | You can potentially recoup some of that investment by
               | selling it later, usually a quickly depreciating asset
               | though.
               | 
               | But, no AWS or AWS premium.
        
               | littlestymaar wrote:
               | There's also the third way, of going for intermediate-
               | size cloud providers, you get the lack of Capex and
               | actual hardware to deal with without the AWS premium. I
               | don't understand why so many people act as if the only
               | alternative was AWS or self-hosted.
        
               | moralestapia wrote:
               | Never tried to suggest that, but tbh, the value added
               | services on both AWS and GCP are hard to emulate, and
               | they "just work" already.
               | 
               | Sure, I could spend a few weeks (months?) compiling and
               | trying some random GNU-certified "alternatives" for Cloud
               | Services but ... just nah ...
        
               | fock wrote:
               | The main value is that you can pay money to have decent
               | network and storage with the hyperscalers. Which none of
               | the intermediate-size cloud services offer to my
               | knowledge?
               | 
               | > I could spend a few weeks (months?) compiling
               | 
               | why would you compile infra-stuff?! Usually this is
               | nicely packaged...
               | 
               | > random GNU-certified "alternatives" for Cloud Services
               | but
               | 
               | For application software you have to take care of this
               | also with the cloud or are you just using the one
               | precompiled environment they offer? Which cloud services
               | do you need anyway - in our area (MatSci), things are
               | very, very POSIX-centered and that is simple to set up.
        
               | moralestapia wrote:
               | >Usually this is nicely packaged...
               | 
               | That has almost never been my experience with
               | "alternatives" but if you can provide a few links I would
               | like to learn about them.
        
               | littlestymaar wrote:
               | What kind of "services" do they provide that provide
               | value in the kind of HPC setting you were talking about?
               | 
               | > trying some random GNU-certified "alternatives" for
               | Cloud Services but ... just nah ...
               | 
               | A significant fraction, if not the majority, of those
               | services are in fact hosted versions of open-source
               | projets[1] so there's no need to be snobbish.
               | 
               | [1]: and that's why things like Teraform and Redis are
               | going source-available in recent days, to fight against
               | cloud vendors parasitic behavior
        
               | fock wrote:
               | I guess having built the physical system I am aware of
               | the tradeoffs... Though due to the funding structure and
               | 0-cost colocation (for our unit), there was not a lot to
               | be discussed and thus I'd be interested in actual numbers
               | for comparison!
        
             | KaiserPro wrote:
             | For a startup I ran, we too built an HPC cluster on AWS. It
             | was very much a big win, as sourcing and installing that
             | amount of kit would have been far too much upfront.
             | 
             | I should have really mentioned your(our) use case, as
             | smallish department without dedicated access to a cluster
             | or people who can run a real steel cluster.
             | 
             | AWS's one click lustre clustre service (FSx) combined with
             | Batch is a really quick and powerful basic HPC cluster.
             | 
             | Before we got bought out, we had planned for creating a
             | "base load" of real steel machines to process the
             | continuous jobs, and spin up the rest of the cluster as and
             | when demand needed
        
         | prmoustache wrote:
         | I guess it really depends how many hours per year the Fugaku is
         | used at its max capacity. Also in this case, it could grow
         | progressively.
        
           | Something1234 wrote:
           | Someone needs to share utilization data on these super
           | computer clusters. Most have a long queue of jobs waiting to
           | be ran, and you have to specify how long you think your job
           | is gonna take to properly schedule it.
        
             | electroly wrote:
             | From the article:
             | 
             | > When Fugaku was running on premises, it was "at 95
             | percent capacity and constantly full," said Dr. Matsuoka.
        
             | ptheywood wrote:
             | Fugaku has a public dashboard (Grafana)
             | 
             | https://status.fugaku.r-ccs.riken.jp/
        
         | exe34 wrote:
         | At this scale, I would expect they have better terms than the
         | retail pricing.
        
       | edu wrote:
       | The title is a little bit misleading, Fugaku stills exists
       | phisically. This project is about being able to replicate the
       | software environment used in Fugaku on AWS.
        
         | chasil wrote:
         | Is this running on Gravitron, or Fujitsu's custom ARM?
        
           | my123 wrote:
           | Graviton3
        
       | sheepscreek wrote:
       | They haven't said a word about costs. Does it cost as
       | much/less/more?
       | 
       | One challenge would be deciding on the unit to compare the costs.
       | They could pick cost per Peta/ExaFLOPs.
        
       | throw0101b wrote:
       | If anyone wants to dabble with (HPC) compute clusters,
       | ElastiCluster is a handy tool to spin up nodes using various
       | cloud APIs:
       | 
       | * https://elasticluster.github.io/elasticluster/
        
       | aconz2 wrote:
       | The goal seems to be: "Virtual Fugaku provides a way to run
       | software developed for the supercomputer, unchanged, on the AWS
       | Cloud". Is AWS running A64FX's? Is there more info on all the
       | details of this somewhere? I think it is a compelling goal to
       | have portability across compute and am curious how they are going
       | about it and what tradeoffs they've made. Software portability is
       | one level of hard and performance portability is another. I wish
       | we could get an OCI spec for running a job that could fill this
       | purpose
        
         | tame3902 wrote:
         | This article has some more details:
         | 
         | https://www.hpcwire.com/2023/01/26/riken-deploys-virtual-fug...
         | 
         | It looks like they want to make the transition between Fugaku
         | and AWS and vice versa easier.
        
       | glial wrote:
       | The article doesn't mention this, but I imagine having an AWS
       | version of the supercomputer would be extremely helpful for
       | software development and performance optimization, especially if
       | the same code could be tested with fewer nodes.
        
       | adev_ wrote:
       | There is a big interest of what Fugaku and Dr. Matsuoka are doing
       | here and it seems that this article is missing it entirely.
       | 
       | HPC development is not your standard dev workflow where your
       | software can be easily developed and tested locally on your
       | laptop.
       | 
       | Most software will requires a proper MPI environment with a
       | parallel file system and (often) a beefy GPU.
       | 
       | Most development on a supercomputer is done on a debug partition.
       | A small subset of the supercomputer reserved for interactive
       | usage. That allows to test the scalability of the program, hunts
       | Heisenbug related to concurrency, access large datasets, etc...
       | 
       | But Debug partitions are problematic: Make it too small and your
       | scientists & devs loose productivity. Make it too large and you
       | are wasting your supercomputer resources to something else than
       | production jobs.
       | 
       | The Cloud solves this issue. You can spawn your cluster, debug
       | and test your jobs, destroy your cluster. You do not need very
       | large scale nor extreme performance, you need flexibility,
       | isolation and interactivity. The Cloud gives you that because of
       | the virtualization.
        
         | bch wrote:
         | > Most software will requires a proper MPI environment with a
         | parallel file system and (often) a beefy GPU
         | 
         | I'm but a tourist in this domain, but can you dig into this a
         | bit more and compare/contrast w "traditional" development? I
         | presume the MPI you're talking about is OpenMPI or MPICH, which
         | need to be dealt with directly - but what are the
         | considerations/requirements for a parallel FS? Hardware is
         | hardware, and I guess what you're saying re: GPUs is that you
         | can't fake The Real Thing (fair enough), but what other
         | interesting "quirks" do you run into in this HPC env vs non-
         | HPC?
        
           | selimnairb wrote:
           | Lots of legacy HPC code assumes POSIX file I/O, which means a
           | parallel file system, which means getting the interconnect
           | topology right, which is not easy.
        
           | trueismywork wrote:
           | The almost single most important feature of supercomputers is
           | guaranteed low latency interconn3ct between two nodes of a
           | supercomputer, which guarantees high performance for even
           | very talkative workloads. This is why supercomputers document
           | their network topology so thoroughly and allow people to
           | reserve nodes in a single rack for example.
        
           | gyrovagueGeist wrote:
           | The interconnect and network topology is also a big component
           | of the hardware where you can't "fake The Real Thing" in
           | practice. You can often get fairly confident in program
           | correctness for toy problem runs by scaling 1-~40 ranks on
           | your local machine, but you can't tell much about the
           | performance until you start running on a real distributed
           | system where you can see how much your communication pattern
           | stresses the cluster.
           | 
           | Or if you run into bugs / crashes that needs 1000s of
           | processes or a full scale problem instance to reproduce, god
           | help you and your SLURM queue times.
        
       | astrodust wrote:
       | Today "supercomputer" translates to "really high AWS spend".
        
       ___________________________________________________________________
       (page generated 2024-04-18 23:01 UTC)