[HN Gopher] Introduction to High-Performance Scientific Computing
       ___________________________________________________________________
        
       Introduction to High-Performance Scientific Computing
        
       Author : HaoZeke
       Score  : 125 points
       Date   : 2021-10-13 08:59 UTC (14 hours ago)
        
 (HTM) web link (pages.tacc.utexas.edu)
 (TXT) w3m dump (pages.tacc.utexas.edu)
        
       | idiot900 wrote:
       | Said it before and will say it again: I used TACC systems a good
       | deal in graduate school and the folks running their systems are
       | fantastic.
        
         | rch wrote:
         | Had access via a research hospital in Houston, and I couldn't
         | agree more.
        
       | huitzitziltzin wrote:
       | This is a great resource IMO!
        
       | [deleted]
        
       | OldHand2018 wrote:
       | A very timely submission!
       | 
       | I've been looking into performance optimizations on heterogeneous
       | multicore systems and much of what I've seen published recently
       | on Arxiv seems to point to tasks, their granularity and their
       | scheduling as increasingly important.
       | 
       | This book mentions but doesn't spend a lot of space on these
       | subjects. It will be very interesting to see how it all evolves.
        
         | jandrewrogers wrote:
         | A book could probably be written on the topic of latency-hiding
         | schedule design alone, and I am not aware of a canonical
         | resource that gives it a thorough treatment. The case of
         | regular hardware/software parallelism is trivial, but it
         | becomes interesting and very complex once you introduce
         | irregular hardware parallelism (e.g. heterogeneous compute
         | elements) or irregular software parallelism (e.g. variable and
         | unpredictable task concurrency -- graph analysis often has this
         | characteristic). The optimal number of tasks any scheduler
         | deals with always falls in a bounded range; trying to keep the
         | number of immediate tasks within that range when the tasks are
         | generated unpredictably and non-locally is non-trivial.
         | 
         | It isn't enough that the task scheduler is adaptive to
         | irregular hardware and software parallelism, in HPC you are
         | effectively running a lot of task schedulers in parallel, each
         | managing their local compute environment and interacting with
         | each other. You sort of need a "meta-scheduler" to schedule the
         | dynamic behavior across all the schedulers so they don't
         | adversely affect each other, which is not scalable. An
         | alternative approach I've often seen is adding a game theoretic
         | context to task schedulers, each tacitly modeling the expected
         | dynamic behavior of other schedulers they interact with. This
         | doesn't require schedulers to explicitly coordinate their
         | state, a big win for scalability, in order to optimize their
         | aggregate behavior. In the HPC context a robust and nearly
         | optimal equilibrium can sometimes be achieved. In the ideal
         | case you can prove the resource requirements for an individual
         | scheduler that can guarantee well-bounded worst case behaviors.
         | 
         | In HPC the topology of the schedulers is essentially fixed
         | (i.e. you know what hardware you are working with) but there is
         | an even more difficult flavor of the same latency-hiding task
         | scheduling problem when the execution environment can have a
         | variable topology i.e. nodes appear and disappear in random
         | places.
         | 
         | There is still a lot opportunity for interesting research on
         | this topic.
        
           | OldHand2018 wrote:
           | You know, I have noticed some of this when going over
           | research papers, but I don't think I'm fully appreciating the
           | significance. I appreciate the response, although it is going
           | to force me to do more research and do a lot more thinking ;)
        
       | joe_the_user wrote:
       | So, is deep learning eating HPC's lunch?
       | 
       | And if it is, how much of this involves deep learning buying
       | computer at a cheaper rate than HPC?
       | 
       | Most serious HPC involves simulating something (weather, atomic
       | particles, car crashes). Lately, there has been a lot of work
       | using neural networks to approximate such simulations more
       | effectively. But would these make sense if the HPC program ran on
       | GPUs to start with.
        
         | anon946 wrote:
         | This material is aimed at those who will be
         | developing/implementing HPC software. HPC techniques are also
         | used for ML approaches, since deep learning also requires a
         | massive of computation, and benefits greatly from massive
         | parallelization. GPUs are common in HPC, for DL and also other
         | computations.
         | 
         | That said, there certainly is a great interest in the
         | scientific computing community in ML/DL.
        
       | 58x14 wrote:
       | I'd like to take a moment to appreciate how utterly monumental it
       | is to have free, instant access to such human ingenuity. This is
       | now first on my reading list.
       | 
       | I'm working on a HPC company where everyone is an order or two of
       | magnitude smarter than me. It's fun and overwhelming. If anyone
       | has a recommended study plan for this subject (I have an
       | unstructured background in CS) or 'lighter' complimentary
       | resources, I'd be grateful to read them.
        
       | fennecfoxen wrote:
       | > * First of all, as of this writing (late 2010), GPUs are
       | attached processors, for instance over a PCI-X bus, so any data
       | they operate on has to be transferred from the CPU.
       | 
       | I think we got GPUDirect RDMA circa 2013. How time flies!
        
         | jxy wrote:
         | try more, measure more, and think about throughput and latency
         | 
         | nothing really changes
        
       | waynesonfire wrote:
       | how has the the enterprise big data ecosystem (hadoop, spark,
       | perhaps even more recently, cloud based stuff such as snowflake)
       | influenced HPC?
        
       | m4r35n357 wrote:
       | Great stuff!
        
       | reagank wrote:
       | I've had a lot of dealing with the folks at TACC in my day job,
       | and their work is pretty amazing. Add in a Top10 supercomputer
       | and you have something pretty impressive.
        
       ___________________________________________________________________
       (page generated 2021-10-13 23:01 UTC)