tAdd more content - brcon2020_adc - my presentation for brcon2020
 (HTM) git clone git://src.adamsgaard.dk/.brcon2020_adc
 (DIR) Log
 (DIR) Files
 (DIR) Refs
 (DIR) LICENSE
       ---
 (DIR) commit 0a73d2e077f88526c454d3157a496bbdd627e4bc
 (DIR) parent 0440b2990984378674500cd0872456946751efef
 (HTM) Author: Anders Damsgaard <anders@adamsgaard.dk>
       Date:   Mon, 27 Apr 2020 22:16:48 +0200
       
       Add more content
       
       Diffstat:
         M brcon2020_adc.md                    |     181 +++++++++++++++++++------------
       
       1 file changed, 109 insertions(+), 72 deletions(-)
       ---
 (DIR) diff --git a/brcon2020_adc.md b/brcon2020_adc.md
       t@@ -25,46 +25,37 @@ systems, and ensures reproducibility of results.
        
        ## About me
        
       +* 33 y/o Dane
       +* #bitreich-en since 2019-12-16
       +
        Present:
        
       -* 33 y/o Dane, linux/bsd user since 2001
       -* #bitreich-en since 2019-12-16
       -* EDITOR=vi
        * Postdoctoral scholar at Stanford University (US)
        * Lecturer at Aarhus University (DK)
        
       -#-
       -
       +#pause
        Previous:
        
       +* Danish Environmental Protection Agency (DK)
        * Scripps Institution of Oceanography (US)
        * National Oceanic and Atmospheric Administration (NOAA, US)
        * Princeton University (US)
        
       -#-
       -
       +#pause
        Academic interests:
        
        * ice sheets, glaciers, and climate
       -* earthquake physics and landslides
       +* earthquake and landslide physics
        * modeling of fluid flows and granular materials
        
       +
        ## Numerical modeling
        
       -* models used for complex physical systems (fluid flows, astronomical events, weather, climate)
       -* domains and physical processes split up into small, manageable chunks
       +* numerical models used for simulating complex physical systems
       +  * n-body simulations: granular materials, gravitational interaction
       +  * fluid flows: CFD, weather, climate
        
       -Numerical models are used extensively for simulating complex physical
       -systems including fluid flows, astronomical events, weather, and
       -climate.  Many researchers struggle to bring their model developments
       -from single-computer, interpreted languages to parallel high-performance
       -computing (HPC) systems.  There are initiatives to make interpreted
       -languages such as MATLAB, Python, and Julia feasible for HPC
       -programming.  In this talk I argue that the computational overhead
       -is far costlier than any potential development time saved.  Instead,
       -doing model development in C and unix tools from the start minimizes
       -porting headaches between platforms, reduces energy use on all
       -systems, and ensures reproducibility of results.
       +* domains and physical processes split up into small, manageable chunks
        
        
        ## Numerical modeling
       t@@ -77,7 +68,8 @@ systems, and ensures reproducibility of results.
               ∂T
               -- = -k ∇² T
               ∂t
       -#-
       +#pause
       +
            domain:
        
               .---------------------------------------------------------------------.
       t@@ -87,7 +79,7 @@ systems, and ensures reproducibility of results.
               '---------------------------------------------------------------------'
        
        
       -## Numerical solution
       +## Numerical modeling
        
              task: Solve partial differential equations (PDEs) by stepping through time
                    PDEs: conservation laws; mass, momentum, enthalpy
       t@@ -105,8 +97,9 @@ systems, and ensures reproducibility of results.
               |    T₁   |    T₂   |    T₃   |    T₄   |    T₅   |    T₆   |    T₇   |
               |         |         |         |         |         |         |         |
               '---------+---------+---------+---------+---------+---------+---------'
       -#-
       -    MATLAB: sol = pdepe(0, @heat_pde, @heat_initial, @heat_bc, x, t);
       +
       +#pause
       +    MATLAB: sol = pdepe(0, @heat_pde, @heat_initial, @heat_bc, x, t)
        
            Python: fenics.solve(lhs==rhs, heat_pde, heat_bc)
        
       t@@ -147,56 +140,43 @@ systems, and ensures reproducibility of results.
        
        ## Numerical solution (finite differences)
        
       -       .---------+---------+---------+---------+---------+---------+---------.
       -       |         |         |         |         |         |         |         |
       -  t    |    T₁   |    T₂   |    T₃   |    T₄   |    T₅   |    T₆   |    T₇   |
       -       |         |         |         |         |         |         |         |
       -       '----|--\-+----|--\-+-/--|--\-+-/--|--\-+-/--|--\-+-/--|----+-/--|----'
       -            |   \     |   \ /   |   \ /   |   \ /   |   \ /   |     /   |  
       -            |    \    |    /    |    /    |    /    |    /    |    /    |   
       -            |     \   |   / \   |   / \   |   / \   |   / \   |   /     |    
       -       .----|----+-\--|--/-+-\--|--/-+-\--|--/-+-\--|--/-+-\--|--/-+----|----.
       -       |         |         |         |         |         |         |         |
       -  t+dt |    T₁   |    T₂   |    T₃   |    T₄   |    T₅   |    T₆   |    T₇   |
       -       |         |         |         |         |         |         |         |
       -       '---------+---------+---------+---------+---------+---------+---------'
       -
            explicit solution with central finite differences:
        
                for (t=0.0; t<t_end; t+=dt) {
                    for (i=1; i<n-1; i++)
                        T_new[i] = T[i] - k*(T[i+1] - 2.0*T[i] + T[i-1])/(dx*dx) * dt;
       +            tmp = T;
       +            T = T_new;
       +            T_new = tmp;
                }
        
            iterative Jacobian solution with central finite differences:
        
                for (t=0.0; t<t_end; t+=dt) {
       -            for (i=1; i<n-1; i++)
       -                T_old[i] = T[i];
                    do {
                        for (i=1; i<n-1; i++) {
       -                    dT[i] = -k*(T[i+1] - 2.0*T[i] + T[i-1])/(dx*dx) * dt;
       -                    T_new[i] = T_old[i] + dT[i];
       +                    T_new[i] = -k*(T[i+1] - 2.0*T[i] + T[i-1])/(dx*dx) * dt;
       +                    r_norm_max = 0.0;
       +                    for (i=1; i<n-1; i++)
       +                        if (fabs((T_new[i] - T[i])/T[i]) > r_norm_max)
       +                            r_norm_max = fabs((T_new[i] - T[i])/T[i]);
       +                    tmp = T;
       +                    T = T_new;
       +                    T_new = tmp;
                        }
                    } while (r_norm_max < RTOL);
                }
        
       -## HPC platforms
       -
       -* Free lunch is over
       -* Parallelization is key
       -
       -
        
        ## From idea to application
        
        
       -    1. Conceptualization
       +    1. Construct system of equations
        
              |
              v
        
       -    2. Derivation of mathematical formulation
       +    2. Derivation of numerical algorithm
        
              |
              v
       t@@ -212,12 +192,12 @@ systems, and ensures reproducibility of results.
        ## From idea to application
        
         ,-----------------------------------------------.
       - |  1. Conceptualization                         |
       + |  1. Construct system of equations             |
         |                                               |
         |    |                                          |
         |    v                                          |          _
         |                                               |     ___ | | __
       - |  2. Derivation of mathematical formulation    |    / _ \| |/ /
       + |  2. Derivation of numerical algorithm         |    / _ \| |/ /
         |                                               |   | (_) |   <
         |    |                                          |    \___/|_|\_\
         |    v                                          |
       t@@ -231,25 +211,39 @@ systems, and ensures reproducibility of results.
                                                            (_)\___/|_|\_\
        
        
       -# Our scientific training includes learning how to make an solid idea,
       -# translate said idea into a set of equations, and how to implement it
       -# in high-level programming languages
       +% Our scientific training includes learning how to make an solid idea,
       +% translate said idea into a set of equations, and how to implement it
       +% in high-level programming languages
       +
       +% using high-level languages:
       +%  - quick development == quick results
       +%  - loose touch with numerical workings
       +%  - develop non-transferrable skills
       +%  - code not transferrable between platforms
       +%  - use of loop structures discouraged, library calls encouraged
       +
       +% using high-level languages:
       +%  - slower development == delayed results
       +%  - gain intimate familiarity with numerical workings
       +%  - develop transferrable code and skills
       +%  - high computational performance when done right
        
       -# using high-level languages:
       -#  - quick development == quick results
       -#  - loose touch with numerical workings
       -#  - develop non-transferrable skills
       -#  - code not transferrable between platforms
       -#  - use of loop structures discouraged, library calls encouraged
       +% 4. apply the new algorithm to HPC
        
       -# using high-level languages:
       -#  - slower development == delayed results
       -#  - gain intimate familiarity with numerical workings
       -#  - develop transferrable code and skills
       -#  - high computational performance when done right
       +% requires basic C programming, usually no syscalls besides file IO
        
       -# requires basic C programming, usually no syscalls besides file IO
        
       +## HPC platforms
       +
       +* Stagnation of CPU clock frequency
       +
       +* Performance through massively parallel deployment (MPI, GPGPU)
       +
       +    * NOAA/NCRC Gaea cluster
       +        * 2x Cray XC40, "Cray Linux Environment"
       +        * 4160 nodes, each 32 to 36 cores, 64 GB memory
       +        * infiniband
       +        * total: 200 TB memory, 32 PB SSD, 5.25 petaflops (peak)
        
        ## Scaling problem
        
       t@@ -258,19 +252,62 @@ New algorithms hard to implement in HPC codes
        
        ## A (non-)solution
        
       -Port/apply high-level languages to HPC platforms
       +* Suggested workaround: Apply interpreted high-level languages to HPC platforms
        
       -high overhead on many machines -> substantially lower performance and energy efficiency
       +#pause
       +
       +NO!
       +
       +* high computational overhead 
       +* many machines
       +* reduced performance and energy efficiency
        
        ## Measuring computational energy use
        
        
        
       +## Algorithm matters
       +
       +* example: granular dynamics and fluid flow simulation for glacier flow
       +
       +                sphere: git://src.adamsgaard.dk/sphere
       +                        C++, Nvidia C, cmake, Python, Paraview
       +                        massively parallel, GPGPU
       +                        detailed physics
       +#pause
       +                        3 month computing time on nvidia tesla k40 (2880 cores)
       +
       +#pause
       +* gained understanding of the mechanics (what matters and what doesn't)
       +* simplify the physics, algorithm, and numerics
       +
       +#pause
       +    1d_fd_simple_shear: git://src.adamsgaard.dk/1d_fd_simple_shear
       +                        C99, makefiles, gnuplot
       +                        single threaded
       +                        simple physics
       +#pause
       +                        real: 0m00.07 s on potato laptop from 2012
       +
       +#pause
       +                        ...guess which one is portable?
       +
        ## Summary
        
       -* Programming in low-level languages during prototyping can save energy and frustration
       +for numerical simulation:
       +
       +* high-level languages
       +        * easy
       +        * produces results quickly
       +        * no insight into numerical algorithm
       +        * no direct way to HPC
       +
       +* low-level languages
       +        * requires low-level skills
       +        * saves electrical energy
       +        * directly to HPC
        
        
        ## Thanks
        
       -    20h & Freenode/#bitreich-en
       +    20h && /names #bitreich-en