tbrcon2020_adc.md - brcon2020_adc - my presentation for brcon2020
(HTM) git clone git://src.adamsgaard.dk/brcon2020_adc
(DIR) Log
(DIR) Files
(DIR) Refs
(DIR) LICENSE
---
tbrcon2020_adc.md (9877B)
---
1 Abstract:
2 Numerical models are used extensively for simulating complex physical
3 systems including fluid flows, astronomical events, weather, and
4 climate. Many researchers struggle to bring their model developments
5 from single-computer, interpreted languages to parallel high-performance
6 computing (HPC) systems. There are initiatives to make interpreted
7 languages such as MATLAB, Python, and Julia feasible for HPC
8 programming. In this talk I argue that the computational overhead
9 is far costlier than any potential development time saved. Instead,
10 doing model development in C and unix tools from the start minimizes
11 porting headaches between platforms, reduces energy use on all
12 systems, and ensures reproducibility of results.
13
14
15 ## brcon2020 - 2020-05-02
16
17 title: Energy efficient programming in science
18
19 author: Anders Damsgaard (adc)
20
21 contact: anders@adamsgaard.dk
22 gopher://adamsgaard.dk
23 https://adamsgaard.dk
24
25
26 ## About me
27
28 * 33 y/o Dane
29 * #bitreich-en since 2019-12-16
30
31 present:
32
33 * postdoctoral scholar at Stanford University (US)
34 * lecturer at Aarhus University (DK)
35
36 previous:
37
38 * Danish Environmental Protection Agency (DK)
39 * Scripps Institution of Oceanography (US)
40 * National Oceanic and Atmospheric Administration (NOAA, US)
41 * Princeton University (US)
42
43 #pause
44 academic interests:
45
46 * ice sheets, glaciers, and climate
47 * earthquake and landslide physics
48 * modeling of fluid flows and granular materials
49
50
51 ## Numerical modeling
52
53 * numerical models used for simulating complex physical systems
54
55 * n-body simulations: planetary formation, icebergs, soil/rock mechanics
56
57 * fluid flows (CFD): aerodynamics, weather, climate
58
59
60 * domains and physical processes split up into small, manageable chunks
61
62
63 ## From idea to application
64
65
66 1. Construct system of equations
67
68 |
69 v
70
71 2. Derivation of numerical algorithm
72
73 |
74 v
75
76 3. Prototype in high-level language
77
78 |
79 v
80
81 4. Re-implementation in low-level language
82
83
84 ## From idea to application
85
86 ,-----------------------------------------------.
87 | 1. Construct system of equations |
88 | |
89 | | |
90 | v | _
91 | | ___ | | __
92 | 2. Derivation of numerical algorithm | / _ \| |/ /
93 | | | (_) | <
94 | | | \___/|_|\_\
95 | v |
96 | |
97 | 3. Prototype in high-level language |
98 `-----------------------------------------------'
99 | _ _
100 v | | ___ | | __
101 | |/ _ \| |/ /
102 4. Re-implementation in low-level language |_| (_) | <
103 (_)\___/|_|\_\
104
105
106 ## Numerical modeling
107
108 task: Solve partial differential equations (PDEs) by stepping through time
109 PDEs: conservation laws; mass, momentum, enthalpy
110
111 example: Heat diffusion through homogenous medium
112
113 ∂T
114 -- = -k ∇²(T)
115 ∂t
116
117 domain:
118
119 .---------------------------------------------------------------------.
120 | |
121 | T |
122 | |
123 '---------------------------------------------------------------------'
124
125 ## Numerical modeling
126
127 domain: discritize into n=7 cells
128
129 .---------+---------+---------+---------+---------+---------+---------.
130 | | | | | | | |
131 | T₁ | T₂ | T₃ | T₄ | T₅ | T₆ | T₇ |
132 | | | | | | | |
133 '---------+---------+---------+---------+---------+---------+---------'
134
135 #pause
136 * Numerical solution with high-level programming:
137
138 MATLAB: sol = pdepe(0, @heat_pde, @heat_initial, @heat_bc, x, t)
139
140 Python: fenics.solve(lhs==rhs, heat_pde, heat_bc)
141
142 Julia: sol = solve(heat_pde, CVODE_BPF(linear_solver=:Diagonal); rel_tol, abs_tol)
143
144 (the above are not entirely equivalent, but you get the point...)
145
146
147 ## Numerical solution: Low-level programming
148
149 example BC: outer boundaries constant temperature (T₁ & T₇)
150
151 * computing ∇²(T)
152
153 .---------+---------+---------+---------+---------+---------+---------.
154 | | | | | | | |
155 t | T₁ | T₂ | T₃ | T₄ | T₅ | T₆ | T₇ |
156 | | | | | | | |
157 '----|--\-+----|--\-+-/--|--\-+-/--|--\-+-/--|--\-+-/--|----+-/--|----'
158 | \ | \ / | \ / | \ / | \ / | / |
159 | \ | / | / | / | / | / |
160 | \ | / \ | / \ | / \ | / \ | / |
161 .----|----+-\--|--/-+-\--|--/-+-\--|--/-+-\--|--/-+-\--|--/-+----|----.
162 | | | | | | | |
163 t + dt | T₁ | T₂ | T₃ | T₄ | T₅ | T₆ | T₇ |
164 | | | | | | | |
165 '---------+---------+---------+---------+---------+---------+---------'
166 |<- dx ->|
167
168
169 ## Numerical solution: Low-level programming
170
171 * explicit solution with central finite differences:
172
173 for (t=0.0; t<t_end; t+=dt) {
174 for (i=1; i<n-1; i++)
175 T_new[i] = T[i] - k*(T[i+1] - 2.0*T[i] + T[i-1])/(dx*dx) * dt;
176 tmp = T;
177 T = T_new;
178 T_new = tmp;
179 }
180 #pause
181
182 * implicit, iterative solution with central finite differences:
183
184 for (t=0.0; t<t_end; t+=dt) {
185 do {
186 for (i=1; i<n-1; i++) {
187 T_new[i] = T[i] - k*(T[i+1] - 2.0*T[i] + T[i-1])/(dx*dx) * dt;
188 r_norm_max = 0.0;
189 for (i=1; i<n-1; i++)
190 if (fabs((T_new[i] - T[i])/T[i]) > r_norm_max)
191 r_norm_max = fabs((T_new[i] - T[i])/T[i]);
192 tmp = T;
193 T = T_new;
194 T_new = tmp;
195 } while (r_norm_max < RTOL);
196 }
197
198
199 ## HPC platforms
200
201 * Stagnation of CPU clock frequency
202
203 * Performance through massively parallel deployment (MPI, GPGPU)
204
205 * NOAA/DOE NCRC Gaea cluster
206 * 2x Cray XC40, "Cray Linux Environment"
207 * 4160 nodes, each 32 to 36 cores, 64 GB memory
208 * infiniband
209 * total: 200 TB memory, 32 PB SSD, 5.25 petaflops (peak)
210
211 ## A (non-)solution
212
213 * high-level, interpreted code with extensive solver library -> low-level, compiled, parallel code
214
215 * suggested workaround: port interpreted high-level languages to HPC platforms
216
217 #pause
218
219 NO!
220
221 * high computational overhead
222 * many machines
223 * reduced performance and energy efficiency
224
225
226 ## A better way
227
228 1. Construct system of equations
229
230 |
231 v
232
233 2. Derivation of numerical algorithm
234
235 |
236 v
237
238 3. Prototype in low-level language
239
240 |
241 v
242
243 4. Add parallelization for HPC
244
245
246 ## Example: Ice-sheet flow with sediment/fluid modeling
247
248
249 --------------------------._____ ATMOSPHERE
250 -----> ```--..
251 ICE `-._________________ __
252 -----> ------> |vvvv| |vvv
253 _________________| |__|
254 -----> ,'
255 ,' <>< OCEAN
256 ----> / ><>
257 ____________________________________/___________________________________
258 SEDIMENT -->
259 ________________________________________________________________________
260
261 * example: granular dynamics and fluid flow simulation for glacier flow
262
263 * 90% of Antarctic ice sheet mass driven by ice flow over sediment
264
265 * need to understand ice-basal sliding in order to project sea-level rise
266
267
268 ## Algorithm matters
269
270 sphere: git://src.adamsgaard.dk/sphere
271 C++, Nvidia C, cmake, Python, Paraview
272 massively parallel, GPGPU
273 detailed physics
274 20,191 LOC
275 #pause
276 3 month computing time on nvidia tesla k40 (2880 cores)
277
278 #pause
279 * gained understanding of the mechanics (what matters and what doesn't)
280 * simplify the physics, algorithm, and numerics
281
282 #pause
283 1d_fd_simple_shear: git://src.adamsgaard.dk/1d_fd_simple_shear
284 C99, makefiles, gnuplot
285 single threaded
286 simple physics
287 2,348 LOC
288 #pause
289 real: 0m00.07 s on laptop from 2012
290
291 #pause
292 ...guess which one is more portable?
293
294 ## Summary
295
296 for numerical simulation:
297
298 * high-level languages
299 * easy
300 * produces results quickly
301 * does not develop low-level programming skills
302 * no insight into numerical algorithm
303 * realistically speaking: no direct way to HPC
304
305 * low-level languages
306 * require low-level skills
307 * saves electrical energy
308 * directly to HPC, just sprinkle some MPI on top
309
310
311 ## Thanks
312
313 20h && /names #bitreich-en