Sustainability and Efficiency for Simulation Software in the Exascale Era

Size: px

Start display at page:

Download "Sustainability and Efficiency for Simulation Software in the Exascale Era"

James Hutchinson
5 years ago
Views:

Sustainability and Efficiency for Simulation Software in the Exascale Era Dominik Thönnes, Ulrich Rüde, Nils Kohl Chair for System Simulation, University of Erlangen-Nürnberg March 09, 2018 SIAM

1 Sustainability and Efficiency for Simulation Software in the Exascale Era Dominik Thönnes, Ulrich Rüde, Nils Kohl Chair for System Simulation, University of Erlangen-Nürnberg March 09, 2018 SIAM Conference on Parallel Processing for Scientific Computing Tokyo, Japan NEO joint work with Dominik Bartuschat (FAU) Daniel Drzisga, Markus Huber, Barbara Wohlmuth (TUM) Simon Bauer, Marcus Mohr, Hans-Peter Bunge (LMU)

2 TerraNeo Project Motivated by simulating Earth Mantle convection Triangle/Tetrahedral meshes allow modeling of complex geometries Structural refinement enables the use of matrix-free methods Fully distributed data structures allow optimal scalability Support different discretizations e.g. first order finite elements and higher order finite elements, finite volumes NEO 2

3 TerraNeo Project NEO 3

4 TerraNeo Project NEO 4

5 Abstraction Data - Topology Data Topology intra-primitive building blocks inter-primitive building blocks Serialization (Buffer / File) Communication Macro Primitives (Simulation) Data Calculations Load Balancing Neighborhood NEO 5

6 From mesh to primitives (2D) Distribution Generate Mesh local primitives neighborhood Mesh File Library Load balancing Rank 0 Create Primitives (vertices, edges, faces) Input Mesh Rank 1 Setup Domain NEO Fully Distributed Domain 6

7 Load Balancing 2D (Faces) Round-Robin ParMETIS Greedy NEO 7

8 Load Balancing 2D (Edges) Round-Robin ParMETIS NEO 8

9 Load Balancing 3D (Tetrahedra) Round-Robin ParMETIS Greedy NEO 9

10 Data Handling Primitive Macro Primitive Types: Vertex, Edge, Face Metadata: globally unique ID direct neighborhood (IDs) geometric information (e.g. vertex coordinates, orientation, ) Lightweight metadata P1 FE Registered / Allocated Data P2 FE Flag Field Arbitrary data structures Actual simulation data NEO 10

11 Communication 3-layer abstraction Control Layer coordinates buffers and directions Packing Layer interface for packing / unpacking data to / from buffers Send / Recv Buffer int a = 42; Buffer Layer NEO MPI abstraction buffersystem.sendbuffer( rank0 ) << a; buffersystem.sendall(); 11

12 Communication Rank 1 Rank 0 Face BufferSystem Edge Data Data RecvBuffer BufferSystem pack (parallel) SendBuffer unpack non-blocking MPI direct copy Rank 2 SendBuffer BufferSystem Edge Edge RecvBuffer Data Data NEO unpack 12

13 Data access abstraction Abstract index Actual memory index 14 0,4 0,3 1,3 0,2 1,2 2,2 0,1 1,1 2,1 3,1 0,0 1,0 2,0 3,0 4, face_index(level,x,y) => linearized index NEO face_index(2,3,1) => 8 13

14 Data access abstraction For Stencil Codes: Indexing which is capable of iterating over all neighbors VERTEX NW VERTEX N allneighbors = {VERTEX_S, VERTEX_SE, VERTEX_W, VERTEX_E, VERTEX_NW, VERTEX_N} VERTEX C VERTEX W VERTEX E allneighborswithcenter = {VERTEX_C, VERTEX_S, VERTEX_SE, VERTEX_W, VERTEX_E, VERTEX_NW, VERTEX_N} VERTEX S VERTEX SE for (stencildirection neighbor : allneighbors) { += face_vertex_stencil[neighbor] * src[index(level,i,j,neighbor)]; NEtmp O } 14

15 Data on Interfaces Distribution of unknowns onto the macro primitives Face 1 NEO Edge 1 Face 2 Black points mark the ownership White points are ghost points Orange points correspond to the same DoF 15

16 Splitting of unknowns Vertex DoF 0,4 0,3 1,3 0,2 1,2 2,2 0,1 1,1 2,1 Cell DoF Edge DoF 3,1 0,3,ve 0,3,di 0,3,gr 0,3,ho 0,0 0,2,ve 1,2,di 0,2,ho 0,1,ve 0,0,ve 0,0,di 0,0,ho 0,1,bl 2,1,ve 2,1,di 1,1,ho 1,0,ve NEO 4,0 0,2,gr 1,1,di 0,1,ho 3,0 1,2,gr 1,2,ho 1,1,ve 0,1,di 2,0 0,2,bl 1,2,ve 0,2,di 1,0 1,0,ho 1,1,gr 2,1,gr 2,1,ho 2,0,ve 1,0,di 0,1,gr 1,1,bl 0,0,bl 3,0,ve 2,0,di 2,0,ho 3,0,di 0,0,gr 1,0,bl 1,0,gr 2,0,bl 2,0,gr 3,0,gr 3,0,ho 16

17 Splitting of unknowns Vertex DoF 0,4 0,3 1,3 0,2 1,2 2,2 0,1 1,1 2,1 Cell DoF Edge DoF 3,1 0,3,ve 0,3,di 0,3,gr 0,3,ho 0,0 0,2,ve 1,2,ve 0,2,di 0,1,ve 0,1,di 0,0,ve 0,0,di 0,0,ho 0,2,gr P2 FE 0,1,bl 2,1,di 1,1,ho 1,0,ve NEO 4,0 2,1,ve 1,1,di 0,1,ho 3,0 0,2,bl 1,2,ho 1,1,ve 2,0 2x 1,2,di 0,2,ho 1,0 1,0,ho 1,1,bl 1,1,gr 2,1,gr 2,1,ho 2,0,ve 1,0,di 0,1,gr 1,2,gr 0,0,bl 3,0,ve 2,0,di 2,0,ho 3,0,di 3,0,ho P3 FE 0,0,gr 1,0,bl 1,0,gr 2,0,bl 2,0,gr 3,0,gr 17

18 Other interesting talks MS102,MS113 Large-Scale Simulation in Geodynamics TerraNeo - A Finite Element Multigrid Framework for Extreme-Scale Earth Mantle Convection Simulations - Dominik Bartuschat A Stencil Scaling Approach for Accelerating Matrix-Free Finite Element Implementations - Daniel Drzisga MS107 Highly Scalable Solvers for Computational PDEs Matrix-Free Parallel Multigrid for Fe Systems with a Trillion Unknowns - Markus Huber NEO 18

19 Annulus convection P1-P1 PSPG Elements for Stokes Finite volumes for heat transport 256 macro triangles 2145 DoFs on each macro (level 6) NEO 19

arxiv: v1 [cs.ms] 25 May 2018

arxiv: v1 [cs.ms] 25 May 2018 Vol. 00, No. 00, Month 2018, 1 22 Article arxiv:1805.10167v1 [cs.ms] 25 May 2018 A Scalable and Modular Software Architecture for Finite Elements on Hierarchical Hybrid Grids Nils Kohl a, Dominik Thönnes