Towards real-time prediction of Tsunami impact effects on nearshore infrastructure

Size: px

Start display at page:

Download "Towards real-time prediction of Tsunami impact effects on nearshore infrastructure"

Bernard Logan
5 years ago
Views:

for Computational Modeling in Civil Engineering http://www.cab.bau.

1 Towards real-time prediction of Tsunami impact effects on nearshore infrastructure Manfred Krafczyk & Jonas Tölke Inst. for Computational Modeling in Civil Engineering DFG-Round Table Programme Near and Onshore Tsunami Effects Folie 1

2 physics numerics Modeling geometry Engineering model Hard- and software Folie 2

3 overview I Kinetic transport modeling overview of previous results description of the planned work algorithms and parallelization strategy anticipated results hard and software resources acceleration by utilizing dedicated hardware summary Folie 3

4 tsunamis, storm surges and dam-breaks involve the large-scale movement of solids and fluids are often irregular in timing and thus difficult to observe and measure involve multiple types of physical processes on a broad range of spatial and temporal scales Computational modelling can play an important role both in helping to understand the nature of the fundamental processes involved and in predicting the detailed outcomes of various types of events in specific locations. State-of-the-art is 2.5 D modeling / simulation of free surface flows Folie 4

5 alps flood 8/2005 following massive precipitation: average fill level exceeds river bank level: genuine 3D effect Folie 5

6 project idea: support / optimization of evacuation / rescue / damage minimization measures by short term 3D HPC simulation Folie 8

7 ~3 CPU hours on a fast PC Folie 9

8 Simulation of a surf wave TU München, Oskar von Miller Institut, Versuchsanstalt für Wasserbau und Wasserwirtschaft Folie 10

9 Simulation of a surf wave Folie 11

10 warning time: the 2004 tsunami example Folie 12

11 Development of a space adaptive CFD prototype for short-term 3D flood / tsunami prediction based on automatic GIS data input to support evacuation / rescue / damage minimization measures. Folie 13

12 pocket -tsunami (PC) Folie 14

Input: flow BC GIS database (surface mesh) 3D - SST CFD simulation 10 10 DOF output (local): flow

13 Input: flow BC GIS database (surface mesh) 3D - SST CFD simulation DOF output (local): flow rates forces water levels simulation cycle: 2-3 hours Pre/Postprocessing: Computational Steering Folie 15

Simulation of flooding and tsunamis automatic acquisition of GIS data fast mesh generation adaptive mesh refinement / coarsening using hierarchic blocks

14 Simulation of flooding and tsunamis automatic acquisition of GIS data fast mesh generation adaptive mesh refinement / coarsening using hierarchic blocks coupling 3D free surface + 2D non-linear shallow water For testing purposes: GIS-Data from GEBCO Digital Atlas, British Oceanographic Data Centre Folie 16

15 CFD kernel: research prototype Lattice-Boltzmann CFD solver 3D transient adaptive multiphase / free surface LES / RANS parallel second-order accurate in space and time MPI Folie 17

16 Kinetic transport modeling small Knudsen number Boltzmann equation Chapman-Enskog-Expansion BGK-Approximation (Bhatnagar, Gross, Krook) Navier Stokes equations discretization in space and time continuity equation Lattice Boltzmann equation (LBGK) Chapman-Enskog-Expansion small Knudsen number small Mach number Folie 18

17 The LB-equation structural advantages: linear and exact advection operator conservative scheme for mass and momentum no numerical viscosity Folie 19

18 code performance on Hitachi SR-8000: ~30% of theoretical peak performance per node parallel efficiency ~90 % on 32 nodes (x 8 processors) code performance on Opteron cluster (120 processors, 500 GB RAM, Myrinet): parallel efficiency ~95 % on 120 processors up to 1 billion grid points Folie 20

19 algorithms and parallelization strategy computational aspects no Poisson equation is solved for the pressure Cartesian grid (automatic 3D grid generation) convergence properties: LBE can be tuned to second-order accuracy with respect to the corresponding solution of incompressible Navier-Stokes flow because of their explicit nature and local stencil LB models are are perfect candidates for efficient parallelization stress tensor locally available (turbulence modelling) Folie 21

Morton N -ordering around a 2-D airfoil (M. J. Aftosmis et al.

20 algorithms and parallelization strategy Optimized data structures for general geometries top: Peano-Hilbert U -ordering bottom: Morton N -ordering adaptive Morton N -ordering around a 2-D airfoil (M. J. Aftosmis et al., Applications of Space-Filling Curves to Cartesian Methods for CFD, AIAA ) Folie 22

21 algorithms and parallelization strategy D2Q9-Modell D3Q19-Modell NW N NE W q SW q S P R q SE S q E T E B Q A SW S SE second order accuracy in space Folie 23

22 Gittergenerierung Automatic 3D grid generation DFG-Round Table Programme Near and Onshore Tsunami Effects Folie 24

23 Domain decomposition: (PAR)METIS DFG-Round Table Programme Near and Onshore Tsunami Effects Folie 25

24 start read subdomain(s) collision boundary nodes Multi Relaxation Time second order BC local grid refinement efficient data structures asynchronous communication nonblocking receive comunication nonblocking send boundary nodes collision inner nodes propagation inner nodes time loop (non-adaptive run) propagation boundary nodes Folie 26

25 anticipated results feasability proof of short term prediction / analysis of catastrophic flood events based on 3D CFD modeling and automatic input of GIS / satellite based topography / bathymetry data Folie 27

26 required computer and software resources minimum requirements: sufficient (short-term) CPU share ~10 TFlops 1TByte RAM 10 TByte Disk space F90 / C++ compiler parallel Debugger MPI 2.x? Folie 28

27 DRAM GAP contiguous memory access is mandatory! Folie 29

28 New hardware developements nvidia GTX 8800 Compute Unified Device Architecture (CUDA) Folie 30

29 nvidia - G80: the parallel stream processor The G80 has eight groups of 16 stream processors, for a total of 128 SPs Generalized floating-point processors capable of operating on any manner of data. G80's stream processors are scalar each SP handles one component SPs are clocked 1.35GHz, giving the GeForce 8800 a tremendous amount of floating-point processing power: 1.35*2*128 = 345 GFLOPs Eight "clusters" of stream processors are connected to six Render Output Unit (ROP) by a crossbar-style switch Each ROP partition has a 64-bits wide interface to graphics memory, which is clocked at 900 MHz. Memory Bandwidth: 6*64/8*0.9*2 (DDR) GB/s = 86 GB/s Folie 31

30 Application Programming Interface (API) Thread Block Grid of Thread Blocks Function Type Qualifiers (_device_, _global_, _host_) Variable Type Qualifiers (_device_,_shared_) Memory management (cudamalloc, cudamemcpy) Synchronisation (_syncthreads() ) Memory Bandwidth Effective bandwidth of each memory space depends significantly on the memory access pattern simultaneous memory accesses of one thread block can be coalesced into a single contiguous, aligned memory access if: thread number N access address BaseAddress + N BaseAddress has to be aligned to 16*sizeof(type) bytes ( otherwise memory bandwidth performance breaks down to about 8 GB/sec ) Folie 32

Platform Peak[Gflops] MemBW[GB/s] price [Euro] Intel Core 2 Duo (2 GHz) 16 6-10

31 Platform Peak[Gflops] MemBW[GB/s] price [Euro] Intel Core 2 Duo (2 GHz) NEC SX-8R A (8 CPUs) Ca nvidia 8800 GTX Folie 33

32 Nonlocal operations Each thread block has shared memory of 16 KB (2 cycles latency) Use shared memory for nonlocal operations (LB: Propagation) Synchronize Write back to device memory Synchronize Grid of Thread Blocks over borders Results Platform MLUPS MemBW[GB/s] GFlops Intel Core Duo (2 GHz) (17 %) 1.6 (20%) Intel Core 2 Duo (2 GHz) (25 %) 3.2 (20%) nvidia 8800 GTX (30 %) 66.0 (19 %) Folie 34

33 CPU versus GPU The war is on Performance GPU CPU Algorithm complexity Explicit solvers Block type grids unstructured mesh implicit solvers Folie 35

34 Fluid-Struktur Interaktion Ferrybridge, England 1965 Folie 36

35 bidirectional Fluid-Structureinteraction Folie 37

36 outlook: fluid-structure interaction, debris flow erosion, sedimentation, scouring Folie 38

37 summary purpose of this research project: to develop a 3D simulation prototype for short term prediction / analysis of catastrophic flood events based on HPC 3D-CFD modeling In terms of modeling / numerical methods / computer science we want to study and optimize the performance of an adaptive kinetic CFD solution environment including pre- and postprocessing issues on massively parallel hardware Folie 39

Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics

Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics H. Y. Schive ( 薛熙于 ) Graduate Institute of Physics, National Taiwan University Leung Center for Cosmology and Particle Astrophysics