Simulating Shallow Water on GPUs Programming of Heterogeneous Systems in Physics

Size: px

Start display at page:

Download "Simulating Shallow Water on GPUs Programming of Heterogeneous Systems in Physics"

Noreen Dean
5 years ago
Views:

1 Simulating Shallow Water on GPUs Programming of Heterogeneous Systems in Physics Martin Pfeiffer Friedrich Schiller University Jena Simulating Shallow Water on GPUs / 17

2 Course Project Origin: Student project to Programming with CUDA course Learned cuda in first part of course Student project in the second part Conditions: 5 weeks to complete (20h per week) Two person team Almost zero prior experience of C/OpenGL Initial idea: Hey, let s simulate tsunamis waves! Simulating Shallow Water on GPUs / 17

3 Applications for Shallow Water Equations Simulating Shallow Water on GPUs / 17

Shallow Water Equation h uh uh + u 2 h + 1 2 gh2 vh uvh t State variable x vh + uvh v 2 h + 1 2 gh2 Fluxes conservation of mass and momentum y 0 = hb x hb y Slope u -

4 Shallow Water Equation h uh uh + u 2 h gh2 vh uvh t State variable x vh + uvh v 2 h gh2 Fluxes conservation of mass and momentum y 0 = hb x hb y Slope u - horizontal water velocity v - vertical water velocity h - water height B - sea bed height g - gravitational constant Simulating Shallow Water on GPUs / 17

point defines (h, uh, vh) T at time n + 1/2 between 2 gridpoints

5 The Lax-Wendroff Method Gridstructure gridpoint stores water height (h), momentum (uh, vh) & sea bed height (B) half-step point defines (h, uh, vh) T at time n + 1/2 between 2 gridpoints half step full step Simulating Shallow Water on GPUs / 17

6 Implementation Challenges I - Host-Device Bandwidth The waterwave have to be visualized Problems Host-Device memory copy is slow PCI-Express x allows 8 Gigabyte/s Visualization slows down graphic device Solution Don t visualize every step Don t copy state variables - only water height and color Use OpenGL compatible data structures Simulating Shallow Water on GPUs / 17

7 Computation Cycle Simulating Shallow Water on GPUs / 17

8 Decoupling Computation-Visualisation Maximize wave steps per frame More CUDA work Less memory transfer But less graphical updates Other benefits Slow/fast motion FPS cap less visualization Simulating Shallow Water on GPUs / 17

9 Implementation Challenges II - Memory Access Every point on the grid accesses the state of it s neighbours Problems Slow global memory access on older devices Memory access domiates computation performance GPU is idle while waiting for operands Don t update grid point before finished reading Solution Texture memory Recalculation is faster than sharing Simulating Shallow Water on GPUs / 17

10 Texture Memory Optimized for 2D memory access Spatial-aware cache Read-only for kernels Interpolation Fast! Simulating Shallow Water on GPUs / 17

11 Performance Improvement - Memory Now the instructions dominate the memory access! Simulating Shallow Water on GPUs / 17

12 Implementation challenges III - Divergence Every thread should do the same Problems Divergent branches within a warp are serialized Problem with handling boundary conditions Model requires non-negativ water height Solution Compute non-boundary grid points first Fix boundary grid points with separate kernel Minimize divergent branch workload Simulating Shallow Water on GPUs / 17

Computation performance GF lops max Achieved % of max Realtime 1 Tesla C2050 1030 410 * 39,8 1654x1654 GeForce 330M 182 53 * 29,1 512x512 -use fast math switch Reduces register usage Higher occupancy

13 Computation performance GF lops max Achieved % of max Realtime 1 Tesla C * 39,8 1654x1654 GeForce 330M * 29,1 512x512 -use fast math switch Reduces register usage Higher occupancy About 15% more GFlops * Nvidia s nbody demo achieves 540 Tesla C2050 / 59 GeForce 330M 1 24 FPS & 20 wave steps per frame Simulating Shallow Water on GPUs / 17

Landscape Data and Graphical Output Data: Landscape data from US-national geophysical data center Can also be read from image files (ppm) Initial waves are read from image files Graphics: 3D Graphics

14 Landscape Data and Graphical Output Data: Landscape data from US-national geophysical data center Can also be read from image files (ppm) Initial waves are read from image files Graphics: 3D Graphics & Movement OpenGL for visualization Multi-platform support Heavy use of vertex buffer objects Same data structures as in CUDA Sample landscape image file Simulating Shallow Water on GPUs / 17

15 Cross your fingers! - Demo Time Simulating Shallow Water on GPUs / 17

16 Questions? Course URL theinf2.informatik.uni-jena.de/lectures/programming+with+cuda.html Code Repository github.com/frty2/cuda Shallow-water-equations Special Thanks to Daniel Kirbst, Jens Mueller, Thomas Baumbach, Prof. Joachim Giesen & Prof. Gerhard Zumbusch. Simulating Shallow Water on GPUs / 17

17 Simulating Shallow Water on GPUs Programming of Heterogeneous Systems in Physics Martin Pfeiffer Friedrich Schiller University Jena Simulating Shallow Water on GPUs / 17

CUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni

CUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni CUDA Optimizations WS 2014-15 Intelligent Robotics Seminar 1 Table of content 1 Background information 2 Optimizations 3 Summary 2 Table of content 1 Background information 2 Optimizations 3 Summary 3