Simulating Shallow Water on GPUs Programming of Heterogeneous Systems in Physics Martin Pfeiffer (m.pfeiffer@uni-jena.de) Friedrich Schiller University Jena 06.10.2011 Simulating Shallow Water on GPUs 06.10.2011 1 / 17
Course Project Origin: Student project to Programming with CUDA course Learned cuda in first part of course Student project in the second part Conditions: 5 weeks to complete (20h per week) Two person team Almost zero prior experience of C/OpenGL Initial idea: Hey, let s simulate tsunamis waves! Simulating Shallow Water on GPUs 06.10.2011 2 / 17
Applications for Shallow Water Equations Simulating Shallow Water on GPUs 06.10.2011 3 / 17
Shallow Water Equation h uh uh + u 2 h + 1 2 gh2 vh uvh t State variable x vh + uvh v 2 h + 1 2 gh2 Fluxes conservation of mass and momentum y 0 = hb x hb y Slope u - horizontal water velocity v - vertical water velocity h - water height B - sea bed height g - gravitational constant Simulating Shallow Water on GPUs 06.10.2011 4 / 17
The Lax-Wendroff Method Gridstructure gridpoint stores water height (h), momentum (uh, vh) & sea bed height (B) half-step point defines (h, uh, vh) T at time n + 1/2 between 2 gridpoints half step full step Simulating Shallow Water on GPUs 06.10.2011 5 / 17
Implementation Challenges I - Host-Device Bandwidth The waterwave have to be visualized Problems Host-Device memory copy is slow PCI-Express 2.0 16x allows 8 Gigabyte/s Visualization slows down graphic device Solution Don t visualize every step Don t copy state variables - only water height and color Use OpenGL compatible data structures Simulating Shallow Water on GPUs 06.10.2011 6 / 17
Computation Cycle Simulating Shallow Water on GPUs 06.10.2011 7 / 17
Decoupling Computation-Visualisation Maximize wave steps per frame More CUDA work Less memory transfer But less graphical updates Other benefits Slow/fast motion FPS cap less visualization Simulating Shallow Water on GPUs 06.10.2011 8 / 17
Implementation Challenges II - Memory Access Every point on the grid accesses the state of it s neighbours Problems Slow global memory access on older devices Memory access domiates computation performance GPU is idle while waiting for operands Don t update grid point before finished reading Solution Texture memory Recalculation is faster than sharing Simulating Shallow Water on GPUs 06.10.2011 9 / 17
Texture Memory Optimized for 2D memory access Spatial-aware cache Read-only for kernels Interpolation Fast! Simulating Shallow Water on GPUs 06.10.2011 10 / 17
Performance Improvement - Memory Now the instructions dominate the memory access! Simulating Shallow Water on GPUs 06.10.2011 11 / 17
Implementation challenges III - Divergence Every thread should do the same Problems Divergent branches within a warp are serialized Problem with handling boundary conditions Model requires non-negativ water height Solution Compute non-boundary grid points first Fix boundary grid points with separate kernel Minimize divergent branch workload Simulating Shallow Water on GPUs 06.10.2011 12 / 17
Computation performance GF lops max Achieved % of max Realtime 1 Tesla C2050 1030 410 * 39,8 1654x1654 GeForce 330M 182 53 * 29,1 512x512 -use fast math switch Reduces register usage Higher occupancy About 15% more GFlops * Nvidia s nbody demo achieves 540 GFlops @ Tesla C2050 / 59 GFlops @ GeForce 330M 1 Gridsize @ 24 FPS & 20 wave steps per frame Simulating Shallow Water on GPUs 06.10.2011 13 / 17
Landscape Data and Graphical Output Data: Landscape data from US-national geophysical data center Can also be read from image files (ppm) Initial waves are read from image files Graphics: 3D Graphics & Movement OpenGL for visualization Multi-platform support Heavy use of vertex buffer objects Same data structures as in CUDA Sample landscape image file Simulating Shallow Water on GPUs 06.10.2011 14 / 17
Cross your fingers! - Demo Time Simulating Shallow Water on GPUs 06.10.2011 15 / 17
Questions? Course URL theinf2.informatik.uni-jena.de/lectures/programming+with+cuda.html Code Repository github.com/frty2/cuda Shallow-water-equations Special Thanks to Daniel Kirbst, Jens Mueller, Thomas Baumbach, Prof. Joachim Giesen & Prof. Gerhard Zumbusch. Simulating Shallow Water on GPUs 06.10.2011 16 / 17
Simulating Shallow Water on GPUs Programming of Heterogeneous Systems in Physics Martin Pfeiffer (m.pfeiffer@uni-jena.de) Friedrich Schiller University Jena 06.10.2011 Simulating Shallow Water on GPUs 06.10.2011 17 / 17