CUDA. Fluid simulation Lattice Boltzmann Models Cellular Automata

Similar documents
Adarsh Krishnamurthy (cs184-bb) Bela Stepanova (cs184-bs)

FLUID SIMULATION. Kristofer Schlachter

CUDA/OpenGL Fluid Simulation. Nolan Goodnight

Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation

CGT 581 G Fluids. Overview. Some terms. Some terms

GPU-based Distributed Behavior Models with CUDA

Overview of Traditional Surface Tracking Methods

Realistic and Controllable Fire Simulation

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

Lattice Boltzmann with CUDA

Navier-Stokes & Flow Simulation

GPGPU LAB. Case study: Finite-Difference Time- Domain Method on CUDA

Fluid Simulation. Dhruv Kore, Giancarlo Gonzalez, and Jenny Sum CS 488: Introduction to Computer Graphics Professor Angus Forbes

Realistic Animation of Fluids

CS 231. Fluid simulation

Navier-Stokes & Flow Simulation

Real-Time Volumetric Smoke using D3D10. Sarah Tariq and Ignacio Llamas NVIDIA Developer Technology

Realtime Water Simulation on GPU. Nuttapong Chentanez NVIDIA Research

Theodore Kim Michael Henson Ming C.Lim. Jae ho Lim. Korea University Computer Graphics Lab.

Divergence-Free Smoothed Particle Hydrodynamics

An Improved Study of Real-Time Fluid Simulation on GPU

Particle-based Fluid Simulation

2.7 Cloth Animation. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter 2 123

Numerical Algorithms on Multi-GPU Architectures

Physically Based Simulation and Animation of Gaseous Phenomena in a Periodic Domain

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE

Two-Phase flows on massively parallel multi-gpu clusters

Navier-Stokes & Flow Simulation

Solving Partial Differential Equations on Overlapping Grids

LATTICE-BOLTZMANN AND COMPUTATIONAL FLUID DYNAMICS

The Immersed Interface Method

Computational Fluid Dynamics (CFD) using Graphics Processing Units

cuibm A GPU Accelerated Immersed Boundary Method

PHYSICALLY BASED ANIMATION

Why Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends

3D ADI Method for Fluid Simulation on Multiple GPUs. Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA

A Particle Cellular Automata Model for Fluid Simulations

Animation of Fluids. Animating Fluid is Hard

CS-184: Computer Graphics Lecture #21: Fluid Simulation II

High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters

Software and Performance Engineering for numerical codes on GPU clusters

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou

Computer animation for fluid simulation of a high viscous fluid melting

Abstract. Introduction. Kevin Todisco

FLUID SIMULATION BY PARTICLE LEVEL SET METHOD WITH AN EFFICIENT DYNAMIC ARRAY IMPLEMENTATION ON GPU

(LSS Erlangen, Simon Bogner, Ulrich Rüde, Thomas Pohl, Nils Thürey in collaboration with many more

Simple and Fast Fluids

Multigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK

Smoke Simulation using Smoothed Particle Hydrodynamics (SPH) Shruti Jain MSc Computer Animation and Visual Eects Bournemouth University

IMPROVED WALL BOUNDARY CONDITIONS WITH IMPLICITLY DEFINED WALLS FOR PARTICLE BASED FLUID SIMULATION

Realistic Animation of Fluids

Procedural Clouds. (special release with dots!)

Fluid Simulation and Reaction-Diffusion Textures on Surfaces Maria Andrade Luiz Velho (supervisor) Technical Report TR Relatório Técnico

Level set methods Formulation of Interface Propagation Boundary Value PDE Initial Value PDE Motion in an externally generated velocity field

CMPT 898 Final Report. Adam L. Preuss. Numerical Simulation Laboratory. Department of Computer Science. University of Saskatchewan

Fluids in Games. Jim Van Verth Insomniac Games

Lagrangian methods and Smoothed Particle Hydrodynamics (SPH) Computation in Astrophysics Seminar (Spring 2006) L. J. Dursi

Fluid Simulation. [Thürey 10] [Pfaff 10] [Chentanez 11]

Comparison between incompressible SPH solvers

Geometric Multigrid on Multicore Architectures: Performance-Optimized Complex Diffusion

ENERGY-224 Reservoir Simulation Project Report. Ala Alzayer

The Development of a Navier-Stokes Flow Solver with Preconditioning Method on Unstructured Grids

Enabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition. Jeff Kiel Director, Graphics Developer Tools

Hardware Accelerated Real-Time Fluid Surface Simulation

Texture Advection Based Simulation of Dynamic Cloud Scene

The 3D DSC in Fluid Simulation

Webinar #3 Lattice Boltzmann method for CompBioMed (incl. Palabos)

Parallel GPU-Based Fluid Animation. Master s thesis in Interaction Design and Technologies JAKOB SVENSSON

Performance of Implicit Solver Strategies on GPUs

ALE Seamless Immersed Boundary Method with Overset Grid System for Multiple Moving Objects

MESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP

Data parallel algorithms, algorithmic building blocks, precision vs. accuracy

GPU-accelerated data expansion for the Marching Cubes algorithm

Interactive Fluid Simulation using Augmented Reality Interface

1. Mathematical Modelling

How to Optimize Geometric Multigrid Methods on GPUs

Studies of the Continuous and Discrete Adjoint Approaches to Viscous Automatic Aerodynamic Shape Optimization

From Notebooks to Supercomputers: Tap the Full Potential of Your CUDA Resources with LibGeoDecomp

Bandwidth Avoiding Stencil Computations

Interactive Fluid Simulation Using Augmented Reality Interface

ACCELERATION OF A COMPUTATIONAL FLUID DYNAMICS CODE WITH GPU USING OPENACC

Overview. Applications of DEC: Fluid Mechanics and Meshing. Fluid Models (I) Part I. Computational Fluids with DEC. Fluid Models (II) Fluid Models (I)

International Supercomputing Conference 2009

Fluid Simulation Quality with Violated CFL Condition

About Phoenix FD PLUGIN FOR 3DS MAX AND MAYA. SIMULATING AND RENDERING BOTH LIQUIDS AND FIRE/SMOKE. USED IN MOVIES, GAMES AND COMMERCIALS.

Shape of Things to Come: Next-Gen Physics Deep Dive

Analysis, extensions and applications of the Finite-Volume Particle Method (FVPM) PN-II-RU-TE Synthesis of the technical report -

2.29 Numerical Marine Hydrodynamics Spring 2007

Volcanic Smoke Animation using CML

CS427 Multicore Architecture and Parallel Computing

Simulation of Liquid-Gas-Solid Flows with the Lattice Boltzmann Method

Acceleration of a 2D Euler Flow Solver Using Commodity Graphics Hardware

Realistic Smoke Simulation Using A Frustum Aligned Grid

Divergence-Free Smoothed Particle Hydrodynamics

The Lattice-Boltzmann Method for Gaseous Phenomena

Particle-Based Fluid Simulation. CSE169: Computer Animation Steve Rotenberg UCSD, Spring 2016

Unstructured Mesh Generation for Implicit Moving Geometries and Level Set Applications

B. Tech. Project Second Stage Report on

FINITE POINTSET METHOD FOR 2D DAM-BREAK PROBLEM WITH GPU-ACCELERATION. M. Panchatcharam 1, S. Sundar 2

Volumetric processing and visualization on heterogeneous architecture

Transcription:

CUDA Fluid simulation Lattice Boltzmann Models Cellular Automata

Please excuse my layout of slides for the remaining part of the talk!

Fluid Simulation Navier Stokes equations for incompressible fluids Well known technique from computer graphics Stable fluids, Jos Stam, Siggraph 99 Can take arbitrary large time steps Finite difference approximations results in grid computations similar to LBM

SDK fluids DEMO

SDK fluids Not 100% identical to the following method

Navier-Stokes (again) u t = (u )u 1 ρ p ν 2 u + f u = 0

Projection operator Define a projection operator P that projects a vector field w onto divergence-free component u. P w = P u + P ( p) Since by definition P w =P u = u thenp ( p)=0 And then

Algorithm Break it down [Stam 2000]: Add forces Advect Diffuse Solve for pressure Subtract pressure gradient

Algorithm From: Stam J. Stable Fluids. Siggraph 1999.

Algorithm Break it down [Stam 2000]: Add forces: u t = ( u )u 1 ρ p ν 2 u + f External Force Source: Mark Harris / GPGPU tutorial@siggraph 04

Add Forces Explicit Euler integration w 1 = u(x,t) + f(x,t) t Source: Mark Harris / GPGPU tutorial@siggraph 04

Algorithm Break it down [Stam 2000]: Add forces: Advect: u t = ( u )u 1 ρ p ν 2 u + f Advection Source: Mark Harris / GPGPU tutorial@siggraph 04

Advection Advection: quantities in a fluid are carried along by its velocity w 2 (x) = w 1 (x w 1 t) Want velocity at position x at new time t + t Follow velocity field back in time from x : (x - w 1 t) Like tracing particles! Simple in a fragment program u(x, t) Path of fluid u(x, t+ t) Trace back in time Source: Mark Harris / GPGPU tutorial@siggraph 04

Algorithm Break it down [Stam 2000]: Add forces: Advect: Diffuse: u t = ( u ) u 1 ρ p ν 2 u + f Diffusion (viscosity) Source: Mark Harris / GPGPU tutorial@siggraph 04

Implicit Stable for large timesteps Numerical integration ), ( ), ( ) ( ), ( ), ( ), ( ), ( ), ( ), ( 2 ) ( 2 2 2 2 2 2 2 2 2 2 2 3 t x w h t x w h I h t x w h t x w h t x w h t x w h t x w h t x w dt dw = + + + = + + = + = 14243 x w ν ν ν

Viscous Diffusion ( I ν t 2 )w = w 3 2 Solution by Jacobi iteration Source: Mark Harris / GPGPU tutorial@siggraph 04

Algorithm Break it down [Stam 2000]: Add forces: Advect: Diffuse: Solve for pressure: 2 p = w 3 Source: Mark Harris / GPGPU tutorial@siggraph 04

Poisson-Pressure Solution Poisson Equation Jacobi, Gauss-Seidel, Multigrid, etc. Jacobi easy on GPU, the rest are trickier 2 p = w 3 Source: Mark Harris / GPGPU tutorial@siggraph 04

Algorithm Break it down [Stam 2000]: Add forces: Advect: Diffuse: Solve for pressure: Subtract pressure gradient: u(x,t + t) = w 3 p Source: Mark Harris / GPGPU tutorial@siggraph 04

Subtract Pressure Gradient u(x,t + t) = w 3 p Last computation of the time step u is now a divergence-free velocity field Source: Mark Harris / GPGPU tutorial@siggraph 04

Implementation Cg code. But we convert to Cuda on the fly

Pseudocode of timestep

Textures

Advection w 2 (x) = w 1 (x w 1 t)

Diffusion ( I ν t 2 )w 3 = w 2

Divergence

Gradient subtraction

Partial differential equations Finite difference discretizations lead to lattice formulations Like the Jacobi iteration program Similar in implementation to cellular automata

Lattice Boltzmann Models Speculations

Implementing Lattice Boltzmann Computation on Graphics Hardware Wei Li, Xiaoming Wei, and Arie Kaufman. The Visual Computer 19(7-8) 2003. Hardware-near paper Extrapolate

movie

LBM The LBM consists of a regular grid and a set of packet distribution values. Each packet distribution f qi corresponds to a velocity direction vector e qi shooting from a node to its neighbor.

LBM in Cuda Every step is easy to program Initially read f i from global memory but store in shared memory Iterate and write out densities, velocities Write out only for visualization or external boundary update

Two papers from the same authors!

Speedup

Cellular automata

Informal definition Cellular automaton A regular grid of cells, each in one of a finite number of states. The grid can be in any finite number of dimensions. Time is also discrete The state of a cell at time t is a function of the states of a finite number of cells (called its neighborhood) at time t 1. Adapted from wikipedia

Rich behaviour from simple functions Example, exclusive or Courtesy Prof. Bastien Chopard

CA in CUDA: format If the number of possible states per cell corresponds to 2 32 integer No bool type on GPUs (at present) Chars available Represent multiple cell per primitive for best performance

Exclusive OR Implementation considerations

CA in CUDA: global memory Read from and write to global memory after each iteration Simple and easy Inefficient with respect to memory bandwidth Kernel Read neighbours states Compute four-way exclusive or Write result at node

CA in CUDA: textures Read states from a texture and write to global memory after each iteration Cache hits reduces memory bandwidth BUT Currently no write-to-texture support Write to global memory and copy to CUDA array For simple kernels the cop outweighs the cache gain

CA in CUDA: shared memory Read states from global memory once into shared memory and write to global memory after each iteration Reduces bandwidth Kernel Read current cell state into shared memory Block border cells read also border cell state Watch out for bank conflicts! Synchronize threads Compute four-way exclusive or from shared mem. Write result at node (from register)

In my experience For 256x256 grid Shared memory version ~5 times faster as global memory version in a specific 2D registration problem

Cuda profiler

Iterate in kernel If you are only interested in the result after n iterations Iterate in kernel to remove CPU overhead Only border cells need to read/write global memory in each iteration Communication between blocks Rest of shared mem. is already known

That was it Thank you for coming