Sparse Fluid Simulation in DirectX. Alex Dunn Dev. Tech. NVIDIA

Size: px

Start display at page:

Download "Sparse Fluid Simulation in DirectX. Alex Dunn Dev. Tech. NVIDIA"

Allyson Marshall
5 years ago
Views:

1 Sparse Fluid Simulation in DirectX Alex Dunn Dev. Tech. NVIDIA

2 Eulerian Simulation Grid based. Great for simulating gaseous fluid; smoke, flame, clouds. It just works->

3 Basic Algorithm Inject 2x Velocity Advect Pressure 2x Pressure Vorticity Evolve 1x Vorticity

Lots of state changes. No interaction between volumes.

6 Why Are Small Volumes Bad? Fluid isn t box shaped. Often fluid will clip bounds. Simulated separately. Lots of state changes. No interaction between volumes. Tricky to render. No sorting between volumes. Each volume rendered separately.

8 Memory (Mb) Problem! N-order problem 64^3 = ~0.25m cells 128^3 = ~2m cells 256^3 = ~16m cells Applies to: Computational complexity Memory requirements Texture3D - 4x16F Dimensions (X = Y = Z) And that s just 1 texture

9 What Do We Do? Typically, fluid doesn t occupy every cell! Why waste cycles/memory on empty cells? Options? Trees Tight fitting but poor lookup speed Bricks Good fit and fast lookup

10 Bricks Split simulation space into groups of cells (each known as a brick). Simulate each brick independently.

11 Storage Storage of bricks can take one of two forms: Compressed; allocate bricks as needed. Uncompressed; allocate all bricks.

12 Compressed Storage Kind of like, vector<brick>. Pros: Good memory consumption. Only store bricks we care about. Cons: Allocation strategies. Expensive neighbourhood lookup. Software translation

13 1 Brick = 4 3 = 64

14 1 Brick = (1+4+1) 3 = 216 New problem; 6n 2 +12n + 8 problem.

15 Uncompressed Storage Allocate everything; forget about unoccupied cells Pros: Simulation is coherent in memory. Cons: No reduction in memory used.

16 Brick Map We need to track which bricks are occupied! New Texture3D<uint> 1 voxel per brick 0 Unoccupied 1 Occupied Could also use packed binary grids [Holger15], but this requires atomics

17 Tracking Bricks Initialise with emitter Perform on the CPU Expansion (unoccupied -> occupied) Read velocity If axial velocity enough to traverse boundary Expand in that axis Reduction (occupied -> unoccupied) Handled automatically Calculate occupied bricks every frame

Sparse Algorithm Clear Tiles Inject Reset all tiles to

Append tile coordinate to list if occupied.

18 Sparse Algorithm Clear Tiles Inject Reset all tiles to 0 (unmapped) in tile map. Advect Pressure Vorticity Evolve* Fill List Read value from tile map. Append tile coordinate to list if occupied. Texture3D<uint> g_occupiedmapro; AppendBuffer<uint3> g_listrw; if(g_occupiedmapro[idx]!= 0) { g_listrw.append(idx); } *Also handles expansion

19 Numbers Show performance of uncompressed grid Explain how memory consumption stays the same when using uncompressed storage. Can we do better?

20 Enter; DirectX 11.3 Volume Tiled Resources (VTR)! Extends 2D functionality in DX Tile = 64KB Bits/Pixel Tile Dimensions 8 64x32x x32x x32x x16x x16x16

21 Tiled Resources Only mapped memory is allocated in VRAM Fast HW translation (same as regular paged memory) All samplers supported Gotcha: Tile mappings must be updated from CPU

22 Idea 1. Tile == Brick 2. Create fluid textures as tiled resources. 3. Create tile pool per texture. 4. Map tiles to memory using list of occupied tiles. 5. Handle expansion/reduction of tiles in simulation.

23 Drawbacks Tile mappings updated by CPU. Possible using CPU read backs. Must use triple buffer for speedy Map/Unmap. Means 3 frame latency. Predict tile mappings ahead of time. Frame N GPU/CPU Status CPU issues render calls for current frame. N+1 GPU executing calls sent from CPU during frame N. CPU issues render calls for current frame. N+2 GPU finished executing calls sent from CPU during frame N. Results ready. GPU executing calls sent from CPU during frame N+1. CPU issues render calls for current frame. N+3 GPU finished executing calls sent from CPU during frame N+1. Results ready. GPU executing calls sent from CPU during frame N+2. CPU issues render calls for current frame. N+4...

24 GPU Simulation; CPU Prediction We always know the maximum velocity of the fluid. Add a skirt of probable tiles around definite tiles.

25 New Algorithm Yes Is Data Available? No Fill List Evolve Readback Tiles Predict Tiles Build List Emitter Tiles Vorticity Pressure Advect Inject Map Tiles Clear Tiles

26 Demo

27 Numbers

28 Questions? Alex Dunn - adunn@nvidia.com Thanks for attending.

SPARSE FLUID SIMULATION IN DIRECTX. Alex Dunn Graphics Dev. Tech.

SPARSE FLUID SIMULATION IN DIRECTX Alex Dunn Graphics Dev. Tech. AGENDA We want more fluid in games! Eulerian Fluid Simulation. Sparse Eulerian Fluid. Feature Level 11.3 Enhancements. 2 WHY DO WE NEED