Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský
Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2
Overview Application stage Geometry stage Rasterization stage Bottleneck finding Tips & tricks Programming techniques 3.3.2010 3
Application stage Application stage LOD Culling Model description Geometry stage Rasterization stage Bottleneck finding Tips & tricks Programming techniques 3.3.2010 4
LOD Levels of detail Simplification Refinement ( Tessellation ) Models Large - subdivide Small - combine Various techniques 3.3.2010 5
Culling Application stage or HW support 3.3.2010 6
Model description Triangle strips with: 10 vertices without: 24 vertices Orientation change Swaps degenerate triangle v0,v1,v2,v3,v2,v4,v5,v6 3.3.2010 7
Model description (2) Non-connected strips swaps 0,1,2,3,3,4,4,5,6,7 Creation Manually NVTriStripper Own code Vertex arrays Example 3.3.2010 8
Geometry stage Per-vertex operations Triangle strip Lightning Disable / enable # light sources Spot, point, directional Normals normalize (preprocessing) Static light & diffuse material Precomputed lightning 3.3.2010 9
Rasterization stage Per-pixel operations Backface culling turn on Draw in front to back order Disable unnecessary features Texture filtering Fog Blending Multisampling Render to texture smaller resolution 3.3.2010 10
Overall optimization Reduce #primitives Turn off features not in use Minimize state changes API calls Color (depth) buffer clear if necessary Reads to CPU expensive New HW features VBO,PBO,FBO Memory optimizations (order, align, ) Disable vertical sync Run on the target HW Good algorithm at first! 3.3.2010 11
Bottleneck finding Rendering - as fast as slowest unit Application limited (AI, collision detection,...) Rasterization limited (high resolutions) Geometry limited (scientific app.) 3.3.2010 12
Basic tests Eliminate all file accesses hdd usage Down-clock ( BIOS, Coolbits, PerfHUD ) CPU speed GPU speed (shaders, core) GPU memory speed 3.3.2010 13
CPU bottleneck Task manager, top CPU consumers (AI, physics,...) Algorithmic improvements API calls reduce state changes Locking resources synchronization Protect the GPU (culling,...) Increase batch size (shaders, gometry, ) Texture atlas, texture array, texture formats 3.3.2010 14
GPU bottleneck Geometry Changes also in rasterization # lights Rasterization Window size Turn off texturing, blending,... 3.3.2010 15
Tips & tricks API Vertex processing Shaders Texturing Rasterization 3.3.2010 16
T&T API Batching Texture atlases, texture arrays State changes Shader constants groups 3.3.2010 17
T&T Vertex processing Indexed primitives (cache) Recalculate data in VS (size of vertex cache) Reduce vertex size Dynamic vertex buffers if necessary 3.3.2010 18
T&T Shaders Use highest profile (compiler, bug fix) Lowest precision half vs. float Linearizable calculations vertex shader Early Z culling Vertex operations to pixel shader Geometry shader if necessary GS smallest maxvertexcount, smallest vertices Constants hard coded Save computation use algebra Complex functions texture lookup 3.3.2010 19 Branching tile size
T&T - Texturing Mipmaping always Trilinear, anisotropic filtering prudently Texture atlas Per-pixel lightning precomputed in texture Texture format 3.3.2010 20
T&T - Rasterization Depth, stencil rendering double speed Color writes disabled Texkill (discard, clip) Depth modification Alpha test Occlusion query Early Z ( Z-cull ) - Front to back rendering Deffered shading Memory Render targets, shaders, textures (size, priority) 3.3.2010 21
Programming examples VBO Vertex buffer object Mainly for geometry rendering Index buffer Geometry processing PBO Pixel buffer object Buffer for transfer pixel data FBO Frame buffer object Render into custom frame buffer 3.3.2010 22
Real-time rendering slides References GeForce 8 and 9 Series GPU Programming http://developer.nvidia.com/object/gpu_programming_guide.html http://en.wikipedia.org/wiki/ http://flickr.com/photos/buou/2126755136/ http://vr.kaist.ac.kr/_r011.htm http://www.vis.uni-stuttgart.de/eng/teaching/lecture/ws04/seminar_spiele/bsp/ PBO http://www.mathematik.uni-dortmund.de/~goeddeke/gpgpu/tutorial3.html http://www.songho.ca/opengl/gl_pbo.html 3.3.2010 23
Questions? 3.3.2010 24