Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

Similar documents
Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Optimisation. CS7GV3 Real-time Rendering

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology

Graphics Processing Unit Architecture (GPU Arch)

Optimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June

GeForce3 OpenGL Performance. John Spitzer

Rendering Objects. Need to transform all geometry then

Working with Metal Overview

Evolution of GPUs Chris Seitz

Programming Guide. Aaftab Munshi Dan Ginsburg Dave Shreiner. TT r^addison-wesley

GCN Performance Tweets AMD Developer Relations

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved

GPU Computation Strategies & Tricks. Ian Buck NVIDIA

Building scalable 3D applications. Ville Miettinen Hybrid Graphics

Programming Graphics Hardware

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

The Application Stage. The Game Loop, Resource Management and Renderer Design

GeForce4. John Montrym Henry Moreton

HISTORY OF MAJOR REVISIONS

Performance OpenGL Programming (for whatever reason)

Grafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL

Ultimate Graphics Performance for DirectX 10 Hardware

RSX Best Practices. Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Mattan Erez. The University of Texas at Austin

Optimizing DirectX Graphics. Richard Huddy European Developer Relations Manager

Lecture 2. Shaders, GLSL and GPGPU

Direct Rendering of Trimmed NURBS Surfaces

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp

Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME)

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

CS4620/5620: Lecture 14 Pipeline

CS427 Multicore Architecture and Parallel Computing

The Source for GPU Programming

The Rasterization Pipeline

ARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 2.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555B (ID051413)

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)

Feeding the Beast: How to Satiate Your GoForce While Differentiating Your Game

Chapter 1 Introduction

Mobile Performance Tools and GPU Performance Tuning. Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools

Hardware-driven Visibility Culling Jeong Hyun Kim

PowerVR Hardware. Architecture Overview for Developers

The Graphics Pipeline

Spring 2009 Prof. Hyesoon Kim

Wed, October 12, 2011

DirectX 10 Performance. Per Vognsen

Spring 2011 Prof. Hyesoon Kim

Readings on graphics architecture for Advanced Computer Architecture class

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Monday Morning. Graphics Hardware

Could you make the XNA functions yourself?

Optimizing Games for ATI s IMAGEON Aaftab Munshi. 3D Architect ATI Research

gems_ch28.qxp 2/26/ :49 AM Page 469 PART V PERFORMANCE AND PRACTICALITIES

3/1/2010. Acceleration Techniques V1.2. Goals. Overview. Based on slides from Celine Loscos (v1.0)

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

Abstract. 2 Description of the Effects Used. 1 Introduction Phong Illumination Bump Mapping

Optimizing for DirectX Graphics. Richard Huddy European Developer Relations Manager

ARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 3.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C (ID102813)

PROFESSIONAL. WebGL Programming DEVELOPING 3D GRAPHICS FOR THE WEB. Andreas Anyuru WILEY. John Wiley & Sons, Ltd.

CS130 : Computer Graphics. Tamar Shinar Computer Science & Engineering UC Riverside

Windowing System on a 3D Pipeline. February 2005

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

Dave Shreiner, ARM March 2009

Comparing Reyes and OpenGL on a Stream Architecture

Performance Analysis and Culling Algorithms

DirectX10 Effects and Performance. Bryan Dudash

Today s Agenda. DirectX 9 Features Sim Dietrich, nvidia - Multisample antialising Jason Mitchell, ATI - Shader models and coding tips

Practical Performance Analysis Koji Ashida NVIDIA Developer Technology Group

Mattan Erez. The University of Texas at Austin

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

Real-World Applications of Computer Arithmetic

PowerVR Performance Recommendations The Golden Rules. October 2015

Pipeline Operations. CS 4620 Lecture 14

PowerVR Performance Recommendations. The Golden Rules

3D Rendering Pipeline

Deferred Splatting. Gaël GUENNEBAUD Loïc BARTHE Mathias PAULIN IRIT UPS CNRS TOULOUSE FRANCE.

Deferred Rendering Due: Wednesday November 15 at 10pm

How to Work on Next Gen Effects Now: Bridging DX10 and DX9. Guennadi Riguer ATI Technologies

Baback Elmieh, Software Lead James Ritts, Profiler Lead Qualcomm Incorporated Advanced Content Group

Graphics Hardware. Instructor Stephen J. Guy

GUERRILLA DEVELOP CONFERENCE JULY 07 BRIGHTON

1.2.3 The Graphics Hardware Pipeline

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1


Architectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

Programming Tips For Scalable Graphics Performance

Project Gotham Racing 2 (Xbox) Real-Time Rendering. Microsoft Flighsimulator. Halflife 2

Motivation MGB Agenda. Compression. Scalability. Scalability. Motivation. Tessellation Basics. DX11 Tessellation Pipeline

Hardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences

E.Order of Operations

DirectX10 Effects. Sarah Tariq

3D buzzwords. Adding programmability to the pipeline 6/7/16. Bandwidth Gravity of modern computer systems

Graphics Hardware. Computer Graphics COMP 770 (236) Spring Instructor: Brandon Lloyd 2/26/07 1

OpenGL SUPERBIBLE. Fifth Edition. Comprehensive Tutorial and Reference. Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak

GPU Memory Model Overview

Getting fancy with texture mapping (Part 2) CS559 Spring Apr 2017

CSE 167: Introduction to Computer Graphics Lecture #4: Vertex Transformation

PowerVR Series5. Architecture Guide for Developers

Transcription:

Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský

Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2

Overview Application stage Geometry stage Rasterization stage Bottleneck finding Tips & tricks Programming techniques 3.3.2010 3

Application stage Application stage LOD Culling Model description Geometry stage Rasterization stage Bottleneck finding Tips & tricks Programming techniques 3.3.2010 4

LOD Levels of detail Simplification Refinement ( Tessellation ) Models Large - subdivide Small - combine Various techniques 3.3.2010 5

Culling Application stage or HW support 3.3.2010 6

Model description Triangle strips with: 10 vertices without: 24 vertices Orientation change Swaps degenerate triangle v0,v1,v2,v3,v2,v4,v5,v6 3.3.2010 7

Model description (2) Non-connected strips swaps 0,1,2,3,3,4,4,5,6,7 Creation Manually NVTriStripper Own code Vertex arrays Example 3.3.2010 8

Geometry stage Per-vertex operations Triangle strip Lightning Disable / enable # light sources Spot, point, directional Normals normalize (preprocessing) Static light & diffuse material Precomputed lightning 3.3.2010 9

Rasterization stage Per-pixel operations Backface culling turn on Draw in front to back order Disable unnecessary features Texture filtering Fog Blending Multisampling Render to texture smaller resolution 3.3.2010 10

Overall optimization Reduce #primitives Turn off features not in use Minimize state changes API calls Color (depth) buffer clear if necessary Reads to CPU expensive New HW features VBO,PBO,FBO Memory optimizations (order, align, ) Disable vertical sync Run on the target HW Good algorithm at first! 3.3.2010 11

Bottleneck finding Rendering - as fast as slowest unit Application limited (AI, collision detection,...) Rasterization limited (high resolutions) Geometry limited (scientific app.) 3.3.2010 12

Basic tests Eliminate all file accesses hdd usage Down-clock ( BIOS, Coolbits, PerfHUD ) CPU speed GPU speed (shaders, core) GPU memory speed 3.3.2010 13

CPU bottleneck Task manager, top CPU consumers (AI, physics,...) Algorithmic improvements API calls reduce state changes Locking resources synchronization Protect the GPU (culling,...) Increase batch size (shaders, gometry, ) Texture atlas, texture array, texture formats 3.3.2010 14

GPU bottleneck Geometry Changes also in rasterization # lights Rasterization Window size Turn off texturing, blending,... 3.3.2010 15

Tips & tricks API Vertex processing Shaders Texturing Rasterization 3.3.2010 16

T&T API Batching Texture atlases, texture arrays State changes Shader constants groups 3.3.2010 17

T&T Vertex processing Indexed primitives (cache) Recalculate data in VS (size of vertex cache) Reduce vertex size Dynamic vertex buffers if necessary 3.3.2010 18

T&T Shaders Use highest profile (compiler, bug fix) Lowest precision half vs. float Linearizable calculations vertex shader Early Z culling Vertex operations to pixel shader Geometry shader if necessary GS smallest maxvertexcount, smallest vertices Constants hard coded Save computation use algebra Complex functions texture lookup 3.3.2010 19 Branching tile size

T&T - Texturing Mipmaping always Trilinear, anisotropic filtering prudently Texture atlas Per-pixel lightning precomputed in texture Texture format 3.3.2010 20

T&T - Rasterization Depth, stencil rendering double speed Color writes disabled Texkill (discard, clip) Depth modification Alpha test Occlusion query Early Z ( Z-cull ) - Front to back rendering Deffered shading Memory Render targets, shaders, textures (size, priority) 3.3.2010 21

Programming examples VBO Vertex buffer object Mainly for geometry rendering Index buffer Geometry processing PBO Pixel buffer object Buffer for transfer pixel data FBO Frame buffer object Render into custom frame buffer 3.3.2010 22

Real-time rendering slides References GeForce 8 and 9 Series GPU Programming http://developer.nvidia.com/object/gpu_programming_guide.html http://en.wikipedia.org/wiki/ http://flickr.com/photos/buou/2126755136/ http://vr.kaist.ac.kr/_r011.htm http://www.vis.uni-stuttgart.de/eng/teaching/lecture/ws04/seminar_spiele/bsp/ PBO http://www.mathematik.uni-dortmund.de/~goeddeke/gpgpu/tutorial3.html http://www.songho.ca/opengl/gl_pbo.html 3.3.2010 23

Questions? 3.3.2010 24