Programming Graphics Hardware

Similar documents
Evolution of GPUs Chris Seitz

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Graphics Processing Unit Architecture (GPU Arch)

Lecture 2. Shaders, GLSL and GPGPU

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Spring 2009 Prof. Hyesoon Kim

Graphics Hardware. Instructor Stephen J. Guy

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

Today. Texture mapping in OpenGL. Texture mapping. Basic shaders for texturing. Today. Computergrafik

Mattan Erez. The University of Texas at Austin

Spring 2011 Prof. Hyesoon Kim

CS 354R: Computer Game Technology

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

Monday Morning. Graphics Hardware

3D Rasterization II COS 426

Advanced Computer Graphics (CS & SE ) Lecture 7

Computergrafik. Matthias Zwicker Universität Bern Herbst 2016

Texture mapping. Computer Graphics CSE 167 Lecture 9

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

Programmable Graphics Hardware

Shaders. Slide credit to Prof. Zwicker

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

GPU-Based Volume Rendering of. Unstructured Grids. João L. D. Comba. Fábio F. Bernardon UFRGS

CS451Real-time Rendering Pipeline

3D Computer Games Technology and History. Markus Hadwiger VRVis Research Center

Readings on graphics architecture for Advanced Computer Architecture class

CS427 Multicore Architecture and Parallel Computing

Mattan Erez. The University of Texas at Austin

Textures. Texture coordinates. Introduce one more component to geometry

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev

Scanline Rendering 2 1/42

Rasterization Overview

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Rendering Objects. Need to transform all geometry then

Shaders (some slides taken from David M. course)

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

The Rasterization Pipeline

Architectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1

INF3320 Computer Graphics and Discrete Geometry

Optimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June

GeForce4. John Montrym Henry Moreton

Module 13C: Using The 3D Graphics APIs OpenGL ES

CS GPU and GPGPU Programming Lecture 7: Shading and Compute APIs 1. Markus Hadwiger, KAUST

Windowing System on a 3D Pipeline. February 2005

Programming Guide. Aaftab Munshi Dan Ginsburg Dave Shreiner. TT r^addison-wesley

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

Graphics Hardware. Ulf Assarsson. Graphics hardware why? Recall the following. Perspective-correct texturing

CS 130 Final. Fall 2015

GPU Architecture and Function. Michael Foster and Ian Frasch

Computergrafik. Matthias Zwicker. Herbst 2010

OpenGL SUPERBIBLE. Fifth Edition. Comprehensive Tutorial and Reference. Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak

1.2.3 The Graphics Hardware Pipeline

Bringing AAA graphics to mobile platforms. Niklas Smedberg Senior Engine Programmer, Epic Games

The Rasterization Pipeline

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

INF3320 Computer Graphics and Discrete Geometry

Pipeline Operations. CS 4620 Lecture 14

CHAPTER 1 Graphics Systems and Models 3

Programming Graphics Hardware. GPU Applications. Randy Fernando, Cyril Zeller

CS230 : Computer Graphics Lecture 4. Tamar Shinar Computer Science & Engineering UC Riverside

3D buzzwords. Adding programmability to the pipeline 6/7/16. Bandwidth Gravity of modern computer systems

Hardware-driven visibility culling

Spring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett

The Rasterizer Stage. Texturing, Lighting, Testing and Blending

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

CS GPU and GPGPU Programming Lecture 12: GPU Texturing 1. Markus Hadwiger, KAUST

Real Time Rendering of Expensive Small Environments Colin Branch Stetson University

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

Beginning Direct3D Game Programming: 1. The History of Direct3D Graphics

Building scalable 3D applications. Ville Miettinen Hybrid Graphics

Perspective Projection and Texture Mapping

CMSC427 Advanced shading getting global illumination by local methods. Credit: slides Prof. Zwicker

PROFESSIONAL. WebGL Programming DEVELOPING 3D GRAPHICS FOR THE WEB. Andreas Anyuru WILEY. John Wiley & Sons, Ltd.

Chapter 2 A top-down approach - How to make shaded images?

Graphics Hardware. Computer Graphics COMP 770 (236) Spring Instructor: Brandon Lloyd 2/26/07 1

Real-Time Shadows. Last Time? Textures can Alias. Schedule. Questions? Quiz 1: Tuesday October 26 th, in class (1 week from today!

Dave Shreiner, ARM March 2009

Practical Performance Analysis Koji Ashida NVIDIA Developer Technology Group

Introduction to OpenGL

E.Order of Operations

CS 450: COMPUTER GRAPHICS TEXTURE MAPPING SPRING 2015 DR. MICHAEL J. REALE

CS GPU and GPGPU Programming Lecture 16+17: GPU Texturing 1+2. Markus Hadwiger, KAUST

PowerVR Hardware. Architecture Overview for Developers

1. Introduction 2. Methods for I/O Operations 3. Buses 4. Liquid Crystal Displays 5. Other Types of Displays 6. Graphics Adapters 7.

Hardware-Assisted Relief Texture Mapping

Advanced Rendering Techniques

Per-Pixel Lighting and Bump Mapping with the NVIDIA Shading Rasterizer

Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Texture

Sung-Eui Yoon ( 윤성의 )

Optimizing Games for ATI s IMAGEON Aaftab Munshi. 3D Architect ATI Research

Working with Metal Overview

Multi-View Soft Shadows. Louis Bavoil

Texturing Theory. Overview. All it takes is for the rendered image to look right. -Jim Blinn 11/10/2018

Real-Time Universal Capture Facial Animation with GPU Skin Rendering

Volume Graphics Introduction

Pipeline Operations. CS 4620 Lecture 10

PowerVR Series5. Architecture Guide for Developers

Texture Mapping and Sampling

Could you make the XNA functions yourself?

Transcription:

Tutorial 5 Programming Graphics Hardware Randy Fernando, Mark Harris, Matthias Wloka, Cyril Zeller

Overview of the Tutorial: Morning 8:30 9:30 10:15 10:45 Introduction to the Hardware Graphics Pipeline Cyril Zeller Controlling the GPU from the CPU: the 3D API Cyril Zeller Break Programming the GPU: High-level Shading Languages Randy Fernando 12:00 Lunch

Overview of the Tutorial: Afternoon 12:00 14:00 14:45 15:45 16:15 17:30 Lunch Optimizing the Graphics Pipeline Matthias Wloka Advanced Rendering Techniques Matthias Wloka Break General-Purpose Computation Using Graphics Hardware Mark Harris End

Introduction to the Hardware Graphics Pipeline Cyril Zeller

Overview Concepts: Real-time rendering Hardware graphics pipeline Evolution of the PC hardware graphics pipeline: 1995-1998: Texture mapping and z-buffer 1998: Multitexturing 1999-2000: Transform and lighting 2001: Programmable vertex shader 2002-2003: Programmable pixel shader 2004: Shader model 3.0 and 64-bit color support PC graphics software architecture Performance numbers

Real-Time Rendering Graphics hardware enables real-time rendering Real-time means display rate at more than 10 images per second 3D Scene = Collection of 3D primitives (triangles, lines, points) Image = Array of pixels

Hardware Graphics Pipeline Application Stage Geometry Stage Rasterization Stage 3D Triangles 2D Triangles Pixels For each triangle vertex: Transform 3D position into screen position Compute attributes For each triangle: Rasterize triangle Interpolate vertex attributes across triangle Shade pixels Resolve visibility

PC Architecture Motherboard Central Processor Unit (CPU) System Memory Bus Port (PCI, AGP, PCIe) Video Memory Graphics Processor Unit (GPU) Video Board

1995-1998: Texture Mapping and Z-Buffer CPU GPU Application / Geometry Stage Rasterization Stage Rasterizer Texture Unit Raster Operations Unit 2D Triangles Textures Bus (PCI) 2D Triangles Textures Frame Buffer System Memory Video Memory PCI: Peripheral Component Interconnect 3dfx s Voodoo

Texture Mapping Triangle Mesh textured with + = Base Texture

Texture Mapping: Texture Coordinates Interpolation Screen Space Texture Space Texel y x (x 1, y 1 ) v u (u1, v1) (x 0, y 0 ) (x, y) (x 2, y 2 ) (u 0, v 0 ) (u, v) (u 2, v 2 )

Texture Mapping: Perspective-Correct Interpolation Perspective-Incorrect Perspective-Correct

Texture Mapping: Magnification Screen Space Texture Space x Pixel u Pixel s texture footprint y v or Nearest-Point Sampling Bilinear Filtering

Texture Mapping: Minification Screen Space Texture Space x Pixel u Pixel s texture footprint y v

Texture Mapping: Mipmapping or Bilinear Filtering Trilinear Filtering

Texture Mapping: Anisotropic Filtering or Bilinear Filtering Trilinear Filtering

Texture Mapping: Addressing Modes Wrap Mirror

Texture Mapping: Addressing Modes Clamp Border

Raster Operations Unit (ROP) Rasterizer Texture Unit Fragments Raster Operations Unit Scissor Test stencil buffer value at (x, y) tested against reference value Fragment Screen Position (x, y) Alpha Value a tested against scissor rectangle tested against reference value Alpha Test Stencil Test Frame Buffer Stencil Buffer tested against Z Test Z-Buffer Depth z z-buffer value at (x, y) (visibility test) blended with Alpha Blending Color Buffer Pixels color buffer value at (x, y): Color (r, g, b) K src * Color src + K dst * Color src (src = fragment dst = color buffer)

1998: Multitexturing CPU GPU Application / Geometry Stage Rasterization Stage Rasterizer Multitexture Unit Raster Operations Unit 2D Triangles Textures Bus (AGP) 2D Triangles Textures Frame Buffer System Memory Video Memory AGP: Accelerated Graphics Port NVIDIA s TNT, ATI s Rage

AGP PCI uses a parallel connection AGP uses a serial connection Fewer pins, simpler protocol Cheaper, more scalable PCI uses a shared-bus protocol AGP uses a point-to-point protocol Bandwidth is not shared among devices AGP uses a dedicated system memory called AGP memory or non-local video memory The GPU can lookup textures that resides in AGP memory Its size is called the AGP aperture Bandwidth: AGP = 2 x PCI (AGP2x = 2 x AGP, etc.)

Multitexturing Base Texture modulated by Light Map X from UT2004 (c) = Epic Games Inc. Used with permission

1999-2000: Transform and Lighting CPU Application Stage GPU Fixed Function Pipeline Geometry Stage Transform and Lighting Unit Rasterization Stage Rasterizer Texture Unit Register Combiner Raster Operations Unit 3D Triangles Textures Bus (AGP) 3D Triangles Textures Frame Buffer System Memory Video Memory Register Combiner: Offers many more texture/color combinations NVIDIA s GeForce 256 and GeForce2, ATI s Radeon 7500, S3 s Savage3D

Transform and Lighting Unit (TnL) Transform Transform and Lighting Unit Model or Object Space World Matrix World Space Model-View Matrix View Matrix Camera or Eye Space Projection Matrix Projection or Clip Space Perspective Division and Viewport Matrix Screen or Window Space Lighting Material Properties Light Properties Vertex Diffuse and Specular Color Vertex Color

Bump Mapping Bump mapping is about fetching the normal from a texture (called a normal map) instead of using the interpolated normal to compute lighting at a given pixel + = Normal Map Diffuse light without bump Diffuse light with bumps

Cube Texture Mapping Cubemap (covering of a cube) Environment Mapping z the six faces (x,y, z) y x Cubemap lookup (with direction (x, y, z)) (the reflection vector is used to lookup the cubemap)

Projective Texture Mapping Projected Texture (x, y, z, w) (x/w, y/w) Texture Projection Projective Texture lookup

2001: Programmable Vertex Shader CPU GPU Application Stage Geometry Stage Vertex Shader (no flow control) Rasterization Stage Rasterizer (with Z-Cull) Register Combiner Raster Operations Unit Texture Shader 3D Triangles Textures System Memory Bus (AGP) 3D Triangles Video Memory Textures Z-Cull: Predicts which fragments will fail the Z test and discards them Texture Shader: Offers more texture addressing and operations NVIDIA s GeForce3 and GeForce4 Ti, ATI s Radeon 8500 Frame Buffer

Vertex Shader A programming processor for any per-vertex computation void VertexShader( // Input per vertex in float4 positioninmodelspace, in float2 texturecoordinates, in float3 normal, ) { // Input per batch of triangles uniform float4x4 modeltoprojection, uniform float3 lightdirection, // Output per vertex out float4 positioninprojectionspace, out float2 texturecoordinatesoutput, out float3 color // Vertex transformation positioninprojectionspace = mul(modeltoprojection, positioninmodelspace); // Texture coordinates copy texturecoordinatesoutput = texturecoordinates; } // Vertex color computation color = dot(lightdirection, normal);

Volume Texture Mapping Volume Texture (3D Noise) (x,y,z) x z y Noise Perturbation Volume Texture lookup (with position (x, y, z))

Hardware Shadow Mapping Shadow Map Computation The shadow map contains the depth z/w of the 3D points visible from the light s point of view Store z/w Spot light (x, y, z, w) (x/w, y/w) Shadow Shadow Rendering Lookup map A 3D point (x, y, z, w) is in shadow if: z/w < value of shadow map at (x/w, y/w) A hardware shadow map lookup returns the value of this comparison value Spot light (x/w, y/w) (x, y, z, w) between 0 and 1

Antialiasing: Definition Aliasing: Undesirable visual artifacts due to insufficient sampling of: Primitives (triangles, lines, etc.) jagged edges Textures or shaders pixelation, moiré patterns Those artifacts are even more noticeable on animated images Antialiasing: Method to reduce aliasing Texture antialiasing is largely handled by proper mipmapping and anisotropic filtering Shader antialiasing can be tricky (especially with conditionals)

Antialiasing: Supersampling and Multisampling Supersampling: Compute color and Z at higher resolution and display averaged color to smooth out the visual artifacts Pixel Center Sample Multisampling: Same thing except only Z is computed at higher resolution As a result, multisampling performs antialiasing on primitive edges only

2002-2003: Programmable Pixel Shader CPU GPU Application Stage Geometry Stage Vertex Shader (static and dynamic flow control) Rasterization Stage Rasterizer (with Z-Cull) Pixel Shader (static flow control only) Raster Operations Unit Texture Unit 3D Triangles Textures Bus (AGP) 3D Triangles Textures Frame Buffer System Memory Video Memory MRT: Multiple Render Target NVIDIA s GeForce FX, ATI s Radeon 9600 to 9800 and X600 to X800

Pixel Shader A programming processor for any per-pixel computation void PixelShader( // Input per pixel in float2 texturecoordinates, in float3 normal, ) { // Input per batch of triangles uniform sampler2d basetexture, uniform float3 lightdirection, // Output per pixel out float3 color // Texture lookup float3 basecolor = tex2d(basetexture, texturecoordinates); // Light computation float light = dot(lightdirection, normal); } // Pixel color computation color = basecolor * light;

Shader: Static vs. Dynamic Flow Control void Shader(... // Input per vertex or per pixel in float3 normal, Static Flow Control (condition varies per batch of triangles) Dynamic Flow Control (condition varies per vertex or pixel) ) { } // Input per batch of triangles uniform float3 lightdirection, uniform bool computelight,...... if (computelight) {... if (dot(lightdirection, normal)) {... }... }...

2004: Shader Model 3.0 and 64-Bit Color Support CPU GPU Application Stage Geometry Stage Vertex Shader (static and dynamic flow control) Rasterization Stage Rasterizer (with Z-Cull) Pixel Shader (static and dynamic flow control) Raster Operations Unit Texture Unit 64-Bit 3D Triangles Color Textures Bus (PCIe) 3D Triangles Textures Frame Buffer System Memory Video Memory PCIe: Peripheral Component Interconnect Express NVIDIA s GeForce 6 Series (6800 and 6600)

PCIe Like AGP: Uses a serial connection Cheap, scalable Uses a point-to-point protocol No shared bandwidth Unlike AGP: General-purpose (not only for graphics) Dual-channels: Bandwidth is available in both direction Bandwidth: PCIe = 2 x AGP8x

Multi-GPU Architecture NVIDIA s Scalable Link Interface multi-gpu technology takes advantage of the increased bandwidth of the PCI Express to automatically accelerates applications through a combination of intelligent hardware and software solutions SLI Connector

Shader Model 3.0 Shader Model 3.0 means: Longer shaders More complex shading Pixel shader: Dynamic flow control Better performance Derivative instructions Shader antialiasing Support for 32-bit floating-point precision Fewer artifacts Face register Faster two-sided lighting Vertex shader: Texture access Simulation on GPU, displacement mapping Vertex buffer frequency Efficient geometry instancing

Shader Model 3.0 Unleashed Image used with permission from Pacific Fighters. 2004 Developed by 1C:Maddox Games. All rights reserved. 2004 Ubi Soft Entertainment.

64-Bit Color Support 64-bit color means one 16-bit floating-point value per channel (R, G, B, A) Alpha blending works with 64-bit color buffer (as opposed to 32-bit fixed-point color buffer only) Texture filtering works with 64-bit textures (as opposed to 32-bit fixed-point textures only) Applications: High-precision image compositing High dynamic range imagery

High Dynamic Range Imagery The dynamic range of a scene is the ratio of the highest to the lowest luminance Real-life scenes can have high dynamic ranges of several millions Display and print devices have a low dynamic range of around 100 Tone mapping is the process of displaying high dynamic range images on those low dynamic range devices High dynamic range images use floating-point colors OpenEXR is a high dynamic range image format that is compatible with NVIDIA s 64-bit color format

Real-Time Tone Mapping The image is entirely computed in 64-bit color and tone-mapped for display From low to high exposure image of the same scene

PC Graphics Software Architecture CPU Application BUS GPU Vertex Program Pixel Program 3D API (OpenGL or DirectX) Commands Programs Vertex Shader Pixel Shader Driver System Memory Geometry (triangles, vertices, normals, etc...) Textures Video Memory The application, 3D API and driver are written in C or C++ The vertex and pixel programs are written in a high-level shading language (Cg, DirectX HLSL, OpenGL Shading Language)

Evolution of Performance 10 000 Mpixels/s 1000 100 Mvertices/s Mtransistors 10 Bus 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 PCI (133 MB/s) AGP (266 MB/s) AGP2x (533 MB/s) AGP4x (1.06 GB/s) AGP8x (2.1 GB/s) PCIe (4 GB/s) Video 4 MB 32 MB 64 MB 128 MB 256 MB 512 MB memory DirectX 1 DirectX 5 DirectX 8 DirectX 9 API DirectX 2 DirectX 3 DirectX 6 DirectX 7 OpenGL 1.4 OpenGL 1.1 OpenGL 1.2 OpenGL 1.3 OpenGL 1.5

The Future Unified general programming model at primitive, vertex and pixel levels Scary amounts of: Floating point horsepower Video memory Bandwidth between system and video memory Lower chip costs and power requirements to make 3D graphics hardware ubiquitous: Automotive (gaming, navigation, heads-up displays) Home (remotes, media center, automation) Mobile (PDAs, cell phones)

References Tons of resources at http://developer.nvidia.com: Code samples Programming guides Recent conference presentations A good website and book on real-time rendering: http://www.realtimerendering.com

Questions Support e-mail: devrelfeedback@nvidia.com [Technical Questions] sdkfeedback@nvidia.com [Tools Questions]