Evolution of GPUs Chris Seitz

Similar documents
Programming Graphics Hardware

Lecture 2. Shaders, GLSL and GPGPU

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Graphics Processing Unit Architecture (GPU Arch)

Graphics Hardware. Instructor Stephen J. Guy

Spring 2009 Prof. Hyesoon Kim

Mattan Erez. The University of Texas at Austin

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

3D Rasterization II COS 426

Spring 2011 Prof. Hyesoon Kim

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

Monday Morning. Graphics Hardware

GeForce4. John Montrym Henry Moreton

The Rasterization Pipeline

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology

Shaders. Slide credit to Prof. Zwicker

CS GPU and GPGPU Programming Lecture 7: Shading and Compute APIs 1. Markus Hadwiger, KAUST

CS 354R: Computer Game Technology

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME)

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev

CS451Real-time Rendering Pipeline

Today. Texture mapping in OpenGL. Texture mapping. Basic shaders for texturing. Today. Computergrafik

Readings on graphics architecture for Advanced Computer Architecture class

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

Real-World Applications of Computer Arithmetic

The Rasterization Pipeline

Mattan Erez. The University of Texas at Austin

Windowing System on a 3D Pipeline. February 2005

Computergrafik. Matthias Zwicker Universität Bern Herbst 2016

Programming Guide. Aaftab Munshi Dan Ginsburg Dave Shreiner. TT r^addison-wesley

Rendering Objects. Need to transform all geometry then

Advanced Computer Graphics (CS & SE ) Lecture 7

Rasterization Overview

Graphics Hardware. Ulf Assarsson. Graphics hardware why? Recall the following. Perspective-correct texturing

Shaders (some slides taken from David M. course)

3D buzzwords. Adding programmability to the pipeline 6/7/16. Bandwidth Gravity of modern computer systems

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

Chapter 2 A top-down approach - How to make shaded images?

Hardware-driven visibility culling

Scanline Rendering 2 1/42

Programmable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1

1.2.3 The Graphics Hardware Pipeline

CHAPTER 1 Graphics Systems and Models 3

CS427 Multicore Architecture and Parallel Computing

Textures. Texture coordinates. Introduce one more component to geometry

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

Pipeline Operations. CS 4620 Lecture 14

Bringing AAA graphics to mobile platforms. Niklas Smedberg Senior Engine Programmer, Epic Games

CS 130 Final. Fall 2015

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

Zeyang Li Carnegie Mellon University

CS230 : Computer Graphics Lecture 4. Tamar Shinar Computer Science & Engineering UC Riverside

The NVIDIA GeForce 8800 GPU

Computer Graphics (CS 543) Lecture 10: Normal Maps, Parametrization, Tone Mapping

Architectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1

Practical Performance Analysis Koji Ashida NVIDIA Developer Technology Group

Advanced Rendering Techniques

PROFESSIONAL. WebGL Programming DEVELOPING 3D GRAPHICS FOR THE WEB. Andreas Anyuru WILEY. John Wiley & Sons, Ltd.

Supplement to Lecture 22

TSBK03 Screen-Space Ambient Occlusion

Graphics Hardware. Computer Graphics COMP 770 (236) Spring Instructor: Brandon Lloyd 2/26/07 1

Hardware-Assisted Relief Texture Mapping

Optimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June

Introduction to Shaders.

Texture mapping. Computer Graphics CSE 167 Lecture 9

The Application Stage. The Game Loop, Resource Management and Renderer Design

Sung-Eui Yoon ( 윤성의 )

A Trip Down The (2011) Rasterization Pipeline

3D Computer Games Technology and History. Markus Hadwiger VRVis Research Center

Deferred Rendering Due: Wednesday November 15 at 10pm

Programmable Graphics Hardware

Levy: Constraint Texture Mapping, SIGGRAPH, CS 148, Summer 2012 Introduction to Computer Graphics and Imaging Justin Solomon

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

The Rasterizer Stage. Texturing, Lighting, Testing and Blending

2.11 Particle Systems

3D Rendering Pipeline

A Bandwidth Effective Rendering Scheme for 3D Texture-based Volume Visualization on GPU

The Source for GPU Programming

CS 4620 Program 3: Pipeline

Spring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett

Hardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences

Module 13C: Using The 3D Graphics APIs OpenGL ES

Building scalable 3D applications. Ville Miettinen Hybrid Graphics

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Models and Architectures

Real-Time Universal Capture Facial Animation with GPU Skin Rendering

Render all data necessary into textures Process textures to calculate final image

OpenGL SUPERBIBLE. Fifth Edition. Comprehensive Tutorial and Reference. Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak

How to Work on Next Gen Effects Now: Bridging DX10 and DX9. Guennadi Riguer ATI Technologies

HISTORY OF MAJOR REVISIONS

PowerVR Hardware. Architecture Overview for Developers

Multi-View Soft Shadows. Louis Bavoil

Optimizing Games for ATI s IMAGEON Aaftab Munshi. 3D Architect ATI Research


Shaders CSCI 4239/5239 Advanced Computer Graphics Spring 2014

Working with Metal Overview

Introduction to Computer Graphics with WebGL

Transcription:

Evolution of GPUs Chris Seitz

Overview Concepts: Real-time rendering Hardware graphics pipeline Evolution of the PC hardware graphics pipeline: 1995-1998: Texture mapping and z-buffer 1998: Multitexturing 1999-2000: Transform and lighting 2001: Programmable vertex shader 2002-2003: Programmable pixel shader 2004: Shader model 3.0 and 64-bit color support

Real-Time Rendering Graphics hardware enables real-time rendering Real-time means display rate at more than 10 images per second 3D Scene = Collection of 3D primitives (triangles, lines, points) Image = Array of pixels

PC Architecture Motherboard CPU System Memory Bus Port (PCI, AGP, PCIe) Graphics Board Video Memory GPU

Hardware Graphics Pipeline Application Stage 3D Triangles Geometry Stage 2D Triangles Rasterization Stage Pixels For each triangle vertex: For each triangle: Transform Rasterize position: triangle 3D -> screen Compute Interpolate vertex attributes attributes Shade pixels Resolve visibility

Real-time Graphics 1997: RIVA 128 3M Transistors

Real-time Graphics 2004: UnReal Engine 3.0 Images Courtesy of Epic Games

95-98: Texture Mapping & Z-Buffer CPU GPU Application / Geometry Stage Rasterizer Rasterization Stage Texture Unit Raster Operations Unit 2D Triangles Bus Textures (PCI) 2D Triangles Textures Frame Buffer System Memory Video Memory

Raster Operations Unit (ROP) Rasterizer Texture Unit Fragments Raster Operations Unit Fragment Screen Position (x, y) tested against scissor rectangle tested against reference value Scissor Test Alpha Test Frame Buffer Alpha Value a Depth z Color (r, g, b) stencil buffer value at (x, y) tested against reference value tested against z-buffer value at (x, y) (Visibility test) Stencil Test Z Test Alpha Blending Stencil Buffer Z-Buffer Color Buffer Pixels blended with color buffer value at (x, y) K src * Color src + K dst * Color src (src = fragment, dst = color buffer)

Texture Mapping Triangle Mesh (with UV coordinates) Base Texture + = Sampling Magnification, Minification Filtering Bilinear, Trilinear, Anisotropic Mipmapping Perspective Correct Interpolation

Texture Mapping: Mipmapping & Filtering or Bilinear Filtering Trilinear Filtering

Filtering Examples Point Bilinear Trilinear Anisotropic (4X) & Trilinear

Incoming

Riva TNT 7M Transistors - 1998 Multitexture

1998: Multitexturing CPU GPU Application / Geometry Stage Rasterizer Rasterization Stage Multi Texture Unit Raster Operations Unit 2D Triangles Textures Bus (AGP) 2D Triangles 2 Textures Frame Buffer System Memory Video Memory

Multitexturing Base Texture modulated by Light Map X = from UT2004 (c) Epic Games Inc. Used with permission

GeForce 256 23M Transistors - 1999 Hardware Transform & Lighting

1999-2000: Transform and Lighting CPU GPU Fixed Function Pipeline Application Stage Geometry Stage Rasterization Stage Transform & Lighting Unit Rasterizer Texture Unit Register Combiner Raster Operations Unit 3D Triangles Textures Bus (AGP) 3D Triangles Textures Frame Buffer System Memory Video Memory

Transform and Lighting Unit (TnL) Transform and Lighting Unit Transform Model or Object Space Model-View Matrix Camera or Eye Space Projection Matrix World Matrix World Space View Matrix Lighting Material Properties Light Properties Vertex Color Projection or Clip Space Perspective Division & Viewport Matrix Screen or Window Space Vertex Diffuse & Specular Color

Wanda and Bubble

GeForce 2 Pixel Shading Multitexture

GeForce 3 Programmable Vertex Shading

2001: Programmable Vertex Shader CPU GPU Application Stage Geometry Stage Vertex Shader (no flow control) Rasterizer w/ Z-Cull Rasterization Stage Texture Unit (4 Textures) Register Combiner Raster Operations Unit 3D Triangles Textures Bus (AGP) 3D Triangles Textures Frame Buffer System Memory Video Memory

Vertex Shader A programmable processor for any per-vertex computation void VertexShader( // Input per vertex in float4 positioninmodelspace, in float2 texturecoordinates, in float3 normal, ) { } // Input per batch of triangles uniform float4x4 modeltoprojection, uniform float3 lightdirection, // Output per vertex out float4 positioninprojectionspace, out float2 texturecoordinatesoutput, out float3 color // Vertex transformation positioninprojectionspace = mul(modeltoprojection, positioninmodelspace); // Texture coordinates copy texturecoordinatesoutput = texturecoordinates; // Vertex color computation color = dot(lightdirection, normal);

Bump Mapping Bump mapping involves fetching the per-pixel normal from a normal map texture (instead of using the interpolated vertex normal) in order to compute lighting at a given pixel + = Diffuse light Normal Map Diffuse light with bumps

Cubic Texture Mapping Cubemap (x,y,z) x z y Environment Mapping Reflection vector is used to lookup into the cubemap Cubemap lookup in direction (x, y, z)

Projective Texture Mapping Projected Texture (x, y, z, w) (x/w, y/w) Texture Projection Projective Texture lookup

Volume Texture Mapping z (x,y,z) y Volume Texture 3D Noise x Solid Textures Noise Perturbation Volume Texture lookup with position (x, y, z)

Hardware Shadow Mapping Shadow Map Computation z/w The shadow map contains the depth (z/w) of the 3D points visible from the light s point of view: Spot light (x, y, z, w) (x/w, y/w) Shadow Rendering shadow map A 3D point (x, y, z, w) is in shadow if: z/w < value of shadow map at (x/w, y/w) A hardware shadow map lookup value Spot light (x/w, y/w) (x, y, z, w) returns the value of this comparison between 0 and 1

Antialiasing: Examples

Antialiasing: Supersampling & Multisampling Supersampling: Compute color and Z at higher resolution and display averaged color to smooth out the visual artifacts Multisampling: Same thing except only Z is computed at higher resolution Multisampling performs antialiasing on primitive edges only

X-Isle: Dinosaur Isle Image courtesy of Crytek

GeForce 4 Multiple Vertex and Pixel Shaders

Wolfman

GeForce FX - >100M Transistors Programmable Pixel Shading (DirectX 9.0) Scalable Architecture

02-03: Programmable Pixel Shader CPU GPU Application Stage Geometry Stage Vertex Shader (static & dynamic flow control) Rasterizer w/ Z-Cull Rasterization Stage Texture Unit (16 textures) Pixel Shader (static flow control) Raster Operations Unit 3D Triangles Textures Bus (AGP) 3D Triangles Textures Frame Buffer System Memory Video Memory

PC Graphics Software Architecture CPU GPU Application 3D API (OpenGL or DirectX) BUS Commands Programs Vertex Shader Pixel Shader Frame Buffer Geometry Driver Textures Vertex Program Pixel Program The application, 3D API and driver are written in C or C++ The vertex and pixel programs are written in a high-level shading language (DirectX HLSL, OpenGL Shading Language, Cg) Pushbuffer: Contains the commands to be executed on the GPU

Pixel Shader A programmable processor for any per-pixel computation void PixelShader( // Input per pixel in float2 texturecoordinates, in float3 normal, // Input per batch of triangles uniform sampler2d basetexture, uniform float3 lightdirection, ) { // Output per pixel out float3 color // Texture lookup float3 basecolor = tex2d(basetexture, texturecoordinates); // Light computation float light = dot(lightdirection, normal); } // Pixel color computation color = basecolor * light;

Shader: Static vs. Dynamic Flow Control void Shader(... // Input per vertex or per pixel in float3 normal, Static Flow Control (condition varies per batch of triangles) Dynamic Flow Control (condition varies per vertex or pixel) ) { } // Input per batch of triangles uniform float3 lightdirection, uniform bool computelight,...... if (computelight) {... if (dot(lightdirection, normal)) {... }... }...

GeForce 6800 220M Transistors Shader Model 3.0 FP64 & High Dynamic Range

2004: Shader Model 3.0 & 64-Bit Color Support CPU GPU Application Stage Geometry Stage Vertex Shader (static & dynamic flow control) Rasterizer w/ Z-Cull Rasterization Stage Texture Unit Pixel Shader (static & dynamic flow control) Raster Operations Unit 64-Bit Color 3D Triangles Textures Bus (PCIe) 3D Triangles Textures Frame Buffer System Memory Video Memory

Shader Model 3.0 Longer shaders More complex shading Pixel shader: Dynamic flow control Better performance Derivative instructions Shader antialiasing Support for 32-bit floating-point precision Fewer artifacts Face register Faster two-sided lighting Vertex shader: Texture access Simulation on GPU, displacement mapping Geometry Instancing Better performance Lord of the Rings The Battle for Middle-earth Far Cry

The GeForce 6 Series Amazing Real-time Effects

Shader Model 3.0 / 64-bit Floating Point Processing Unreal Engine 3.0 Running On GeForce 6 Series 2 Million Triangle Detail Mesh Models High Dynamic Range Rendering Fully Customizable Shaders UnReal Engine 3.0 Images Courtesy of Epic Games

Shader Model 3.0 / 64-bit Floating Point Processing Unreal Engine 3.0 Running On GeForce 6 Series UnReal Engine 3.0 Images Courtesy of Epic Games 100 Million Triangle Source Content Scene High Dynamic Range Rendering

High Dynamic Range Imagery The dynamic range of a scene is the ratio of the highest to the lowest luminance Real-life scenes can have high dynamic ranges of several millions Display and print devices have a low dynamic range of around 100 Tone mapping is the process of displaying high dynamic range images on those low dynamic range devices High dynamic range images use floating-point colors OpenEXR is a high dynamic range image format that is compatible with NVIDIA s 64-bit color format HDR Rendering Engine - Compute surface reflectance, save in HDR buffer Contributions from multiple lights are additive (blended) Add image-space special effects & Post to HDR buffer AA, Glow, Depth of Field, Motion Blur

Real-Time Tone Mapping The image is entirely computed in 64-bit color and tone-mapped for display Renderings of the same scene, from low to high exposure

Evolution of Performance 10000 CPU Frequency (GHz) 1000 100 10 1 Pixel Fill Rate Vertex Rate Graphics Bandwidth Bus Bandwidth CPU Frequency Bus Bandwidth (GB/sec) Pixel Fill Rate (MPixels/sec) Vertex Rate (MVerts/sec) Graphics Bandwidth (GB/sec) 0.1 1994 2004 4 MB 32 MB 64 MB 128 MB 256 MB 512 MB DirectX 6 DirectX 7DirectX 8 DirectX 9 OpenGL 1.1 OpenGL 1.4 OpenGL 1.5

Looking Ahead: Now + 10 years 1000000 100000 10000 CPU Frequency (GHz) Bus Bandwidth (GB/sec) Pixel Fill Rate (MPixels/sec) Vertex Rate (MVerts/sec) Graphics flops (GFlops/sec) Graphics Bandwidth (GB/sec) 127 Gvertx 1000 100 Pixel Fill Rate Vertex Rate Graphics flops 10 1.5 Mverts BUS Bandwidth CPU Frequency 100 GHz 0.1 1994 2004 2014 100 MHz

NVIDIA SLI Multi-GPU

Progression of Graphics Virtual Fighter NV1 1Mtrans Wanda NV1x 22Mtrans Wolfman NV2x 63Mtrans Dawn NV3x 130Mtrans Nalu NV4x 222M

Thank You