Real-Time Graphics Architecture

Similar documents
Lecture 4: Geometry Processing. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Real-Time Graphics Architecture

Real-Time Graphics Architecture

Optimisation. CS7GV3 Real-time Rendering

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Rasterization Overview

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

Advanced Shading and Texturing

CS230 : Computer Graphics Lecture 4. Tamar Shinar Computer Science & Engineering UC Riverside

2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into

CS 4620 Program 3: Pipeline

The Traditional Graphics Pipeline

Lecture 2. Shaders, GLSL and GPGPU

Lecture 17: Shadows. Projects. Why Shadows? Shadows. Using the Shadow Map. Shadow Maps. Proposals due today. I will mail out comments

CSE 167: Introduction to Computer Graphics Lecture #4: Vertex Transformation

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp

CS 130 Exam I. Fall 2015

Drawing Fast The Graphics Pipeline

The Traditional Graphics Pipeline

Scheduling the Graphics Pipeline on a GPU

Computergrafik. Matthias Zwicker. Herbst 2010

1.2.3 The Graphics Hardware Pipeline

Rendering Objects. Need to transform all geometry then

The Traditional Graphics Pipeline

CSE 167: Lecture #4: Vertex Transformation. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

Comparing Reyes and OpenGL on a Stream Architecture

Models and Architectures

Drawing Fast The Graphics Pipeline

Drawing Fast The Graphics Pipeline

CHAPTER 1 Graphics Systems and Models 3

Orthogonal Projection Matrices. Angel and Shreiner: Interactive Computer Graphics 7E Addison-Wesley 2015

Real-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis

CS 130 Final. Fall 2015

Could you make the XNA functions yourself?

The Application Stage. The Game Loop, Resource Management and Renderer Design

PowerVR Hardware. Architecture Overview for Developers

CS130 : Computer Graphics. Tamar Shinar Computer Science & Engineering UC Riverside

CS770/870 Spring 2017 Open GL Shader Language GLSL

CS770/870 Spring 2017 Open GL Shader Language GLSL

Pipeline Operations. CS 4620 Lecture 10

Real-Time Graphics Architecture

Deferred Rendering Due: Wednesday November 15 at 10pm

Lecture 25: Board Notes: Threads and GPUs

Standard Graphics Pipeline

Clipping & Culling. Lecture 11 Spring Trivial Rejection Outcode Clipping Plane-at-a-time Clipping Backface Culling

From Vertices to Fragments: Rasterization. Reading Assignment: Chapter 7. Special memory where pixel colors are stored.

E.Order of Operations

3D Graphics for Game Programming (J. Han) Chapter II Vertex Processing

4: Polygons and pixels

Computer Graphics. Shadows

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

Shadow Algorithms. CSE 781 Winter Han-Wei Shen

Computing Visibility. Backface Culling for General Visibility. One More Trick with Planes. BSP Trees Ray Casting Depth Buffering Quiz

CSE 167: Introduction to Computer Graphics Lecture #9: Visibility. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2018

Advanced Computer Graphics

Today. Rendering pipeline. Rendering pipeline. Object vs. Image order. Rendering engine Rendering engine (jtrt) Computergrafik. Rendering pipeline

Surface Graphics. 200 polys 1,000 polys 15,000 polys. an empty foot. - a mesh of spline patches:

Shadow Rendering EDA101 Advanced Shading and Rendering

COMS 4160: Problems on Transformations and OpenGL

Cornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

Lecture 13: Reyes Architecture and Implementation. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Introduction to Computer Graphics with WebGL

CS 130 Exam I. Fall 2015

CS4620/5620: Lecture 14 Pipeline

CSE 167: Introduction to Computer Graphics Lecture #10: View Frustum Culling

CSE 690: GPGPU. Lecture 2: Understanding the Fabric - Intro to Graphics. Klaus Mueller Stony Brook University Computer Science Department

EE 4702 GPU Programming

RSX Best Practices. Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog

LOD and Occlusion Christian Miller CS Fall 2011

POWERVR MBX. Technology Overview

Building scalable 3D applications. Ville Miettinen Hybrid Graphics

Real-Time Graphics Architecture

The graphics pipeline. Pipeline and Rasterization. Primitives. Pipeline

CSE 167: Introduction to Computer Graphics Lecture #11: Visibility Culling

The Rasterization Pipeline

Robust Stencil Shadow Volumes. CEDEC 2001 Tokyo, Japan

RASTERISED RENDERING

Visibility: Z Buffering

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

Interactive Computer Graphics A TOP-DOWN APPROACH WITH SHADER-BASED OPENGL

Optimizing DirectX Graphics. Richard Huddy European Developer Relations Manager

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Intro to Ray-Tracing & Ray-Surface Acceleration

Scanline Rendering 2 1/42

CSE 167: Introduction to Computer Graphics Lecture #5: Projection. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2017

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

3D Modeling: Surfaces

Blue colour text questions Black colour text sample answers Red colour text further explanation or references for the sample answers

Graphics and Imaging Architectures

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

The Rendering Pipeline (1)

PowerVR Series5. Architecture Guide for Developers

Performance OpenGL Programming (for whatever reason)

Pipeline and Rasterization. COMP770 Fall 2011

Transcription:

Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Geometry Outline Vertex and primitive operations System examples emphasis on clipping Primitive generation OpenGL selection 1

Readings Required High-Performance Polygon Rendering, Akeley and Jermoluk, Proceedings of SIGGRAPH 88. RealityEngine Graphics, Akeley, Proceedings of SIGGRAPH 93. InfiniteReality: A Real-Time Graphics System, Montrym, Baum, Dignam, and Migdal, Proceedings of SIGGRAPH 97. Recommended Curved PN Triangles, PDF to be on web site soon OpenGL Specification Modern Graphics Pipeline Application Command Geometry Today s lecture Rasterization Texture Fragment Already covered Display 2

Geometry Processing Two types of operations Vertex operations Operate on individual vertexes Primitive operations Operate on all the vertexes of a primitive Vertex Operations Transform coordinates and normal Model world (not rigid) World eye (rigid) Normalize the length of the normal Compute vertex lighting Transform texture coordinates Generate if so specified Transform coordinates to clip coordinates (projection) Divide coordinates by w Apply affine viewport transform (x, y, and z) 3

Coordinate Transformation 4x4 matrix, 4-component coordinate Single-precision floating point is typical Matrices can be composed By the application By the implementation Be careful about invariance x m11 m12 m13 m14 x y z = m21 m22 m23 m24 m31 m32 m33 m34 y z w m41 m42 m42 m44 w Normal Transformation (nx ny nz ) = (nx ny nz) Mu -1 Mu = upper-left 3x3 of M Approaches to acquiring Mu -1 include Maintain separately, or Compute when coordinate matrix is changed, or Force specification by application Why transform normals? (Can lighting be computed in model coords?) Model matrix may not be rigid But, recall quadrilateral decomp. problem Why normalize normals to unit length Lighting equations require unit length Non-rigid model matrix distorts lengths Requires reciprocal square root operation 4

Normal Transformation (nx ny nz ) = (nx ny nz) Mu -1 Mu = upper-left 3x3 of M Approaches to acquiring Mu -1 include Maintain separately, or Compute when coordinate matrix is changed, or Force specification by application Why transform normals? (Can lighting be computed in model coords?) Model matrix may not be rigid But, recall quadrilateral decomp. problem Why normalize normals to unit length Lighting equations require unit length Non-rigid model matrix distorts lengths Requires reciprocal square root operation Lighting Simple n dot l evaluation Plus ambient and specular Multiple lights, Remember, n is not a facet normal Possibly two-sided Compute for n and n Both results may be needed! Much arithmetic is shared View-direction simplification Eye vector is [0,0,1] Saves arithmetic -n There is no such thing as view direction! n l 5

No Interdependencies Vertex operations apply to vertexes independently Transform coordinates and normal Model world World eye Normalize the length of the normal Compute vertex lighting Transform texture coordinates Transform to clip coordinates Divide by w Apply affine viewport transform Primitive Operations Primitive assembly Clipping Backface cull Transform coordinates and normal Model world World eye Normalize the length of the normal Compute vertex lighting Transform texture coordinates Transform to clip coordinates Assemble vertexes into primitives Clip primitives against frustum Divide by w Apply affine viewport transform Eliminate back-facing triangles 6

Primitive Assembly Assemble based on application commands Independent or strip or mesh or. Decompose to triangles Prior to clipping to maintain invariance Algorithm properties Fixed execution time (good) All vertex operations up to this point have this property Vertex interdependencies (bad) Clipping Two types Point, line: eliminates geometry Polygon: eliminates and introduces edges Line Segments Polygon 7

Clipping Two types Point, line: eliminates geometry Polygon: eliminates and introduces edges Invariance requirements Pre-decomposition to triangles Care with edge arithmetic Algorithm properties Vertex interdependencies (bad) Data-dependent execution (worse) Variable execution time (substantially different) Variable code paths Backface Cull Facet facing toward or away from viewpoint? No facet normal (other APIs?) Use sign of primitive s window coordinate area Remember, only triangles are planar Use facing direction to x2,y2 Select lighting result (for n or n) Potentially discard the primitive Advance in sequence to improve efficiency? x0,y0 x1,y1 Triangle area = (x0y1 - x1y0) + (x1y2 - x2y1) + (x2y0 - x0y2) 2 8

Some Examples Systems Clark Geometry Engine (1983) Silicon Graphics GTX (1988) Silicon Graphics RealityEngine (1992) Silicon Graphics InfiniteReality (1996) Modern GPU (2001) What we ll look at Organization of the geometry system Distribution of vertex and primitive operations How clipping affects the implementation Clark Geometry Engine (1983) 1 st generation capability Simple, fixed-function pipeline Twelve identical engines Soft-configured at start-up Clipping allocated ½ of total power Performance invariant (good) Typically idle (bad) Coordinate Transform 6-plane Frustum Clipping Divide by w Viewport 9

GTX Geometry Engine (1988) 2 nd generation capability Variable functionality pipeline 5 identical engines Modes alter some functions Load balancing is difficult Clip testing Clipping allocated 1/5 of power Clip testing is the constant load Actual clipping Slow pipeline execution (bad) out of balance Typically isn t invoked (good) Coord, normal Transform Lighting Clipping state Divide by w (clipping) Viewport Prim. Assy. Backface cull RealityEngine Geometry (1992) 3 rd generation capability Variable-functionality MIMD organization Eight identical engines Round-robin work assignment Good static load balancing Command processor Splits large strips of primitives Shadows per-vertex state Broadcasts other state Primitive assembly Complicates work distribution Reduces efficiency of strip processing Command Processor Round-robin Aggregation 10

RealityEngine Clipping Clipping introduces data-dependent (dynamic) load Cannot be predicted by CP Dynamic load balance accomplished by: Round-robin assignment Large input and output FIFOs For each geometry processor Sized greater than (n) long/typical Large work load per processor Minimize the long/typical ratio Unlike pipeline processing Command Processor Round-robin Aggregation InfiniteReality Geometry (1996) 3 rd generation capability Variable-functionality MIMD organization Four identical (SIMD) engines Least-busy work assignment Good static load balancing Command processor Splits large strips of primitives Shadows per-vertex state Broadcasts other state Primitive assembly Complicates work distribution Reduces efficiency of strip processing Command Processor Sequence Token Aggregation 11

InfiniteReality Clipping Dynamic load balance accomplished by: Least-busy assignment Even larger FIFOs Large work load per processor Likelihood of clipping reduced by Guard-band clipping algorithm Command Processor Sequence Token Aggregation Guard-Band Clipping Expand clipping frustum beyond desired viewport Near and far clipping are unchanged Frustum clip only when necessary Ideal triangle cases: Discard if outside viewport, else Render without clipping if inside frustum, else Rasterizer must scissor well for this to be efficient Clip triangles That cross both viewport and frustum boundaries That cross the near or far clip-planes Operation is imperfect, but conservative 12

Guard-Band Clipping Example Rendered Discarded Clipped Viewport Guard-band Modern Geometry Engine (2001) Utilizes homogeneous rasterization algorithm No clipping required Hence no primitive assembly prior to rasterization Push backface cull to rasterization setup This is where triangle area is computed anyway All geometry engine calculations are On independent vertexes easy work distribution Not data dependent minimal code branching Allows efficient, SIMD geometry engine implementation Programmability? 13

Primitive Generation Why do this? Some important reasons: Match application semantics Move computation from CPU to GPU Reduce storage requirements Perverse desire to complicate the GPU design ;-) Highest-priority reason: Reduce CPU to GPU data rate! Basic Idea Introduce new pipeline stage We ll treat as part of geometry Two-step application Specify generation function Modes Data values (potentially large) Execute generation function Block-mode commands, and/or Vertex-like commands Generated geometry Is treated as if app-specified Application Command Prim. Generate Geometry Rasterization Texture Fragment Display 14

Problems with Primitive Generation Specification may be complex E.g. 4 th order 2D OpenGL Evaluation mesh May be more data than generated triangles! Specification/execution may be sequential Require double buffer, separate execution engines Cracks and T-vertexes Different surfaces must abut But arithmetic is of finite precision Generally requires stitching OpenGL accommodates this with mixed evaluation and vertex specification Stitching Patches Together 4x4 Patch 5x5 Patch 15

Stitching Patches Together 4x4 Patch 5x5 Patch Stitching vertexes introduced Problems with Primitive Generation Implementation may be complex E.g. Trimmed NURBS Limit tesselation to curve-specified region of 2D domain Algorithm doesn t fit GPU architecture Don t redesign a general-purpose MP system Semantics may not match application s E.g. Trimmed NURBS Getting this right is extremely difficult Don t trespass on application s secret sauce Where should primitive generation be done?. 16

Where is Primitive Generation Done? On geometry engines Reduces maximum vertex rate Generation consumes GE cycles May serialize setup/execution GEs not designed to double buffer Complicates load balancing Command Processor distribution Complicates state handling Command Processor managed this OpenGL designed to accommodate this Awkward state semantics Command Processor Sequence Token Aggregation Where is Primitive Generation Done? On an additional processor Requires fixed-function resources Idle during normal operation May serialize setup/execution If not designed to double buffer Simplifies Geometry Engine load balancing State management Feels like a new pipeline stage Command processor functions split Command Processor Primitive Generation Command Processor2 Sequence Token Aggregation? 17

Other Data Rate Reductions Display lists Work well for client-server Allow bandwidth to be reassigned Geometric decompression Strips and meshes Indexed triangle sets Geometry Compression (Deering 95) Hypothetical backend for advanced compression format ATI PN Triangle technology ATI s TRUFORM Technology Requires minimal setup Single mode (triangle edge subdivision count) No geometric data Operates on triangles as normally submitted Edge Subdivison Count Normal-based surface extraction 18

ATI s TRUFORM Technology Improves silhouettes Improves lighting over per-vertex (per fragment?) ATI s TRUFORM Technology Limitations Constant subdivision per object Generally limited capability (e.g. continuity) Strengths Actually reduces CPU to GPU data rate Requires minimal application recoding Easily avoids cracks and T-vertexes Is simple to understand, to implement, and to use Leads toward a Reyes rendering approach Lots of small triangles Shading done in pre-projected coordinates Hardware trails software. 19

OpenGL Selection Light pen replacement mechanism Light pen is a calligraphic device Focuses on screen Signals when stroke is drawn in focus region Raster equivalent Point with a mouse Set selection mode Re-render entire scene Clip frustum reduced to small region around pointer Each object tagged (integer name) Return hit information Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall 20