Real-Time Graphics Architecture

Similar documents
Real-Time Graphics Architecture

Real-Time Graphics Architecture

CS4620/5620: Lecture 14 Pipeline

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

Triangle Rasterization

Realtime 3D Computer Graphics Virtual Reality

Drawing Fast The Graphics Pipeline

Lecture 4: Geometry Processing. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Surface shading: lights and rasterization. Computer Graphics CSE 167 Lecture 6

Computer Graphics. - Rasterization - Philipp Slusallek

Rasterization and Graphics Hardware. Not just about fancy 3D! Rendering/Rasterization. The simplest case: Points. When do we care?

CS452/552; EE465/505. Clipping & Scan Conversion

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

E.Order of Operations

The graphics pipeline. Pipeline and Rasterization. Primitives. Pipeline

Rasterization. CS 4620 Lecture Kavita Bala w/ prior instructor Steve Marschner. Cornell CS4620 Fall 2015 Lecture 16

CS 498 VR. Lecture 18-4/4/18. go.illinois.edu/vrlect18

FROM VERTICES TO FRAGMENTS. Lecture 5 Comp3080 Computer Graphics HKBU

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

4: Polygons and pixels

Topic #1: Rasterization (Scan Conversion)

- Rasterization. Geometry. Scan Conversion. Rasterization

The Traditional Graphics Pipeline

Einführung in Visual Computing

2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

Pipeline and Rasterization. COMP770 Fall 2011

Line Drawing Week 6, Lecture 9

Rasterization. CS4620 Lecture 13

Pipeline Operations. CS 4620 Lecture 10

COMP371 COMPUTER GRAPHICS

Computer Graphics 7 - Rasterisation

The Traditional Graphics Pipeline

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

Advanced Shading and Texturing

The Traditional Graphics Pipeline

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Renderer Implementation: Basics and Clipping. Overview. Preliminaries. David Carr Virtual Environments, Fundamentals Spring 2005

Hidden surface removal. Computer Graphics

CS 543: Computer Graphics. Rasterization

Drawing Fast The Graphics Pipeline

Rasterization. COMP 575/770 Spring 2013

COMP30019 Graphics and Interaction Scan Converting Polygons and Lines

Rasterization: Geometric Primitives

Shadows in the graphics pipeline

3D Rasterization II COS 426

Pipeline Operations. CS 4620 Lecture 14

Projection and viewing. Computer Graphics CSE 167 Lecture 4

CSE 167: Introduction to Computer Graphics Lecture #5: Projection. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2017

Lets assume each object has a defined colour. Hence our illumination model is looks unrealistic.

Page 1. Area-Subdivision Algorithms z-buffer Algorithm List Priority Algorithms BSP (Binary Space Partitioning Tree) Scan-line Algorithms

1.2.3 The Graphics Hardware Pipeline

CSE 167: Introduction to Computer Graphics Lecture #9: Visibility. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2018

CS 130 Exam I. Fall 2015

Line Drawing. Introduction to Computer Graphics Torsten Möller / Mike Phillips. Machiraju/Zhang/Möller

Line Drawing. Foundations of Computer Graphics Torsten Möller

Rasterization. MIT EECS Frédo Durand and Barb Cutler. MIT EECS 6.837, Cutler and Durand 1

From Ver(ces to Fragments: Rasteriza(on

Announcements. Midterms graded back at the end of class Help session on Assignment 3 for last ~20 minutes of class. Computer Graphics

Scheduling the Graphics Pipeline on a GPU

Real-Time Graphics Architecture

Computer Graphics and GPGPU Programming

Computer Graphics. Lecture 9 Hidden Surface Removal. Taku Komura

Topics. From vertices to fragments

Rasterization. CS4620/5620: Lecture 12. Announcements. Turn in HW 1. PPA 1 out. Friday lecture. History of graphics PPA 1 in 4621.

CS 130 Final. Fall 2015

From Vertices to Fragments: Rasterization. Reading Assignment: Chapter 7. Special memory where pixel colors are stored.

Visibility: Z Buffering

Drawing Fast The Graphics Pipeline

Perspective Projection and Texture Mapping

Clipping and Scan Conversion

Computer Graphics (CS 543) Lecture 10: Rasterization and Antialiasing

OpenGL Graphics System. 2D Graphics Primitives. Drawing 2D Graphics Primitives. 2D Graphics Primitives. Mathematical 2D Primitives.

GeForce4. John Montrym Henry Moreton

Fall CSCI 420: Computer Graphics. 7.1 Rasterization. Hao Li.

CSCI 420 Computer Graphics Lecture 14. Rasterization. Scan Conversion Antialiasing [Angel Ch. 6] Jernej Barbic University of Southern California

Scan Conversion. Drawing Lines Drawing Circles

Fondamenti di Grafica 3D The Rasterization Pipeline.

CSE528 Computer Graphics: Theory, Algorithms, and Applications

Final projects. Rasterization. The Graphics Pipeline. Illumination (Shading) (Lighting) Viewing Transformation. Rest of semester. This week, with TAs

Graphics and Interaction Rendering pipeline & object modelling

Rasterization, or What is glbegin(gl_lines) really doing?

Rasterization Overview

CHAPTER 1 Graphics Systems and Models 3

Rasterization. Rasterization (scan conversion) Digital Differential Analyzer (DDA) Rasterizing a line. Digital Differential Analyzer (DDA)

Last time? Rasterization. Scan Conversion (Rasterization) High-level concepts for Visibility / Display. Today

CS451Real-time Rendering Pipeline

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

EF432. Introduction to spagetti and meatballs

CS 4731: Computer Graphics Lecture 21: Raster Graphics: Drawing Lines. Emmanuel Agu

Overview. Pipeline implementation I. Overview. Required Tasks. Preliminaries Clipping. Hidden Surface removal

Rendering. Basic Math Review. Rasterizing Lines and Polygons Hidden Surface Remove Multi-pass Rendering with Accumulation Buffers.

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev

Chapter 8: Implementation- Clipping and Rasterization

Lecture 2. Shaders, GLSL and GPGPU

Real-Time Graphics Architecture

Scanline Rendering 2 1/42

Class of Algorithms. Visible Surface Determination. Back Face Culling Test. Back Face Culling: Object Space v. Back Face Culling: Object Space.

Transcription:

Real-Time Graphics Architecture Lecture 5: Rasterization Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/cs448-07-spring/ Rasterization Outline Fundamentals System examples Special topics (Depth-buffer, cracks and holes, ) Suggested reading Olano and Greer, Triangle Scan Conversion using 2D Homogeneous Coordinates, GH 1997. Lapidous and Jiao, Optimal Depth Buffer for Low-Cost Graphics Hardware, GH 1999. Akeley and Su, Minimum Triangle Separation for Correct Z-buffer Occlusion, GH 2006. 1

Recall that Straight lines project to straight lines When projection is to a plane (our assumption) Only vertexes need to be transformed That s why we re interested in lines and polygons Parameterizations (e.g., projected distance) are warped: Recall that Ideal screen coordinates are continuous Implementations always use discrete math, but with substantial sub-pixel precision A pixel is a big thing Addressable resolution equal to pixels on screen Lots of data (recall over-square RealityEngine buffer) Points and lines have no geometric area. 2

Terminology Rasterization: conversion of primitives to fragments Primitive: point, line, polygon, glyph, image,. Fragment: transient data structure, e.g., short x,y; long depth; short r,g,b,a; Pixels exist in an array (e.g., framebuffer) Have implicit <x,y> coordinates Fragments are routed to appropriate pixels This is the sort we saw in the previous lecture There will be more sorting Two fundamental operations Fragment selection Identify pixels for which fragments are to be generated Must be conservative, efficiency matters <x,y> parameters are special Attribute assignment Assign attribute values to each fragment Eg E.g., color, depth, Scheduling: the third fundamental operation Assign fragments to parallel fragment shaders We won t talk about this much (ask the experts) 3

Fragment selection Generate one fragment for each pixel that is intersected by the primitive Intersected could mean that the primitive s area intersects the pixel s: Center point, or Square region, or Filter-function Some examples: Point sampled fragment selection Aliased rasterization 4

Box sampled fragment selection Tiled rasterization (scaling the primitive doesn t do this) Filtered fragment selection Antialiased rasterization 5

Fragment selection (continued) What if the primitive doesn t have a geometric area? Points don t Lines don t Two choices: Rule-based approach (e.g., Bresenham line) Allows desired properties to be maintained, but May require additional hardware complexity Assign an screen-space area (e.g., circle for point, rectangle for line) Can utilize polygon rasterization algorithm, but May result in wavy lines, flashing points, etc. Attribute assignment Identify a parameterization for each attribute e.g., planar Sample this parameterization as required Which parameterization? Lots of possibilities (linear, cubic, ) Always defined implicitly by vertex values Properties of vertex-defined parameterization: Zero-order continuity If fully specified by the two vertexes that specify the edge Triangles allow planar parameterization Polygons (4+ edges) are (almost) never planar Parameterization may vary with screen orientation 6

Linear interpolation Compute intermediate attribute value Along a line: A = aa1 + ba2, a+b=1 On a plane: A = aa1 + ba2 + ca3, a+b+c=1 Only projected values interpolate linearly in screen space (straight lines project to straight lines) x and y are projected (divided by w) Attribute values are not naturally projected Choice for attribute interpolation in screen space Interpolate t unprojected values Cheap and easy to do, but gives wrong values Sometimes OK for color, but Never acceptable for texture coordinates Do it right Incorrect attribute interpolation A1 Linear interpolation A1 A A A A2 A2 A = A! 7

Perspective-correct linear interpolation Only projected values interpolate correctly, so project A Linearly interpolate A 1 /w 1 and A 2 /w 2 Also interpolate 1/w 1 and 1/w 2 These also interpolate linearly in screen space Divide interpolants at each sample point to recover A (A/w) / (1/w) = A Division is expensive (more than add or multiply), so Recover w for the sample point (reciprocate), and Multiply pyeach projected attribute by w Barycentric triangle parameterization: aa1/w1 + ba2/w2 + ca3/w3 A = a/w1 + b/w2 + c/w3 a + b + c = 1 Example: Gouraud shaded quadrilateral Fragment selection Walk edges Change edges at vertexes Attribute assignment Loop in a loop algorithm: Interpolate along edges Interpolate edge-to-edge Outer loop is complex E.g., either 2 or 3 regions Parameterization is a function of Screen orientation Choice of spans 8

Example: Gouraud shaded quadrilateral All projected quadrilaterals are non-planar Due to discrete coordinate precision What if quadrilateral is concave? Concave is complex (split spans -- see example) Non-planar concave for some view What if quadrilateral l intersects t itself? A real mess (no vertex to signal change - see example) Non-planar bowtie for some view All polygons are triangles (or should be) Three points define a plane All triangles are planar All parameterizations are planar Triangle is always convex Regardless of arithmetic precision Simple rasterization, no special cases Modern GPUs decompose n-gons to triangles SGI switched in 1990, VGX product OpenGL designed to support this Optimized quadrilateral decomposition developed 9

Normal-based quad decomposition Compute (A dot C) and (B dot D) A B C Connect vertex pair with the greater dot product D Avoid connecting the stirrups Avoid frame-to-frame jitter If dot products are equal E.g., symmetric curved surface Transforming will add noise Point sampled triangles Modern choice for aliased rendering Fragment selection Generate fragment if pixel center is inside triangle Handle special-case edge/vertex intersections Attribute assignment Sample function at pixel center Yields mean value for surrounded pixels Generates a consistent ray for depth buffering Never sample outside the triangle Avoid color wrap, depth punch-through But how is antialiased filtering handled? 10

Point sampled points and lines Geometric points and lines have no area, so Pixel sample locations almost never in primitive Semantics are confused at best Must assign attribute parameterizations for pseudo-area Point: attribute-constant disk Line: single-attribute-slope rectangle Problem: how to outline filled, depth-buffered triangles? Depth values are wrong, lines disappear VGX introduced hollow polygons OpenGL 1.1 introduced glpolygonoffset() Outline problem 11

Integer DDA arithmetic Goal: efficient interpolation Direct evaluation is expensive Requires multiplications for each evaluation Digital Differential Analyzer (DDA) Fixed point iiiiii.ffffff representation, accumulator and slope Add slope repeatedly to accumulator to evaluate adjacent sample locations Planar DDA uses separate X and Y slopes Can move around the plane arbitrarily Require log2(n) fraction bits for n accumulation steps! Triangle Rasterization Examples Gouraud shaded (GTX) Per-pixel evaluation (Pixel Planes 4) Edge walk, planar parameterization (VGX) Barycentric direct evaluation (InfiniteReality) Small tiles (Bali proposed) Homogeneous recursive descent (NVIDIA) 12

Algorithm properties Setup and execution costs Setup: constant per triangle Execution: relative to triangle s projected area Ability to parallelize Ability to cull to a rectangular screen region To support tiling To support scissoring Gouraud shaded (GTX) Two-stage algorithm DDA edge walk fragment selection parameter assignment DDA scan-line walk parameter assignment only Requires expensive scan-line setup Location of first sample is non- unit distance from edge Parallelizes in two stages (e.g., GTX) Cannot scissor efficiently Works on quadrilaterals 13

Engine per pixel (Pixel Planes 4) Engine-per-pixel (Pixel Planes 4) Individual engine at every pixel! Solves edge equations to determine inclusion Solves attribute equations to determine values Setup involves computation of plane and edge slopes Execution is in constant-time Clever evaluation tree makes this possible Extremely fast for large triangles, but Extremely inefficient for small triangles Pixel depth complexity = # triangles in scene Scissor culling is a non-issue 14

Pixel Planes 4 fragment selection Pixel Planes 4 fragment selection 15

Pixel Planes 4 fragment selection Pixel Planes 4 attribute evaluation 16

Edge walk, planar evaluation (VGX) Edge walk, planar evaluation (VGX) Hybrid algorithm Edge DDA walk for fragment selection Efficient generation of conservative fragment set Sample DDA walk for parameter assignment Never step off sample grid, so Never have to make sub-pixel adjustment Scissor cull possible Adds complexity to edge walk Sample walk simplifies parallelism 17

Interpolation outside the triangle DDA can operate out-of-range MSBs beyond desired range don t matter Carry chain flows up, not down Can handle arbitrarily large slopes Can iterate outside the triangle s area Don t clamp intermediate results! Doesn t work for floating point! Slope Accum 18

Wrapping Problem: overflow or underflow of iterated value Integer arithmetic wraps Maximum value overflows to zero Zero underflows to maximum value Minor iteration error huge value error Value Binary Interpreted Binary 3 11 3 11 2 10 2 10 1 01 1 01 0 00 0 00 Overflow Underflow Guard bits Solution: extend range to detect wrapped values Add one or two guard MSBs Non-zero guard bit(s) out-of-range value Interpret out-of-range values as the nearer of zero or max 19

Guard-bit example (2-bit value) Value Binary Interpreted Binary 7 111 0 00 6 110 0 00 5 101 3 11 4 100 3 11 3 011 3 11 2 010 2 10 1 001 1 01 0 000 0 00 Overflow Underflow Guard-bit implementation (n-bit) In guard Inn-1 Out n-1 In1 Out 1 In0 Out 0 20

DDA bit assignment examples Edge equation in 4k x 4k rendering space 1 guard bit 12 integer bits (4k space) 10 sub-pixel position bits 12 interpolation bits (log2(4096)) 35 bits total Depth value in 4k x 4k rendering space 2 guard bits (depth wrap is disaster) 24 integer bits (reasonable depth precision) 13 interpolation bits (longest path in 4k x 4k) 39 bits total Barycentric (InfiniteReality) Hybrid algorithm Approximate edge walk for fragment selection Pineda edge functions used to generate AA masks Direct barycentric evaluation for attribute assignment Barycentric coordinates are DDA walked on grid Minimizes setup cost Additional computational complexity accepted Handles small triangles well Scissor cull implemented Supports guard band clipping 21

Small tiles (Bali proposed) Frame buffer tiled into nxn (16x16) regions Each tile is owned by one of k separate engines Two-level rasterization: Tile selection (avoid broadcast, conservative) Fragment selection and attribute assignment Parallelizes well Handles small triangles well Scissors well At tile selection stage Homogeneous recursive descent Rasterizes unprojected, unclipped geometry Huge improvement for geometry processing! Interpolates clip-plane plane distances Used by NVIDIA GPUs (but not ATI?) Is not well documented Read Olano and Greer, Graphics Hardware 1997 paper Attribute assignment precision has many pitfalls Cannot generate perspective-incorrect attribute values Recursive descent drives 2x2 parallel fragment generation Scissors well But cannot outline-clip against non-frustum planes 22

Ideal depth buffer Topic is surprisingly rich What is good metric for accuracy? Ideal depth buffer (true object-z buffer) Interpolate Z/w and 1/w Divide at each fragment to recover object Z value Expensive division arithmetic (Z has many bits) Precision is distributed evenly in Z buffer Seems desirable, but actually is not More precision nearer the viewpoint is good SGI Odyssey product is only example I know 1/w depth buffer (aka W-buffer) Observe that w is just object Z to begin with Recall that 1/w interpolates linearly Store and compare 1/w values in depth buffer No expensive division is required No additional interpolation, 1/w was needed anyway W-buffer precision is packed toward the view point Precision between view point and near-clip is lost 1/w value is scaled to match far-clip, but not biased Derivative of 1/w is -1/w 2 Some warp is desired, but this is extreme Fails with application-specified homogeneous coords 23

Z/w depth buffer (aka Z-buffer) Z/w interpolates linearly too Store and compare Z/w values in depth value No expensive per-fragment division required Depth buffer precision is packed toward the near clipping plane (not view point) No wasted representations between viewpoint and near clipping plane Similar warp to W-buffer Loses farclip/nearclip bits in far field OpenGL (and SGI) standard approach See OpenGL spec for details Works with application-specified homogeneous coords Depth buffer warp compensation Use floating point representation for depth values May convert to float after interpolation if desired To compensate for warp in W-buffer or Z-buffer Map [near, far] to [0, 1] InfiniteReality product does this (Z-buffer) To compensate for warp in object-z buffer Map [near, far] to [1, 0] Odyssey product does this 24

Bad Zfar = 50 miles Znear = 25 feet Normalized to Zfar = 1 Good 16-bit integer z-buffer Bad 16-bit warped z-buffer Zfar = 50 miles Znear = 25 feet Publications: Akeley ypatent, 1998 Lapidous and Guofang, Graphics Hardware 1999 (Complementary Z-buffer, reversed floating point) Good 25

Screen x,y precision affects Z-buffer Depth parameterization is constructed from vertex values, which were perturbed to finite screencoordinate precisions Discretized x,y values change this parameterization substantially Can t correct by computing depth parameterization with high-precision x,y values, because this results in sampling outside the triangle So screen x,y precision affects Z-buffer resolution Slopes are 2 and 1 respectively FOV factor matters here Bad Smin is dominated by Xwin and Ywin representation here (explains comment in Lapidous01) Good 26

Bad Good Vertical bars show the difference between discrete Xwin,Ywin,Zbuf-only calculations and fully-discrete calculations. Holes and cracks Goals for adjacent point-sampled triangles: No missed pixels (holes) No pixels rasterized twice Problem: sample point intersects the edge Use canonical edge arithmetic (identical for both triangles) Swap only the sense of the decision Problem: sample point intersects shared vertex Construct elaborate rasterization rules, or Use separate grids for vertexes and samples Problem: screen x,y discritization flips orientation of some triangles in a mesh 27

Holes and cracks (continued) T-vertex A vertex intended to be on an edge Results in cracks (see example) Cannot eliminate problem Would require infinite precision Antialiasing helps, though See Pat Hanrahan s cs248 slides graphics.stanford.edu/courses/cs 248-98-fall/Lectures/lecture9/ Final observations To get consistent precision using floating point Bias all values into range [2 n, 2 n+1 ] Forces all exponents to be the same Can subtract bias, or lose in float-to-fixed conversion Round-to-zero is evil for signed rasterization algorithms Especially important for floor() and ceiling() Or avoid with bias technique Some horrible problems go away in the limit Huge parameter slope little triangle area 28

Final observations (continued) Highly-acute triangles Result from high-precision screen coordinates Can rasterize incorrectly due to minor slope errors Develop very large parameter slopes Sometimes excess precision is damaging Fine screen coordinates acute triangles Olano and Greer describe this But, fine screen coordinates depth buffer accuracy Line stipple is a mess For one segment, complicates raster parallelism For connected segments, complicates geometry parallelism Summary Many ways to go about rasterization Far more than considered in this talk Trends are toward Full attribute evaluation at each pixel location Floating-point representations Architects have a responsibility to understand and correctly specify numeric precisions 29

Real-Time Graphics Architecture Lecture 5: Rasterization Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/cs448-07-spring/ 30