Lecture 4: Geometry Processing. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Similar documents
Real-Time Graphics Architecture

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Scheduling the Graphics Pipeline on a GPU

Efficient GPU Rendering of Subdivision Surfaces. Tim Foley,

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Many rendering scenarios, such as battle scenes or urban environments, require rendering of large numbers of autonomous characters.

CS4620/5620: Lecture 14 Pipeline

Course Recap + 3D Graphics on Mobile GPUs

Graphics and Imaging Architectures

The Rasterization Pipeline

A Real-time Micropolygon Rendering Pipeline. Kayvon Fatahalian Stanford University

Rendering Objects. Need to transform all geometry then

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

Comparing Reyes and OpenGL on a Stream Architecture

Lecture 10: Shading Languages. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

GeForce4. John Montrym Henry Moreton

Lecture 13: Reyes Architecture and Implementation. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Graphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university

Pipeline Operations. CS 4620 Lecture 14

Introduction to the Direct3D 11 Graphics Pipeline

Direct Rendering of Trimmed NURBS Surfaces

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

Motivation MGB Agenda. Compression. Scalability. Scalability. Motivation. Tessellation Basics. DX11 Tessellation Pipeline

Graphics Pipeline 2D Geometric Transformations

3D Graphics for Game Programming (J. Han) Chapter II Vertex Processing

Approximate Catmull-Clark Patches. Scott Schaefer Charles Loop

Today. Rendering pipeline. Rendering pipeline. Object vs. Image order. Rendering engine Rendering engine (jtrt) Computergrafik. Rendering pipeline

CSE 167: Introduction to Computer Graphics Lecture #4: Vertex Transformation

Rendering Subdivision Surfaces Efficiently on the GPU

Real-Time Graphics Architecture

Rasterization Overview

Computergrafik. Matthias Zwicker Universität Bern Herbst 2016

Lecture 9: Deferred Shading. Visual Computing Systems CMU , Fall 2013

Why modern versions of OpenGL should be used Some useful API commands and extensions

CS452/552; EE465/505. Clipping & Scan Conversion

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

INF3320 Computer Graphics and Discrete Geometry

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Lecture 2. Shaders, GLSL and GPGPU

CS427 Multicore Architecture and Parallel Computing

Computergrafik. Matthias Zwicker. Herbst 2010

Wed, October 12, 2011

Parallel Triangle Rendering on a Modern GPU

Rendering approaches. 1.image-oriented. 2.object-oriented. foreach pixel... 3D rendering pipeline. foreach object...

Real-Time Graphics Architecture

Ray Casting of Trimmed NURBS Surfaces on the GPU

CSE 167: Lecture #4: Vertex Transformation. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

Hardware-driven Visibility Culling Jeong Hyun Kim

General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)

Real-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis

Craig Peeper Software Architect Windows Graphics & Gaming Technologies Microsoft Corporation

Culling. Computer Graphics CSE 167 Lecture 12

A Trip Down The (2011) Rasterization Pipeline

2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into

Fast Tessellated Rendering on Fermi GF100. HPG 2010 Tim Purcell

PN-triangle tessellation using Geometry shaders

CSE 167: Introduction to Computer Graphics Lecture #10: View Frustum Culling

Spline Surfaces, Subdivision Surfaces

Approximating Subdivision Surfaces with Gregory Patches for Hardware Tessellation

A SIMD-efficient 14 Instruction Shader Program for High-Throughput Microtriangle Rasterization

Performance OpenGL Programming (for whatever reason)

CS770/870 Fall 2015 Advanced GLSL

The F-Buffer: A Rasterization-Order FIFO Buffer for Multi-Pass Rendering. Bill Mark and Kekoa Proudfoot. Stanford University

The Application Stage. The Game Loop, Resource Management and Renderer Design

Parallel Programming for Graphics

From Shader Code to a Teraflop: How GPU Shader Cores Work. Jonathan Ragan- Kelley (Slides by Kayvon Fatahalian)

CSE 167: Introduction to Computer Graphics Lecture #11: Visibility Culling

CMSC427 Transformations II: Viewing. Credit: some slides from Dr. Zwicker

M4G - A Surface Representation for Adaptive CPU-GPU Computation

Information Coding / Computer Graphics, ISY, LiTH. Splines

General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)

S U N G - E U I YO O N, K A I S T R E N D E R I N G F R E E LY A VA I L A B L E O N T H E I N T E R N E T

Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Texture

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

The Graphics Pipeline

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

Optimizing DirectX Graphics. Richard Huddy European Developer Relations Manager

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

Computergrafik. Matthias Zwicker. Herbst 2010

CSE 167: Introduction to Computer Graphics Lecture #9: Visibility. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2018

Overview. Pipeline implementation I. Overview. Required Tasks. Preliminaries Clipping. Hidden Surface removal

Pipeline Operations. CS 4620 Lecture 10

How to Work on Next Gen Effects Now: Bridging DX10 and DX9. Guennadi Riguer ATI Technologies

INFOGR Computer Graphics. J. Bikker - April-July Lecture 11: Visibility. Welcome!

Evolution of GPUs Chris Seitz

Lecture 25: Board Notes: Threads and GPUs

Cornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary

From Shader Code to a Teraflop: How Shader Cores Work

3/1/2010. Acceleration Techniques V1.2. Goals. Overview. Based on slides from Celine Loscos (v1.0)

3D Rendering Pipeline

RSX Best Practices. Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog

Deferred Rendering Due: Wednesday November 15 at 10pm

Drawing Fast The Graphics Pipeline

Smooth Surfaces from 4-sided Facets

CS130 : Computer Graphics. Tamar Shinar Computer Science & Engineering UC Riverside

EE 4702 GPU Programming

Realtime 3D Computer Graphics Virtual Reality

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Transcription:

Lecture 4: Processing Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011)

Today Key per-primitive operations (clipping, culling) Various slides credit John Owens, Kurt Akeley, and Pat Hanrahan Programmable primitive generation - shader - Modern GPU tessellation

Recall: in a modern graphics pipeline, application-specified logic computes vertex positions Vertex Processing Vertex positions emitted by vertex processing (or the geometry shader, if enabled) are represented in homogeneous clip-space coordinates. (x,y,z,w) Vertex is within the view frustum if: -w x w -w y w -w z w Vertex s position in euclidian space is (x/w, y/w, z/w) Rasterization (Fragment Generation)

Per primitive operations Assemble vertices into primitives Clip primitive against view frustum For each resulting primitive Divide by w Apply viewport transform Discard back-facing primitives [optional, depends on config]

Assembling vertices into primitives How to assemble is part of graphics state (specified by draw command) Notice: independent vertices get grouped into primitives (dependency!)

Clipping May generate new vertices/primitives, or eliminate vertices/primitives Data-dependent computation - variable amount of work per primitive - variable control flow per primitive

Why clipping? Avoid downstream processing that will not contribute to image (rasterization, fragment processing) Establish invariants for emitted primitives - Can safely divide by w after clipping - Bounds on vertex positions (can now choose precision of subsequent operations accordingly)

Guard-band clipping [RealityEngine, Akeley 93] Reduces variance in per-primitive clipping work Cost (conservative: primitives no longer guaranteed to be fully on screen) - Rasterizer must not generate off-screen fragments - Increased precision needed during rasterization

Back-face culling Use sign of triangle area to determine if triangle is facing toward or away from camera May discard primitive as a result of this test - For closed meshes, eliminates ~ 1/2 of triangles (these triangles will be occluded anyway)

SGI Reality Engine 1992 [Akeley 93] Command Processor Input Fifo Engine Engine Engine Engine Engine Engine Engine Engine Output Fifo... Fragment... Generator Divide triangle strips from application into small strips, round robin to geometry engines Buffers absorb variance in amount of work per triangle

Programmable geometry amplification Amplification by geometry shader or tessellation functionality in a modern pipeline is far greater than that of clipping shader: output up to 1024 floats worth of vertices per input primitive Tessellation: thousands of vertices from a base primitive

Thought experiment Assume maximum amplification factor is large (known statically) Command Processor Input Fifo Amplifier Amplifier Amplifier Amplifier Amplifier Amplifier Amplifier Amplifier Output Fifo... Fragment... Generator Simple approach 1: make on-chip buffers as big as possible: run fast for low amplification Simple approach 2: make huge FIFOs (store off-chip in memory)

Modern GPU tessellation [Moreton 01] Motivations: - Reduce CPU-GPU bandwidth - Animate/skin course resolution mesh, but render high resolution mesh Note: D3D11 Stage Naming (not canonical stage names) Vertex Shader Hull Shader Tessellator Requires parametric surfaces (must support direct evaluation) Domain Shader Shader

Parametric surface (0,1) (1,1) v (0,0) (u,v) u (1,0) f(u,v)=<x,y,z>

Parametric surfaces: common examples Bicubic patch, 16 control points (quad domain) PN Triangles, 3 vertices + 3 normals (defines bezier patch on triangular domain) See Approximating Catmull-Clark Subdivision Surfaces With Bicubic Patches, Loop et al. 2008 See Curves PN Triangles, Vlachos et al. 2008

Modern GPU tessellation Hull shader - Accepts primitives after traditional vertex processing Note: D3D11 Stage Naming (not canonical stage names) Primitive (with adjacency) Vertex Shader Hull Shader - Computes tessellation factor along each domain edge Edge Factors Tessellator Domain Shader Control Points - Computes control points for parametric surface (from primitive vertices) Shader

Hull shader produces edge tessellation rates Based on estimate of parametric surface position (Note: rates need not be integral) 2 v 3 5 u 7

Fixed-function tessellation stage Input: edge tessellation constraints for a patch Output: (almost) uniform mesh topology meeting constraints 2 3 5 7 [Moreton 01]

Domain shader stage Input: control points (from hull shader) and stream of parametric vertex locations (u,v) from tessellator Output: position of vertex at parametric coordinate: f(u,v)

Modern GPU tessellation Heterogeneous implementation Hull shader - Original primitive granularity - Data-parallel - Large working set (typically a primitive + one-ring) Tessellator Note: D3D11 Stage Naming (not canonical stage names) Primitive (with adjacency) Vertex Shader Edge Factors Hull Shader Tessellator Control Points - Surface agnostic, fixed-function hardware implementation - Irregular control flow Domain shader - Fine-mesh-vertex granularity Domain Shader Shader - Data-parallel (preserves shader programming model) - Direct evaluation of surface (extra math, but data-parallel)

Challenge: avoid cracks! 1 2 1 (parametric domain) 2

Modern GPU tessellation summary Heterogeneous, 3-stage implementation - Algorithms co-designed with pipeline abstractions and hardware Enables adaptive level-of-detail, high-resolution meshes in games Challenges - Application developer: avoiding cracks (requires consistent edge rate evaluation -- this is tricky in floating point math) - GPU implementor: managing large data amplification... while maintaining parallelism, locality, and order