Drawing Fast The Graphics Pipeline

Similar documents
Drawing Fast The Graphics Pipeline

Drawing Fast The Graphics Pipeline

3D Drawing Visibility & Rasterization. CS559 Spring 2016 Lecture 9 February 23, 2016

Drawing in 3D (viewing, projection, and the rest of the pipeline)

Drawing in 3D (viewing, projection, and the rest of the pipeline)

Rasterization and Graphics Hardware. Not just about fancy 3D! Rendering/Rasterization. The simplest case: Points. When do we care?

Drawing in 3D (viewing, projection, and the rest of the pipeline)

WebGL and GLSL Basics. CS559 Fall 2015 Lecture 10 October 6, 2015

CS4620/5620: Lecture 14 Pipeline

The Rasterization Pipeline

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

WebGL and GLSL Basics. CS559 Fall 2016 Lecture 14 October

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

Pipeline Operations. CS 4620 Lecture 10

CS230 : Computer Graphics Lecture 4. Tamar Shinar Computer Science & Engineering UC Riverside

Shadows in the graphics pipeline

CS452/552; EE465/505. Clipping & Scan Conversion

POWERVR MBX. Technology Overview

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

Rendering approaches. 1.image-oriented. 2.object-oriented. foreach pixel... 3D rendering pipeline. foreach object...

Hardware-driven visibility culling

PowerVR Hardware. Architecture Overview for Developers

Pipeline Operations. CS 4620 Lecture 14

CS451Real-time Rendering Pipeline

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Culling. Computer Graphics CSE 167 Lecture 12

Lecture 2. Shaders, GLSL and GPGPU

Spring 2009 Prof. Hyesoon Kim

CS130 : Computer Graphics. Tamar Shinar Computer Science & Engineering UC Riverside

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

Scanline Rendering 2 1/42

Could you make the XNA functions yourself?

Hidden surface removal. Computer Graphics

CS 4620 Program 3: Pipeline

Game Architecture. 2/19/16: Rasterization

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Hardware-driven Visibility Culling Jeong Hyun Kim

OpenGL: Open Graphics Library. Introduction to OpenGL Part II. How do I render a geometric primitive? What is OpenGL

CSE 167: Introduction to Computer Graphics Lecture #4: Vertex Transformation

Triangle Rasterization

Practical Texturing (WebGL) CS559 Fall 2016 Lecture 20 November 7th 2016

CSE 167: Introduction to Computer Graphics Lecture #11: Visibility Culling

Rasterization Overview

CS 130 Final. Fall 2015

Getting fancy with texture mapping (Part 2) CS559 Spring Apr 2017

CSE 167: Lecture #4: Vertex Transformation. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

Optimizing DirectX Graphics. Richard Huddy European Developer Relations Manager

Real-Time Graphics Architecture

TSBK03 Screen-Space Ambient Occlusion

EECE 478. Learning Objectives. Learning Objectives. Rasterization & Scenes. Rasterization. Compositing

Rasterization. COMP 575/770 Spring 2013

Parallel Triangle Rendering on a Modern GPU

CS427 Multicore Architecture and Parallel Computing

Spring 2011 Prof. Hyesoon Kim

The Graphics Pipeline and OpenGL I: Transformations!

Realtime 3D Computer Graphics Virtual Reality

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Ulf Assarsson Department of Computer Engineering Chalmers University of Technology

LOD and Occlusion Christian Miller CS Fall 2011

CSE 167: Introduction to Computer Graphics Lecture #9: Visibility. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2018

CSE 167: Introduction to Computer Graphics Lecture #10: View Frustum Culling

Grafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL

Intro to OpenGL III. Don Fussell Computer Science Department The University of Texas at Austin

Wed, October 12, 2011

CS 130 Exam I. Fall 2015

GeForce4. John Montrym Henry Moreton

Lecture 17: Shadows. Projects. Why Shadows? Shadows. Using the Shadow Map. Shadow Maps. Proposals due today. I will mail out comments

Optimizing for DirectX Graphics. Richard Huddy European Developer Relations Manager

CS 130 Exam I. Fall 2015

The Graphics Pipeline

CS 464 Review. Review of Computer Graphics for Final Exam

- Rasterization. Geometry. Scan Conversion. Rasterization

E.Order of Operations

CS 498 VR. Lecture 18-4/4/18. go.illinois.edu/vrlect18

Computing Visibility. Backface Culling for General Visibility. One More Trick with Planes. BSP Trees Ray Casting Depth Buffering Quiz

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp

PowerVR Performance Recommendations The Golden Rules. October 2015

Computer Graphics Shadow Algorithms

Last Time. Why are Shadows Important? Today. Graphics Pipeline. Clipping. Rasterization. Why are Shadows Important?

CS 354R: Computer Game Technology

CS130 : Computer Graphics Lecture 2: Graphics Pipeline. Tamar Shinar Computer Science & Engineering UC Riverside

Rasterization. CS 4620 Lecture Kavita Bala w/ prior instructor Steve Marschner. Cornell CS4620 Fall 2015 Lecture 16

Graphics Hardware. Instructor Stephen J. Guy

Performance OpenGL Programming (for whatever reason)

CS 498 VR. Lecture 19-4/9/18. go.illinois.edu/vrlect19

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

1.2.3 The Graphics Hardware Pipeline

CS559: Computer Graphics. Lecture 12: Antialiasing & Visibility Li Zhang Spring 2008

EE 4702 GPU Programming

Module 13C: Using The 3D Graphics APIs OpenGL ES

Intro to OpenGL II. Don Fussell Computer Science Department The University of Texas at Austin

Today s Agenda. Basic design of a graphics system. Introduction to OpenGL

CS 381 Computer Graphics, Fall 2008 Midterm Exam Solutions. The Midterm Exam was given in class on Thursday, October 23, 2008.

Today. Rendering pipeline. Rendering pipeline. Object vs. Image order. Rendering engine Rendering engine (jtrt) Computergrafik. Rendering pipeline

Course Recap + 3D Graphics on Mobile GPUs

Project 1, 467. (Note: This is not a graphics class. It is ok if your rendering has some flaws, like those gaps in the teapot image above ;-)

Real-Time Shadows. Last Time? Textures can Alias. Schedule. Questions? Quiz 1: Tuesday October 26 th, in class (1 week from today!

Transcription:

Drawing Fast The Graphics Pipeline CS559 Fall 2015 Lecture 9 October 1, 2015

What I was going to say last time How are the ideas we ve learned about implemented in hardware so they are fast. Important: We want to use the hardware The APIs require us to understand it Need to know what building blocks we have

The first 4 Key Ideas 1. Work in convenient coordinate systems. Use transformations to get from where you want to be to where you need to be. Hierarchical modeling lets us build things out of pieces. 2. Use homogeneous coordinates and transformations to make common operations easy. Translation, projections, coordinate system shift all become simple matrix multiplies. 3. Create viewing transformations with projection. The geometry of imaging (pinhole camera model) leads to linear transformations in homogeneous coordinates. 4. Implement primitive-based rendering (interactive graphics) with a pipeline. The abstractions map nicely onto hardware, and let you do things like visibility computations easily. Be aware that there are other paradigms for drawing.

1. Put a 3D primitive in the World Modeling 2. Figure out what color it should be 3. Position relative to the Eye Viewing / Camera Transformation 4. Get rid of stuff behind you/offscreen Clipping 5. Figure out where it goes on screen Projection (sometimes called Viewing) 6. Figure out if something else blocks it Visibility / Occlusion 7. Draw the 2D primitive Rasterization (convert to s)

1. Put a 3D primitive in the World Modeling Get triangles 2. Figure out what color it should be Do lighting 3. Position relative to the Eye Viewing / Camera View Transformation transform 4. Get rid of stuff behind you/offscreen Clipping Don t worry 5. Figure out where it goes on screen Projection (sometimes Projection called Transform Viewing) 6. Figure out if something else blocks it Visibility / Occlusion Z- Buffer 7. Draw the 2D primitive Rasterization (convert Let library to s) do it

Favorite question from Office Hours Why Triangles?

Rasterization Figure out which pixels a primitive covers Turns primitives into pixels

Dealing with Aliasing? Simple: Aliased (jaggies, ) Crisp Anti-Aliased: Less aliased Blurry

Lines

Triangles

Hardware Rasterization For each point: Compute barycentric coords Decide if in or out

Linear Interpolation P = (1-t) A + t B (t is the coord) A B 0 1 1.5 1A+0B ½ A+ ½ B 0A+1B - ½ A+1½B Interpolative coordinate (t) P = (1-t) A + t B 0<= t <= 1 then in line segment

Barycentric Coordinates Any point in the plane is a convex combination of the vertices of the triangle P=αA+βB+ɣC α+ β + ɣ = 1 A Inside triangle 0 <= α, β, ɣ <= 1 C P B

Barycentric Coords are Useful! Every point in plane has a coordinate (α β ɣ) such that: α+ β + ɣ = 1 Easy test inside the triangle 0 <= α, β, ɣ <= 1 Interpolate values across triangles x p = α x 1 + β x 2 + ɣ x 3 c p = α c 1 + β c 2 + ɣ c 3

Hardware Rasterization For each point: Compute barycentric coords Decide if in or out.7,.7, -.4 1.1, 0, -.1.9,.05,.05.33,.33,.33

Wasteful? Can do all points in parallel We want the coordinates (coming soon) Does the right things for touching triangles Each point in 1 triangle

Hardware Rasterization Each center point in one triangle If we choose consistently for on-the-edge cases Over simplified version: 0 <= a,b,c < 1

Note Triangles are independent Even in rasterization (they are independent throughout process)

The steps of 3D graphics Model objects (make triangles) Transform (find point positions) Shade (lighting per tri / vertex) Transform (projection) Rasterize (figure out pixels) Shade (per-pixel coloring) Write pixels (with Z-Buffer test)

A Pipeline Triangles are independent Vertices are independent s (within triangles) are independent (caveats about sharing for efficiency) Don t need to finish 1 before start 2 (might want to preserve finishing order)

Pipelining in conventional processors Start step 2 before step 1 completes Unless step 2 depends on step 1 Pipe Stall C = A * B C = A * B F = D * E J = G * H C= A*B F= D*E J= G*H F = D * C J = G * H C= A*B F= D*C J= G*H

Triangles are independent! No stalls! (no complexity of handling stalls) Start step 2 before step 1 completes Unless step 2 depends on step 1 Pipe Stall C = A * B C = A * B F = D * E J = G * H C= A*B F= D*E H= G*H F = D * C J = G * H C= A*B C= D*C J= G*H

A Pipeline 1 transf light project raster shade write

A Pipeline 1 transf light project raster shade write 2 transf light project raster shade write 3 transf light project raster shade write

Vertices are independent Parallelize! 1 transf transf proj transf transf light project raster shade write transf transf proj

Parallelization operations split triangles / re-assemble compute per-vertex not per-triangle (fragment) operations lots of potential parallelism less predictable Use queues and caches

Why do we care? This is why the hardware can be fast It requires a specific model Hardware implements this model The programming interface is designed for this model. You need to understand it.

Some History Custom Hardware (pre-1980) rare, each different Workstation Hardware (early 80s-early 90s) increasing features, common methods Consumer Graphics Hardware (mid 90s-) cheap, eventually feature complete Programmable Graphics Hardware (2002-)

Graphics Workstations 1982-199X Implemented graphics in hardware Providing a common abstraction set Fixed function it was built into the hardware

Silicon Graphics (SGI) Stanford Research Project 1980 Spun-off to SGI (company) 1982 The Geometry Engine 4x4 matrix mutiply chip approximate division Raster engine (Z-buffer)

1988: The Personal Iris

The 4D-2X0 series 4 processors (240) Different graphics 1988 GT/GTX 1990 - VGX

Why do you care? It s the first time the abstractions were right later stuff adds to it It s where the programming model is from it was IrisGL before OpenGL It s the pipeline at it s essense we ll add to it, not take away

The Abstractions Points / Lines / Triangles Vertices in 4D Color in 4D (RGBA = transparency) Per- transform (4x4 + divide by w) Per- lighting Color interpolation Fill Triangle Z-test (and other tests) Double buffer (and other buffers)

What s left to add? All of this was in software in the 80s 1990 texture 1992 multi-texture (don t really need) 1998 (2000) programmable shading 2002 programmable pipelines 2005 more programmability

The pipeline (2006-current) Application Program Graphics Driver Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

The pipeline (1988) Application Program Graphics Driver Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program The full fixed-function pipeline (1992) Graphics Driver Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program Graphics Driver The parts you have to program 1988-2014 Now (in addition to above) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

A Triangle s Journey

Things to observe as we travel through the pipeline What does each stage do? What are its inputs and output? important for programmability Why would it be a bottleneck? and what could we do to avoid it

Application Program The pipeline (1988) (no texturing) Graphics Driver Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program Graphics Driver Start here Setup modes (window, ) Setup transform, lights Draw a triangle Position, color, normal Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Drawing a triangle Modes per triangle which window, how to fill, use z-buffer, Data per-vertex position normal color other things (texture coords)

Per? Modes per triangle which window, how to fill, use z-buffer, Data per-vertex position normal ß allow us to make non-flat color ß allows us to interpolate other things (texture coords)

Per- not Per-Triangle Allows sharing vertices between triangles Or make all the vertices the same (color, normal, ) to get truly flat

Application Program Graphics Driver Triangle V1: (x1,y1,z1), (r1,g1,b1),(nx1,ny1,nz1) V2: (x2,y2,z2), (r2,g2,b2),(nx2,ny2,nz2) V3: (x3,y3,z3), (r3,g3,b3),(nx3,ny3,nz3) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program Graphics Driver Triangle V1: (x1,y1,z1), (r1,g1,b1),(nx1,ny1,nz1) V2: (x2,y2,z2), (r2,g2,b2),(nx2,ny2,nz2) V3: (x3,y3,z3), (r3,g3,b3),(nx3,ny3,nz3) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Is this a potential bottleneck? Application Program Graphics Driver Function calls to the driver 3 vertices + triangle + Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Old style OpenGL begin(triangle); c3f(r1,g1,b1); n3f(nx1,ny1,nz1); v3f(x1,y1,z1); c3f(r2,g2,b2); n3f(nx2,ny2,nz2); v3f(x2,y2,z2); c3f(r3,g3,b3); n3f(nx3,ny3,nz3); v3f(x3,y3,z3); end(triangle); 11 function calls 35 arguments pushed Old days: This is a lot less than the number of pixels! Nowadays: Just the memory access swamps the process

Coming Soon Block transfers of data Data for lots of triangles moved as a block Try to draw groups of triangles

Application Program Graphics Driver Triangle V1: (x1,y1,z1), (r1,g1,b1),(nx1,ny1,nz1) V2: (x2,y2,z2), (r2,g2,b2),(nx2,ny2,nz2) V3: (x3,y3,z3), (r3,g3,b3),(nx3,ny3,nz3) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program Graphics Driver Split up triangles into vertices V1: (x1,y1,z1), (r1,g1,b1),(nx1,ny1,nz1) V2: (x2,y2,z2), (r2,g2,b2),(nx2,ny2,nz2) V3: (x3,y3,z3), (r3,g3,b3),(nx3,ny3,nz3) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program Graphics Driver Buffer / the vertices V1: (x1,y1,z1), (r1,g1,b1),(nx1,ny1,nz1) V2: (x2,y2,z2), (r2,g2,b2),(nx2,ny2,nz2) V3: (x3,y3,z3), (r3,g3,b3),(nx3,ny3,nz3) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Buffering Vertices Old Days: processing expensive Try to maximize re-use Process once an use for many triangles Nowadays Getting vertex to hardware is expensive Process vertices in parallel

Application Program Graphics Driver Buffer / the vertices V1: (x1,y1,z1), (r1,g1,b1),(nx1,ny1,nz1) V2: (x2,y2,z2), (r2,g2,b2),(nx2,ny2,nz2) V3: (x3,y3,z3), (r3,g3,b3),(nx3,ny3,nz3) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Process each vertex independently Application Program Graphics Driver Transform compute x and n Clip Light compute c Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program in à out In: x, n, c Out: x, n, c, x, n, c Graphics Driver Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Just adds information to vertices Computes transformation screen space positons, normals Computes lighting new colors (in the old days, clipping done here hence TCL)

: Each vertex is independent Inputs are: vertex information for this vertex any global information Outputs are: current transform, lighting, vertex information for this vertex

Application Program Looking ahead When we program this pipeline piece It will still be: in à out Graphics Driver Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Store processed vertices in a cache Application Program Graphics Driver Store several so that we can re- use if many triangles share a vertex Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Caching Old days: Big deal, important for performance Now: Not even sure that it s always done

Put triangles back together Application Program Graphics Driver This is one of the few places where triangles exist Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program In the fixed- function pipeline, there is no triangle processing step (maybe clipping) Graphics Driver Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program Graphics Driver Rasterizer: Convert triangles to a list of pixels Input: Triangle with values (x,c, ) Output: s with values (x, c, ) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

s or Fragments I am using the terms interchangably (actually, today I am using pixel) Technically = a dot on the screen Fragment = a dot on a triangle might not become a pixel (fails z-test) might only be part of a pixel

Where do pixel values come from? Each vertex has values Each pixel comes from 3 vertices s interpolate their vertices values Barycentric interpolation All values (in a pixel) are interpolated

Each triangle is separate Careful processing of edges so no cracks 1 triangle à many pixels

Application Program Graphics Driver Rasterizer: Convert triangles to a list of pixels Input: Triangle per- vertex info*3 Output: s interpolated info Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Each triangle can make lots of pixels Application Program Graphics Driver We need to process pixels in parallel (all of the remaining steps) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Process each pixel to get its final values Application Program Usually color, but sometimes Z Graphics Driver This step is a no- op in the 1988 pipeline Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

in à out, each independent Application Program Graphics Driver All we do is change its values (potentially re- compute) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Coming attractions Application Program This step will get to be pretty exciting Graphics Driver Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Ground Rules s are independent in à out Changing its position (x,y) makes it a different pixel (so you can t) Can change other values Or reject

Coming attractions Application Program This step will get to be pretty exciting Graphics Driver Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Application Program Graphics Driver Consider the pixel and it s destination Z- buffer (is it in front?) Alpha / color / stencil - tests Color blending Then (and only then) can we write Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

Each pixel requires a read/write cycle Application Program Graphics Driver Potential memory bandwidth bottleneck (cache writes) There are worse memory bottlenecks Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

We ve made it! Application Program Graphics Driver Now the frame buffer gets sent to the screen. (at the appropriate time) Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer

What if we didn t make it Suppose the triangle s pixels are occluded Removed by the z-buffer

Normal Z- Test Application Program Graphics Driver Happens at the end We wasted all that work! Command Buffer (Triangle ) (TCL) Cache Assembly Triangle Rasterize Tests shading Geometry Texture Memory Render to texture Frame Buffer