Standard Graphics Pipeline

Similar documents
Cornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary

CS4620/5620: Lecture 14 Pipeline

RealityEngine Graphics

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

CS452/552; EE465/505. Clipping & Scan Conversion

CS 130 Exam I. Fall 2015

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Spring 2009 Prof. Hyesoon Kim

Rasterization Overview

The Pixel-Planes Family of Graphics Architectures

CS427 Multicore Architecture and Parallel Computing

CS451Real-time Rendering Pipeline

Pipeline Operations. CS 4620 Lecture 14

Line Drawing. Introduction to Computer Graphics Torsten Möller / Mike Phillips. Machiraju/Zhang/Möller

Optimization for Real-Time Graphics Applications

Spring 2011 Prof. Hyesoon Kim

FROM VERTICES TO FRAGMENTS. Lecture 5 Comp3080 Computer Graphics HKBU

Lecture 2. Shaders, GLSL and GPGPU

Computer Graphics. Hardware Pipeline. Visual Imaging in the Electronic Age Prof. Donald P. Greenberg October 23, 2014 Lecture 16

Computer Graphics. Rendering. by Brian Wyvill University of Calgary. cpsc/enel P 1

EECE 478. Learning Objectives. Learning Objectives. Rasterization & Scenes. Rasterization. Compositing

Module Contact: Dr Stephen Laycock, CMP Copyright of the University of East Anglia Version 1

E.Order of Operations

The Traditional Graphics Pipeline

CS 112 The Rendering Pipeline. Slide 1

Graphics Hardware and Display Devices

CHAPTER 1 Graphics Systems and Models 3

Models and Architectures

The Traditional Graphics Pipeline

Overview. Pipeline implementation I. Overview. Required Tasks. Preliminaries Clipping. Hidden Surface removal

Real-Time Graphics Architecture

CS230 : Computer Graphics Lecture 4. Tamar Shinar Computer Science & Engineering UC Riverside

Introduction to Computer Graphics with WebGL

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

Triangle Rasterization

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

Line Drawing. Foundations of Computer Graphics Torsten Möller

CS130 : Computer Graphics Lecture 2: Graphics Pipeline. Tamar Shinar Computer Science & Engineering UC Riverside

A SIMD-efficient 14 Instruction Shader Program for High-Throughput Microtriangle Rasterization

Graphics Hardware. Instructor Stephen J. Guy

Models and Architectures. Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts University of New Mexico

Rasterization. COMP 575/770 Spring 2013

The Traditional Graphics Pipeline

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

CSE 167: Introduction to Computer Graphics Lecture #4: Vertex Transformation

From Vertices to Fragments: Rasterization. Reading Assignment: Chapter 7. Special memory where pixel colors are stored.

Real-Time Graphics Architecture

Surface shading: lights and rasterization. Computer Graphics CSE 167 Lecture 6

Drawing Fast The Graphics Pipeline

- Rasterization. Geometry. Scan Conversion. Rasterization

Rasterization. CS 4620 Lecture Kavita Bala w/ prior instructor Steve Marschner. Cornell CS4620 Fall 2015 Lecture 16

What Next? Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. * slides thanks to Kavita Bala & many others

Rendering Objects. Need to transform all geometry then

CS 130 Exam I. Fall 2015

Z 3 : An Economical Hardware Technique for High-Quality Antialiasing and Order-Independent Transparency

Visibility: Z Buffering

CS 4731: Computer Graphics Lecture 21: Raster Graphics: Drawing Lines. Emmanuel Agu

CS 316: Multicore/GPUs

Fall CSCI 420: Computer Graphics. 7.1 Rasterization. Hao Li.

Pipeline and Rasterization. COMP770 Fall 2011

Drawing Fast The Graphics Pipeline

As dierent shading methods and visibility calculations have diversied the. image generation, many dierent alternatives have come into existence for

OpenGL: Open Graphics Library. Introduction to OpenGL Part II. How do I render a geometric primitive? What is OpenGL

CS559: Computer Graphics. Lecture 12: Antialiasing & Visibility Li Zhang Spring 2008

Pipeline Operations. CS 4620 Lecture 10

3D Rendering Pipeline

Portland State University ECE 588/688. Graphics Processors

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Computer Graphics and Visualization. Graphics Systems and Models

CSCI 420 Computer Graphics Lecture 14. Rasterization. Scan Conversion Antialiasing [Angel Ch. 6] Jernej Barbic University of Southern California

Programming Graphics Hardware

Current Trends in Computer Graphics Hardware

The graphics pipeline. Pipeline and Rasterization. Primitives. Pipeline

Rasterization. Rasterization (scan conversion) Digital Differential Analyzer (DDA) Rasterizing a line. Digital Differential Analyzer (DDA)

1.2.3 The Graphics Hardware Pipeline

Realtime 3D Computer Graphics Virtual Reality

The NVIDIA GeForce 8800 GPU

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Hardware-driven visibility culling

Rasterization. CS4620 Lecture 13

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

Parallel Rendering. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen

CS 130 Final. Fall 2015

Clipping. Angel and Shreiner: Interactive Computer Graphics 7E Addison-Wesley 2015

Rasterization. MIT EECS Frédo Durand and Barb Cutler. MIT EECS 6.837, Cutler and Durand 1

Graphics and Interaction Rendering pipeline & object modelling

Course Recap + 3D Graphics on Mobile GPUs

Computer Graphics (CS 543) Lecture 9 (Part 2): Clipping. Prof Emmanuel Agu. Computer Science Dept. Worcester Polytechnic Institute (WPI)

GeForce4. John Montrym Henry Moreton

Real-Time Graphics Architecture

Reading. 18. Projections and Z-buffers. Required: Watt, Section , 6.3, 6.6 (esp. intro and subsections 1, 4, and 8 10), Further reading:

Last class. A vertex (w x, w y, w z, w) - clipping is in the - windowing and viewport normalized view volume if: - scan conversion/ rasterization

CSE 167: Lecture #4: Vertex Transformation. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

The Graphics Pipeline

Deferred Rendering Due: Wednesday November 15 at 10pm

Game Architecture. 2/19/16: Rasterization

CS130 : Computer Graphics. Tamar Shinar Computer Science & Engineering UC Riverside

Drawing Fast The Graphics Pipeline

Evolution of GPUs Chris Seitz

Transcription:

Graphics Architecture Software implementations of rendering are slow. OpenGL on Sparc workstations. Performance can be improved using sophisticated algorithms and faster machines. Real-time large-scale 3D graphics is not possible without hardware support. At least some of the rendering steps can be replaced by hardware. Rendering is ideal for pipeline and multiprocessor architectures. CPS124, 296: Computer Graphics Graphics Hardware Page 1

A Brief History Generation: Capabilities for which the architecture was primarily designed. First Generation First machines came out in early 1980s. Transformation capabilities. Limited frame-buer processing. Flat shading. Smooth shading, z-buer not supported. Examples: SGI Iris 3000 (1985); Apollo DN570(1985). Limited smooth shading & depth buering. Examples: SGI 4DG (1986). Eective for wire-frame images. Later machines: CPS124, 296: Computer Graphics Graphics Hardware Page 2

The Second Generation Reduced memory costs & Application-specic ICs (ASICs): Allowed large frame-buer with multiple rendering processors. Interpolation of colors and depths. Memory capacity & bandwidth allowed depth buering. Examples: SGI GT (1988); Apollo DN590(1988). Later machines: limited texture mapping. Antialiasing of points and lines. Examples: SGI VGX; HP VRX; Apollo DN1000. CPS124, 296: Computer Graphics Graphics Hardware Page 3

The Third Generation SGI Reality Engine: Lighting, smooth shading, depth-buer, texture mapping, and antialiasing. 0:5 millions triangles per second, under assumptions: triangles in short strip. 10% triangles intersect the view frustum. Filtering for textures; large textures. Antialiasing for polygons. Pixel ll rate: 30Hz rendering of 1280 1024 full-screen images. SGI Innite Reality: Pixel ll rate 60Hz. Virtual texture memory. Display-list memory on graphics processor. Onyx and Onyx2 platforms. CPS124, 296: Computer Graphics Graphics Hardware Page 4

CPS124, 296: Computer Graphics Graphics Hardware Page 5 Display traversal View Transform Standard Graphics Pipeline Modeling transform Clipping Trivial accept/ reject Viewport Mapping Lighting Rasterization Monitor Graphics Pipeline

Graphics Pipeline Assume polygons are being rendered. Application processing between frames. Geometry processing: 3D Polygons to 2D polygons (in screen coordinates). Transformation from local to world coordinates. Lighting at vertices. Transforming to a canonical view volume. Clipping. Perspective projection. Transformation to screen coordinates. Rasterization: 2D polygons! pixels. Scan conversion. Shading. Hidden surface removal. Display processing: Converting pixels to analog display. Front end vs back end. CPS124, 296: Computer Graphics Graphics Hardware Page 6

Pipelines Subsystems Geometry processing is ideal for pipelined processing. Stage 1 Stage 2 Stage 3 p3 p2 p1 p4 p3 p2 Earlier stages process the next polygon while later stages are processing the current polygon. Latency and throughput. SGI used pipelined processing in early architectures. CPS124, 296: Computer Graphics Graphics Hardware Page 7

Parallel Subsystems Geometry processing faster on parallel machines. Polygons can be processed in parallel. Each processor performs all steps of geometry processing on a polygon. Multiple polygons are processed simultaneously. Recent SGI systems use this approach. Rasterization is ideal for parallel processing. 1280 1024 1:3Mpixels need to be processed per frame. Supersampling: # subpixels 10{20 million. Most pixels are processed multiple times in z- buer algorithms. Use multiple processors. CPS124, 296: Computer Graphics Graphics Hardware Page 8

Partitioning of Memory 1 1 2 2 1 2 1 2 1 1 2 2 3 4 3 4 3 3 4 4 1 2 1 2 3 3 4 4 3 4 3 4 Contiguous vs Interleaved Contiguous partitioning performs well in the best case. Polygons are uniformly distributed. Each processor handles only a fraction of the polygons. Load on each processor is balanced. Performs poorly in the worst case. All polygons are in a local region. A few processors do all the work. Interleaved is best in the worst case. Each processor handles all the polygons. Load is balanced. CPS124, 296: Computer Graphics Graphics Hardware Page 9

SGI Reality Engine CPS124, 296: Computer Graphics Graphics Hardware Page 10

SGI Reality Engine Three, four, or six graphics boards. Geometry board: Input FIFO Command processor Geometry engines: 6; 8; or 12. Raster memory board: 1; 2, or 4. 5 fragment generators. Each with its texture memory. 80 image engines. Each image engine with frame buer memory: 256 bits per pixel. Display board: Video functions. Video timing Color mapping D/A conversion FIFO memories at Input and output of each geometry engine. Input of each fragment generator. Input of each image engine. CPS124, 296: Computer Graphics Graphics Hardware Page 11

Command Processor Receives OpenGL command from applications and other processors. Directs each triangle to one of the geometry engines. Round-robin distribution. No load balancing. Infrequent command: e.g., matrix multiplication, lighting model. Broadcasted to all geometry engines. Synchronization is required. Frequent command: e.g., vertex color, coordinate, normal. Bundled with each rendering command. Sent to individual geometry engines. Breaks long connected sequences of segments and triangles into short groups. Each piece sent to a single geometry engine. CPS124, 296: Computer Graphics Graphics Hardware Page 12

Geometry Engines From Command Processor 48 ASIC 48 64 256K X 64 DRAM i860xp Transformations Lighting Clipping Projection Texture To Triangle Bus Output is sent to the triangle bus. Connecting all geometry engines to all fragment generators. 1M smooth shaded, depth buered, texture mapped, triangles per second. Depth, texture calculations in double precision. Peak performance 100MFLOPS CPS124, 296: Computer Graphics Graphics Hardware Page 13

Fragment Generator From Triangle Bus 48 ASIC ASIC 1M X 16 DRAM 16 ASIC ASIC To 16 Image Engines Output of geometry engines is sent to 5, 10, or 20 fragment generators; say 20. Each fragment generator responsible for 1/20 of the screen's pixels; 64K pixels. Interleaved partitioning of the screen. Computes the intersection of the set of pixels fully or partially covered by the triangle. For each fragment it computes Depth, color (including texture) A subsample mask for each fragment. Output is sent to 16 image engines. CPS124, 296: Computer Graphics Graphics Hardware Page 14

Fragment Generator Subsample mask: 4, 8, or 16 samples from a 8 8 grid. Same subsamples for each pixel. Depth is computed at the center sample. Ensures accurate depth calculation at each subpixel location. Color sample values: If triangle covers the pixel, compute color at the center. If partially covers, compute near the centroid of intersection of triangle and pixel. Incremental algorithm for rasterization. CPS124, 296: Computer Graphics Graphics Hardware Page 15

Image Engine From Fragment Generator 4 Image Engine 16 256K X 16 DRAM 1 To Display Generator 16 Image engines connected to each fragment generator. Each image engine responsible for 4K pixels. (Or 8K, 16K if fewer raster memory cards.) Each image engine assigned to a xed subset of pixels. Each image engine controls a 256K16 DRAM. DRAM forms frame and depth buers. CPS124, 296: Computer Graphics Graphics Hardware Page 16

Image Engines Each pixel is assigned 1024 bits (Or 512, 256 if fewer raster memory cards.) These bits store: Color (R, G,B, A values) for each subpixel 12bits each if 8 subsamples; 8 bits each if 16 subsamples. Depth: 32 bits if 8 subsamples 24 bits if 16 subsamples. 1; 2; 4 1280 1024 displayable color buers Displayable buers have the same resolution as subsamples. CPS124, 296: Computer Graphics Graphics Hardware Page 17

Image Engines When a triangle is passed to a fragment generator, it's slope information is passed to image engines. Using slope information, image engines compute depth at each subpixel of 8 8 grid. For each 1 in the mask, depth value is compared with the value stored in the depth buer. If comparison succeeds Color and depth values in the framebuer are updated. Aggregate color value is recomputed. New color value is rewritten on the displayable buer. CPS124, 296: Computer Graphics Graphics Hardware Page 18

Innite Reality CPS124, 296: Computer Graphics Graphics Hardware Page 19

Innite Reality Pixel ll rate 60 frames per second. Each pixel is assigned 0.5{2K bits. Speed up command processinng. Display lists trasferred from the host processor using DMA transfer. 15MB memory to store display lists at the graphics processor. Customized geometry processors. Geometry distribution: Round robin: Simple assignment. Least busy: Better performance. Vertex bus instead of triangle bus. Triangle slope information is not passed. Only vertex information is passed. Reduces bandwidth by 60%. Load on vertex and input buses are similar. Additional hardware for texture mapping. CPS124, 296: Computer Graphics Graphics Hardware Page 20

Pixel Plane 5 CPS124, 296: Computer Graphics Graphics Hardware Page 21

Pixel Plane 5 (Fuchs et al., 1989) Performance: 2:3M triangles per second. Parallel geometry processing. 50 graphics processors Parallel rasterization. Contiguous partition. 20 renderes. separate shaders. CPS124, 296: Computer Graphics Graphics Hardware Page 22

Graphics Processor Receives polygons from application or other processors. Performs geometric processing. Assigns processed polygons to contiguous partitions. Each partition is 128 128 square. Each graphics processor has a bin for every partition. After all polygons are processed, all bins are passed to renderers. Communication is through a high bandwidth ring network. CPS124, 296: Computer Graphics Graphics Hardware Page 23

Renderer All bins are processed in parallel A renderer rasterizes all polygons in one partition's bin from each graphics processor. After processing these bins, renderee processes bins of another partition. Rasterization is performed using logic-enhanced memory. A small processor for each of the 128 128 = 16K pixels. 2K memory for each pixel. Each processor maintains a few states, e.g., its x- & y-coordinates, and evaluates Ax + By + C + Dx 2 + Exy + Fy 2 CPS124, 296: Computer Graphics Graphics Hardware Page 24

Pixel Flow (Molnar et al., 1992) Being developed by Hewlett-Packard. Unbounded parallelism in theory; MIMD machine. Performance: in theory: unlimited polygons per second. Rendering is performed by n complete rendering systems. Each system includes a graphic processor, renderer, and a shader. Each system processes 1=n polygons. It outputs the frame buer and also the z- buer. Composites n dierent images to produce the overall image. Unlike other machines, one pixel may be processed by many processors. CPS124, 296: Computer Graphics Graphics Hardware Page 25