Real-Time Graphics Architecture

Size: px
Start display at page:

Download "Real-Time Graphics Architecture"

Transcription

1 Real-Time Graphics Architecture Lecture 4: Parallelism and Communication Kurt Akeley Pat Hanrahan Topics 1. Frame buffers 2. Types of parallelism 3. Communication patterns and requirements 4. Sorting classification for parallel rendering (with examples) 1

2 Frame Buffers er vs. calligraphic er (image order) dominant choice Calligraphic (object order) Earliest choice (Sketchpad) E&S terminals in the 70s and 80s Works with light pens Scene complexity affects frame rate Monitors are expensive Still required for FAA simulation Increases absolute brightness of light points 2

3 Frame buffer definitions What is a frame buffer? What can we learn by considering different definitions? Frame buffer definition #1 Storage for commands that are executed to refresh the display Allows for raster or calligraphic display (e.g. Megatech) Frame buffer for calligraphic display is a display list OpenGL render list? Key point: frame buffer contents are interpreted Color mapping Image scaling, warping Window system (overlay, separate windows, ) Address Recalculation Pipeline 3

4 Frame buffer definition #2 Image memory used to decouple the render frame rate from the display frame rate Meets common understanding of frame buffer as image Leads naturally to double buffering One render buffer, one display buffer, swap n-buffering also possible, can control latency Key idea: decoupling enables general-purpose GPU Visual simulation has high render frame rate MCAD has low render frame rate Window manager has no frame rate Frame buffer definition #3 All pixel-assigned memory used to assemble and display the images being rendered Key point: frame buffer is active participant in rendering Leads to non-color buffers: depth, stencil, window control OpenGL treats these buffers as part of frame buffer Some reserve frame buffer for color images Should be n-buffered in some cases (sort last) RealityEngine frame buffer can be deeper than wide or high History cycles through this definition 2-D manipulation 3-D painters algorithm 3-D depth, stencil, accumulation, multi-pass Programmable shading 4

5 Frame buffer is optional Calligraphic display If we don t define display list as frame buffer Follow-the-beam rendering Minimizes latency Saves cost if frames are never dropped Talisman-like image assembly (3-D sprites) Old idea (visual simulation, window systems) GigaPixel render tile Frame buffer stores color images only Depth, stencil, etc. in small tile Dominant architecture is consistent SGI architectures look like ATI architectures, which look like NVIDIA architectures Details are evolving, but big picture remains the same Why is this? Simplicity of design Simplicity of algorithms Simplicity of immediate-mode approach 5

6 Simplicity of design Frame buffer operations Blending: merge fragment and pixel color Depth Buffering: save nearest fragment Stencil Buffering: simple pixel state machine Accumulation Buffering: high-resolution color arithmetic Antialiasing: (to be covered later). All frame buffer operations: Combine fragment and pixel data (not just a replace) But replace operation is optimized, e.g., no parity/ecc Are local (no intra-pixel dependencies) Why aren t fragment operations programmable? Simplicity of algorithms Frame buffer employs brute-force simplicity Hidden surface elimination: Depth-buffer vs. sort/painter Capping: Stencil-based vs. object calculations Image-space algorithm is efficient Just samples, never object information, locality Just-in-time calculation, steady cost function Accumulation Buffer (high-resolution color arithmetic) The Accumulation Buffer, Haeberli and Akeley, Proceedings of SIGGRAPH 90 Volume rendering using 3D textures Multi-pass rendering Interactive Multi-pass Programmable Shading, Peercy, Olano, Airey, and Ungar, Proceedings of SIGGRAPH 00 6

7 Simplicity of immediate-mode Frame buffer contents are context Matches 2D/window-rendering model Rendering System Little graphics state here Frame buffer: most graphics state here Decreasing display bandwidth burden Historically display bandwidth was a limiting factor Hence Sproull s Rule : fill rate >= display rate Now display bandwidth is almost inconsequential Year System FB (GB) Disp (GB) Disp / FB 1984 SGI 2000-series / SGI GTX 1.8 * / SGI InfiniteReality / NVIDIA 7900 GTX /70 * VRAM provided separate video bandwidth 7

8 Parallelism and Communication Parallelism and communication Parallelism using multiple computational units to processes work in parallel Communication connecting the computational units to allow work to be distributed and aggregated Issues Dependencies Ordering Sorting Scalability Computation Bandwidth Load balancing 8

9 Parallelism taxonomy Hardware parallelism (simultaneous execution on multiple processors) Virtual parallelism (time sharing a single processor, usually with hardware support) Data parallelism [aka parallelism ] (same task on similar data sets) Task parallelism (different tasks on similar OR differing data sets) Parallelism taxonomy Hardware parallelism (simultaneous execution on multiple processors) Virtual parallelism (time sharing a single processor, usually with hardware support) Data parallelism [aka parallelism ] (same task on similar data sets) Frame-parallelism (batch, SGI N-clops) Object-parallelism (geometry) Image-parallelism (fragment/pixel) Task parallelism (different tasks on similar OR differing data sets) 9

10 Parallelism taxonomy Hardware parallelism (simultaneous execution on multiple processors) Virtual parallelism (time sharing a single processor, usually with hardware support) Data parallelism [aka parallelism ] (same task on similar data sets) Task parallelism (different tasks on similar OR differing data sets) Frame-parallelism (batch, SGI N-clops) Object-parallelism (geometry) Image-parallelism (fragment/pixel) Multi-processing (on multiple CPUs) Pipelining (the graphics pipeline) Parallelism taxonomy Data parallelism [aka parallelism ] (same task on similar data sets) Hardware parallelism (simultaneous execution on multiple processors) Frame-parallelism (batch, SGI N-clops) Object-parallelism (geometry) Image-parallelism (fragment/pixel) Virtual parallelism (time sharing a single processor, usually with hardware support) Multi-processing (graphics context switching) Multi-threading (almost defines a GPU-like processor) Task parallelism (different tasks on similar OR differing data sets) Multi-processing (on multiple CPUs) Pipelining (the graphics pipeline) 10

11 Parallelism taxonomy Data parallelism [aka parallelism ] (same task on similar data sets) Task parallelism (different tasks on similar OR differing data sets) Hardware parallelism (simultaneous execution on multiple processors) Frame-parallelism (batch, SGI N-clops) Object-parallelism (geometry) Image-parallelism (fragment/pixel) Multi-processing (on multiple CPUs) Pipelining (the graphics pipeline) Virtual parallelism (time sharing a single processor, usually with hardware support) Multi-processing (graphics context switching) Multi-threading (almost defines a GPU-like processor) Multi-processing (time sharing a single CPU) Multi-threading (Direct3D-10 commoncore ) Graphics is embarrassingly parallel Ample self-similar data sets Frames, vertexes, fragments, texels, pixels With minimal dependencies Few intra-set dependencies Pixels (in the frame buffer) are the significant exception Inter-set dependencies are purely sequential Graphics pipeline is designed to minimize dependencies Other graphics architectures have more dependencies E.g., for global lighting effects But graphics pipeline has huge redundancies Hence many opportunities for optimization How hard should we work to do things wrong? 11

12 etry parallelism trend (SGI) Model Transform Length Transform Width G GTX VGX RE IR Image parallelism trend (SGI) erization G GTX VGX RE IR 12

13 The clear trend Shorter and wider Why? Communication taxonomy Sorting Distribution Object Image Routing (Introduced by parallelism) (Introduced by parallelism) Texturing Fundamental 13

14 Sorting is fundamental Sorting Distribution Object Image Routing Texturing I. E. Sutherland, R. F. Sproull, and R. A. Schumacher, A characterization of ten hidden surface algorithms Classified by order of x, y, z radix sorts Pipelining vs. parallelism Issue Ordering dependencies Sorting dependencies Computation scalability Bandwidth scalability Load balancing scalability Task Parallelism (pipelining) Easy Easy Data Parallelism Challenging Challenging 14

15 Pipelining vs. parallelism Issue Ordering dependencies Sorting dependencies Computation scalability Bandwidth scalability Load balancing scalability Task Parallelism (pipelining) Easy Easy Poor Poor (Nearly) impossible Data Parallelism Challenging Challenging Challenging Challenging Challenging Ordering challenges Fundamental: Frame buffer operations Painter s algorithm Memory hazards Texture writes Render Copy to texture Render Readback From pipelining: Changes to graphics state 15

16 Sorting taxonomy Application Command etry erization Texture Fragment Display Sort-First Sort-Middle Sort-Last Fragment Sort-Last Image Composition Sort-First 16

17 Sort-first App App Pre-transformation Cmd SORT Cmd Point-to-point communication scales Tex Tex Coarse tiling incurs load imbalance Frag Frag Disp Disp Princeton Display Wall, Stanford WireGL Sort-first Order Automatic (conceptually) Sort Pre-stage (cheat ) Compute scalability Good Bandwidth scalability Good Load balance scalability Poor 17

18 Sort-first App Cmd Tex Frag SORT ROUTE Disp App Cmd Tex Frag Ring parallelism App Cmd Tex Frag DIST Cmd DIST Cmd DIST DIST Cmd Tex Tex Tex Frag Frag Frag ROUTE Disp 3DLABs 18

19 Sort-Middle Image-space work distribution Parke - Tiled Fuchs - Interleaved 19

20 Sort-middle interleaved App DISTRIBUTE Cmd Cmd etry work load-balanced, Tex Frag SORT ROUTE Disp Tex Frag except clipping and tesselation Broadcast communication does not scale, but supports ordering Finely interleaved screen tiling ensures es excellent load balance SGI Graphics Workstations: RealityEngine, InfiniteReality Sort-middle interleaved Order Force sequence at triangle sort Sort Broadcast Compute scalability Good Bandwidth scalability Limited by sort broadcast Load balance scalability Good 20

21 SGI RealityEngine 240 MB/s 1600 MB/s 3200 MB/s Sort-middle tiled App Cmd Tex Frag Disp SORT ROUTE App Cmd Tex Frag Disp UNC PixelPlanes, Stanford Argus Point-to-point communication scales Coarse tiling incurs load imbalance 21

22 Sort-middle tiled (immediate mode) Order Force sequence at triangle sort Sort Can approach point-to-point Compute scalability Good Bandwidth scalability Good Load balance scalability Poor for rasterization (due to large triangles) Sort-middle tiled (chunked) Order Sort Force sequence at triangle sort Full-frame delay, render to texture difficulties Can approach point-to-point Compute scalability Good Bandwidth scalability Good Load balance scalability Good 22

23 UNC Pixel-Planes5 (1990) Sort-Last 23

24 Sort-last fragment App Cmd Tex Frag SORT ROUTE App Cmd Tex Frag Improved texture locality No redundant work in FG Exposes rasterization load imbalance to application Point-to-point communication scales, but requires more bw Finely interleaved screen tiling insures excellent load balance Disp Disp Possible, but difficult, to maintain ordering Kubota Denali, E&S Freedom 3000 Sort-last fragment Order Force sequence at fragment sort Sort Point-to-point, high bandwidth Compute scalability Good Bandwidth scalability OK (sorting is the bottleneck) Load balance scalability OK (exposed to application) 24

25 Kubota Denali (1993) TEM 48 5 X6 24 X10 FBM Denali Technical Overview 1.0 Kubota Pacific Computer, 1993 Image composition Z comp Other combiners possible 25

26 Sort-last image composition App Cmd App Cmd Exposes rasterization load imbalance to application Point-to-point ring interconnect scales Tex Tex Frag Frag Disp SORT Disp Two-stage image composition loses ordering UNC/HP PixelFlow, Aizu VC-1, Stanford Lightning-2 Sort-last image composition Order Not fully supported! Sort One to many for each pipeline Compute scalability Excellent Bandwidth scalability Excellent Load balance scalability OK (exposed to application) 26

27 UNC Pixel Flow From J. Poulton, J. Eyles, S. Molnar, H. Fuchs, Pixel Flow: The Realization 27

28 Sort-Everywhere Sort-everywhere: Pomegranate App Cmd Mem Tex Mem Frag Disp DISTRIBUTE SORT ROUTE App Cmd Tex Mem Frag Mem Disp 28

29 Architecture comparison X indicates an issue Sort-first interleaved Sort-middle i Sort-middle tiled (immd) Sort-middle tiled (chunk) Sort-last frag gment Sort-last ima age comp. Sort-everywh here Ordered (X) X Compute scalability Bandwidth scalability X X Load balance scalability X X X X Summary GPU architecture trend Pipeline hardware-parallel virtual-parallel 29

30 Readings Required 1. S. Molnar, M. Cox, D. Ellsworth, H. Fuchs, A sorting classification of parallel rendering 2. Fuchs et al., A heterogenous multiprocessor graphics system using processor-enhanced memories (PP5). 3. Eyles et al., PixelFlow: The Realization Recommended 1. F. I. Parke, Simulation and expected performance analysis of multiple processor z-buffer systems 2. H. Fuchs, Distributing a visible surface algorithm over multiple processors Real-Time Graphics Architecture Lecture 4: Parallelism and Communication Kurt Akeley Pat Hanrahan 30

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Analyzing a 3D Graphics Workload Where is most of the work done? Memory Vertex

More information

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Today Finishing up from last time Brief discussion of graphics workload metrics

More information

Scheduling the Graphics Pipeline on a GPU

Scheduling the Graphics Pipeline on a GPU Lecture 20: Scheduling the Graphics Pipeline on a GPU Visual Computing Systems Today Real-time 3D graphics workload metrics Scheduling the graphics pipeline on a modern GPU Quick aside: tessellation Triangle

More information

CS427 Multicore Architecture and Parallel Computing

CS427 Multicore Architecture and Parallel Computing CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:

More information

Real-Time Graphics Architecture

Real-Time Graphics Architecture Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Geometry Outline Vertex and primitive operations System examples emphasis on clipping Primitive

More information

Parallel Rendering. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen

Parallel Rendering. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen Parallel Rendering Molnar, Cox, Ellsworth, and Fuchs. A Sorting Classification of Parallel Rendering. IEEE Computer Graphics and Applications. July, 1994. Why Parallelism Applications need: High frame

More information

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics Why GPU? Chapter 1 Graphics Hardware Graphics Processing Unit (GPU) is a Subsidiary hardware With massively multi-threaded many-core Dedicated to 2D and 3D graphics Special purpose low functionality, high

More information

Real-Time Graphics Architecture

Real-Time Graphics Architecture Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Rasterization Outline Fundamentals Examples Special topics (Depth-buffer, cracks and holes,

More information

Introduction to Computer Graphics. Overview. What is Computer Graphics?

Introduction to Computer Graphics. Overview. What is Computer Graphics? INSTITUTIONEN FÖR SYSTEMTEKNIK LULEÅ TEKNISKA UNIVERSITET Introduction to Computer Graphics David Carr Fundamentals of Computer Graphics Spring 2004 Based on Slides by E. Angel Graphics 1 L Overview What

More information

Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University

Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University Project3 Cache Race Games night Monday, May 4 th, 5pm Come, eat, drink, have fun and be merry! Location: B17 Upson Hall

More information

Rendering Objects. Need to transform all geometry then

Rendering Objects. Need to transform all geometry then Intro to OpenGL Rendering Objects Object has internal geometry (Model) Object relative to other objects (World) Object relative to camera (View) Object relative to screen (Projection) Need to transform

More information

MMGD0206 Computer Graphics. Chapter 1 Development of Computer Graphics : History

MMGD0206 Computer Graphics. Chapter 1 Development of Computer Graphics : History MMGD0206 Computer Graphics Chapter 1 Development of Computer Graphics : History What is Computer Graphics? Computer graphics generally means creation, storage and manipulation of models and images Such

More information

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp Next-Generation Graphics on Larrabee Tim Foley Intel Corp Motivation The killer app for GPGPU is graphics We ve seen Abstract models for parallel programming How those models map efficiently to Larrabee

More information

3D Computer Games Technology and History. Markus Hadwiger VRVis Research Center

3D Computer Games Technology and History. Markus Hadwiger VRVis Research Center 3D Computer Games Technology and History VRVis Research Center Lecture Outline Overview of the last ten years A look at seminal 3D computer games Most important techniques employed Graphics research and

More information

Current Trends in Computer Graphics Hardware

Current Trends in Computer Graphics Hardware Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006)

More information

Real-Time Graphics Architecture

Real-Time Graphics Architecture Real-Time Graphics Architecture Lecture 8: Antialiasing Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/cs448-07-spring/ Antialiasing Outline Aliasing and antialiasing Taxonomy of antialiasing approaches

More information

Models of the Impact of Overlap in Bucket Rendering

Models of the Impact of Overlap in Bucket Rendering Models of the Impact of Overlap in Bucket Rendering Milton Chen, Gordon Stoll, Homan Igehy, Kekoa Proudfoot and Pat Hanrahan Computer Systems Laboratory Stanford University Abstract Bucket rendering is

More information

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal Graphics Hardware, Graphics APIs, and Computation on GPUs Mark Segal Overview Graphics Pipeline Graphics Hardware Graphics APIs ATI s low-level interface for computation on GPUs 2 Graphics Hardware High

More information

Standard Graphics Pipeline

Standard Graphics Pipeline Graphics Architecture Software implementations of rendering are slow. OpenGL on Sparc workstations. Performance can be improved using sophisticated algorithms and faster machines. Real-time large-scale

More information

GeForce4. John Montrym Henry Moreton

GeForce4. John Montrym Henry Moreton GeForce4 John Montrym Henry Moreton 1 Architectural Drivers Programmability Parallelism Memory bandwidth 2 Recent History: GeForce 1&2 First integrated geometry engine & 4 pixels/clk Fixed-function transform,

More information

Computer Graphics. Hardware Pipeline. Visual Imaging in the Electronic Age Prof. Donald P. Greenberg October 23, 2014 Lecture 16

Computer Graphics. Hardware Pipeline. Visual Imaging in the Electronic Age Prof. Donald P. Greenberg October 23, 2014 Lecture 16 Computer Graphics Hardware Pipeline Visual Imaging in the Electronic Age Prof. Donald P. Greenberg October 23, 2014 Lecture 16 Moore s Law Chip density doubles every 18 months. Processing Power (P) in

More information

Real-Time Graphics Architecture

Real-Time Graphics Architecture RealTime Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a01fall About Kurt Personal history B.E.E. Univeristy of Delaware, 1980 M.S.E.E. Stanford, 1982 SGI

More information

Architectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1

Architectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1 Architectures Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1 Overview of today s lecture The idea is to cover some of the existing graphics

More information

Real-Time Graphics Architecture

Real-Time Graphics Architecture Real-Time Graphics Architecture Lecture 5: Rasterization Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/cs448-07-spring/ Rasterization Outline Fundamentals System examples Special topics (Depth-buffer,

More information

A Data-Parallel Genealogy: The GPU Family Tree. John Owens University of California, Davis

A Data-Parallel Genealogy: The GPU Family Tree. John Owens University of California, Davis A Data-Parallel Genealogy: The GPU Family Tree John Owens University of California, Davis Outline Moore s Law brings opportunity Gains in performance and capabilities. What has 20+ years of development

More information

CMSC 611: Advanced. Parallel Systems

CMSC 611: Advanced. Parallel Systems CMSC 611: Advanced Computer Architecture Parallel Systems Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems

More information

DESIGNING GRAPHICS ARCHITECTURES AROUND SCALABILITY AND COMMUNICATION

DESIGNING GRAPHICS ARCHITECTURES AROUND SCALABILITY AND COMMUNICATION DESIGNING GRAPHICS ARCHITECTURES AROUND SCALABILITY AND COMMUNICATION A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN

More information

Advanced Shading and Texturing

Advanced Shading and Texturing Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Advanced Shading and Texturing 1 Topics Features Bump mapping Environment mapping Shadow

More information

Real-Time Graphics Architecture

Real-Time Graphics Architecture Real-Time Graphics Architecture Lecture 9: Programming GPUs Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/cs448-07-spring/ Programming GPUs Outline Caveat History Contemporary GPUs Programming

More information

Comparing Reyes and OpenGL on a Stream Architecture

Comparing Reyes and OpenGL on a Stream Architecture Comparing Reyes and OpenGL on a Stream Architecture John D. Owens Brucek Khailany Brian Towles William J. Dally Computer Systems Laboratory Stanford University Motivation Frame from Quake III Arena id

More information

PowerVR Hardware. Architecture Overview for Developers

PowerVR Hardware. Architecture Overview for Developers Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

Cornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary

Cornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary Cornell University CS 569: Interactive Computer Graphics Introduction Lecture 1 [John C. Stone, UIUC] 2008 Steve Marschner 1 2008 Steve Marschner 2 NASA University of Calgary 2008 Steve Marschner 3 2008

More information

Spring 2009 Prof. Hyesoon Kim

Spring 2009 Prof. Hyesoon Kim Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on

More information

Spring 2011 Prof. Hyesoon Kim

Spring 2011 Prof. Hyesoon Kim Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on

More information

A Data-Parallel Genealogy: The GPU Family Tree

A Data-Parallel Genealogy: The GPU Family Tree A Data-Parallel Genealogy: The GPU Family Tree Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Outline Moore s Law brings

More information

The NVIDIA GeForce 8800 GPU

The NVIDIA GeForce 8800 GPU The NVIDIA GeForce 8800 GPU August 2007 Erik Lindholm / Stuart Oberman Outline GeForce 8800 Architecture Overview Streaming Processor Array Streaming Multiprocessor Texture ROP: Raster Operation Pipeline

More information

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline sequence of operations to generate an image using object-order processing primitives processed one-at-a-time

More information

Out-Of-Core Sort-First Parallel Rendering for Cluster-Based Tiled Displays

Out-Of-Core Sort-First Parallel Rendering for Cluster-Based Tiled Displays Out-Of-Core Sort-First Parallel Rendering for Cluster-Based Tiled Displays Wagner T. Corrêa James T. Klosowski Cláudio T. Silva Princeton/AT&T IBM OHSU/AT&T EG PGV, Germany September 10, 2002 Goals Render

More information

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline sequence of operations to generate an image using object-order processing primitives processed one-at-a-time

More information

RealityEngine Graphics

RealityEngine Graphics RealityEngine Graphics Kurt Akeley Silicon Graphics Computer Systems Abstract The RealityEngine TM graphics system is the first of a new generation of systems designed primarily to render texture mapped,

More information

Lecture 2. Shaders, GLSL and GPGPU

Lecture 2. Shaders, GLSL and GPGPU Lecture 2 Shaders, GLSL and GPGPU Is it interesting to do GPU computing with graphics APIs today? Lecture overview Why care about shaders for computing? Shaders for graphics GLSL Computing with shaders

More information

Beyond Programmable Shading Keeping Many Cores Busy: Scheduling the Graphics Pipeline

Beyond Programmable Shading Keeping Many Cores Busy: Scheduling the Graphics Pipeline Keeping Many s Busy: Scheduling the Graphics Pipeline Jonathan Ragan-Kelley, MIT CSAIL 29 July 2010 This talk How to think about scheduling GPU-style pipelines Four constraints which drive scheduling decisions

More information

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI Tutorial on GPU Programming #2 Joong-Youn Lee Supercomputing Center, KISTI Contents Graphics Pipeline Vertex Programming Fragment Programming Introduction to Cg Language Graphics Pipeline The process to

More information

National Chiao Tung Univ, Taiwan By: I-Chen Lin, Assistant Professor

National Chiao Tung Univ, Taiwan By: I-Chen Lin, Assistant Professor Computer Graphics 1. Graphics Systems National Chiao Tung Univ, Taiwan By: I-Chen Lin, Assistant Professor Textbook: Hearn and Baker, Computer Graphics, 3rd Ed., Prentice Hall Ref: E.Angel, Interactive

More information

GPU Architecture. Michael Doggett Department of Computer Science Lund university

GPU Architecture. Michael Doggett Department of Computer Science Lund university GPU Architecture Michael Doggett Department of Computer Science Lund university GPUs from my time at ATI R200 Xbox360 GPU R630 R610 R770 Let s start at the beginning... Graphics Hardware before GPUs 1970s

More information

Multi-Graphics. Multi-Graphics Project. Scalable Graphics using Commodity Graphics Systems

Multi-Graphics. Multi-Graphics Project. Scalable Graphics using Commodity Graphics Systems Multi-Graphics Scalable Graphics using Commodity Graphics Systems VIEWS PI Meeting May 17, 2000 Pat Hanrahan Stanford University http://www.graphics.stanford.edu/ Multi-Graphics Project Scalable and Distributed

More information

1.2.3 The Graphics Hardware Pipeline

1.2.3 The Graphics Hardware Pipeline Figure 1-3. The Graphics Hardware Pipeline 1.2.3 The Graphics Hardware Pipeline A pipeline is a sequence of stages operating in parallel and in a fixed order. Each stage receives its input from the prior

More information

A Reconfigurable Architecture for Load-Balanced Rendering

A Reconfigurable Architecture for Load-Balanced Rendering A Reconfigurable Architecture for Load-Balanced Rendering Jiawen Chen Michael I. Gordon William Thies Matthias Zwicker Kari Pulli Frédo Durand Graphics Hardware July 31, 2005, Los Angeles, CA The Load

More information

CS4620/5620: Lecture 14 Pipeline

CS4620/5620: Lecture 14 Pipeline CS4620/5620: Lecture 14 Pipeline 1 Rasterizing triangles Summary 1! evaluation of linear functions on pixel grid 2! functions defined by parameter values at vertices 3! using extra parameters to determine

More information

Computer Graphics. Bing-Yu Chen National Taiwan University

Computer Graphics. Bing-Yu Chen National Taiwan University Computer Graphics Bing-Yu Chen National Taiwan University Introduction The Graphics Process Color Models Triangle Meshes The Rendering Pipeline 1 INPUT What is Computer Graphics? Definition the pictorial

More information

Mattan Erez. The University of Texas at Austin

Mattan Erez. The University of Texas at Austin EE382V (17325): Principles in Computer Architecture Parallelism and Locality Fall 2007 Lecture 12 GPU Architecture (NVIDIA G80) Mattan Erez The University of Texas at Austin Outline 3D graphics recap and

More information

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367

More information

Lecture 25: Board Notes: Threads and GPUs

Lecture 25: Board Notes: Threads and GPUs Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel

More information

Hardware-driven Visibility Culling Jeong Hyun Kim

Hardware-driven Visibility Culling Jeong Hyun Kim Hardware-driven Visibility Culling Jeong Hyun Kim KAIST (Korea Advanced Institute of Science and Technology) Contents Introduction Background Clipping Culling Z-max (Z-min) Filter Programmable culling

More information

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation

More information

Lecture 4: Geometry Processing. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Lecture 4: Geometry Processing. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011) Lecture 4: Processing Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today Key per-primitive operations (clipping, culling) Various slides credit John Owens, Kurt Akeley,

More information

Programmable GPUS. Last Time? Reading for Today. Homework 4. Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes

Programmable GPUS. Last Time? Reading for Today. Homework 4. Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes Last Time? Programmable GPUS Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes frame buffer depth buffer stencil buffer Stencil Buffer Homework 4 Reading for Create some geometry "Rendering

More information

Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Texture

Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan.  Texture Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Texture 1 Topics 1. Review of texture mapping 2. RealityEngine and InfiniteReality 3. Texture

More information

CS 316: Multicore/GPUs

CS 316: Multicore/GPUs CS 316: Multicore/GPUs Kavita Bala Fall 2007 Computer Science Cornell University Announcements Core Wars will be out in the next couple of days Aim at having fun! Number of points allocated to it is small

More information

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011) Lecture 6: Texture Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today: texturing! Texture filtering - Texture access is not just a 2D array lookup ;-) Memory-system implications

More information

Lecture 9: Deferred Shading. Visual Computing Systems CMU , Fall 2013

Lecture 9: Deferred Shading. Visual Computing Systems CMU , Fall 2013 Lecture 9: Deferred Shading Visual Computing Systems The course so far The real-time graphics pipeline abstraction Principle graphics abstractions Algorithms and modern high performance implementations

More information

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane Rendering Pipeline Rendering Converting a 3D scene to a 2D image Rendering Light Camera 3D Model View Plane Rendering Converting a 3D scene to a 2D image Basic rendering tasks: Modeling: creating the world

More information

PowerVR Series5. Architecture Guide for Developers

PowerVR Series5. Architecture Guide for Developers Public Imagination Technologies PowerVR Series5 Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

Working with Metal Overview

Working with Metal Overview Graphics and Games #WWDC14 Working with Metal Overview Session 603 Jeremy Sandmel GPU Software 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission

More information

Windowing System on a 3D Pipeline. February 2005

Windowing System on a 3D Pipeline. February 2005 Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April

More information

Texture. Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan.

Texture. Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Texture Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/courses/cs448-07-spring/ Topics 1. Projective texture mapping 2. Texture filtering and mip-mapping 3. Early

More information

frame buffer depth buffer stencil buffer

frame buffer depth buffer stencil buffer Final Project Proposals Programmable GPUS You should all have received an email with feedback Just about everyone was told: Test cases weren t detailed enough Project was possibly too big Motivation could

More information

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance

More information

Graphics Processing Unit Architecture (GPU Arch)

Graphics Processing Unit Architecture (GPU Arch) Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics

More information

A Sorting Classification of Parallel Rendering

A Sorting Classification of Parallel Rendering A Sorting Classification of Parallel Rendering Steven Molnar *, Michael Cox, David Ellsworth *, Henry Fuchs * * University of North Carolina at Chapel Hill and Princeton University Front-page photo: Simulation

More information

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,

More information

GPGPU. Peter Laurens 1st-year PhD Student, NSC

GPGPU. Peter Laurens 1st-year PhD Student, NSC GPGPU Peter Laurens 1st-year PhD Student, NSC Presentation Overview 1. What is it? 2. What can it do for me? 3. How can I get it to do that? 4. What s the catch? 5. What s the future? What is it? Introducing

More information

Hardware-driven visibility culling

Hardware-driven visibility culling Hardware-driven visibility culling I. Introduction 20073114 김정현 The goal of the 3D graphics is to generate a realistic and accurate 3D image. To achieve this, it needs to process not only large amount

More information

Course Recap + 3D Graphics on Mobile GPUs

Course Recap + 3D Graphics on Mobile GPUs Lecture 18: Course Recap + 3D Graphics on Mobile GPUs Interactive Computer Graphics Q. What is a big concern in mobile computing? A. Power Two reasons to save power Run at higher performance for a fixed

More information

Scanline Rendering 2 1/42

Scanline Rendering 2 1/42 Scanline Rendering 2 1/42 Review 1. Set up a Camera the viewing frustum has near and far clipping planes 2. Create some Geometry made out of triangles 3. Place the geometry in the scene using Transforms

More information

Drawing Fast The Graphics Pipeline

Drawing Fast The Graphics Pipeline Drawing Fast The Graphics Pipeline CS559 Fall 2015 Lecture 9 October 1, 2015 What I was going to say last time How are the ideas we ve learned about implemented in hardware so they are fast. Important:

More information

CS452/552; EE465/505. Clipping & Scan Conversion

CS452/552; EE465/505. Clipping & Scan Conversion CS452/552; EE465/505 Clipping & Scan Conversion 3-31 15 Outline! From Geometry to Pixels: Overview Clipping (continued) Scan conversion Read: Angel, Chapter 8, 8.1-8.9 Project#1 due: this week Lab4 due:

More information

Drawing Fast The Graphics Pipeline

Drawing Fast The Graphics Pipeline Drawing Fast The Graphics Pipeline CS559 Spring 2016 Lecture 10 February 25, 2016 1. Put a 3D primitive in the World Modeling Get triangles 2. Figure out what color it should be Do ligh/ng 3. Position

More information

Further Developing GRAMPS. Jeremy Sugerman FLASHG January 27, 2009

Further Developing GRAMPS. Jeremy Sugerman FLASHG January 27, 2009 Further Developing GRAMPS Jeremy Sugerman FLASHG January 27, 2009 Introduction Evaluation of what/where GRAMPS is today Planned next steps New graphs: MapReduce and Cloth Sim Speculative potpourri, outside

More information

Graphics and Imaging Architectures

Graphics and Imaging Architectures Graphics and Imaging Architectures Kayvon Fatahalian http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/ About Kayvon New faculty, just arrived from Stanford Dissertation: Evolving real-time graphics

More information

Performance Analysis of Cluster based Interactive 3D Visualisation

Performance Analysis of Cluster based Interactive 3D Visualisation Performance Analysis of Cluster based Interactive 3D Visualisation P.D. MASELINO N. KALANTERY S.WINTER Centre for Parallel Computing University of Westminster 115 New Cavendish Street, London UNITED KINGDOM

More information

The Graphics Pipeline

The Graphics Pipeline The Graphics Pipeline Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel But you really want shadows, reflections, global illumination, antialiasing

More information

What s New with GPGPU?

What s New with GPGPU? What s New with GPGPU? John Owens Assistant Professor, Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Microprocessor Scaling is Slowing

More information

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2 Overview Application

More information

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST CS 380 - GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1 Markus Hadwiger, KAUST Reading Assignment #2 (until Feb. 17) Read (required): GLSL book, chapter 4 (The OpenGL Programmable

More information

GPU Architecture and Function. Michael Foster and Ian Frasch

GPU Architecture and Function. Michael Foster and Ian Frasch GPU Architecture and Function Michael Foster and Ian Frasch Overview What is a GPU? How is a GPU different from a CPU? The graphics pipeline History of the GPU GPU architecture Optimizations GPU performance

More information

Graphics and Interaction Rendering pipeline & object modelling

Graphics and Interaction Rendering pipeline & object modelling 433-324 Graphics and Interaction Rendering pipeline & object modelling Department of Computer Science and Software Engineering The Lecture outline Introduction to Modelling Polygonal geometry The rendering

More information

Threading Hardware in G80

Threading Hardware in G80 ing Hardware in G80 1 Sources Slides by ECE 498 AL : Programming Massively Parallel Processors : Wen-Mei Hwu John Nickolls, NVIDIA 2 3D 3D API: API: OpenGL OpenGL or or Direct3D Direct3D GPU Command &

More information

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology Graphics Performance Optimisation John Spitzer Director of European Developer Technology Overview Understand the stages of the graphics pipeline Cherchez la bottleneck Once found, either eliminate or balance

More information

Optimizing for DirectX Graphics. Richard Huddy European Developer Relations Manager

Optimizing for DirectX Graphics. Richard Huddy European Developer Relations Manager Optimizing for DirectX Graphics Richard Huddy European Developer Relations Manager Also on today from ATI... Start & End Time: 12:00pm 1:00pm Title: Precomputed Radiance Transfer and Spherical Harmonic

More information

Rasterization. MIT EECS Frédo Durand and Barb Cutler. MIT EECS 6.837, Cutler and Durand 1

Rasterization. MIT EECS Frédo Durand and Barb Cutler. MIT EECS 6.837, Cutler and Durand 1 Rasterization MIT EECS 6.837 Frédo Durand and Barb Cutler MIT EECS 6.837, Cutler and Durand 1 Final projects Rest of semester Weekly meetings with TAs Office hours on appointment This week, with TAs Refine

More information

Grafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL

Grafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL Grafica Computazionale: Lezione 30 Grafica Computazionale lezione30 Introduction to OpenGL Informatica e Automazione, "Roma Tre" May 20, 2010 OpenGL Shading Language Introduction to OpenGL OpenGL (Open

More information

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key

More information

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 Announcements Homework project #2 due this Friday, October

More information

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand

More information

Rasterization Overview

Rasterization Overview Rendering Overview The process of generating an image given a virtual camera objects light sources Various techniques rasterization (topic of this course) raytracing (topic of the course Advanced Computer

More information

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved Avi Shapira Graphic Remedy Copyright Khronos Group, 2009 - Page 1 2004 2009 Graphic Remedy. All Rights Reserved Debugging and profiling 3D applications are both hard and time consuming tasks Companies

More information

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11 Pipeline Operations CS 4620 Lecture 11 1 Pipeline you are here APPLICATION COMMAND STREAM 3D transformations; shading VERTEX PROCESSING TRANSFORMED GEOMETRY conversion of primitives to pixels RASTERIZATION

More information

ARM Multimedia IP: working together to drive down system power and bandwidth

ARM Multimedia IP: working together to drive down system power and bandwidth ARM Multimedia IP: working together to drive down system power and bandwidth Speaker: Robert Kong ARM China FAE Author: Sean Ellis ARM Architect 1 Agenda System power overview Bandwidth, bandwidth, bandwidth!

More information

Drawing Fast The Graphics Pipeline

Drawing Fast The Graphics Pipeline Drawing Fast The Graphics Pipeline CS559 Fall 2016 Lectures 10 & 11 October 10th & 12th, 2016 1. Put a 3D primitive in the World Modeling 2. Figure out what color it should be 3. Position relative to the

More information