Real-Time Graphics Architecture
|
|
- Percival Stephens
- 5 years ago
- Views:
Transcription
1 Real-Time Graphics Architecture Lecture 4: Parallelism and Communication Kurt Akeley Pat Hanrahan Topics 1. Frame buffers 2. Types of parallelism 3. Communication patterns and requirements 4. Sorting classification for parallel rendering (with examples) 1
2 Frame Buffers er vs. calligraphic er (image order) dominant choice Calligraphic (object order) Earliest choice (Sketchpad) E&S terminals in the 70s and 80s Works with light pens Scene complexity affects frame rate Monitors are expensive Still required for FAA simulation Increases absolute brightness of light points 2
3 Frame buffer definitions What is a frame buffer? What can we learn by considering different definitions? Frame buffer definition #1 Storage for commands that are executed to refresh the display Allows for raster or calligraphic display (e.g. Megatech) Frame buffer for calligraphic display is a display list OpenGL render list? Key point: frame buffer contents are interpreted Color mapping Image scaling, warping Window system (overlay, separate windows, ) Address Recalculation Pipeline 3
4 Frame buffer definition #2 Image memory used to decouple the render frame rate from the display frame rate Meets common understanding of frame buffer as image Leads naturally to double buffering One render buffer, one display buffer, swap n-buffering also possible, can control latency Key idea: decoupling enables general-purpose GPU Visual simulation has high render frame rate MCAD has low render frame rate Window manager has no frame rate Frame buffer definition #3 All pixel-assigned memory used to assemble and display the images being rendered Key point: frame buffer is active participant in rendering Leads to non-color buffers: depth, stencil, window control OpenGL treats these buffers as part of frame buffer Some reserve frame buffer for color images Should be n-buffered in some cases (sort last) RealityEngine frame buffer can be deeper than wide or high History cycles through this definition 2-D manipulation 3-D painters algorithm 3-D depth, stencil, accumulation, multi-pass Programmable shading 4
5 Frame buffer is optional Calligraphic display If we don t define display list as frame buffer Follow-the-beam rendering Minimizes latency Saves cost if frames are never dropped Talisman-like image assembly (3-D sprites) Old idea (visual simulation, window systems) GigaPixel render tile Frame buffer stores color images only Depth, stencil, etc. in small tile Dominant architecture is consistent SGI architectures look like ATI architectures, which look like NVIDIA architectures Details are evolving, but big picture remains the same Why is this? Simplicity of design Simplicity of algorithms Simplicity of immediate-mode approach 5
6 Simplicity of design Frame buffer operations Blending: merge fragment and pixel color Depth Buffering: save nearest fragment Stencil Buffering: simple pixel state machine Accumulation Buffering: high-resolution color arithmetic Antialiasing: (to be covered later). All frame buffer operations: Combine fragment and pixel data (not just a replace) But replace operation is optimized, e.g., no parity/ecc Are local (no intra-pixel dependencies) Why aren t fragment operations programmable? Simplicity of algorithms Frame buffer employs brute-force simplicity Hidden surface elimination: Depth-buffer vs. sort/painter Capping: Stencil-based vs. object calculations Image-space algorithm is efficient Just samples, never object information, locality Just-in-time calculation, steady cost function Accumulation Buffer (high-resolution color arithmetic) The Accumulation Buffer, Haeberli and Akeley, Proceedings of SIGGRAPH 90 Volume rendering using 3D textures Multi-pass rendering Interactive Multi-pass Programmable Shading, Peercy, Olano, Airey, and Ungar, Proceedings of SIGGRAPH 00 6
7 Simplicity of immediate-mode Frame buffer contents are context Matches 2D/window-rendering model Rendering System Little graphics state here Frame buffer: most graphics state here Decreasing display bandwidth burden Historically display bandwidth was a limiting factor Hence Sproull s Rule : fill rate >= display rate Now display bandwidth is almost inconsequential Year System FB (GB) Disp (GB) Disp / FB 1984 SGI 2000-series / SGI GTX 1.8 * / SGI InfiniteReality / NVIDIA 7900 GTX /70 * VRAM provided separate video bandwidth 7
8 Parallelism and Communication Parallelism and communication Parallelism using multiple computational units to processes work in parallel Communication connecting the computational units to allow work to be distributed and aggregated Issues Dependencies Ordering Sorting Scalability Computation Bandwidth Load balancing 8
9 Parallelism taxonomy Hardware parallelism (simultaneous execution on multiple processors) Virtual parallelism (time sharing a single processor, usually with hardware support) Data parallelism [aka parallelism ] (same task on similar data sets) Task parallelism (different tasks on similar OR differing data sets) Parallelism taxonomy Hardware parallelism (simultaneous execution on multiple processors) Virtual parallelism (time sharing a single processor, usually with hardware support) Data parallelism [aka parallelism ] (same task on similar data sets) Frame-parallelism (batch, SGI N-clops) Object-parallelism (geometry) Image-parallelism (fragment/pixel) Task parallelism (different tasks on similar OR differing data sets) 9
10 Parallelism taxonomy Hardware parallelism (simultaneous execution on multiple processors) Virtual parallelism (time sharing a single processor, usually with hardware support) Data parallelism [aka parallelism ] (same task on similar data sets) Task parallelism (different tasks on similar OR differing data sets) Frame-parallelism (batch, SGI N-clops) Object-parallelism (geometry) Image-parallelism (fragment/pixel) Multi-processing (on multiple CPUs) Pipelining (the graphics pipeline) Parallelism taxonomy Data parallelism [aka parallelism ] (same task on similar data sets) Hardware parallelism (simultaneous execution on multiple processors) Frame-parallelism (batch, SGI N-clops) Object-parallelism (geometry) Image-parallelism (fragment/pixel) Virtual parallelism (time sharing a single processor, usually with hardware support) Multi-processing (graphics context switching) Multi-threading (almost defines a GPU-like processor) Task parallelism (different tasks on similar OR differing data sets) Multi-processing (on multiple CPUs) Pipelining (the graphics pipeline) 10
11 Parallelism taxonomy Data parallelism [aka parallelism ] (same task on similar data sets) Task parallelism (different tasks on similar OR differing data sets) Hardware parallelism (simultaneous execution on multiple processors) Frame-parallelism (batch, SGI N-clops) Object-parallelism (geometry) Image-parallelism (fragment/pixel) Multi-processing (on multiple CPUs) Pipelining (the graphics pipeline) Virtual parallelism (time sharing a single processor, usually with hardware support) Multi-processing (graphics context switching) Multi-threading (almost defines a GPU-like processor) Multi-processing (time sharing a single CPU) Multi-threading (Direct3D-10 commoncore ) Graphics is embarrassingly parallel Ample self-similar data sets Frames, vertexes, fragments, texels, pixels With minimal dependencies Few intra-set dependencies Pixels (in the frame buffer) are the significant exception Inter-set dependencies are purely sequential Graphics pipeline is designed to minimize dependencies Other graphics architectures have more dependencies E.g., for global lighting effects But graphics pipeline has huge redundancies Hence many opportunities for optimization How hard should we work to do things wrong? 11
12 etry parallelism trend (SGI) Model Transform Length Transform Width G GTX VGX RE IR Image parallelism trend (SGI) erization G GTX VGX RE IR 12
13 The clear trend Shorter and wider Why? Communication taxonomy Sorting Distribution Object Image Routing (Introduced by parallelism) (Introduced by parallelism) Texturing Fundamental 13
14 Sorting is fundamental Sorting Distribution Object Image Routing Texturing I. E. Sutherland, R. F. Sproull, and R. A. Schumacher, A characterization of ten hidden surface algorithms Classified by order of x, y, z radix sorts Pipelining vs. parallelism Issue Ordering dependencies Sorting dependencies Computation scalability Bandwidth scalability Load balancing scalability Task Parallelism (pipelining) Easy Easy Data Parallelism Challenging Challenging 14
15 Pipelining vs. parallelism Issue Ordering dependencies Sorting dependencies Computation scalability Bandwidth scalability Load balancing scalability Task Parallelism (pipelining) Easy Easy Poor Poor (Nearly) impossible Data Parallelism Challenging Challenging Challenging Challenging Challenging Ordering challenges Fundamental: Frame buffer operations Painter s algorithm Memory hazards Texture writes Render Copy to texture Render Readback From pipelining: Changes to graphics state 15
16 Sorting taxonomy Application Command etry erization Texture Fragment Display Sort-First Sort-Middle Sort-Last Fragment Sort-Last Image Composition Sort-First 16
17 Sort-first App App Pre-transformation Cmd SORT Cmd Point-to-point communication scales Tex Tex Coarse tiling incurs load imbalance Frag Frag Disp Disp Princeton Display Wall, Stanford WireGL Sort-first Order Automatic (conceptually) Sort Pre-stage (cheat ) Compute scalability Good Bandwidth scalability Good Load balance scalability Poor 17
18 Sort-first App Cmd Tex Frag SORT ROUTE Disp App Cmd Tex Frag Ring parallelism App Cmd Tex Frag DIST Cmd DIST Cmd DIST DIST Cmd Tex Tex Tex Frag Frag Frag ROUTE Disp 3DLABs 18
19 Sort-Middle Image-space work distribution Parke - Tiled Fuchs - Interleaved 19
20 Sort-middle interleaved App DISTRIBUTE Cmd Cmd etry work load-balanced, Tex Frag SORT ROUTE Disp Tex Frag except clipping and tesselation Broadcast communication does not scale, but supports ordering Finely interleaved screen tiling ensures es excellent load balance SGI Graphics Workstations: RealityEngine, InfiniteReality Sort-middle interleaved Order Force sequence at triangle sort Sort Broadcast Compute scalability Good Bandwidth scalability Limited by sort broadcast Load balance scalability Good 20
21 SGI RealityEngine 240 MB/s 1600 MB/s 3200 MB/s Sort-middle tiled App Cmd Tex Frag Disp SORT ROUTE App Cmd Tex Frag Disp UNC PixelPlanes, Stanford Argus Point-to-point communication scales Coarse tiling incurs load imbalance 21
22 Sort-middle tiled (immediate mode) Order Force sequence at triangle sort Sort Can approach point-to-point Compute scalability Good Bandwidth scalability Good Load balance scalability Poor for rasterization (due to large triangles) Sort-middle tiled (chunked) Order Sort Force sequence at triangle sort Full-frame delay, render to texture difficulties Can approach point-to-point Compute scalability Good Bandwidth scalability Good Load balance scalability Good 22
23 UNC Pixel-Planes5 (1990) Sort-Last 23
24 Sort-last fragment App Cmd Tex Frag SORT ROUTE App Cmd Tex Frag Improved texture locality No redundant work in FG Exposes rasterization load imbalance to application Point-to-point communication scales, but requires more bw Finely interleaved screen tiling insures excellent load balance Disp Disp Possible, but difficult, to maintain ordering Kubota Denali, E&S Freedom 3000 Sort-last fragment Order Force sequence at fragment sort Sort Point-to-point, high bandwidth Compute scalability Good Bandwidth scalability OK (sorting is the bottleneck) Load balance scalability OK (exposed to application) 24
25 Kubota Denali (1993) TEM 48 5 X6 24 X10 FBM Denali Technical Overview 1.0 Kubota Pacific Computer, 1993 Image composition Z comp Other combiners possible 25
26 Sort-last image composition App Cmd App Cmd Exposes rasterization load imbalance to application Point-to-point ring interconnect scales Tex Tex Frag Frag Disp SORT Disp Two-stage image composition loses ordering UNC/HP PixelFlow, Aizu VC-1, Stanford Lightning-2 Sort-last image composition Order Not fully supported! Sort One to many for each pipeline Compute scalability Excellent Bandwidth scalability Excellent Load balance scalability OK (exposed to application) 26
27 UNC Pixel Flow From J. Poulton, J. Eyles, S. Molnar, H. Fuchs, Pixel Flow: The Realization 27
28 Sort-Everywhere Sort-everywhere: Pomegranate App Cmd Mem Tex Mem Frag Disp DISTRIBUTE SORT ROUTE App Cmd Tex Mem Frag Mem Disp 28
29 Architecture comparison X indicates an issue Sort-first interleaved Sort-middle i Sort-middle tiled (immd) Sort-middle tiled (chunk) Sort-last frag gment Sort-last ima age comp. Sort-everywh here Ordered (X) X Compute scalability Bandwidth scalability X X Load balance scalability X X X X Summary GPU architecture trend Pipeline hardware-parallel virtual-parallel 29
30 Readings Required 1. S. Molnar, M. Cox, D. Ellsworth, H. Fuchs, A sorting classification of parallel rendering 2. Fuchs et al., A heterogenous multiprocessor graphics system using processor-enhanced memories (PP5). 3. Eyles et al., PixelFlow: The Realization Recommended 1. F. I. Parke, Simulation and expected performance analysis of multiple processor z-buffer systems 2. H. Fuchs, Distributing a visible surface algorithm over multiple processors Real-Time Graphics Architecture Lecture 4: Parallelism and Communication Kurt Akeley Pat Hanrahan 30
Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)
Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Analyzing a 3D Graphics Workload Where is most of the work done? Memory Vertex
More informationParallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)
Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Today Finishing up from last time Brief discussion of graphics workload metrics
More informationScheduling the Graphics Pipeline on a GPU
Lecture 20: Scheduling the Graphics Pipeline on a GPU Visual Computing Systems Today Real-time 3D graphics workload metrics Scheduling the graphics pipeline on a modern GPU Quick aside: tessellation Triangle
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:
More informationReal-Time Graphics Architecture
Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Geometry Outline Vertex and primitive operations System examples emphasis on clipping Primitive
More informationParallel Rendering. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen
Parallel Rendering Molnar, Cox, Ellsworth, and Fuchs. A Sorting Classification of Parallel Rendering. IEEE Computer Graphics and Applications. July, 1994. Why Parallelism Applications need: High frame
More informationGraphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics
Why GPU? Chapter 1 Graphics Hardware Graphics Processing Unit (GPU) is a Subsidiary hardware With massively multi-threaded many-core Dedicated to 2D and 3D graphics Special purpose low functionality, high
More informationReal-Time Graphics Architecture
Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Rasterization Outline Fundamentals Examples Special topics (Depth-buffer, cracks and holes,
More informationIntroduction to Computer Graphics. Overview. What is Computer Graphics?
INSTITUTIONEN FÖR SYSTEMTEKNIK LULEÅ TEKNISKA UNIVERSITET Introduction to Computer Graphics David Carr Fundamentals of Computer Graphics Spring 2004 Based on Slides by E. Angel Graphics 1 L Overview What
More informationProf. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University
Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University Project3 Cache Race Games night Monday, May 4 th, 5pm Come, eat, drink, have fun and be merry! Location: B17 Upson Hall
More informationRendering Objects. Need to transform all geometry then
Intro to OpenGL Rendering Objects Object has internal geometry (Model) Object relative to other objects (World) Object relative to camera (View) Object relative to screen (Projection) Need to transform
More informationMMGD0206 Computer Graphics. Chapter 1 Development of Computer Graphics : History
MMGD0206 Computer Graphics Chapter 1 Development of Computer Graphics : History What is Computer Graphics? Computer graphics generally means creation, storage and manipulation of models and images Such
More informationNext-Generation Graphics on Larrabee. Tim Foley Intel Corp
Next-Generation Graphics on Larrabee Tim Foley Intel Corp Motivation The killer app for GPGPU is graphics We ve seen Abstract models for parallel programming How those models map efficiently to Larrabee
More information3D Computer Games Technology and History. Markus Hadwiger VRVis Research Center
3D Computer Games Technology and History VRVis Research Center Lecture Outline Overview of the last ten years A look at seminal 3D computer games Most important techniques employed Graphics research and
More informationCurrent Trends in Computer Graphics Hardware
Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006)
More informationReal-Time Graphics Architecture
Real-Time Graphics Architecture Lecture 8: Antialiasing Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/cs448-07-spring/ Antialiasing Outline Aliasing and antialiasing Taxonomy of antialiasing approaches
More informationModels of the Impact of Overlap in Bucket Rendering
Models of the Impact of Overlap in Bucket Rendering Milton Chen, Gordon Stoll, Homan Igehy, Kekoa Proudfoot and Pat Hanrahan Computer Systems Laboratory Stanford University Abstract Bucket rendering is
More informationGraphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal
Graphics Hardware, Graphics APIs, and Computation on GPUs Mark Segal Overview Graphics Pipeline Graphics Hardware Graphics APIs ATI s low-level interface for computation on GPUs 2 Graphics Hardware High
More informationStandard Graphics Pipeline
Graphics Architecture Software implementations of rendering are slow. OpenGL on Sparc workstations. Performance can be improved using sophisticated algorithms and faster machines. Real-time large-scale
More informationGeForce4. John Montrym Henry Moreton
GeForce4 John Montrym Henry Moreton 1 Architectural Drivers Programmability Parallelism Memory bandwidth 2 Recent History: GeForce 1&2 First integrated geometry engine & 4 pixels/clk Fixed-function transform,
More informationComputer Graphics. Hardware Pipeline. Visual Imaging in the Electronic Age Prof. Donald P. Greenberg October 23, 2014 Lecture 16
Computer Graphics Hardware Pipeline Visual Imaging in the Electronic Age Prof. Donald P. Greenberg October 23, 2014 Lecture 16 Moore s Law Chip density doubles every 18 months. Processing Power (P) in
More informationReal-Time Graphics Architecture
RealTime Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a01fall About Kurt Personal history B.E.E. Univeristy of Delaware, 1980 M.S.E.E. Stanford, 1982 SGI
More informationArchitectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1
Architectures Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1 Overview of today s lecture The idea is to cover some of the existing graphics
More informationReal-Time Graphics Architecture
Real-Time Graphics Architecture Lecture 5: Rasterization Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/cs448-07-spring/ Rasterization Outline Fundamentals System examples Special topics (Depth-buffer,
More informationA Data-Parallel Genealogy: The GPU Family Tree. John Owens University of California, Davis
A Data-Parallel Genealogy: The GPU Family Tree John Owens University of California, Davis Outline Moore s Law brings opportunity Gains in performance and capabilities. What has 20+ years of development
More informationCMSC 611: Advanced. Parallel Systems
CMSC 611: Advanced Computer Architecture Parallel Systems Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems
More informationDESIGNING GRAPHICS ARCHITECTURES AROUND SCALABILITY AND COMMUNICATION
DESIGNING GRAPHICS ARCHITECTURES AROUND SCALABILITY AND COMMUNICATION A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN
More informationAdvanced Shading and Texturing
Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Advanced Shading and Texturing 1 Topics Features Bump mapping Environment mapping Shadow
More informationReal-Time Graphics Architecture
Real-Time Graphics Architecture Lecture 9: Programming GPUs Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/cs448-07-spring/ Programming GPUs Outline Caveat History Contemporary GPUs Programming
More informationComparing Reyes and OpenGL on a Stream Architecture
Comparing Reyes and OpenGL on a Stream Architecture John D. Owens Brucek Khailany Brian Towles William J. Dally Computer Systems Laboratory Stanford University Motivation Frame from Quake III Arena id
More informationPowerVR Hardware. Architecture Overview for Developers
Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationCornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary
Cornell University CS 569: Interactive Computer Graphics Introduction Lecture 1 [John C. Stone, UIUC] 2008 Steve Marschner 1 2008 Steve Marschner 2 NASA University of Calgary 2008 Steve Marschner 3 2008
More informationSpring 2009 Prof. Hyesoon Kim
Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationSpring 2011 Prof. Hyesoon Kim
Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationA Data-Parallel Genealogy: The GPU Family Tree
A Data-Parallel Genealogy: The GPU Family Tree Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Outline Moore s Law brings
More informationThe NVIDIA GeForce 8800 GPU
The NVIDIA GeForce 8800 GPU August 2007 Erik Lindholm / Stuart Oberman Outline GeForce 8800 Architecture Overview Streaming Processor Array Streaming Multiprocessor Texture ROP: Raster Operation Pipeline
More informationgraphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1
graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline sequence of operations to generate an image using object-order processing primitives processed one-at-a-time
More informationOut-Of-Core Sort-First Parallel Rendering for Cluster-Based Tiled Displays
Out-Of-Core Sort-First Parallel Rendering for Cluster-Based Tiled Displays Wagner T. Corrêa James T. Klosowski Cláudio T. Silva Princeton/AT&T IBM OHSU/AT&T EG PGV, Germany September 10, 2002 Goals Render
More informationgraphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1
graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline sequence of operations to generate an image using object-order processing primitives processed one-at-a-time
More informationRealityEngine Graphics
RealityEngine Graphics Kurt Akeley Silicon Graphics Computer Systems Abstract The RealityEngine TM graphics system is the first of a new generation of systems designed primarily to render texture mapped,
More informationLecture 2. Shaders, GLSL and GPGPU
Lecture 2 Shaders, GLSL and GPGPU Is it interesting to do GPU computing with graphics APIs today? Lecture overview Why care about shaders for computing? Shaders for graphics GLSL Computing with shaders
More informationBeyond Programmable Shading Keeping Many Cores Busy: Scheduling the Graphics Pipeline
Keeping Many s Busy: Scheduling the Graphics Pipeline Jonathan Ragan-Kelley, MIT CSAIL 29 July 2010 This talk How to think about scheduling GPU-style pipelines Four constraints which drive scheduling decisions
More informationTutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI
Tutorial on GPU Programming #2 Joong-Youn Lee Supercomputing Center, KISTI Contents Graphics Pipeline Vertex Programming Fragment Programming Introduction to Cg Language Graphics Pipeline The process to
More informationNational Chiao Tung Univ, Taiwan By: I-Chen Lin, Assistant Professor
Computer Graphics 1. Graphics Systems National Chiao Tung Univ, Taiwan By: I-Chen Lin, Assistant Professor Textbook: Hearn and Baker, Computer Graphics, 3rd Ed., Prentice Hall Ref: E.Angel, Interactive
More informationGPU Architecture. Michael Doggett Department of Computer Science Lund university
GPU Architecture Michael Doggett Department of Computer Science Lund university GPUs from my time at ATI R200 Xbox360 GPU R630 R610 R770 Let s start at the beginning... Graphics Hardware before GPUs 1970s
More informationMulti-Graphics. Multi-Graphics Project. Scalable Graphics using Commodity Graphics Systems
Multi-Graphics Scalable Graphics using Commodity Graphics Systems VIEWS PI Meeting May 17, 2000 Pat Hanrahan Stanford University http://www.graphics.stanford.edu/ Multi-Graphics Project Scalable and Distributed
More information1.2.3 The Graphics Hardware Pipeline
Figure 1-3. The Graphics Hardware Pipeline 1.2.3 The Graphics Hardware Pipeline A pipeline is a sequence of stages operating in parallel and in a fixed order. Each stage receives its input from the prior
More informationA Reconfigurable Architecture for Load-Balanced Rendering
A Reconfigurable Architecture for Load-Balanced Rendering Jiawen Chen Michael I. Gordon William Thies Matthias Zwicker Kari Pulli Frédo Durand Graphics Hardware July 31, 2005, Los Angeles, CA The Load
More informationCS4620/5620: Lecture 14 Pipeline
CS4620/5620: Lecture 14 Pipeline 1 Rasterizing triangles Summary 1! evaluation of linear functions on pixel grid 2! functions defined by parameter values at vertices 3! using extra parameters to determine
More informationComputer Graphics. Bing-Yu Chen National Taiwan University
Computer Graphics Bing-Yu Chen National Taiwan University Introduction The Graphics Process Color Models Triangle Meshes The Rendering Pipeline 1 INPUT What is Computer Graphics? Definition the pictorial
More informationMattan Erez. The University of Texas at Austin
EE382V (17325): Principles in Computer Architecture Parallelism and Locality Fall 2007 Lecture 12 GPU Architecture (NVIDIA G80) Mattan Erez The University of Texas at Austin Outline 3D graphics recap and
More informationCS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology
CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367
More informationLecture 25: Board Notes: Threads and GPUs
Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel
More informationHardware-driven Visibility Culling Jeong Hyun Kim
Hardware-driven Visibility Culling Jeong Hyun Kim KAIST (Korea Advanced Institute of Science and Technology) Contents Introduction Background Clipping Culling Z-max (Z-min) Filter Programmable culling
More informationReal - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský
Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation
More informationLecture 4: Geometry Processing. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)
Lecture 4: Processing Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today Key per-primitive operations (clipping, culling) Various slides credit John Owens, Kurt Akeley,
More informationProgrammable GPUS. Last Time? Reading for Today. Homework 4. Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes
Last Time? Programmable GPUS Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes frame buffer depth buffer stencil buffer Stencil Buffer Homework 4 Reading for Create some geometry "Rendering
More informationReal-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Texture
Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Texture 1 Topics 1. Review of texture mapping 2. RealityEngine and InfiniteReality 3. Texture
More informationCS 316: Multicore/GPUs
CS 316: Multicore/GPUs Kavita Bala Fall 2007 Computer Science Cornell University Announcements Core Wars will be out in the next couple of days Aim at having fun! Number of points allocated to it is small
More informationLecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)
Lecture 6: Texture Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today: texturing! Texture filtering - Texture access is not just a 2D array lookup ;-) Memory-system implications
More informationLecture 9: Deferred Shading. Visual Computing Systems CMU , Fall 2013
Lecture 9: Deferred Shading Visual Computing Systems The course so far The real-time graphics pipeline abstraction Principle graphics abstractions Algorithms and modern high performance implementations
More informationRendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane
Rendering Pipeline Rendering Converting a 3D scene to a 2D image Rendering Light Camera 3D Model View Plane Rendering Converting a 3D scene to a 2D image Basic rendering tasks: Modeling: creating the world
More informationPowerVR Series5. Architecture Guide for Developers
Public Imagination Technologies PowerVR Series5 Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationWorking with Metal Overview
Graphics and Games #WWDC14 Working with Metal Overview Session 603 Jeremy Sandmel GPU Software 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission
More informationWindowing System on a 3D Pipeline. February 2005
Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April
More informationTexture. Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan.
Texture Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/courses/cs448-07-spring/ Topics 1. Projective texture mapping 2. Texture filtering and mip-mapping 3. Early
More informationframe buffer depth buffer stencil buffer
Final Project Proposals Programmable GPUS You should all have received an email with feedback Just about everyone was told: Test cases weren t detailed enough Project was possibly too big Motivation could
More informationCSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance
More informationGraphics Processing Unit Architecture (GPU Arch)
Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics
More informationA Sorting Classification of Parallel Rendering
A Sorting Classification of Parallel Rendering Steven Molnar *, Michael Cox, David Ellsworth *, Henry Fuchs * * University of North Carolina at Chapel Hill and Princeton University Front-page photo: Simulation
More informationCSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller
Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,
More informationGPGPU. Peter Laurens 1st-year PhD Student, NSC
GPGPU Peter Laurens 1st-year PhD Student, NSC Presentation Overview 1. What is it? 2. What can it do for me? 3. How can I get it to do that? 4. What s the catch? 5. What s the future? What is it? Introducing
More informationHardware-driven visibility culling
Hardware-driven visibility culling I. Introduction 20073114 김정현 The goal of the 3D graphics is to generate a realistic and accurate 3D image. To achieve this, it needs to process not only large amount
More informationCourse Recap + 3D Graphics on Mobile GPUs
Lecture 18: Course Recap + 3D Graphics on Mobile GPUs Interactive Computer Graphics Q. What is a big concern in mobile computing? A. Power Two reasons to save power Run at higher performance for a fixed
More informationScanline Rendering 2 1/42
Scanline Rendering 2 1/42 Review 1. Set up a Camera the viewing frustum has near and far clipping planes 2. Create some Geometry made out of triangles 3. Place the geometry in the scene using Transforms
More informationDrawing Fast The Graphics Pipeline
Drawing Fast The Graphics Pipeline CS559 Fall 2015 Lecture 9 October 1, 2015 What I was going to say last time How are the ideas we ve learned about implemented in hardware so they are fast. Important:
More informationCS452/552; EE465/505. Clipping & Scan Conversion
CS452/552; EE465/505 Clipping & Scan Conversion 3-31 15 Outline! From Geometry to Pixels: Overview Clipping (continued) Scan conversion Read: Angel, Chapter 8, 8.1-8.9 Project#1 due: this week Lab4 due:
More informationDrawing Fast The Graphics Pipeline
Drawing Fast The Graphics Pipeline CS559 Spring 2016 Lecture 10 February 25, 2016 1. Put a 3D primitive in the World Modeling Get triangles 2. Figure out what color it should be Do ligh/ng 3. Position
More informationFurther Developing GRAMPS. Jeremy Sugerman FLASHG January 27, 2009
Further Developing GRAMPS Jeremy Sugerman FLASHG January 27, 2009 Introduction Evaluation of what/where GRAMPS is today Planned next steps New graphs: MapReduce and Cloth Sim Speculative potpourri, outside
More informationGraphics and Imaging Architectures
Graphics and Imaging Architectures Kayvon Fatahalian http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/ About Kayvon New faculty, just arrived from Stanford Dissertation: Evolving real-time graphics
More informationPerformance Analysis of Cluster based Interactive 3D Visualisation
Performance Analysis of Cluster based Interactive 3D Visualisation P.D. MASELINO N. KALANTERY S.WINTER Centre for Parallel Computing University of Westminster 115 New Cavendish Street, London UNITED KINGDOM
More informationThe Graphics Pipeline
The Graphics Pipeline Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel But you really want shadows, reflections, global illumination, antialiasing
More informationWhat s New with GPGPU?
What s New with GPGPU? John Owens Assistant Professor, Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Microprocessor Scaling is Slowing
More informationReal - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský
Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2 Overview Application
More informationCS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1 Markus Hadwiger, KAUST Reading Assignment #2 (until Feb. 17) Read (required): GLSL book, chapter 4 (The OpenGL Programmable
More informationGPU Architecture and Function. Michael Foster and Ian Frasch
GPU Architecture and Function Michael Foster and Ian Frasch Overview What is a GPU? How is a GPU different from a CPU? The graphics pipeline History of the GPU GPU architecture Optimizations GPU performance
More informationGraphics and Interaction Rendering pipeline & object modelling
433-324 Graphics and Interaction Rendering pipeline & object modelling Department of Computer Science and Software Engineering The Lecture outline Introduction to Modelling Polygonal geometry The rendering
More informationThreading Hardware in G80
ing Hardware in G80 1 Sources Slides by ECE 498 AL : Programming Massively Parallel Processors : Wen-Mei Hwu John Nickolls, NVIDIA 2 3D 3D API: API: OpenGL OpenGL or or Direct3D Direct3D GPU Command &
More informationGraphics Performance Optimisation. John Spitzer Director of European Developer Technology
Graphics Performance Optimisation John Spitzer Director of European Developer Technology Overview Understand the stages of the graphics pipeline Cherchez la bottleneck Once found, either eliminate or balance
More informationOptimizing for DirectX Graphics. Richard Huddy European Developer Relations Manager
Optimizing for DirectX Graphics Richard Huddy European Developer Relations Manager Also on today from ATI... Start & End Time: 12:00pm 1:00pm Title: Precomputed Radiance Transfer and Spherical Harmonic
More informationRasterization. MIT EECS Frédo Durand and Barb Cutler. MIT EECS 6.837, Cutler and Durand 1
Rasterization MIT EECS 6.837 Frédo Durand and Barb Cutler MIT EECS 6.837, Cutler and Durand 1 Final projects Rest of semester Weekly meetings with TAs Office hours on appointment This week, with TAs Refine
More informationGrafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL
Grafica Computazionale: Lezione 30 Grafica Computazionale lezione30 Introduction to OpenGL Informatica e Automazione, "Roma Tre" May 20, 2010 OpenGL Shading Language Introduction to OpenGL OpenGL (Open
More informationReal-Time Rendering (Echtzeitgraphik) Michael Wimmer
Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key
More informationCSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012
CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 Announcements Homework project #2 due this Friday, October
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationRasterization Overview
Rendering Overview The process of generating an image given a virtual camera objects light sources Various techniques rasterization (topic of this course) raytracing (topic of the course Advanced Computer
More informationCopyright Khronos Group, Page Graphic Remedy. All Rights Reserved
Avi Shapira Graphic Remedy Copyright Khronos Group, 2009 - Page 1 2004 2009 Graphic Remedy. All Rights Reserved Debugging and profiling 3D applications are both hard and time consuming tasks Companies
More informationPipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11
Pipeline Operations CS 4620 Lecture 11 1 Pipeline you are here APPLICATION COMMAND STREAM 3D transformations; shading VERTEX PROCESSING TRANSFORMED GEOMETRY conversion of primitives to pixels RASTERIZATION
More informationARM Multimedia IP: working together to drive down system power and bandwidth
ARM Multimedia IP: working together to drive down system power and bandwidth Speaker: Robert Kong ARM China FAE Author: Sean Ellis ARM Architect 1 Agenda System power overview Bandwidth, bandwidth, bandwidth!
More informationDrawing Fast The Graphics Pipeline
Drawing Fast The Graphics Pipeline CS559 Fall 2016 Lectures 10 & 11 October 10th & 12th, 2016 1. Put a 3D primitive in the World Modeling 2. Figure out what color it should be 3. Position relative to the
More information