Graphics and Imaging Architectures Kayvon Fatahalian http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/
About Kayvon New faculty, just arrived from Stanford Dissertation: Evolving real-time graphics pipeline for micropolygon rendering (micropolygon meshes = high-resolution meshes used in off-line rendering for film) In addition to rendering: involved in many research projects related to computing on GPUs - Brook (precursor to CUDA/OpenCL) - Sequoia (locality-aware parallel programming language) - GRAMPS (stream computing for heterogeneous HW)
COMPUTING IS BECOMING HIGHLY VISUAL! Visually rich user interfaces (that require 3D graphics!) Innovative new input modalities (touch, gyro, cameras) Ubiquitous, high-resolution cameras (emergence of computational photography) Games, entertainment (continuous push towards higher visual realism)
COMPUTING EFFICIENTLY IS INCREASINGLY CRITICAL! Ubiquitous parallelism Core i7 (Nahalem) Heterogeneous parallelism Power constraints Mobile, mobile, mobile Real-time 3D graphics has always found a way to consume all available compute: GPUs are efficient parallel, heterogeneous systems (that are relatively easy to program)
My Macbook Pro 2011 (two GPUs) AMD Radeon HD GPU Quad-core Intel Core i7 CPU (Contains integrated Intel GPU) From ifixit.com teardown
ipad 2 main board Apple A5 SoC Dual-core ARM CPU Image processing DSP GPU Video Encode/ Decode Flash memory From ifixit.com teardown
Touchscreen controller (integrated DSP) ~ $1 From US Patent Application 2006/0097991
My Nikon D7000 camera 16.2 MP sensor 14 bits per pixel 6 fps burst
What this course is This is a course about how real-time graphics systems work GRAPHICS ALGORITHMS (the workload) mapping/scheduling MACHINE ORGANIZATION geometry processing rasterization texture mapping anti-aliasing Parallelism Locality Communication The design of throughput processing cores The role of fixed-function HW
What this course is This is a course about how real-time graphics systems work GRAPHICS ALGORITHMS (the workload) mapping/scheduling MACHINE ORGANIZATION ABSTRACTIONS (the real-time graphics pipeline) choice of primitives level of abstraction
What this course is not This is not an [OpenGL, CUDA, OpenCL] programming course - But we will be discussing the design of these abstractions (and their future) in great detail!
Logistics 1: schedule Expect it to be fluid - it s my first time doing this - students in this class have wide variation in background - http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/ First half of class: focused on real-time 3D graphics - 3D graphics abstractions and the implementation of the real-time 3D graphics pipeline (e.g., OpenGL/Direct3D) on modern GPUs Second half of class: broader sampling of applications and systems - future 3D graphics pipelines, GPU programming model issues (OpenCL/ CUDA), hybrid CPU/GPUs, image processing architectures, cameras Several external speakers (AMD, Intel, NVIDIA)
Logistics 2: expectations This is a reading/project course (40%) Paper readings - Weekly: you will be asked to submit brief reviews of selected papers (or answer a specific questions) - Participate in class discussions (10%) Two quick (2-3 day) C++ programming assignments - Meant to concretize concepts, not stress you out (50%) Semester-long, self-selected research project (teams of 1-2) - Aim for publishable results
3D rendering Image credit: Henrik Wann Jensen Model of a scene: 3D surface geometry (e.g., triangle mesh) surface materials lights camera Image How does each triangle contribute to each pixel in the image?
What is an architecture? (white board)
Real-time graphics pipeline (entities) 3 1 4 2 Vertices Primitives (triangles, points, lines) Fragments Pixels
Real-time graphics pipeline (operations) Vertex Generation 1 3 4 Vertices in 3D space Vertices Vertex stream 2 Vertex Processing Primitives Fragments Pixels Vertex stream Primitive Generation Primitive stream Primitive Processing Primitive stream Fragment Generation (Rasterization) Fragment stream Fragment Processing Fragment stream Pixel Operations Vertices in positioned on screen Triangles positioned on screen Fragments (one per each covered pixel) Shaded fragments Output image (pixels)
Real-time graphics pipeline (state) Memory Buffers (system state) Vertices Primitives Fragments Pixels Vertex Generation Vertex stream Vertex Processing Vertex stream Primitive Generation Primitive stream Primitive Processing Primitive stream Fragment Generation (Rasterization) Fragment stream Fragment Processing Fragment stream Pixel Operations 3 1 4 2 Vertex Data buffers Texture buffers Vertex transform matrices Texture buffers Texture buffers Output image buffer
History: Evolution of interactive 3D graphics Following slides borrowed from Kurt Akeley and Pat Hanrahan http://graphics.stanford.edu/wikis/cs448-07-spring
Slide credit Kurt Akeley and Pat Hanrahan
Early years Evans and Sutherland built machines for flight simulators throughout the 70s Early 80s: Custom ASIC for geometry processing - The Geometry Engine: a VLSI Geometry System for Graphics (Jim Clark, SIGGRAPH 82) - Work started at Stanford (led to Silicon Graphics, 1982)
Slide credit Kurt Akeley and Pat Hanrahan
Slide credit Kurt Akeley and Pat Hanrahan
Slide credit Kurt Akeley and Pat Hanrahan
Slide credit Kurt Akeley and Pat Hanrahan
Slide credit Kurt Akeley and Pat Hanrahan
Off-line rendering for film Off-line graphics advancing as well Image credit: Pixar, Toy Story 1, 1995 - Pixar created in 1986 - Initially thought they were going to build hardware (Pixar image computer) Reyes graphics system at Pixar (Renderman) - Feature rich: smooth/complex surfaces, programmable shading, anti-aliasing, motion blur - Implemented as a software application on CPUs - Hours/frame
Early PC graphics Wolfenstein 3D, 1992 Doom, 1993 Software applications running on CPU Extreme amount of software optimization, algorithm development (and hacks)
PC 3D graphics 3D graphics acceleration for PC motivated by games add-in boards made by 3DFX, NVIDIA, ATI, PowerVR, and more - 3DFX founders were from SGI - 3DFX Voodoo (1997) standardized around accelerating OpenGL/Direct3D - offloaded rasterization, texture mapping, frame-buffer processing - initially only subset of GL implemented by Quake Quake 1
Slide credit Kurt Akeley and Pat Hanrahan
Fourth generation Programmable shading
Current generation Tessellation (smooth/complex surfaces) Image credit: NVIDIA/Unigine
Current generation Some GPU computations now driven by alternative programming interface (not 3D graphics pipeline) - to augment 3D rendering (game physics, global lighting) - for non-graphics applications (scientific computing) Image credit: NVIDIA Image credit: Folding @ Home
Today: high versatility, high peak compute Intel Core i7 (quad-core CPU) ~100 GFLOPS 730 million transistors AMD Radeon HD 5870 GPU ~2.7 TFLOPS 2.2 billion transistors More compute power than Pixar s entire render farm for Toy Story 1!
Readings D. Blythe, The Rise of the Graphics Processor Proceedings of the IEEE, 2008 Real-Time Rendering, Ch 2. The Graphics Rendering Pipeline (handout)