Current Trends in Computer Graphics Hardware

Similar documents
CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

CS427 Multicore Architecture and Parallel Computing

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

GPGPU. Peter Laurens 1st-year PhD Student, NSC

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav

Spring 2011 Prof. Hyesoon Kim

Spring 2009 Prof. Hyesoon Kim

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

Lecture 2. Shaders, GLSL and GPGPU

Threading Hardware in G80

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

CMPE 665:Multiple Processor Systems CUDA-AWARE MPI VIGNESH GOVINDARAJULU KOTHANDAPANI RANJITH MURUGESAN

Graphics Hardware. Instructor Stephen J. Guy

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Graphics and Imaging Architectures

Portland State University ECE 588/688. Graphics Processors

Lecture 15: Introduction to GPU programming. Lecture 15: Introduction to GPU programming p. 1

Lecture 1: Gentle Introduction to GPUs

GPU Architecture and Function. Michael Foster and Ian Frasch

From Brook to CUDA. GPU Technology Conference

ECE 571 Advanced Microprocessor-Based Design Lecture 20

CS GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8. Markus Hadwiger, KAUST

GPU Architecture. Michael Doggett Department of Computer Science Lund university

GPU Programming Using NVIDIA CUDA

3D Computer Games Technology and History. Markus Hadwiger VRVis Research Center

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

GPUs and GPGPUs. Greg Blanton John T. Lubia

What Next? Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. * slides thanks to Kavita Bala & many others

GPGPU Applications. for Hydrological and Atmospheric Simulations. and Visualizations on the Web. Ibrahim Demir

ECE 571 Advanced Microprocessor-Based Design Lecture 18

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Efficient and Scalable Shading for Many Lights

Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University

ECE 8823: GPU Architectures. Objectives

Drawing Fast The Graphics Pipeline

DIFFERENTIAL. Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka

What s New with GPGPU?

Drawing Fast The Graphics Pipeline

! Readings! ! Room-level, on-chip! vs.!

Drawing Fast The Graphics Pipeline

GPU Computation Strategies & Tricks. Ian Buck NVIDIA

CS179 GPU Programming Introduction to CUDA. Lecture originally by Luke Durant and Tamas Szalay

Graphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

GPGPU introduction and network applications. PacketShaders, SSLShader

Mattan Erez. The University of Texas at Austin

Real-time Graphics 9. GPGPU

GPU for HPC. October 2010

ECE 574 Cluster Computing Lecture 16

GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP

Programmable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1

Introduction to Modern GPU Hardware


Introduction to GPU computing

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

Graphics Processor Acceleration and YOU

Real-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis

Real-Time Graphics Architecture

Using Graphics Chips for General Purpose Computation

EECS 487: Interactive Computer Graphics

Antonio R. Miele Marco D. Santambrogio

Grafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL

Shaders. Slide credit to Prof. Zwicker

High Performance Computing on GPUs using NVIDIA CUDA

GRAPHICS HARDWARE. Niels Joubert, 4th August 2010, CS147

Fast Interactive Sand Simulation for Gesture Tracking systems Shrenik Lad

Accelerating CFD with Graphics Hardware

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

GPGPU LAB. Case study: Finite-Difference Time- Domain Method on CUDA

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Technology for a better society. hetcomp.com

NVIDIA Fermi Architecture

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Introduction to CUDA (1 of n*)

Standard Graphics Pipeline

Jeremy W. Sheaffer 1 David P. Luebke 2 Kevin Skadron 1. University of Virginia Computer Science 2. NVIDIA Research

CONSOLE ARCHITECTURE

Computer Graphics. Hardware Pipeline. Visual Imaging in the Electronic Age Prof. Donald P. Greenberg October 23, 2014 Lecture 16

CSCI-GA Graphics Processing Units (GPUs): Architecture and Programming Lecture 2: Hardware Perspective of GPUs

Pat Hanrahan. Modern Graphics Pipeline. How Powerful are GPUs? Application. Command. Geometry. Rasterization. Fragment. Display.

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

GPU Basics. Introduction to GPU. S. Sundar and M. Panchatcharam. GPU Basics. S. Sundar & M. Panchatcharam. Super Computing GPU.

EE 7722 GPU Microarchitecture. Offered by: Prerequisites By Topic: Text EE 7722 GPU Microarchitecture. URL:

Technical Report on IEIIT-CNR

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Real-time Graphics 9. GPGPU

Distributed Virtual Reality Computation

Motivation Hardware Overview Programming model. GPU computing. Part 1: General introduction. Ch. Hoelbling. Wuppertal University

CENG 477 Introduction to Computer Graphics. Graphics Hardware and OpenGL

Neural Network Implementation using CUDA and OpenMP

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Mattan Erez. The University of Texas at Austin

General Purpose Computing on Graphical Processing Units (GPGPU(

CS195V Week 9. GPU Architecture and Other Shading Languages

Transcription:

Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006) Iowa State University/VRAC (2003-2006) Fraunhofer IGD (Germany) (1994-2003) Area: Interactive 3D Graphics, Software Systems for Virtual Reality Project lead for OpenSG Open Source scenegraph (www.opensg.org) 1

LITE Louisiana Immersive Technologies Enterprise State-initiated, independent R&D lab Good relations to University & Industry Provide Visualization, Supercomputing and Networking Resources For University researchers For smaller companies LITE Venues Flex Executive Room Theatre TIS 2

LITE Resources Compute 352 processor (32*11) SGI Altix 350 cluster 160 core, 4 TB main memory single-systemimage Altix 4700 Network 20 Gb Connection to LONI Bridges into National Lambda Rail and out 20 Gb Connection for commercial projects COMPUTER GRAPHICS HARDWARE 3

Motivation Why bother with special hardware? Graphics is very computation-intensive Graphics work is different from CPU work Special hardware can be much more efficient Interactive (3D) Graphics can never be fast enough People are willing to pay good money for it First Graphics Hardware CRT Displays Need refresh per frame Line Drawing Processor Simple Commands Move, Draw Limited amount of memory Evolved into simple programs Loops, Subroutines 4

2D Processors Fairly soon after raster-based displays Raster operations need a lot of memory bandwidth Simple colored pixel fill Points, Lines, Parallelograms, Triangles The world is not flat, though 3D is a lot more challenging 3D models consist of thousands/millions of vertices and triangles 3D Graphics Pipeline Vertices Transform Light Clip Perspect Viewport Setup Span Pixelfill Display End: Integer data (pixel coords, colors) Easy and efficient to do in hardware Beginning: Floating-point data Fundamental operation: 4D dot product 5

The Geometry Engine Vector-Vector Multiply/Add Unit 5 MFlops Front-end to 2D drawing hardware Lines Z-sorted polygons SGI Iris Workstation First general-purpose graphics workstation J. Clark The Geometry Engine, ACM SIGGRAPH 82 Making it faster Vertices Transform Clip Viewport Pipeline really is a pipeline Pipelining allows the use of multiple chips together Don t have to be different Different Microcode or configuration Iris: 12 (see above) A balanced pipeline scales performance linearly Does not reduce latency, though 6

Parallelism Vertices Transf Transf Light Clip Persp Viewp Setup Span Span Pixel Display Span Graphics: very parallel Thousands of vertices, millions of pixels Very similar/the same operations on each Parallelism improves performance and latency Pushing the Envelope SGI Reality Engine (1992)/ SGI Infinite Reality (1996) RE: 12 Geometry Engines Intel i860xp IR: 4 Geometry Engines Custom ASICs 64K Vertex Geo/Raster Buffer 4 Raster Memory Boards 80 Image Engines each 1 MB Framebuffer/IE 64 MB Texmem (IR) K. Akeley Reality Engine Graphics SIGGRAPH 93 J. Montrym Infinite Realty A Real-Time Graphics System, SIGGRAPH 97 7

Making it cheap All thanks to John Carmack Invented First Person Shooter Doom, 1993 Fully software-rendered Showed 3D graphics for games Jump-started the PC 3D chip industry PCs Coming Up Followed the same evolution as the workstations 2D Pixel Filling (Image Operations, Textures, Depth Buffer) 2D Triangle Drawing 3D Transformations (nvidia GeForce 256, 1999) And surpass the workstations 8

Workstations vs. PCs Reaching Limits By 2000 the full pipeline was in hardware What to do next? Point features: Vertex Blending, Sprites, Just minor advances Problem: Differentiating Features Games need to look better than competition Give flexibility to developers Fixed operations, configure connections 9

Programming the GPU First approach: low-level, assembly languages Standardized between manufacturers Not directly mapped to hardware Better: High-level, C/C++-style languages Microsoft Direct3D HLSL OpenGL GLSL nvidia CG How to program? Need to keep graphics hardware advantages Pipelining & Parallelism Simplicity: want a lot of simple processors Need to keep ordering Final image depends on order Solution: Stream Processing Programs (kernels) work on individual units (vertices and pixels) No communication between them 10

What to program? 2 Big Parts: Vertex & Pixel Handling Separate cooperating programs Vertices Transform Light Clip Perspect Viewport Setup Span Pixelfill Display The Quest for More Power Hunger for performance insatiable More realism, more complexity Need more memory bandwidth, more compute GPUs develop much faster than CPUs CPU: doubles every ~18 months GPU: quadruples every ~18 months 11

CPUs vs. GPUs: Bandwidth From http://gpu4vision.icg.tugraz.at CPUs vs. GPUs: FLOPS From http://gpu4vision.icg.tugraz.at 12

Processing vs. Cache Intel Tukwila 2 B Transistors 30 MB Cache 4 Cores / 16 ALUs nvidia GT200 1.4 B Transistors ~2.5 MB Cache > 500 SIMD ALUs 933 GFlops peak No Cache? Caches main tool against memory latency Memory Cycle ~ 200 Compute Cycles Same for graphics cards Alternative: Multithreading While waiting, compute on other thread Intel s HyperThreading GT200: 10240 threads 10 groups with 1024 each 13

Massive Multithreading Allows Very High Performance For the right problems Lots of compute, not a lot of memory access Write worse than read Imposes new limitations Register pressure: need to keep state for all threads Max performance for < 10 variables used Ok for graphics: calculate lighting for pixels GT200 Architecture From www.beyond3d.com 14

GPGPU General Programming on GPUs Use GPU for non-graphics usage First: using graphics APIs Render result, read back as image Hard to learn, painful to use Still inherently a graphics program Output to pixels only, no scattered write No communication between threads at all New API: CUDA / OpenCL Compute Unified Device Architecture Variant of C/C++ Vector types Special global shared memory No recursion CPU/GPU specific functions and variables Explicit thread blocking Limited thread communication None between blocks, some within block 15

nvidia Tesla Graphics card without monitor outputs Purely as a compute device Also rack-mounted Possible, but not necessary CUDA works on most current nvidia graphics cards OpenCL CUDA is proprietary, only on nvidia systems AMD s variant is CTM OpenCL is an open standard for GPGPU Initiated June 08 by Khronos No results yet, but an important step 16

Example Applications And Speedups Some extreme examples: K-Nearest Neighbor Search: 470x Sum Product Solving: 270x Image Processing: Sliding-Windows for Rapid Object Class Localization: 30x Optical Flow using CUDA and OpenCV: 90x More examples at http://www.nvidia.com/object/cuda_home.html See also http://gpu4vision.icg.tugraz.at Summary Graphics operations can be accelerated massively by graphics hardware GPUs today are more powerful than CPUs For specific tasks: different programming model Focused on stream processing operations Thanks to CUDA/OpenCL can be used nearly as easily as CPUs 17