GRAPHICS HARDWARE. Niels Joubert, 4th August 2010, CS147
|
|
- Jasper Dorsey
- 6 years ago
- Views:
Transcription
1 GRAPHICS HARDWARE Niels Joubert, 4th August 2010, CS147
2 Rendering Latest GPGPU Today Enabling Real Time Graphics Pipeline History Architecture Programming
3 RENDERING PIPELINE Real-Time Graphics
4 Vertices Rendering Pipeline Vertex Processing are transformed into screen space
5 Vertices Rendering Pipeline Vertex Processing are transformed into screen space Each vertex is transformed independently! Vertex Processor Vertex Processor Vertex Processor Vertex Processor Vertex Processor Vertex Processor
6 Vertices Primitives Rendering Pipeline Primitive Processing are organized into primitives are clipped and culled
7 Rendering Pipeline Rasterization Primitives are rasterized into pixel fragments.
8 Rendering Pipeline Rasterization Each primitive is rasterized Primitives are rasterized into pixel fragments. independently!
9 Fragments Rendering Pipeline Fragment Processing are shaded to compute a color at a pixel
10 Fragments Rendering Pipeline Fragment Processing Each fragment is shaded independently! are shaded to compute a color at a pixel
11 Fragments Z-Buffer Rendering Pipeline Pixel Operations are blended into the framebuffer determines visibility
12 Memory Many All Conflicts Revisit Rendering Pipeline Frame buffer location with aggregation ability fragments end up on same pixel fragments are handled independently when writing to framebuffer? this later! [Synchronization / Atomics]
13 Rendering Pipeline Pipeline Entities
14 Rendering Pipeline Graphics Pipeline Vertices Vertex Generation Vertex Processing Vertex Buffers Transformations Vertex Shader 3D to Screen Space Position, Color, Texture Coordinate manipulations Primitives Primitive Generation Primitive Processing Polygon Data Geometry Shader Interpolate Variables Add or Remove Vertices Fragments Fragment Generation Fragment Processing Fragment shader Interpolate & Rast Lighting Effects Pixels Pixel Operations Z-Buffer Blend Producer-Consumer
15 HISTORY How we got to where we are
16 History: 1963 Sketchpad The first GUI.
17 History: 1972 Xerox Alto 1972: Mouse, Keyboard, Files, Folders, Draggable windows Personal Computer
18 Launched Did History: 1977 Apple II Mhz 6502 processor 4KB RAM (expandable to 48KB for $1340) Graphics for the Masses it have a graphics card?
19 Video No, History: 1977 Apple II CPU: Writes pixel data to RAM Hardware: Reads RAM in scanlines, generates NTSC there is no graphics card.
20 The History: 1982 The Geometry Engine A VLSI Geometry System for Graphics James H. Clark, Computer Systems Lab, Stanford University & Silicon Graphics, Inc. Geometry System is a... computing system for computer graphics constructed from a basic building block, the Geometry Engine.
21 History: 1982 The Geometry Engine A VLSI Geometry System for Graphics James H. Clark, Computer Systems Lab, Stanford University & Silicon Graphics, Inc. 5 Million FLOPS
22 History: 1982 The Geometry Engine A VLSI Geometry System for Graphics James H. Clark, Computer Systems Lab, Stanford University & Silicon Graphics, Inc.
23 History: 1982 The Geometry Engine Instruction Set:
24 Silicon History: 1982 IRIS - Integrated Raster Imaging System Graphics Incʼs first real-time 3D rendering engine. CPU Multibus Geometry System Raster System EtherNet
25
26 BLock History: 1985 BLIT & Commodore Amiga Image Transfer Co-processor / logic block Rapid movement and modification of memory blocks Commodore Amiga had a complete blitter in hardware, in a separate graphics processor
27 History: 1993 Silicon Graphics RealityEngine Its target capability is the rendering of lighted, smooth shaded, depth buffered, texture mapped, antialiased triangles. - RealityEngine Graphics, K. Akeley, 1993 Vertex Generation Vertex Processing Primitive Generation Primitive Processing Fragment Generation Fragment Processing Pixel Operations
28 Silicon OpenGL History: 1992 OpenGL 1.0 & 1995 Direct3D OpenGL 1.0 Graphics Proprietary IRIS GL API (state of the art) OpenGL as open standard derived from IRIS Standardised HW access, device drivers becomes important HUGE success: allows HW to evolve, SW to decouple
29 History: 1998 NVidia RIVA TNT CPU Vertex Generation Vertex Processing GPU Clip/Cull/Rast Texture Map Pixel Ops / Framebuffer Primitive Generation Primitive Processing Fragment Generation Fragment Processing Pixel Operations
30 History: 2002 Direct3D 9, OpenGL 2.0 GPU Vertex Generation Vertex Vertex Vertex Vertex Vertex Processing Clip / Cull / Rasterize Vertex Vertex Vertex Vertex Tex Tex Tex Tex Primitive Generation Primitive Processing Vertex Vertex Vertex Vertex Fragment Generation Tex Tex Tex Tex Pixel Ops / Framebuffer Fragment Processing Pixel Operations ATI Radeon 9700
31 History: 2006 Unified Shading GPUs GPU Vertex Generation Core Core Core Core Vertex Processing Clip / Cull / Rasterize Scheduler Core Core Core Core Tex Tex Tex Tex Core Core Core Core Core Core Core Core Tex Tex Tex Tex Primitive Generation Primitive Processing Fragment Generation Pixel Ops Pixel Ops Pixel Ops Fragment Processing Pixel Ops Pixel Ops Pixel Ops Pixel Operations GeForce G80
32 GRAPHICS ARCHITECTURE GPUs as Throughput-Oriented Hardware
33 Single Architecture CPU Evolution stream of instructions REALLY FAST Long, deep pipelines Branch Prediction & Speculative Execution Hierarchy of Caches Instruction Level Parallelism (ILP)
34 Architecture G5 (2003)
35
36 Architecture G5 (2003) 2 Ghz 1 Ghz FSB 4GB RAM 2 FPUs 50 million transistors 215 inst. pipeline Branch Prediction
37 Architecture G5 (2003) Fermi (2010) 2 Ghz 1 Ghz FSB 4GB RAM 1.4 Ghz 1.8 Ghz FSB 4GB RAM (1.5 in GTX) 2 FPUs 50 million transistors 215 inst. pipeline Branch Prediction
38 Architecture G5 (2003) Fermi (2010) 2 Ghz 1 Ghz FSB 4GB RAM 2 FPUs 50 million transistors 1.4 Ghz 1.8 Ghz FSB 4GB RAM (1.5 in GTX) 960 FPUs 3 billion transistors 215 inst. pipeline Branch Prediction No Branch Prediction
39 The - Architecture Mooreʼs Law number of transistors on an integrated circuit doubles every two years Gorden E. Moore
40 The - What Architecture Mooreʼs Law number of transistors on an integrated circuit doubles every two years Gorden E. Moore Matters: How we use these transistors
41 Architecture Buy Performance with Power
42 There Weʼre Architecture Serial Performance Scaling Cannot continue to scale Mhz is no 10 Ghz chip Cannot increase power consumption per area melting chips Can continue to increase transistor count
43 out-of-order vector SSE, multithreading, Architecture Using Transistors Instruction-level parallelism execution, speculation, branch prediction Data-level parallelism units, SIMD execution AVX, Cell SPE, Clearspeed Thread-level parallelism multicore, manycore
44 Architecture Why Massively Parallel Processing? Computation Power of Graphic Processing Units
45 Architecture Why Massively Parallel Processing? Memory Throughput of Graphic Processing Units
46 Architecture Why Massively Parallel Processing? How can this be? Remove transistors dedicated to speed of a single stream of instructions out-of-order execution, speculation, caches, branch prediction CPU: minimize latency of an individual thread More memory bandwidth, more compute Nothing else on the card! Simple design GPU: maximize throughput of all threads.
47 From Shader Code to a Teraflop: How Shader Cores Work Kayvon Fatahalian Stanford University SIGGRAPH 2009: Beyond Programmable Shading:
48 What s in a GPU? Shader Core Shader Core Tex Input Assembly Rasterizer Shader Core Shader Core Tex Output Blend Shader Core Shader Core Shader Core Shader Core Tex Tex Video Decode Work Distributor HW or SW? Heterogeneous chip multi-processor (highly tuned for graphics) SIGGRAPH 2009: Beyond Programmable Shading: 4
49 A diffuse reflectance shader!"#$%&'(#)*"#$+(,&-./'&0123%4".56(#),&-+( 3%4".5(%789.17'+( A( ((3%4".5(B;+( I(( Independent, but no explicit parallelism SIGGRAPH 2009: Beyond Programmable Shading: 5
50 Compile shader 1 unshaded fragment input record!"#$%&'(#)*"#$+(,&-./'&0123%4".56(#),&-+( 3%4".5(%789.17'+( 3%4".:(;733/!&*9";&'<3%4".5(=4'#>(3%4".0(/?@( A( ((3%4".5(B;+( ((B;(C(#),&-D*"#$%&<#)*"#$>(/?@+( ((B;(EC(F%"#$(<(;4.<%789.17'>(=4'#@>(GDG>(HDG@+( (('&./'=(3%4".:<B;>(HDG@+(((( I(( 2;733/!&*9";&'6J(!"#$%&('G>(?:>(.G>(!G( #/%(('5>(?G>(FKGLGM( #";;('5>(?H>(FKGLHM>('5( #";;('5>(?0>(FKGL0M>('5( F%#$('5>('5>(%<GDG@>(%<HDG@( #/%((4G>('G>('5( #/%((4H>('H>('5( #/%((40>('0>('5( #4?((45>(%<HDG@( 1 shaded fragment output record SIGGRAPH 2009: Beyond Programmable Shading: 6
51 Execute shader Fetch/ Decode ALU (Execute) Execution Context!"#$$%&'()*"'+,-. &*/01' &2. /% :2;. /*"".+73.4<3.892:<;3.+7. /*"".+73.4=3.892:=;3.+7. /%1..A /%1..A<3.+<3.+7. /%1..A=3.+=3.+7. SIGGRAPH 2009: Beyond Programmable Shading: 7
52 CPU- style cores Fetch/ Decode ALU (Execute) Execution Context Out-of-order control logic Fancy branch predictor Memory pre-fetcher Data cache (A big one) SIGGRAPH 2009: Beyond Programmable Shading: 13
53 Slimming down Fetch/ Decode ALU (Execute) Execution Context Idea #1: Remove components that help a single instruction stream run fast SIGGRAPH 2009: Beyond Programmable Shading: 14
54 Two cores (two fragments in parallel) fragment 1 fragment 2 Fetch/ Decode Fetch/ Decode!"#$$%&'()*"'+,-. &*/01' &2. /% :2;. /*"".+73.4<3.892:<;3.+7. /*"".+73.4=3.892:=; / >2?2@3.1><?2@. /%1..A /%1..A<3.+<3.+7. /%1..A=3.+=3.+7. /A4..A73.1><?2@. ALU (Execute) Execution Context ALU (Execute) Execution Context!"#$$%&'()*"'+,-. &*/01' &2. /% :2;. /*"".+73.4<3.892:<;3.+7. /*"".+73.4=3.892:=; / >2?2@3.1><?2@. /%1..A /%1..A<3.+<3.+7. /%1..A=3.+=3.+7. /A4..A73.1><?2@. SIGGRAPH 2009: Beyond Programmable Shading: 15
55 Four cores (four fragments in parallel) Fetch/ Decode ALU (Execute) Execution Context Fetch/ Decode ALU (Execute) Execution Context Fetch/ Decode ALU (Execute) Execution Context Fetch/ Decode ALU (Execute) Execution Context SIGGRAPH 2009: Beyond Programmable Shading: 16
56 Sixteen cores (sixteen fragments in parallel) ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU 16 cores = 16 simultaneous instruction streams SIGGRAPH 2009: Beyond Programmable Shading: 17
57 Instruction stream sharing But many fragments should be able to share an instruction stream!!"#$$%&'()*"'+,-. &*/01' &2. /% :2;. /*"".+73.4<3.892:<;3.+7. /*"".+73.4=3.892:=;3.+7. /%1..A /%1..A<3.+<3.+7. /%1..A=3.+=3.+7. SIGGRAPH 2009: Beyond Programmable Shading: 18
58 Recall: simple processing core Fetch/ Decode ALU (Execute) Execution Context SIGGRAPH 2009: Beyond Programmable Shading: 19
59 Add ALUs Fetch/ Decode ALU 1 ALU 2 ALU 3 ALU 4 ALU 5 ALU 6 ALU 7 ALU 8 Idea #2: Amortize cost/complexity of managing an instruction stream across many ALUs Ctx Ctx Ctx Ctx SIMD processing Ctx Ctx Ctx Ctx Shared Ctx Data SIGGRAPH 2009: Beyond Programmable Shading: 20
60 Modifying the shader Fetch/ Decode ALU 1 ALU 2 ALU 3 ALU 4 ALU 5 ALU 6 ALU 7 ALU 8 Ctx Ctx Ctx Ctx Ctx Ctx Ctx Ctx Shared Ctx Data!"#$$%&'()*"'+,-. &*/01' &2. /% :2;. /*"".+73.4<3.892:<;3.+7. /*"".+73.4=3.892:=; / >2?2@3.1><?2@. /%1..A /%1..A<3.+<3.+7. /%1..A=3.+=3.+7. /A4..A73.1><?2@. Original compiled shader: Processes one fragment using scalar ops on scalar SIGGRAPH 2009: Beyond Programmable Shading: registers 21
61 Modifying the shader Fetch/ Decode ALU 1 ALU 2 ALU 3 ALU 4 ALU 5 ALU 6 ALU 7 ALU 8 Ctx Ctx Ctx Ctx!"#$%&'())*+,-./',0123 "#$%&+/456,37,8&09:37,8&7;:3<9:37,8&+93 "#$%&4*6337,8&0=:37,8&79:38>9?9@3 "#$%&4/''37,8&0=:37,8&7A:38>9?A@:37,8&0=3 "#$%&4/''37,8&0=:37,8&7B:38>9?B@:37,8&0=3 "#$%&864537,8&0=:37,8&0=:36C9D9E:36CAD9E3 "#$%&4*6337,8&F9:37,8&09:37,8&0=3 "#$%&4*6337,8&FA:37,8&0A:37,8&0=3 "#$%&4*6337,8&FB:37,8&0B:37,8&0=3 "#$%&4F7337,8&F=:36CAD9E3 Ctx Ctx Ctx Ctx Shared Ctx Data SIGGRAPH 2009: Beyond Programmable Shading: New compiled shader: Processes 8 fragments using vector ops on vector registers 22
62 Modifying the shader Fetch/ Decode ALU 1 ALU 2 ALU 3 ALU 4 ALU 5 ALU 6 ALU 7 ALU 8 Ctx Ctx Ctx Ctx !"#$%&'())*+,-./',0123 "#$%&+/456,37,8&09:37,8&7;:3<9:37,8&+93 "#$%&4*6337,8&0=:37,8&79:38>9?9@3 "#$%&4/''37,8&0=:37,8&7A:38>9?A@:37,8&0=3 "#$%&4/''37,8&0=:37,8&7B:38>9?B@:37,8&0=3 "#$%&864537,8&0=:37,8&0=:36C9D9E:36CAD9E3 "#$%&4*6337,8&F9:37,8&09:37,8&0=3 "#$%&4*6337,8&FA:37,8&0A:37,8&0=3 "#$%&4*6337,8&FB:37,8&0B:37,8&0=3 "#$%&4F7337,8&F=:36CAD9E3 Ctx Ctx Ctx Ctx Shared Ctx Data SIGGRAPH 2009: Beyond Programmable Shading: 23
63 128 fragments in parallel SIGGRAPH 2009: Beyond Programmable Shading: 16 cores = 128 ALUs = 16 simultaneous instruction streams 24
64 vertices / fragments primitives 128 [ ] in parallel CUDA threads OpenCL work items compute shader threads primitives vertices fragments SIGGRAPH 2009: Beyond Programmable Shading: 25
65 What is the problem?
66 But what about branches? Time (clocks) ALU 1 ALU ALU 8 T T F T F F F F./01203!4!205,# #123+&#!"#$%#&#'(#)# 7+",#:#9#A#@5>### *#+,-+#)# %#:#'>## *# 9#:#;2<$%=#+%;(># 9#?:#@-># 7+",#:#@5>###.7+-/8+#/01203!4!205,# #123+&# SIGGRAPH 2009: Beyond Programmable Shading: 27
67 But what about branches? Time (clocks) ALU 1 ALU ALU 8 T T F T F F F F Not all ALUs do useful work! Worst case: 1/8 performance./01203!4!205,# #123+&#!"#$%#&#'(#)# 7+",#:#9#A#@5>### *#+,-+#)# %#:#'>## *# 9#:#;2<$%=#+%;(># 9#?:#@-># 7+",#:#@5>###.7+-/8+#/01203!4!205,# #123+&# SIGGRAPH 2009: Beyond Programmable Shading: 28
68 Plenty more intricacies! No time now - See the Beyond Programmable Shading Siggraph talk Think of a GPU as a multi-core processor optimized for maximum throughput: Many SIMD cores working together.
69 Thousands Amenable An efficient GPU workload... on independent pieces of work Uses many ALUs on many cores to instruction stream sharing Uses SIMD instructions Compute-heavy: lots of math for each memory access
70 30 GF100 Architecture SMʼs on GTX480 Resources: 2 x 16 Cores 16 Load/Store units 4 Special Functions Sin/Cos/Sqrt 16 Double Precision units 1.44 Terra FLOPS
71 Mental On GF100 Architecture Model: every clock cycle, you can assign an instruction to two of these resources Resources: 2 x 16 Cores 16 Load/Store units 4 Special Functions 16 Double Precision units
72 GPGPU General Purpose Computing on GPUs
73 C/C++ GPGPU CUDA Stream Programming extended with: Kernels - function executed N times in parallel CPU/GPU Synchronization GPU Memory Management
74 GPGPU CUDA Stream Programming
75 Example: Vector Addition Kernel Device Code!!"#$%&'()"*)+($,"-'%"#"."/01!!"23+4"(4,)35"&),6$,%-"$7)"&38,9:8-)"3558(8$7 ;;<=$>3=;; C 87( 8."(4,)35D5EFE #J8K"."/J8K"0"1J8KI L 87( %387?B C!!"M'7"<,85"$6"N!OPQ">=$+G-"$6"OPQ"(4,)35-")3+4 *)+/55RRR"N!OPQA"OPQSSS?5;/A"5;1A"5;#BI L
76 Example: Vector Addition Kernel!!"#$%&'()"*)+($,"-'%"#"."/01!!"23+4"(4,)35"&),6$,%-"$7)"&38,9:8-)"3558(8$7 ;;<=$>3=;; C 87("8"."(4,)35D5EFE #J8K"."/J8K"0"1J8KI L 87("%387?B C Host Code!!"M'7"<,85"$6"N!OPQ">=$+G-"$6"OPQ"(4,)35-")3+4 *)+/55RRR"N!OPQA"OPQSSS?5;/A"5;1A"5;#BI L
77
78 Kernel Variations and Output global void kernel( int *a ) { int idx = blockidx.x*blockdim.x + threadidx.x; a[idx] = 7; } Output: global void kernel( int *a ) { int idx = blockidx.x*blockdim.x + threadidx.x; a[idx] = blockidx.x; } Output: global void kernel( int *a ) { int idx = blockidx.x*blockdim.x + threadidx.x; a[idx] = threadidx.x; } Output:
79 Example: Shuffling Data!!"#$%&'$&"()*+$,"-),$'"%."/$0,!!"1)23"43&$)'"5%($,"%.$"$*$5$.4 667*%-)*66 J A"43&$)'B'CDC E"-*%2/F85DC ;"-*%2/B'CDCG.$>6)&&)0H8I"A"<&$(6)&&)0H8.'82$,H8IIG Host Code 8.4 J
80 Need GPGPU Alternatives OpenCL Attempts to be OpenGL for GPGPU Almost identical to CUDA for a higher level languages Jacket for MATLAB PyCUDA
81 The Future Massively Multi-Core Processors
82 The Future MultiCore is Dead You have an array of six-core CPUs in your rack? You re going to feel pretty stupid when all the cool admins start popping eight-core chips.
83 Number Cannot The Future Many-Core? of cores so large that: Traditional caching models donʼt work keep coherent cache Network on a chip?
84 Havenʼt Programming The Future Memory Models & More even touched it Coherent and Uncoherent caches Uniform vs Non-Uniform Memory Access? TM? Special Purpose Hardware? Schedulers? Languages
85 Up 30% Similar The Future Intel Nehalem to 12 cores lower power usage programming abstractions: 12 cores, each with 128-bit wide SIMD units (SSE) Intel SandyBridge 256-bit wide SIMD units (AVX)
86 Putting Graphics The Future Parallelism Everywhere all those transistors to use Many ALUs Many Cores Intricate Cache Hierarchies Very difficult to program is way ahead of the game
87 The The Future Learn More: Landscape of Parallel Computing Research
88 Kayvon Mike Jared Thank you! Acknowledgements Fatahalian Many of these slides are inspired by or copied from him Houston CS448s Beyond Programmable Shading Hobernock & David Tarjan CS193g Programming massively parallel processors (I TAʼd this last quarter)
From Shader Code to a Teraflop: How GPU Shader Cores Work. Jonathan Ragan- Kelley (Slides by Kayvon Fatahalian)
From Shader Code to a Teraflop: How GPU Shader Cores Work Jonathan Ragan- Kelley (Slides by Kayvon Fatahalian) 1 This talk Three major ideas that make GPU processing cores run fast Closer look at real
More informationFrom Shader Code to a Teraflop: How Shader Cores Work
From Shader Code to a Teraflop: How Shader Cores Work Kayvon Fatahalian Stanford University This talk 1. Three major ideas that make GPU processing cores run fast 2. Closer look at real GPU designs NVIDIA
More informationGPU Architecture. Robert Strzodka (MPII), Dominik Göddeke G. TUDo), Dominik Behr (AMD)
GPU Architecture Robert Strzodka (MPII), Dominik Göddeke G (TUDo( TUDo), Dominik Behr (AMD) Conference on Parallel Processing and Applied Mathematics Wroclaw, Poland, September 13-16, 16, 2009 www.gpgpu.org/ppam2009
More informationReal-Time Rendering Architectures
Real-Time Rendering Architectures Mike Houston, AMD Part 1: throughput processing Three key concepts behind how modern GPU processing cores run code Knowing these concepts will help you: 1. Understand
More informationLecture 7: The Programmable GPU Core. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)
Lecture 7: The Programmable GPU Core Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today A brief history of GPU programmability Throughput processing core 101 A detailed
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:
More informationAntonio R. Miele Marco D. Santambrogio
Advanced Topics on Heterogeneous System Architectures GPU Politecnico di Milano Seminar Room A. Alario 18 November, 2015 Antonio R. Miele Marco D. Santambrogio Politecnico di Milano 2 Introduction First
More informationCSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller
Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,
More informationIntroduction to Multicore architecture. Tao Zhang Oct. 21, 2010
Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationScientific Computing on GPUs: GPU Architecture Overview
Scientific Computing on GPUs: GPU Architecture Overview Dominik Göddeke, Jakub Kurzak, Jan-Philipp Weiß, André Heidekrüger and Tim Schröder PPAM 2011 Tutorial Toruń, Poland, September 11 http://gpgpu.org/ppam11
More informationCS GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8 Markus Hadwiger, KAUST Reading Assignment #5 (until March 12) Read (required): Programming Massively Parallel Processors book, Chapter
More informationCurrent Trends in Computer Graphics Hardware
Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006)
More informationGraphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics
Why GPU? Chapter 1 Graphics Hardware Graphics Processing Unit (GPU) is a Subsidiary hardware With massively multi-threaded many-core Dedicated to 2D and 3D graphics Special purpose low functionality, high
More informationGPU Architecture. Michael Doggett Department of Computer Science Lund university
GPU Architecture Michael Doggett Department of Computer Science Lund university GPUs from my time at ATI R200 Xbox360 GPU R630 R610 R770 Let s start at the beginning... Graphics Hardware before GPUs 1970s
More informationThreading Hardware in G80
ing Hardware in G80 1 Sources Slides by ECE 498 AL : Programming Massively Parallel Processors : Wen-Mei Hwu John Nickolls, NVIDIA 2 3D 3D API: API: OpenGL OpenGL or or Direct3D Direct3D GPU Command &
More informationPortland State University ECE 588/688. Graphics Processors
Portland State University ECE 588/688 Graphics Processors Copyright by Alaa Alameldeen 2018 Why Graphics Processors? Graphics programs have different characteristics from general purpose programs Highly
More informationAdministrivia. HW0 scores, HW1 peer-review assignments out. If you re having Cython trouble with HW2, let us know.
Administrivia HW0 scores, HW1 peer-review assignments out. HW2 out, due Nov. 2. If you re having Cython trouble with HW2, let us know. Review on Wednesday: Post questions on Piazza Introduction to GPUs
More informationSpring 2011 Prof. Hyesoon Kim
Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationParallel Computing: Parallel Architectures Jin, Hai
Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer
More informationGPUs and GPGPUs. Greg Blanton John T. Lubia
GPUs and GPGPUs Greg Blanton John T. Lubia PROCESSOR ARCHITECTURAL ROADMAP Design CPU Optimized for sequential performance ILP increasingly difficult to extract from instruction stream Control hardware
More informationParallel Programming for Graphics
Beyond Programmable Shading Course ACM SIGGRAPH 2010 Parallel Programming for Graphics Aaron Lefohn Advanced Rendering Technology (ART) Intel What s In This Talk? Overview of parallel programming models
More informationPowerVR Hardware. Architecture Overview for Developers
Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationArchitectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1
Architectures Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1 Overview of today s lecture The idea is to cover some of the existing graphics
More informationCS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology
CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367
More informationCSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance
More informationLecture 15: Introduction to GPU programming. Lecture 15: Introduction to GPU programming p. 1
Lecture 15: Introduction to GPU programming Lecture 15: Introduction to GPU programming p. 1 Overview Hardware features of GPGPU Principles of GPU programming A good reference: David B. Kirk and Wen-mei
More informationSpring 2009 Prof. Hyesoon Kim
Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationGraphics Processing Unit Architecture (GPU Arch)
Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics
More informationA Data-Parallel Genealogy: The GPU Family Tree. John Owens University of California, Davis
A Data-Parallel Genealogy: The GPU Family Tree John Owens University of California, Davis Outline Moore s Law brings opportunity Gains in performance and capabilities. What has 20+ years of development
More informationReal - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský
Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation
More informationCornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary
Cornell University CS 569: Interactive Computer Graphics Introduction Lecture 1 [John C. Stone, UIUC] 2008 Steve Marschner 1 2008 Steve Marschner 2 NASA University of Calgary 2008 Steve Marschner 3 2008
More informationGraphics and Imaging Architectures
Graphics and Imaging Architectures Kayvon Fatahalian http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/ About Kayvon New faculty, just arrived from Stanford Dissertation: Evolving real-time graphics
More informationGraphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university
Graphics Architectures and OpenCL Michael Doggett Department of Computer Science Lund university Overview Parallelism Radeon 5870 Tiled Graphics Architectures Important when Memory and Bandwidth limited
More informationCONSOLE ARCHITECTURE
CONSOLE ARCHITECTURE Introduction Part 1 What is a console? Console components Differences between consoles and PCs Benefits of console development The development environment Console game design What
More informationX. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1
X. GPU Programming 320491: Advanced Graphics - Chapter X 1 X.1 GPU Architecture 320491: Advanced Graphics - Chapter X 2 GPU Graphics Processing Unit Parallelized SIMD Architecture 112 processing cores
More informationWindowing System on a 3D Pipeline. February 2005
Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April
More informationWhat Next? Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. * slides thanks to Kavita Bala & many others
What Next? Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University * slides thanks to Kavita Bala & many others Final Project Demo Sign-Up: Will be posted outside my office after lecture today.
More informationAccelerating image registration on GPUs
Accelerating image registration on GPUs Harald Köstler, Sunil Ramgopal Tatavarty SIAM Conference on Imaging Science (IS10) 13.4.2010 Contents Motivation: Image registration with FAIR GPU Programming Combining
More informationGPU! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room, Bld 20! 11 December, 2017!
Advanced Topics on Heterogeneous System Architectures GPU! Politecnico di Milano! Seminar Room, Bld 20! 11 December, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2 Introduction!
More informationChallenges for GPU Architecture. Michael Doggett Graphics Architecture Group April 2, 2008
Michael Doggett Graphics Architecture Group April 2, 2008 Graphics Processing Unit Architecture CPUs vsgpus AMD s ATI RADEON 2900 Programming Brook+, CAL, ShaderAnalyzer Architecture Challenges Accelerated
More informationIntroduction to CUDA (1 of n*)
Agenda Introduction to CUDA (1 of n*) GPU architecture review CUDA First of two or three dedicated classes Joseph Kider University of Pennsylvania CIS 565 - Spring 2011 * Where n is 2 or 3 Acknowledgements
More informationGPU Basics. Introduction to GPU. S. Sundar and M. Panchatcharam. GPU Basics. S. Sundar & M. Panchatcharam. Super Computing GPU.
Basics of s Basics Introduction to Why vs CPU S. Sundar and Computing architecture August 9, 2014 1 / 70 Outline Basics of s Why vs CPU Computing architecture 1 2 3 of s 4 5 Why 6 vs CPU 7 Computing 8
More informationParallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)
Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Analyzing a 3D Graphics Workload Where is most of the work done? Memory Vertex
More informationEE382N (20): Computer Architecture - Parallelism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez. The University of Texas at Austin
EE382 (20): Computer Architecture - ism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez The University of Texas at Austin 1 Recap 2 Streaming model 1. Use many slimmed down cores to run in parallel
More information! Readings! ! Room-level, on-chip! vs.!
1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads
More informationParallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)
Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Today Finishing up from last time Brief discussion of graphics workload metrics
More informationUnit 11: Putting it All Together: Anatomy of the XBox 360 Game Console
Computer Architecture Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Milo Martin & Amir Roth at University of Pennsylvania! Computer Architecture
More informationMulti-Processors and GPU
Multi-Processors and GPU Philipp Koehn 7 December 2016 Predicted CPU Clock Speed 1 Clock speed 1971: 740 khz, 2016: 28.7 GHz Source: Horowitz "The Singularity is Near" (2005) Actual CPU Clock Speed 2 Clock
More informationEfficient and Scalable Shading for Many Lights
Efficient and Scalable Shading for Many Lights 1. GPU Overview 2. Shading recap 3. Forward Shading 4. Deferred Shading 5. Tiled Deferred Shading 6. And more! First GPU Shaders Unified Shaders CUDA OpenCL
More informationThis Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources
This Unit: Putting It All Together CIS 501 Computer Architecture Unit 12: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital Circuits
More informationMattan Erez. The University of Texas at Austin
EE382V (17325): Principles in Computer Architecture Parallelism and Locality Fall 2007 Lecture 12 GPU Architecture (NVIDIA G80) Mattan Erez The University of Texas at Austin Outline 3D graphics recap and
More informationGeForce4. John Montrym Henry Moreton
GeForce4 John Montrym Henry Moreton 1 Architectural Drivers Programmability Parallelism Memory bandwidth 2 Recent History: GeForce 1&2 First integrated geometry engine & 4 pixels/clk Fixed-function transform,
More informationCSCI-GA Graphics Processing Units (GPUs): Architecture and Programming Lecture 2: Hardware Perspective of GPUs
CSCI-GA.3033-004 Graphics Processing Units (GPUs): Architecture and Programming Lecture 2: Hardware Perspective of GPUs Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com History of GPUs
More informationAccelerator cards are typically PCIx cards that supplement a host processor, which they require to operate Today, the most common accelerators include
3.1 Overview Accelerator cards are typically PCIx cards that supplement a host processor, which they require to operate Today, the most common accelerators include GPUs (Graphics Processing Units) AMD/ATI
More informationThis Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?
This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital
More informationGPU Architecture and Function. Michael Foster and Ian Frasch
GPU Architecture and Function Michael Foster and Ian Frasch Overview What is a GPU? How is a GPU different from a CPU? The graphics pipeline History of the GPU GPU architecture Optimizations GPU performance
More informationECE 574 Cluster Computing Lecture 16
ECE 574 Cluster Computing Lecture 16 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 26 March 2019 Announcements HW#7 posted HW#6 and HW#5 returned Don t forget project topics
More informationGPU Programming. Lecture 1: Introduction. Miaoqing Huang University of Arkansas 1 / 27
1 / 27 GPU Programming Lecture 1: Introduction Miaoqing Huang University of Arkansas 2 / 27 Outline Course Introduction GPUs as Parallel Computers Trend and Design Philosophies Programming and Execution
More informationGraphics Hardware. Instructor Stephen J. Guy
Instructor Stephen J. Guy Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability! Programming Examples Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability!
More informationOptimizing DirectX Graphics. Richard Huddy European Developer Relations Manager
Optimizing DirectX Graphics Richard Huddy European Developer Relations Manager Some early observations Bear in mind that graphics performance problems are both commoner and rarer than you d think The most
More informationGPU for HPC. October 2010
GPU for HPC Simone Melchionna Jonas Latt Francis Lapique October 2010 EPFL/ EDMX EPFL/EDMX EPFL/DIT simone.melchionna@epfl.ch jonas.latt@epfl.ch francis.lapique@epfl.ch 1 Moore s law: in the old days,
More informationGPU A rchitectures Architectures Patrick Neill May
GPU Architectures Patrick Neill May 30, 2014 Outline CPU versus GPU CUDA GPU Why are they different? Terminology Kepler/Maxwell Graphics Tiled deferred rendering Opportunities What skills you should know
More informationGPGPU introduction and network applications. PacketShaders, SSLShader
GPGPU introduction and network applications PacketShaders, SSLShader Agenda GPGPU Introduction Computer graphics background GPGPUs past, present and future PacketShader A GPU-Accelerated Software Router
More informationMultimedia in Mobile Phones. Architectures and Trends Lund
Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson
More informationParallel Programming on Larrabee. Tim Foley Intel Corp
Parallel Programming on Larrabee Tim Foley Intel Corp Motivation This morning we talked about abstractions A mental model for GPU architectures Parallel programming models Particular tools and APIs This
More informationIntroduction to Parallel Programming For Real-Time Graphics (CPU + GPU)
Introduction to Parallel Programming For Real-Time Graphics (CPU + GPU) Aaron Lefohn, Intel / University of Washington Mike Houston, AMD / Stanford 1 What s In This Talk? Overview of parallel programming
More informationComputer Architecture
Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 10 Thread and Task Level Parallelism Computer Architecture Part 10 page 1 of 36 Prof. Dr. Uwe Brinkschulte,
More informationGRAPHICS PROCESSING UNITS
GRAPHICS PROCESSING UNITS Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 4, John L. Hennessy and David A. Patterson, Morgan Kaufmann, 2011
More informationReal-time Graphics 9. GPGPU
9. GPGPU GPGPU GPU (Graphics Processing Unit) Flexible and powerful processor Programmability, precision, power Parallel processing CPU Increasing number of cores Parallel processing GPGPU general-purpose
More informationGPU Computation Strategies & Tricks. Ian Buck NVIDIA
GPU Computation Strategies & Tricks Ian Buck NVIDIA Recent Trends 2 Compute is Cheap parallelism to keep 100s of ALUs per chip busy shading is highly parallel millions of fragments per frame 0.5mm 64-bit
More informationBeyond Programmable Shading Keeping Many Cores Busy: Scheduling the Graphics Pipeline
Keeping Many s Busy: Scheduling the Graphics Pipeline Jonathan Ragan-Kelley, MIT CSAIL 29 July 2010 This talk How to think about scheduling GPU-style pipelines Four constraints which drive scheduling decisions
More informationAnatomy of AMD s TeraScale Graphics Engine
Anatomy of AMD s TeraScale Graphics Engine Mike Houston Design Goals Focus on Efficiency f(perf/watt, Perf/$) Scale up processing power and AA performance Target >2x previous generation Enhance stream
More informationReal-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis
Real-Time Reyes: Programmable Pipelines and Research Challenges Anjul Patney University of California, Davis Real-Time Reyes-Style Adaptive Surface Subdivision Anjul Patney and John D. Owens SIGGRAPH Asia
More informationPat Hanrahan. Modern Graphics Pipeline. How Powerful are GPUs? Application. Command. Geometry. Rasterization. Fragment. Display.
How Powerful are GPUs? Pat Hanrahan Computer Science Department Stanford University Computer Forum 2007 Modern Graphics Pipeline Application Command Geometry Rasterization Texture Fragment Display Page
More informationECE 571 Advanced Microprocessor-Based Design Lecture 20
ECE 571 Advanced Microprocessor-Based Design Lecture 20 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 12 April 2016 Project/HW Reminder Homework #9 was posted 1 Raspberry Pi
More informationGPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC
GPGPUs in HPC VILLE TIMONEN Åbo Akademi University 2.11.2010 @ CSC Content Background How do GPUs pull off higher throughput Typical architecture Current situation & the future GPGPU languages A tale of
More informationShaders. Slide credit to Prof. Zwicker
Shaders Slide credit to Prof. Zwicker 2 Today Shader programming 3 Complete model Blinn model with several light sources i diffuse specular ambient How is this implemented on the graphics processor (GPU)?
More informationFrom Brook to CUDA. GPU Technology Conference
From Brook to CUDA GPU Technology Conference A 50 Second Tutorial on GPU Programming by Ian Buck Adding two vectors in C is pretty easy for (i=0; i
More informationA Data-Parallel Genealogy: The GPU Family Tree
A Data-Parallel Genealogy: The GPU Family Tree Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Outline Moore s Law brings
More informationGPU Architecture. Samuli Laine NVIDIA Research
GPU Architecture Samuli Laine NVIDIA Research Today The graphics pipeline: Evolution of the GPU Throughput-optimized parallel processor design I.e., the GPU Contrast with latency-optimized (CPU-like) design
More informationCE 431 Parallel Computer Architecture Spring Graphics Processor Units (GPU) Architecture
CE 431 Parallel Computer Architecture Spring 2017 Graphics Processor Units (GPU) Architecture Nikos Bellas Computer and Communications Engineering Department University of Thessaly Some slides borrowed
More informationReal-time Graphics 9. GPGPU
Real-time Graphics 9. GPGPU GPGPU GPU (Graphics Processing Unit) Flexible and powerful processor Programmability, precision, power Parallel processing CPU Increasing number of cores Parallel processing
More informationMotivation Hardware Overview Programming model. GPU computing. Part 1: General introduction. Ch. Hoelbling. Wuppertal University
Part 1: General introduction Ch. Hoelbling Wuppertal University Lattice Practices 2011 Outline 1 Motivation 2 Hardware Overview History Present Capabilities 3 Programming model Past: OpenGL Present: CUDA
More informationNumerical Simulation on the GPU
Numerical Simulation on the GPU Roadmap Part 1: GPU architecture and programming concepts Part 2: An introduction to GPU programming using CUDA Part 3: Numerical simulation techniques (grid and particle
More informationScanline Rendering 2 1/42
Scanline Rendering 2 1/42 Review 1. Set up a Camera the viewing frustum has near and far clipping planes 2. Create some Geometry made out of triangles 3. Place the geometry in the scene using Transforms
More informationIntroduction to GPU hardware and to CUDA
Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 35 Course outline Introduction to GPU hardware
More informationLecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)
Lecture 6: Texture Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today: texturing! Texture filtering - Texture access is not just a 2D array lookup ;-) Memory-system implications
More informationThe NVIDIA GeForce 8800 GPU
The NVIDIA GeForce 8800 GPU August 2007 Erik Lindholm / Stuart Oberman Outline GeForce 8800 Architecture Overview Streaming Processor Array Streaming Multiprocessor Texture ROP: Raster Operation Pipeline
More informationCompute-mode GPU Programming Interfaces
Lecture 8: Compute-mode GPU Programming Interfaces Visual Computing Systems What is a programming model? Programming models impose structure A programming model provides a set of primitives/abstractions
More informationJosef Pelikán, Jan Horáček CGG MFF UK Praha
GPGPU and CUDA 2012-2018 Josef Pelikán, Jan Horáček CGG MFF UK Praha pepca@cgg.mff.cuni.cz http://cgg.mff.cuni.cz/~pepca/ 1 / 41 Content advances in hardware multi-core vs. many-core general computing
More informationCS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1 Markus Hadwiger, KAUST Reading Assignment #2 (until Feb. 17) Read (required): GLSL book, chapter 4 (The OpenGL Programmable
More informationNext-Generation Graphics on Larrabee. Tim Foley Intel Corp
Next-Generation Graphics on Larrabee Tim Foley Intel Corp Motivation The killer app for GPGPU is graphics We ve seen Abstract models for parallel programming How those models map efficiently to Larrabee
More informationScheduling the Graphics Pipeline on a GPU
Lecture 20: Scheduling the Graphics Pipeline on a GPU Visual Computing Systems Today Real-time 3D graphics workload metrics Scheduling the graphics pipeline on a modern GPU Quick aside: tessellation Triangle
More informationEvolution of GPUs Chris Seitz
Evolution of GPUs Chris Seitz Overview Concepts: Real-time rendering Hardware graphics pipeline Evolution of the PC hardware graphics pipeline: 1995-1998: Texture mapping and z-buffer 1998: Multitexturing
More informationThis Unit: Putting It All Together. CIS 371 Computer Organization and Design. What is Computer Architecture? Sources
This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital
More informationSpring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett
Spring 2010 Prof. Hyesoon Kim AMD presentations from Richard Huddy and Michael Doggett Radeon 2900 2600 2400 Stream Processors 320 120 40 SIMDs 4 3 2 Pipelines 16 8 4 Texture Units 16 8 4 Render Backens
More informationCS130 : Computer Graphics. Tamar Shinar Computer Science & Engineering UC Riverside
CS130 : Computer Graphics Tamar Shinar Computer Science & Engineering UC Riverside Raster Devices and Images Raster Devices Hearn, Baker, Carithers Raster Display Transmissive vs. Emissive Display anode
More information2.11 Particle Systems
2.11 Particle Systems 320491: Advanced Graphics - Chapter 2 152 Particle Systems Lagrangian method not mesh-based set of particles to model time-dependent phenomena such as snow fire smoke 320491: Advanced
More informationPreparing seismic codes for GPUs and other
Preparing seismic codes for GPUs and other many-core architectures Paulius Micikevicius Developer Technology Engineer, NVIDIA 2010 SEG Post-convention Workshop (W-3) High Performance Implementations of
More information