CS GPU and GPGPU Programming Lecture 3: GPU Architecture 2. Markus Hadwiger, KAUST

Similar documents
CS GPU and GPGPU Programming Lecture 3: GPU Architecture 2. Markus Hadwiger, KAUST

CS GPU and GPGPU Programming Lecture 7: Shading and Compute APIs 1. Markus Hadwiger, KAUST

GPU Architecture. Robert Strzodka (MPII), Dominik Göddeke G. TUDo), Dominik Behr (AMD)

From Shader Code to a Teraflop: How Shader Cores Work

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

From Shader Code to a Teraflop: How GPU Shader Cores Work. Jonathan Ragan- Kelley (Slides by Kayvon Fatahalian)

Real-Time Rendering Architectures

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

CS GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8. Markus Hadwiger, KAUST

Intro to Shading Languages

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

The Rasterization Pipeline

Basics of GPU-Based Programming

3D buzzwords. Adding programmability to the pipeline 6/7/16. Bandwidth Gravity of modern computer systems

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

CS GPU and GPGPU Programming Lecture 11: GPU Texturing 1. Markus Hadwiger, KAUST

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

CS427 Multicore Architecture and Parallel Computing

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

GPU Programmable Graphics Pipeline

GpuPy: Accelerating NumPy With a GPU

GLSL Introduction. Fu-Chung Huang. Thanks for materials from many other people

Programmable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1

Graphics Processing Unit Architecture (GPU Arch)

Pipeline Operations. CS 4620 Lecture 14

Shaders (some slides taken from David M. course)

Mattan Erez. The University of Texas at Austin

Lecture 10: Shading Languages. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Cg 2.0. Mark Kilgard

1.2.3 The Graphics Hardware Pipeline

CS GPU and GPGPU Programming Lecture 12: GPU Texturing 1. Markus Hadwiger, KAUST

Supplement to Lecture 22

Introduction to Programmable GPUs CPSC 314. Introduction to GPU Programming CS314 Gordon Wetzstein, 09/03/09

AMCS / CS 247 Scientific Visualization Lecture 10: (GPU) Texture Mapping. Markus Hadwiger, KAUST

TSBK 07! Computer Graphics! Ingemar Ragnemalm, ISY

Introduction to Programmable GPUs CPSC 314. Real Time Graphics

Lecture 2. Shaders, GLSL and GPGPU

Lecture 17: Shading in OpenGL. CITS3003 Graphics & Animation

Efficient and Scalable Shading for Many Lights

OpenGL Programmable Shaders

GLSL Introduction. Fu-Chung Huang. Thanks for materials from many other people

Real-Time Graphics Architecture

OpenGL Status - November 2013 G-Truc Creation

Lecture 10: Part 1: Discussion of SIMT Abstraction Part 2: Introduction to Shading

CS195V Week 9. GPU Architecture and Other Shading Languages

CS GPU and GPGPU Programming Lecture 16+17: GPU Texturing 1+2. Markus Hadwiger, KAUST

Illumination and Shading

Shaders. Slide credit to Prof. Zwicker

Intro to GPU Programming (OpenGL Shading Language) Cliff Lindsay Ph.D. Student CS WPI

Lecture 23: Domain-Specific Parallel Programming

Lecture 7: The Programmable GPU Core. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

The Graphics Pipeline and OpenGL III: OpenGL Shading Language (GLSL 1.10)!

CGT520 Lighting. Lighting. T-vertices. Normal vector. Color of an object can be specified 1) Explicitly as a color buffer

GPU Architecture. Michael Doggett Department of Computer Science Lund university

GPU Target Applications

1 of 5 9/10/ :11 PM

Rendering Subdivision Surfaces Efficiently on the GPU

Mattan Erez. The University of Texas at Austin

EDAF80 Introduction to Computer Graphics. Seminar 3. Shaders. Michael Doggett. Slides by Carl Johan Gribel,

Sung-Eui Yoon ( 윤성의 )

Objectives Shading in OpenGL. Front and Back Faces. OpenGL shading. Introduce the OpenGL shading methods. Discuss polygonal shading

Computer Graphics with OpenGL ES (J. Han) Chapter 6 Fragment shader

Computer Graphics Coursework 1

Programmable Graphics Hardware

Threading Hardware in G80

GPU Architecture & CUDA Programming

Computer Graphics (CS 543) Lecture 8a: Per-Vertex lighting, Shading and Per-Fragment lighting

OpenGl Pipeline. triangles, lines, points, images. Per-vertex ops. Primitive assembly. Texturing. Rasterization. Per-fragment ops.

Lecture 9: Deferred Shading. Visual Computing Systems CMU , Fall 2013

The Graphics Pipeline and OpenGL III: OpenGL Shading Language (GLSL 1.10)!

Introduction to Shaders.

Graphics Hardware. Computer Graphics COMP 770 (236) Spring Instructor: Brandon Lloyd 2/26/07 1

Programmable Graphics Hardware

12.2 Programmable Graphics Hardware

Module Contact: Dr Stephen Laycock, CMP Copyright of the University of East Anglia Version 1

CS4621/5621 Fall Basics of OpenGL/GLSL Textures Basics

Programmable Graphics Hardware

CSE 167: Introduction to Computer Graphics Lecture #6: Lights. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2014

CIS 536/636 Introduction to Computer Graphics. Kansas State University. CIS 536/636 Introduction to Computer Graphics

General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)

Advanced Rendering & Shading

The Rasterization Pipeline

Shading Languages Tim Foley

Fast Third-Order Texture Filtering

Evolution of GPUs Chris Seitz

Programming shaders & GPUs Christian Miller CS Fall 2011

Dave Shreiner, ARM March 2009

ECS 175 COMPUTER GRAPHICS. Ken Joy.! Winter 2014

Feeding the Beast: How to Satiate Your GoForce While Differentiating Your Game

Windowing System on a 3D Pipeline. February 2005

Lecture 11of 41. Surface Detail 2 of 5: Textures OpenGL Shading

The Application Stage. The Game Loop, Resource Management and Renderer Design

Shading Languages for Graphics Hardware

Tutorial on GPU Programming. Joong-Youn Lee Supercomputing Center, KISTI

Programming with OpenGL Part 3: Shaders. Ed Angel Professor of Emeritus of Computer Science University of New Mexico

GRAPHICS HARDWARE. Niels Joubert, 4th August 2010, CS147

Programming Graphics Hardware

CSE 167: Lecture #8: GLSL. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

CS 418: Interactive Computer Graphics. Basic Shading in WebGL. Eric Shaffer

Transcription:

CS 380 - GPU and GPGPU Programming Lecture 3: GPU Architecture 2 Markus Hadwiger, KAUST

Reading Assignment #2 (until Feb. 9) Read (required): GLSL book, chapter 4 (The OpenGL Programmable Pipeline) GPU Gems 2 book, chapter 30 (The GeForce 6 Series GPU Architecture) available online: http://download.nvidia.com/developer/gpu_gems_2/gpu_gems2_ch30.pdf 2

Courtesy Markus Hadwiger, Kayvon KAUSTFatahalian, CMU 3

Courtesy Markus Hadwiger, Kayvon KAUSTFatahalian, CMU 4

Courtesy Markus Hadwiger, Kayvon KAUSTFatahalian, CMU 5

Direct3D 10 Pipeline (~OpenGL 3.2) New geometry shader stage: Vertex -> geometry -> pixel shaders Stream output after geometry shader Courtesy David Blythe, Microsoft 6

Direct3D 11 Pipeline (~OpenGL 4.x) New tessellation stages Hull shader (OpenGL: tessellation control) Tessellator (OpenGL: tessellation primitive generator) Domain shader (OpenGL: tessellation evaluation) Trend of adding new stages likely to continue... (plus shaders outside the graphics pipeline: compute shader)... or full flexibility such as in Intel MIC (Larrabee) architecture? 7

GPU Structure Before Unified Shaders Host Vertex Processors Cull/Clip/Setup Example NVIDIA Geforce 6/7, 2004, 2005 Z-Cull Rasterization Fragment Processors Texture Cache Fragment Crossbar Memory Access Z-Compare and Blending Memory Partition Memory Partition Memory Partition Memory Partition

Legacy Vertex Shading Unit (1) Geforce 3 (NV20), 2001 floating point 4-vector vertex engine still very instructive for understanding GPUs in general Lindholm et al., A User-Programmable Vertex Engine, SIGGRAPH 2001 9

Legacy Vertex Shading Unit (2) Input attributes Code examples swizzling! 10

Legacy Vertex Shading Unit (3) Vector instruction set, very few instructions; no branching yet! 11

Vertex Processor Begin Vertex Vertex Program Instructions Input- Registers copy vertex attributes to input registers Fetch next instruction Read inputor temporary registers Mapping: Negation Swizzling Temporary Registers Output- Registers Execute command Write to output or temp. registers no Finished? yes Emit Vertex

Fast Forward to Programm. Fragment Shading < 1999 Courtesy Mark Kilgard 13

Fast Forward to Programm. Fragment Shading GeForce 256, 1999 Courtesy Mark Kilgard 14

Fast Forward to Programm. Fragment Shading GeForce 3, 2001 Courtesy Mark Kilgard 15

Fast Forward to Programm. Fragment Shading GeForce FX (5), 2003 Courtesy Mark Kilgard 16

Legacy Fragment Shading Unit (1) GeForce 6 (NV40), 2004 dynamic branching 17

Legacy Fragment Shading Unit (2) Example code!!arbfp1.0 ATTRIB unit_tc = fragment.texcoord[ 0 ]; PARAM mvp_inv[] = { state.matrix.mvp.inverse }; PARAM constants = {0, 0.999, 1, 2}; TEMP pos_win, temp; TEX pos_win.z, unit_tc, texture[ 1 ], 2D; ADD pos_win.w, constants.y, -pos_win.z; KIL pos_win.w; MOV result.color.w, pos_win.z; MOV pos_win.xyw, unit_tc; MAD pos_win.xyz, pos_win, constants.a, -constants.b; DP4 temp.w, mvp_inv[ 3 ], pos_win; RCP temp.w, temp.w; MUL pos_win, pos_win, temp.w; DP4 result.color.x, mvp_inv[ 0 ], pos_win; DP4 result.color.y, mvp_inv[ 1 ], pos_win; DP4 result.color.z, mvp_inv[ 2 ], pos_win; END 18

Fragment Processor Begin Fragment Fragment Program Instructions copy fragment attributes to Input register Fetch next instruction Read input of temporary registers Input- Registers Temporary Registers Texture Memory Output Registers Calculate texture address and sample texture interpolate texel color yes Mapping: Negation Swizzling Texture Instruction? no execute instruction Write to output or temporary registers no Finished? yes Emit Fragment

A diffuse reflectance shader sampler mysamp; Texture2D<float3> mytex; float3 lightdir; float4 diffuseshader(float3 norm, float2 uv) { float3 kd; kd = mytex.sample(mysamp, uv); kd *= clamp( dot(lightdir, norm), 0.0, 1.0); return float4(kd, 1.0); } Independent, but no explicit parallelism SIGGRAPH 2009: Beyond Programmable Shading: http://s09.idav.ucdavis.edu/ 20

Compile shader 1 unshaded fragment input record sampler mysamp; Texture2D<float3> mytex; float3 lightdir; float4 diffuseshader(float3 norm, float2 uv) { float3 kd; kd = mytex.sample(mysamp, uv); kd *= clamp ( dot(lightdir, norm), 0.0, 1.0); return float4(kd, 1.0); } <diffuseshader>: sample r0, v4, t0, s0 mul r3, v0, cb0[0] madd r3, v1, cb0[1], r3 madd r3, v2, cb0[2], r3 clmp r3, r3, l(0.0), l(1.0) mul o0, r0, r3 mul o1, r1, r3 mul o2, r2, r3 mov o3, l(1.0) 1 shaded fragment output record SIGGRAPH 2009: Beyond Programmable Shading: http://s09.idav.ucdavis.edu/ 21

Per-Pixel(Fragment) Lighting Simulating smooth surfaces by calculating illumination for each fragment Example: specular highlights (Phong illumination/shading) Phong shading: per-fragment evaluation Gouraud shading: linear interpolation from vertices 22

Per-Pixel Phong Lighting (Cg) void main(float4 position : TEXCOORD0, float3 normal : TEXCOORD1, out float4 ocolor : COLOR, { uniform float3 ambientcol, uniform float3 lightcol, uniform float3 lightpos, uniform float3 eyepos, uniform float3 Ka, uniform float3 Kd, uniform float3 Ks, uniform float shiny)

Per-Pixel Phong Lighting (Cg) float3 P = position.xyz; float3 N = normal; float3 V = normalize(eyeposition - P); float3 H = normalize(l + V); float3 ambient = Ka * ambientcol; float3 L = normalize(lightpos - P); float difflight = max(dot(l, N), 0); float3 diffuse = Kd * lightcol * difflight; float speclight = pow(max(dot(h, N), 0), shiny); float3 specular = Ks * lightcol * speclight; } ocolor.xyz = ambient + diffuse + specular; ocolor.w = 1;

Thank you.