Graphics Hardware. Computer Graphics COMP 770 (236) Spring Instructor: Brandon Lloyd 2/26/07 1

Similar documents
Sung-Eui Yoon ( 윤성의 )

Graphics Hardware. Instructor Stephen J. Guy

Graphics Processing Unit Architecture (GPU Arch)

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

Lecture 2. Shaders, GLSL and GPGPU

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Introduction to the OpenGL Shading Language

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

Introduction to Shaders.

E.Order of Operations

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Programmable Graphics Hardware

Today. Rendering - III. Outline. Texturing: The 10,000m View. Texture Coordinates. Specifying Texture Coordinates in GL

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

Programmable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

Introduction to Shaders for Visualization. The Basic Computer Graphics Pipeline

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

Shaders. Slide credit to Prof. Zwicker

General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)

1.2.3 The Graphics Hardware Pipeline

Programming Graphics Hardware

Supplement to Lecture 22

2.11 Particle Systems

GPU Computation Strategies & Tricks. Ian Buck NVIDIA

Evolution of GPUs Chris Seitz

CS4620/5620: Lecture 14 Pipeline

CS770/870 Spring 2017 Open GL Shader Language GLSL

CS770/870 Spring 2017 Open GL Shader Language GLSL

GpuPy: Accelerating NumPy With a GPU

3D buzzwords. Adding programmability to the pipeline 6/7/16. Bandwidth Gravity of modern computer systems

Basics of GPU-Based Programming

GeForce4. John Montrym Henry Moreton

GLSL Introduction. Fu-Chung Huang. Thanks for materials from many other people

CS427 Multicore Architecture and Parallel Computing

Pipeline Operations. CS 4620 Lecture 14

Programming shaders & GPUs Christian Miller CS Fall 2011

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)

Threading Hardware in G80

Rendering Objects. Need to transform all geometry then

Scanline Rendering 2 1/42

Spring 2009 Prof. Hyesoon Kim

Hardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences

Shaders (some slides taken from David M. course)

The Rasterization Pipeline

General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)

Spring 2011 Prof. Hyesoon Kim

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

CS452/552; EE465/505. Clipping & Scan Conversion

Could you make the XNA functions yourself?

Drawing Fast The Graphics Pipeline

CMSC427 Advanced shading getting global illumination by local methods. Credit: slides Prof. Zwicker

Programmable Graphics Hardware

The Graphics Pipeline and OpenGL III: OpenGL Shading Language (GLSL 1.10)!

Programming Guide. Aaftab Munshi Dan Ginsburg Dave Shreiner. TT r^addison-wesley

Optimizing DirectX Graphics. Richard Huddy European Developer Relations Manager

12.2 Programmable Graphics Hardware

Tutorial on GPU Programming. Joong-Youn Lee Supercomputing Center, KISTI

The Graphics Pipeline

Windowing System on a 3D Pipeline. February 2005

INF3320 Computer Graphics and Discrete Geometry

Mattan Erez. The University of Texas at Austin

Drawing Fast The Graphics Pipeline

Shaders CSCI 4239/5239 Advanced Computer Graphics Spring 2014

Graphics Pipeline & APIs

Programmable GPUS. Last Time? Reading for Today. Homework 4. Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes

The Transition from RenderMan to the OpenGL Shading Language (GLSL)

OpenGL on Android. Lecture 7. Android and Low-level Optimizations Summer School. 27 July 2015

Current Trends in Computer Graphics Hardware

! Readings! ! Room-level, on-chip! vs.!

The Need for Programmability

Portland State University ECE 588/688. Graphics Processors

Cg 2.0. Mark Kilgard

The NVIDIA GeForce 8800 GPU

Spring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett

Module 13C: Using The 3D Graphics APIs OpenGL ES

Levy: Constraint Texture Mapping, SIGGRAPH, CS 148, Summer 2012 Introduction to Computer Graphics and Imaging Justin Solomon

GPU Memory Model. Adapted from:

Introduction to OpenGL

CHAPTER 1 Graphics Systems and Models 3

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

From Brook to CUDA. GPU Technology Conference

What s New with GPGPU?

Working with Metal Overview

Triangle Rasterization

Drawing Fast The Graphics Pipeline

Graphics Pipeline & APIs

frame buffer depth buffer stencil buffer

Case 1:17-cv SLR Document 1-3 Filed 01/23/17 Page 1 of 33 PageID #: 60 EXHIBIT C

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology

Programmable Graphics Hardware

Programmable GPUs Outline

printf Debugging Examples

Volume Graphics Introduction

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Transcription:

Graphics Hardware Computer Graphics COMP 770 (236) Spring 2007 Instructor: Brandon Lloyd 2/26/07 1

From last time Texture coordinates Uses of texture maps reflectance and other surface parameters lighting geometry Solid textures 2/26/07 2

Graphics from a system s perspective Graphics operations most frequently executed on a coprocessor called a Graphics Processing Unit (GPU) Dedicated buses between the host CPU and the GPU AGP, PCI Express Separate GPU memory Framebuffer, textures, etc. Shared memory with CPU 2/26/07 3

OpenGL 1.4 - Graphics Past Fixed-function graphics pipeline every step neatly planned PHILOSOPHY: Performance > Flexibility Extended by committee Why process anything other than polygons or the occasional pixel? A fragment is a potential Host Commands Vertex Transforms Cull, Clip & Project Process And Rasterize Primitive Fragment Processing Texture Memory Per- Fragment Operations Frame Buffer Operations Frame Buffer Display Pixel Pack & Unpack Read Back Control 2/26/07 4

OpenGL 2.0 - Graphics Today Programmable processing units Programmable vertex and fragment processors (Exposes what was always there beneath the covers) Texture memory general-purpose data storage Texture Memory Host Commands Vertex Vertex Processor Processor Cull, Clip & Project Process and Rasterize Primitive Fragment Fragment Processor Processor Per- Fragment Operation Frame Buffer Operation Frame Buffer Display Pixel Pack & Unpack Read Back Control 2/26/07 5

A GPU block diagram GeForce 6 Series Massive parallelism pipelining multiple data paths Mix of programmable and hardwired function blocks Simple inter-processor connectivity High-bandwidth memory interfaces 2/26/07 6

Vertex processor capabilities Lighting, Material and Geometry flexibility Vertex programs replace the following parts of the pipeline: Vertex & Normal transformation Normalization and rescaling Per-Vertex Lighting Calculations Color application Texture coordinate generation & transformation The vertex program does NOT replace: Perspective divide and viewport (NDC) mapping Clipping Backface culling Primitive assembly (Triangle setup, edge equations, etc.) 2/26/07 7

Vertex processor Inputs & Outputs Vertex shader is supplied with a number of parameters Vertex parameters, OpenGL state, user supplied parameters Results written into prearranged locations (registers) that are understood by later processing steps Standard OpenGL attributes glcolor, glnormal glvertex, glmultitexcoord User-Defined Attributes User-Defined Uniform Variables eyeposition, lightposition, modelscalefactor, etc. Vertex Processor Standard OpenGL State ModelViewMatrix, gllightsource[0,..n], glfogcolor, glfrontmaterial, etc. Standard OpenGL variables Vertex & texture coords, Vertex color User-Defined variables Model coordinates, Normals, hvector, toeyevector, etc 2/26/07 8

Fragment processor capabilities Flexibility for texturing and per-pixel operations Fragment programs replace the following parts of the OpenGL pipeline: Operations on interpolated values Pixel zoom Texture access Scale and bias Texture application (modulate, add) Color table lookup Fog (color, depth) Convolution Color sums (blends, mattes) Color matrix The Fragment shader does NOT replace: Scan Conversion Histogram Coverage Pixel packing and unpacking Scissor Stipple Alpha test Depth test Stencil test Alpha blending Logical ops Dithering Plane masking Z-buffer replacement test 2/26/07 9

Fragment processor Inputs & Outputs Fragment shader is supplied with a number of parameters fragment parameters, OpenGL state, user supplied parameters Results written into prearranged locations (registers) that are understood by later processing steps User-Defined Uniform Variables eyeposition, li:ghtposition, modelscalefactor, epsilon, etc. Standard Rasterizer attributes color (r, g, b, a), depth (z), texturecoordinates User-Defined Attributes Normals, modelcoord, density, etc Fragment Processor Standard OpenGL variables FragmentColor, FragmentDepth TextureMemory Textures, Tables, TempStorage 2/26/07 10

GPU Programmability The first Vertex and Fragment programs were written in low-level, H/Wspecific assembly languages specific capabilities (eg. floating point only in Vertex shaders, fixed-point only in Fragment shaders) Trend is toward Higher-Level languages GeForce 8800 has unified shaders (same capabilities for both Vertex and Fragment shaders) Application Vertex Processor Process and Rasterize Primitive Fragment Processor Per Fragment & Frame Buffer Ops Frame Buffer Application Program Vertex Shader Program Fragment Shader Program 2/26/07 11

Historical: GeForce 3 Vertex Processor In the beginning, resources were limited Difficult to do anything, even at the assembly level Useful macros for Vector-scalar mult Vector-vector add Dot-product Normalize became the programming method of choice Vector Floating Pt Datapath Vertex Attributes 16x4 registers Vertex Program 128 instrs Vertex Results 15x4 registers Uniform Parameters (don t change on each vertex) 96x4 registers Temp Registers 12x4 registers 2/26/07 12

GPU/CPU Differences Early GPUs offered no branching support Conditional operations instead If (rega < 0) regb = regc No general indirect access to memory (i.e. lookup tables, textures, etc.) Limited Arrays (uniform parameters) Fixed vector sizes (2, 3 & 4) Vector Floating Pt Datapath Vertex Attributes 16x4 registers Vertex Program 128 instrs Vertex Results 15x4 registers Uniform Parameters (don t change on each vertex) 96x4 registers Temp Registers 12x4 registers 2/26/07 13

Recent GPUs (GeForce 6) 512 total instructions 64K executed per primitive Independent execution (MIMD) Branching Subroutine calls Flexible per-vertex processing General purpose vector registers 4-element (x,y,z,w) Vertex Cache Texture accesses 2/26/07 14

Recent fragment processors 512 total instructions 64K executed per primitive Coupled execution (SIMD) Branching Subroutine calls Flexible per-fragment processing General purpose vector registers 4-element (r,g,b,a) Streamed Datapaths Texture accesses 2/26/07 15

Fragment unit performance 4 ops & 2 Units Same operation on all 4-args on 2 units Or 2-different ops on 2-args on 2 units per clock RGBA or RGBA 16-units per chip 128 = 16 x 8 ops per clock 2/26/07 16

Programming the Beast GPU Assembly Example: FRC R2.y, C11.w; ADD R3.x, C11.w, -R2.y; MOV H4.y, R2.y; ADD H4.x, -H4.y, C4.w; MUL R3.xy, R3.xyww, C11.xyww; ADD R3.xy, R3.xyww, C11.z; TEX H5, R3, TEX2, 2D; ADD R3.x, R3.x, C11.x; TEX H6, R3, TEX2, 2D; GPU programs were quirky and non-portable 2/26/07 17

High-level languages to the rescue Cg High-level language example: (Cg = C for graphics ) float4 main( { float2 detailcoords : TEXCOORD0, float2 bumpcoords: TEXCOORD1, float3 lightvector : COLOR0, uniform float3 ambientcolor, uniform sampler2d detailtexture : TEXUNIT0, uniform sampler2d bumptexture : TEXUNIT1): COLOR float3 detailcolor = tex2d(detailtexture, detailcoords).rgb; float3 lightvectorfinal = 2.0 * (lightvector.rgb - 0.5); float3 bumpnormalvectorfinal = 2.0 * (tex2d(bumptexture, bumpcoords).rgb - 0.5); float diffuse = dot(bumpnormalvectorfinal, lightvectorfinal); return float4(diffuse * detailcolor + ambientcolor, 1.0); } Easier to read and maintain (old-lesson relearned), but still quirky Symbolic named variables (allocated by compiler) Portability between implementations Reuse pieces 2/26/07 18

Explosion of GPU HLLs Stanford Shading Language Cg (Predecessor to Cg) C/Renderman-like Separate programming environment Compiled and linked into application GLSL (OpenGL Shading Language) Embedded into application and compiled on the fly HLSL - DirectX pixel-shading language Integrated into application programming environment 2/26/07 19

Shading language uses Lots of versatility Subsurface Scattering NPR Renders Fire Effects Refraction Ray Tracing Solid Textures Ambient Occlusion Cloth Simulation 2/26/07 20

HLLs are still hard to use All of the old problems associated with managing projects with multiple parts resurface Separate host program Separate Vertex program Separate Fragment program Multiple shaders per application Want incremental tweaks of shaders to propagate to successors Due to small program size must manage loading and running multipass shaders What s the solution? Smart syntax aware editors Project Managements scripts (Makefiles) Wrappers, Linking loaders, (OOP??) 2/26/07 21

Enter the GPU IDE 2/26/07 22

FX Composer & RenderMonkey Integrated Shader Development Environment Interactive Preview window-- lets you see the impact of your shader changes immediately Supports HLSL (RM also supports GLSL) Separate editor windows for vertex and fragment shading code Support generation of artwork (textures, color palettes, MIPmaps) Built-in host application that allows loading geometry Built-in disassembler Provides error checking and limited debugging Free download at http://developer.nvidia.com/object/fx_composer_home.html http://ati.amd.com/developer/rendermonkey/index.html 2/26/07 23

GPGPU General Processing on Graphics Processing Units (GPGPU) map more general computation to the GPU to take advantage of the massive amounts of raw processing power 2/26/07 24

Graphics hardware future GeForce 8800 unified shader architecture general scalar processors (integer and bitwise ops) 2/26/07 25

Graphics hardware future Convergence between CPUs and GPUs CPUs going multi-core many simpler, lower power cores GPUs becoming more and more programmable CUDA, PeakStream, etc. Massive multi-threading to hide latency Heterogeneous collection of processors Cell ATI Fusion More complex memory models less hardware caching more reliance on the programmer 2/26/07 26

Next time Programmable shaders 2/26/07 27