General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)
|
|
- Brenda Rose
- 5 years ago
- Views:
Transcription
1 ME 290-R: General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing) Sara McMains Spring 2009 Lecture 7
2 Outline Last time Visibility Shading Texturing Today Texturing continued Frame buffer operations Hardware trends GPU programming intro 2
3 Procedural Texture Mapping Instead of looking up an image, pass the texture coordinates to a function that computes the texture value on the fly Renderman, the Pixar rendering language, does this Available in with fragment shaders on current generation hardware Advantages: Near-infinite resolution with small storage cost Idea works for many other things Has the disadvantage of being slower 3
4 Other Types of Mapping Environment mapping looks up incoming illumination in a map Simulates reflections from shiny surfaces Bump-mapping computes an offset to the normal vector at each rendered pixel No need to put bumps in geometry, but silhouette looks wrong Displacement mapping adds an offset to the surface at each point Like putting bumps on geometry, but simpler to model All are available in software renderers like RenderMan compliant renderers All these are becoming available in hardware 4
5 Bump Mapping Look at smooth silhouettes Credit: Rich Riesenfeld 5
6 Displacement Mapping Look at silhouette Credit: Rich Riesenfeld 6
7 Deforming Images 3D Animated Flags--By 3DFlags.com intuitionbase.com/waveguide/tut6.html Credit: Rich Riesenfeld 7
8 Outline Today Texturing continued Frame buffer operations Hardware trends GPU programming intro 8
9 Rasterization Frame buffer Color buffer Credit: Naga Govindaraju Depth buffer Stencil buffer 9
10 Frame buffer Ops Fragment Alpha Test Stencil Test Depth Test Credit: Naga Govindaraju, Mark Harris 10
11 Pipeline: Alpha Test no Reject fragment Fragment Alpha Test Stencil Test Depth Test P User-specified If ( P.alpha op alpha ) pass fragment Else reject fragment Credit: Naga Govindaraju, Mark Harris 11
12 Frame Buffer Ops: Alpha Test Fragment Alpha Test yes Stencil Test Depth Test P User-specified If ( P.alpha op alpha ) pass fragment Else reject fragment Credit: Naga Govindaraju, Mark Harris 12
13 Frame Buffer Ops : Stencil Test no Fragment Alpha Test Stencil Test Depth Test P User-specified If ( P.FB.Stencil op S ) pass fragment Else reject fragment Credit: Naga Govindaraju, Mark Harris 13
14 Frame Buffer Ops : Stencil Test Fragment yes Alpha Test Stencil Test Depth Test P User-specified If ( P.FB.Stencil op S ) pass fragment Else reject fragment Credit: Naga Govindaraju, Mark Harris 14
15 Frame Buffer Ops : Depth Test no Fragment Alpha Test Stencil Test Depth Test P If ( P.FB.depth op P.depth ) pass fragment Else reject fragment Credit: Naga Govindaraju, Mark Harris 15
16 Frame Buffer Ops: Depth Test Fragment Alpha Test Stencil Test Depth Test yes Frame Buffer Credit: Naga Govindaraju, Mark Harris 16
17 Outline Today Frame buffer operations Hardware trends GPU programming intro 17
18 In the beginning... (1965) Gordon Moore: # transistors per die doubling annually Moore s Law transistor density increasing size decreasing 18
19 Today Processor performance doubles ~18 mos more transistors faster clock DRAM capacity doubles every ~3 yrs bandwidth increases 25%/yr latency improving 5%/yr 19
20 Compute vs. Communicate Faster clocks faster computation Chips are big sending signal all the way across takes multiple clock cycles Communication becoming more expensive relative to computation ratio of computation:bandwidth growing 20
21 Computation:Bandwidth Case Study NVIDIA GeForce FX 5800 (12/02) 2 fp ops: word of off-chip bandwidth GeForce FX 5950 (6/03) 2.66 fp ops: word of off-chip bandwidth GeForce FX 6800 (1/04) almost 6 fp ops: word of off-chip bandwidth 21
22 Computation:Bandwidth Case Study John Owens 22
23 Predicted Trends John Owens 23
24 Implications Computation:Latency, Bandwidth need to do useful work while waiting for data request fulfillment may be faster to compute value than using a lookup table Need efficient communication as well as efficient computation 24
25 Outline Today Texturing Frame buffer operations Hardware trends GPU programming intro 25
26 GPU Programming Intro Outline Data Parallelism and Stream Processing Computational Resources Inventory CPU-GPU Analogies Example: N-body gravitational simulation Parallel reductions Linear Algebra Representations Overview of Branching Techniques Credit: Mark Harris 26
27 The Importance of Data Parallelism GPUs are designed for graphics Highly parallel tasks GPUs process fragments independent Temporary registers are zeroed No shared or static data No read-modify-write buffers Data-parallel processing Multiple vertex & fragment pipelines vertices & Hide memory latency (with more computation) Credit: Mark Harris 27
28 Arithmetic Intensity Arithmetic intensity ops per word transferred Computation / bandwidth Best to have Ideal GPGPU apps have Large data sets High parallelism high arithmetic intensity High independence between data elements Credit: Mark Harris 28
29 Data Streams & Kernels Streams Collection of records requiring similar computation Vertex positions, Voxels, FEM cells, etc. Provide data parallelism Kernels Functions applied to each element in stream transforms, PDE, Few dependencies between stream elements Encourage high Arithmetic Intensity Credit: Mark Harris 29
30 Example: Simulation Grid Common GPGPU computation style Textures represent computational grids = streams Many computations map to grids Matrix algebra Image & Volume processing Physically-based simulation ray tracing Non-grid streams can be mapped to grids Credit: Mark Harris 30
31 Stream Computation Grid Simulation algorithm Made up of steps Each step updates entire grid Must complete before next step can begin Grid is a stream, steps are kernels Kernel applied to each stream element Credit: Mark Harris Cloud simulation algorithm 31
32 Vertex Programming Vertex Program Interface to Transform&Light unit GPU instruction set to perform all vertex math Input: arbitrary vertex attributes Output: transformed vertex attributes homogeneous clip space position (required) color texture coordinates... 32
33 Scatter vs. Gather Grid communication Grid cells share information Credit: Mark Harris 33
34 Computational Resources Inventory Programmable parallel processors Vertex, Geometry, & Fragment pipelines Rasterizer Mostly useful for interpolating addresses (texture coordinates) and per-vertex constants Texture unit Read-only memory interface Render to texture Write-only memory interface Credit: Mark Harris 34
35 Vertex Processor Fully programmable Processes 4-vectors (RGBA / XYZW) Capable of scatter but not gather Can change the location of current vertex Cannot read info from other vertices On older GPUs can only read a small constant memory Vertex Texture Fetch Random access memory for vertices Not available with older vertex processors Credit: Mark Harris 35
36 Fragment Processor Fully programmable Processes 4-component vectors (RGBA / XYZW) Random access memory read (textures) Capable of gather but not scatter RAM read (texture fetch), but no RAM write Output address fixed to a specific pixel Typically more useful than vertex processor More fragment pipelines than vertex pipelines Direct output (fragment processor is at end of pipeline) Credit: Mark Harris 36
37 Vertex Programming Vertex Program Does not generate or destroy vertexes Geometry processor on latest cards can No topological information provided No edge, face, nor neighboring vertex info But this can be packed in vertex attributes Dynamically loadable Credit: Naga Govindaraju 37
38 What gets bypassed? Modelview vertex transformations Projection transformations Vertex weighting/blending Normal transformation, rescaling, normalization Per-vertex lighting Texture coordinate generation and texture matrix transformations User-clip planes Credit: Naga Govindaraju 38
39 What does NOT get bypassed? Clipping to the view frustum Perspective divide Viewport transformation Depth range transformation Clamping of colors to [0,1] ([0,255]) Primitive rasterization Credit: Naga Govindarajuv 39
40 CPU-GPU Analogies CPU programming is familiar GPU programming is graphics-centric Analogies can aid understanding Credit: Mark Harris 40
41 CPU-GPU Analogies CPU GPU Stream / Data Array = Texture Memory Read = Texture Sample Credit: Mark Harris 41
42 Kernels CPU GPU Kernel / loop body / algorithm step = Fragment Program Credit: Mark Harris 42
43 Feedback Each algorithm step depends on the results of previous steps Each time step depends on the results of the previous time step Credit: Mark Harris 43
44 Feedback CPU.. Grid[i][j]= x;... GPU Array Write = Render to Texture Credit: Mark Harris 44
45 GPU Simulation Overview Analogies lead to implementation Algorithm steps are fragment programs Computational kernels Current state is stored in textures Feedback via render to texture Credit: Mark Harris 45
46 Invoking Computation Must invoke computation at each pixel Just draw geometry! Most common GPGPU invocation is a full-screen quad Other Useful Analogies Rasterization = Kernel Invocation Texture Coordinates = Computational Domain Vertex Coordinates = Computational Range Credit: Mark Harris 46
47 Typical Grid Computation Initialize view (so that pixels:texels::1:1) For each algorithm step: Activate render-to-texture Setup input textures, fragment program Draw a full-screen quad Credit: Mark Harris glmatrixmode(gl_modelview); glloadidentity(); glmatrixmode(gl_projection); glloadidentity(); glortho(0, 1, 0, 1, 0, 1); glviewport(0, 0, outtexresx, outtexresy); 47
48 Example: N-Body Simulation Brute force L N = 8192 bodies N 2 gravity computations 64M force comps. / frame ~25 flops per force GFLOPs sustained GeForce 6800 Ultra Nyland, Harris, Prins, GP poster Credit: Mark Harris 48
49 Computing Gravitational Forces Each body attracts all other bodies N bodies, so N 2 forces Draw into an NxN buffer Pixel ( i, j) computes force between bodies i and j Very simple fragment program More than 2048 bodies makes it trickier Credit: Mark Harris 49
50 Computing Gravitational Forces F( i, j) = gm / (, ) 2 i M j d i j, d( i, j) = pos( i) - pos( j) Force is proportional to the inverse square of the distance between bodies Credit: Mark Harris 50
51 Computing Gravitational Forces N N-body force Texture Body Position Texture j j force( i, j) i F( i, j) = gm / (, ) 2 i M j r i j, 0 i N d( i, j) = pos( i) - pos( j) Coordinates ( i, j ) in force texture used to find bodies i and j in body position texture Credit: Mark Harris 51
52 Computing Gravitational Forces float4 force(float2 ij { : WPOS, uniform sampler2d pos) : COLOR0 // Pos texture is 2D, not 1D, so we need to // convert body index into 2D coords for pos tex float4 icoords = getbodycoords(ij); float4 iposmass = texture2d(pos, icoords.xy); float4 jposmass = texture2d(pos, icoords.zw); float3 dir = ipos.xyz - jpos.xyz; float d2 = dot(dir, dir); dir = normalize(dir); return dir * g * iposmass.w * jposmass.w / d2; Credit: Mark Harris } 52
53 Computing Total Force Have: array of (i,j) forces Need: total force on each particle i N N-body force Texture force( i, j) 0 i N Credit: Mark Harris 53
54 Computing Total Force Have: array of (i,j) forces Need: total force on each particle i Sum of each column of the force array N N-body force Texture force( i, j) 0 i N Credit: Mark Harris 54
55 Computing Total Force Have: array of (i,j) forces N N-body force Texture Need: total force on each particle i Sum of each column of the force array force( i, j) Can do all N columns in parallel 0 i N This is called a Parallel Reduction Credit: Mark Harris 55
56 Parallel Reductions 1D parallel reduction: sum N columns or rows in parallel add two halves of texture together NxN + Credit: Mark Harris 56
57 Parallel Reductions 1D parallel reduction: sum N columns or rows in parallel add two halves of texture together repeatedly... N x( N /2) + Credit: Mark Harris 57
58 Parallel Reductions 1D parallel reduction: sum N columns or rows in parallel add two halves of texture together repeatedly... N x( N /4) + Credit: Mark Harris 58
59 Parallel Reductions 1D parallel reduction: sum N columns or rows in parallel add two halves of texture together repeatedly... Until we re left with a single row of texels Nx1 Requires log 2 N steps Credit: Mark Harris 59
60 Update Positions and Velocities Now we have a 1-D array of total forces One per body Update Velocity u( i, t+ dt) = u( i, t) + Ftotal( i) * dt Simple fragment program reads previous velocity and force textures, creates new velocity texture Update Position x( i, t+ dt) = x( i, t) + u( i, t) * dt Simple fragment program reads previous position and velocity textures, creates new position texture Credit: Mark Harris 60
61 Linear Algebra Representations Vector representation 2D textures best we can do High texture memory bandwidth Read-write access, dependent fetches 1 N 1 N Credit: Jens Krüger 61
62 Representation (cont.) Dense Matrix representation treat a dense matrix as a set of column vectors again, store these vectors as 2D textures i Matrix N Vectors N N N 2D-Textures 1 i N N 1 i N Credit: Jens Krüger 62
63 Representation (cont.) Banded Sparse Matrix representation treat a banded matrix as a set of diagonal vectors i Matrix 2 Vectors N N 2 2D-Textures 1 2 N 1 2 Credit: Jens Krüger 63
64 Representation (cont.) Banded Sparse Matrix representation combine opposing vectors to save space i Matrix 2 Vectors N N 2 2D-Textures 1 2 N-i N 1 2 Credit: Jens Krüger 64
65 Operations Vector-Vector Operations Reduced to 2D texture operations Coded in vertex/fragment programs Example: Vector1 + Vector2 à Vector3 Vector 1 Vector 2 Vector 3 + Static quad TexUnit 0 TexUnit 1 Render To Texture Pass through return tex0 + tex1 Vertex program fragment program Credit: Jens Krüger 65
66 The single float on GPUs Some operations generate single float values e.g. reduce... Read-back to main-memory is slow Keep single floats on the GPU as 1x1 textures Credit: Jens Krüger 66
67 GPGPU Flow Control Strategies Branching and Looping
68 Branching Techniques Fragment program branches can be expensive No true fragment branching on older cards SIMD branching on GeForce 6+ Series Incoherent branching hurts performance Sometimes better to move decisions up the pipeline Replace with math Occlusion Query Z-cull Pre-computation Static Branch Resolution Credit: Mark Harris 68
69 Branching with Occlusion Query Use it for iteration termination Do { // outer loop on CPU BeginOcclusionQuery { // Render with fragment program that // discards fragments that satisfy // termination criteria } EndQuery } While query returns > 0 Can be used for subdivision techniques Credit: Mark Harris 69
70 Z-Cull In early pass, modify depth buffer Clear Z to 1 Draw quad at Z=0 Discard fragments that should be modified in later passes Subsequent passes Enable depth test (GL_LESS) Draw full-screen quad at z=0.5 Only fragments with previous depth=1 will be processed Can also use stencil cull on GeForce 6 series Not available on GeForce FX (NV3X) Discard and shader depth output disables Z-Cull Credit: Mark Harris 70
71 Pre-computation Pre-compute anything that will not change every iteration! Example: static obstacles in fluid sim Texture containing boundary info for cells inside obstacles Reuse that texture until obstacles are modified Combine with Z-cull for higher performance! Credit: Mark Harris 71
72 Static Branch Resolution Avoid branches where outcome is fixed One region is always true, another false Separate FPs for each region, no branches Example: boundaries Credit: Mark Harris 72
73 Acknowledgements Jens Krüger Mark Harris Naga K. Govindaraju John Owens GPU Gems 2 chapter 29: Streaming Architectures and Technology Trends 73
74 OpenGL Depth Buffer OpenGL defines a depth buffer as its visibility algorithm The enable depth testing: To clear the depth buffer: glenable(gl_depth_test) glclear(gl_depth_buffer_bit) To clear color and depth: glclear(gl_color_buffer_bit GL_DEPTH_BUFFER_BIT) The number of bits used for the depth values can be specified (windowing system dependent, and hardware may impose limits based on available memory) The comparison function can be specified: gldepthfunc( ) Sometimes want to draw furthest thing, or equal to depth in buffer 74
75 Basic OpenGL Texturing Specify texture coordinates for the polygon: Use gltexcoord2f(s,t) before each vertex: Eg: gltexcoord2f(0,0); glvertex3f(x,y,z); Create a texture object and fill it with texture data: glgentextures(num, &indices) objects to get identifiers for the glbindtexture(gl_texture_2d, identifier) the texture gltexparameteri(gl_texture_2d,, ) parameters for use when applying the texture MORE glteximage2d(gl_texture_2d,.) to bind Following texture commands refer to the bound texture to specify texture data (the image itself) to specify the 75
76 Basic OpenGL Texturing (cont) Enable texturing: glenable(gl_texture_2d) State how the texture will be used: gltexenvf( ) Texturing is done after lighting You re ready to go 76
General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)
ME 290-R: General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing) Sara McMains Spring 2009 Lecture 7 Outline Last time Visibility Shading Texturing Today Texturing continued
More informationGeneral-Purpose Computation on Graphics Hardware
General-Purpose Computation on Graphics Hardware Welcome & Overview David Luebke NVIDIA Introduction The GPU on commodity video cards has evolved into an extremely flexible and powerful processor Programmability
More informationGPU Computation Strategies & Tricks. Ian Buck NVIDIA
GPU Computation Strategies & Tricks Ian Buck NVIDIA Recent Trends 2 Compute is Cheap parallelism to keep 100s of ALUs per chip busy shading is highly parallel millions of fragments per frame 0.5mm 64-bit
More informationGraphics Processing Unit Architecture (GPU Arch)
Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics
More informationThe Way of the GPU (based on GPGPU SIGGRAPH Course)
The Way of the GPU (based on GPGPU SIGGRAPH Course) CS535 Fall 2016 Daniel G. Aliaga Department of Computer Science Purdue University Computer Graphics Pipeline Geometry (this is really from 20 years ago
More informationQuery Processing on GPUs
Query Processing on GPUs Graphics Processor Overview Mapping Computation to GPUs Database and data mining applications Database queries Quantile and frequency queries External memory sorting Scientific
More informationGPGPU: Beyond Graphics. Mark Harris, NVIDIA
GPGPU: Beyond Graphics Mark Harris, NVIDIA What is GPGPU? General-Purpose Computation on GPUs GPU designed as a special-purpose coprocessor Useful as a general-purpose coprocessor The GPU is no longer
More informationGeneral Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)
ME 90-R: General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing) Sara McMains Spring 009 Lecture Outline Last time Frame buffer operations GPU programming intro Linear algebra
More informationCS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1 Markus Hadwiger, KAUST Reading Assignment #2 (until Feb. 17) Read (required): GLSL book, chapter 4 (The OpenGL Programmable
More informationCS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology
CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367
More informationThe GPGPU Programming Model
The Programming Model Institute for Data Analysis and Visualization University of California, Davis Overview Data-parallel programming basics The GPU as a data-parallel computer Hello World Example Programming
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:
More informationE.Order of Operations
Appendix E E.Order of Operations This book describes all the performed between initial specification of vertices and final writing of fragments into the framebuffer. The chapters of this book are arranged
More informationTexture Mapping and Sampling
Texture Mapping and Sampling CPSC 314 Wolfgang Heidrich The Rendering Pipeline Geometry Processing Geometry Database Model/View Transform. Lighting Perspective Transform. Clipping Scan Conversion Depth
More informationTutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI
Tutorial on GPU Programming #2 Joong-Youn Lee Supercomputing Center, KISTI Contents Graphics Pipeline Vertex Programming Fragment Programming Introduction to Cg Language Graphics Pipeline The process to
More informationRasterization Overview
Rendering Overview The process of generating an image given a virtual camera objects light sources Various techniques rasterization (topic of this course) raytracing (topic of the course Advanced Computer
More informationLecture 2. Shaders, GLSL and GPGPU
Lecture 2 Shaders, GLSL and GPGPU Is it interesting to do GPU computing with graphics APIs today? Lecture overview Why care about shaders for computing? Shaders for graphics GLSL Computing with shaders
More informationPipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11
Pipeline Operations CS 4620 Lecture 11 1 Pipeline you are here APPLICATION COMMAND STREAM 3D transformations; shading VERTEX PROCESSING TRANSFORMED GEOMETRY conversion of primitives to pixels RASTERIZATION
More informationCS4620/5620: Lecture 14 Pipeline
CS4620/5620: Lecture 14 Pipeline 1 Rasterizing triangles Summary 1! evaluation of linear functions on pixel grid 2! functions defined by parameter values at vertices 3! using extra parameters to determine
More informationApplications of Explicit Early-Z Culling
Applications of Explicit Early-Z Culling Jason L. Mitchell ATI Research Pedro V. Sander ATI Research Introduction In past years, in the SIGGRAPH Real-Time Shading course, we have covered the details of
More informationGPU Memory Model. Adapted from:
GPU Memory Model Adapted from: Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Gary J. Katz, University
More informationPipeline Operations. CS 4620 Lecture 14
Pipeline Operations CS 4620 Lecture 14 2014 Steve Marschner 1 Pipeline you are here APPLICATION COMMAND STREAM 3D transformations; shading VERTEX PROCESSING TRANSFORMED GEOMETRY conversion of primitives
More informationGeneral Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)
ME 290-R: General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing) Sara McMains Spring 2009 Performance: Bottlenecks Sources of bottlenecks CPU Transfer Processing Rasterizer
More informationGraphics Hardware. Computer Graphics COMP 770 (236) Spring Instructor: Brandon Lloyd 2/26/07 1
Graphics Hardware Computer Graphics COMP 770 (236) Spring 2007 Instructor: Brandon Lloyd 2/26/07 1 From last time Texture coordinates Uses of texture maps reflectance and other surface parameters lighting
More informationGeForce4. John Montrym Henry Moreton
GeForce4 John Montrym Henry Moreton 1 Architectural Drivers Programmability Parallelism Memory bandwidth 2 Recent History: GeForce 1&2 First integrated geometry engine & 4 pixels/clk Fixed-function transform,
More informationWhat s New with GPGPU?
What s New with GPGPU? John Owens Assistant Professor, Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Microprocessor Scaling is Slowing
More informationToday. Rendering pipeline. Rendering pipeline. Object vs. Image order. Rendering engine Rendering engine (jtrt) Computergrafik. Rendering pipeline
Computergrafik Today Rendering pipeline s View volumes, clipping Viewport Matthias Zwicker Universität Bern Herbst 2008 Rendering pipeline Rendering pipeline Hardware & software that draws 3D scenes on
More informationWindowing System on a 3D Pipeline. February 2005
Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April
More informationgraphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1
graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline sequence of operations to generate an image using object-order processing primitives processed one-at-a-time
More informationgraphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1
graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline sequence of operations to generate an image using object-order processing primitives processed one-at-a-time
More information2.11 Particle Systems
2.11 Particle Systems 320491: Advanced Graphics - Chapter 2 152 Particle Systems Lagrangian method not mesh-based set of particles to model time-dependent phenomena such as snow fire smoke 320491: Advanced
More informationFrom Brook to CUDA. GPU Technology Conference
From Brook to CUDA GPU Technology Conference A 50 Second Tutorial on GPU Programming by Ian Buck Adding two vectors in C is pretty easy for (i=0; i
More informationOptimizing DirectX Graphics. Richard Huddy European Developer Relations Manager
Optimizing DirectX Graphics Richard Huddy European Developer Relations Manager Some early observations Bear in mind that graphics performance problems are both commoner and rarer than you d think The most
More informationApplications of Explicit Early-Z Z Culling. Jason Mitchell ATI Research
Applications of Explicit Early-Z Z Culling Jason Mitchell ATI Research Outline Architecture Hardware depth culling Applications Volume Ray Casting Skin Shading Fluid Flow Deferred Shading Early-Z In past
More informationThe Rasterization Pipeline
Lecture 5: The Rasterization Pipeline (and its implementation on GPUs) Computer Graphics CMU 15-462/15-662, Fall 2015 What you know how to do (at this point in the course) y y z x (w, h) z x Position objects
More information1.2.3 The Graphics Hardware Pipeline
Figure 1-3. The Graphics Hardware Pipeline 1.2.3 The Graphics Hardware Pipeline A pipeline is a sequence of stages operating in parallel and in a fixed order. Each stage receives its input from the prior
More informationX. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1
X. GPU Programming 320491: Advanced Graphics - Chapter X 1 X.1 GPU Architecture 320491: Advanced Graphics - Chapter X 2 GPU Graphics Processing Unit Parallelized SIMD Architecture 112 processing cores
More informationCSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller
Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,
More informationReal-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis
Real-Time Reyes: Programmable Pipelines and Research Challenges Anjul Patney University of California, Davis Real-Time Reyes-Style Adaptive Surface Subdivision Anjul Patney and John D. Owens SIGGRAPH Asia
More informationC P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev
C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE UGRAD.CS.UBC.C A/~CS314 Mikhail Bessmeltsev 1 WHAT IS RENDERING? Generating image from a 3D scene 2 WHAT IS RENDERING? Generating image
More informationShadow Algorithms. CSE 781 Winter Han-Wei Shen
Shadow Algorithms CSE 781 Winter 2010 Han-Wei Shen Why Shadows? Makes 3D Graphics more believable Provides additional cues for the shapes and relative positions of objects in 3D What is shadow? Shadow:
More informationReal-Time Rendering (Echtzeitgraphik) Michael Wimmer
Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key
More informationProgramming Graphics Hardware
Tutorial 5 Programming Graphics Hardware Randy Fernando, Mark Harris, Matthias Wloka, Cyril Zeller Overview of the Tutorial: Morning 8:30 9:30 10:15 10:45 Introduction to the Hardware Graphics Pipeline
More informationSung-Eui Yoon ( 윤성의 )
Introduction to Computer Graphics and OpenGL Graphics Hardware Sung-Eui Yoon ( 윤성의 ) Course URL: http://sglab.kaist.ac.kr/~sungeui/etri_cg/ Class Objectives Understand how GPUs have been evolved Understand
More informationShaders. Slide credit to Prof. Zwicker
Shaders Slide credit to Prof. Zwicker 2 Today Shader programming 3 Complete model Blinn model with several light sources i diffuse specular ambient How is this implemented on the graphics processor (GPU)?
More informationSpring 2009 Prof. Hyesoon Kim
Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationGraphics Hardware. Instructor Stephen J. Guy
Instructor Stephen J. Guy Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability! Programming Examples Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability!
More informationShaders (some slides taken from David M. course)
Shaders (some slides taken from David M. course) Doron Nussbaum Doron Nussbaum COMP 3501 - Shaders 1 Traditional Rendering Pipeline Traditional pipeline (older graphics cards) restricts developer to texture
More informationSpring 2011 Prof. Hyesoon Kim
Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationLecture 4: Geometry Processing. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)
Lecture 4: Processing Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today Key per-primitive operations (clipping, culling) Various slides credit John Owens, Kurt Akeley,
More informationEvolution of GPUs Chris Seitz
Evolution of GPUs Chris Seitz Overview Concepts: Real-time rendering Hardware graphics pipeline Evolution of the PC hardware graphics pipeline: 1995-1998: Texture mapping and z-buffer 1998: Multitexturing
More informationThe Graphics Pipeline
The Graphics Pipeline Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel But you really want shadows, reflections, global illumination, antialiasing
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationGeneral Purpose computation on GPUs. Liangjun Zhang 2/23/2005
General Purpose computation on GPUs Liangjun Zhang 2/23/2005 Outline Interpretation of GPGPU GPU Programmable interfaces GPU programming sample: Hello, GPGPU More complex programming GPU essentials, opportunity
More informationCSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015
CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015 Announcements Project 2 due tomorrow at 2pm Grading window
More informationMattan Erez. The University of Texas at Austin
EE382V: Principles in Computer Architecture Parallelism and Locality Fall 2008 Lecture 10 The Graphics Processing Unit Mattan Erez The University of Texas at Austin Outline What is a GPU? Why should we
More informationRendering Objects. Need to transform all geometry then
Intro to OpenGL Rendering Objects Object has internal geometry (Model) Object relative to other objects (World) Object relative to camera (View) Object relative to screen (Projection) Need to transform
More informationGPGPU. Peter Laurens 1st-year PhD Student, NSC
GPGPU Peter Laurens 1st-year PhD Student, NSC Presentation Overview 1. What is it? 2. What can it do for me? 3. How can I get it to do that? 4. What s the catch? 5. What s the future? What is it? Introducing
More informationIntroduction to Shaders for Visualization. The Basic Computer Graphics Pipeline
Introduction to Shaders for Visualization Mike Bailey The Basic Computer Graphics Pipeline Model Transform View Transform Per-vertex Lighting Projection Transform Homogeneous Division Viewport Transform
More informationThe Traditional Graphics Pipeline
Last Time? The Traditional Graphics Pipeline Participating Media Measuring BRDFs 3D Digitizing & Scattering BSSRDFs Monte Carlo Simulation Dipole Approximation Today Ray Casting / Tracing Advantages? Ray
More informationGraphics Pipeline & APIs
Graphics Pipeline & APIs CPU Vertex Processing Rasterization Fragment Processing glclear (GL_COLOR_BUFFER_BIT GL_DEPTH_BUFFER_BIT); glpushmatrix (); gltranslatef (-0.15, -0.15, solidz); glmaterialfv(gl_front,
More informationDrawing Fast The Graphics Pipeline
Drawing Fast The Graphics Pipeline CS559 Spring 2016 Lecture 10 February 25, 2016 1. Put a 3D primitive in the World Modeling Get triangles 2. Figure out what color it should be Do ligh/ng 3. Position
More informationAccelerating CFD with Graphics Hardware
Accelerating CFD with Graphics Hardware Graham Pullan (Whittle Laboratory, Cambridge University) 16 March 2009 Today Motivation CPUs and GPUs Programming NVIDIA GPUs with CUDA Application to turbomachinery
More informationHardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences
Hardware Accelerated Volume Visualization Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences A Real-Time VR System Real-Time: 25-30 frames per second 4D visualization: real time input of
More informationData-Parallel Algorithms on GPUs. Mark Harris NVIDIA Developer Technology
Data-Parallel Algorithms on GPUs Mark Harris NVIDIA Developer Technology Outline Introduction Algorithmic complexity on GPUs Algorithmic Building Blocks Gather & Scatter Reductions Scan (parallel prefix)
More informationReal-Time Graphics Architecture
Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Geometry Outline Vertex and primitive operations System examples emphasis on clipping Primitive
More informationCS451Real-time Rendering Pipeline
1 CS451Real-time Rendering Pipeline JYH-MING LIEN DEPARTMENT OF COMPUTER SCIENCE GEORGE MASON UNIVERSITY Based on Tomas Akenine-Möller s lecture note You say that you render a 3D 2 scene, but what does
More informationEnhancing Traditional Rasterization Graphics with Ray Tracing. March 2015
Enhancing Traditional Rasterization Graphics with Ray Tracing March 2015 Introductions James Rumble Developer Technology Engineer Ray Tracing Support Justin DeCell Software Design Engineer Ray Tracing
More informationPipeline Operations. CS 4620 Lecture 10
Pipeline Operations CS 4620 Lecture 10 2008 Steve Marschner 1 Hidden surface elimination Goal is to figure out which color to make the pixels based on what s in front of what. Hidden surface elimination
More informationOptimizing for DirectX Graphics. Richard Huddy European Developer Relations Manager
Optimizing for DirectX Graphics Richard Huddy European Developer Relations Manager Also on today from ATI... Start & End Time: 12:00pm 1:00pm Title: Precomputed Radiance Transfer and Spherical Harmonic
More informationRay Tracing. Computer Graphics CMU /15-662, Fall 2016
Ray Tracing Computer Graphics CMU 15-462/15-662, Fall 2016 Primitive-partitioning vs. space-partitioning acceleration structures Primitive partitioning (bounding volume hierarchy): partitions node s primitives
More informationComputergrafik. Matthias Zwicker. Herbst 2010
Computergrafik Matthias Zwicker Universität Bern Herbst 2010 Today Bump mapping Shadows Shadow mapping Shadow mapping in OpenGL Bump mapping Surface detail is often the result of small perturbations in
More informationComparing Reyes and OpenGL on a Stream Architecture
Comparing Reyes and OpenGL on a Stream Architecture John D. Owens Brucek Khailany Brian Towles William J. Dally Computer Systems Laboratory Stanford University Motivation Frame from Quake III Arena id
More informationIntroduction to Multicore architecture. Tao Zhang Oct. 21, 2010
Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)
More informationGraphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics
Why GPU? Chapter 1 Graphics Hardware Graphics Processing Unit (GPU) is a Subsidiary hardware With massively multi-threaded many-core Dedicated to 2D and 3D graphics Special purpose low functionality, high
More informationTriangle Rasterization
Triangle Rasterization Computer Graphics COMP 770 (236) Spring 2007 Instructor: Brandon Lloyd 2/07/07 1 From last time Lines and planes Culling View frustum culling Back-face culling Occlusion culling
More informationShadow Rendering EDA101 Advanced Shading and Rendering
Shadow Rendering EDA101 Advanced Shading and Rendering 2006 Tomas Akenine-Möller 1 Why, oh why? (1) Shadows provide cues about spatial relationships among objects 2006 Tomas Akenine-Möller 2 Why, oh why?
More informationGPU Memory Model Overview
GPU Memory Model Overview John Owens University of California, Davis Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization SciDAC Institute for Ultrascale Visualization
More informationCS452/552; EE465/505. Clipping & Scan Conversion
CS452/552; EE465/505 Clipping & Scan Conversion 3-31 15 Outline! From Geometry to Pixels: Overview Clipping (continued) Scan conversion Read: Angel, Chapter 8, 8.1-8.9 Project#1 due: this week Lab4 due:
More informationDrawing Fast The Graphics Pipeline
Drawing Fast The Graphics Pipeline CS559 Fall 2016 Lectures 10 & 11 October 10th & 12th, 2016 1. Put a 3D primitive in the World Modeling 2. Figure out what color it should be 3. Position relative to the
More informationDrawing Fast The Graphics Pipeline
Drawing Fast The Graphics Pipeline CS559 Fall 2015 Lecture 9 October 1, 2015 What I was going to say last time How are the ideas we ve learned about implemented in hardware so they are fast. Important:
More informationThe Application Stage. The Game Loop, Resource Management and Renderer Design
1 The Application Stage The Game Loop, Resource Management and Renderer Design Application Stage Responsibilities 2 Set up the rendering pipeline Resource Management 3D meshes Textures etc. Prepare data
More informationComputer Graphics Shadow Algorithms
Computer Graphics Shadow Algorithms Computer Graphics Computer Science Department University of Freiburg WS 11 Outline introduction projection shadows shadow maps shadow volumes conclusion Motivation shadows
More informationThe NVIDIA GeForce 8800 GPU
The NVIDIA GeForce 8800 GPU August 2007 Erik Lindholm / Stuart Oberman Outline GeForce 8800 Architecture Overview Streaming Processor Array Streaming Multiprocessor Texture ROP: Raster Operation Pipeline
More informationGraphics Pipeline & APIs
3 2 4 Graphics Pipeline & APIs CPU Vertex Processing Rasterization Processing glclear (GL_COLOR_BUFFER_BIT GL_DEPTH_BUFFER_BIT); glpushmatrix (); gltranslatef (-0.15, -0.15, solidz); glmaterialfv(gl_front,
More informationCSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012
CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 Announcements Homework project #2 due this Friday, October
More informationPOWERVR MBX. Technology Overview
POWERVR MBX Technology Overview Copyright 2009, Imagination Technologies Ltd. All Rights Reserved. This publication contains proprietary information which is subject to change without notice and is supplied
More informationCSE 167: Lecture #4: Vertex Transformation. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012
CSE 167: Introduction to Computer Graphics Lecture #4: Vertex Transformation Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 Announcements Project 2 due Friday, October 12
More informationCSE 167: Introduction to Computer Graphics Lecture #5: Projection. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2017
CSE 167: Introduction to Computer Graphics Lecture #5: Projection Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2017 Announcements Friday: homework 1 due at 2pm Upload to TritonEd
More informationDiFi: Distance Fields - Fast Computation Using Graphics Hardware
DiFi: Distance Fields - Fast Computation Using Graphics Hardware Avneesh Sud Dinesh Manocha UNC-Chapel Hill http://gamma.cs.unc.edu/difi Distance Fields Distance Function For a site a scalar function f:r
More informationCSE 167: Introduction to Computer Graphics Lecture #4: Vertex Transformation
CSE 167: Introduction to Computer Graphics Lecture #4: Vertex Transformation Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2013 Announcements Project 2 due Friday, October 11
More informationCMSC427 Advanced shading getting global illumination by local methods. Credit: slides Prof. Zwicker
CMSC427 Advanced shading getting global illumination by local methods Credit: slides Prof. Zwicker Topics Shadows Environment maps Reflection mapping Irradiance environment maps Ambient occlusion Reflection
More informationThe Rasterization Pipeline
Lecture 5: The Rasterization Pipeline Computer Graphics and Imaging UC Berkeley CS184/284A, Spring 2016 What We ve Covered So Far z x y z x y (0, 0) (w, h) Position objects and the camera in the world
More informationIntro to OpenGL III. Don Fussell Computer Science Department The University of Texas at Austin
Intro to OpenGL III Don Fussell Computer Science Department The University of Texas at Austin University of Texas at Austin CS354 - Computer Graphics Don Fussell Where are we? Continuing the OpenGL basic
More informationThe Traditional Graphics Pipeline
Last Time? The Traditional Graphics Pipeline Reading for Today A Practical Model for Subsurface Light Transport, Jensen, Marschner, Levoy, & Hanrahan, SIGGRAPH 2001 Participating Media Measuring BRDFs
More informationGraphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal
Graphics Hardware, Graphics APIs, and Computation on GPUs Mark Segal Overview Graphics Pipeline Graphics Hardware Graphics APIs ATI s low-level interface for computation on GPUs 2 Graphics Hardware High
More informationCS 130 Final. Fall 2015
CS 130 Final Fall 2015 Name Student ID Signature You may not ask any questions during the test. If you believe that there is something wrong with a question, write down what you think the question is trying
More informationWhy Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends
Imagine stream processor; Bill Dally, Stanford Connection Machine CM; Thinking Machines Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid Jeffrey Bolz Eitan Grinspun Caltech Ian Farmer
More informationThreading Hardware in G80
ing Hardware in G80 1 Sources Slides by ECE 498 AL : Programming Massively Parallel Processors : Wen-Mei Hwu John Nickolls, NVIDIA 2 3D 3D API: API: OpenGL OpenGL or or Direct3D Direct3D GPU Command &
More informationChapter IV Fragment Processing and Output Merging. 3D Graphics for Game Programming
Chapter IV Fragment Processing and Output Merging Fragment Processing The per-fragment attributes may include a normal vector, a set of texture coordinates, a set of color values, a depth, etc. Using these
More information