Fragment-Parallel Composite and Filter. Anjul Patney, Stanley Tzeng, and John D. Owens University of California, Davis

Size: px
Start display at page:

Download "Fragment-Parallel Composite and Filter. Anjul Patney, Stanley Tzeng, and John D. Owens University of California, Davis"

Transcription

1 Fragment-Parallel Composite and Filter Anjul Patney, Stanley Tzeng, and John D. Owens University of California, Davis

2 Parallelism in Interactive Graphics Well-expressed in hardware as well as APIs Consistently growing in degree & expression More and more cores on upcoming GPUs From programmable shaders to pipelines We should rethink algorithms to exploit this This paper provides one example Parallelization of composite/filter stages

3 A Feed-Forward Rendering Pipeline Primitives Geometry Processing Rasterization Composite Filter Pixels

4 Composite & Filter Input: Unordered list of fragments Output Pixel colors Assumption No fragments are discarded Sample Locations Pixel

5 Basic Idea Processors Pixel-Parallel

6 Basic Idea Irregularity Processors Fragment-Parallel Insufficient parallelism

7 Motivation Most applications have low depth complexity Pixel-level parallelism is sufficient We are interested in applications with Very high depth complexity High variation in depth complexity Further Future platforms will demand more parallelism High depth-complexity can limit pixel-parallelism

8 Number of subpixels Motivation Distribution of Depth Complexity Number of depth layers

9 Related Work Order-Independent Transparency (OIT) Depth-Peeling [Everitt 01] One pass per transparent layer Stencil-Routed A-buffer [Myers & Bavoil 07] One pass per 8 depth layers 1 Bucket Depth-Peeling [Liu et al. 09] One pass per up to 32 layers 2 1 Maximum MSAA samples per pixel 2 Maximum render targets

10 Related Work Order-Independent Transparency (OIT) OIT using Direct3D 11 [Gruen et al. 10] Use fragment linked-lists Per-pixel sort and composite Hair Self-Shadowing [Sintorn et al. 09] Each fragment computes its contribution Assumes constant opacity

11 Related Work Programmable Rendering Pipelines RenderAnts [Zhou et al. 09] Sort fragments globally Per-pixel composite/filter FreePipe [Liu et al. 10] Sort fragments globally Per-pixel composite/filter

12 Pixel-Parallel Formulation P i P (i+1) P (i+2) Sj j S (j+1) S (j+2) S (j+3) S (j+4) S (j+5) S (j+6) Thread IDs P: Pixel S: Subsample

13 j j+1 j+2 j+3 j+4 j+5 j+6 j+7 j+8 j+9 j+10 j+11 j+12 j+13 j+14 j+15 j+16 j+17 j+18 j+19 j+20 j+21 j+22 j+23 Fragment-Parallel Formulation P i P (i+1) P (i+2) S j S (j+1) S (j+2) S (j+3) S (j+4) S (j+5) S (j+6) Thread IDs P: Pixel S: Subsample

14 Fragment-Parallel Formulation How can this behavior be achieved? Revisit the composite equation fragment 1 fragment 2 background C s = α 1 C 1 + (1-α 1 ){α 2 C 2 +(1-α 2 )( (α N +(1-α N )C B ) } C s = 1.α 1.C 1 + (1-α 1 ).α 2.C 2 + (1-α 1 )(1-α 2 ).α 3.C (1-α 1 )(1-α 2 ) (1-α k-1 ).α i.c k + + (1-α 1 )(1-α 2 ) (1-α N ).C B Local Contribution L k Global Contribution G k

15 Fragment-Parallel Formulation C s = G 1.L 1 + G 2.L 2 + G 3.L 3 G N.L N G k = (1-α 1 ).(1-α 2 ) (1-α k-1 ) L k = α k.c k L k is trivially parallel (local computation) G k is the result of a scan operation (product) For the list of input fragments Compute G[ ] and L[ ], multiply Perform reduction to add subpixel contributions

16 Fragment-Parallel Formulation Filter, for every pixel: C p = C s1.κ 1 + C s2.κ C sm.κ M This can be expressed as another reduction After multiplying with subpixel weights κ m Can be merged with previous reduction

17 Fragment-Parallel Composite & Filter Final Algorithm 1. Two-key sort (Subpixel ID, depth) 2. Segmented Scan (obtain G k ) 3. Premultiply with weights (L k, κ m ) 4. Segmented Reduction

18 Fragment-Parallel Formulation P i P (i+1) P (i+2) Segmented Scan (product) Segmented Reduction (sum) P: Pixel S: Subsample

19 Implementation Hardware used: NVIDIA GeForce GTX 280 We require fast Segmented Scan and Reduce CUDPP library provides that Restricts implementation to NVIDIA CUDA No direct access to hardware rasterizer We wrote our own

20 Example System Polygons Applications Games Depth Complexity 1 to few tens of layers Suited to pixel-parallel Fragment-parallel software rasterizer

21 Example System Particles Applications Simulations, games Depth Complexity Hundreds of layers High depth-variance Particle-parallel sprite rasterizer

22 Example System Volumes Applications Scientific Visualization Depth Complexity Tens to Hundreds of layers Low depth-variance Major-axis-slice rasterizer

23 Example System Reyes Applications Offline rendering Depth Complexity Tens of layers Moderate depth variance Data-parallel micropolygon rasterizer

24 Particles Volume Reyes (grass) Polygon Rendering Time (ms) Performance Results Fragment Generation Pixel-Parallel Composite/Filter Fragment-Parallel Composite/Filter

25 Fragments per second Performance Variation 1.00E+08 Performance Variation 1.00E E+06 Fragment-Parallel Pixel-Parallel 1.00E Depth Complexity

26 Limitations Increased memory traffic Several passes through CUDPP primitives Unclear how to optimize for special cases Threshold opacity Threshold depth complexity

27 Summary and Conclusion Parallel formulation of composite equation Maps well to known primitives Can be integrated with filter Consistent performance across varying workloads FPC is applicable to future rendering pipelines Exploits higher degree of parallelism Better related to size of rendering workload A tool for building programmable pipelines

28 Future Work Performance Reduction in memory traffic Extension to special-case scenes Hybrid PPC-FPC formulations Applications Integration with hardware rasterizer Cinematic rendering, Photoshop

29 Acknowledgments NSF Award SciDAC Insitute for Ultrascale Visualization NVIDIA Research Fellowship Equipment donated by NVIDIA Discussions and Feedback Shubho Sengupta (UC Davis), Matt Pharr (Intel), Aaron Lefohn (Intel), Mike Houston (AMD) Anonymous reviewers Implementation assistance Jeff Stuart, Shubho Sengupta

30 Thanks!

Real-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis

Real-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis Real-Time Reyes: Programmable Pipelines and Research Challenges Anjul Patney University of California, Davis Real-Time Reyes-Style Adaptive Surface Subdivision Anjul Patney and John D. Owens SIGGRAPH Asia

More information

Real-Time Reyes Programmable Pipelines and Research Challenges

Real-Time Reyes Programmable Pipelines and Research Challenges Real-Time Reyes Programmable Pipelines and Research Challenges Anjul Patney University of California, Davis This talk Parallel Computing for Graphics: In Action What does it take to write a programmable

More information

Constant-Memory Order-Independent Transparency Techniques

Constant-Memory Order-Independent Transparency Techniques Constant-Memory Order-Independent Transparency Techniques Louis Bavoil lbavoil@nvidia.com Eric Enderton eenderton@nvidia.com Document Change History Version Date Responsible Reason for Change 1 March 14,

More information

Beyond Programmable Shading Course, ACM SIGGRAPH 2011

Beyond Programmable Shading Course, ACM SIGGRAPH 2011 1/66 Road to Real-Time Order-Independent Transparency Marco Salvi 2/66 Talk Outline Motivation Compositing Equation Recursive Solvers Visibility Based Solvers State of the Art and Future Work Q&A 3/66

More information

Stochastic Transparency. Eric Enderton Erik Sintorn Pete Shirley David Luebke

Stochastic Transparency. Eric Enderton Erik Sintorn Pete Shirley David Luebke Stochastic Transparency Eric Enderton Erik Sintorn Pete Shirley David Luebke I3D 2010 Order Independent Transparency hair foliage particles windows shadows thereof Standard OIT algorithms Sort primitives

More information

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST CS 380 - GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1 Markus Hadwiger, KAUST Reading Assignment #2 (until Feb. 17) Read (required): GLSL book, chapter 4 (The OpenGL Programmable

More information

A Trip Down The (2011) Rasterization Pipeline

A Trip Down The (2011) Rasterization Pipeline A Trip Down The (2011) Rasterization Pipeline Aaron Lefohn - Intel / University of Washington Mike Houston AMD / Stanford 1 This talk Overview of the real-time rendering pipeline available in ~2011 corresponding

More information

GPU Task-Parallelism: Primitives and Applications. Stanley Tzeng, Anjul Patney, John D. Owens University of California at Davis

GPU Task-Parallelism: Primitives and Applications. Stanley Tzeng, Anjul Patney, John D. Owens University of California at Davis GPU Task-Parallelism: Primitives and Applications Stanley Tzeng, Anjul Patney, John D. Owens University of California at Davis This talk Will introduce task-parallelism on GPUs What is it? Why is it important?

More information

Data-Parallel Algorithms on GPUs. Mark Harris NVIDIA Developer Technology

Data-Parallel Algorithms on GPUs. Mark Harris NVIDIA Developer Technology Data-Parallel Algorithms on GPUs Mark Harris NVIDIA Developer Technology Outline Introduction Algorithmic complexity on GPUs Algorithmic Building Blocks Gather & Scatter Reductions Scan (parallel prefix)

More information

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Today Finishing up from last time Brief discussion of graphics workload metrics

More information

7/29/2010 Beyond Programmable Shading Course, ACM SIGGRAPH 2010

7/29/2010 Beyond Programmable Shading Course, ACM SIGGRAPH 2010 7/29/2010 Beyond Programmable Shading Course, ACM SIGGRAPH 2010 1 Looking Back, Looking Forward, Why and How is Interactive Rendering Changing Mike Houston AMD Welcome to Beyond Programmable Shading! Beyond

More information

Rendering Grass with Instancing in DirectX* 10

Rendering Grass with Instancing in DirectX* 10 Rendering Grass with Instancing in DirectX* 10 By Anu Kalra Because of the geometric complexity, rendering realistic grass in real-time is difficult, especially on consumer graphics hardware. This article

More information

The Rasterization Pipeline

The Rasterization Pipeline Lecture 5: The Rasterization Pipeline (and its implementation on GPUs) Computer Graphics CMU 15-462/15-662, Fall 2015 What you know how to do (at this point in the course) y y z x (w, h) z x Position objects

More information

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Analyzing a 3D Graphics Workload Where is most of the work done? Memory Vertex

More information

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp Next-Generation Graphics on Larrabee Tim Foley Intel Corp Motivation The killer app for GPGPU is graphics We ve seen Abstract models for parallel programming How those models map efficiently to Larrabee

More information

GPU Memory Model. Adapted from:

GPU Memory Model. Adapted from: GPU Memory Model Adapted from: Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Gary J. Katz, University

More information

8/5/2012. Introduction. Transparency. Anti-Aliasing. Applications. Conclusions. Introduction

8/5/2012. Introduction. Transparency. Anti-Aliasing. Applications. Conclusions. Introduction Introduction Transparency effects and applications Anti-Aliasing impact in the final image Why combine Transparency with Anti-Aliasing? Marilena Maule João Comba Rafael Torchelsen Rui Bastos UFRGS UFRGS

More information

Order Independent Transparency with Dual Depth Peeling. Louis Bavoil, Kevin Myers

Order Independent Transparency with Dual Depth Peeling. Louis Bavoil, Kevin Myers Order Independent Transparency with Dual Depth Peeling Louis Bavoil, Kevin Myers Document Change History Version Date Responsible Reason for Change 1.0 February 9 2008 Louis Bavoil Initial release Abstract

More information

Parallel Programming for Graphics

Parallel Programming for Graphics Beyond Programmable Shading Course ACM SIGGRAPH 2010 Parallel Programming for Graphics Aaron Lefohn Advanced Rendering Technology (ART) Intel What s In This Talk? Overview of parallel programming models

More information

On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing

On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing 2018 On-the-fly for Massively-Parallel Software Geometry Processing Bernhard Kerbl Wolfgang Tatzgern Elena Ivanchenko Dieter Schmalstieg Markus Steinberger 5 4 3 4 2 5 6 7 6 3 1 2 0 1 0, 0,1,7, 7,1,2,

More information

GPU Architecture and Function. Michael Foster and Ian Frasch

GPU Architecture and Function. Michael Foster and Ian Frasch GPU Architecture and Function Michael Foster and Ian Frasch Overview What is a GPU? How is a GPU different from a CPU? The graphics pipeline History of the GPU GPU architecture Optimizations GPU performance

More information

Real-Time Hair Rendering on the GPU NVIDIA

Real-Time Hair Rendering on the GPU NVIDIA Real-Time Hair Rendering on the GPU Sarah Tariq NVIDIA Motivation Academia and the movie industry have been simulating and rendering impressive and realistic hair for a long time We have demonstrated realistic

More information

What s New with GPGPU?

What s New with GPGPU? What s New with GPGPU? John Owens Assistant Professor, Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Microprocessor Scaling is Slowing

More information

NVIDIA Parallel Nsight. Jeff Kiel

NVIDIA Parallel Nsight. Jeff Kiel NVIDIA Parallel Nsight Jeff Kiel Agenda: NVIDIA Parallel Nsight Programmable GPU Development Presenting Parallel Nsight Demo Questions/Feedback Programmable GPU Development More programmability = more

More information

Soft Particles. Tristan Lorach

Soft Particles. Tristan Lorach Soft Particles Tristan Lorach tlorach@nvidia.com January 2007 Document Change History Version Date Responsible Reason for Change 1 01/17/07 Tristan Lorach Initial release January 2007 ii Abstract Before:

More information

Scan Primitives for GPU Computing

Scan Primitives for GPU Computing Scan Primitives for GPU Computing Shubho Sengupta, Mark Harris *, Yao Zhang, John Owens University of California Davis, *NVIDIA Corporation Motivation Raw compute power and bandwidth of GPUs increasing

More information

Direct3D 11 Performance Tips & Tricks

Direct3D 11 Performance Tips & Tricks Direct3D 11 Performance Tips & Tricks Holger Gruen Cem Cebenoyan AMD ISV Relations NVIDIA ISV Relations Agenda Introduction Shader Model 5 Resources and Resource Views Multithreading Miscellaneous Q&A

More information

COMPUTER GRAPHICS COURSE. Rendering Pipelines

COMPUTER GRAPHICS COURSE. Rendering Pipelines COMPUTER GRAPHICS COURSE Rendering Pipelines Georgios Papaioannou - 2014 A Rendering Pipeline Rendering or Graphics Pipeline is the sequence of steps that we use to create the final image Many graphics/rendering

More information

Spring 2009 Prof. Hyesoon Kim

Spring 2009 Prof. Hyesoon Kim Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on

More information

CS427 Multicore Architecture and Parallel Computing

CS427 Multicore Architecture and Parallel Computing CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:

More information

Spring 2011 Prof. Hyesoon Kim

Spring 2011 Prof. Hyesoon Kim Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on

More information

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation

More information

Volume Graphics Introduction

Volume Graphics Introduction High-Quality Volume Graphics on Consumer PC Hardware Volume Graphics Introduction Joe Kniss Gordon Kindlmann Markus Hadwiger Christof Rezk-Salama Rüdiger Westermann Motivation (1) Motivation (2) Scientific

More information

Volumetric Particle Shadows. Simon Green

Volumetric Particle Shadows. Simon Green Volumetric Particle Shadows Simon Green Abstract This paper describes an easy to implement, high performance method for adding volumetric shadowing to particle systems. It only requires a single 2D shadow

More information

Fast BVH Construction on GPUs

Fast BVH Construction on GPUs Fast BVH Construction on GPUs Published in EUROGRAGHICS, (2009) C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, D. Manocha University of North Carolina at Chapel Hill NVIDIA University of California

More information

Rasterization Overview

Rasterization Overview Rendering Overview The process of generating an image given a virtual camera objects light sources Various techniques rasterization (topic of this course) raytracing (topic of the course Advanced Computer

More information

Multi-View Soft Shadows. Louis Bavoil

Multi-View Soft Shadows. Louis Bavoil Multi-View Soft Shadows Louis Bavoil lbavoil@nvidia.com Document Change History Version Date Responsible Reason for Change 1.0 March 16, 2011 Louis Bavoil Initial release Overview The Multi-View Soft Shadows

More information

2.11 Particle Systems

2.11 Particle Systems 2.11 Particle Systems 320491: Advanced Graphics - Chapter 2 152 Particle Systems Lagrangian method not mesh-based set of particles to model time-dependent phenomena such as snow fire smoke 320491: Advanced

More information

Chapter 1 Introduction

Chapter 1 Introduction Graphics & Visualization Chapter 1 Introduction Graphics & Visualization: Principles & Algorithms Brief History Milestones in the history of computer graphics: 2 Brief History (2) CPU Vs GPU 3 Applications

More information

Comparing Reyes and OpenGL on a Stream Architecture

Comparing Reyes and OpenGL on a Stream Architecture Comparing Reyes and OpenGL on a Stream Architecture John D. Owens Brucek Khailany Brian Towles William J. Dally Computer Systems Laboratory Stanford University Motivation Frame from Quake III Arena id

More information

GPU Memory Model Overview

GPU Memory Model Overview GPU Memory Model Overview John Owens University of California, Davis Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization SciDAC Institute for Ultrascale Visualization

More information

AGGREGATE G-BUFFER ANTI-ALIASING

AGGREGATE G-BUFFER ANTI-ALIASING AGGREGATE G-BUFFER ANTI-ALIASING Cyril Crassin 1, Morgan McGuire 1,2, Kayvon Fatahalian 3, Aaron Lefohn 1 1 NVIDIA 2 Williams College 3 Carnegie Mellon University Motivation Pixel The Mummy [ Universal

More information

Accelerating CFD with Graphics Hardware

Accelerating CFD with Graphics Hardware Accelerating CFD with Graphics Hardware Graham Pullan (Whittle Laboratory, Cambridge University) 16 March 2009 Today Motivation CPUs and GPUs Programming NVIDIA GPUs with CUDA Application to turbomachinery

More information

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015 CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015 Announcements Project 2 due tomorrow at 2pm Grading window

More information

A Data-Parallel Genealogy: The GPU Family Tree

A Data-Parallel Genealogy: The GPU Family Tree A Data-Parallel Genealogy: The GPU Family Tree Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Outline Moore s Law brings

More information

A SIMD-efficient 14 Instruction Shader Program for High-Throughput Microtriangle Rasterization

A SIMD-efficient 14 Instruction Shader Program for High-Throughput Microtriangle Rasterization A SIMD-efficient 14 Instruction Shader Program for High-Throughput Microtriangle Rasterization Jordi Roca Victor Moya Carlos Gonzalez Vicente Escandell Albert Murciego Agustin Fernandez, Computer Architecture

More information

The Application Stage. The Game Loop, Resource Management and Renderer Design

The Application Stage. The Game Loop, Resource Management and Renderer Design 1 The Application Stage The Game Loop, Resource Management and Renderer Design Application Stage Responsibilities 2 Set up the rendering pipeline Resource Management 3D meshes Textures etc. Prepare data

More information

General-Purpose Computation on Graphics Hardware

General-Purpose Computation on Graphics Hardware General-Purpose Computation on Graphics Hardware Welcome & Overview David Luebke NVIDIA Introduction The GPU on commodity video cards has evolved into an extremely flexible and powerful processor Programmability

More information

Scheduling the Graphics Pipeline on a GPU

Scheduling the Graphics Pipeline on a GPU Lecture 20: Scheduling the Graphics Pipeline on a GPU Visual Computing Systems Today Real-time 3D graphics workload metrics Scheduling the graphics pipeline on a modern GPU Quick aside: tessellation Triangle

More information

REYES REYES REYES. Goals of REYES. REYES Design Principles

REYES REYES REYES. Goals of REYES. REYES Design Principles You might be surprised to know that most frames of all Pixar s films and shorts do not use a global illumination model for rendering! Instead, they use Renders Everything You Ever Saw Developed by Pixar

More information

MSAA- Based Coarse Shading

MSAA- Based Coarse Shading MSAA- Based Coarse Shading for Power- Efficient Rendering on High Pixel- Density Displays Pavlos Mavridis Georgios Papaioannou Department of Informatics, Athens University of Economics & Business Motivation

More information

FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS

FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS Chris Wyman, Rama Hoetzlein, Aaron Lefohn 2015 Symposium on Interactive 3D Graphics & Games CONTRIBUTIONS Full scene, fully dynamic alias-free

More information

3D Rendering Pipeline

3D Rendering Pipeline 3D Rendering Pipeline Reference: Real-Time Rendering 3 rd Edition Chapters 2 4 OpenGL SuperBible 6 th Edition Overview Rendering Pipeline Modern CG Inside a Desktop Architecture Shaders Tool Stage Asset

More information

General Algorithm Primitives

General Algorithm Primitives General Algorithm Primitives Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Topics Two fundamental algorithms! Sorting Sorting

More information

Shaders. Slide credit to Prof. Zwicker

Shaders. Slide credit to Prof. Zwicker Shaders Slide credit to Prof. Zwicker 2 Today Shader programming 3 Complete model Blinn model with several light sources i diffuse specular ambient How is this implemented on the graphics processor (GPU)?

More information

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology Graphics Performance Optimisation John Spitzer Director of European Developer Technology Overview Understand the stages of the graphics pipeline Cherchez la bottleneck Once found, either eliminate or balance

More information

CS 354R: Computer Game Technology

CS 354R: Computer Game Technology CS 354R: Computer Game Technology Texture and Environment Maps Fall 2018 Texture Mapping Problem: colors, normals, etc. are only specified at vertices How do we add detail between vertices without incurring

More information

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 Announcements Homework project #2 due this Friday, October

More information

Graphics Processing Unit Architecture (GPU Arch)

Graphics Processing Unit Architecture (GPU Arch) Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics

More information

Transparency with Deferred Shading

Transparency with Deferred Shading Transparency with Deferred Shading A protoype for rendering transparency with a Deferred Shader using Alpha-Blending CHRISTIAN MAGNERFELT Bachelor s Thesis at NADA Supervisor: Mårten Björkman Stockholm,

More information

A Sampling of CUDA Libraries Michael Garland

A Sampling of CUDA Libraries Michael Garland A Sampling of CUDA Libraries Michael Garland NVIDIA Research CUBLAS Implementation of BLAS (Basic Linear Algebra Subprograms) on top of CUDA driver Self-contained at the API level, no direct interaction

More information

Chapter IV Fragment Processing and Output Merging. 3D Graphics for Game Programming

Chapter IV Fragment Processing and Output Merging. 3D Graphics for Game Programming Chapter IV Fragment Processing and Output Merging Fragment Processing The per-fragment attributes may include a normal vector, a set of texture coordinates, a set of color values, a depth, etc. Using these

More information

Lecture 12: Advanced Rendering

Lecture 12: Advanced Rendering Lecture 12: Advanced Rendering CSE 40166 Computer Graphics Peter Bui University of Notre Dame, IN, USA November 30, 2010 Limitations of OpenGL Pipeline Rendering Good Fast, real-time graphics rendering.

More information

Efficient Stream Reduction on the GPU

Efficient Stream Reduction on the GPU Efficient Stream Reduction on the GPU David Roger Grenoble University Email: droger@inrialpes.fr Ulf Assarsson Chalmers University of Technology Email: uffe@chalmers.se Nicolas Holzschuch Cornell University

More information

A Data-Parallel Genealogy: The GPU Family Tree. John Owens University of California, Davis

A Data-Parallel Genealogy: The GPU Family Tree. John Owens University of California, Davis A Data-Parallel Genealogy: The GPU Family Tree John Owens University of California, Davis Outline Moore s Law brings opportunity Gains in performance and capabilities. What has 20+ years of development

More information

CSE 167: Introduction to Computer Graphics Lecture #18: More Effects. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2016

CSE 167: Introduction to Computer Graphics Lecture #18: More Effects. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2016 CSE 167: Introduction to Computer Graphics Lecture #18: More Effects Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2016 Announcements TA evaluations CAPE Final project blog

More information

Real-Time Reyes-Style Adaptive Surface Subdivision

Real-Time Reyes-Style Adaptive Surface Subdivision Real-Time Reyes-Style Adaptive Surface Subdivision Anjul Patney University of California, Davis John D. Owens University of California, Davis Figure 1: Flat-shaded OpenGL renderings of Reyes-subdivided

More information

Scanline Rendering 2 1/42

Scanline Rendering 2 1/42 Scanline Rendering 2 1/42 Review 1. Set up a Camera the viewing frustum has near and far clipping planes 2. Create some Geometry made out of triangles 3. Place the geometry in the scene using Transforms

More information

GPUs and GPGPUs. Greg Blanton John T. Lubia

GPUs and GPGPUs. Greg Blanton John T. Lubia GPUs and GPGPUs Greg Blanton John T. Lubia PROCESSOR ARCHITECTURAL ROADMAP Design CPU Optimized for sequential performance ILP increasingly difficult to extract from instruction stream Control hardware

More information

Efficient and Scalable Shading for Many Lights

Efficient and Scalable Shading for Many Lights Efficient and Scalable Shading for Many Lights 1. GPU Overview 2. Shading recap 3. Forward Shading 4. Deferred Shading 5. Tiled Deferred Shading 6. And more! First GPU Shaders Unified Shaders CUDA OpenCL

More information

Reyes Rendering on the GPU

Reyes Rendering on the GPU Reyes Rendering on the GPU Martin Sattlecker Graz University of Technology Markus Steinberger Graz University of Technology Abstract In this paper we investigate the possibility of real-time Reyes rendering

More information

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane Rendering Pipeline Rendering Converting a 3D scene to a 2D image Rendering Light Camera 3D Model View Plane Rendering Converting a 3D scene to a 2D image Basic rendering tasks: Modeling: creating the world

More information

Memory-efficient Adaptive Subdivision for Software Rendering on the GPU

Memory-efficient Adaptive Subdivision for Software Rendering on the GPU Memory-efficient Adaptive Subdivision for Software Rendering on the GPU Marshall Plan Scholarship End Report Thomas Weber Vienna University of Technology April 30, 2014 Abstract The adaptive subdivision

More information

Filtering theory: Battling Aliasing with Antialiasing. Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology

Filtering theory: Battling Aliasing with Antialiasing. Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology Filtering theory: Battling Aliasing with Antialiasing Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology 1 What is aliasing? 2 Why care at all? l Quality!! l Example:

More information

Spring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett

Spring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett Spring 2010 Prof. Hyesoon Kim AMD presentations from Richard Huddy and Michael Doggett Radeon 2900 2600 2400 Stream Processors 320 120 40 SIMDs 4 3 2 Pipelines 16 8 4 Texture Units 16 8 4 Render Backens

More information

COMP 4801 Final Year Project. Ray Tracing for Computer Graphics. Final Project Report FYP Runjing Liu. Advised by. Dr. L.Y.

COMP 4801 Final Year Project. Ray Tracing for Computer Graphics. Final Project Report FYP Runjing Liu. Advised by. Dr. L.Y. COMP 4801 Final Year Project Ray Tracing for Computer Graphics Final Project Report FYP 15014 by Runjing Liu Advised by Dr. L.Y. Wei 1 Abstract The goal of this project was to use ray tracing in a rendering

More information

A Reconfigurable Architecture for Load-Balanced Rendering

A Reconfigurable Architecture for Load-Balanced Rendering A Reconfigurable Architecture for Load-Balanced Rendering Jiawen Chen Michael I. Gordon William Thies Matthias Zwicker Kari Pulli Frédo Durand Graphics Hardware July 31, 2005, Los Angeles, CA The Load

More information

Mattan Erez. The University of Texas at Austin

Mattan Erez. The University of Texas at Austin EE382V (17325): Principles in Computer Architecture Parallelism and Locality Fall 2007 Lecture 11 The Graphics Processing Unit Mattan Erez The University of Texas at Austin Outline What is a GPU? Why should

More information

Graphics and Imaging Architectures

Graphics and Imaging Architectures Graphics and Imaging Architectures Kayvon Fatahalian http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/ About Kayvon New faculty, just arrived from Stanford Dissertation: Evolving real-time graphics

More information

A Real-time Micropolygon Rendering Pipeline. Kayvon Fatahalian Stanford University

A Real-time Micropolygon Rendering Pipeline. Kayvon Fatahalian Stanford University A Real-time Micropolygon Rendering Pipeline Kayvon Fatahalian Stanford University Detailed surfaces Credit: DreamWorks Pictures, Shrek 2 (2004) Credit: Pixar Animation Studios, Toy Story 2 (1999) Credit:

More information

Rendering Subdivision Surfaces Efficiently on the GPU

Rendering Subdivision Surfaces Efficiently on the GPU Rendering Subdivision Surfaces Efficiently on the GPU Gy. Antal, L. Szirmay-Kalos and L. A. Jeni Department of Algorithms and their Applications, Faculty of Informatics, Eötvös Loránd Science University,

More information

GPU Multisplit. Saman Ashkiani, Andrew Davidson, Ulrich Meyer, John D. Owens. S. Ashkiani (UC Davis) GPU Multisplit GTC / 16

GPU Multisplit. Saman Ashkiani, Andrew Davidson, Ulrich Meyer, John D. Owens. S. Ashkiani (UC Davis) GPU Multisplit GTC / 16 GPU Multisplit Saman Ashkiani, Andrew Davidson, Ulrich Meyer, John D. Owens S. Ashkiani (UC Davis) GPU Multisplit GTC 2016 1 / 16 Motivating Simple Example: Compaction Compaction (i.e., binary split) Traditional

More information

Windowing System on a 3D Pipeline. February 2005

Windowing System on a 3D Pipeline. February 2005 Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April

More information

The NVIDIA GeForce 8800 GPU

The NVIDIA GeForce 8800 GPU The NVIDIA GeForce 8800 GPU August 2007 Erik Lindholm / Stuart Oberman Outline GeForce 8800 Architecture Overview Streaming Processor Array Streaming Multiprocessor Texture ROP: Raster Operation Pipeline

More information

Filtering theory: Battling Aliasing with Antialiasing. Department of Computer Engineering Chalmers University of Technology

Filtering theory: Battling Aliasing with Antialiasing. Department of Computer Engineering Chalmers University of Technology Filtering theory: Battling Aliasing with Antialiasing Department of Computer Engineering Chalmers University of Technology 1 What is aliasing? 2 Why care at all? l Quality!! l Example: Final fantasy The

More information

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367

More information

Department of Computer Engineering 3D Graphics in Games and Movies

Department of Computer Engineering 3D Graphics in Games and Movies Department of Computer Engineering 3D Graphics in Games and Movies Ulf Assarsson Department of Computer Engineering The screen consists of pixels Department of Computer Engineering 3D-Rendering Objects

More information

FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS

FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS Chris Wyman, Rama Hoetzlein, Aaron Lefohn 2015 Symposium on Interactive 3D Graphics & Games CONTRIBUTIONS Full scene, fully dynamic alias-free

More information

Further Developing GRAMPS. Jeremy Sugerman FLASHG January 27, 2009

Further Developing GRAMPS. Jeremy Sugerman FLASHG January 27, 2009 Further Developing GRAMPS Jeremy Sugerman FLASHG January 27, 2009 Introduction Evaluation of what/where GRAMPS is today Planned next steps New graphs: MapReduce and Cloth Sim Speculative potpourri, outside

More information

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1 X. GPU Programming 320491: Advanced Graphics - Chapter X 1 X.1 GPU Architecture 320491: Advanced Graphics - Chapter X 2 GPU Graphics Processing Unit Parallelized SIMD Architecture 112 processing cores

More information

OIT to Volumetric Shadow Mapping, 101 Uses for Raster Ordered Views using DirectX 12

OIT to Volumetric Shadow Mapping, 101 Uses for Raster Ordered Views using DirectX 12 OIT to Volumetric Shadow Mapping, 101 Uses for Raster Ordered Views using DirectX 12 Leigh Davies March 05, 2015 Introduction Raster Ordered Views Applications + R&D Topics Performance Tips & Tricks Summary

More information

Direct Rendering of Trimmed NURBS Surfaces

Direct Rendering of Trimmed NURBS Surfaces Direct Rendering of Trimmed NURBS Surfaces Hardware Graphics Pipeline 2/ 81 Hardware Graphics Pipeline GPU Video Memory CPU Vertex Processor Raster Unit Fragment Processor Render Target Screen Extended

More information

Data parallel algorithms, algorithmic building blocks, precision vs. accuracy

Data parallel algorithms, algorithmic building blocks, precision vs. accuracy Data parallel algorithms, algorithmic building blocks, precision vs. accuracy Robert Strzodka Architecture of Computing Systems GPGPU and CUDA Tutorials Dresden, Germany, February 25 2008 2 Overview Parallel

More information

Programmable GPUS. Last Time? Reading for Today. Homework 4. Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes

Programmable GPUS. Last Time? Reading for Today. Homework 4. Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes Last Time? Programmable GPUS Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes frame buffer depth buffer stencil buffer Stencil Buffer Homework 4 Reading for Create some geometry "Rendering

More information

GPGPU Applications. for Hydrological and Atmospheric Simulations. and Visualizations on the Web. Ibrahim Demir

GPGPU Applications. for Hydrological and Atmospheric Simulations. and Visualizations on the Web. Ibrahim Demir GPGPU Applications for Hydrological and Atmospheric Simulations and Visualizations on the Web Ibrahim Demir Big Data We are collecting and generating data on a petabyte scale (1Pb = 1,000 Tb = 1M Gb) Data

More information

Introduction to Shaders.

Introduction to Shaders. Introduction to Shaders Marco Benvegnù hiforce@gmx.it www.benve.org Summer 2005 Overview Rendering pipeline Shaders concepts Shading Languages Shading Tools Effects showcase Setup of a Shader in OpenGL

More information

Hair Self Shadowing and Transparency Depth Ordering Using Occupancy maps

Hair Self Shadowing and Transparency Depth Ordering Using Occupancy maps Hair Self Shadowing and Transparency Depth Ordering Using Occupancy maps Erik Sintorn Chalmers University of technology Ulf Assarsson Chalmers University of Technology Figure 1: The woman renders in 37.3

More information

My focus will be on pointing out the patterns rather than explaining individual algorithms

My focus will be on pointing out the patterns rather than explaining individual algorithms 1 2 3 4 My focus will be on pointing out the patterns rather than explaining individual algorithms 5 I added Stochastic Transparency the morning of the SIGGRAPH course because it (a) fits the pattern of

More information

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal Graphics Hardware, Graphics APIs, and Computation on GPUs Mark Segal Overview Graphics Pipeline Graphics Hardware Graphics APIs ATI s low-level interface for computation on GPUs 2 Graphics Hardware High

More information

Beyond Programmable Shading 2012

Beyond Programmable Shading 2012 Beyond Programmable Shading Course ACM SIGGRAPH 2012 Beyond Programmable Shading 2012 Aaron Lefohn Intel Mike Houston AMD Welcome 5 th consecutive year of Beyond Programmable Shading SIGGRAPH course This

More information