Tracking Graphics State for Network Rendering

Similar documents
Multi-Graphics. Multi-Graphics Project. Scalable Graphics using Commodity Graphics Systems

A New Bandwidth Reduction Method for Distributed Rendering Systems

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

A New Bandwidth Reduction Method for Distributed Rendering System

Chromium. Petteri Kontio 51074C

Scheduling the Graphics Pipeline on a GPU

The F-Buffer: A Rasterization-Order FIFO Buffer for Multi-Pass Rendering. Bill Mark and Kekoa Proudfoot. Stanford University

Visualization and VR for the Grid

Appendix B - Unpublished papers

DESIGNING GRAPHICS ARCHITECTURES AROUND SCALABILITY AND COMMUNICATION

CS GPU and GPGPU Programming Lecture 7: Shading and Compute APIs 1. Markus Hadwiger, KAUST

Simplifying Applications Use of Wall-Sized Tiled Displays. Espen Skjelnes Johnsen, Otto J. Anshus, John Markus Bjørndalen, Tore Larsen

A Data-Parallel Genealogy: The GPU Family Tree. John Owens University of California, Davis

Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Ray Tracing.

Real-Time Graphics Architecture

Real-Time Graphics Architecture

A Data-Parallel Genealogy: The GPU Family Tree

Using a Camera to Capture and Correct Spatial Photometric Variation in Multi-Projector Displays

Methodology for Lecture. Importance of Lighting. Outline. Shading Models. Brief primer on Color. Foundations of Computer Graphics (Spring 2010)

GPU Architecture and Function. Michael Foster and Ian Frasch

Achieving Real Time Ray Tracing Effects Through GPGPU Acceleration

GPU Computation Strategies & Tricks. Ian Buck NVIDIA

A Survey and Performance Analysis of Software Platforms for Interactive Cluster-Based Multi-Screen Rendering

Graphics and Imaging Architectures

CS427 Multicore Architecture and Parallel Computing

Mobile HW and Bandwidth

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Texture

Course Recap + 3D Graphics on Mobile GPUs

Retained Mode Parallel Rendering for Scalable Tiled Displays

GeForce4. John Montrym Henry Moreton

3D Computer Games Technology and History. Markus Hadwiger VRVis Research Center

Programmable GPUS. Last Time? Reading for Today. Homework 4. Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes

Mattan Erez. The University of Texas at Austin

Optimized Visualization for Tiled Displays

GPGPU on Mobile Devices

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

Automatic Tuning Matrix Multiplication Performance on Graphics Hardware

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

Spring 2009 Prof. Hyesoon Kim

frame buffer depth buffer stencil buffer

Spring 2011 Prof. Hyesoon Kim

Threading Hardware in G80

Reducing State Changes with a Pipeline Buffer

Graphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university

Windowing System on a 3D Pipeline. February 2005

Silicon Graphics Prism TM : A Platform for Scalable Graphics

Lecture 10: Part 1: Discussion of SIMT Abstraction Part 2: Introduction to Shading

Lecture 4: Geometry Processing. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Exposing Application Graphics to a Dynamic Heterogeneous Network

A Real-Time Procedural Shading System for Programmable Graphics Hardware

CS GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8. Markus Hadwiger, KAUST

GPU-Based Volume Rendering of. Unstructured Grids. João L. D. Comba. Fábio F. Bernardon UFRGS

Comparing Reyes and OpenGL on a Stream Architecture

Models of the Impact of Overlap in Bucket Rendering

CSM Scrolling. An acceleration technique for the rendering of cascaded shadow maps

Effects needed for Realism. Ray Tracing. Ray Tracing: History. Outline. Foundations of Computer Graphics (Spring 2012)

The parallelized perspective shear-warp algorithm for volume rendering

Portland State University ECE 588/688. Graphics Processors

A GPU Accelerated Spring Mass System for Surgical Simulation

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

Ray Casting on Programmable Graphics Hardware. Martin Kraus PURPL group, Purdue University

Standard Graphics Pipeline

CSE 167: Introduction to Computer Graphics Lecture #11: Visibility Culling

Today. Rendering algorithms. Rendering algorithms. Images. Images. Rendering Algorithms. Course overview Organization Introduction to ray tracing

CSE 167: Introduction to Computer Graphics Lecture #18: More Effects. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2016

Graphics Processing Unit Architecture (GPU Arch)

Hardware Shading: State-of-the-Art and Future Challenges

Real Time Ray Tracing

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

Intro to Ray-Tracing & Ray-Surface Acceleration

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp

NVIDIA nforce 790i SLI Chipsets

A Reconfigurable Architecture for Load-Balanced Rendering

Parallel Volume Rendering Using PC Graphics Hardware

NVIDIA nfinitefx Engine: Programmable Pixel Shaders

The GPGPU Programming Model

Cornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary

Current Trends in Computer Graphics Hardware

An Approach on Hardware Design for Computationally Intensive Image Processing Applications based on Light Field Refocusing Algorithm

FROM VERTICES TO FRAGMENTS. Lecture 5 Comp3080 Computer Graphics HKBU

CMPE 665:Multiple Processor Systems CUDA-AWARE MPI VIGNESH GOVINDARAJULU KOTHANDAPANI RANJITH MURUGESAN

Parallel Architecture. Sathish Vadhiyar

CS 428: Fall Introduction to. OpenGL primer. Andrew Nealen, Rutgers, /13/2010 1

Hardware-driven visibility culling

The Source for GPU Programming

Shader Programming 1. Examples. Vertex displacement mapping. Daniel Wesslén 1. Post-processing, animated procedural textures

PowerVR Hardware. Architecture Overview for Developers

GPU Memory Model Overview

Abstract. Introduction. Kevin Todisco

Network Aware Parallel Rendering with PCs

Shading Languages for Graphics Hardware

CSE 167: Introduction to Computer Graphics Lecture #10: View Frustum Culling

Hardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences

Mali Demos: Behind the Pixels. Stacy Smith

SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES

Accelerating Molecular Modeling Applications with Graphics Processors

CS450/550. Pipeline Architecture. Adapted From: Angel and Shreiner: Interactive Computer Graphics6E Addison-Wesley 2012

Managing Deformable Objects in Cluster Rendering

Transcription:

Tracking Graphics State for Network Rendering Ian Buck Greg Humphreys Pat Hanrahan Computer Science Department Stanford University

Distributed Graphics How to manage distributed graphics applications, renderers, and displays? Network

Virtual Graphics Virtualize the graphics output Serial input to parallel graphics assumes single large resource Virtual

Virtual Graphics Virtualize the graphics destination. Driver manages shared resource. assumes owns graphics. Virtual Virtual

Virtual Graphics Tiled Rendering Single application rendering to many outputs Parallel Rendering Many applications rendering to a single output Previous Work Window Systems X11 SunRay Visualization Servers GLR GLX

Tiled Rendering Network Minimize network traffic Sort first geometry commands Broadcast state commands?

Tiled Rendering Network Lazy State Update Issue minimal state commands to sync render

Parallel Rendering Hardware context switching too slow.17 ms / switch NVIDIA GeForce 32 streams, 60 fps = 30% Frame Time

Parallel Rendering Software context switch Generate state commands for switch Single hardware context

Cluster Rendering Network

Overview Data structure for generating context comparisons. Tiled Rendering Lazy State Updates Parallel Rendering Soft Context Switching WireGL OpenGL driver for cluster rendering.

Context Data Structure Challenge: Generate state commands of context differences. Direct comparison too slow. Acceleration data structure: Track difference information during execution Quick search for comparison Context A Context B

Context Data Structure Hierarchical dirty bits Indicate which elements need comparison Texture Lighting Fragment ShadeModel Light0 Position Enable

Context Data Structure Hierarchical dirty bits Context A: Context B: glenable(gl_light0) Texture Lighting Fragment ShadeModel Light0 Position Enable

Context Diff Context Data Structure Context A Context B Texture Lighting Fragment ShadeModel Light0 gldisable(gl_light0) Position Enable

Context Data Structure State command invalidates all other contexts Wide dirty bit vector Lighting Single write invalidates all contexts

Tiled Rendering Geometry Bucketing Track object space bounding box Transform object box to screen space Send geometry commands to outputs which overlap screen space extent

Tiled Rendering Lazy State Update Defer sending Custom state commands for each render

Lazy State Update Virtual Context Matrix: I M 1 Render Contexts Matrix: I M 1 glloadmatrix(m 1 ) Matrix: I Load transform state M 1 Render Geometry

Lazy State Update Virtual Context Matrix: M 12 Render Contexts Matrix: M 1 Matrix: I M 2 glloadmatrix(m 2 ) Load transform state M 1 Render Geometry Load transform state M 2 Render Geometry

Tiled Rendering Results Marching Cubes.25 Framerate.125 Volume Rendering 1.5 Mtri Surface 1024x768 Outputs 8x4 = 25 Mpixel display 1x1 2x1 2x2 4x2 4x4 8x4 WireGL Broadcast

Tiled Rendering Results Quake III 48 Framerate 24 Quake III OpenGL State Intensive Fine Granularity 8x4 dominated by overlap 1x1 2x1 2x2 4x2 4x4 8x4 WireGL Broadcast

Parallel Rendering Requires fast context switching between streams

Soft Context Switching Generate State Commands Context compare operation to generate state commands Benefits Prevent hardware pipeline flushes Switch time dependent on context differences Matrix: M 1 glloadmatrix(m 2 ) Matrix: M 2

Soft Context Switching Results: Varying current color and transformation state. Context switches per second: SGI Infinite Reality SGI Cobalt NVIDIA GeForce WireGL 697 2,101 5,968 191,699

Conclusions State tracking heiarchical dirty bit Allows for fast context comparison operations Enables Virtual Graphics Tiled Rendering Parallel Rendering WireGL http://graphics.stanford.edu/software/wiregl

Acknowledgements Matthew Eldridge Chris Niederauer Department of Energy contract B504665

Tracking Graphics State for Network Rendering Ian Buck Greg Humphreys Pat Hanrahan Computer Science Department Stanford University