Part IV. Review of hardware-trends for real-time ray tracing

Similar documents
Massive Model Visualization using Real-time Ray Tracing

B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes

Real-Time Ray Tracing Using Nvidia Optix Holger Ludvigsen & Anne C. Elster 2010

Ray Tracing. Computer Graphics CMU /15-662, Fall 2016

Ray Tracing on the Cell Processor

Visual Computing Meets AI:

Estimating Performance of a Ray-Tracing ASIC Design

Blue-Steel Ray Tracer

Interactive Ray Tracing: Higher Memory Coherence

Fast Stereoscopic Rendering on Mobile Ray Tracing GPU for Virtual Reality Applications

The OpenRT Real-Time Ray-Tracing Project

RPU: A Programmable Ray Processing Unit for Realtime Ray Tracing

Realtime Ray Tracing and its use for Interactive Global Illumination

INFOGR Computer Graphics. J. Bikker - April-July Lecture 11: Acceleration. Welcome!

Xbox 360 Architecture. Lennard Streat Samuel Echefu

INFOMAGR Advanced Graphics. Jacco Bikker - February April Welcome!

and Parallel Algorithms Programming with CUDA, WS09 Waqar Saleem, Jens Müller

COMP 4801 Final Year Project. Ray Tracing for Computer Graphics. Final Project Report FYP Runjing Liu. Advised by. Dr. L.Y.

Sung-Eui Yoon ( 윤성의 )

PantaRay: Fast Ray-traced Occlusion Caching of Massive Scenes J. Pantaleoni, L. Fascione, M. Hill, T. Aila

CS427 Multicore Architecture and Parallel Computing

Parallel Computing: Parallel Architectures Jin, Hai

CONSOLE ARCHITECTURE

Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm

Fast BVH Construction on GPUs

Production. Visual Effects. Fluids, RBD, Cloth. 2. Dynamics Simulation. 4. Compositing

V-Ray RT: A New Paradigm in Photorealistic Raytraced Rendering on NVIDIA GPUs. Vladimir Koylazov Chaos Software.

Hardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences

Recursion and Data Structures in Computer Graphics. Ray Tracing

Parallel Physically Based Path-tracing and Shading Part 3 of 2. CIS565 Fall 2012 University of Pennsylvania by Yining Karl Li

Review for Ray-tracing Algorithm and Hardware

Embree Ray Tracing Kernels: Overview and New Features

Lecture 11 - GPU Ray Tracing (1)

This Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources

Hardware Implementation of TRaX Architecture

INFOMAGR Advanced Graphics. Jacco Bikker - February April Welcome!

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. What is Computer Architecture? Sources

Real Time Ray Tracing

Anti-aliased and accelerated ray tracing. University of Texas at Austin CS384G - Computer Graphics

Lecture 4 - Real-time Ray Tracing

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?

Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console

Ray Tracing: Whence and Whither?

Acceleration Data Structures

SAH guided spatial split partitioning for fast BVH construction. Per Ganestam and Michael Doggett Lund University

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

Windowing System on a 3D Pipeline. February 2005

Cross-Hardware GPGPU implementation of Acceleration Structures for Ray Tracing using OpenCL

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp

Computer Graphics. Lecture 13. Global Illumination 1: Ray Tracing and Radiosity. Taku Komura

Graphics Processing Unit Architecture (GPU Arch)

Architectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1

Ray Tracing with Multi-Core/Shared Memory Systems. Abe Stephens

GPU Computation Strategies & Tricks. Ian Buck NVIDIA

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

Scanline Rendering 2 1/42

ECE 571 Advanced Microprocessor-Based Design Lecture 20

521493S Computer Graphics. Exercise 3

AN ACCELERATION OF FPGA-BASED RAY TRACER

Xbox 360 high-level architecture

Computer Graphics (CS 543) Lecture 13b Ray Tracing (Part 1) Prof Emmanuel Agu. Computer Science Dept. Worcester Polytechnic Institute (WPI)

A Reconfigurable Architecture for Load-Balanced Rendering

Advanced Graphics. Path Tracing and Photon Mapping Part 2. Path Tracing and Photon Mapping

Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Ray Tracing.

From Brook to CUDA. GPU Technology Conference

A Parallel Algorithm for Construction of Uniform Grids

Computer Graphics. - Ray Tracing I - Marcus Magnor Philipp Slusallek. Computer Graphics WS05/06 Ray Tracing I

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

HW Trends and Architectures

Ray Tracing. Kjetil Babington

Interactive Rendering with Coherent Ray Tracing

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield

Computer Graphics. Lecture 10. Global Illumination 1: Ray Tracing and Radiosity. Taku Komura 12/03/15

GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP

Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow. Takahiro Harada, AMD 2017/3

Multimedia in Mobile Phones. Architectures and Trends Lund

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

Ray Tracing. Cornell CS4620/5620 Fall 2012 Lecture Kavita Bala 1 (with previous instructors James/Marschner)

Experts in Application Acceleration Synective Labs AB

Interactive High Resolution Isosurface Ray Tracing on Multi-Core Processors

Effects needed for Realism. Computer Graphics (Fall 2008) Ray Tracing. Ray Tracing: History. Outline

Building a Fast Ray Tracer

Administrivia. HW0 scores, HW1 peer-review assignments out. If you re having Cython trouble with HW2, let us know.

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

Effects needed for Realism. Ray Tracing. Ray Tracing: History. Outline. Foundations of Computer Graphics (Spring 2012)

Photorealistic 3D Rendering for VW in Mobile Devices

Lighting. To do. Course Outline. This Lecture. Continue to work on ray programming assignment Start thinking about final project

RTfact. Concepts for Generic Ray Tracing. Iliyan Georgiev. Computer Graphics Group Saarland University Saarbrücken, Germany

Computer Graphics. - Spatial Index Structures - Philipp Slusallek

Ray Tracing Performance

Cell Broadband Engine. Spencer Dennis Nicholas Barlow

Graphics and Imaging Architectures

Anti-aliased and accelerated ray tracing. University of Texas at Austin CS384G - Computer Graphics Fall 2010 Don Fussell

Spring 2009 Prof. Hyesoon Kim

REAL-TIME GPU PHOTON MAPPING. 1. Introduction

Topic 12: Texture Mapping. Motivation Sources of texture Texture coordinates Bump mapping, mip-mapping & env mapping

Chapter 11 Global Illumination. Part 1 Ray Tracing. Reading: Angel s Interactive Computer Graphics (6 th ed.) Sections 11.1, 11.2, 11.

Enabling immersive gaming experiences Intro to Ray Tracing

Ray tracing. Computer Graphics COMP 770 (236) Spring Instructor: Brandon Lloyd 3/19/07 1

Transcription:

Part IV Review of hardware-trends for real-time ray tracing

Hardware Trends For Real-time Ray Tracing Philipp Slusallek Saarland University, Germany

Large Model Visualization at Boeing CATIA Model of Boeing 777: 350 million triangles, 30 GB on disk, 2-3 fps on Dual-Opteron

VW Visualization Center by intrace GmbH

VW Visualization Center by intrace GmbH

VW Visualization Center by intrace GmbH

Product Visualization at EADS

Ray Traced Games

Real-time Requirements Minimum Number of Rays But 1 megapixel screen 30 frames per second 10 rays per pixel (anti-aliasing, lighting,...) 300 million rays per second Larger screens (2x), higher frame rate (2x) Complex lighting (10x) Need several giga-rays per second

Ray Tracing on Multi-Core Advantages High-performance implementations are available Highly flexible environment Scales nicely with # of cores (~10 Mrays/s per core) Disadvantage Need 30 cores for minimum requirements Not for the mass market any time soon

Ray Tracing on Multi-Core Advantages High-performance implementations are available Highly flexible environment Scales nicely with # of cores (~10 Mrays/s per core) Disadvantage Need 30 cores for minimum requirements Not for the mass market any time soon But high-end systems are becoming available Opteron-System (8 CPUs x Quad-Core) 32 cores

Ray Tracing on Cell Advantages Already 8 compact but powerful cores (SPUs) Highly efficient SIMD instruction set DMA and full control over caches in Local Store C/C++ compiler Disadvantages Still hard to program, non-optimal compilers Needs another programming approach No good, high-level data parallel languages available Complex and costly memory handling

Ray Tracing on 8-core Cell: Performance @ 1024 x1024 Scene Triangles Single-Cell Dual ERW6 800 58.1 110.9 Conference Beetle 280k 20.0 37.3 680k 16.2 30.6

Ray Tracing on GPUs Increasingly Implemented as an Add-On Volume rendering by ray casting [Krüger '03] Displacement mapping [Wang '04] Approximate refractions on GPU [Weyman '05] Screen space caustics [Krüger '06] Ray racing not well supported by GPUs So far, less efficient than CPUs Even though they have higher raw performance Up to 300,000 rays per second

Ray Tracing on GPUs Increasingly Implemented as an Add-On Volume rendering by ray casting [Krüger '03] Displacement mapping [Wang '04] Approximate refractions on GPU [Weyman '05] Screen space caustics [Krüger '06] Ray racing not well supported by GPUs So far, less efficient than CPUs Even though they have higher raw performance Up to 300,000 rays per second

G80: A Many-Core (G)PU

G80: A Many-Core (G)PU

Ray Tracing on GPUs Highly parallel processing 128 cores, 1000s HW-threads Local communication between threads Optimal for packet tracing Enough local memory for a stack (BVH or KD tree) Direct implementation of BVH ray tracer Faster than CPU (~16 million rays per second) Large models: Power Plant (13 Mtris @ 6.4 fps) But: Data management issues

Example: Ray Tracing on GPU Conference: 300,000 tris 6.1 fps Soda Hall: 2 million tris 5.7 fps Power Plant: ~13 million tris 2.9 fps Power Plant: ~13 million tris 1.9 fps All with shadows, 1 light source, and Phong shading 1024 x 1024 resolution on Nvidia G80

How to Trace Billions of Rays per Second Wait for faster GPUs Wait for faster CPUs (Intel s Larrabee?) Design new Hardware architecture Optimized for ray tracing

D-RPU Approach Shading processor Design similar to fragment processors on GPUs Support for full recursion even with SIMD Highly parallel, highly efficient Improved programming model Add highly efficient recursion, conditional branching Add flexible memory access (beyond textures) Custom traversal and intersection hardware High-performance kd-tree traversal & triangle intersection

D-RPU: Dynamic Scenes [GH'06] Bounding KD-Trees (B-KD Trees) Combining the best of two worlds Traversal efficiency of kd-trees Update efficiency of bounding volume hierarchies (BVH) Efficient for coherent motion with fixed topologies Supports general rays Good for empty space Implemented in HW Traversal & update

D-RPU: High-Level Architecture Fully Programmable

D-RPU: Hardware Architecture

Hardware Implementation

D-RPU Implementation Xilinx Virtex-4 LX160 128 MB RAM,.5 GB/s @ 66 MHz 7.5 GFLOP/s @ 24 bit Usage: 99% logic, 60% memory 32 threads per SPU 60% usage Chunk size of 4 95% efficiency 12 kb caches in total 90% hit rate Performance 40-70% faster than OpenRT OpenRT on CPU with 40x clock rate 60x more efficient

D-RPU Implementation D-RPU ASIC Synthesized from HWML With HW evaluation for clock rate Larger caches (3x 16 KB) 4-way associative 130 nm process from UMC: 49 mm2, 266 MHz 30 GFLOP/s @ 32 bit (post-layout timing) 2.1 GB/s required to external memory

Projections ATI R-520: 288 mm² in 90 nm process D-RPU-4: 196 mm², 130 nm 120 GFLOP/s @ 266 MHz (constant field scaling) 8.5 GB/s (DDR2 memory?) D-RPU-8: 186 mm², 90 nm 361 GFLOP/s @ 400 MHz (constant field scaling) 25.6 GB/s (multi-channel DDR-2 or XDR memory) Joint CAS project with IBM

Performance @ 1024 x 768 (shadows, full Phong shading, textures)

Outlook: Hardware for Ray Tracing Symmetric & Asymmetric Multi-Core CPUs Current: ~10 Mrays/s (per core) Future: many cores per chip, SHM High Performance Parallel GPUs Similar performance than CPUs Still some limitations Custom Ray Tracing Hardware Current: 5-9 Mrays/s (FPGA, 66 MHz) Future: >1 Grays/s (ASIC, 400 MHz)

Interested? Questions? Informatik Saarland Computergraphik Ray Tracing Direct Email intrace GmbH http://www.informatik-saarland.de http://graphics.cs.uni-sb.de http://www.openrt.de slusallek@cs.uni-sb.de http://www.intrace.com info@intrace.com