A Bandwidth Effective Rendering Scheme for 3D Texture-based Volume Visualization on GPU

Similar documents
A Bandwidth Reduction Scheme for 3D Texture-Based Volume Rendering on Commodity Graphics Hardware

Hardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences

Structure. Woo-Chan Park, Kil-Whan Lee, Seung-Gi Lee, Moon-Hee Choi, Won-Jong Lee, Cheol-Ho Jeong, Byung-Uck Kim, Woo-Nam Jung,

CS427 Multicore Architecture and Parallel Computing

Hardware-driven Visibility Culling Jeong Hyun Kim

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

Graphics Hardware. Instructor Stephen J. Guy

Enhancing Traditional Rasterization Graphics with Ray Tracing. October 2015

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Evolution of GPUs Chris Seitz

COMP 4801 Final Year Project. Ray Tracing for Computer Graphics. Final Project Report FYP Runjing Liu. Advised by. Dr. L.Y.

GPU-based Volume Rendering. Michal Červeňanský

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

Graphics Processing Unit Architecture (GPU Arch)

Programming Graphics Hardware

Volume Graphics Introduction

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

Direct Rendering of Trimmed NURBS Surfaces

Architectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1

Ray Casting of Trimmed NURBS Surfaces on the GPU

Optimizing DirectX Graphics. Richard Huddy European Developer Relations Manager

Lecture 2. Shaders, GLSL and GPGPU

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Optimizing for DirectX Graphics. Richard Huddy European Developer Relations Manager

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp

A New Bandwidth Reduction Method for Distributed Rendering Systems

Real-Time Ray Tracing Using Nvidia Optix Holger Ludvigsen & Anne C. Elster 2010

The Application Stage. The Game Loop, Resource Management and Renderer Design

Enhancing Traditional Rasterization Graphics with Ray Tracing. March 2015

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

Deferred Rendering Due: Wednesday November 15 at 10pm

CS230 : Computer Graphics Lecture 4. Tamar Shinar Computer Science & Engineering UC Riverside

RSX Best Practices. Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Spring 2009 Prof. Hyesoon Kim

lecture 18 - ray tracing - environment mapping - refraction

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

An Efficient Approach for Emphasizing Regions of Interest in Ray-Casting based Volume Rendering

DiFi: Distance Fields - Fast Computation Using Graphics Hardware

2.11 Particle Systems

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

The Rasterization Pipeline

Hardware-driven visibility culling

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

Overview. Pipeline implementation I. Overview. Required Tasks. Preliminaries Clipping. Hidden Surface removal

Enabling immersive gaming experiences Intro to Ray Tracing

3/1/2010. Acceleration Techniques V1.2. Goals. Overview. Based on slides from Celine Loscos (v1.0)

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

The Ultimate Developers Toolkit. Jonathan Zarge Dan Ginsburg

Performance Analysis and Culling Algorithms

CS451Real-time Rendering Pipeline

Shadows. COMP 575/770 Spring 2013

L10 Layered Depth Normal Images. Introduction Related Work Structured Point Representation Boolean Operations Conclusion

Spring 2011 Prof. Hyesoon Kim

CHAPTER 1 Graphics Systems and Models 3

Programmable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1

Row Tracing with Hierarchical Occlusion Maps

Scanline Rendering 2 1/42

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

First Steps in Hardware Two-Level Volume Rendering

GeForce4. John Montrym Henry Moreton

Advanced GPU Raycasting

Clipping. Angel and Shreiner: Interactive Computer Graphics 7E Addison-Wesley 2015

Mali-400 MP: A Scalable GPU for Mobile Devices Tom Olson

Introduction to Shaders.

Optimisation. CS7GV3 Real-time Rendering

Performance OpenGL Programming (for whatever reason)

Ray Casting on Programmable Graphics Hardware. Martin Kraus PURPL group, Purdue University

Optimizing Games for ATI s IMAGEON Aaftab Munshi. 3D Architect ATI Research

Dynamic Ambient Occlusion and Indirect Lighting. Michael Bunnell NVIDIA Corporation

A fixed-point 3D graphics library with energy-efficient efficient cache architecture for mobile multimedia system

Bringing it all together: The challenge in delivering a complete graphics system architecture. Chris Porthouse

CS 354R: Computer Game Technology

Spring 2009 Prof. Hyesoon Kim

Hidden surface removal. Computer Graphics

Working with Metal Overview

CS 130 Exam I. Fall 2015

Could you make the XNA functions yourself?

FROM VERTICES TO FRAGMENTS. Lecture 5 Comp3080 Computer Graphics HKBU

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Multiresolution model generation of. texture-geometry for the real-time rendering 1

TSBK03 Screen-Space Ambient Occlusion

Programmable Shaders for Deformation Rendering

Windowing System on a 3D Pipeline. February 2005

Course Recap + 3D Graphics on Mobile GPUs

Fast Stereoscopic Rendering on Mobile Ray Tracing GPU for Virtual Reality Applications

Getting fancy with texture mapping (Part 2) CS559 Spring Apr 2017

Models and Architectures

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

GPU Computation Strategies & Tricks. Ian Buck NVIDIA

Wed, October 12, 2011

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

Hardware-Assisted Relief Texture Mapping

Real-Time Buffer Compression. Michael Doggett Department of Computer Science Lund university

Programmable GPUS. Last Time? Reading for Today. Homework 4. Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes

POWERVR MBX. Technology Overview

Comparison of High-Speed Ray Casting on GPU

Transcription:

for 3D Texture-based Volume Visualization on GPU Won-Jong Lee, Tack-Don Han Media System Laboratory (http://msl.yonsei.ac.k) Dept. of Computer Science, Yonsei University, Seoul, Korea

Contents Background & Proposed Octree-based Bandwidth Effective Volume Rendering al Results

Why Volume Rendering? Overcome limitation of polygon-based rendering Volume rendering generate photorealistic and nonphotorealistic scene 3D Texture VR Limitation Polygon Dataset Polygonal Rendering Rendering Volume Dataset Volume Rendering Real Object

Application#1 Visualization 3D Texture VR Limitation material sciences geology medical visualization

Application#2 - Entertainment atmospheric effects 3D Texture VR Limitation explosions, FX game

3D Texture-base Volume Rendering Texture Based Volume-Rendering Low cost & high quality method 3D Texture VR Limitation X-ray like shading(1996) Shading with OpenGL extensions(1998) Shading with Register Combiner & Multi-Texture Units(1999) Classification with Multi-dimensional T/F(2001) GPU ray-casting (2003) Real-time decompression & rendering (2003) NPR for VR (2003) Point-based VR (2004)

3D Texture-base Volume Rendering Typical GPU-based volume rendering flow 3D Texture VR Limitation Gradient Computation Shaders and T/F Creation Main Original Volume Data CPU Disk Vertex & Texture Coordinates Volume, Gradients, T/F, Graphic Pre-TnL Texture Pixel GPU Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

3D Texture-base Volume Rendering Typical GPU-based volume rendering flow 3D Texture VR Limitation Gradient Computation Shaders and T/F Creation Main Original Volume Data CPU Disk Vertex & Texture Coordinates Volume, Gradients, T/F, Graphic Pre-TnL Texture Pixel GPU Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

3D Texture-base Volume Rendering Typical GPU-based volume rendering flow 3D Texture VR Limitation Gradient Computation Shaders and T/F Creation Main Original Volume Data CPU Disk Vertex & Texture Coordinates Volume, Gradients, T/F, Graphic Pre-TnL Texture Pixel GPU Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

3D Texture-base Volume Rendering Typical GPU-based volume rendering flow 3D Texture VR Limitation Gradient Computation Shaders and T/F Creation Main Original Volume Data CPU Disk Vertex & Texture Coordinates Volume, Gradients, T/F, Graphic Pre-TnL Texture Pixel GPU Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

3D Texture-base Volume Rendering Typical GPU-based volume rendering flow 3D Texture VR Limitation Gradient Computation Shaders and T/F Creation Main Original Volume Data CPU Disk Vertex & Texture Coordinates Volume, Gradients, T/F, Graphic Pre-TnL Texture Pixel GPU Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

3D Texture-base Volume Rendering Typical GPU-based volume rendering flow 3D Texture VR Limitation Gradient Computation Shaders and T/F Creation Main Original Volume Data CPU Disk Vertex & Texture Coordinates Volume, Gradients, T/F, Graphic Pre-TnL Texture Pixel GPU Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

Limitation of 3D Texture Mapping The size of volume data sets can be processed is still limited usually, 64-256MB graphic memory, middle sized(512 3 ) 8 16bit volume data, 256MB 512MB Gradient Computation Shaders and T/F Creation Main Original Volume Data CPU Disk Vertex & Texture Coordinates Volume, Gradients, T/F, Graphic Pre-TnL Texture Pixel GPU Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation 3D Texture VR Limitation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

Limitation of 3D Texture Mapping Bottleneck in graphic memory bus lots of texture & pixel traffic (interpolation, α-blending) Lower localities in texture mapping & blending operation 3D Texture VR Limitation 1, request Pixel Gradient Computation Shader2,3,4 Hit! Shaders and T/F Creation 5,6,7,8, Miss!! Texture Threshing! 1,2,3,4 load u Main Graphic Original Volume Data CPU Disk t 2 3 6 1 7 4 5 8 Vertex & Texture Coordinates Volume, Gradients, T/F, s Graphic Pre-TnL Texture Pixel GPU Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

Limitation of 3D Texture Mapping Bottleneck in graphic memory bus lots of texture & pixel traffic (interpolation, α- blending) Lower localities in texture mapping & blending operation GPU Blending Unit Gradient Computation Pixel Shaders and T/F Creation Hit! Main Miss!! Original Volume Data CPU Disk Graphic Vertex & Texture Coordinates Volume, Gradients, T/F, Graphic Pre-TnL Texture Pixel Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation 3D Texture VR Limitation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

Limitation of 3D Texture Mapping Brute-forced approach classical optimization technique cannot be utilized (empty-space skipping, early-ray termination) 3D Texture VR Limitation unable to skip space Gradient Computation Shaders and T/F Creation Main Original Volume Data CPU Disk Vertex & Texture Coordinates Volume, Gradients, T/F, Graphic Pre-TnL Texture Pixel GPU Vertex Shader Post-TnL Triangle Setup And Rasterization Pixel Shader Raster Operation Coordinates Transform Classification, 3D Texture Mapping, Lighting Blending

Contributions of This Research Propose an sub-division based rendering algorithm for GPU bandwidth-effectiveness Enabling empty space skipping for optimized rendering Utilizing modern GPU features for low-cost texture coordinate computation Build-up for effective H/W simulation environments for GPU based rendering

I.Boada, I.Navazo, and R.Scopigno. Multiresolution Volume Visualization with a Texture-Based Octree. The Visual Computer, 17(3), 185-197, 2001. VR on GPU Multi-resolution Empty Space Skipping AMR tree level (a) level (b) level (c)

W. Li, K. Mueller, and A. Kaufman, Empty Space Skipping and Occlusion Clipping for Texture-Based Volume Rendering, In Proc. of IEEE Visualization 2003, pages 317-324, 2003. VR on GPU Multi-resolution Empty Space Skipping AMR tree partitioned by the growing boxes converted into a BSP tree Rendering with mixed boxes and texture

R. Kauhler, M. Simon, and H. C. Hege, Interactive Volume Rendering of Large Sparse Data Sets Using Adaptive Mesh Refinement Hierarchies, IEEE Transactions on Visualization and Computer Graphics, VOL. 9, NO. 3, JULY-SEPTEMBER 2003, pages 341-351. VR on GPU Multi-resolution Empty Space Skipping AMR tree Standard w/ Octree w/ AMR tree

Proposed Method

Proposed Octree-based V/R Octree based Empty Space Skipping Octree Hierarchy Rendering unable to skip empty space Sub-division enabling empty empty space space checking skipping standard method proposed method

Proposed Octree-based V/R Efficiency Octree Hierarchy Rendering standard method texture cache pixel cache proposed method texture cache pixel cache HIT HIT for each sub-volume MISS only cold MISS on switching sub-volume

Proposed Octree-based V/R Rendering Octree in GPU Utilizing the property of regular sized sub-volumes Easy to detect visibility order of each sub-volume Octree Hierarchy Rendering Incremental texture coordinates computations in GPU

al Results

Benchmarks Environment Analysis Texture Pixel Total Bandwidth Frame Rates Lobster (255x255x56) Engine (255x255x110) Foot (255x255x255) CTdata (255x255x255)

GPU Simulation Environment #1 CPU-based software simulation For evaluating cache efficiency, memory bandwidth compile with Cg compiler (cgc) Cg vertex & fragment shader NV_vertex_program NV_fragment_program GL function & shader assembly call Mesa 6.0 (fully compatible with OpenGL 1.5) Environment Analysis Texture Pixel Total Bandwidth Frame Rates C++ OpenGL application Dinero III scene cache simulation results Make comparison with standard and proposed method Under perfectly same cache configuration fully associative cache, 32Bytes block, 1~128KBytes cache 512x512 screen size, 10 frame rendered

GPU Simulation Environment #2 GPU-based hardware experiment For evaluating rendering speedup Cg vertex & fragment shader compile with Cg compiler (cgc) GL function call Environment Analysis Texture Pixel Total Bandwidth Frame Rates NV_vertex_program NV_fragment_program Standard OpenGL library (opengl32.dll) C++ OpenGL application GPU scene & frame rates result load Shader assembly to GPU Make comparison with standard and proposed method Under perfectly same condition Pentium IV 2.80GHz PC with 512MB RDRAM & AGP8x bus Nvidia GeForce FX 5900 with 256MB graphic memory

Trade-Off Analysis Average number of vertices per frame (K). 160 140 120 100 80 60 40 20 0 lobster engine foot ctdata 256^3 128^3 64^3 32^3 16^3 8^3 The size of a sub-volume Non-empty region ratio. 1.2 1 0.8 0.6 0.4 0.2 0 lobster engine foot ctdata 256^3 128^3 64^3 32^3 16^3 8^3 The size of a sub-volume Environment Analysis Texture Pixel Total Bandwidth Frame Rates Average # of Vertices per Frame Non-Empty Region Ratio

Miss Rates Texture 20% NSD OSD(64^3) OSD(32^3) OSD(16^3) 20% NSD OSD(64^3) OSD(32^3) OSD(16^3) Miss Rate(%) 18% 16% 14% 12% 10% 8% 6% 4% 2% Miss Rate(%) 18% 16% 14% 12% 10% 8% 6% 4% 2% Environment Analysis Texture Pixel Total Bandwidth Frame Rates 0% 1 2 4 8 16 32 64 128 Texture Size(KBytes) Lobster (255x255x56) 0% 1 2 4 8 16 32 64 128 Texture Size(KBytes) Engine (255x255x110) NSD OSD(64^3) OSD(32^3) OSD(16^3) NSD OSD(64^3) OSD(32^3) OSD(16^3) 20% 20% 18% 18% 16% 16% 14% 14% Miss Rates(%) 12% 10% 8% Miss Rates(%) 12% 10% 8% 6% 6% 4% 4% 2% 2% 0% 1 2 4 8 16 32 64 128 Texture Size(KBytes) Foot (255x255x255) 0% 1 2 4 8 16 32 64 128 Texture Size(KBytes) CTdata (255x255x255) NSD : Non-Subdivided OSD : Octree Subdivided (number) : size of sub-volume

Miss Rates Pixel (Color) Miss Rate (%) NSD OSD(64^3) OSD(32^3) OSD(16^3) 5.5% 5.0% 4.5% 4.0% 3.5% 3.0% 2.5% 2.0% 1.5% 1.0% 0.5% Miss Rate (%) NSD OSD(64^3) OSD(32^3) OSD(16^3) 5.0% 4.5% 4.0% 3.5% 3.0% 2.5% 2.0% 1.5% 1.0% 0.5% Environment Analysis Texture Pixel Total Bandwidth Frame Rates 0.0% 1 2 4 8 16 32 64 128 0.0% 1 2 4 8 16 32 64 128 Color Size (KB) Lobster (255x255x56) Color Size (KB) Engine (255x255x110) 5.5% 5.0% 4.5% NSD OSD(64^3) OSD(32^3) OSD(16^3) 5.0% 4.5% 4.0% NSD OSD(64^3) OSD(32^3) OSD(16^3) 4.0% 3.5% Miss Rate (%) 3.5% 3.0% 2.5% 2.0% 1.5% 1.0% 0.5% 0.0% 1 2 4 8 16 32 64 128 Color Size (KB) Foot (255x255x255) Miss Rate (%) 3.0% 2.5% 2.0% 1.5% 1.0% 0.5% 0.0% 1 2 4 8 16 32 64 128 Color Size (KB) CTdata (255x255x255) NSD : Non-Subdivided OSD : Octree Subdivided (number) : size of sub-volume

Miss Rates Pixel (Depth) 10% NSD OSD(64^3) OSD(32^3) OSD(16^3) 10% NSD OSD(64^3) OSD(32^3) OSD(16^3) Miss Rate (%) 9% 8% 7% 6% 5% 4% 3% 2% 1% Miss Rate (%) 9% 8% 7% 6% 5% 4% 3% 2% 1% Environment Analysis Texture Pixel Total Bandwidth Frame Rates 0% 1 2 4 8 16 32 64 128 0% 1 2 4 8 16 32 64 128 Depth Size (KB) Lobster (255x255x56) Depth Size (KB) Engine (255x255x110) NSD OSD(64^3) OSD(32^3) OSD(16^3) NSD OSD(64^3) OSD(32^3) OSD(16^3) 10% 10% 9% 9% 8% 8% 7% 7% Miss Rate (%) 6% 5% 4% Miss Rate (%) 6% 5% 4% 3% 3% 2% 2% 1% 1% 0% 1 2 4 8 16 32 64 128 Depth Size (KB) Foot (255x255x255) 0% 1 2 4 8 16 32 64 128 Depth Size (KB) CTdata (255x255x255) NSD : Non-Subdivided OSD : Octree Subdivided (number) : size of sub-volume

Total Bandwidth Comparison #1 for 10 frames NSD : Non Sub-Divided standard method, OSD : Octree-based Sub-Divided method Environment Analysis Texture Pixel Total Bandwidth Frame Rates At best case, 29x reduced! Even worst case, 8Mbytes increased!

Total Bandwidth Comparison #2 for 10 frames Environment Analysis Texture Pixel Total Bandwidth Frame Rates

Rendering Performance Environment Analysis Texture Pixel Total Bandwidth Frame Rates NSD : Non Sub-Divided standard method, OSD : Octree-based Sub-Divided method

& Future work Octree based Bandwidth-Effective Volume Rendering Maximize the temporal & spacial locality of memory access Enable the empty space skipping 2~29x bandwidth reduction 2~6x faster rendering Application Volumetric Effects