Visualization and VR for the Grid Chris Johnson Scientific Computing and Imaging Institute University of Utah
Computational Science Pipeline Construct a model of the physical domain (Mesh Generation, CAD) Apply boundary conditions Numerically approximate governing equations (FE, FD, BE) Compute (Preconditioners, Solvers) Visualize (Isosurfaces, Vector Fields, Volume Rendering) Think about the BIG PICTURE!
Computational Science Pipeline Think about Pipeline Complexity! Modeling Simulation Integration! Visualization
Interactive Interactive Large-Scale Large-Scale Visualization Visualization Medical Scientific Computing GeoScience
An HPV Success Story
Why Parallel Visualization? Terabyte P4
Why Parallel Visualization? Large Amounts of Data Compute Intensive Rendering Need Interactivity!
Visualization Algorithm Classification The visualization rendering pipeline: Geometry Database Geometry Transformation Per Vertex Rasterization Per Pixel Image
Parallel Visualization Taxonomy Sort-First Database Traversal Sort-Middle Database Traversal Sort-Last Database Traversal Preprocessing 3D Primitives Preprocessing Preprocessing G G G G G G G G G G G G 2D Primitives R R R R R R R R R R R R Rendered Pixels Display Display Display [Molnar et al. `94]
Parallel Visualization 1995 CM-5 Sort Last Molecular Dynamics Visualization
Parallel Visualization - 1996 T3D NOISE Sort-Middle 10 FPS HIPPI-FB
Parallel Visualization 1998-2002 Origin 2000-3000 New Paradigm NUMA DSM Multiple- Graphics HW
Real-Time Ray Tracer (RTRT) Implemented on SGI Origin 3000 ccnuma architecture - up to 512 processors (now working on a distributed version) Approximately linear speedup Load balancing and memory coherence are key to performance
Real-Time Ray Tracer
Real-Time Ray Tracer - Scalability
Molecular Simulation and Visualization Protein Molecules Klaus Schulten Theoretical Biophysics Group, UIUC
Real-time Surface Rendering
Real-time Volume Rendering
Hardware Multipipe Partition Image using Hilbert Curve IR1 IR2 IR3 IR4 IR1 IR2 IR3
Multi-pipe Volume Rendering Requires 5GB/sec of disk I/O bandwidth for 3-5 Hz interaction rate - Impedance match to pipe texture download speed - 96 fibre channel interfaces, 6 per pipe RAM IR IR IR IR TRex:LANL/Utah
TRex: Multipipe Volume Rendering
Hardware Multipipe Results Average FPS for 3GB 512x512 Image 25 20 Avg FPS 15 10 5 1 IR 2 IR 4 IR 8 IR 0 1 2 4 8 Number of Tiles
Overview of Parallel Visualization Techniques Rendering Features Global Illumination Multipipe Global Illumination Multipipe Rendering Polygon rendering RTRT Scalability / Interactivity
What s Next? Cluster and Grid Visualization Why? Current high end visualization hardware is expensive Current high end hardware not flexible But the real reason is: Clusters and the Grid are: cool, hot fashionable, the in thing
Cluster and Grid Visualization So, why are there ZERO papers on visualization in CCGrid? Maybe visualization isn t important? 66% of your brain is for visualization Visualization on clusters and grids is HARD!
Current Grid Based Visualization Non-interactive Visualization
Grid Visualization?
SGI RealityMonster "Mural Server" (8 processor SGI Origin) Client Thread Master Thread... Client Thread Pipe Thread Pipe Thread InfiniteReality Graphics InfiniteReality Graphics Projector Projector Projector Projector Projector Projector Projector Projector Network Client App... Client App OpenGL Protocol G. Humphreys, P. Hanrahan, A distributed graphics system for large tiled displays, Vis99
PC RealityMonster "Mural Server" (PC Cluster Version) Network Pipe Server Pipe Server Pipe Server GFX HW GFX HW GFX HW Projector Projector Projector Master Thread Pipe Server Pipe Server GFX HW GFX HW Projector Projector Pipe Server GFX HW Projector Client Thread... Client Thread Pipe Server Pipe Server GFX HW GFX HW Projector Projector Network Client App... Client App OpenGL Protocol G. Humphreys, I. Buck, P. Hanrahan, Distributed rendering for large tiled displays, Supercomputing 2000
Visualization on Clusters Sort-First Database Traversal G Preprocessing 3D Primitives G G G Parallelize R R R R Display Parallelize (Custom HW)
Stanford Cluster Multi-projector Based Sort First Parallel Rendering Custom DVI Compositing Network Similar in spirit to Sepia (Compaq) WireGL Software
Chromium Cr(Cluster Renderer) Cluster: 32 graphics nodes, 4 server nodes Computer: Compaq SP750 2 processors (800 Mhz, 133Mhz FSB) i840 core logic (biggest issue for vis-clusters) 256 MB memory 18GB disk (+ 3*36GB on servers) 64-bit, 66 Mhz PCI AGP-4X Pro http://sourceforge.net/projects/chromium/ Graphics NVIDIA GeForce3 Network Myrinet 64-bit, 66 Mhz
Princeton - Cluster Commodity Clustering
UT Austin Cluster Compaq Cluster Multi-Projector Custom Compositing Network
Utah Visualization Cluster 64 CPUs (1.7 GHz P4, 1 Gb memory each) Nvidia GeForce 3 Switch connected GigE network Heavy duty cooling racks
Cluster Visualization Clusters (basically multipipe rendering) -Compaq - Princeton - Stanford -UT Austin - Cal Tech -etc... Sort First Sort Last
Adding Clusters to the Picture Rendering Features Global Illumination Multipipe Global Illumination Multipipe Rendering Polygon Clusters rendering RTRT Scalability / Interactivity
Grid Visualization Service Approaches Do everything on the remote server Pros: simple to grid enable existing packages Cons: high latency, eliminates possibility of proxying and caching at local site. Very high-end remote server Moderate Bandwidth Link Local machine
Grid Visualization Service Approaches Do everything but the rendering on the remote server Pros: fairly simply to grid-enable existing packages, removes some load from the remote site. Cons: requires every local site to have good rendering power. Moderate bandwidth Link High-end remote server Computer with good rendering power
Grid Visualization Service Approaches Use a local proxy for the rendering Pros: offloads work from the remote site, allows local sites to contribute additional resources. Cons: local sites may not have sufficiently powerful proxy resources, application is more complex, requires high bandwidth between local and remote sites. High bandwidth link Moderate bandwidth link High-end remote server Powerful local proxy server Low-end local PC
Grid Visualization Service Approaches Do everything at the local site. Pros: low latency, easy to grid-enable existing packages. Cons: requires high-bandwidth link between sites, requires powerful compute and graphics resources at the local site. Low-end remote server Ultrahigh Bandwidth Link Powerful local server
Grid Visualization Needs Assertion: static application partitions are not appropriate for heavyweight grid services Need a uniform framework for grid-enabling and partitioning heavyweight services such as remote visualization.
Challenges to Providing Heavyweight Grid Services Grid visualization resources are limited We must find an easy way to grid-enable existing packages. Grid resources are wildly heterogeneous wrt to Visualization (graphics, displays, I/O, plus usual) Programs should be performance-portable (sense resources at load time) and configure themselves in some optimal way. Grid resources are dynamic Programs need be adaptive to dynamical adjust to changing resources might need user decision (e.g. accuracy versus frame rate) Applications that provide heavyweight grid services must be resource-aware.
Active Frames - CMU Dv is a flexible framework that allows one to schedule and partition heavyweight services, based on the notion of a active frame. Active Frame Server Input Active Frame Frame data Frame program Active frame interpreter Application libraries e.g, vtk Output Active Frame Frame data Frame program Host David O Hallaron, CMU
Grid-Enabling vtk with Dv request frame [request server, scheduler, flowgraph, data reader ] local Dv client result request server reader scheduler...... local Dv response frames (to other Dv servers) server [appl. data, scheduler, flowgraph,control ] remote machine local machine (Dv client) David O Hallaron, CMU
Cactus Ed Seidel
Grid Visualization Algorithms Compression (wavelets, multiresolution, mpeg, etc.) Importance Based Rendering View Dependent Image Based Rendering Adaptive (Resource Aware) Parallel/Distributed Techniques
Point-Cloud Rendering - ISI Thiebaux/Czajkowski/Kesselman - ISI
Volume Browsing Path - ISI Parallel I/O Reduction Sorting Transport Rendering TCP/IP Receive Buffer............ Volume time-series (3D time scalar) Parallel search/sort converts to OpenGL geom. Double-buffered frame reassembly Thiebaux/Czajkowski/Kesselman - ISI
The Visualization Pipeline Reduce the amount of data Reduce during the search... View point Preprocess Search O(V(k)) O(k) construct O(k) O(V(k)) Render Isovalue Offline Online
A View-dependent Approach Attractive for: Grid visualization Large data sets Sub-pixel polygons
View Dependent - Visible Woman Full View Isosurface depend Polys 2,246,000 246,000 Create 177 sec 72 sec Render 2.32 sec 0.25 sec
Parallel Image Based Rendering Size of the reconstruction image per pixel cost Complexity Modeling Complexity Polygon Count Lighting Complexity
Parallel IBR Part of Distributed Visualization
Immersive Interactive Visualization
Multiple Fields - Haptics
Immersive CFD Visualization
Future for Grid Visualization New Algorithms Rendering Features Global Illumination Multipipe Global Illumination Adaptive/Aware Importance Based IBR methods Clusters Polygon rendering View Dependent Grid Visualization Multiresolution RTRT Scalability / Interactivity
Future of Grid Visualization? GAMES!!!!! 16-way parallel 2GB RAMBUS 1G Polys per sec 50GB/sec memory bus 1920x1080 resolution
Future of Grid Visualization?
Acknowledgements DOE ASCI and SciDAC NSF NIH NCRR and BISTI SGI Visual Supercomputing Center Visual Influence
Scientific Computing and Imaging
More Information www.sci.utah.edu crj@sci.utah.edu