High End Computing Is Bringing Atomistic Simulation To Macroscopic
|
|
- Brendan Barrett
- 5 years ago
- Views:
Transcription
1 Virtualization-aware Application Framework for High-end Classical-quantum Atomistic Simulations of Nanosystems Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Department of Computer Science, Department of Physics & Astronomy, Department of Materials Science & Engineering University of Southern California Collaborators: Rajiv K. Kalia, Ashish Sharma, Priya Vashishta (CACS), Hiroshi Iyetomi (Niigata), Hideaki Kikuchi (Louisiana), Shuji Ogata (Nagoya Inst. Tech.), Fuyuki Shimojo (Kumamoto), Kenji Tsuruta (Okayama) High End Computing Is Bringing Atomistic Simulation To Macroscopic 1.5 billion atom molecular dynamics simulation on 1,024 IBM SP3 processors at NAVO-MSRC 0.3 µm Color code: Si 3 N 4 ; SiC; SiO 2 Computing beyond Teraflop: Grid of distributed supercomputers?
2 Parallel Benchmark Platforms Earth Simulator DoD-NAVO IBM SP4 LSU/Atipa Linux cluster SuperMike Computing Beyond Teraflop: Grid of Globally Distributed PC Clusters? NASA Information Power Grid Grid of globally distributed supercomputers LSU/Atipa Linux cluster SuperMike Commodity Xeon-based multi-teraflop Linux cluster
3 Virtualization-aware Application Framework Atomistic materials simulation methods Molecular Dynamics (MD) Quantum Mechanics (QM) r ij i j r ik k Atom Electron density Virtualized applications Grid Scalability Portable performance Adaptation Data-locality principles Molecular Dynamics: N-Body Problem Newton s equation of motion d 2 r m i i dt 2 = V r N ( ) (i =1,..., N) r i N-body problem O(N 2 ) Long-range electrostatic interaction N q i V es (x) = x = x j (j = 1,...,N) i=1 x x i Application: drug design, robotics, entertainment, etc. A scene from the movie Twister
4 Spatial Locality: Fast Multipole Method 1. Clustering: Encapsulate far-field information using multipoles l N V ( x)= q i r l i Y *m Y l ( θ i,φ i ) m l ( θ,φ) r l+1 (r i,θ i,φ i ) l=0m= l i=1 2. Hierarchical abstraction: Octree data structure 3. O(N) algorithm: Constant number of interactive cells per octree node l = 0 l = 1 (r,θ,φ) delegate compute inherit l = 2 l = 3 Temporal Locality: Multiple Time Stepping Different force-update schedules for different force components i) Reduced computation ii) Enhanced data locality & parallel efficiency Far field: every steps Near field Tertiary: every steps Secondary: every 5-15 steps Reversible symplectic integrator Simulation-loop invariant: phase-space volume Long-time stability T (p N n t,r n t (p N 0,r N 0 ) N ) Primary: every step 0 I ( p N n t,r N n t ) I 0 (p N 0,r N 0 ) = 0 I I 0
5 Oxide Growth in an Al Nanoparticle Unique metal/ceramic nanocomposite Al AlO x 70 Å 110 Å Oxide thickness saturates at 40 Å after 0.5 ns Excellent agreement with experiments Quantum N-Body Problem Challenge: Exponential complexity Density functional theory (DFT) (Kohn, Nobel Chemistry Prize, 98) { ψ n (r) n =1,K,N el } O(C N ) O(N 3 ) ψ(r 1,r 2,K,r N el ) > Pseudopotential (Troullier & Martins, 91) > Generalized gradient approximation (Perdew, 96) Minimize: N E[ { ψ n }]= d 3 * h 2 2 el rψ n (r) 2m e r 2 +V ion (r) ψ n (r) + e2 2 n=1 Constrained minimization problem: with orthonormal constraints: d 3 * rψ m (r)ψ n (r) =δ mn N el Charge density: ρ(r) = ψ n (r) 2 n=1 d 3 rd 3 r ρ(r)ρ( r ) r r + E XC [ ρ(r) ]
6 Real-Space DFT on Hierarchical Grids Efficient parallelization of DFT: real-space approaches High-order finite difference (Chelikowsky, Troullier, Saad, 94) Multigrid acceleration (Bernholc et al., 96; Beck, 00) Double-grid method (Ono, Hirose, 99) Spatial decomposition/divide-&-conquer Multigrid acceleration of preconditioned conjugate gradient Adaptive high-resolution grid for atomic pseudopotentials Quantum-Nearsightedness Locality Principle O(N) DFT algorithm (Mauri & Galli, 94) Asymptotic decay of density matrix: Localized functions: N el ρ(r, r * ) ψ n (r)ψ n ( r ) n=1 ( ) exp C r r ψ n (r) φ m (r) φ m (r) = n ψ n (r)u nm R cut Unconstrained minimization: N wf N wf m=1 n=1 ρ(r, r ) ( ) E [{ φ n }]= d 3 rφ * m (r)(h ηi)φ n (r) 2δ nm d 3 rφ * n (r)φ m (r) + ηn el
7 Data Locality in Parallelization Challenge: Load balancing for irregular data structures Irregular data-structures/ processor-speed Map Parallel computer Optimization problem: Minimize the load-imbalance cost Minimize the communication cost Topology-preserving spatial decomposition structured 6-step message passing minimizes latency E = t comp ( max p {i r i p} )+ t comm max p {i r i p < r c } +t latency max p N message ( p) ( [ ]) ( ) Computational-Space Decomposition Topology-preserving computational-space decomposition in curved space Curvilinear coordinate transformation ξ = x + u(x) Particle-processor mapping: regular 3D mesh topology ( ) p( ξ i )= p x ( ξ ix )P y P z + p y ( ξ iy )P z + p z ξ iz p α ( ξ iα )= ξ iα P α /L α ( α = x, y,z) Regular mesh topology in computational space, ξ Curved partition in physical space, x
8 Wavelet-based Adaptive Load Balancing Simulated annealing to minimize the load-imbalance & communication costs, E[ξ(x)] Wavelet representation speeds up the optimization ξ(x) = x + d lm ψ lm (x) l,m 3000 Total cost Plane wave Wavelet CPU time (min) Locality in Data Compression Massive data transfer via wide area network: 75 GB/step of data for 1.5 billion-atom MD! Compressed software pipeline Scalable encoding: Store relative positions on spacefilling curve: O(NlogN) O(N) Result: Data size, 50 Bytes/atom 6 Bytes/atom
9 Spacefilling-curve Data Compression Algorithm: 1. Sort particles along the spacefilling curve 2. Store relative positions: O(NlogN) O(N) Adaptive variable-length encoding to handle outliers User-controlled error bound Result: An order-of-magnitude reduction of I/O size: 50 6 Bytes/atom Scalable Scientific Algorithm Suite Design-space diagram on 1,024 IBM SP3 & Cray T3E processors On 1,024 IBM SP3 processors: 6.44-billion-atom MD of SiO 2 444,000-electron DFT of GaAs Supercomputing 2001 Best Technical Paper Award
10 Grid Enabling: Multiple QM Clustering system E = E MD + cluster cluster [EQM ({rqm },{r HS }) E MD ({rqm },{r HS })] cluster Divide-&-conquer QM2 MD QM1 QM3 Global Collaborative Simulation Hybrid MD/QM simulation on a Grid of distributed PC clusters in the US & Japan Japan:Yamaguchi 65 P4 2.0GHz Hiroshima, Okayama, Niigata 3 24 P4 1.8GHz US: Louisiana 17 Athlon XP 1900+
11 Preliminary Benchmark Results MD 91,256 atoms QM (DFT) 76n atoms on n clusters Scaled speedup, P = 1 (for MD) + 8n (for QM) Efficiency = 94.0% on 25 processors over 3 PC clusters in the US & Japan Environmental Effect on Fracture Reaction of H 2 O molecules at a Si crack tip MD QM Blue: Oxygen White: Hydrogen Green: Silicon
12 Data Locality in Visualization Octree-based fast view-frustum culling Probabilistic occlusion culling Parallel/distributed processing Reduced data Graphics server W A N D User position PC cluster ImmersaDesk Interactive visualization of a billion-atom dataset in immersive environment Octree-based View-Frustum Culling Use the octree data structure to efficiently select only visible atoms Complexity Insertion into octree: O(N) Data extraction: O(logN)
13 Probabilstic Occlusion Culling Remove atoms that are occluded by other atoms closer to the viewer Draw fewer atoms per region as the distance of a region from the viewer increases Without occlusion With occlusion 68% fewer atoms & 3 times higher frame rate Distributed Architecture RENDERING & VISUALIZATION MODULE RENDERING SYSTEM PER-ATOM OCCLUDER USER POSITION NEAR COMPLETE LIST OF VIEWABLE ATOMS OCTREE BASED DATA EXTRACTION MODULE PROBABALISTIC OCCLUSION CULLING MODULE REGIONS OF INTEREST TCP/IP SOCKET
14 Parallel & Distributed Atomsviewer Real-time walkthrough for a billion atoms on an SGI Onyx2 (2 MIPS R10K, 4GB RAM) connected to a PC cluster (4 800MHz P3) IEEE Virtual Reality 2002 Best Paper 209 Million Atom MD of Hypervelocity Impact AlN plate with impact velocity 15 km/s
15 Application of Multiscale Simulations Oxidation on Si Surface MD FE QM cluster QM O MD Si QM Si Handshake H Interfacing quantum-dot devices & biological cells (e.g. neural implant to restore vision) Computational Science Education Dual-Degree Program Ph.D. in physical/biological sciences & MS from computer science Computational Science Workshop for Underrepresented Groups Intercontinental Courses The Netherlands Delft Univ. T3E Origin SP Alpha cluster USA Louisiana State Univ. VR workbench Video Conferencing Virtual Classroom ImmersaDesk Chat Tool Whiteboard Tool
16 Conclusion Multiscale simulation approach: Can be virtualized on a Grid of distributed PC clusters through data-locality principles Will allow global collaboration of scientists to increase the scope & size of simulation study
Massive Dataset Visualization
Massive Dataset Visualization Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Dept. of Computer Science, Dept. of Physics & Astronomy, Dept. of Chemical Engineering & Materials Science,
More informationGrid Computing: Application to Science
Grid Computing: Application to Science Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Dept. of Computer Science, Dept. of Physics & Astronomy, Dept. of Chemical Engineering & Materials
More informationHow does a crack propagate in a composite
H IGH-D IMENSIONAL D ATA LARGE MULTIDIMENSIONAL DATA VISUALIZATION FOR MATERIALS SCIENCE Materials scientists use scientific visualization to explore very large multidimensional data sets. The Atomsviewer
More informationOptimizing Molecular Dynamics
Optimizing Molecular Dynamics This chapter discusses performance tuning of parallel and distributed molecular dynamics (MD) simulations, which involves both: (1) intranode optimization within each node
More informationNAMD Serial and Parallel Performance
NAMD Serial and Parallel Performance Jim Phillips Theoretical Biophysics Group Serial performance basics Main factors affecting serial performance: Molecular system size and composition. Cutoff distance
More informationTechniques and algorithms for immersive and interactive visualization of large datasets
Louisiana State University LSU Digital Commons LSU Master's Theses Graduate School 2002 Techniques and algorithms for immersive and interactive visualization of large datasets Ashish Sharma Louisiana State
More informationScalable and portable visualization of large atomistic datasets
Computer Physics Communications 163 (2004) 53 64 www.elsevier.com/locate/cpc Scalable and portable visualization of large atomistic datasets Ashish Sharma, Rajiv K. Kalia, Aiichiro Nakano, Priya Vashishta
More informationInternational Journal of Computational Science (Print) (Online) Global Information Publisher 2007, Vol. 1, No.
International Journal of Computational Science 1992-6669 (Print) 1992-6677 (Online) Global Information Publisher 2007, Vol. 1, No. 4, 407-421 ParaViz: A Spatially Decomposed Parallel Visualization Algorithm
More informationImmersive and Interactive Exploration of Billion-Atom Systems
Ashish Sharma sharmaa@usc.edu Collaboratory for Advanced Computing and Simulations Dept. of Computer Science Immersive and Interactive Exploration of Billion-Atom Systems Aiichiro Nakano anakano@usc.edu
More informationCSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller
Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,
More informationPerformance Evaluation of NWChem Ab-Initio Molecular Dynamics (AIMD) Simulations on the Intel Xeon Phi Processor
* Some names and brands may be claimed as the property of others. Performance Evaluation of NWChem Ab-Initio Molecular Dynamics (AIMD) Simulations on the Intel Xeon Phi Processor E.J. Bylaska 1, M. Jacquelin
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationIntermediate Parallel Programming & Cluster Computing
High Performance Computing Modernization Program (HPCMP) Summer 2011 Puerto Rico Workshop on Intermediate Parallel Programming & Cluster Computing in conjunction with the National Computational Science
More informationPerformance Analysis of Cluster based Interactive 3D Visualisation
Performance Analysis of Cluster based Interactive 3D Visualisation P.D. MASELINO N. KALANTERY S.WINTER Centre for Parallel Computing University of Westminster 115 New Cavendish Street, London UNITED KINGDOM
More informationParallel Molecular Dynamics
Parallel Molecular Dynamics Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Department of Computer Science Department of Physics & Astronomy Department of Chemical Engineering & Materials
More informationLecture 7: Introduction to HFSS-IE
Lecture 7: Introduction to HFSS-IE 2015.0 Release ANSYS HFSS for Antenna Design 1 2015 ANSYS, Inc. HFSS-IE: Integral Equation Solver Introduction HFSS-IE: Technology An Integral Equation solver technology
More informationScientific Visualization. CSC 7443: Scientific Information Visualization
Scientific Visualization Scientific Datasets Gaining insight into scientific data by representing the data by computer graphics Scientific data sources Computation Real material simulation/modeling (e.g.,
More informationPerformance Metrics of a Parallel Three Dimensional Two-Phase DSMC Method for Particle-Laden Flows
Performance Metrics of a Parallel Three Dimensional Two-Phase DSMC Method for Particle-Laden Flows Benzi John* and M. Damodaran** Division of Thermal and Fluids Engineering, School of Mechanical and Aerospace
More informationA Kernel-independent Adaptive Fast Multipole Method
A Kernel-independent Adaptive Fast Multipole Method Lexing Ying Caltech Joint work with George Biros and Denis Zorin Problem Statement Given G an elliptic PDE kernel, e.g. {x i } points in {φ i } charges
More informationScalable I/O of large-scale molecular dynamics simulations: A data-compression algorithm
Computer Physics Communications 131 (2000) 78 85 www.elsevier.nl/locate/cpc Scalable I/O of large-scale molecular dynamics simulations: A data-compression algorithm Andrey Omeltchenko, Timothy J. Campbell,
More informationMulti-Domain Pattern. I. Problem. II. Driving Forces. III. Solution
Multi-Domain Pattern I. Problem The problem represents computations characterized by an underlying system of mathematical equations, often simulating behaviors of physical objects through discrete time
More informationImplementation of an integrated efficient parallel multiblock Flow solver
Implementation of an integrated efficient parallel multiblock Flow solver Thomas Bönisch, Panagiotis Adamidis and Roland Rühle adamidis@hlrs.de Outline Introduction to URANUS Why using Multiblock meshes
More informationSimulation Advances. Antenna Applications
Simulation Advances for RF, Microwave and Antenna Applications Presented by Martin Vogel, PhD Application Engineer 1 Overview Advanced Integrated Solver Technologies Finite Arrays with Domain Decomposition
More informationParallel Rendering. Jongwook Jin Kwangyun Wohn VR Lab. CS Dept. KAIST MAR
Optimal Partition for Parallel Rendering Jongwook Jin Kwangyun Wohn VR Lab. CS Dept. KAIST Culture Technology Research Center MAR.06.2008 Optimal Partition for Parallel Rendering Load balanced partition
More informationHPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)
HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access
More informationA TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE
A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE Abu Asaduzzaman, Deepthi Gummadi, and Chok M. Yip Department of Electrical Engineering and Computer Science Wichita State University Wichita, Kansas, USA
More informationPetascale Multiscale Simulations of Biomolecular Systems. John Grime Voth Group Argonne National Laboratory / University of Chicago
Petascale Multiscale Simulations of Biomolecular Systems John Grime Voth Group Argonne National Laboratory / University of Chicago About me Background: experimental guy in grad school (LSCM, drug delivery)
More informationHigh-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs
High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs Gordon Erlebacher Department of Scientific Computing Sept. 28, 2012 with Dimitri Komatitsch (Pau,France) David Michea
More informationGoal. Interactive Walkthroughs using Multiple GPUs. Boeing 777. DoubleEagle Tanker Model
Goal Interactive Walkthroughs using Multiple GPUs Dinesh Manocha University of North Carolina- Chapel Hill http://www.cs.unc.edu/~walk SIGGRAPH COURSE #11, 2003 Interactive Walkthrough of complex 3D environments
More informationHeterogeneous Multi-Computer System A New Platform for Multi-Paradigm Scientific Simulation
Heterogeneous Multi-Computer System A New Platform for Multi-Paradigm Scientific Simulation Taisuke Boku, Hajime Susa, Masayuki Umemura, Akira Ukawa Center for Computational Physics, University of Tsukuba
More informationMolecular Dynamics and Quantum Mechanics Applications
Understanding the Performance of Molecular Dynamics and Quantum Mechanics Applications on Dell HPC Clusters High-performance computing (HPC) clusters are proving to be suitable environments for running
More informationFast Multipole Method on the GPU
Fast Multipole Method on the GPU with application to the Adaptive Vortex Method University of Bristol, Bristol, United Kingdom. 1 Introduction Particle methods Highly parallel Computational intensive Numerical
More informationIntroducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method
Introducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method G. Wellein, T. Zeiser, G. Hager HPC Services Regional Computing Center A. Nitsure, K. Iglberger, U. Rüde Chair for System
More informationDynamic Spatial Partitioning for Real-Time Visibility Determination. Joshua Shagam Computer Science
Dynamic Spatial Partitioning for Real-Time Visibility Determination Joshua Shagam Computer Science Master s Defense May 2, 2003 Problem Complex 3D environments have large numbers of objects Computer hardware
More informationBenchmark 1.a Investigate and Understand Designated Lab Techniques The student will investigate and understand designated lab techniques.
I. Course Title Parallel Computing 2 II. Course Description Students study parallel programming and visualization in a variety of contexts with an emphasis on underlying and experimental technologies.
More informationDiFi: Distance Fields - Fast Computation Using Graphics Hardware
DiFi: Distance Fields - Fast Computation Using Graphics Hardware Avneesh Sud Dinesh Manocha UNC-Chapel Hill http://gamma.cs.unc.edu/difi Distance Fields Distance Function For a site a scalar function f:r
More informationAccelerating Parallel Analysis of Scientific Simulation Data via Zazen
Accelerating Parallel Analysis of Scientific Simulation Data via Zazen Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw D. E. Shaw Research Motivation
More informationHarnessing GPU speed to accelerate LAMMPS particle simulations
Harnessing GPU speed to accelerate LAMMPS particle simulations Paul S. Crozier, W. Michael Brown, Peng Wang pscrozi@sandia.gov, wmbrown@sandia.gov, penwang@nvidia.com SC09, Portland, Oregon November 18,
More informationVisualization and VR for the Grid
Visualization and VR for the Grid Chris Johnson Scientific Computing and Imaging Institute University of Utah Computational Science Pipeline Construct a model of the physical domain (Mesh Generation, CAD)
More informationParallel and Distributed Systems Lab.
Parallel and Distributed Systems Lab. Department of Computer Sciences Purdue University. Jie Chi, Ronaldo Ferreira, Ananth Grama, Tzvetan Horozov, Ioannis Ioannidis, Mehmet Koyuturk, Shan Lei, Robert Light,
More informationIndirect Volume Rendering
Indirect Volume Rendering Visualization Torsten Möller Weiskopf/Machiraju/Möller Overview Contour tracing Marching cubes Marching tetrahedra Optimization octree-based range query Weiskopf/Machiraju/Möller
More informationAccelerating Finite Element Analysis in MATLAB with Parallel Computing
MATLAB Digest Accelerating Finite Element Analysis in MATLAB with Parallel Computing By Vaishali Hosagrahara, Krishna Tamminana, and Gaurav Sharma The Finite Element Method is a powerful numerical technique
More informationImproving the Scalability of Comparative Debugging with MRNet
Improving the Scalability of Comparative Debugging with MRNet Jin Chao MeSsAGE Lab (Monash Uni.) Cray Inc. David Abramson Minh Ngoc Dinh Jin Chao Luiz DeRose Robert Moench Andrew Gontarek Outline Assertion-based
More informationAlgorithms and Architecture. William D. Gropp Mathematics and Computer Science
Algorithms and Architecture William D. Gropp Mathematics and Computer Science www.mcs.anl.gov/~gropp Algorithms What is an algorithm? A set of instructions to perform a task How do we evaluate an algorithm?
More informationCenter Extreme Scale CS Research
Center Extreme Scale CS Research Center for Compressible Multiphase Turbulence University of Florida Sanjay Ranka Herman Lam Outline 10 6 10 7 10 8 10 9 cores Parallelization and UQ of Rocfun and CMT-Nek
More informationANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011
ANSYS HPC Technology Leadership Barbara Hutchings barbara.hutchings@ansys.com 1 ANSYS, Inc. September 20, Why ANSYS Users Need HPC Insight you can t get any other way HPC enables high-fidelity Include
More informationVMD Key Features of Recent Release
VMD 1.8.7 Key Features of Recent Release John Stone Research/vmd/ Quick Summary of Features Broader platform support Updated user interfaces Accelerated analysis, rendering, display: multi-core CPUs, GPUs
More informationby Virtual Reality System
Particle Simulation Analysis by Virtual Reality System Hiroaki Ohtani 12) 1,2), Nobuaki Ohno 3), Ritoku Horiuchi 12) 1,2) 1 National Institute for Fusion Science (NIFS), Japan 2 The Graduate University
More informationTESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications
TESLA P PERFORMANCE GUIDE HPC and Deep Learning Applications MAY 217 TESLA P PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important
More informationArchitectural constraints to attain 1 Exaflop/s for three scientific application classes
Architectural constraints to attain Exaflop/s for three scientific application classes Abhinav Bhatele, Pritish Jetley, Hormozd Gahvari, Lukasz Wesolowski, William D. Gropp, Laxmikant V. Kalé Department
More informationDouble Patterning Layout Decomposition for Simultaneous Conflict and Stitch Minimization
Double Patterning Layout Decomposition for Simultaneous Conflict and Stitch Minimization Kun Yuan, Jae-Seo Yang, David Z. Pan Dept. of Electrical and Computer Engineering The University of Texas at Austin
More informationLoad Balancing and Data Migration in a Hybrid Computational Fluid Dynamics Application
Load Balancing and Data Migration in a Hybrid Computational Fluid Dynamics Application Esteban Meneses Patrick Pisciuneri Center for Simulation and Modeling (SaM) University of Pittsburgh University of
More informationICON for HD(CP) 2. High Definition Clouds and Precipitation for Advancing Climate Prediction
ICON for HD(CP) 2 High Definition Clouds and Precipitation for Advancing Climate Prediction High Definition Clouds and Precipitation for Advancing Climate Prediction ICON 2 years ago Parameterize shallow
More informationLecture 15: More Iterative Ideas
Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!
More informationCLUSTER-BASED MOLECULAR DYNAMICS PARALLEL SIMULATION IN THERMOPHYSICS
CLUSTER-BASED MOLECULAR DYNAMICS PARALLEL SIMULATION IN THERMOPHYSICS JIWU SHU * BING WANG JINZHAO WANG 2 MIN CHEN 2 WEIMIN ZHENG ( Dept. of Computer Science and Technology, Tsinghua Univ., Beijing, 00084)
More informationLarge Scale Data Visualization. CSC 7443: Scientific Information Visualization
Large Scale Data Visualization Large Datasets Large datasets: D >> 10 M D D: Hundreds of gigabytes to terabytes and even petabytes M D : 1 to 4 GB of RAM Examples: Single large data set Time-varying data
More informationCommodity Cluster Computing
Commodity Cluster Computing Ralf Gruber, EPFL-SIC/CAPA/Swiss-Tx, Lausanne http://capawww.epfl.ch Commodity Cluster Computing 1. Introduction 2. Characterisation of nodes, parallel machines,applications
More informationENABLING NEW SCIENCE GPU SOLUTIONS
ENABLING NEW SCIENCE TESLA BIO Workbench The NVIDIA Tesla Bio Workbench enables biophysicists and computational chemists to push the boundaries of life sciences research. It turns a standard PC into a
More informationExploiting hierarchical parallelisms for molecular dynamics simulation on multicore clusters
J Supercomput (2011) 57: 20 33 DOI 10.1007/s11227-011-0560-1 Exploiting hierarchical parallelisms for molecular dynamics simulation on multicore clusters Liu Peng Manaschai Kunaseth Hikmet Dursun Ken-ichi
More informationSplotch: High Performance Visualization using MPI, OpenMP and CUDA
Splotch: High Performance Visualization using MPI, OpenMP and CUDA Klaus Dolag (Munich University Observatory) Martin Reinecke (MPA, Garching) Claudio Gheller (CSCS, Switzerland), Marzia Rivi (CINECA,
More informationMultiscale Techniques: Wavelet Applications in Volume Rendering
Multiscale Techniques: Wavelet Applications in Volume Rendering Michael H. F. Wilkinson, Michel A. Westenberg and Jos B.T.M. Roerdink Institute for Mathematics and University of Groningen The Netherlands
More informationParallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs
Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs C.-C. Su a, C.-W. Hsieh b, M. R. Smith b, M. C. Jermy c and J.-S. Wu a a Department of Mechanical Engineering, National Chiao Tung
More informationAnalyzing the Performance of IWAVE on a Cluster using HPCToolkit
Analyzing the Performance of IWAVE on a Cluster using HPCToolkit John Mellor-Crummey and Laksono Adhianto Department of Computer Science Rice University {johnmc,laksono}@rice.edu TRIP Meeting March 30,
More informationImage-Space-Parallel Direct Volume Rendering on a Cluster of PCs
Image-Space-Parallel Direct Volume Rendering on a Cluster of PCs B. Barla Cambazoglu and Cevdet Aykanat Bilkent University, Department of Computer Engineering, 06800, Ankara, Turkey {berkant,aykanat}@cs.bilkent.edu.tr
More informationENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION
ENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION Vignesh Adhinarayanan Ph.D. (CS) Student Synergy Lab, Virginia Tech INTRODUCTION Supercomputers are constrained by power Power
More informationStorage Hierarchy Management for Scientific Computing
Storage Hierarchy Management for Scientific Computing by Ethan Leo Miller Sc. B. (Brown University) 1987 M.S. (University of California at Berkeley) 1990 A dissertation submitted in partial satisfaction
More informationANSYS HPC Technology Leadership
ANSYS HPC Technology Leadership 1 ANSYS, Inc. November 14, Why ANSYS Users Need HPC Insight you can t get any other way It s all about getting better insight into product behavior quicker! HPC enables
More informationArchitecture Conscious Data Mining. Srinivasan Parthasarathy Data Mining Research Lab Ohio State University
Architecture Conscious Data Mining Srinivasan Parthasarathy Data Mining Research Lab Ohio State University KDD & Next Generation Challenges KDD is an iterative and interactive process the goal of which
More informationOverview. Pipeline implementation I. Overview. Required Tasks. Preliminaries Clipping. Hidden Surface removal
Overview Pipeline implementation I Preliminaries Clipping Line clipping Hidden Surface removal Overview At end of the geometric pipeline, vertices have been assembled into primitives Must clip out primitives
More informationDevelopment of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics
Development of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics I. Pantle Fachgebiet Strömungsmaschinen Karlsruher Institut für Technologie KIT Motivation
More informationUsing GPUs to compute the multilevel summation of electrostatic forces
Using GPUs to compute the multilevel summation of electrostatic forces David J. Hardy Theoretical and Computational Biophysics Group Beckman Institute for Advanced Science and Technology University of
More informationCenter for Computational Science
Center for Computational Science Toward GPU-accelerated meshfree fluids simulation using the fast multipole method Lorena A Barba Boston University Department of Mechanical Engineering with: Felipe Cruz,
More informationMODEL-FREE, STATISTICAL DETECTION AND TRACKING OF MOVING OBJECTS
MODEL-FREE, STATISTICAL DETECTION AND TRACKING OF MOVING OBJECTS University of Koblenz Germany 13th Int. Conference on Image Processing, Atlanta, USA, 2006 Introduction Features of the new approach: Automatic
More informationEfficient use of hybrid computing clusters for nanosciences
International Conference on Parallel Computing ÉCOLE NORMALE SUPÉRIEURE LYON Efficient use of hybrid computing clusters for nanosciences Luigi Genovese CEA, ESRF, BULL, LIG 16 Octobre 2008 with Matthieu
More informationGPU-based Distributed Behavior Models with CUDA
GPU-based Distributed Behavior Models with CUDA Courtesy: YouTube, ISIS Lab, Universita degli Studi di Salerno Bradly Alicea Introduction Flocking: Reynolds boids algorithm. * models simple local behaviors
More informationNVIDIA Update and Directions on GPU Acceleration for Earth System Models
NVIDIA Update and Directions on GPU Acceleration for Earth System Models Stan Posey, HPC Program Manager, ESM and CFD, NVIDIA, Santa Clara, CA, USA Carl Ponder, PhD, Applications Software Engineer, NVIDIA,
More informationBuilding scalable 3D applications. Ville Miettinen Hybrid Graphics
Building scalable 3D applications Ville Miettinen Hybrid Graphics What s going to happen... (1/2) Mass market: 3D apps will become a huge success on low-end and mid-tier cell phones Retro-gaming New game
More informationReal-Time Rendering (Echtzeitgraphik) Dr. Michael Wimmer
Real-Time Rendering (Echtzeitgraphik) Dr. Michael Wimmer wimmer@cg.tuwien.ac.at Visibility Overview Basics about visibility Basics about occlusion culling View-frustum culling / backface culling Occlusion
More informationLarge Data Visualization
Large Data Visualization Seven Lectures 1. Overview (this one) 2. Scalable parallel rendering algorithms 3. Particle data visualization 4. Vector field visualization 5. Visual analytics techniques for
More informationSpatial Data Structures
CSCI 420 Computer Graphics Lecture 17 Spatial Data Structures Jernej Barbic University of Southern California Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees [Angel Ch. 8] 1 Ray Tracing Acceleration
More informationDefense Technical Information Center Compilation Part Notice
UNCLASSIFIED Defense Technical Information Center Compilation Part Notice ADP023800 TITLE: A Comparative Study of ARL Linux Cluster Performance DISTRIBUTION: Approved for public release, distribution unlimited
More informationPhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea.
Abdulrahman Manea PhD Student Hamdi Tchelepi Associate Professor, Co-Director, Center for Computational Earth and Environmental Science Energy Resources Engineering Department School of Earth Sciences
More informationVictory Advanced Structure Editor. 3D Process Simulator for Large Structures
Victory Advanced Structure Editor 3D Process Simulator for Large Structures Applications Victory Advanced Structure Editor is designed for engineers who need to create layout driven 3D process based structures
More informationRow Tracing with Hierarchical Occlusion Maps
Row Tracing with Hierarchical Occlusion Maps Ravi P. Kammaje, Benjamin Mora August 9, 2008 Page 2 Row Tracing with Hierarchical Occlusion Maps Outline August 9, 2008 Introduction Related Work Row Tracing
More informationOpenCL Implementation Of A Heterogeneous Computing System For Real-time Rendering And Dynamic Updating Of Dense 3-d Volumetric Data
OpenCL Implementation Of A Heterogeneous Computing System For Real-time Rendering And Dynamic Updating Of Dense 3-d Volumetric Data Andrew Miller Computer Vision Group Research Developer 3-D TERRAIN RECONSTRUCTION
More informationFast Multipole and Related Algorithms
Fast Multipole and Related Algorithms Ramani Duraiswami University of Maryland, College Park http://www.umiacs.umd.edu/~ramani Joint work with Nail A. Gumerov Efficiency by exploiting symmetry and A general
More informationCS-235 Computational Geometry
CS-235 Computational Geometry Computer Science Department Fall Quarter 2002. Computational Geometry Study of algorithms for geometric problems. Deals with discrete shapes: points, lines, polyhedra, polygonal
More informationsimulation framework for piecewise regular grids
WALBERLA, an ultra-scalable multiphysics simulation framework for piecewise regular grids ParCo 2015, Edinburgh September 3rd, 2015 Christian Godenschwager, Florian Schornbaum, Martin Bauer, Harald Köstler
More informationEE795: Computer Vision and Intelligent Systems
EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 WRI C225 Lecture 02 130124 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Basics Image Formation Image Processing 3 Intelligent
More informationHPCS HPCchallenge Benchmark Suite
HPCS HPCchallenge Benchmark Suite David Koester, Ph.D. () Jack Dongarra (UTK) Piotr Luszczek () 28 September 2004 Slide-1 Outline Brief DARPA HPCS Overview Architecture/Application Characterization Preliminary
More informationParallel Geospatial Data Management for Multi-Scale Environmental Data Analysis on GPUs DOE Visiting Faculty Program Project Report
Parallel Geospatial Data Management for Multi-Scale Environmental Data Analysis on GPUs 2013 DOE Visiting Faculty Program Project Report By Jianting Zhang (Visiting Faculty) (Department of Computer Science,
More informationMulti-Level Cache Hierarchy Evaluation for Programmable Media Processors. Overview
Multi-Level Cache Hierarchy Evaluation for Programmable Media Processors Jason Fritts Assistant Professor Department of Computer Science Co-Author: Prof. Wayne Wolf Overview Why Programmable Media Processors?
More informationParallel Computing: From Inexpensive Servers to Supercomputers
Parallel Computing: From Inexpensive Servers to Supercomputers Lyle N. Long The Pennsylvania State University & The California Institute of Technology Seminar to the Koch Lab http://www.personal.psu.edu/lnl
More informationParticle-Based Volume Rendering of Unstructured Volume Data
Particle-Based Volume Rendering of Unstructured Volume Data Takuma KAWAMURA 1)*) Jorji NONAKA 3) Naohisa SAKAMOTO 2),3) Koji KOYAMADA 2) 1) Graduate School of Engineering, Kyoto University 2) Center for
More informationAlgorithm Engineering with PRAM Algorithms
Algorithm Engineering with PRAM Algorithms Bernard M.E. Moret moret@cs.unm.edu Department of Computer Science University of New Mexico Albuquerque, NM 87131 Rome School on Alg. Eng. p.1/29 Measuring and
More informationMany rendering scenarios, such as battle scenes or urban environments, require rendering of large numbers of autonomous characters.
1 2 Many rendering scenarios, such as battle scenes or urban environments, require rendering of large numbers of autonomous characters. Crowd rendering in large environments presents a number of challenges,
More informationIntroduction. Summary. Why computer architecture? Technology trends Cost issues
Introduction 1 Summary Why computer architecture? Technology trends Cost issues 2 1 Computer architecture? Computer Architecture refers to the attributes of a system visible to a programmer (that have
More informationRay Tracing with Multi-Core/Shared Memory Systems. Abe Stephens
Ray Tracing with Multi-Core/Shared Memory Systems Abe Stephens Real-time Interactive Massive Model Visualization Tutorial EuroGraphics 2006. Vienna Austria. Monday September 4, 2006 http://www.sci.utah.edu/~abe/massive06/
More informationThe Center for Computational Research
The Center for Computational Research Russ Miller Director, Center for Computational Research UB Distinguished Professor, Computer Science & Engineering Senior Research Scientist, Hauptman-Woodward Medical
More informationPartitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA
Partitioning and Partitioning Tools Tim Barth NASA Ames Research Center Moffett Field, California 94035-00 USA 1 Graph/Mesh Partitioning Why do it? The graph bisection problem What are the standard heuristic
More information