Philip C. Roth. Computer Science and Mathematics Division Oak Ridge National Laboratory
|
|
- Jeffery Underwood
- 5 years ago
- Views:
Transcription
1 Philip C. Roth Computer Science and Mathematics Division Oak Ridge National Laboratory
2 A Tree-Based Overlay Network (TBON) like MRNet provides scalable infrastructure for tools and applications MRNet's process topology and placement support is extremely flexible (on most platforms) Any tree topology Internal processes on same nodes as application processes, or on distinct nodes 2 Managed by UT-Battelle
3 3 Managed by UT-Battelle
4 Flexibility leads to questions identifying best process topology and placement Interaction of several factors determine best Performance (tool and application) System hardware and software Purpose Even economics (e.g., can I afford to request extra nodes for MRNet processes given my allocation budget?) Decision process often not rigorous using rule of thumb 4 Managed by UT-Battelle
5 Goal: Given a node allocation on a leadership class system, to be able to identify best MRNet process placement and topoogy Several constraints: Tool multicast and reduction requirements Behavior of application under study Other activity on the system System software and hardware 5 Managed by UT-Battelle
6 Cray XT is target platform Jaguar XT4 and XT5 systems at Oak Ridge National Laboratory (ORNL) Hopper XT5 at NERSC Kraken XT5 at ORNL Opteron-based nodes arranged in 3D mesh with possibility of torus links Cray Linux Environment 6 Managed by UT-Battelle
7 Goal: understand Cray XT allocation characteristics & their impact on MRNet-based tool process placement Used simple MPI/Portals program to collect node number and position within the XT mesh Earlier generation ORNL Jaguar with dual-core Opterons Batch job launched two independent instances of the program: 512 application nodes (1024 processes) 72 tool nodes (enough for balanced 8-way TBON topology assuming front-end is on batch script service node) 7 Managed by UT-Battelle
8 8 Managed by UT-Battelle
9 Discrete event simulation of XT system nodes running application and MRNet processes Component of MAST framework: Modeling Assertions, Simulation, and Tuning 9 Managed by UT-Battelle
10 Node modules connected in 3D torus Implemented using OMNeT++ ( 10 Managed by UT-Battelle
11 XTNode Application Process +/- X +/- Y +/- Z MRNet Process 11 Managed by UT-Battelle
12 12 Managed by UT-Battelle
13 13 Managed by UT-Battelle
14 XML file Multiple parallel programs per file, including type and associated attributes like input Mapping of processes to system nodes 14 Managed by UT-Battelle
15 Measuring process-to-process latency and bandwidth MPI, Sockets Fully populated nodes, one process per node Pairs of processes Even ranks first pair left, then right 15 Managed by UT-Battelle
16 1.2e-05 mpi left mpi right 4.5e-06 sock left sock right 1e-05 4e-06 8e e-06 Latency (s) 6e-06 Latency (s) 3e-06 4e e-06 2e-06 2e Rank 1.5e Rank 16 Managed by UT-Battelle
17 5e e+09 mpi left mpi right 9.5e+08 sock left sock right 9e+08 4e e e+08 Bandwidth (Bytes/s) 3e e+09 Bandwidth (Bytes/s) 8e+08 2e e e+09 7e+08 1e+09 5e Rank 6.5e Rank 17 Managed by UT-Battelle
18 9.6e e-06 mpi left mpi right 1.88e e-06 sock left sock right 9.2e e-06 9e e e e-06 Latency (s) 8.6e e-06 Latency (s) 1.78e e e e-06 8e e e e e Rank 1.68e Rank 18 Managed by UT-Battelle
19 1.6e e+09 mpi left mpi right 1.59e e+09 sock left sock right 1.57e e e+09 Bandwidth (Bytes/s) 1.54e e e+09 Bandwidth (Bytes/s) 1.55e e e e e e e e e e Rank 1.48e Rank 19 Managed by UT-Battelle
20 Process/Processor Mapping MA- Instrumented MPI Program Run on Parallel System MA Model MA Control Flow Graph Simulator System Simulator Behavior/ Performance Prediction(s) Automated Code Tuning Framework ScalaTrace Trace File Replayer Open Trace Format Trace File Replayer Sequoia Trace File Replayer MRNet Workload Driver + Trace File Replayer + Stochastic Workload Generator 20 Managed by UT-Battelle
21 Basic XTNode with SeaStar router is implemented Parameterization still in progress as described earlier Support for simple MPI-based workloads Hardcoded behaviors (hot potato, 1D exchange) OTF and Sequioa trace readers implemented for previous version, must be resurrected Support for TBON processes designed and partially implemented Recently adapted model from OMNeT to 4.1 (changes in simulation time) 21 Managed by UT-Battelle
22 This research is sponsored by the Office of Advanced Scientific Computing Research; U.S. Department of Energy. The work was performed at the Oak Ridge National Laboratory which is managed by UT Battelle, LLC under Contract No. De-AC05-00OR This research used resources of the Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. De- AC05-00OR Managed by UT-Battelle
23 Predicting TBON performance on Cray XT is highly desirable Matching TBON process topology and placement to tool needs subject to application and system constraints May support online reconfiguration of TBON topology Developing simulation-based TBON prediction capability Expect predictions of realistic scenarios soon Easily adaptable to expected future architectures (e.g., GPU-enabled nodes, Infiniband clusters) Embeddable (in theory) 23 Managed by UT-Battelle
24 24 Managed by UT-Battelle
ScalaIOTrace: Scalable I/O Tracing and Analysis
ScalaIOTrace: Scalable I/O Tracing and Analysis Karthik Vijayakumar 1, Frank Mueller 1, Xiaosong Ma 1,2, Philip C. Roth 2 1 Department of Computer Science, NCSU 2 Computer Science and Mathematics Division,
More informationScalable Tool Infrastructure for the Cray XT Using Tree-Based Overlay Networks
Scalable Tool Infrastructure for the Cray XT Using Tree-Based Overlay Networks Philip C. Roth, Oak Ridge National Laboratory and Jeffrey S. Vetter, Oak Ridge National Laboratory and Georgia Institute of
More informationScalable, Automated Characterization of Parallel Application Communication Behavior
Scalable, Automated Characterization of Parallel Application Communication Behavior Philip C. Roth Computer Science and Mathematics Division Oak Ridge National Laboratory 12 th Scalable Tools Workshop
More informationMADNESS. Rick Archibald. Computer Science and Mathematics Division ORNL
MADNESS Rick Archibald Computer Science and Mathematics Division ORNL CScADS workshop: Leadership-class Machines, Petascale Applications and Performance Strategies July 27-30 th Managed by UT-Battelle
More informationThe Effect of Emerging Architectures on Data Science (and other thoughts)
The Effect of Emerging Architectures on Data Science (and other thoughts) Philip C. Roth With contributions from Jeffrey S. Vetter and Jeremy S. Meredith (ORNL) and Allen Malony (U. Oregon) Future Technologies
More informationCharacterizing the I/O Behavior of Scientific Applications on the Cray XT
Characterizing the I/O Behavior of Scientific Applications on the Cray XT Philip C. Roth Computer Science and Mathematics Division Oak Ridge National Laboratory Oak Ridge, TN 37831 rothpc@ornl.gov ABSTRACT
More informationISC 09 Poster Abstract : I/O Performance Analysis for the Petascale Simulation Code FLASH
ISC 09 Poster Abstract : I/O Performance Analysis for the Petascale Simulation Code FLASH Heike Jagode, Shirley Moore, Dan Terpstra, Jack Dongarra The University of Tennessee, USA [jagode shirley terpstra
More informationAggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments
Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments Swen Böhm 1,2, Christian Engelmann 2, and Stephen L. Scott 2 1 Department of Computer
More informationThe Titan Tools Experience
The Titan Tools Experience Michael J. Brim, Ph.D. Computer Science Research, CSMD/NCCS Petascale Tools Workshop 213 Madison, WI July 15, 213 Overview of Titan Cray XK7 18,688+ compute nodes 16-core AMD
More informationSLURM Operation on Cray XT and XE
SLURM Operation on Cray XT and XE Morris Jette jette@schedmd.com Contributors and Collaborators This work was supported by the Oak Ridge National Laboratory Extreme Scale Systems Center. Swiss National
More informationPreparing GPU-Accelerated Applications for the Summit Supercomputer
Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership
More informationGuidelines for Efficient Parallel I/O on the Cray XT3/XT4
Guidelines for Efficient Parallel I/O on the Cray XT3/XT4 Jeff Larkin, Cray Inc. and Mark Fahey, Oak Ridge National Laboratory ABSTRACT: This paper will present an overview of I/O methods on Cray XT3/XT4
More informationIntroduction to HPC Parallel I/O
Introduction to HPC Parallel I/O Feiyi Wang (Ph.D.) and Sarp Oral (Ph.D.) Technology Integration Group Oak Ridge Leadership Computing ORNL is managed by UT-Battelle for the US Department of Energy Outline
More informationExploring Use-cases for Non-Volatile Memories in support of HPC Resilience
Exploring Use-cases for Non-Volatile Memories in support of HPC Resilience Onkar Patil 1, Saurabh Hukerikar 2, Frank Mueller 1, Christian Engelmann 2 1 Dept. of Computer Science, North Carolina State University
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman
More informationCombing Partial Redundancy and Checkpointing for HPC
Combing Partial Redundancy and Checkpointing for HPC James Elliott, Kishor Kharbas, David Fiala, Frank Mueller, Kurt Ferreira, and Christian Engelmann North Carolina State University Sandia National Laboratory
More informationThe Role of InfiniBand Technologies in High Performance Computing. 1 Managed by UT-Battelle for the Department of Energy
The Role of InfiniBand Technologies in High Performance Computing 1 Managed by UT-Battelle Contributors Gil Bloch Noam Bloch Hillel Chapman Manjunath Gorentla- Venkata Richard Graham Michael Kagan Vasily
More informationA Holistic Approach for Performance Measurement and Analysis for Petascale Applications
A Holistic Approach for Performance Measurement and Analysis for Petascale Applications Heike Jagode 1,2, Jack Dongarra 1,2 Sadaf Alam 2, Jeffrey Vetter 2 Wyatt Spear 3, Allen D. Malony 3 1 The University
More informationAutomated Characterization of Parallel Application Communication Patterns
Automated Characterization of Parallel Application Communication Patterns Philip C. Roth Jeremy S. Meredith Jeffrey S. Vetter Oak Ridge National Laboratory 17 June 2015 ORNL is managed by UT-Battelle for
More informationToward Improved Support for Loosely Coupled Large Scale Simulation Workflows. Swen Boehm Wael Elwasif Thomas Naughton, Geoffroy R.
Toward Improved Support for Loosely Coupled Large Scale Simulation Workflows Swen Boehm Wael Elwasif Thomas Naughton, Geoffroy R. Vallee Motivation & Challenges Bigger machines (e.g., TITAN, upcoming Exascale
More informationComparison of Scheduling Policies and Workloads on the NCCS and NICS XT4 Systems at Oak Ridge National Laboratory
Comparison of Scheduling Policies and Workloads on the NCCS and NICS XT4 Systems at Oak Ridge National Laboratory Troy Baer HPC System Administrator National Institute for Computational Sciences University
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences Computer Science and Mathematics Power Measurement for High Performance Computing: State of the Art PMP 2011 Chung-Hsing Hsu Steve Poole
More informationData Reduction and Partitioning in an Extreme Scale GPU-Based Clustering Algorithm
Data Reduction and Partitioning in an Extreme Scale GPU-Based Clustering Algorithm Benjamin Welton and Barton Miller Paradyn Project University of Wisconsin - Madison DRBSD-2 Workshop November 17 th 2017
More informationBreakthrough Science via Extreme Scalability. Greg Clifford Segment Manager, Cray Inc.
Breakthrough Science via Extreme Scalability Greg Clifford Segment Manager, Cray Inc. clifford@cray.com Cray s focus The requirement for highly scalable systems Cray XE6 technology The path to Exascale
More informationOverlapping Computation and Communication for Advection on Hybrid Parallel Computers
Overlapping Computation and Communication for Advection on Hybrid Parallel Computers James B White III (Trey) trey@ucar.edu National Center for Atmospheric Research Jack Dongarra dongarra@eecs.utk.edu
More informationSteven Carter. Network Lead, NCCS Oak Ridge National Laboratory OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 1
Networking the National Leadership Computing Facility Steven Carter Network Lead, NCCS Oak Ridge National Laboratory scarter@ornl.gov 1 Outline Introduction NCCS Network Infrastructure Cray Architecture
More informationComparison of XT3 and XT4 Scalability
Comparison of XT3 and XT4 Scalability Patrick H. Worley Oak Ridge National Laboratory CUG 2007 May 7-10, 2007 Red Lion Hotel Seattle, WA Acknowledgements Research sponsored by the Climate Change Research
More informationCRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar
CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION
More informationHPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017
Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle
More informationTitan - Early Experience with the Titan System at Oak Ridge National Laboratory
Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid
More informationA More Realistic Way of Stressing the End-to-end I/O System
A More Realistic Way of Stressing the End-to-end I/O System Verónica G. Vergara Larrea Sarp Oral Dustin Leverman Hai Ah Nam Feiyi Wang James Simmons CUG 2015 April 29, 2015 Chicago, IL ORNL is managed
More informationA Case for Standard Non-Blocking Collective Operations
A Case for Standard Non-Blocking Collective Operations T. Hoefler,2, P. Kambadur, R. L. Graham 3, G. Shipman 4 and A. Lumsdaine Open Systems Lab 2 Computer Architecture Group Indiana University Technical
More informationPortable Heterogeneous High-Performance Computing via Domain-Specific Virtualization. Dmitry I. Lyakh.
Portable Heterogeneous High-Performance Computing via Domain-Specific Virtualization Dmitry I. Lyakh liakhdi@ornl.gov This research used resources of the Oak Ridge Leadership Computing Facility at the
More informationThe Red Storm System: Architecture, System Update and Performance Analysis
The Red Storm System: Architecture, System Update and Performance Analysis Douglas Doerfler, Jim Tomkins Sandia National Laboratories Center for Computation, Computers, Information and Mathematics LACSI
More informationPorting SLURM to the Cray XT and XE. Neil Stringfellow and Gerrit Renker
Porting SLURM to the Cray XT and XE Neil Stringfellow and Gerrit Renker Background Cray XT/XE basics Cray XT systems are among the largest in the world 9 out of the top 30 machines on the top500 list June
More informationHierarchy Aware Blocking and Nonblocking Collective Communications-The Effects of Shared Memory in the Cray XT environment
Hierarchy Aware Blocking and Nonblocking Collective Communications-The Effects of Shared Memory in the Cray XT environment Richard L. Graham, Joshua S. Ladd, Manjunath GorentlaVenkata Oak Ridge National
More informationImproving the Scalability of Comparative Debugging with MRNet
Improving the Scalability of Comparative Debugging with MRNet Jin Chao MeSsAGE Lab (Monash Uni.) Cray Inc. David Abramson Minh Ngoc Dinh Jin Chao Luiz DeRose Robert Moench Andrew Gontarek Outline Assertion-based
More informationIntroduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill
Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center
More informationEarly Evaluation of the Cray XD1
Early Evaluation of the Cray XD1 (FPGAs not covered here) Mark R. Fahey Sadaf Alam, Thomas Dunigan, Jeffrey Vetter, Patrick Worley Oak Ridge National Laboratory Cray User Group May 16-19, 2005 Albuquerque,
More informationPerformance of Variant Memory Configurations for Cray XT Systems
Performance of Variant Memory Configurations for Cray XT Systems Wayne Joubert, Oak Ridge National Laboratory ABSTRACT: In late 29 NICS will upgrade its 832 socket Cray XT from Barcelona (4 cores/socket)
More informationThe Cray Rainier System: Integrated Scalar/Vector Computing
THE SUPERCOMPUTER COMPANY The Cray Rainier System: Integrated Scalar/Vector Computing Per Nyberg 11 th ECMWF Workshop on HPC in Meteorology Topics Current Product Overview Cray Technology Strengths Rainier
More informationManaging HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory
Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department
More informationI/O Router Placement and Fine-Grained Routing on Titan to Support Spider II
I/O Router Placement and Fine-Grained Routing on Titan to Support Spider II Matt Ezell, Sarp Oral, Feiyi Wang, Devesh Tiwari, Don Maxwell, Dustin Leverman, and Jason Hill Oak Ridge National Laboratory;
More informationThe Hopper System: How the Largest* XE6 in the World Went From Requirements to Reality! Katie Antypas, Tina Butler, and Jonathan Carter
The Hopper System: How the Largest* XE6 in the World Went From Requirements to Reality! Katie Antypas, Tina Butler, and Jonathan Carter CUG 2011, May 25th, 2011 1 Requirements to Reality Develop RFP Select
More informationOpen MPI for Cray XE/XK Systems
Open MPI for Cray XE/XK Systems Samuel K. Gutierrez LANL Nathan T. Hjelm LANL Manjunath Gorentla Venkata ORNL Richard L. Graham - Mellanox Cray User Group (CUG) 2012 May 2, 2012 U N C L A S S I F I E D
More informationScaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc
Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC
More informationProject Name. The Eclipse Integrated Computational Environment. Jay Jay Billings, ORNL Parent Project. None selected yet.
Project Name The Eclipse Integrated Computational Environment Jay Jay Billings, ORNL 20140219 Parent Project None selected yet. Background The science and engineering community relies heavily on modeling
More informationPerformance database technology for SciDAC applications
Performance database technology for SciDAC applications D Gunter 1, K Huck 2, K Karavanic 3, J May 4, A Malony 2, K Mohror 3, S Moore 5, A Morris 2, S Shende 2, V Taylor 6, X Wu 6, and Y Zhang 7 1 Lawrence
More informationAutomatic Identification of Application I/O Signatures from Noisy Server-Side Traces. Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S.
Automatic Identification of Application I/O Signatures from Noisy Server-Side Traces Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S. Vazhkudai Instance of Large-Scale HPC Systems ORNL s TITAN (World
More informationDVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011
DVS, GPFS and External Lustre at NERSC How It s Working on Hopper Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011 1 NERSC is the Primary Computing Center for DOE Office of Science NERSC serves
More informationDesign and Construction of Relational Database for Structural Modeling Verification and Validation. Weiju Ren, Ph. D.
Design and Construction of Relational Database for Structural Modeling Verification and Validation Weiju Ren, Ph. D. Oak Ridge National Laboratory renw@ornl.gov ASME Verification and Validation Symposium
More informationABySS Performance Benchmark and Profiling. May 2010
ABySS Performance Benchmark and Profiling May 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationAllowing Users to Run Services at the OLCF with Kubernetes
Allowing Users to Run Services at the OLCF with Kubernetes Jason Kincl Senior HPC Systems Engineer Ryan Adamson Senior HPC Security Engineer This work was supported by the Oak Ridge Leadership Computing
More informationScalable, Fault-Tolerant Membership for MPI Tasks on HPC Systems
fastos.org/molar Scalable, Fault-Tolerant Membership for MPI Tasks on HPC Systems Jyothish Varma 1, Chao Wang 1, Frank Mueller 1, Christian Engelmann, Stephen L. Scott 1 North Carolina State University,
More informationA PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers
A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers Maxime Martinasso, Grzegorz Kwasniewski, Sadaf R. Alam, Thomas C. Schulthess, Torsten Hoefler Swiss National Supercomputing
More informationChallenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures. Allison H. Baker, Todd Gamblin, Martin Schulz, and Ulrike Meier Yang
Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures. Allison H. Baker, Todd Gamblin, Martin Schulz, and Ulrike Meier Yang Multigrid Solvers Method of solving linear equation
More informationOak Ridge National Laboratory
Oak Ridge National Laboratory Lustre Scalability Workshop Presented by: Galen M. Shipman Collaborators: David Dillow Sarp Oral Feiyi Wang February 10, 2009 We have increased system performance 300 times
More informationTuning I/O Performance for Data Intensive Computing. Nicholas J. Wright. lbl.gov
Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright njwright @ lbl.gov NERSC- National Energy Research Scientific Computing Center Mission: Accelerate the pace of scientific discovery
More informationPresent and Future Leadership Computers at OLCF
Present and Future Leadership Computers at OLCF Al Geist ORNL Corporate Fellow DOE Data/Viz PI Meeting January 13-15, 2015 Walnut Creek, CA ORNL is managed by UT-Battelle for the US Department of Energy
More informationEfficiency Evaluation of the Input/Output System on Computer Clusters
Efficiency Evaluation of the Input/Output System on Computer Clusters Sandra Méndez, Dolores Rexachs and Emilio Luque Computer Architecture and Operating System Department (CAOS) Universitat Autònoma de
More informationInfiniBand-based HPC Clusters
Boosting Scalability of InfiniBand-based HPC Clusters Asaf Wachtel, Senior Product Manager 2010 Voltaire Inc. InfiniBand-based HPC Clusters Scalability Challenges Cluster TCO Scalability Hardware costs
More informationThe Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System
The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Alan Humphrey, Qingyu Meng, Martin Berzins Scientific Computing and Imaging Institute & University of Utah I. Uintah Overview
More informationDetermining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace
Determining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace James Southern, Jim Tuccillo SGI 25 October 2016 0 Motivation Trend in HPC continues to be towards more
More informationReduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection
Switching Operational modes: Store-and-forward: Each switch receives an entire packet before it forwards it onto the next switch - useful in a general purpose network (I.e. a LAN). usually, there is a
More informationFCP: A Fast and Scalable Data Copy Tool for High Performance Parallel File Systems
FCP: A Fast and Scalable Data Copy Tool for High Performance Parallel File Systems Feiyi Wang (Ph.D.) Veronica Vergara Larrea Dustin Leverman Sarp Oral ORNL is managed by UT-Battelle for the US Department
More informationEfficient Object Storage Journaling in a Distributed Parallel File System
Efficient Object Storage Journaling in a Distributed Parallel File System Presented by Sarp Oral Sarp Oral, Feiyi Wang, David Dillow, Galen Shipman, Ross Miller, and Oleg Drokin FAST 10, Feb 25, 2010 A
More informationResilience Design Patterns: A Structured Approach to Resilience at Extreme Scale
Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale Saurabh Hukerikar Christian Engelmann Computer Science Research Group Computer Science & Mathematics Division Oak Ridge
More informationHYCOM Performance Benchmark and Profiling
HYCOM Performance Benchmark and Profiling Jan 2011 Acknowledgment: - The DoD High Performance Computing Modernization Program Note The following research was performed under the HPC Advisory Council activities
More informationSUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine
SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine April 2007 Part No 820-1270-11 Revision 1.1, 4/18/07
More informationEN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University
EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University Material from: The Datacenter as a Computer: An Introduction to
More informationTree-Based Density Clustering using Graphics Processors
Tree-Based Density Clustering using Graphics Processors A First Marriage of MRNet and GPUs Evan Samanas and Ben Welton Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 The
More informationThe Common Communication Interface (CCI)
The Common Communication Interface (CCI) Presented by: Galen Shipman Technology Integration Lead Oak Ridge National Laboratory Collaborators: Scott Atchley, George Bosilca, Peter Braam, David Dillow, Patrick
More informationPerformance Measurement and Evaluation Tool for Large-scale Systems
Performance Measurement and Evaluation Tool for Large-scale Systems Hong Ong ORNL hongong@ornl.gov December 7 th, 2005 Acknowledgements This work is sponsored in parts by: The High performance Computing
More informationConcepts for High Availability in Scientific High-End Computing
Concepts for High Availability in Scientific High-End Computing C. Engelmann 1,2 and S. L. Scott 1 1 Computer Science and Mathematics Division Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA 2
More informationMPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA
MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA Gilad Shainer 1, Tong Liu 1, Pak Lui 1, Todd Wilde 1 1 Mellanox Technologies Abstract From concept to engineering, and from design to
More informationA Lightweight Library for Building Scalable Tools
A Lightweight Library for Building Scalable Tools Emily R. Jacobson, Michael J. Brim, Barton P. Miller Paradyn Project University of Wisconsin jacobson@cs.wisc.edu June 6, 2010 Para 2010: State of the
More informationThe Spider Center Wide File System
The Spider Center Wide File System Presented by: Galen M. Shipman Collaborators: David A. Dillow Sarp Oral Feiyi Wang May 4, 2009 Jaguar: World s most powerful computer Designed for science from the ground
More informationThe Constellation Project. Andrew W. Nash 14 November 2016
The Constellation Project Andrew W. Nash 14 November 2016 The Constellation Project: Representing a High Performance File System as a Graph for Analysis The Titan supercomputer utilizes high performance
More informationScalable and Fault Tolerant Failure Detection and Consensus
EuroMPI'15, Bordeaux, France, September 21-23, 2015 Scalable and Fault Tolerant Failure Detection and Consensus Amogh Katti, Giuseppe Di Fatta, University of Reading, UK Thomas Naughton, Christian Engelmann
More informationVirtual Topologies for Scalable Resource Management and Contention Attenuation in a Global Address Space Model on the Cray XT5
2011 International Conference on Parallel Processing Virtual Topologies for Scalable Resource Management and Contention Attenuation in a Global Address Space Model on the Cray XT5 Weikuan Yu Vinod Tipparaju
More informationPerformance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-scale Multicore Clusters
Performance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-scale Multicore Clusters Xingfu Wu and Valerie Taylor Department of Computer Science and Engineering
More informationEnabling high-speed asynchronous data extraction and transfer using DART
CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. (21) Published online in Wiley InterScience (www.interscience.wiley.com)..1567 Enabling high-speed asynchronous
More informationAmazon Web Services: Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud
Amazon Web Services: Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud Summarized by: Michael Riera 9/17/2011 University of Central Florida CDA5532 Agenda
More informationCluster Network Products
Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1 Interconnects in Top500 list 11/2009 2 Interconnects in Top500 list 11/2008 3 Cluster
More informationAnalytics of Wide-Area Lustre Throughput Using LNet Routers
Analytics of Wide-Area Throughput Using LNet Routers Nagi Rao, Neena Imam, Jesse Hanley, Sarp Oral Oak Ridge National Laboratory User Group Conference LUG 2018 April 24-26, 2018 Argonne National Laboratory
More informationCommunication Models for Resource Constrained Hierarchical Ethernet Networks
Communication Models for Resource Constrained Hierarchical Ethernet Networks Speaker: Konstantinos Katrinis # Jun Zhu +, Alexey Lastovetsky *, Shoukat Ali #, Rolf Riesen # + Technical University of Eindhoven,
More informationThe Spider Center-Wide File System
The Spider Center-Wide File System Presented by Feiyi Wang (Ph.D.) Technology Integration Group National Center of Computational Sciences Galen Shipman (Group Lead) Dave Dillow, Sarp Oral, James Simmons,
More informationIS TOPOLOGY IMPORTANT AGAIN? Effects of Contention on Message Latencies in Large Supercomputers
IS TOPOLOGY IMPORTANT AGAIN? Effects of Contention on Message Latencies in Large Supercomputers Abhinav S Bhatele and Laxmikant V Kale ACM Research Competition, SC 08 Outline Why should we consider topology
More informationLeveraging Flash in HPC Systems
Leveraging Flash in HPC Systems IEEE MSST June 3, 2015 This work was performed under the auspices of the U.S. Department of Energy by under Contract DE-AC52-07NA27344. Lawrence Livermore National Security,
More informationDELIVERABLE D5.5 Report on ICARUS visualization cluster installation. John BIDDISCOMBE (CSCS) Jerome SOUMAGNE (CSCS)
DELIVERABLE D5.5 Report on ICARUS visualization cluster installation John BIDDISCOMBE (CSCS) Jerome SOUMAGNE (CSCS) 02 May 2011 NextMuSE 2 Next generation Multi-mechanics Simulation Environment Cluster
More informationCray XC Scalability and the Aries Network Tony Ford
Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?
More informationThe State and Needs of IO Performance Tools
The State and Needs of IO Performance Tools Scalable Tools Workshop Lake Tahoe, CA August 6 12, 2017 This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National
More informationOptimization of the Hop-Byte Metric for Effective Topology Aware Mapping
Optimization of the Hop-Byte Metric for Effective Topology Aware Mapping C. D. Sudheer Department of Mathematics and Computer Science Sri Sathya Sai Institute of Higher Learning, India Email: cdsudheerkumar@sssihl.edu.in
More informationEd D Azevedo Oak Ridge National Laboratory Piotr Luszczek University of Tennessee
A Framework for Check-Pointed Fault-Tolerant Out-of-Core Linear Algebra Ed D Azevedo (e6d@ornl.gov) Oak Ridge National Laboratory Piotr Luszczek (luszczek@cs.utk.edu) University of Tennessee Acknowledgement
More informationHTCondor on Titan. Wisconsin IceCube Particle Astrophysics Center. Vladimir Brik. HTCondor Week May 2018
HTCondor on Titan Wisconsin IceCube Particle Astrophysics Center Vladimir Brik HTCondor Week May 2018 Overview of Titan Cray XK7 Supercomputer at Oak Ridge Leadership Computing Facility Ranked #5 by TOP500
More informationAdvanced Job Launching. mapping applications to hardware
Advanced Job Launching mapping applications to hardware A Quick Recap - Glossary of terms Hardware This terminology is used to cover hardware from multiple vendors Socket The hardware you can touch and
More informationLLNL Tool Components: LaunchMON, P N MPI, GraphLib
LLNL-PRES-405584 Lawrence Livermore National Laboratory LLNL Tool Components: LaunchMON, P N MPI, GraphLib CScADS Workshop, July 2008 Martin Schulz Larger Team: Bronis de Supinski, Dong Ahn, Greg Lee Lawrence
More informationDesigning High Performance Communication Middleware with Emerging Multi-core Architectures
Designing High Performance Communication Middleware with Emerging Multi-core Architectures Dhabaleswar K. (DK) Panda Department of Computer Science and Engg. The Ohio State University E-mail: panda@cse.ohio-state.edu
More informationAnna Morajko.
Performance analysis and tuning of parallel/distributed applications Anna Morajko Anna.Morajko@uab.es 26 05 2008 Introduction Main research projects Develop techniques and tools for application performance
More informationMulti-Application Online Profiling Tool
Multi-Application Online Profiling Tool Vi-HPS Julien ADAM, Antoine CAPRA 1 About MALP MALP is a tool originally developed in CEA and in the University of Versailles (UVSQ) It generates rich HTML views
More information