Oak Ridge National Laboratory Computing and Computational Sciences

Similar documents
Preparing GPU-Accelerated Applications for the Summit Supercomputer

Present and Future Leadership Computers at OLCF

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017

UCX: An Open Source Framework for HPC Network APIs and Beyond

Oak Ridge National Laboratory Computing and Computational Sciences

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Mapping MPI+X Applications to Multi-GPU Architectures

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

IBM CORAL HPC System Solution

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

Oak Ridge Leadership Computing Facility: Summit and Beyond

outthink limits Spectrum Scale Enhancements for CORAL Sarp Oral, Oak Ridge National Laboratory Gautam Shah, IBM

IBM HPC Technology & Strategy

Exploring Emerging Technologies in the Extreme Scale HPC Co- Design Space with Aspen

Power Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017

MVAPICH User s Group Meeting August 20, 2015

UCX: An Open Source Framework for HPC Network APIs and Beyond

IBM Spectrum Scale IO performance

Acceleration of HPC applications on hybrid CPU-GPU systems: When can Multi-Process Service (MPS) help?

Stepping up to Summit

MICROWAY S NVIDIA TESLA V100 GPU SOLUTIONS GUIDE

Powering)the)Road)to)National)HPC)Leadership)

Preparing Scientific Software for Exascale

Center for Accelerated Application Readiness. Summit. Tjerk Straatsma. Getting Applications Ready for. OLCF Scientific Computing Group

In-Network Computing. Paving the Road to Exascale. 5th Annual MVAPICH User Group (MUG) Meeting, August 2017

Modernizing OpenMP for an Accelerated World

The Role of InfiniBand Technologies in High Performance Computing. 1 Managed by UT-Battelle for the Department of Energy

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

ABySS Performance Benchmark and Profiling. May 2010

HPC Customer Requirements for OpenFabrics Software

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

Responsive Large Data Analysis and Visualization with the ParaView Ecosystem. Patrick O Leary, Kitware Inc

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

Overview of Tianhe-2

NVIDIA Update and Directions on GPU Acceleration for Earth System Models

CS500 SMARTER CLUSTER SUPERCOMPUTERS

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

The Titan Tools Experience

Interconnect Related Research at Oak Ridge National Laboratory

IBM Power Advanced Compute (AC) AC922 Server

S THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE. Presenter: Louis Capps, Solution Architect, NVIDIA,

Interconnect Your Future

ALCF Operations Best Practices and Highlights. Mark Fahey The 6 th AICS International Symposium Feb 22-23, 2016 RIKEN AICS, Kobe, Japan

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

Steve Scott, Tesla CTO SC 11 November 15, 2011

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy

Programming NVM Systems

Overlapping Computation and Communication for Advection on Hybrid Parallel Computers

Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies

HYCOM Performance Benchmark and Profiling

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Lecture 20: Distributed Memory Parallelism. William Gropp

A More Realistic Way of Stressing the End-to-end I/O System

In-Network Computing. Sebastian Kalcher, Senior System Engineer HPC. May 2017

I/O Router Placement and Fine-Grained Routing on Titan to Support Spider II

RapidIO.org Update. Mar RapidIO.org 1

The Stampede is Coming: A New Petascale Resource for the Open Science Community

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Philip C. Roth. Computer Science and Mathematics Division Oak Ridge National Laboratory

Overview of Reedbush-U How to Login

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

Interconnect Your Future

The Hyperion Project: Collaboration for an Advanced Technology Cluster Testbed. November 2008

Interconnect Your Future

Unified Communication X (UCX)

JÜLICH SUPERCOMPUTING CENTRE Site Introduction Michael Stephan Forschungszentrum Jülich

Efficient Object Storage Journaling in a Distributed Parallel File System

CUDA. Matthew Joyner, Jeremy Williams

Future Routing Schemes in Petascale clusters

Early Experiences Writing Performance Portable OpenMP 4 Codes

The Future of High Performance Interconnects

CCR. ISC18 June 28, Kevin Pedretti, Jim H. Laros III, Si Hammond SAND C. Photos placed in horizontal env

A Large-Scale Study of Soft- Errors on GPUs in the Field

GPU Architecture. Alan Gray EPCC The University of Edinburgh

Building the Most Efficient Machine Learning System

Paving the Road to Exascale

The Spider Center-Wide File System

OpenPOWER Innovations for HPC. IBM Research. IWOPH workshop, ISC, Germany June 21, Christoph Hagleitner,

Computational Challenges and Opportunities for Nuclear Astrophysics

Organizational Update: December 2015

Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters

InfiniBand Networked Flash Storage

Trends in HPC (hardware complexity and software challenges)

OpenFOAM Performance Testing and Profiling. October 2017

Interconnect Your Future

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015

OSIsoft at Lawrence Livermore National Laboratory

IBM Power AC922 Server

OLCF's next- genera0on Spider file system

Using MPI+OpenMP for current and future architectures

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence

DISP: Optimizations Towards Scalable MPI Startup

Smarter Clusters from the Supercomputer Experts

Designed for Maximum Accelerator Performance

GPUS FOR NGVLA. M Clark, April 2015

The Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011

The Red Storm System: Architecture, System Update and Performance Analysis

John Levesque Nov 16, 2001

Leadership Computing Directions at Oak Ridge National Laboratory: Navigating the Transition to Heterogeneous Architectures

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems

Transcription:

Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015

Acknowledgments Bernholdt David E. Hill Jason J. Leverman Dustin B. Curtis Philip B. 2 OFA update by ORNL

OLCF The Oak Ridge Leadership Computing Facility (OLCF) was established at Oak Ridge National Laboratory in 2004 with the mission of accelerating scientific discovery and engineering progress by providing outstanding computing and data management resources to high-priority research and development projects. Jaguar: 2.3 PF Titan: 27 PF ORNL s supercomputing program has grown from humble beginnings to deliver some of the most powerful systems in the world. On the way, it has helped researchers deliver practical breakthroughs and new scientific knowledge in climate, materials, nuclear science, and a wide range of other disciplines. 3 OFA update by ORNL

CORAL CORAL Collaboration of ORNL, ANL, LLNL Objective Procure 3 leadership computers to be sited at Argonne, Oak Ridge and Lawrence Livermore in 2017 Two of the contracts have been awarded with the Argonne contract in process Leadership Computers RFP requests >100 PF, 2 GB/core main memory, local NVRAM, and science performance 5x-10x Titan or Sequoia 4 OFA update by ORNL

The Road to Exascale Since clock-rate scaling ended in 2003, HPC performance has been achieved through increased parallelism. Jaguar scaled to 300,000 cores. Titan and beyond deliver hierarchical parallelism with very powerful nodes. MPI plus thread level parallelism through OpenACC or OpenMP plus vectors Jaguar: 2.3 PF Multi-core CPU 7 MW Titan: 27 PF Hybrid GPU/CPU 9 MW Summit: 5-10x Titan Hybrid GPU/CPU 10 MW CORAL System 2010 2012 2017 2022 OLCF5: 5-10x Summit ~20 MW 5 OFA update by ORNL

System Summary Compute Node POWER Architecture Processor NVIDIA Volta NVMe-compatible PCIe 800GB SSD > 512 GB HBM + DDR4 Coherent Shared Memory Compute Rack Standard 19 Warm water cooling Compute System Summit: 5x-10x Titan 10 MW IBM POWER NVLink NVIDIA Volta HBM NVLink Mellanox Interconnect Dual-rail EDR Infiniband OFA Software Stack 6 OFA update by ORNL

How does Summit compare to Titan Summit VS Titan Feature Summit Titan Application Performance 5-10x Titan Baseline Number of Nodes ~3,400 18,688 Node performance > 40 TF 1.4 TF Memory per Node >512 GB (HBM + DDR4) 38GB (GDDR5+DDR3) NVRAM per Node 800 GB 0 Node Interconnect NVLink (5-12x PCIe 3) PCIe 2 System Interconnect (node injection bandwidth) 12 SC 14 Summit - Bland Do Not Release Prior to Monday, Nov. 17, 2014 7 OFA update by ORNL Dual Rail EDR-IB (23 GB/s) Gemini (6.4 GB/s) Interconnect Topology Non-blocking Fat Tree 3D Torus Processors File System IBM POWER9 NVIDIA Volta AMD Ot er on NVIDIA e Kp ler 120 PB, 1 TB/s, P GFS 32 PB, 1 TB/s, Lustre Peak power consumption 10 MW 9 MW Present and Future Leadership Computers at OLCF, Buddy Bland https://www.olcf.ornl.gov/wp-content/uploads/2014/12/olcf-user-group-summit-12-3-2014.pdf

ORNL ORNL Joined OFA organization in December 2014 InfiniBand Technology Multiple InfiniBand installations Substantial experience in management of InfiniBand networks OFA software stack users ADIOS, Burst Buffers, CCI, Cheetah, Luster / Spider, Open MPI, Open SHMEM, STCI, UCCS/UCX, and more Substantial experience in research and development of HPC and RDMA software We participate in OFVWG and OFIWG efforts 8 OFA update by ORNL

Our Experience with OFA Software Stack There are multiple ways to get OFA software Mellanox OFED OFA OFED OFED packages within Linux distributions and it is not easy choice 9 OFA update by ORNL

OFED Mellanox OFED Up-to-date software stack Network tools provide a relatively good level of information (device details, speeds, etc.) Consistent CLI interface for most of the tools All the above is true for Mellanox software/hardware only OFED Community software stack (supports multiple technologies, vendors) Very often it is behind Mellanox OFED in terms software features Inconsistent CLI interfaces Tools are way behind Mellanox OFED tools 10 OFA update by ORNL

OFED - Continued OFED packages within Linux distributions Very easy to maintain, upgrade, install yum upgrade Security updates! Based on OFED + Mellanox OFED (patches)? Packages are behind Mellanox OFED in terms software features 11 OFA update by ORNL

Wish list: Tools and Documentation We need better tools for management and monitoring IB networks Currently we use ibdiagnet + scripting It is relatively easy to fetch HCA s information/statistic It is difficult to extract information about switches It is even more difficult to connect the dots How do we identify hot spots and congestion in the network Identification of bad cables/connectors We want it free and open source Documentation of best practices and user experience can be very helpful Advance level: QoS tuning, balance loading with LMC bits, routing algorithms, etc.. 12 OFA update by ORNL

Questions? This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725 13 OFA update by ORNL