Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2017

Similar documents
Interconnect Your Future Paving the Road to Exascale

Interconnect Your Future

Interconnect Your Future

In-Network Computing. Paving the Road to Exascale. 5th Annual MVAPICH User Group (MUG) Meeting, August 2017

The Future of High Performance Interconnects

Building the Most Efficient Machine Learning System

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

Interconnect Your Future

Building the Most Efficient Machine Learning System

Paving the Road to Exascale

Interconnect Your Future

High Performance Computing

Solutions for Scalable HPC

In-Network Computing. Paving the Road to Exascale. June 2017

In-Network Computing. Sebastian Kalcher, Senior System Engineer HPC. May 2017

Interconnect Your Future

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

Corporate Update. Enabling The Use of Data January Mellanox Technologies

Paving the Road to Exascale Computing. Yossi Avni

The Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011

Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

The Future of Interconnect Technology

Introduction to High-Performance Computing

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

Deep Learning mit PowerAI - Ein Überblick

How to Network Flash Storage Efficiently at Hyperscale. Flash Memory Summit 2017 Santa Clara, CA 1

By John Kim, Chair SNIA Ethernet Storage Forum. Several technology changes are collectively driving the need for faster networking speeds.

Birds of a Feather Presentation

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Future Routing Schemes in Petascale clusters

InfiniBand Strengthens Leadership as The High-Speed Interconnect Of Choice

The Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations

MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA

Mellanox Virtual Modular Switch

High-Performance Training for Deep Learning and Computer Vision HPC

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Accelerating Ceph with Flash and High Speed Networks

IBM CORAL HPC System Solution

Ethernet. High-Performance Ethernet Adapter Cards

IBM SpectrumAI with NVIDIA Converged Infrastructure Solutions for AI workloads

2008 International ANSYS Conference

MICROWAY S NVIDIA TESLA V100 GPU SOLUTIONS GUIDE

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet

ABySS Performance Benchmark and Profiling. May 2010

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

N V M e o v e r F a b r i c s -

Voltaire Making Applications Run Faster

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

IBM Power AC922 Server

Single-Points of Performance

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc

InfiniBand-based HPC Clusters

LS-DYNA Productivity and Power-aware Simulations in Cluster Environments

Optimizing LS-DYNA Productivity in Cluster Environments

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

unleashed the future Intel Xeon Scalable Processors for High Performance Computing Alexey Belogortsev Field Application Engineer

Introduction to Infiniband

HIGH-PERFORMANCE STORAGE FOR DISCOVERY THAT SOARS

OCP Engineering Workshop - Telco

AMD Disaggregates the Server, Defines New Hyperscale Building Block

Optimizing Apache Spark with Memory1. July Page 1 of 14

Dr. Jean-Laurent PHILIPPE, PhD EMEA HPC Technical Sales Specialist. With Dell Amsterdam, October 27, 2016

Unlock business value with HPC & Artificial Intelligence. FORUM TERATEC June 19, 2018 José RODRIGUES HPC Sales Manager

PowerServe HPC/AI Servers

Real Application Performance and Beyond

Benefits of 25, 40, and 50GbE Networks for Ceph and Hyper- Converged Infrastructure John F. Kim Mellanox Technologies

S THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE. Presenter: Louis Capps, Solution Architect, NVIDIA,

Designing High-Performance MPI Collectives in MVAPICH2 for HPC and Deep Learning

InfiniBand Networked Flash Storage

Sharing High-Performance Devices Across Multiple Virtual Machines

High Performance Interconnects: Landscape, Assessments & Rankings

Efficient and Scalable Multi-Source Streaming Broadcast on GPU Clusters for Deep Learning

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

Data-Centric Innovation Summit DAN MCNAMARA SENIOR VICE PRESIDENT GENERAL MANAGER, PROGRAMMABLE SOLUTIONS GROUP

IBM Power Advanced Compute (AC) AC922 Server

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016

Maximizing Cluster Scalability for LS-DYNA

ARISTA: Improving Application Performance While Reducing Complexity

SOFTWARE-DEFINED BLOCK STORAGE FOR HYPERSCALE APPLICATIONS

JÜLICH SUPERCOMPUTING CENTRE Site Introduction Michael Stephan Forschungszentrum Jülich

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries

Highly Scalable, Non-RDMA NVMe Fabric. Bob Hansen,, VP System Architecture

Embracing Open Technologies in the HPEC Market

Predicting Service Outage Using Machine Learning Techniques. HPE Innovation Center

A Dwarf in a Giant Embedded systems in High Performance Computing. Andrea Bartolini

MicroTerabyte - NVMesh - Oracle RAC Reference Architecture

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

RapidIO.org Update.

IBM Spectrum Scale IO performance

Emerging Technologies for HPC Storage

Gen-Z Memory-Driven Computing

Architectures for Scalable Media Object Search

PRACE Project Access Technical Guidelines - 19 th Call for Proposals

Disclosures Statements in this presentation that refer to Business Outlook, future plans and expectations are forward-looking statements that involve

Mapping MPI+X Applications to Multi-GPU Architectures

Top 5 Reasons to Consider

Transcription:

Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, November 2017

InfiniBand Accelerates Majority of New Systems on TOP500 InfiniBand connects 77% of new HPC systems (June 17 Nov 17) InfiniBand connects 6 times more new systems versus proprietary interconnects (June 17 - Nov 17) InfiniBand connects 15 times more new systems versus Ethernet (June 17 - Nov 17) Mellanox accelerates the fastest supercomputer in the world InfiniBand connects 2 of the top 5 supercomputers (#1, #4) Mellanox connects 39% of overall TOP500 systems (192 systems, InfiniBand and Ethernet) InfiniBand connects 33% of the total TOP500 systems (164 systems) InfiniBand connects 60% of the HPC TOP500 systems 25G Ethernet first appearance on the Nov 17 TOP500 list (China Hyperscale company) - 19 systems, all are Mellanox connected Mellanox connects all of 25G, 40G and 100G Ethernet system on the list InfiniBand is the Interconnect of Choice for HPC Infrastructures Enabling Machine Learning, High-Performance, Web 2.0, Cloud, Storage, Big Data Applications 2017 Mellanox Technologies 2

Mellanox Connects the World s Fastest Supercomputer National Supercomputing Center in Wuxi, China #1 on the TOP500 List 93 Petaflop performance, 3X higher versus #2 on the TOP500 41K nodes, 10 million cores, 256 cores per CPU Mellanox adapter and switch solutions * Source: Report on the Sunway TaihuLight System, Jack Dongarra (University of Tennessee), June 20, 2016 (Tech Report UT-EECS-16-742) 2017 Mellanox Technologies 3

Mellanox InfiniBand Accelerates Two of TOP5 Supercomputers National Supercomputing Center in Wuxi, China #1 on the TOP500 List Japan Agency for Marine-Earth Science and Technology #4 on the TOP500 List, 19 Petaflops 2017 Mellanox Technologies 4

InfiniBand Accelerates Artificial Intelligence (AI) and Deep Learning Facebook AI Supercomputer #35 on the TOP500 List NVIDIA AI Supercomputer #36 on the TOP500 List EDR InfiniBand In-Network Computing technology key for scalable Deep Learning systems RDMA accelerates Deep Learning performance by 2X, becomes de-facto solution for AI 2017 Mellanox Technologies 5

Mellanox Accelerates Next Generation HPC and AI Systems Summit CORAL System Sierra CORAL System Fastest Supercomputer in Japan Fastest Supercomputer in Canada InfiniBand Dragonfly+ Topology Highest Performance with InfiniBand 2017 Mellanox Technologies 6

Mellanox In the TOP500 Mellanox accelerates the fastest supercomputer on the list InfiniBand connects 2 of top 5 systems - #1 (China) and #4 (Japan) InfiniBand connects 6 times more new HPC systems versus OmniPath (June 17 - Nov 17) InfiniBand connects 15 times more new HPC systems versus Ethernet (June 17 - Nov 17) InfiniBand connects 77% of new HPC systems (June 17 Nov 17) Mellanox connects 39 percent of overall TOP500 systems (192 systems, InfiniBand and Ethernet) InfiniBand connects 33 percent of the total TOP500 systems (164 systems) InfiniBand connects 60 percent of the HPC TOP500 systems 25G Ethernet first appearance on the Nov 17 TOP500 list (China Hyperscale company) - 19 systems, all are Mellanox connected Mellanox connects all of 25G, 40G and 100G Ethernet systems InfiniBand is the most used high-speed Interconnect on the TOP500 InfiniBand is the preferred interconnect for Artificial Intelligence and Deep Learning systems Mellanox enable highest ROI for Machine Learning, High-Performance, Cloud, Storage, Big Data and more applications Paving The Road to Exascale Performance 2017 Mellanox Technologies 7

TOP500 Interconnect Trends The TOP500 List has Evolved to Include Both HPC and Cloud / Web2.0 Hyperscale Platforms For the HPC Platforms, InfiniBand Continues it s Leadership as the Most Used Interconnect Solution Mellanox Connects all 25, 40 and 100G Ethernet Systems 2017 Mellanox Technologies 8

TOP500 Interconnect Trends (Ethernet Speed Details) The TOP500 List has Evolved to Include Both HPC and Cloud / Web2.0 Hyperscale Platforms For the HPC Platforms, InfiniBand Continues it s Leadership as the Most Used Interconnect Solution Mellanox Connects all 25, 40 and 100G Ethernet Systems 2017 Mellanox Technologies 9

New TOP500 HPC Systems June 17 Nov 17 (6 Months Period) New TOP500 HPC Systems Added to TOP500 List Between June 17 to Nov 17 InfiniBand Accelerates 6X New Systems Compare to OmniPath InfiniBand Accelerates 15X New Systems Compare to Ethernet 2017 Mellanox Technologies 10

TOP500 HPC Systems TOP500 Interconnect Trends TOP500 List Includes HPC and Non-HPC Systems InfiniBand is the Interconnect of Choice for HPC Systems 2017 Mellanox Technologies 11

InfiniBand Solutions TOP100, 200, 300, 400, 500 InfiniBand is The Most Used High-Speed Interconnect of the TOP500 Superior Performance, Scalability, Efficiency and Return-On-Investment 2017 Mellanox Technologies 12

InfiniBand Solutions TOP100, 200, 300, 400, 500 HPC Systems Only (Excluding Cloud, Web2.0 etc. Systems) InfiniBand is The Most Used Interconnect For HPC Systems Superior Performance, Scalability, Efficiency and Return-On-Investment 2017 Mellanox Technologies 13

Maximum Efficiency and Return on Investment Mellanox smart interconnect solutions enable In-Network Computing and CPU-Offloading Critical with CPU accelerators and higher scale deployments Ensures highest system efficiency and overall return on investment InfiniBand System: NASA, System Efficiency: 84% System: NCAR, System Efficiency: 90% System: US Army, System Efficiency: 94% Omni-Path System: CINECA, System Efficiency: 57% System: Barcelona, System Efficiency: 62% System: TACC, System Efficiency: 53% 1.7X Higher System Efficiency! 43% System Resources not Utilized! 2017 Mellanox Technologies 14

Mellanox Interconnect Technology Enables a Variety of Applications NVMe, NVMe Over Fabrics GPUDirect Software Defined Storage Big Data Storage Image, Video Recognition Voice Recognition & Search HPC RDMA Hadoop / Spark SQL & NoSQL Database Deep Learning SHARP NCCL MPI Sentiment Analysis Fraud & Flaw Detection 2017 Mellanox Technologies 15

Mellanox In-Network Computing Delivers Highest Performance In-Network Computing 10X Performance Acceleration Critical for HPC and Machine Learning Applications In-Network Computing 35X Performance Acceleration Delivers Highest Application Performance Self-Healing Technology 5000X Faster Network Recovery Unbreakable Data Centers GPUDirect RDMA GPU Acceleration Technology 10X Performance Acceleration Critical for HPC and Machine Learning Applications 2017 Mellanox Technologies 16

Mellanox Intelligent Network Unlocks the Power of Artificial Intelligence RDMA Supercharges Leading AI Frameworks Cognitive Toolkit 60% Higher ROI Mellanox Supercharges Leading AI Companies 50% Lower CapEx & OpEx 2017 Mellanox Technologies 17

InfiniBand Delivers Best Return on Investment 30-100% Higher Return on Investment Up to 50% Saving on Capital and Operation Expenses Highest Applications Performance, Scalability and Productivity Weather Automotive Chemistry Molecular Dynamics Genomics 1.3X Better 2X Better 1.4X Better 1.7X Better 1.3X Better 2017 Mellanox Technologies 18

InfiniBand for Automotive Design - LS-DYNA Application Example InfiniBand Always Wins: $213,042 Best Performance. Best R.O.I. System Cost OmniPath $491,280 With Free Network InfiniBand $278,238 $213,042 Cost 43% Savings Lower TCO Based on 32-node Scalable Unit Performance 2017 Mellanox Technologies 19

InfiniBand for Computational Fluid Dynamic Fluent Application Example InfiniBand Always Wins: $441,690 Best Performance. Best R.O.I. System Cost OmniPath $982,560 With Free Network InfiniBand $540,870 $441,690 Cost 45% Savings Lower TCO Based on 64-node Scalable Unit Performance 2017 Mellanox Technologies 20

InfiniBand Chemistry Simulation VASP Application Example InfiniBand Always Wins: $98,718 Best Performance. Best R.O.I. System Cost OmniPath $245,640 With Free Network InfiniBand $146,922 $98,718 Cost 40% Savings Lower TCO Based on 16-node Scalable Unit Performance 2017 Mellanox Technologies 21

SPEED Interconnect Technology: The Need for Speed and Intelligence 400G NDR 200G HDR 100G EDR LS-DYNA (FEA) OpenFOAM (CFD) Homeland Security Weather Cosmological Simulations Human Genome The Large Hadron Collider (CERN) Brain Mapping 56G FDR 40G QDR SIZE 100 Nodes 1,000 Nodes 10,000 Nodes 100,000 Nodes 1,000,000 Nodes 2017 Mellanox Technologies 22

The Intelligent Interconnect Overcomes Performance Bottlenecks CPU-Centric (Onload) Data-Centric (Offload) Must Wait for the Data Creates Performance Bottlenecks Analyze Data as it Moves! Faster Data Speeds and In-Network Computing Enable Higher Performance and Scale 2017 Mellanox Technologies 23

In-Network Computing Enables Data-Centric Data Center Faster Data Speeds and In-Network Computing Enable Higher Performance and Scale 2017 Mellanox Technologies 24

Highest-Performance 100Gb/s Interconnect Solutions 100Gb/s Adapter, 0.6us latency 175-200 million messages per second (10 / 25 / 40 / 50 / 56 / 100Gb/s) 36 EDR (100Gb/s) Ports, <90ns Latency Throughput of 7.2Tb/s 7.02 Billion msg/sec (195M msg/sec/port) 32 100GbE Ports, 64 25/50GbE Ports (10 / 25 / 40 / 50 / 100GbE) Throughput of 3.2Tb/s Transceivers Active Optical and Copper Cables (10 / 25 / 40 / 50 / 56 / 100Gb/s) VCSELs, Silicon Photonics and Copper MPI, SHMEM/PGAS, UPC For Commercial and Open Source Applications Leverages Hardware Accelerations 2017 Mellanox Technologies 25

Highest-Performance 200Gb/s Interconnect Solutions 200Gb/s Adapter, 0.6us latency 200 million messages per second (10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s) 40 HDR (200Gb/s) InfiniBand Ports 80 HDR100 InfiniBand Ports Throughput of 16Tb/s, <90ns Latency 16 400GbE, 32 200GbE, 128 25/50GbE Ports (10 / 25 / 40 / 50 / 100 / 200 GbE) Throughput of 6.4Tb/s Transceivers Active Optical and Copper Cables (10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s) VCSELs, Silicon Photonics and Copper MPI, SHMEM/PGAS, UPC For Commercial and Open Source Applications Leverages Hardware Accelerations 2017 Mellanox Technologies 26

Upcoming HDR InfiniBand Increases Mellanox Leadership HDR InfiniBand the Most Scalable Switch 1.7X Better 2.8X Better 4.6X Better 80-Ports Top of Rack Switch 3200 Nodes with Only 2-Teir Switch Network 128K Nodes with Only 3-Tier Switch Network 400-Nodes 1.6X Switch Saving, 2X Cable Saving InfiniBand: 15 ToR Switches; OmniPath: 24 ToR Switches 1600-Nodes 4X Cable Saving, 2X Power Saving InfiniBand: 1 Modular Switch; OmniPath: 64 ToR + 2 Modular Switches 2017 Mellanox Technologies 27

End-to-End Interconnect Solutions for All Platforms Highest Performance and Scalability for X86, Power, GPU, ARM and FPGA-based Compute and Storage Platforms 10, 20, 25, 40, 50, 56, 100 and 200Gb/s Speeds X86 POWER GPU ARM FPGA Smart Interconnect to Unleash The Power of All Compute Architectures 2017 Mellanox Technologies 28

InfiniBand The Smart Choice for HPC Platforms and Applications We chose a co-design approach supporting in the best possible manner our key applications. The only interconnect that really could deliver that was Mellanox s InfiniBand. In HPC, the processor should be going 100% of the time on a science question, not on a communications question. This is why the offload capability of Mellanox s network is critical. One of the big reasons we use InfiniBand and not an alternative is that we ve got backwards compatibility with our existing solutions. Watch Video Watch Video Watch Video InfiniBand is the most advanced interconnect technology in the world, with dramatic communication overhead reduction that fully unleashes cluster performance. We have users that move tens of terabytes of data and this needs to happen very, very rapidly. InfiniBand is the way to do it. InfiniBand is the best that is required for our applications. It enhances and unlocks the potential of the system. Watch Video Watch Video Watch Video 2017 Mellanox Technologies 29

TOP500 Mellanox Accelerated Supercomputers (Examples) Wuxi Supercomputing Center 93 Petaflop World s Fastest Supercomputer 41K nodes Japan Agency for Marine-Earth Science and Technology 19 Petaflop Japan s Fastest Supercomputer NASA Ames Research Center 6 Petaflop Enabling Space Exploration 20K nodes 2017 Mellanox Technologies 30

TOP500 Mellanox Accelerated Supercomputers (Examples) 5.3 Enabling Energy Explorations Total Exploration Production Petaflop National Center for Atmospheric Research 4.8 Petaflop Enabling Weather Research Exploration & Production ENI S.p.A. 3.2 Petaflop Enabling Energy Explorations 2017 Mellanox Technologies 31

Proven Advantages Deliver best return on investment Scalable, intelligent, flexible, high performance, end-to-end connectivity Standards-based (InfiniBand, Ethernet), supported by large eco-system Supports all compute architectures: x86, Power, ARM, GPU, FPGA etc. Offloading architecture: RDMA, application acceleration engines, etc. Flexible topologies: Fat Tree, Mesh, 3D Torus, Dragonfly+, etc. Converged I/O: compute, storage, management on single fabric Backward and future compatible The Future Depends On Smart Interconnect 2017 Mellanox Technologies 32

Thank You