Paving the Road to Exascale Computing Yossi Avni HPC@mellanox.com
Connectivity Solutions for Efficient Computing Enterprise HPC High-end HPC HPC Clouds ICs Mellanox Interconnect Networking Solutions Adapter Cards Host/Fabric Software Switches/Gateways Cables Leading Connectivity Solution Provider For Servers and Storage 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 2
Complete End-to-End Connectivity Host/Fabric Software Management Application Accelerations - UFM, FabricIT - Integration with job schedulers - Inbox Drivers - Collectives Accelerations (FCA/CORE-Direct) - GPU Accelerations (GPUDirect) - MPI/SHMEM - RDMA - Quality of Service Networking Efficiency/Scalability - Adaptive Routing - Congestion Management - Traffic aware Routing (TARA) Server and Storage High-Speed Connectivity - Latency - Bandwidth - CPU Utilization - Message rate 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 3
Mellanox s Interconnect Leadership Highest Throughput Lowest Latency CPU Availability Message Rate RDMA Highest Performance End-to-End Quality From Silicon to System Auto Negotiation Power Management Signal Integrity Cable Reach GPU Acceleration Adaptive Routing Advanced HPC Complete Eco-System Congestion Control MPI/SHMEM Offloads Topologies/Routing 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 4
Bandwidth per direction (Gb/s) InfiniBand Link Speed Roadmap # of Lanes per direction Per Lane & Rounded Per Link Bandwidth (Gb/s) 5G-IB DDR 10G-IB QDR 14G-IB-FDR (14.025) 26G-IB-EDR (25.78125) 12 60+60 120+120 168+168 300+300 8 40+40 80+80 112+112 200+200 4 20+20 40+40 56+56 100+100 1 5+5 10+10 14+14 25+25 300G-IB-EDR 168G-IB-FDR 12x HDR 12x NDR 8x NDR 8x HDR 120G-IB-QDR 200G-IB-EDR 112G-IB-FDR 4x HDR 4x NDR x12 60G-IB-DDR 80G-IB-QDR 100G-IB-EDR 56G-IB-FDR 1x NDR x8 40G-IB-DDR 40G-IB-QDR 25G-IB-EDR 14G-IB-FDR 1x HDR x4 20G-IB-DDR 10G-IB-QDR x1 2005-2006 - 2007-2008 - 2009-2010 - 2011 2014 Market Demand 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 5
Next Generation InfiniBand Technology Available: 2011 (end-to-end: adapters, switches, cables) Highest Throughput Connectivity for Server and Storage 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 6
Scalable MPI Collectives Acceleration with FCA Node Offloading/acceleration management (FCA) Offloading at the HCA (CORE-Direct) ~20% Performance increase at 16 nodes! Offloading at the network/switches (icpu) Most Scalable Offloading for MPI Applications 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 7
Mellanox Message Rate Performance Results Highest MPI Message Rate! 90 Million messages per Second Highest IB Message Rate! 23 Million messages per Second PPN process per node, or cores per node 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 8
Network Utilization via Traffic Aware Routing Job Submitted in Scheduler Matching Jobs Automatically Application Level Monitoring & Optimization Measurements Fabric-wide Policy Pushed to Match Application Requirements Maximizing Network Utilization 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 9
Hardware Based Congestion Control Network Latency % Improvement Ping Pong Latency 88% Natural Ring Latency 81.6% Random Ring Latency 81.3% Ping Pong bandwidth 85.5% Applications (HPCC) % Improvement PTRANS 76% FFT 40% For more performance examples: First Experiences with Congestion Control in InfiniBand Hardware ; Ernst Gunnar Gran, Magne Eimot, Sven-Arne Reinemo, Tor Skeie, Olav Lysne, Lars Paul Huse, Gilad Shainer; IPDPS 2010 Congestion Free Network For Highest Efficiency 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 10
Highest Performance GPU Clusters with GPUDirect GPUDirect GPU computing mandates Mellanox solutions GPUDirect: 35% application performance increase 3 nodes 3 nodes Mellanox InfiniBand Accelerates GPU Communications 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 11
Superior InfiniBand Solutions University, Academic Labs, Research Clustered Cloud & Web 2.0 Financial Computational Aided Database Engineering Bioscience Oil and Gas Weather Digital Media Financial Mellanox Connectivity Solutions Performance: 45% Lower latency, highest throughput and 3x the message rate Scalability: proven for Petascale computing, highest scalability through accelerations Reliability :from silicon to system, highest signal integrity, two order of magnitude lower BER Efficiency: Highest CPU/GPU availability through complete offloading, low power consumption Certification: Complete ISVs support and qualification, MPI vendors, job schedulers Return on investment: Most cost/effective, simple to manage, 40Gb/s end-to-end connectivity 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 12
Bottom Line Mellanox Benefits for HPC High-end HPC Enterprise HPC HPC Clouds Entry-level HPC Performance 100+% Increase Complete High-Performance Scalable Interconnect Solutions for Server and Storage TCO 50+% Reduction Energy Costs 65+% Reduction Infrastructure 60+% Saving 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 13
Performance Leadership Across Industries 30%+ of Fortune-100 & top global-high Performance Computers 6 of Top 10 Global Banks 9 of Top 10 Automotive Manufacturers 4 of Top 10 Pharmaceutical Companies 7 of Top 10 Oil and Gas Companies 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 14
Thinking, Designing and Building Scalable HPC HPC@mellanox.com Thank You 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 15