HPC and AI Solution Overview. Garima Kochhar HPC and AI Innovation Lab
|
|
- Timothy Stokes
- 5 years ago
- Views:
Transcription
1 HPC and AI Solution Overview Garima Kochhar HPC and AI Innovation Lab 1
2 Dell EMC HPC and DL team charter Design, develop and integrate HPC and DL Heading systems Lorem ipsum dolor sit amet, consectetur adipiscing elit. Flexible reference architectures Systems tuned for research computing, manufacturing, life sciences, oil and gas, etc. Act as the focal point for joint R&D activities Technology collaboration with partners for joint innovation Research coordination with DSC, COEs and customers New Investment: more SMEs, huge innovation ecosystem HPC Innovation Lab Technical briefings, tours, remote access Conduct application performance Heading studies and develop best practices Lorem ipsum dolor sit amet, consectetur adipiscing elit. White papers, blogs, presentations Prototype and evaluate advanced technologies HPC+Cloud, HPC+Big Data nvmes, FPGAs, containers, DL/ML workloads, etc. 2
3 World-class infrastructure in the Innovation Lab 13K ft. 2 lab, 1,300+ servers, ~10PB storage dedicated to HPC in collaboration with the community Zenith TOP500-class system based on Intel Scalable Systems Framework (OPA, KNL, Xeon) 324 nodes with Intel Xeon Gold 6148-F processors, Omni-Path fabric and 655TF sustained performance Intel Xeon Phi (KNL) servers. 805TF combined performance. #292 on Top500, 1.4PFlop/s theoretical peak Isilon H600, F800 Rattler Research/development system with Mellanox, NVIDIA and Bright Computing 88 nodes with Intel Xeon Gold 6148 processors and EDR InfiniBand 16 nodes with Intel Xeon Gold 6148 processors and 4 V100 GPUs each 3
4 Focus areas HPC software stack Bright Cluster Manager, OpenHPC Integration of all software components Compute performance and tuning Application focus + BIOS, Memory, Interconnect Accelerators and co-processors Different workloads Interconnect performance and tuning Storage solutions Isilon, NSS, Lustre Vertical solutions Genomics research CFD/Manufacturing Proof of Concept studies Containers, FPGAs, NVMeoF, etc. 4
5 Compute - performance - tuning
6 Stream Triad GB/s (higher is better) 16 DIMM relative bandwidth when compared to 12 DIMMs. Relative memory bandwidthh [higher is better] Careful memory configuration - Skylake Impact of unbalanced 512 GB memory configuration Near-balanced configurations GB, 576 GB Processor SKU Balanced Balanced Balanced Near balanced Near balanced 12 x 16GB 12 x 32GB 24 x 32GB 12 x 16GB + 12 x 32GB 12 x 8GB + 12 x 16GB 192 GB 384 GB 768 GB 576 GB 288 GB 384 GB 12x32GB 512 GB 16x32GB Rel perf 16x32GB 6138, 2666 MT/s 6142, 2666 MT/s Unbalanced configurations are very bad for performance Balanced and near-balanced configurations are ideal for HPC 6
7 Performance Relative to E5-2697A v4 ANSYS Fluent Two Socket System Performance ANSYS Fluent v17.2 aircraft_wing_14m E v2 12c, 2.7 GHz, 130W, 1866 MT/s E v3 10c, 2.6/2.2 GHz, 105W, 2133 MT/s E v3 14c, 2.6/2.2 GHz, 145W, 2133 MT/s E5-2697A v4 16c, 2.6/2.2 GHz, 145W, 2400 MT/s E v4 18c, 2.3/2.0 GHz, 145W, 2400 MT/s c, 2.1GHz, 125W, 12x16GB 2666 MT/s c, 3.0GHz, 150W, 24x16GB 2666 MT/s c, 2.6GHz, 150W, 12x16GB 2666 MT/s c, 2.4GHz, 150W, 12x16GB 2666 MT/s c, 2.7GHz, 165W, 12x16GB 2666 MT/s c, 2.7GHz, 205W, 12x16GB 2666 MT/s IVB HSW BDW SKX Relative Performance per Core Skylake provides significantly better performance for ANSYS Fluent relative to Broadwell. Relative performance depends on the specific benchmark dataset: 1.2 to 1.4 for
8 Relative performance over 16 nodes (higher is better) Relative performance over 16 nodes (higher is better) AMD EPYC - WRF multi-node tests 30.0 WRF - conus 12km 30.0 WRF - conus 2.5 km epyc 2 epyc 4 epyc 8 epyc 16 epyc 1 epyc 2 epyc 4 epyc 8 epyc 16 epyc EPYC GHz EPYC GHz Linear scaling EPYC GHz EPYC GHz Linear scaling EPYC 7601 is ~3% better than Base frequency 10% faster, Turbo 6%. 8
9 Interconnects
10 Latency (us) EDR Latency w/ c-states SKL,BW EDR Latency (cstates enabled vs disabled) Message size (bytes) SKL cstates-en w/ switch SKL cstates-en, B2B SKL cstates-dis w/ switch SKL cstates-disabled, B2B SKL- Intel Xeon Gold 2.6GHz 16C Cstates enabled and disabled latency results are about the same. 10
11 Solver Rating (higher is better) Performance Relative to E v4 ANSYS Fluent Small Model Scaling ANSYS Fluent v17.2 ice_2m (1) 72 (2) 144 (4) 216 (6) 288 (8) 432 (12) 576 (16) C E v4 + EDR C EDR Perf relative to E v4 Number of Cores (Nodes) With a small data set, performance advantage of Skylake decreases at scale, but remains positive. Unlikely to run a model of this size with more than a few nodes. 11
12 GP GPU and accelerators
13 ns/day (higher is better) AMBER B V100 vs P % K V100 vs P % AMBER STMV PCIe SXM2 PCIe SXM2 13G C4130 P100 14G C4140 V Configuration B Configuration K V100 is significantly faster, 1x V100 faster than 4x P100 SMX2 better than PCIe 9% P100, up to 30% V100 SMX2 has slightly higher frequency than PCIe card 13
14 HPC storage solutions - NSS - IEEL - Isilon - Proof Of Concepts
15 Dell Storage for HPC with Intel EE for Lustre Solution Turn-key solution designed for high speed fast scratch storage Solution benefits & Dell differentiation Parallel scalable file system based on Intel EE for Lustre software Single file system namespace scalable to high capacities and performance Best practices developed by Dell HPC Engineering provide optimal performance on Dell hardware Tests yield peaks of roughly 15GB/s write and 17GB/s read per building block Lustre Distributed Namespace allows distribution of Lustre sub-directories across multiple MDTs to increase metadata capacity capabilities and performance Share data with other file systems utilizing optional NFS/CIFS gateway Dell Networking 10/40GbE, InfiniBand or Omni-Path Dell Po wervault MD3460 Dell Po wervault MD3420 Intel Manager for Lust re Pow eredge R630 MDS Pair Dell Po weredge R730 Ac tive/ Passive OSS Pair Dell Po weredge R730 Active/Active 12 Gbps SAS Failover Co nnect ions Dell Po wervault MD3420 (Optio nal for DNE) 12 Gbps SAS Failover Co nnect ions 15
16 16 IEEL3.0+OPA
17 Support for Isilon for Scalable NFS- Sequential performance (N-N) -- Write 17
18 Deep Learning
19 Images/sec Speedup TensorFlow+Horovod on multiple V100 nodes TensorFlow+Horovod Resnet50 on Multi-node V100 2 V100 4 V100 8 V V V100 Performance Speedup TensorFlow+Horovod scales well on multiple nodes, 12.5x speedup with 16 V100 (4 nodes) Ibverbs and MPI are used for nodes communication 4 nodes with V100-PCIe GPUs are used FP32 mode, batch size is 128 per GPU 19
20 Ready Bundle for Deep Learning - NVIDIA Open Source Frameworks TensorFlow, MxNet, CNTK, Theano, Torch, Caffe/Caffe2 Neural network Libraries MLPython, CaffeOnSpark, cudnn, cublas, NCCL, Keras, GIE Platform C4140 Processor 2 x Intel Xeon CPU 6148 Memory Drives Network GPU 384GB 2400MHz 2x200GB 1.8 SSD Mellanox ConnectX-5 VPI (EDR 100Gb/s) 4x V100-16GB SXM2 Software & Firmware [Reference] Operating System Provisioning and Management RHEL 7.4 x86_64 Bright Cluster Manager
21 Tying it together - Access to the lab - White papers and Blogs
22 Recent Publications Design Principles for HPC 14G with Skylake how much better for HPC? BIOS characterization for HPC with Intel Skylake processor Performance study of four Socket PowerEdge R940 Server with Intel Skylake processors Dell EMC HPC Systems - SKY is the limit Entering a new arena in Computing- KNL System benchmark results on KNL STREAM and HPL NAMD Performance Analysis on Skylake Architecture LAMMPS Four Node Comparative Performance Analysis on Skylake Processors De novo assembly with PowerEdge HPCG Performance study with Intel R940 Skylake processors Dell EMC HPC System for Life Skylake memory study Science v1.1 HPC Applications Performance on V100 Application Performance on P100- PCIe GPUs Containerizing HPC Applications with Singularity HPCG Performance study with Intel Performance of LS-DYNA on KNL Singularity Containers Scaling Deep Learning on Multiple V100 Nodes Deep Learning on V100 Deep Learning Inference on P40 vs P4 with Skylake Deep Learning Inference on P40 GPUs Deep Learning Performance with Intel Caffe Training, CPU model choice and Scalability Deep Learning Performance on R740 with V100 PCIe GPUs Getting Started With OpenHPC DELL EMC Isilon F800 and F600 I/O Performance DELL EMC ISILON F800 AND H600 WHOLE GENOME ANALYSIS PERFORMANCE Digital Manufacturing with 14G 22
23
HPC Innovation Lab Update. Dell EMC HPC Community Meeting 3/28/2017
HPC Innovation Lab Update Dell EMC HPC Community Meeting 3/28/2017 Dell EMC HPC Innovation Lab charter Design, develop and integrate Heading HPC systems Lorem ipsum Flexible reference dolor sit amet, architectures
More informationDell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia
More informationDell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for ANSYS Mechanical, ANSYS Fluent, and
More informationDell HPC System for Manufacturing System Architecture and Application Performance
Dell HPC System for Manufacturing System Architecture and Application Performance This Dell technical white paper describes the architecture of the Dell HPC System for Manufacturing and discusses performance
More informationTechnologies and application performance. Marc Mendez-Bermond HPC Solutions Expert - Dell Technologies September 2017
Technologies and application performance Marc Mendez-Bermond HPC Solutions Expert - Dell Technologies September 2017 The landscape is changing We are no longer in the general purpose era the argument of
More informationSystem Design of Kepler Based HPC Solutions. Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering.
System Design of Kepler Based HPC Solutions Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering. Introduction The System Level View K20 GPU is a powerful parallel processor! K20 has
More informationDell EMC HPC System for Life Sciences v1.4
Dell EMC HPC System for Life Sciences v1.4 Designed for genomics sequencing analysis, bioinformatics and computational biology Dell EMC Engineering April 2017 A Dell EMC Reference Architecture Revisions
More informationTESLA V100 PERFORMANCE GUIDE. Life Sciences Applications
TESLA V100 PERFORMANCE GUIDE Life Sciences Applications NOVEMBER 2017 TESLA V100 PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important
More informationEmerging Technologies for HPC Storage
Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional
More informationRECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016
RECENT TRENDS IN GPU ARCHITECTURES Perspectives of GPU computing in Science, 26 th Sept 2016 NVIDIA THE AI COMPUTING COMPANY GPU Computing Computer Graphics Artificial Intelligence 2 NVIDIA POWERS WORLD
More informationLAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015
LAMMPS-KOKKOS Performance Benchmark and Profiling September 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, NVIDIA
More informationNAMD GPU Performance Benchmark. March 2011
NAMD GPU Performance Benchmark March 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Intel, Mellanox Compute resource - HPC Advisory
More informationAltair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015
Altair OptiStruct 13.0 Performance Benchmark and Profiling May 2015 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute
More informationLS-DYNA Performance Benchmark and Profiling. October 2017
LS-DYNA Performance Benchmark and Profiling October 2017 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: LSTC, Huawei, Mellanox Compute resource
More informationPerformance Analysis of HPC Applications on Several Dell PowerEdge 12 th Generation Servers
Performance Analysis of HPC Applications on Several Dell PowerEdge 12 th Generation Servers This Dell technical white paper evaluates and provides recommendations for the performance of several HPC applications
More informationIntel Select Solutions for Professional Visualization with Advantech Servers & Appliances
Solution Brief Intel Select Solution for Professional Visualization Intel Xeon Processor Scalable Family Powered by Intel Rendering Framework Intel Select Solutions for Professional Visualization with
More informationGROMACS (GPU) Performance Benchmark and Profiling. February 2016
GROMACS (GPU) Performance Benchmark and Profiling February 2016 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Mellanox, NVIDIA Compute
More informationunleashed the future Intel Xeon Scalable Processors for High Performance Computing Alexey Belogortsev Field Application Engineer
the future unleashed Alexey Belogortsev Field Application Engineer Intel Xeon Scalable Processors for High Performance Computing Growing Challenges in System Architecture The Walls System Bottlenecks Divergent
More informationSmarter Clusters from the Supercomputer Experts
Smarter Clusters from the Supercomputer Experts Maximize Your Results with Flexible, High-Performance Cray CS500 Cluster Supercomputers In science and business, as soon as one question is answered another
More informationHigh-Performance Training for Deep Learning and Computer Vision HPC
High-Performance Training for Deep Learning and Computer Vision HPC Panel at CVPR-ECV 18 by Dhabaleswar K. (DK) Panda The Ohio State University E-mail: panda@cse.ohio-state.edu http://www.cse.ohio-state.edu/~panda
More informationIBM Deep Learning Solutions
IBM Deep Learning Solutions Reference Architecture for Deep Learning on POWER8, P100, and NVLink October, 2016 How do you teach a computer to Perceive? 2 Deep Learning: teaching Siri to recognize a bicycle
More informationLS-DYNA Performance Benchmark and Profiling. October 2017
LS-DYNA Performance Benchmark and Profiling October 2017 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: LSTC, Huawei, Mellanox Compute resource
More informationBroadberry. Artificial Intelligence Server for Fraud. Date: Q Application: Artificial Intelligence
TM Artificial Intelligence Server for Fraud Date: Q2 2017 Application: Artificial Intelligence Tags: Artificial intelligence, GPU, GTX 1080 TI HM Revenue & Customs The UK s tax, payments and customs authority
More informationCS500 SMARTER CLUSTER SUPERCOMPUTERS
CS500 SMARTER CLUSTER SUPERCOMPUTERS OVERVIEW Extending the boundaries of what you can achieve takes reliable computing tools matched to your workloads. That s why we tailor the Cray CS500 cluster supercomputer
More informationTESLA V100 PERFORMANCE GUIDE May 2018
TESLA V100 PERFORMANCE GUIDE May 2018 TESLA V100 The Fastest and Most Productive GPU for AI and HPC Volta Architecture Tensor Core Improved NVLink & HBM2 Volta MPS Improved SIMT Model Most Productive GPU
More informationS THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE. Presenter: Louis Capps, Solution Architect, NVIDIA,
S7750 - THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE Presenter: Louis Capps, Solution Architect, NVIDIA, lcapps@nvidia.com A TALE OF ENLIGHTENMENT Basic OK List 10 for x = 1 to 3 20 print
More informationNAMD Performance Benchmark and Profiling. February 2012
NAMD Performance Benchmark and Profiling February 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource -
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton
More informationInspur AI Computing Platform
Inspur Server Inspur AI Computing Platform 3 Server NF5280M4 (2CPU + 3 ) 4 Server NF5280M5 (2 CPU + 4 ) Node (2U 4 Only) 8 Server NF5288M5 (2 CPU + 8 ) 16 Server SR BOX (16 P40 Only) Server target market
More informationDELL EMC ISILON F800 AND H600 I/O PERFORMANCE
DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access
More informationInterconnect Your Future
Interconnect Your Future Paving the Path to Exascale November 2017 Mellanox Accelerates Leading HPC and AI Systems Summit CORAL System Sierra CORAL System Fastest Supercomputer in Japan Fastest Supercomputer
More informationThe Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations
The Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations Ophir Maor HPC Advisory Council ophir@hpcadvisorycouncil.com The HPC-AI Advisory Council World-wide HPC non-profit
More informationIntel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage
Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Evaluation of Lustre File System software enhancements for improved Metadata performance Wojciech Turek, Paul Calleja,John
More informationLS-DYNA Performance Benchmark and Profiling. April 2015
LS-DYNA Performance Benchmark and Profiling April 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationIBM Power AC922 Server
IBM Power AC922 Server The Best Server for Enterprise AI Highlights More accuracy - GPUs access system RAM for larger models Faster insights - significant deep learning speedups Rapid deployment - integrated
More informationNAMD Performance Benchmark and Profiling. January 2015
NAMD Performance Benchmark and Profiling January 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationSTAR-CCM+ Performance Benchmark and Profiling. July 2014
STAR-CCM+ Performance Benchmark and Profiling July 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: CD-adapco, Intel, Dell, Mellanox Compute
More informationS8765 Performance Optimization for Deep- Learning on the Latest POWER Systems
S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems Khoa Huynh Senior Technical Staff Member (STSM), IBM Jonathan Samn Software Engineer, IBM Evolving from compute systems to
More informationPerformance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA
Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to
More informationAMBER 11 Performance Benchmark and Profiling. July 2011
AMBER 11 Performance Benchmark and Profiling July 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource -
More informationOpenFOAM Performance Testing and Profiling. October 2017
OpenFOAM Performance Testing and Profiling October 2017 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Huawei, Mellanox Compute resource - HPC
More informationPerformance Analysis of LS-DYNA in Huawei HPC Environment
Performance Analysis of LS-DYNA in Huawei HPC Environment Pak Lui, Zhanxian Chen, Xiangxu Fu, Yaoguo Hu, Jingsong Huang Huawei Technologies Abstract LS-DYNA is a general-purpose finite element analysis
More informationAMD EPYC and NAMD Powering the Future of HPC February, 2019
AMD EPYC and NAMD Powering the Future of HPC February, 19 Exceptional Core Performance NAMD is a compute-intensive workload that benefits from AMD EPYC s high core IPC (Instructions Per Clock) and high
More informationDeep Learning Performance and Cost Evaluation
Micron 5210 ION Quad-Level Cell (QLC) SSDs vs 7200 RPM HDDs in Centralized NAS Storage Repositories A Technical White Paper Rene Meyer, Ph.D. AMAX Corporation Publish date: October 25, 2018 Abstract Introduction
More informationABySS Performance Benchmark and Profiling. May 2010
ABySS Performance Benchmark and Profiling May 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationIBM CORAL HPC System Solution
IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy
More informationCisco UCS C480 ML M5 Rack Server Performance Characterization
White Paper Cisco UCS C480 ML M5 Rack Server Performance Characterization The Cisco UCS C480 ML M5 Rack Server platform is designed for artificial intelligence and machine-learning workloads. 2018 Cisco
More informationDeep Learning Performance and Cost Evaluation
Micron 5210 ION Quad-Level Cell (QLC) SSDs vs 7200 RPM HDDs in Centralized NAS Storage Repositories A Technical White Paper Don Wang, Rene Meyer, Ph.D. info@ AMAX Corporation Publish date: October 25,
More informationHPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov
HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads Natalia Vassilieva, Sergey Serebryakov Deep learning ecosystem today Software Hardware 2 HPE s portfolio for deep learning Government,
More informationSYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS
SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA S Axel Koehler, Principal Solution Architect HPCN%Workshop%Goettingen,%14.%Mai%2018 NVIDIA - AI COMPUTING COMPANY Computer Graphics Computing Artificial Intelligence
More informationApril 2 nd, Bob Burroughs Director, HPC Solution Sales
April 2 nd, 2019 Bob Burroughs Director, HPC Solution Sales Today - Introducing 2 nd Generation Intel Xeon Scalable Processors how Intel Speeds HPC performance Work Time System Peak Efficiency Software
More informationOCTOPUS Performance Benchmark and Profiling. June 2015
OCTOPUS Performance Benchmark and Profiling June 2015 2 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the
More informationOverview of Reedbush-U How to Login
Overview of Reedbush-U How to Login Information Technology Center The University of Tokyo http://www.cc.u-tokyo.ac.jp/ Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle FY 08 09 10 11 12 13 14 15
More informationOptimal BIOS settings for HPC with Dell PowerEdge 12 th generation servers
Optimal BIOS settings for HPC with Dell PowerEdge 12 th generation servers This Dell technical white paper analyses the various BIOS options available in Dell PowerEdge 12 th generation servers and provides
More informationTrends in systems and how to get efficient performance
Trends in systems and how to get efficient performance Martin Hilgeman HPC Consultant martin.hilgeman@dell.com The landscape is changing We are no longer in the general purpose era the argument of tuning
More informationTESLA P100 PERFORMANCE GUIDE. Deep Learning and HPC Applications
TESLA P PERFORMANCE GUIDE Deep Learning and HPC Applications SEPTEMBER 217 TESLA P PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important
More informationAMD EPYC BASED DELL EMC POWEREDGE 14G SERVERS Scott Aylor, Corporate Vice President and General Manager, Datacenter and Embedded Solutions Group
AMD EPYC BASED DELL EMC POWEREDGE 14G SERVERS Scott Aylor, Corporate Vice President and General Manager, Datacenter and Embedded Solutions Group 2 DELL TECHNOLOGY WORLD AMD BREAKOUT SESSION MAY 1 ST, 2018
More informationSun Lustre Storage System Simplifying and Accelerating Lustre Deployments
Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems
More informationGame-changing Extreme GPU computing with The Dell PowerEdge C4130
Game-changing Extreme GPU computing with The Dell PowerEdge C4130 A Dell Technical White Paper This white paper describes the system architecture and performance characterization of the PowerEdge C4130.
More informationSharing High-Performance Devices Across Multiple Virtual Machines
Sharing High-Performance Devices Across Multiple Virtual Machines Preamble What does sharing devices across multiple virtual machines in our title mean? How is it different from virtual networking / NSX,
More informationHPE Scalable Storage with Intel Enterprise Edition for Lustre*
HPE Scalable Storage with Intel Enterprise Edition for Lustre* HPE Scalable Storage with Intel Enterprise Edition For Lustre* High Performance Storage Solution Meets Demanding I/O requirements Performance
More informationGROMACS Performance Benchmark and Profiling. August 2011
GROMACS Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationGROMACS Performance Benchmark and Profiling. September 2012
GROMACS Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource
More informationDeep Learning mit PowerAI - Ein Überblick
Stephen Lutz Deep Learning mit PowerAI - Open Group Master Certified IT Specialist Technical Sales IBM Cognitive Infrastructure IBM Germany Ein Überblick Stephen.Lutz@de.ibm.com What s that? and what s
More informationBuilding NVLink for Developers
Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized
More informationANSYS Fluent 14 Performance Benchmark and Profiling. October 2012
ANSYS Fluent 14 Performance Benchmark and Profiling October 2012 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information
More informationAltair RADIOSS Performance Benchmark and Profiling. May 2013
Altair RADIOSS Performance Benchmark and Profiling May 2013 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Altair, AMD, Dell, Mellanox Compute
More informationThe Future of High Performance Interconnects
The Future of High Performance Interconnects Ashrut Ambastha HPC Advisory Council Perth, Australia :: August 2017 When Algorithms Go Rogue 2017 Mellanox Technologies 2 When Algorithms Go Rogue 2017 Mellanox
More informationAccelerating MPI Message Matching and Reduction Collectives For Multi-/Many-core Architectures Mohammadreza Bayatpour, Hari Subramoni, D. K.
Accelerating MPI Message Matching and Reduction Collectives For Multi-/Many-core Architectures Mohammadreza Bayatpour, Hari Subramoni, D. K. Panda Department of Computer Science and Engineering The Ohio
More informationGateways to Discovery: Cyberinfrastructure for the Long Tail of Science
Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science ECSS Symposium, 12/16/14 M. L. Norman, R. L. Moore, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S.
More informationPRACE Project Access Technical Guidelines - 19 th Call for Proposals
PRACE Project Access Technical Guidelines - 19 th Call for Proposals Peer-Review Office Version 5 06/03/2019 The contributing sites and the corresponding computer systems for this call are: System Architecture
More informationAbout 2CRSI. OCtoPus Solution. Technical Specifications. OCtoPus servers. OCtoPus. OCP Solution by 2CRSI.
About 2CRSI OCtoPus Solution Technical Specifications OCtoPus servers OCtoPus OCP Solution by 2CRSI 1 About 2CRSI 3 OCtoPus Solution 4 Technical Specifications OCtoPus Rack Unique server design 6 7 OCtoPus
More informationAbout 2CRSI. OCtoPus Solution. Technical Specifications. OCtoPus. OCP Solution by 2CRSI.
About 2CRSI OCtoPus Solution Technical Specifications OCtoPus OCtoPus OCP Solution by 2CRSI 1 Remark: All specifications and photos are subject to change whitout notice. 2 About 2CRSI 5 OCtoPus Solution
More informationCharacterizing and Benchmarking Deep Learning Systems on Modern Data Center Architectures
Characterizing and Benchmarking Deep Learning Systems on Modern Data Center Architectures Talk at Bench 2018 by Xiaoyi Lu The Ohio State University E-mail: luxi@cse.ohio-state.edu http://www.cse.ohio-state.edu/~luxi
More informationSNAP Performance Benchmark and Profiling. April 2014
SNAP Performance Benchmark and Profiling April 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox For more information on the supporting
More informationPlaFRIM. Technical presentation of the platform
PlaFRIM Technical presentation of the platform 1-11/12/2018 Contents 2-11/12/2018 01. 02. 03. 04. 05. 06. 07. Overview Nodes description Networks Storage Evolutions How to acces PlaFRIM? Need Help? 01
More informationAcuSolve Performance Benchmark and Profiling. October 2011
AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, Altair Compute
More informationCESM (Community Earth System Model) Performance Benchmark and Profiling. August 2011
CESM (Community Earth System Model) Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,
More informationComparative Benchmarking of the First Generation of HPC-Optimised Arm Processors on Isambard
Prof Simon McIntosh-Smith Isambard PI University of Bristol / GW4 Alliance Comparative Benchmarking of the First Generation of HPC-Optimised Arm Processors on Isambard Isambard system specification 10,000+
More informationNAMD Performance Benchmark and Profiling. November 2010
NAMD Performance Benchmark and Profiling November 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox Compute resource - HPC Advisory
More informationIntegration Path for Intel Omni-Path Fabric attached Intel Enterprise Edition for Lustre (IEEL) LNET
Integration Path for Intel Omni-Path Fabric attached Intel Enterprise Edition for Lustre (IEEL) LNET Table of Contents Introduction 3 Architecture for LNET 4 Integration 5 Proof of Concept routing for
More informationEFFICIENT INFERENCE WITH TENSORRT. Han Vanholder
EFFICIENT INFERENCE WITH TENSORRT Han Vanholder AI INFERENCING IS EXPLODING 2 Trillion Messages Per Day On LinkedIn 500M Daily active users of iflytek 140 Billion Words Per Day Translated by Google 60
More informationBirds of a Feather Presentation
Mellanox InfiniBand QDR 4Gb/s The Fabric of Choice for High Performance Computing Gilad Shainer, shainer@mellanox.com June 28 Birds of a Feather Presentation InfiniBand Technology Leadership Industry Standard
More informationExploiting InfiniBand and GPUDirect Technology for High Performance Collectives on GPU Clusters
Exploiting InfiniBand and Direct Technology for High Performance Collectives on Clusters Ching-Hsiang Chu chu.368@osu.edu Department of Computer Science and Engineering The Ohio State University OSU Booth
More informationLAMMPS Performance Benchmark and Profiling. July 2012
LAMMPS Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationInterconnect Your Future
Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators
More informationREFERENCE ARCHITECTURES OF DELL EMC READY BUNDLE FOR HPC LIFE SCIENCES
REFERENCE ARCHITECTURES OF DELL EMC READY BUNDLE FOR HPC LIFE SCIENCES Refresh with 14 th Generation servers ABSTRACT Dell EMC s flexible HPC architecture for Life Sciences has been through a dramatic
More informationHETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA
HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS
More informationCP2K Performance Benchmark and Profiling. April 2011
CP2K Performance Benchmark and Profiling April 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationTESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications
TESLA P PERFORMANCE GUIDE HPC and Deep Learning Applications MAY 217 TESLA P PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important
More informationMemory Selection Guidelines for High Performance Computing with Dell PowerEdge 11G Servers
Memory Selection Guidelines for High Performance Computing with Dell PowerEdge 11G Servers A Dell Technical White Paper By Garima Kochhar and Jacob Liberman High Performance Computing Engineering Dell
More informationSOLUTIONS BRIEF: Transformation of Modern Healthcare
SOLUTIONS BRIEF: Transformation of Modern Healthcare Healthcare & The Intel Xeon Scalable Processor Intel is committed to bringing the best of our manufacturing, design and partner networks to enable our
More informationICON Performance Benchmark and Profiling. March 2012
ICON Performance Benchmark and Profiling March 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource - HPC
More informationLow-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc.
Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc. 1 DISCLAIMER This presentation and/or accompanying oral statements by Samsung
More informationIn partnership with. VelocityAI REFERENCE ARCHITECTURE WHITE PAPER
In partnership with VelocityAI REFERENCE JULY // 2018 Contents Introduction 01 Challenges with Existing AI/ML/DL Solutions 01 Accelerate AI/ML/DL Workloads with Vexata VelocityAI 02 VelocityAI Reference
More informationEfficient Communication Library for Large-Scale Deep Learning
IBM Research AI Efficient Communication Library for Large-Scale Deep Learning Mar 26, 2018 Minsik Cho (minsikcho@us.ibm.com) Deep Learning changing Our Life Automotive/transportation Security/public safety
More informationInfiniBand Networked Flash Storage
InfiniBand Networked Flash Storage Superior Performance, Efficiency and Scalability Motti Beck Director Enterprise Market Development, Mellanox Technologies Flash Memory Summit 2016 Santa Clara, CA 1 17PB
More informationOpenPOWER Performance
OpenPOWER Performance Alex Mericas Chief Engineer, OpenPOWER Performance IBM Delivering the Linux ecosystem for Power SOLUTIONS OpenPOWER IBM SOFTWARE LINUX ECOSYSTEM OPEN SOURCE Solutions with full stack
More informationMELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구
MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio
More informationPOWEREDGE RACK SERVERS
QUICK REFERENCE GUIDE POWEREDGE RACK SERVERS Dell EMC PowerEdge rack servers help you build a modern infrastructure that minimizes IT challenges and business success. Choose from a complete portfolio of
More information