The Effect of HPC Cluster Architecture on the Scalability Performance of CAE Simulations
|
|
- Nelson Davidson
- 6 years ago
- Views:
Transcription
1 The Effect of HPC Cluster Architecture on the Scalability Performance of CAE Simulations Pak Lui HPC Advisory Council June 7,
2 Agenda Introduction to HPC Advisory Council Benchmark Configuration Performance Benchmark Testing/Results Summary Q&A / For More Information
3 The HPC Advisory Council Mission Statement World-wide HPC non-profit organization (429+ members) Bridges the gap between HPC usage and its potential Provides best practices and a support/development center Explores future technologies and future developments Leading edge solutions and technology demonstrations
4 HPC Advisory Council Members
5 HPC Advisory Council Cluster Center Dell PowerEdge R node cluster Dell PowerVault MD3420 Dell PowerVault MD3460 HP Proliant XL230a Gen9 10-node cluster HP Cluster Platform 3000SL 16-node cluster InfiniBand Storage (Lustre) Dell PowerEdge C node cluster Dell PowerEdge R node cluster Dell PowerEdge R720xd/R node cluster Dell PowerEdge M node cluster Dell PowerEdge C node cluster White-box InfiniBand-based Storage (Lustre)
6 HPC Training HPC Training Center CPUs GPUs Interconnects Clustering Storage Cables Programming Applications Network of Experts Ask the experts
7 Special Interest Subgroups HPC Scale Subgroup Explore usage of commodity HPC as a replacement for multi-million dollar mainframes and proprietary based supercomputers HPC Storage Subgroup Demonstrate how to build highperformance storage solutions and their affect on application performance and productivity HPC Cloud Subgroup Explore usage of HPC components as part of the creation of external/public/internal/private cloud computing environments. HPC GPU Subgroup Explore usage models of GPU components as part of next generation compute environments and potential optimizations for GPU based computing HPC Works Subgroup Provide best practices for building balanced and scalable HPC systems, performance tuning and application guidelines. HPC Music To enable HPC in music production and to develop HPC cluster solutions that further enable the future of music production
8 University Award Program University award program Universities / individuals are encouraged to submit proposals for advanced research Selected proposal will be provided with: Exclusive computation time on the HPC Advisory Council s Compute Center Invitation to present in one of the HPC Advisory Council s worldwide workshops Publication of the research results on the HPC Advisory Council website 2010 award winner is Dr. Xiangqian Hu, Duke University Topic: Massively Parallel Quantum Mechanical Simulations for Liquid Water 2011 award winner is Dr. Marco Aldinucci, University of Torino Topic: Effective Streaming on Multi-core by Way of the FastFlow Framework 2012 award winner is Jacob Nelson, University of Washington Runtime Support for Sparse Graph Applications 2013 award winner is Antonis Karalis Topic: Music Production using HPC 2014 award winner is Antonis Karalis Topic: Music Production using HPC 2015 award winner is Christian Kniep Topic: Dockers To submit a proposal please check the HPC Advisory Council web site
9 Exploring All Platforms X86, Power, GPU, FPGA and ARM based Platforms x86 Power GPU FPGA ARM
10 158+ Applications Best Practices Published Abaqus CPMD LS-DYNA MILC AcuSolve Dacapo minife OpenMX Amber Desmond MILC PARATEC AMG DL-POLY MSC Nastran PFA AMR Eclipse MR Bayes PFLOTRAN ABySS FLOW-3D MM5 Quantum ANSYS CFX GADGET-2 MPQC ESPRESSO ANSYS FLUENT GROMACS NAMD RADIOSS ANSYS Mechanics Himeno Nekbone SPECFEM3D BQCD HOOMD-blue NEMO WRF CCSM HYCOM NWChem CESM ICON Octopus COSMO Lattice QCD OpenAtom CP2K LAMMPS OpenFOAM For more information, visit:
11 HPCAC - ISC 16 Student Cluster Competition University-based teams to compete and demonstrate the incredible capabilities of state-of- the-art HPC systems and applications on the International Super Computing HPC (ISC HPC) show-floor The Student Cluster Challenge is designed to introduce the next generation of students to the high performance computing world and community
12 ISC'15 Student Cluster Competition Award Ceremony
13 ISC'16 Student Cluster Competition Teams
14 Getting Ready to 2016 Student Cluster Competition
15 HPCAC Conferences 2015 Conferences
16 2016 HPC Advisory Council Conferences Introduction HPC Advisory Council (HPCAC) 429+ members, Application best practices, case studies Benchmarking center with remote access for users World-wide workshops Value add for your customers to stay up to date and in tune to HPC market 2016 Conferences USA (Stanford University) February Switzerland (CSCS) March Mexico TBD Spain (BSC) September 21 China (HPC China) October 26 For more information
17 RADIOSS by Altair Altair RADIOSS Structural analysis solver for highly non-linear problems under dynamic loadings Consists of features for: multiphysics simulation and advanced materials such as composites Highly differentiated for Scalability, Quality and Robustness RADIOSS is used across all industry worldwide Improves crashworthiness, safety, and manufacturability of structural designs RADIOSS has established itself as an industry standard for automotive crash and impact analysis for over 20 years
18 Test Cluster Configuration Dell PowerEdge R node (896-core) Thor cluster Dual-Socket 14-core Intel 2.60 GHz CPUs (Turbo on, Max Perf set in BIOS) OS: RHEL 6.5, OFED MLNX_OFED_LINUX InfiniBand SW stack Memory: 64GB memory, DDR MHz Hard Drives: 1TB 7.2 RPM SATA 2.5 Mellanox ConnectX-4 EDR 100Gb/s InfiniBand VPI adapters Mellanox Switch-IB SB Gb/s InfiniBand VPI switch Mellanox ConnectX-3 40/56Gb/s QDR/FDR InfiniBand VPI adapters Mellanox SwitchX SX Gb/s FDR InfiniBand VPI switch MPI: Intel MPI 5.0.2, Mellanox HPC-X v1.2.0 Application: Altair RADIOSS 13.0 Benchmark datasets: Neon benchmarks: 1 million elements (8ms, Double Precision), unless otherwise stated
19 PowerEdge R730 Massive flexibility for data intensive operations Performance and efficiency Intelligent hardware-driven systems management with extensive power management features Innovative tools including automation for parts replacement and lifecycle manageability Broad choice of networking technologies from GbE to IB Built in redundancy with hot plug and swappable PSU, HDDs and fans Benefits Designed for performance workloads from big data analytics, distributed storage or distributed computing where local storage is key to classic HPC and large scale hosting environments High performance scale-out compute and low cost dense storage in one package Hardware Capabilities Flexible compute platform with dense storage capacity 2S/2U server, 7 PCIe slots Large memory footprint (Up to 1.5TB / 24 DIMMs) High I/O performance and optional storage configurations HDD: SAS, SATA, nearline SAS; SSD: SAS, SATA 16 x 2.5 up to 29TB via 1.8TB hot-plug SAS hard drives 8 x 3.5 up to 64TB via 8TB hot-plug nearline SAS hard drives
20 RADIOSS Performance Interconnect (MPP) EDR InfiniBand provides higher scalability than Ethernet 70 times better performance than 1GbE at 16 nodes / 448 cores 4.8x better performance than 10GbE at 16 nodes / cores Ethernet solutions does not scale beyond 4 nodes with pure MPI 70x 4.8x Intel MPI Higher is better 28 Processes/Node
21 RADIOSS Profiling % Time Spent on MPI RADIOSS utilizes point-to-point communications in most data transfers The most time MPI consuming calls is MPI_Waitany() and MPI_Wait() MPI_Recv(55%), MPI_Waitany(23%), MPI_Allreduce(13%) MPP Mode 28 Processes/Node
22 RADIOSS Performance Interconnect (MPP) EDR InfiniBand provides better scalability performance EDR IB improves over QDR IB by 28% at 16 nodes / 448 cores EDR InfiniBand outperforms FDR InfiniBand by 25% at 16 nodes 28% 25% Higher is better 28 Processes/Node
23 RADIOSS Performance CPU Cores Running more cores per node generally improves overall performance Seen improvement of 18% from 20 to 28 cores per node at 8 nodes Improvement seems not as consistent at higher node counts Guideline: Most optimal workload distribution is 4000 elements/process For test case of 1 million elements, most optimal core sizes is ~256 cores 4000 elements per process should provides sufficient workload for each process Hybrid MPP (HMPP) provides way to achieve additional scalability on more CPUs 18% 6% Higher is better Intel MPI
24 RADIOSS Performance IMPI Tuning (MPP) Tuning Intel MPI collective algorithm can improve performance MPI profile shows about 20% of runtime spent on MPI_Allreduce communications Default algorithm in Intel MPI is Recursive Doubling The default algorithm is the best among all tested for MPP Intel MPI Higher is better 28 Processes/Node
25 RADIOSS Performance Hybrid MPP version Enabling Hybrid MPP mode unlocks the RADIOSS scalability At larger scale, productivity improves as more threads involves As more threads involved, amount of communications by processes are reduced At 32 nodes/896 cores, best configuration is 1 process per socket to spawn 14 threads each 28 threads/1 PPN is not advised due to breach of data locality across different CPU socket The following environment setting and tuned flags are used for Intel MPI: I_MPI_PIN_DOMAIN auto I_MPI_ADJUST_ALLREDUCE 5 I_MPI_ADJUST_BCAST 1 KMP_AFFINITY compact KMP_STACKSIZE 400m ulimit -s unlimited 3.7x 32% 70% Intel MPI EDR InfiniBand
26 RADIOSS Profiling Number of MPI Calls For MPP utilizes most non-blocking calls for communications MPI_Recv, MPI_Waitany, MPI_Allreduce are used most of the time For HMPP, communication behavior has changed Higher time percentage in MPI_Waitany, MPI_Allreduce, and MPI_Recv MPI Communication behavior changed from previous RADIOSS version Most likely due to more CPU cores available on the current cluster MPP, 28PPN HMPP, 2PPN / 14 Threads
27 RADIOSS Profiling MPI Message Sizes The most time consuming MPI communications are: MPI_Recv: Messages concentrated at 640B, 1KB, 320B, 1280B MPI_Waitany: Messages are: 48B, 8B, 384B MPI_Allreduce: Most message sizes appears at 80B MPP, 28PPN HMPP, 2PPN / 14 Threads Pure MPP 28 Processes/Node
28 RADIOSS Performance Intel MPI Tuning (DP) For Hybrid MPP DP, tuning MPI_Allreduce shows more gain than MPP For DAPL provider, Binomial gather+scatter #5 improved perf by 27% over default For OFA provider, tuned MPI_Allreduce algorithm improves by 44% over default Both OFA and DAPL improved by tuning I_MPI_ADJUST_ALLREDUCE=5 Flags for OFA: I_MPI_OFA_USE_XRC 1. For DAPL: ofa-v2-mlx5_0-1u provider 27% 44% Intel MPI Higher is better 2 PPN / 14 OpenMP
29 RADIOSS Performance Interconnect (HMPP) EDR InfiniBand provides better scalability performance than Ethernet 214% better performance than 1GbE at 16 nodes 104% better performance than 10GbE at 16 nodes InfiniBand typically outperforms other interconnect in collective operations 214% 104% Intel MPI Higher is better 2 PPN / 14 OpenMP
30 RADIOSS Performance Interconnect (HMPP) EDR InfiniBand provides better scalability performance than FDR IB EDR IB outperforms FDR IB by 27% at 32 nodes Improvement for EDR InfiniBand occurs at high node count 27% Intel MPI Higher is better 2 PPN / 14 OpenMP
31 RADIOSS Performance System Generations Intel E5-2680v3 (Haswell) cluster outperforms prior generations Performs faster by 100% vs Jupiter, by 238% vs Janus at 16 nodes System components used: Thor: 2-socket Intel 2133MHz DIMMs, EDR IB, v13.0 Jupiter: 2-socket Intel 1600MHz DIMMs, FDR IB, v12.0 Janus: 2-socket Intel 1333MHz DIMMs, QDR IB, v % 100% Single Precision
32 RADIOSS Summary RADIOSS is designed to perform at scale in HPC environment Shows excellent scalability over 896 cores/32 nodes and beyond with Hybrid MPP Hybrid MPP version enhanced RADIOSS scalability 2 MPI processes per node (or 1 MPI process per socket), 14 threads each Additional CPU cores generally accelerating time to solution performance Network and MPI Tuning EDR IB outperforms other Ethernet in scalability EDR IB delivers higher scalability performance than FDR/QDR IB Tuning environment/parameters to maximize performance Tuning MPI collective ops helps RADIOSS to achieve even better scalability
33 STAR-CCM+ STAR-CCM+ An engineering process-oriented CFD tool Client-server architecture, object-oriented programming Delivers the entire CFD process in a single integrated software environment Developed by CD-adapco
34 Test Cluster Configuration Dell PowerEdge R node (896-core) Thor cluster Dual-Socket 14-Core Intel 2.60 GHz CPUs BIOS: Maximum Performance, Home Snoop Memory: 64GB memory, DDR MHz (Snoop Mode: Home Snoop) OS: RHEL 6.5, MLNX_OFED_LINUX InfiniBand SW stack Hard Drives: 2x 1TB 7.2 RPM SATA 2.5 on RAID 1 Mellanox ConnectX-4 EDR 100Gb/s InfiniBand Adapters Mellanox Switch-IB SB port EDR 100Gb/s InfiniBand Switch Mellanox ConnectX-3 FDR VPI InfiniBand and 40Gb/s Ethernet Adapters Mellanox SwitchX-2 SX port 56Gb/s FDR InfiniBand / VPI Ethernet Switch Dell InfiniBand-Based Lustre Storage based on Dell PowerVault MD3460 and Dell PowerVault MD3420 MPI: Platform MPI Application: STAR-CCM Benchmarks: lemans_poly_17m civil_trim_20m reactor_9m, LeMans_100M.amg
35 STAR-CCM+ Performance Network Interconnects EDR InfiniBand delivers superior scalability in application performance IB delivers 66% higher performance than 40GbE, 88% higher than 10GbE at 32 nodes Scalability stops beyond 4 nodes for 1GbE; scalability is limited for 10/40GbE Input data: Lemans_poly_17m: A race car model with 17 million cells 88%66% 748% Higher is better 28 MPI Processes / Node
36 STAR-CCM+ Performance Network Interconnects EDR InfiniBand delivers superior scalability in application performance EDR IB provides 177 higher performance than 40GbE, 194% than 40GbE at 32 nodes InfiniBand demonstrates continuous performance gain at scale Input data: reactor_9m: A reactor model with 9 million cells 194% 177% Higher is better 28 MPI Processes / Node
37 STAR-CCM+ Profiling % of MPI Calls For the most time consuming MPI calls: Lemans_17m: 55% MPI_Allreduce, 23% MPI_Waitany, 7% MPI_Bcast, 7% MPI_Recv Reactor_9m: 59% MPI_Allreduce, 21% MPI_Waitany, 7% MPI_Recv, 4% MPI_Bcast MPI as a percentage in wall clock times: Lemans_17m: 12% MPI_Allreduce, 5% MPI_Waitany, 2% MPI_Bcast, 2% MPI_Recv Reactor_9m: 15% MPI_Allreduce, 5% MPI_Waitany, 2% MPI_Recv, 1% MPI_Bcast lemans_17m 32 nodes / 896 Processes reactor_9m 32 Nodes / 896 Processes
38 STAR-CCM+ Profiling MPI Message Size Distribution For the most time consuming MPI calls Lemans_17m: MPI_Allreduce 4B (30%), 16B (19%), 8B (6%), MPI_Bcast 4B (4%) Reactor_9m: MPI_Allreduce 16B (35%), 4B (15%), 8B (8%), MPI_Bcast 1B (4%) lemans_17m 32 nodes / 896 Processes reactor_9m 32 Nodes / 896 Processes
39 STAR-CCM+ Profiling Time Spent in MPI Majority of the MPI time is spent on MPI collective Operations and nonblocking communications Heavy use of MPI collective operations (MPI_Allreduce, MPI_Bcast) and MPI_Waitany Some node imbalances characteristics shown on both input dataset Some processes appeared to take more time in communications, in MPI_Allreduce lemans_17m 32 nodes / 896 Processes reactor_9m 32 Nodes / 896 Processes
40 STAR-CCM+ Performance Scalability Speedup EDR InfiniBand demonstrates linear scaling for STAR-CCM+ STAR-CCM+ is able to achieve linear scaling with EDR InfiniBand Other interconnects only provided limited scalability As demonstrated in previous slides Higher is better 28 MPI Processes / Node
41 STAR-CCM+ Performance System Generations Current system generations of HW & SW configuration outperform prior generations Current Haswell systems outperformed Ivy Bridge by 38%, Sandy Bridge by 149%, Westmere by 409% Dramatic performance benefit due to better system architecture in compute and network scalability System components used: Haswell: 2-socket 14-core DDR4 2133MHz DIMMs, ConnectX-4 EDR InfiniBand, v Ivy Bridge: 2-socket 10-core DDR3 1600MHz DIMMs, Connect-IB FDR InfiniBand, v Sandy Bridge: 2-socket 8-core DDR3 1600MHz DIMMs, ConnectX-3 FDR InfiniBand, v Westmere: 2-socket 6-core DDR3 1333MHz DIMMs, ConnectX-2 QDR InfiniBand, v % 149% 38% Higher is better
42 STAR-CCM+ Summary Compute: cluster of the current generation outperforms system architecture of previous generations Outperformed Ivy Bridge by 38%, Sandy Bridge by 149%, Westmere by 409% Dramatic performance benefit due to better system architecture in compute and network scalability Network: EDR InfiniBand demonstrates superior scalability in STAR-CCM+ performance EDR IB provides higher performance by over 4-5 times vs 1GbE, 10GbE and 40GbE, 15% vs FDR IB at 32 nodes Lemans_17m: Scalability stops beyond 4 nodes for 1GbE; scalability is limited for 10/40 GbE Reactor_9m: EDR IB provides 177 higher performance than 40GbE, 194% than 40GbE at 32 nodes EDR InfiniBand demonstrates linear scalability in STAR- CCM+ performance on the test cases
43 ANSYS Fluent Computational Fluid Dynamics (CFD) is a computational technology Enables the study of the dynamics of things that flow Enable better understanding of qualitative and quantitative physical phenomena in the flow which is used to improve engineering design CFD brings together a number of different disciplines Fluid dynamics, mathematical theory of partial differential systems, computational geometry, numerical analysis, Computer science ANSYS FLUENT is a leading CFD application from ANSYS Widely used in almost every industry sector and manufactured product
44 Test Cluster Configuration Dell PowerEdge R node (896-core) Thor cluster Dual-Socket 14-Core Intel 2.60 GHz CPUs Turbo enabled (Power Management: Maximum Performance) Memory: 64GB memory, DDR MHz (Memory Snoop: Home Snoop) OS: RHEL 6.5, MLNX_OFED_LINUX InfiniBand SW stack Hard Drives: 2x 1TB 7.2 RPM SATA 2.5 on RAID 1 Mellanox Switch-IB SB port 100Gb/s EDR InfiniBand Switch Mellanox ConnectX-4 EDR 100Gbps EDR InfiniBand Adapters Mellanox SwitchX-2 SX port 56Gb/s FDR InfiniBand / VPI Ethernet Switch Mellanox ConnectX-3 FDR InfiniBand, 10/40GbE Ethernet VPI Adapters MPI: Mellanox HPC-X v , Platform MPI 9.1 Application: ANSYS Fluent 16.1 Benchmark datasets: eddy_417k truck_111m
45 Fluent Performance Network Interconnects InfiniBand delivers superior scalability performance EDR InfiniBand provides higher performance than Ethernet InfiniBand delivers ~20 to 44 times higher performance and continuous scalability Ethernet performance stays flat (or stops scaling) beyond 2 nodes 20x 44x 20x Higher is better 28 MPI Processes / Node
46 Fluent Performance EDR vs FDR InfiniBand EDR InfiniBand delivers superior scalability in application performance As the number of nodes scales, performance gap of EDR IB becomes widen Performance advantage of EDR InfiniBand increases for larger core counts EDR InfiniBand provides 111% versus FDR InfiniBand at 16 nodes (448 cores) 111% Higher is better 28 MPI Processes / Node
47 Fluent Performance MPI Libraries HPC-X delivers higher scalability performance than Platform MPI by 16% Support of HPC-X on Fluent Based on the support of Open MPI on Fluent The new yalla pml reduces the overhead. Flags used for HPC-X: Tuning Parameters used: -mca coll_fca_enable 1 -mca pml yalla -map-by node -x MXM_TLS=self,shm,ud --bind-to core 16% 12% Higher is better 28 MPI Processes / Node
48 Fluent Profiling Time Spent by MPI Calls Different communication patterns seen depending on data Eddy_417k: Most time spent in MPI_Recv, MPI_Allreduce, MPI_Waitall Truck_111m: Most time spent in MPI_Bcast, MPI_Recv, MPI_Allreduce eddy_417k truck_111m
49 Fluent Profiling Time Spent by MPI Calls Different communication patterns seen with different input data Eddy_417k: Most time spent in MPI_Recv, MPI_Allreduce, MPI_Waitall Truck_111m: Most time spent in MPI_Bcast, MPI_Recv, MPI_Allreduce eddy_417k truck_111m
50 Fluent Profiling MPI Message Sizes The most time consuming transfers are from small messages: Eddy_417k : MPI_Recv@16B (28%wall), MPI_Allreduce@4B (14% wall), (6%wall) Truck_111m: (22%wall), (19%wall), (13%wall) eddy_417k truck_111m 32 Nodes
51 Fluent Summary Performance Compute: Intel Haswell cluster outperforms system architecture of previous generations Haswell cluster outperforms Ivy Bridge cluster by 26%-49% at 32 node (896 cores) depending on workload Network: EDR InfiniBand delivers superior scalability in application performance EDR InfiniBand provides 20 to 44 times higher performance and more scalable compared to 1GbE/10GbE/40GbE Performance for Ethernet (1GbE/10GbE/40GbE) stays flat (or stops scaling) beyond 2 nodes EDR InfiniBand provides 111% versus FDR InfiniBand at 16nodes / 448 cores MPI: HPC-X delivers higher scalability performance than Platform MPI by 16%
52 Thank You! All trademarks are property of their respective owners. All information is provided As-Is without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and completeness of the information contained herein. HPC Advisory Council undertakes no duty and 52assumes no obligation to update or correct any information presented herein
Clustering Optimizations How to achieve optimal performance? Pak Lui
Clustering Optimizations How to achieve optimal performance? Pak Lui 130 Applications Best Practices Published Abaqus CPMD LS-DYNA MILC AcuSolve Dacapo minife OpenMX Amber Desmond MILC PARATEC AMG DL-POLY
More informationLS-DYNA Performance Benchmark and Profiling. April 2015
LS-DYNA Performance Benchmark and Profiling April 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationAltair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015
Altair OptiStruct 13.0 Performance Benchmark and Profiling May 2015 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute
More informationApplication Performance Optimizations. Pak Lui
Application Performance Optimizations Pak Lui 2 140 Applications Best Practices Published Abaqus CPMD LS-DYNA MILC AcuSolve Dacapo minife OpenMX Amber Desmond MILC PARATEC AMG DL-POLY MSC Nastran PFA AMR
More informationThe Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations
The Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations Ophir Maor HPC Advisory Council ophir@hpcadvisorycouncil.com The HPC-AI Advisory Council World-wide HPC non-profit
More informationSTAR-CCM+ Performance Benchmark and Profiling. July 2014
STAR-CCM+ Performance Benchmark and Profiling July 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: CD-adapco, Intel, Dell, Mellanox Compute
More informationLAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015
LAMMPS-KOKKOS Performance Benchmark and Profiling September 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, NVIDIA
More informationANSYS Fluent 14 Performance Benchmark and Profiling. October 2012
ANSYS Fluent 14 Performance Benchmark and Profiling October 2012 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information
More informationGROMACS (GPU) Performance Benchmark and Profiling. February 2016
GROMACS (GPU) Performance Benchmark and Profiling February 2016 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Mellanox, NVIDIA Compute
More informationNAMD Performance Benchmark and Profiling. January 2015
NAMD Performance Benchmark and Profiling January 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationMILC Performance Benchmark and Profiling. April 2013
MILC Performance Benchmark and Profiling April 2013 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the supporting
More informationAltair RADIOSS Performance Benchmark and Profiling. May 2013
Altair RADIOSS Performance Benchmark and Profiling May 2013 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Altair, AMD, Dell, Mellanox Compute
More informationCPMD Performance Benchmark and Profiling. February 2014
CPMD Performance Benchmark and Profiling February 2014 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the supporting
More informationSNAP Performance Benchmark and Profiling. April 2014
SNAP Performance Benchmark and Profiling April 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox For more information on the supporting
More informationLS-DYNA Performance Benchmark and Profiling. October 2017
LS-DYNA Performance Benchmark and Profiling October 2017 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: LSTC, Huawei, Mellanox Compute resource
More informationThe Impact of Inter-node Latency versus Intra-node Latency on HPC Applications The 23 rd IASTED International Conference on PDCS 2011
The Impact of Inter-node Latency versus Intra-node Latency on HPC Applications The 23 rd IASTED International Conference on PDCS 2011 HPC Scale Working Group, Dec 2011 Gilad Shainer, Pak Lui, Tong Liu,
More informationOCTOPUS Performance Benchmark and Profiling. June 2015
OCTOPUS Performance Benchmark and Profiling June 2015 2 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the
More informationRADIOSS Benchmark Underscores Solver s Scalability, Quality and Robustness
RADIOSS Benchmark Underscores Solver s Scalability, Quality and Robustness HPC Advisory Council studies performance evaluation, scalability analysis and optimization tuning of RADIOSS 12.0 on a modern
More informationOpenFOAM Performance Testing and Profiling. October 2017
OpenFOAM Performance Testing and Profiling October 2017 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Huawei, Mellanox Compute resource - HPC
More informationHPC Applications Performance and Optimizations Best Practices Pak Lui
HPC Applications Performance and Optimizations Best Practices Pak Lui 130 Applications Best Practices Published Abaqus CPMD LS-DYNA MILC AcuSolve Dacapo minife OpenMX Amber Desmond MILC PARATEC AMG DL-POLY
More informationAcuSolve Performance Benchmark and Profiling. October 2011
AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, Altair Compute
More informationDell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia
More informationPerformance Optimizations for LS-DYNA with Mellanox HPC-X Scalable Software Toolkit
Performance Optimizations for LS-DYNA with Mellanox HPC-X Scalable Software Toolkit Pak Lui 1, David Cho 1, Gilad Shainer 1, Scot Schultz 1, Brian Klaff 1 1 Mellanox Technologies, Inc. 1 Abstract From
More informationLS-DYNA Performance Benchmark and Profiling. October 2017
LS-DYNA Performance Benchmark and Profiling October 2017 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: LSTC, Huawei, Mellanox Compute resource
More informationPerformance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA
Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to
More informationAcuSolve Performance Benchmark and Profiling. October 2011
AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox, Altair Compute
More informationGROMACS Performance Benchmark and Profiling. September 2012
GROMACS Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource
More informationAMBER 11 Performance Benchmark and Profiling. July 2011
AMBER 11 Performance Benchmark and Profiling July 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource -
More informationLAMMPS Performance Benchmark and Profiling. July 2012
LAMMPS Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationGROMACS Performance Benchmark and Profiling. August 2011
GROMACS Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationCP2K Performance Benchmark and Profiling. April 2011
CP2K Performance Benchmark and Profiling April 2011 Note The following research was performed under the HPC Advisory Council HPC works working group activities Participating vendors: HP, Intel, Mellanox
More informationCP2K Performance Benchmark and Profiling. April 2011
CP2K Performance Benchmark and Profiling April 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationCESM (Community Earth System Model) Performance Benchmark and Profiling. August 2011
CESM (Community Earth System Model) Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,
More informationMaximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms
Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton
More informationScheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications
Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications Sep 2009 Gilad Shainer, Tong Liu (Mellanox); Jeffrey Layton (Dell); Joshua Mora (AMD) High Performance Interconnects for
More informationNAMD GPU Performance Benchmark. March 2011
NAMD GPU Performance Benchmark March 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Intel, Mellanox Compute resource - HPC Advisory
More informationHYCOM Performance Benchmark and Profiling
HYCOM Performance Benchmark and Profiling Jan 2011 Acknowledgment: - The DoD High Performance Computing Modernization Program Note The following research was performed under the HPC Advisory Council activities
More informationNEMO Performance Benchmark and Profiling. May 2011
NEMO Performance Benchmark and Profiling May 2011 Note The following research was performed under the HPC Advisory Council HPC works working group activities Participating vendors: HP, Intel, Mellanox
More informationNAMD Performance Benchmark and Profiling. February 2012
NAMD Performance Benchmark and Profiling February 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource -
More informationICON Performance Benchmark and Profiling. March 2012
ICON Performance Benchmark and Profiling March 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource - HPC
More informationDell HPC System for Manufacturing System Architecture and Application Performance
Dell HPC System for Manufacturing System Architecture and Application Performance This Dell technical white paper describes the architecture of the Dell HPC System for Manufacturing and discusses performance
More informationDell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for ANSYS Mechanical, ANSYS Fluent, and
More informationHPC Innovation Lab Update. Dell EMC HPC Community Meeting 3/28/2017
HPC Innovation Lab Update Dell EMC HPC Community Meeting 3/28/2017 Dell EMC HPC Innovation Lab charter Design, develop and integrate Heading HPC systems Lorem ipsum Flexible reference dolor sit amet, architectures
More informationABySS Performance Benchmark and Profiling. May 2010
ABySS Performance Benchmark and Profiling May 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationHimeno Performance Benchmark and Profiling. December 2010
Himeno Performance Benchmark and Profiling December 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource
More informationNAMD Performance Benchmark and Profiling. November 2010
NAMD Performance Benchmark and Profiling November 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox Compute resource - HPC Advisory
More informationGRID Testing and Profiling. November 2017
GRID Testing and Profiling November 2017 2 GRID C++ library for Lattice Quantum Chromodynamics (Lattice QCD) calculations Developed by Peter Boyle (U. of Edinburgh) et al. Hybrid MPI+OpenMP plus NUMA aware
More informationTechnologies and application performance. Marc Mendez-Bermond HPC Solutions Expert - Dell Technologies September 2017
Technologies and application performance Marc Mendez-Bermond HPC Solutions Expert - Dell Technologies September 2017 The landscape is changing We are no longer in the general purpose era the argument of
More informationMM5 Modeling System Performance Research and Profiling. March 2009
MM5 Modeling System Performance Research and Profiling March 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center
More informationLAMMPSCUDA GPU Performance. April 2011
LAMMPSCUDA GPU Performance April 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Intel, Mellanox Compute resource - HPC Advisory Council
More informationOPEN MPI WITH RDMA SUPPORT AND CUDA. Rolf vandevaart, NVIDIA
OPEN MPI WITH RDMA SUPPORT AND CUDA Rolf vandevaart, NVIDIA OVERVIEW What is CUDA-aware History of CUDA-aware support in Open MPI GPU Direct RDMA support Tuning parameters Application example Future work
More informationMPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA
MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA Gilad Shainer 1, Tong Liu 1, Pak Lui 1, Todd Wilde 1 1 Mellanox Technologies Abstract From concept to engineering, and from design to
More informationPerformance Analysis of LS-DYNA in Huawei HPC Environment
Performance Analysis of LS-DYNA in Huawei HPC Environment Pak Lui, Zhanxian Chen, Xiangxu Fu, Yaoguo Hu, Jingsong Huang Huawei Technologies Abstract LS-DYNA is a general-purpose finite element analysis
More informationRECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016
RECENT TRENDS IN GPU ARCHITECTURES Perspectives of GPU computing in Science, 26 th Sept 2016 NVIDIA THE AI COMPUTING COMPANY GPU Computing Computer Graphics Artificial Intelligence 2 NVIDIA POWERS WORLD
More informationInterconnect Your Future
Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators
More informationFUSION1200 Scalable x86 SMP System
FUSION1200 Scalable x86 SMP System Introduction Life Sciences Departmental System Manufacturing (CAE) Departmental System Competitive Analysis: IBM x3950 Competitive Analysis: SUN x4600 / SUN x4600 M2
More informationHPC and AI Solution Overview. Garima Kochhar HPC and AI Innovation Lab
HPC and AI Solution Overview Garima Kochhar HPC and AI Innovation Lab 1 Dell EMC HPC and DL team charter Design, develop and integrate HPC and DL Heading systems Lorem ipsum dolor sit amet, consectetur
More informationGPU ACCELERATED COMPUTING. 1 st AlsaCalcul GPU Challenge, 14-Jun-2016, Strasbourg Frédéric Parienté, Tesla Accelerated Computing, NVIDIA Corporation
GPU ACCELERATED COMPUTING 1 st AlsaCalcul GPU Challenge, 14-Jun-2016, Strasbourg Frédéric Parienté, Tesla Accelerated Computing, NVIDIA Corporation GAMING PRO ENTERPRISE VISUALIZATION DATA CENTER AUTO
More informationPerformance Analysis of HPC Applications on Several Dell PowerEdge 12 th Generation Servers
Performance Analysis of HPC Applications on Several Dell PowerEdge 12 th Generation Servers This Dell technical white paper evaluates and provides recommendations for the performance of several HPC applications
More informationMaximizing Cluster Scalability for LS-DYNA
Maximizing Cluster Scalability for LS-DYNA Pak Lui 1, David Cho 1, Gerald Lotto 1, Gilad Shainer 1 1 Mellanox Technologies, Inc. Sunnyvale, CA, USA 1 Abstract High performance network interconnect is an
More informationOptimizing LS-DYNA Productivity in Cluster Environments
10 th International LS-DYNA Users Conference Computing Technology Optimizing LS-DYNA Productivity in Cluster Environments Gilad Shainer and Swati Kher Mellanox Technologies Abstract Increasing demand for
More informationTECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING
TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING Accelerated computing is revolutionizing the economics of the data center. HPC enterprise and hyperscale customers deploy
More informationLS-DYNA Productivity and Power-aware Simulations in Cluster Environments
LS-DYNA Productivity and Power-aware Simulations in Cluster Environments Gilad Shainer 1, Tong Liu 1, Jacob Liberman 2, Jeff Layton 2 Onur Celebioglu 2, Scot A. Schultz 3, Joshua Mora 3, David Cownie 3,
More informationTECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING
TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING Accelerated computing is revolutionizing the economics of the data center. HPC and hyperscale customers deploy accelerated
More informationInterconnect Your Future
#OpenPOWERSummit Interconnect Your Future Scot Schultz, Director HPC / Technical Computing Mellanox Technologies OpenPOWER Summit, San Jose CA March 2015 One-Generation Lead over the Competition Mellanox
More informationHP GTC Presentation May 2012
HP GTC Presentation May 2012 Today s Agenda: HP s Purpose-Built SL Server Line Desktop GPU Computing Revolution with HP s Z Workstations Hyperscale the new frontier for HPC New HPC customer requirements
More information2008 International ANSYS Conference
2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,
More informationIntroduction to High-Performance Computing
Introduction to High-Performance Computing 2 What is High Performance Computing? There is no clear definition Computing on high performance computers Solving problems / doing research using computer modeling,
More informationAccelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing
Accelerating HPC (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing SAAHPC, Knoxville, July 13, 2010 Legal Disclaimer Intel may make changes to specifications and product
More informationLS-DYNA Scalability Analysis on Cray Supercomputers
13 th International LS-DYNA Users Conference Session: Computing Technology LS-DYNA Scalability Analysis on Cray Supercomputers Ting-Ting Zhu Cray Inc. Jason Wang LSTC Abstract For the automotive industry,
More informationBirds of a Feather Presentation
Mellanox InfiniBand QDR 4Gb/s The Fabric of Choice for High Performance Computing Gilad Shainer, shainer@mellanox.com June 28 Birds of a Feather Presentation InfiniBand Technology Leadership Industry Standard
More informationScalable x86 SMP Server FUSION1200
Scalable x86 SMP Server FUSION1200 Challenges Scaling compute-power is either Complex (scale-out / clusters) or Expensive (scale-up / SMP) Scale-out - Clusters Requires advanced IT skills / know-how (high
More informationMSC Nastran Explicit Nonlinear (SOL 700) on Advanced SGI Architectures
MSC Nastran Explicit Nonlinear (SOL 700) on Advanced SGI Architectures Presented By: Dr. Olivier Schreiber, Application Engineering, SGI Walter Schrauwen, Senior Engineer, Finite Element Development, MSC
More informationMELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구
MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio
More informationHPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)
HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access
More informationImplementing SQL Server 2016 with Microsoft Storage Spaces Direct on Dell EMC PowerEdge R730xd
Implementing SQL Server 2016 with Microsoft Storage Spaces Direct on Dell EMC PowerEdge R730xd Performance Study Dell EMC Engineering October 2017 A Dell EMC Performance Study Revisions Date October 2017
More informationANSYS HPC Technology Leadership
ANSYS HPC Technology Leadership 1 ANSYS, Inc. November 14, Why ANSYS Users Need HPC Insight you can t get any other way It s all about getting better insight into product behavior quicker! HPC enables
More informationThe State of Accelerated Applications. Michael Feldman
The State of Accelerated Applications Michael Feldman Accelerator Market in HPC Nearly half of all new HPC systems deployed incorporate accelerators Accelerator hardware performance has been advancing
More informationMellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007
Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise
More informationHETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA
HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS
More informationMemory Selection Guidelines for High Performance Computing with Dell PowerEdge 11G Servers
Memory Selection Guidelines for High Performance Computing with Dell PowerEdge 11G Servers A Dell Technical White Paper By Garima Kochhar and Jacob Liberman High Performance Computing Engineering Dell
More informationAMD EPYC and NAMD Powering the Future of HPC February, 2019
AMD EPYC and NAMD Powering the Future of HPC February, 19 Exceptional Core Performance NAMD is a compute-intensive workload that benefits from AMD EPYC s high core IPC (Instructions Per Clock) and high
More informationOptimal BIOS settings for HPC with Dell PowerEdge 12 th generation servers
Optimal BIOS settings for HPC with Dell PowerEdge 12 th generation servers This Dell technical white paper analyses the various BIOS options available in Dell PowerEdge 12 th generation servers and provides
More informationNew Features in LS-DYNA HYBRID Version
11 th International LS-DYNA Users Conference Computing Technology New Features in LS-DYNA HYBRID Version Nick Meng 1, Jason Wang 2, Satish Pathy 2 1 Intel Corporation, Software and Services Group 2 Livermore
More informationAnalyzing Performance and Power of Applications on GPUs with Dell 12G Platforms. Dr. Jeffrey Layton Enterprise Technologist HPC
Analyzing Performance and Power of Applications on GPUs with Dell 12G Platforms Dr. Jeffrey Layton Enterprise Technologist HPC Why GPUs? GPUs have very high peak compute capability! 6-9X CPU Challenges
More informationBuilding NVLink for Developers
Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized
More informationSpeedup Altair RADIOSS Solvers Using NVIDIA GPU
Innovation Intelligence Speedup Altair RADIOSS Solvers Using NVIDIA GPU Eric LEQUINIOU, HPC Director Hongwei Zhou, Senior Software Developer May 16, 2012 Innovation Intelligence ALTAIR OVERVIEW Altair
More informationGateways to Discovery: Cyberinfrastructure for the Long Tail of Science
Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science ECSS Symposium, 12/16/14 M. L. Norman, R. L. Moore, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S.
More informationThe Cray CX1 puts massive power and flexibility right where you need it in your workgroup
The Cray CX1 puts massive power and flexibility right where you need it in your workgroup Up to 96 cores of Intel 5600 compute power 3D visualization Up to 32TB of storage GPU acceleration Small footprint
More informationANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011
ANSYS HPC Technology Leadership Barbara Hutchings barbara.hutchings@ansys.com 1 ANSYS, Inc. September 20, Why ANSYS Users Need HPC Insight you can t get any other way HPC enables high-fidelity Include
More informationTECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING
TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING Table of Contents: The Accelerated Data Center Optimizing Data Center Productivity Same Throughput with Fewer Server Nodes
More informationQLogic in HPC Vendor Update IDC HPC User Forum April 16, 2008 Jeff Broughton Sr. Director Engineering Host Solutions Group
QLogic in HPC Vendor Update IDC HPC User Forum April 16, 2008 Jeff Broughton Sr. Director Engineering Host Solutions Group 1 Networking for Storage and HPC Leading supplier of Fibre Channel Leading supplier
More informationManufacturing Bringing New Levels of Performance to CAE Applications
Solution Brief: Manufacturing Bringing New Levels of Performance to CAE Applications Abstract Computer Aided Engineering (CAE) is used to help manufacturers bring products to market faster while maintaining
More informationHP solutions for mission critical SQL Server Data Management environments
HP solutions for mission critical SQL Server Data Management environments SQL Server User Group Sweden Michael Kohs, Technical Consultant HP/MS EMEA Competence Center michael.kohs@hp.com 1 Agenda HP ProLiant
More informationPower Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017
Power Systems AC922 Overview Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017 IBM POWER HPC Platform Strategy High-performance computer and high-performance
More informationHIGH-PERFORMANCE STORAGE FOR DISCOVERY THAT SOARS
HIGH-PERFORMANCE STORAGE FOR DISCOVERY THAT SOARS OVERVIEW When storage demands and budget constraints collide, discovery suffers. And it s a growing problem. Driven by ever-increasing performance and
More informationTrends in systems and how to get efficient performance
Trends in systems and how to get efficient performance Martin Hilgeman HPC Consultant martin.hilgeman@dell.com The landscape is changing We are no longer in the general purpose era the argument of tuning
More informationAssessment of LS-DYNA Scalability Performance on Cray XD1
5 th European LS-DYNA Users Conference Computing Technology (2) Assessment of LS-DYNA Scalability Performance on Cray Author: Ting-Ting Zhu, Cray Inc. Correspondence: Telephone: 651-65-987 Fax: 651-65-9123
More informationSystem Design of Kepler Based HPC Solutions. Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering.
System Design of Kepler Based HPC Solutions Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering. Introduction The System Level View K20 GPU is a powerful parallel processor! K20 has
More information