ANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011
|
|
- Diana Jefferson
- 6 years ago
- Views:
Transcription
1 ANSYS HPC Technology Leadership Barbara Hutchings 1 ANSYS, Inc. September 20,
2 Why ANSYS Users Need HPC Insight you can t get any other way HPC enables high-fidelity Include details - for reliable results Be sure your design is right Innovate with confidence HPC delivers throughput Consider multiple design ideas Optimize the design Ensure performance across range of conditions 2 ANSYS, Inc. September 20,
3 15 % spent on R&D 570 software developers Partner relationships ANSYS HPC Leadership A History of HPC Performance Parallel dynamic moving/deforming mesh Distributed memory particle tracking Integration with load management systems Support for Linux clusters, low latency interconnects 10M cell fluids simulations, 128 processors st general-purpose parallel CFD with interactive client-server user environment Ideal scaling to 2048 cores (fluids) Teraflop performance at 512 core (structures) Parallel I/O (fluids) Domain Decomposition introduced (HFSS 12) Parallel meshing (fluids) Support for clusters using Windows HPC Parallel dynamic mesh refinement and coarsening Dynamic load balancing s Vector Processing on Mainframes Hybrid parallel for sustained multicore performance (fluids) GPU acceleration (structures) Optimized performance on multicore processors 1 st One Billion cell fluids simulation Distributed sparse solver Distributed PCG solver Variational Technology DANSYS released Distributed Solve (DSO) HFSS st company to solve 100M structural DOF Today s multi-core / many-core 64bit hardware large memory addressing evolution Shared memory multiprocessing (HFSS 7) makes HPC a software development imperative. ANSYS 1990 is committed to maintaining performance 1990 Shared Memory Multiprocessing for structural simulations leadership Iterative PCG Solver Introduced for large structural analysis 3 ANSYS, Inc. September 20, 2008 ANSYS, Inc. All rights reserved. ANSYS, Inc. Proprietary
4 HPC A Software Development Imperative Clock Speed Leveling off Core Counts Growing Exploding (GPUs) Future performance depends on highly scalable parallel software 4 Source: ANSYS, Inc. September 20,
5 RATING RATING ANSYS FLUENT Scaling Achievement 2008 Hardware (Intel Harpertown, DDR IB) 2010 Hardware (Intel Westmere, QDR IB) IDEAL IDEAL Number of Cores Systems keep improving: faster processors, more cores Ideal rating (speed) doubled in two years! ANSYS, Inc. September 20, Number of Cores Memory bandwidth per core and network latency/bw stress scalability 2008 release (12.0) re-architected MPI huge scaling improvement, for a while 2010 release (13.0) introduces hybrid parallelism and scaling continues!
6 Core Solver Rating Extreme CFD Scaling s of cores 2500 Scaling to Thousands of Cores 111M Cell Truck Benchmark Number of Cores ANSYS Fluent 13.0 ANSYS Fluent 14.0 (Pre- Release) Enabled by ongoing software innovation Hybrid parallel: fast shared memory communication (OpenMP) within a machine to speed up overall solver performance; distributed memory (MPI) between machines 6 ANSYS, Inc. September 20,
7 Solution Rating Solution Rating Parallel Scaling ANSYS Mechanical 300 Sparse Solver (Parallel Re-Ordering) Focus on bottlenecks in R12.1 R13.0 the distributed memory solvers (DANSYS) 150 Sparse Solver Number of cores Parallelized equation ordering 40% faster w/ updated Intel MKL PCG Solver (Pre-Conditioner Scaling) R12.1 R13.0 Preconditioned Conjugate Gradient (PCG) Solver Parallelized preconditioning step ANSYS, Inc. September 20, Number of cores
8 Architecture-Aware Partitioning Original partitions are remapped to the cluster considering the network topology and latencies Minimizes inter-machine traffic reducing load on network switches Improves performance, particularly on slow interconnects and/or large clusters Partition Graph 3 machines, 8 cores each Colors indicate machines Original mapping New mapping 8 ANSYS, Inc. September 20,
9 File I/O Performance Case file IO Both read and write significantly faster in R13 A combination of serial-io optimizations as well as parallel-io techniques, where available Parallel-IO (.pdat) Significant speedup of parallel IO, particularly for cases with large number of zones Support for Lustre, EMC/MPFS, AIX/GPFS file systems added Data file IO (.dat) Performance in R12 was highly optimized. Further incremental improvements done in R13 9 ANSYS, Inc. September 20, 91.2 Parallel Data write truck_14m, case read 79.5 R12 vs. R13 BMW -68% FL5L2 4M -63% Circuit -97% Truck 14M -64%
10 What about GPU Computing? CPUs and GPUs work in a collaborative fashion CPU GPU PCI Express channel Multi-core processors Typically 4-6 cores Powerful, general purpose Many-core processors Typically hundreds of cores Great for highly parallel code, within memory constraints 10 ANSYS, Inc. September 20,
11 ANSYS Mechanical SMP GPU Speedup Solver Kernel Speedups Overall Speedups From NAFEMS World Congress May Boston, MA, USA Accelerate FEA Simulations with a GPU -by Jeff Beisheim, ANSYS 11 ANSYS, Inc. September 20, Tesla C2050 and Intel Xeon 5560
12 R14: GPU Acceleration for DANSYS 3 R14 Distributed ANSYS Total Simulation Speedups for R13 Benchmark set 4 CPU cores CPU cores + 1 GPU V13cg-1 (JCG, 1100k) V13sp-1 (sparse, 430k) V13sp-2 (sparse, 500k) V13sp-3 (sparse, 2400k) V13sp-4 (sparse, 1000k) V13sp-5 (sparse, 2100k) Windows workstation : Two Intel Xeon 5560 processors (2.8 GHz, 8 cores total), 32 GB RAM, NVIDIA Tesla C2070, Windows 7, TCC driver mode 12 ANSYS, Inc. September 20,
13 Total Speedup ANSYS Mechanical Multi-Node GPU Solder Joint Benchmark (4 MDOF, Creep Strain Analysis) Linux cluster : Each node contains 12 Intel Xeon 5600-series cores, 96 GB RAM, NVIDIA Tesla M2070, InfiniBand R14 Distributed ANSYS w/wo GPU Without GPU With GPU 3.4x 3.2x 4.4x x 1.9x 1.0 Solder balls Mold cores 32 cores 64 cores 13 ANSYS, Inc. September 20, PCB Results Courtesy of MicroConsult Engineering, GmbH
14 GPU Acceleration for CFD Radiation viewfactor calculation (ANSYS FLUENT 14 - beta) First capability for specialty physics view factors, ray tracing, reaction rates, etc. R&D focus on linear solvers, smoothers but potential limited by Amdahl s Law 14 ANSYS, Inc. September 20,
15 Case Study HPC for High Fidelity CFD 8M to 12M element turbocharger models (ANSYS CFX) Previous practice (8 nodes HPC) Full stage compressor runs hours Turbine simulations up to 72 hours Current practice (160 nodes) 32 nodes per simulation Full stage compressor 4 hours Turbine simulations 5-6 hours Simultaneous consideration of 5 ideas Ability to address design uncertainty clearance tolerance ANSYS HPC technology is enabling Cummins to use larger models with greater geometric details and more-realistic treatment of physical phenomena ANSYS, Inc. September 20,
16 Case Study HPC for High Fidelity CFD EURO/CFD Model sizes up to 200M cells (ANSYS FLUENT) cluster of 700 cores cores per simulation 25 Millions (4 Days) 50 Millions (2 Days) 3 Millions of Cells (6 Days) 10 Millions (5 Days) Compressibility Conduction/Convection Supersonic Multiphase Radiation Increase of : Transient Optimisation / DOE Dynamic Mesh Spatial-temporal Accuracy LES Combustion Aeroacoustic Fluid Structure Interaction Complexity of Physical Phenomenon 16 ANSYS, Inc. September 20,
17 Microconsult GmbH Case Study HPC for High Fidelity Mechanical Solder joint failure analysis Thermal stress 7.8 MDOF Creep strain 5.5 MDOF Simulation time reduced from 2 weeks to 1 day From 8 26 cores (past) to 128 cores (present) HPC is an important competitive advantage for companies looking to optimize the performance of their products and reduce time to market. 17 ANSYS, Inc. September 20,
18 Case Study HPC for Desktop Productivity Cognity Limited steerable conductors for oil recovery ANSYS Mechanical simulations to determine load carrying capacity 750K elements, many contacts 12 core workstations / 24 GB RAM 6X speedup / results in 1 hour or less 5-10 design iterations per day Parallel processing makes it possible to evaluate five to 10 design iterations per day, enabling Cognity to rapidly improve their design ANSYS, Inc. September 20,
19 Case Study Skewed Waveguide Array (HFSS) 16X16 (256 elements and excitations) Skewed Rectangular Waveguide (WR90) Array 1.3M Matrix Size Using 8 cores 3 hrs. solution time 0.4GB Memory total Using 16 cores 2 hrs. solution time 0.8GB Memory total Additional Cores Faster solution time More memory. Unit cell shown with wireframe view of virtual array 19 ANSYS, Inc. September 20,
20 Case Study Desktop Productivity Cautionary Tale NVIDIA - Case study on the value of HW refresh and SW best-practice Deflection and bending of 3D glasses ANSYS Mechanical 1M DOF models Optimization of: Solver selection (direct vs iterative) Machine memory (in core execution) Multicore (8-way) parallel with GPU acceleration Before/After: 77x speedup from 60 hours per simulation to 47 minutes. Most importantly: HPC tuning added scope for design exploration and optimization. 20 ANSYS, Inc. September 20,
21 Take Home Points / Discussion ANSYS HPC performance enables scaling for high-fidelity What could you learn from a 10M (or 100M) cell / DOF model? What could you learn if you had time to consider 10 x more design ideas? Scaling applies to all physics, all hardware (desktop and cluster) ANSYS continually invests in software development for HPC Maximized value from your HPC investment This creates differentiated competitive advantage for ANSYS users Comments / Questions / Discussion 21 ANSYS, Inc. September 20,
HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)
HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access
More informationANSYS HPC Technology Leadership
ANSYS HPC Technology Leadership 1 ANSYS, Inc. November 14, Why ANSYS Users Need HPC Insight you can t get any other way It s all about getting better insight into product behavior quicker! HPC enables
More informationSolving Large Complex Problems. Efficient and Smart Solutions for Large Models
Solving Large Complex Problems Efficient and Smart Solutions for Large Models 1 ANSYS Structural Mechanics Solutions offers several techniques 2 Current trends in simulation show an increased need for
More informationStan Posey, CAE Industry Development NVIDIA, Santa Clara, CA, USA
Stan Posey, CAE Industry Development NVIDIA, Santa Clara, CA, USA NVIDIA and HPC Evolution of GPUs Public, based in Santa Clara, CA ~$4B revenue ~5,500 employees Founded in 1999 with primary business in
More informationWhy HPC for. ANSYS Mechanical and ANSYS CFD?
Why HPC for ANSYS Mechanical and ANSYS CFD? 1 HPC Defined High Performance Computing (HPC) at ANSYS: An ongoing effort designed to remove computing limitations from engineers who use computer aided engineering
More informationANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation
ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation Ray Browell nvidia Technology Theater SC12 1 2012 ANSYS, Inc. nvidia Technology Theater SC12 HPC Revolution Recent
More informationSpeedup Altair RADIOSS Solvers Using NVIDIA GPU
Innovation Intelligence Speedup Altair RADIOSS Solvers Using NVIDIA GPU Eric LEQUINIOU, HPC Director Hongwei Zhou, Senior Software Developer May 16, 2012 Innovation Intelligence ALTAIR OVERVIEW Altair
More informationMaximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs
Presented at the 2014 ANSYS Regional Conference- Detroit, June 5, 2014 Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs Bhushan Desam, Ph.D. NVIDIA Corporation 1 NVIDIA Enterprise
More informationMaking Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010
Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010 Windows HPC Server 2008 R2 Windows HPC Server 2008 R2 makes supercomputing
More informationComputer Aided Engineering with Today's Multicore, InfiniBand-Based Clusters ANSYS, Inc. All rights reserved. 1 ANSYS, Inc.
Computer Aided Engineering with Today's Multicore, InfiniBand-Based Clusters 2006 ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Proprietary Our Business Simulation Driven Product Development Deliver superior
More informationTFLOP Performance for ANSYS Mechanical
TFLOP Performance for ANSYS Mechanical Dr. Herbert Güttler Engineering GmbH Holunderweg 8 89182 Bernstadt www.microconsult-engineering.de Engineering H. Güttler 19.06.2013 Seite 1 May 2009, Ansys12, 512
More information2008 International ANSYS Conference
28 International ANSYS Conference Maximizing Performance for Large Scale Analysis on Multi-core Processor Systems Don Mize Technical Consultant Hewlett Packard 28 ANSYS, Inc. All rights reserved. 1 ANSYS,
More informationAdvances of parallel computing. Kirill Bogachev May 2016
Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being
More informationRecent Advances in ANSYS Toward RDO Practices Using optislang. Wim Slagter, ANSYS Inc. Herbert Güttler, MicroConsult GmbH
Recent Advances in ANSYS Toward RDO Practices Using optislang Wim Slagter, ANSYS Inc. Herbert Güttler, MicroConsult GmbH 1 Product Development Pressures Source: Engineering Simulation & HPC Usage Survey
More informationSimulation Advances. Antenna Applications
Simulation Advances for RF, Microwave and Antenna Applications Presented by Martin Vogel, PhD Application Engineer 1 Overview Advanced Integrated Solver Technologies Finite Arrays with Domain Decomposition
More informationHPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)
HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access
More informationQLogic TrueScale InfiniBand and Teraflop Simulations
WHITE Paper QLogic TrueScale InfiniBand and Teraflop Simulations For ANSYS Mechanical v12 High Performance Interconnect for ANSYS Computer Aided Engineering Solutions Executive Summary Today s challenging
More informationSIMPLIFYING HPC SIMPLIFYING HPC FOR ENGINEERING SIMULATION WITH ANSYS
SIMPLIFYING HPC SIMPLIFYING HPC FOR ENGINEERING SIMULATION WITH ANSYS THE DELL WAY We are an acknowledged leader in academic supercomputing including major HPC systems installed at the Cambridge University
More informationUnderstanding Hardware Selection to Speedup Your CFD and FEA Simulations
Understanding Hardware Selection to Speedup Your CFD and FEA Simulations 1 Agenda Why Talking About Hardware HPC Terminology ANSYS Work-flow Hardware Considerations Additional resources 2 Agenda Why Talking
More informationANSYS High. Computing. User Group CAE Associates
ANSYS High Performance Computing User Group 010 010 CAE Associates Parallel Processing in ANSYS ANSYS offers two parallel processing methods: Shared-memory ANSYS: Shared-memory ANSYS uses the sharedmemory
More informationMaximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms
Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,
More informationIBM Information Technology Guide For ANSYS Fluent Customers
IBM ISV & Developer Relations Manufacturing IBM Information Technology Guide For ANSYS Fluent Customers A collaborative effort between ANSYS and IBM 2 IBM Information Technology Guide For ANSYS Fluent
More informationACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016
ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? AGENDA Why use AMG? When to use AMG? NVIDIA AmgX Results 2
More informationPerformance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA
Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to
More informationReal Application Performance and Beyond
Real Application Performance and Beyond Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400 Fax: 408-970-3403 http://www.mellanox.com Scientists, engineers and analysts
More informationA Comprehensive Study on the Performance of Implicit LS-DYNA
12 th International LS-DYNA Users Conference Computing Technologies(4) A Comprehensive Study on the Performance of Implicit LS-DYNA Yih-Yih Lin Hewlett-Packard Company Abstract This work addresses four
More informationHigh performance Computing and O&G Challenges
High performance Computing and O&G Challenges 2 Seismic exploration challenges High Performance Computing and O&G challenges Worldwide Context Seismic,sub-surface imaging Computing Power needs Accelerating
More informationHFSS 14 Update for SI and RF Applications Markus Kopp Product Manager, Electronics ANSYS, Inc.
HFSS 14 Update for SI and RF Applications Markus Kopp Product Manager, Electronics ANSYS, Inc. 1 ANSYS, Inc. September 21, Advanced Solvers: Finite Arrays with DDM 2 ANSYS, Inc. September 21, Finite Arrays
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton
More informationIndustrial finite element analysis: Evolution and current challenges. Keynote presentation at NAFEMS World Congress Crete, Greece June 16-19, 2009
Industrial finite element analysis: Evolution and current challenges Keynote presentation at NAFEMS World Congress Crete, Greece June 16-19, 2009 Dr. Chief Numerical Analyst Office of Architecture and
More informationTwo-Phase flows on massively parallel multi-gpu clusters
Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous
More informationSimulation Advances for RF, Microwave and Antenna Applications
Simulation Advances for RF, Microwave and Antenna Applications Bill McGinn Application Engineer 1 Overview Advanced Integrated Solver Technologies Finite Arrays with Domain Decomposition Hybrid solving:
More informationThe Cray CX1 puts massive power and flexibility right where you need it in your workgroup
The Cray CX1 puts massive power and flexibility right where you need it in your workgroup Up to 96 cores of Intel 5600 compute power 3D visualization Up to 32TB of storage GPU acceleration Small footprint
More informationPerformance Benefits of NVIDIA GPUs for LS-DYNA
Performance Benefits of NVIDIA GPUs for LS-DYNA Mr. Stan Posey and Dr. Srinivas Kodiyalam NVIDIA Corporation, Santa Clara, CA, USA Summary: This work examines the performance characteristics of LS-DYNA
More informationMPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA
MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA Gilad Shainer 1, Tong Liu 1, Pak Lui 1, Todd Wilde 1 1 Mellanox Technologies Abstract From concept to engineering, and from design to
More informationESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report
ESPRESO ExaScale PaRallel FETI Solver Hybrid FETI Solver Report Lubomir Riha, Tomas Brzobohaty IT4Innovations Outline HFETI theory from FETI to HFETI communication hiding and avoiding techniques our new
More informationSession S0069: GPU Computing Advances in 3D Electromagnetic Simulation
Session S0069: GPU Computing Advances in 3D Electromagnetic Simulation Andreas Buhr, Alexander Langwost, Fabrizio Zanella CST (Computer Simulation Technology) Abstract Computer Simulation Technology (CST)
More informationLarge scale Imaging on Current Many- Core Platforms
Large scale Imaging on Current Many- Core Platforms SIAM Conf. on Imaging Science 2012 May 20, 2012 Dr. Harald Köstler Chair for System Simulation Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen,
More informationFaster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs. Baskar Rajagopalan Accelerated Computing, NVIDIA
Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs Baskar Rajagopalan Accelerated Computing, NVIDIA 1 Engineering & IT Challenges/Trends NVIDIA GPU Solutions AGENDA Abaqus GPU
More informationBuilding NVLink for Developers
Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized
More informationIntroduction to parallel Computing
Introduction to parallel Computing VI-SEEM Training Paschalis Paschalis Korosoglou Korosoglou (pkoro@.gr) (pkoro@.gr) Outline Serial vs Parallel programming Hardware trends Why HPC matters HPC Concepts
More informationOverview of Parallel Computing. Timothy H. Kaiser, PH.D.
Overview of Parallel Computing Timothy H. Kaiser, PH.D. tkaiser@mines.edu Introduction What is parallel computing? Why go parallel? The best example of parallel computing Some Terminology Slides and examples
More informationMaximizing Memory Performance for ANSYS Simulations
Maximizing Memory Performance for ANSYS Simulations By Alex Pickard, 2018-11-19 Memory or RAM is an important aspect of configuring computers for high performance computing (HPC) simulation work. The performance
More informationHPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT:
HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms Author: Stan Posey Panasas, Inc. Correspondence: Stan Posey Panasas, Inc. Phone +510 608 4383 Email sposey@panasas.com
More informationAccelerated ANSYS Fluent: Algebraic Multigrid on a GPU. Robert Strzodka NVAMG Project Lead
Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU Robert Strzodka NVAMG Project Lead A Parallel Success Story in Five Steps 2 Step 1: Understand Application ANSYS Fluent Computational Fluid Dynamics
More informationEngineers can be significantly more productive when ANSYS Mechanical runs on CPUs with a high core count. Executive Summary
white paper Computer-Aided Engineering ANSYS Mechanical on Intel Xeon Processors Engineer Productivity Boosted by Higher-Core CPUs Engineers can be significantly more productive when ANSYS Mechanical runs
More informationAcuSolve Performance Benchmark and Profiling. October 2011
AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, Altair Compute
More informationEnhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations
Performance Brief Quad-Core Workstation Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations With eight cores and up to 80 GFLOPS of peak performance at your fingertips,
More informationBig Data Analytics Performance for Large Out-Of- Core Matrix Solvers on Advanced Hybrid Architectures
Procedia Computer Science Volume 51, 2015, Pages 2774 2778 ICCS 2015 International Conference On Computational Science Big Data Analytics Performance for Large Out-Of- Core Matrix Solvers on Advanced Hybrid
More informationHPC Architectures. Types of resource currently in use
HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationThe Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center
The Stampede is Coming Welcome to Stampede Introductory Training Dan Stanzione Texas Advanced Computing Center dan@tacc.utexas.edu Thanks for Coming! Stampede is an exciting new system of incredible power.
More informationA Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids
A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids Patrice Castonguay and Antony Jameson Aerospace Computing Lab, Stanford University GTC Asia, Beijing, China December 15 th, 2011
More informationDell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia
More informationSun Lustre Storage System Simplifying and Accelerating Lustre Deployments
Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems
More informationGPU-Acceleration of CAE Simulations. Bhushan Desam NVIDIA Corporation
GPU-Acceleration of CAE Simulations Bhushan Desam NVIDIA Corporation bdesam@nvidia.com 1 AGENDA GPUs in Enterprise Computing Business Challenges in Product Development NVIDIA GPUs for CAE Applications
More informationCenter Extreme Scale CS Research
Center Extreme Scale CS Research Center for Compressible Multiphase Turbulence University of Florida Sanjay Ranka Herman Lam Outline 10 6 10 7 10 8 10 9 cores Parallelization and UQ of Rocfun and CMT-Nek
More informationAccelerating Implicit LS-DYNA with GPU
Accelerating Implicit LS-DYNA with GPU Yih-Yih Lin Hewlett-Packard Company Abstract A major hindrance to the widespread use of Implicit LS-DYNA is its high compute cost. This paper will show modern GPU,
More informationOptimizing LS-DYNA Productivity in Cluster Environments
10 th International LS-DYNA Users Conference Computing Technology Optimizing LS-DYNA Productivity in Cluster Environments Gilad Shainer and Swati Kher Mellanox Technologies Abstract Increasing demand for
More informationArchitectures for Scalable Media Object Search
Architectures for Scalable Media Object Search Dennis Sng Deputy Director & Principal Scientist NVIDIA GPU Technology Workshop 10 July 2014 ROSE LAB OVERVIEW 2 Large Database of Media Objects Next- Generation
More informationAcuSolve Performance Benchmark and Profiling. October 2011
AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox, Altair Compute
More information2008 International ANSYS Conference
2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,
More informationIntel Cluster Toolkit Compiler Edition 3.2 for Linux* or Windows HPC Server 2008*
Intel Cluster Toolkit Compiler Edition. for Linux* or Windows HPC Server 8* Product Overview High-performance scaling to thousands of processors. Performance leadership Intel software development products
More informationAlternative GPU friendly assignment algorithms. Paul Richmond and Peter Heywood Department of Computer Science The University of Sheffield
Alternative GPU friendly assignment algorithms Paul Richmond and Peter Heywood Department of Computer Science The University of Sheffield Graphics Processing Units (GPUs) Context: GPU Performance Accelerated
More informationAmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015
AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative
More informationAlgorithms, System and Data Centre Optimisation for Energy Efficient HPC
2015-09-14 Algorithms, System and Data Centre Optimisation for Energy Efficient HPC Vincent Heuveline URZ Computing Centre of Heidelberg University EMCL Engineering Mathematics and Computing Lab 1 Energy
More informationSplotch: High Performance Visualization using MPI, OpenMP and CUDA
Splotch: High Performance Visualization using MPI, OpenMP and CUDA Klaus Dolag (Munich University Observatory) Martin Reinecke (MPA, Garching) Claudio Gheller (CSCS, Switzerland), Marzia Rivi (CINECA,
More informationLecture 7: Introduction to HFSS-IE
Lecture 7: Introduction to HFSS-IE 2015.0 Release ANSYS HFSS for Antenna Design 1 2015 ANSYS, Inc. HFSS-IE: Integral Equation Solver Introduction HFSS-IE: Technology An Integral Equation solver technology
More informationInfiniBand-based HPC Clusters
Boosting Scalability of InfiniBand-based HPC Clusters Asaf Wachtel, Senior Product Manager 2010 Voltaire Inc. InfiniBand-based HPC Clusters Scalability Challenges Cluster TCO Scalability Hardware costs
More informationTowards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers
Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,
More informationHybrid Implementation of 3D Kirchhoff Migration
Hybrid Implementation of 3D Kirchhoff Migration Max Grossman, Mauricio Araya-Polo, Gladys Gonzalez GTC, San Jose March 19, 2013 Agenda 1. Motivation 2. The Problem at Hand 3. Solution Strategy 4. GPU Implementation
More informationThe Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System
The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Alan Humphrey, Qingyu Meng, Martin Berzins Scientific Computing and Imaging Institute & University of Utah I. Uintah Overview
More informationParticleworks: Particle-based CAE Software fully ported to GPU
Particleworks: Particle-based CAE Software fully ported to GPU Introduction PrometechVideo_v3.2.3.wmv 3.5 min. Particleworks Why the particle method? Existing methods FEM, FVM, FLIP, Fluid calculation
More informationHYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE
HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER AVISHA DHISLE PRERIT RODNEY ADHISLE PRODNEY 15618: PARALLEL COMPUTER ARCHITECTURE PROF. BRYANT PROF. KAYVON LET S
More informationMSC Nastran Explicit Nonlinear (SOL 700) on Advanced SGI Architectures
MSC Nastran Explicit Nonlinear (SOL 700) on Advanced SGI Architectures Presented By: Dr. Olivier Schreiber, Application Engineering, SGI Walter Schrauwen, Senior Engineer, Finite Element Development, MSC
More informationANSYS Fluent 14 Performance Benchmark and Profiling. October 2012
ANSYS Fluent 14 Performance Benchmark and Profiling October 2012 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information
More informationHigh-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs
High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs Gordon Erlebacher Department of Scientific Computing Sept. 28, 2012 with Dimitri Komatitsch (Pau,France) David Michea
More informationJ. Blair Perot. Ali Khajeh-Saeed. Software Engineer CD-adapco. Mechanical Engineering UMASS, Amherst
Ali Khajeh-Saeed Software Engineer CD-adapco J. Blair Perot Mechanical Engineering UMASS, Amherst Supercomputers Optimization Stream Benchmark Stag++ (3D Incompressible Flow Code) Matrix Multiply Function
More informationBirds of a Feather Presentation
Mellanox InfiniBand QDR 4Gb/s The Fabric of Choice for High Performance Computing Gilad Shainer, shainer@mellanox.com June 28 Birds of a Feather Presentation InfiniBand Technology Leadership Industry Standard
More informationNew Technologies in CST STUDIO SUITE CST COMPUTER SIMULATION TECHNOLOGY
New Technologies in CST STUDIO SUITE 2016 Outline Design Tools & Modeling Antenna Magus Filter Designer 2D/3D Modeling 3D EM Solver Technology Cable / Circuit / PCB Systems Multiphysics CST Design Tools
More informationMulti-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation
Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation 1 Cheng-Han Du* I-Hsin Chung** Weichung Wang* * I n s t i t u t e o f A p p l i e d M
More informationInvestigation and Feasibility Study of Linux and Windows in the Computational Processing Power of ANSYS Software
Science Arena Publications Specialty Journal of Electronic and Computer Sciences Available online at www.sciarena.com 2017, Vol, 3 (1): 40-46 Investigation and Feasibility Study of Linux and Windows in
More informationAnalyzing the Performance of IWAVE on a Cluster using HPCToolkit
Analyzing the Performance of IWAVE on a Cluster using HPCToolkit John Mellor-Crummey and Laksono Adhianto Department of Computer Science Rice University {johnmc,laksono}@rice.edu TRIP Meeting March 30,
More informationDell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for ANSYS Mechanical, ANSYS Fluent, and
More informationAerodynamics of a hi-performance vehicle: a parallel computing application inside the Hi-ZEV project
Workshop HPC enabling of OpenFOAM for CFD applications Aerodynamics of a hi-performance vehicle: a parallel computing application inside the Hi-ZEV project A. De Maio (1), V. Krastev (2), P. Lanucara (3),
More informationS8901 Quadro for AI, VR and Simulation
S8901 Quadro for AI, VR and Simulation Carl Flygare, PNY Quadro Product Marketing Manager Allen Bourgoyne, NVIDIA Senior Product Marketing Manager The question of whether a computer can think is no more
More informationNVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU
NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU GPGPU opens the door for co-design HPC, moreover middleware-support embedded system designs to harness the power of GPUaccelerated
More informationD036 Accelerating Reservoir Simulation with GPUs
D036 Accelerating Reservoir Simulation with GPUs K.P. Esler* (Stone Ridge Technology), S. Atan (Marathon Oil Corp.), B. Ramirez (Marathon Oil Corp.) & V. Natoli (Stone Ridge Technology) SUMMARY Over the
More informationHETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA
HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS
More informationHPC 2 Informed by Industry
HPC 2 Informed by Industry HPC User Forum October 2011 Merle Giles Private Sector Program & Economic Development mgiles@ncsa.illinois.edu National Center for Supercomputing Applications University of Illinois
More informationSystem Design of Kepler Based HPC Solutions. Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering.
System Design of Kepler Based HPC Solutions Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering. Introduction The System Level View K20 GPU is a powerful parallel processor! K20 has
More informationIBM Power AC922 Server
IBM Power AC922 Server The Best Server for Enterprise AI Highlights More accuracy - GPUs access system RAM for larger models Faster insights - significant deep learning speedups Rapid deployment - integrated
More informationGROMACS Performance Benchmark and Profiling. August 2011
GROMACS Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationParallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor
Multiprocessing Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast. Almasi and Gottlieb, Highly Parallel
More informationCMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)
CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can
More informationIndustrial achievements on Blue Waters using CPUs and GPUs
Industrial achievements on Blue Waters using CPUs and GPUs HPC User Forum, September 17, 2014 Seattle Seid Korić PhD Technical Program Manager Associate Adjunct Professor koric@illinois.edu Think Big!
More informationUsing an HPC Cloud for Weather Science
Using an HPC Cloud for Weather Science Provided By: Transforming Operational Environmental Predictions Around the Globe Moving EarthCast Technologies from Idea to Production EarthCast Technologies produces
More informationThe Future of Interconnect Technology
The Future of Interconnect Technology Michael Kagan, CTO HPC Advisory Council Stanford, 2014 Exponential Data Growth Best Interconnect Required 44X 0.8 Zetabyte 2009 35 Zetabyte 2020 2014 Mellanox Technologies
More informationWorking Differently Accelerating Virtual Product Design with Intel Quad-Core Technology and ESI Group Software
White Paper Quad-Core Intel/ESI Group Workstation White paper Working Differently ccelerating Virtual Product Design with Intel Quad-Core Technology and ESI Group Software Workstation supercomputers powered
More informationThe determination of the correct
SPECIAL High-performance SECTION: H i gh-performance computing computing MARK NOBLE, Mines ParisTech PHILIPPE THIERRY, Intel CEDRIC TAILLANDIER, CGGVeritas (formerly Mines ParisTech) HENRI CALANDRA, Total
More informationMellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007
Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise
More information