ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

Similar documents
Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU. Robert Strzodka NVAMG Project Lead

Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs

Stan Posey, CAE Industry Development NVIDIA, Santa Clara, CA, USA

ANSYS HPC Technology Leadership

Why HPC for. ANSYS Mechanical and ANSYS CFD?

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

ANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011

Solving Large Complex Problems. Efficient and Smart Solutions for Large Models

Understanding Hardware Selection to Speedup Your CFD and FEA Simulations

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016

GPU-Acceleration of CAE Simulations. Bhushan Desam NVIDIA Corporation

Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs. Baskar Rajagopalan Accelerated Computing, NVIDIA

GPU-Accelerated Algebraic Multigrid for Commercial Applications. Joe Eaton, Ph.D. Manager, NVAMG CUDA Library NVIDIA

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

Recent Advances in ANSYS Toward RDO Practices Using optislang. Wim Slagter, ANSYS Inc. Herbert Güttler, MicroConsult GmbH

ANSYS High. Computing. User Group CAE Associates

The Cray CX1 puts massive power and flexibility right where you need it in your workgroup

The Visual Computing Company

CUDA Accelerated Compute Libraries. M. Naumov

GPU Computing fuer rechenintensive Anwendungen. Axel Koehler NVIDIA

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

High Performance Computing with Accelerators

Building NVLink for Developers

Turbostream: A CFD solver for manycore

TFLOP Performance for ANSYS Mechanical

Stan Posey NVIDIA, Santa Clara, CA, USA;

Large scale Imaging on Current Many- Core Platforms

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

A Comprehensive Study on the Performance of Implicit LS-DYNA

GPU COMPUTING WITH MSC NASTRAN 2013

Accelerating Implicit LS-DYNA with GPU

Multi-GPU simulations in OpenFOAM with SpeedIT technology.

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015

FEMAP/NX NASTRAN PERFORMANCE TUNING

MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures

Engineers can be significantly more productive when ANSYS Mechanical runs on CPUs with a high core count. Executive Summary

Computer Aided Engineering with Today's Multicore, InfiniBand-Based Clusters ANSYS, Inc. All rights reserved. 1 ANSYS, Inc.

Performance Benefits of NVIDIA GPUs for LS-DYNA

HPC Architectures. Types of resource currently in use

Pedraforca: a First ARM + GPU Cluster for HPC

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

WHAT S NEW IN GRID 7.0. Mason Wu, GRID & ProViz Solutions Architect Nov. 2018

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Application of GPU technology to OpenFOAM simulations

Particleworks: Particle-based CAE Software fully ported to GPU

GPU Accelerated Solvers for ODEs Describing Cardiac Membrane Equations

DU _v01. September User Guide

HP and CATIA HP Workstations for running Dassault Systèmes CATIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

HFSS Hybrid Finite Element and Integral Equation Solver for Large Scale Electromagnetic Design and Simulation

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations

New Features in LS-DYNA HYBRID Version

Advances of parallel computing. Kirill Bogachev May 2016

Optimising the Mantevo benchmark suite for multi- and many-core architectures

HP GTC Presentation May 2012

Finite Element Integration and Assembly on Modern Multi and Many-core Processors

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Efficient AMG on Hybrid GPU Clusters. ScicomP Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann. Fraunhofer SCAI

OpenMPSuperscalar: Task-Parallel Simulation and Visualization of Crowds with Several CPUs and GPUs

System Design of Kepler Based HPC Solutions. Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering.

IBM Information Technology Guide For ANSYS Fluent Customers

2008 International ANSYS Conference

GPU GPU CPU. Raymond Namyst 3 Samuel Thibault 3 Olivier Aumage 3

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

PARALUTION - a Library for Iterative Sparse Methods on CPU and GPU

Dell HPC System for Manufacturing System Architecture and Application Performance

LAMMPSCUDA GPU Performance. April 2011

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany

Structural Mechanics With ANSYS Pierre THIEFFRY Lead Product Manager ANSYS, Inc.

The Stampede is Coming: A New Petascale Resource for the Open Science Community

S8901 Quadro for AI, VR and Simulation

Performance of the 3D-Combustion Simulation Code RECOM-AIOLOS on IBM POWER8 Architecture. Alexander Berreth. Markus Bühler, Benedikt Anlauf

Hybrid (MPP+OpenMP) version of LS-DYNA

Session S0069: GPU Computing Advances in 3D Electromagnetic Simulation

GPU Clusters for High- Performance Computing Jeremy Enos Innovative Systems Laboratory

Automated Finite Element Computations in the FEniCS Framework using GPUs

HPC with Multicore and GPUs

Krishnan Suresh Associate Professor Mechanical Engineering

QUADRO ADVANCED VISUALIZATION. PREVAIL & PREVAIL ELITE SSDs PROFESSIONAL STORAGE -

Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation using OpenACC

Simulation Advances. Antenna Applications

TESLA P100 PERFORMANCE GUIDE. Deep Learning and HPC Applications

Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design

GPU PROGRESS AND DIRECTIONS IN APPLIED CFD

n N c CIni.o ewsrg.au

GPUs and Emerging Architectures

OP2 FOR MANY-CORE ARCHITECTURES

Carlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain)

The GPU-Cluster. Sandra Wienke Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

Parallel Systems. Project topics

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

Analyzing Performance and Power of Applications on GPUs with Dell 12G Platforms. Dr. Jeffrey Layton Enterprise Technologist HPC

Headline in Arial Bold 30pt. SGI Altix XE Server ANSYS Microsoft Windows Compute Cluster Server 2003

Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance

Optimizing Data Locality for Iterative Matrix Solvers on CUDA

Mathematical computations with GPUs

Transcription:

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation Ray Browell nvidia Technology Theater SC12 1 2012 ANSYS, Inc. nvidia Technology Theater SC12

HPC Revolution Recent advancements have revolutionized the computational speed available on the desktop Multi-core processors Every core is really an independent processor Large amounts of RAM and SSDs GPUs 2 2012 ANSYS, Inc. nvidia Technology Theater SC12

Mechanical GPU Accelerator Capability 3 2012 ANSYS, Inc. nvidia Technology Theater SC12

Mechanical GPU Accelerator Capability 4 2012 ANSYS, Inc. nvidia Technology Theater SC12

Mechanical GPU Accelerator Capability Targeted hardware NVIDIA Tesla C2075 NVIDIA Tesla M2090 NVIDIA Quadro 6000 NVIDIA Quadro K5000 NVIDIA Tesla K10 NVIDIA Tesla K20 Power (W) 225 250 225 122 250 250 Memory 6 GB 6 GB 6 GB 4 GB 8 GB 6 to 24 GB Memory Bandwidth (GB/s) 144 177.4 144 173 320 288 Peak Speed SP/DP (GFlops) 1030/515 1331/665 1030/515 2290/95 4577/190 5184/1728 These NVIDIA Kepler based products are not released yet, so specifications may be incorrect 5 2012 ANSYS, Inc. nvidia Technology Theater SC12

Mechanical GPU Accelerator Capability 6 2012 ANSYS, Inc. nvidia Technology Theater SC12

ANSYS Mechanical Number of Jobs Per Day 25 ANSYS Mechanical 14.5 Preview Results for Distributed ANSYS 14.5 Preview and Xeon 8-Core CPUs V14sp-5 Model 20 15 10 5 0 Higher is Better 12 8 8 Xeon E5-2687W 8 Cores + Tesla C2075 1.6x 19 Xeon E5-2687W 8 Cores + 2 x Tesla C2075 Turbine geometry 2,100 K DOF SOLID187 FEs Static, nonlinear One iteration ANSYS Mechanical14.5 Direct sparse solver Results from HP Z820; 2 x Xeons (16 Cores, use of only 8) 128GB memory, Win7; 2 x Tesla C2075 7 2012 ANSYS, Inc. nvidia Technology Theater SC12

Relative Speedup Structural GPU Accelerator Capability GPUs can offer significantly faster time to solution 6.5 million DOF Linear static analysis Sparse solver (DMP) 2 Intel Xeon E5-2670 (2.6 GHz, 16 cores total), 128 GB RAM, SSD, 4 Tesla C2075, Win7 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 GPU Performance 8 2012 ANSYS, Inc. nvidia Technology Theater SC12 2.6x 3.8x 2 cores 8 cores 8 cores (no GPU) (no GPU) (1 GPU)

Relative Speedup Structural GPU Accelerator Capability GPUs can offer significantly faster time to solution 11.8 million DOF Linear static analysis PCG solver (DMP) 2 Intel Xeon E5-2670 (2.6 GHz, 16 cores total), 128 GB RAM, SSD, 4 Tesla C2075, Win7 6.0 5.0 4.0 3.0 2.0 1.0 0.0 GPU Performance 9 2012 ANSYS, Inc. nvidia Technology Theater SC12 2.7x 5.2x 2 cores 8 cores 16 cores (no GPU) (1 GPU) (4 GPUs)

ANSYS and NVIDIA Collaboration Release ANSYS Mechanical ANSYS Fluent 13.0 Dec 2010 14.0 Dec 2011 14.5 Nov 2012 Shared Memory Solvers; Single Node/ Single GPU + Distributed ANSYS; Multi-node / 1 GPU/node + Multi-GPU / node; + Hybrid PCG; Radiation Heat Transfer (beta) + GPU AMG Solver (beta), Single GPU 10 2012 ANSYS, Inc. nvidia Technology Theater SC12

Fluent Radiation Modeling on GPUs VIEWFAC Utility to compute view factors Hybrid MPI-OpenMP-OpenCL parallel implementation Works on CPUs, GPUs or both RAY TRACING Utility to compute view factors Uses Optix on NVIDIA C2070 Available as full features in 14.5 11 2012 ANSYS, Inc. nvidia Technology Theater SC12

ANSYS Fluent AMG Solver Time (Sec) 3000 Fluent AMG Solver on GPUs Work-in-Progress NVAMG Project Preview of ANSYS Fluent 14.5 Performance Dual Socket CPU 2832 Dual Socket CPU + Tesla C2075 Helix Model 2000 1000 0 5.5x 2 x Xeon X5650, Only 1 Core Used 12 2012 ANSYS, Inc. nvidia Technology Theater SC12 933 Lower is Better 1.8x 517 517 2 x Xeon X5650, All 12 Cores Used Helix geometry 1.2M Hex cells Unsteady, laminar Coupled PBNS, DP AMG F-cycle on CPU AMG V-cycle on GPU NOTE: All jobs solver time only, ~65% of total time

Design Optimization How will you use all of this computing power? Higher fidelity Full assemblies More nonlinear Design Optimization Studies 13 2012 ANSYS, Inc. nvidia Technology Theater SC12

HPC Revolution HDD vs. SSDs SMP vs. DMP The right combination of algorithms and hardware leads to maximum efficiency Clusters Interconnects 14 2012 ANSYS, Inc. nvidia Technology Theater SC12 GPUs

Thank You! Improving Engineering Productivity with HPC and GPU-Accelerated Simulation Raymond Browell 724.514.3070 Ray.Browell@ansys.com 15 2012 ANSYS, Inc. nvidia Technology Theater SC12