Automated Design Exploration and Optimization + HPC Best Practices

Similar documents
Mesh Morphing and the Adjoint Solver in ANSYS R14.0. Simon Pereira Laz Foley

Automated Design Exploration and Optimization. Clinton Smith, PhD CAE Support and Training PADT April 26, 2012

Shape optimisation using breakthrough technologies

RBF Morph An Add-on Module for Mesh Morphing in ANSYS Fluent

Automated Design Exploration and Optimization + HPC Best Practices

IBM Information Technology Guide For ANSYS Fluent Customers

Design Exploration and Robust Design. Judd Kaiser Product Manager, ANSYS Workbench Platform

Optimisationfor CFD. ANSYS R14 Fluids Update Seminar. Milton Park, February 16 th, 2012 Sheffield, February 29 th, 2012 Aberdeen, March 8 th, 2012

ANSYS HPC Technology Leadership

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

Adjoint Solver Workshop

ANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011

An advanced RBF Morph application: coupled CFD-CSM Aeroelastic Analysis of a Full Aircraft Model and Comparison to Experimental Data

Speed and Accuracy of CFD: Achieving Both Successfully ANSYS UK S.A.Silvester

Real Application Performance and Beyond

Coupled Analysis of FSI

Introduction to ANSYS CFX

Understanding Hardware Selection to Speedup Your CFD and FEA Simulations

Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs

Solving Large Complex Problems. Efficient and Smart Solutions for Large Models

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

ANSYS AIM 16.0 Overview. AIM Program Management

Answers to Webinar "Wind farm flow modelling using CFD update" Q&A session

Engineers can be significantly more productive when ANSYS Mechanical runs on CPUs with a high core count. Executive Summary

Shape Optimization for Aerodynamic Efficiency Using Adjoint Methods

You will not hear hold music while waiting for the event to begin.

Topology Optimization in Fluid Dynamics

RBF Morph: mesh morphing in OpenFoam

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations

Why HPC for. ANSYS Mechanical and ANSYS CFD?

Introduction to C omputational F luid Dynamics. D. Murrin

Commercial Implementations of Optimization Software and its Application to Fluid Dynamics Problems

ANSYS High. Computing. User Group CAE Associates

Recent Advances in ANSYS Toward RDO Practices Using optislang. Wim Slagter, ANSYS Inc. Herbert Güttler, MicroConsult GmbH

Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance

Adjoint Solver Advances, Tailored to Automotive Applications

The Cray CX1 puts massive power and flexibility right where you need it in your workgroup

Ansys Fluent R Michele Andreoli

Pressure Drop Evaluation in a Pilot Plant Hydrocyclone

Thank you for downloading one of our ANSYS whitepapers we hope you enjoy it.

ANSYS Fluid Structure Interaction for Thermal Management and Aeroelasticity

Parametric. Practices. Patrick Cunningham. CAE Associates Inc. and ANSYS Inc. Proprietary 2012 CAE Associates Inc. and ANSYS Inc. All rights reserved.

Flow and Heat Transfer in a Mixing Elbow

Using Multiple Rotating Reference Frames

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

CFD Topological Optimization of a Car Water-Pump Inlet using TOSCA Fluid and STAR- CCM+

OzenCloud Case Studies

Introduction to CFX. Workshop 2. Transonic Flow Over a NACA 0012 Airfoil. WS2-1. ANSYS, Inc. Proprietary 2009 ANSYS, Inc. All rights reserved.

KEYWORDS Morphing, CAE workflow, Optimization, Automation, DOE, Regression, CFD, FEM, Python

Auto Injector Syringe. A Fluent Dynamic Mesh 1DOF Tutorial

AcuSolve Performance Benchmark and Profiling. October 2011

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

Accelerating Implicit LS-DYNA with GPU

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Using a Single Rotating Reference Frame

Tutorial 1. Introduction to Using FLUENT: Fluid Flow and Heat Transfer in a Mixing Elbow

Swapnil Nimse Project 1 Challenge #2

This tutorial illustrates how to set up and solve a problem involving solidification. This tutorial will demonstrate how to do the following:

QLogic TrueScale InfiniBand and Teraflop Simulations

Using Multiple Rotating Reference Frames

AcuSolve Performance Benchmark and Profiling. October 2011

Automotive Fluid-Structure Interaction (FSI) Concepts, Solutions and Applications. Laz Foley, ANSYS Inc.

Industrial finite element analysis: Evolution and current challenges. Keynote presentation at NAFEMS World Congress Crete, Greece June 16-19, 2009

NUMERICAL INVESTIGATION OF THE FLOW BEHAVIOR INTO THE INLET GUIDE VANE SYSTEM (IGV)

First Steps - Ball Valve Design

Missile External Aerodynamics Using Star-CCM+ Star European Conference 03/22-23/2011

Automatic & Robust Meshing in Fluids 2011 ANSYS Regional Conferences

Simulating Sinkage & Trim for Planing Boat Hulls. A Fluent Dynamic Mesh 6DOF Tutorial

Turbocharger Design & Analysis Solutions. Bill Holmes Brad Hutchinson Detroit, October 2012

A Comprehensive Study on the Performance of Implicit LS-DYNA

DYNARDO Dynardo GmbH CFD Examples. Dr.-Ing. Johannes Will President Dynardo GmbH

Overview and Recent Developments of Dynamic Mesh Capabilities

Fluid structure interaction analysis: vortex shedding induced vibrations

Webinar: TwinMesh for Reliable CFD Analysis of Rotating Positive Displacement Machines

ANSYS FLUENT. Airfoil Analysis and Tutorial

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Automatic & Robust Meshing in Fluids 2011 ANSYS Regional Conferences

Modeling External Compressible Flow

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads

Estimating Vertical Drag on Helicopter Fuselage during Hovering

System Level Cooling, Fatigue, and Durability. Co-Simulation. Stuart A. Walker, Ph.D.

CFD Simulation of a dry Scroll Vacuum Pump including Leakage Flows

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

2008 International ANSYS Conference

Maximizing Memory Performance for ANSYS Simulations

Aero-Vibro Acoustics For Wind Noise Application. David Roche and Ashok Khondge ANSYS, Inc.

Introduction to ANSYS DesignXplorer

TENDER FOR PURCHASE OF LICENSES OF FINITE ELEMENT ANALYSIS PROGRAM SPECIFICATIONS

Software within building physics and ground heat storage. HEAT3 version 7. A PC-program for heat transfer in three dimensions Update manual

ANSYS FLUENT. Lecture 3. Basic Overview of Using the FLUENT User Interface L3-1. Customer Training Material

SPC 307 Aerodynamics. Lecture 1. February 10, 2018

Compressible Flow in a Nozzle

Computational Simulation of the Wind-force on Metal Meshes

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

CFD Optimisation case studies with STAR-CD and STAR-CCM+

Recent & Upcoming Features in STAR-CCM+ for Aerospace Applications Deryl Snyder, Ph.D.

Ashwin Shridhar et al. Int. Journal of Engineering Research and Applications ISSN : , Vol. 5, Issue 6, ( Part - 5) June 2015, pp.

2008 International ANSYS Conference

The viscous forces on the cylinder are proportional to the gradient of the velocity field at the

Ultimate Workstation Performance

Transcription:

Automated Design Exploration and Optimization + HPC Best Practices 1

Outline The Path to Robust Design ANSYS DesignXplorer Mesh Morphing and Optimizer RBF Morph Adjoint Solver HPC Best Practices 2

The Path to Robust Design Robust Design is an ANSYS Advantage Single Physics Solution Accuracy, robustness, speed Multiphysics Solution Integration Platform What if Study Parametric Platform Design Exploration DOE, Response Surfaces, Correlation, Sensitivity, Unified reporting, etc. Optimization Algorithms Published API Robust Design Six Sigma Analysis Probabilistic Algorithms Adjoint solver methods 3

What If? Interactively adjust the parameter values and Update Needed for "What If?" Parametric CAD Connections Pervasive Parameters Persistent Updates Managed State, Update Mechanisms Remote Solve Manager (RSM) Parametric Persistence is an ANSYS Advantage! 4

Design Exploration Design Exploration is an ANSYS Advantage 5

Optimization Optimal Candidates 6

Six Sigma Analysis Thermal Stress Pressure & Flow Velocity Exhaust manifold design Six Sigma Analysis Input Parameters Outlet Diameter of the manifold Thickness at inlet External Temperature Engine RPM Parametric Geometry Maximum Displacement should not exceed 1.5 mm Response Parameters Max Flow Temperature Max Deformation Max Von-Mises stress Deformation All samples reports max deformation below 1.5 mm Uncertainty of input parameters Response Surface showing the effect of engine speed and thickness at outlet on the maximum deformation 7

ANSYS DesignXplorer 9

ANSYS DesignXplorer DesignXplorer is everything under this Parameter bar Low cost & easy to use! It drives Workbench Improves the ROI! ANSYS Workbench Solvers DX 10

Design of Experiments With little more effort than for a single run, you can use DesignXplorer to create a DOE and run many variations. 11

Correlation Matrix Understand how your parameters are correlated/influenced by other parameters! 12

Sensitivity Understand which parameters your design is most sensitive to! 13

Response Surface Understand the sensitivities of the output parameters (results) wrt the input parameters. 3D Response 2D Slices Response 14

Goal-Driven Optimization Use an optimization algorithm or screening to understand tradeoffs or discover optimal design candidates! 15

Robustness Evaluation Input parameters have variation! Make sure your design is robust! Six Sigma, TQM Output parameters vary also! 16 Understand how your performance will vary with your design tolerances? Predict how many parts will likely fail? Understand which inputs require the greatest control?

Example 1: Slit Die Need uniform outflow Minimize pressure drop P2 P3 P1 Flow Uniformity BAD Bad GOOD Good Pressure Drop 17

Example 2: Combustor 3 parameters Diffuser Length Exit Height Outlet Minimize pressure loss Outlet Minimize mach number Outlet Inlet Dump Gap Sensitivity 18

Mesh Morphing and Optimizer 19

Fluent Morpher-Optimization Feature Allows users to optimize product design based on shape deformation to achieve design objective Based on free-form deformation tool coupled with various optimization methods 20

Mesh Morphing Applies a geometric design change directly to the mesh in the solver Uses a Bernstein polynomial-based morphing scheme Freeform mesh deformation defined on a matrix of control points leads to a smooth deformation Works on all mesh types (Tet/Prism, CutCell, HexaCore, Polyhedral) User prescribes the scale and direction of deformations to control points distributed evenly through the rectilinear region. 21

Process What if? Setup Case Run Setup Morph Morph Evaluate OR Regions Parameters Deformation Optimizer Optimizer Setup Case Run Setup Optimizer Optimize Auto 22 Choose best design Optimal Solution

Deformation Definition Define constraint(s) (if any) Select control points and prescribe the relative ranges of motion 23

Objective Function Objective Function: Equal flow rate Baseline Design Optimized Design 24

Optimizer Algorithms; Compass, Powell, Rosenbrock, Simplex, Torczon Auto Optimize! 25

Example: L-Shaped Duct Application: L-shaped duct Objective Function: Uniform flow at the outlet Significant Improvement in Flow Uniformity 26

RBF Morph 27

RBF Morph Features The user-friendly RBF Morph add-on module is fully integrated within Fluent (GUI, TUI & solving stage) and Workbench Mesh-independent RBF fit used for surface mesh morphing and volume mesh smoothing Parallel calculation allows to morph large size models (many millions of cells) in a short time Management of every kind of mesh element type (tetrahedral, hexahedral, polyhedral, etc.) Ability to convert morphed mesh surfaces back into CAD Multi fit makes the Fluent case truly parametric (only 1 mesh is stored) Precision: exact nodal movement and exact feature preservation. 28

How Does RBF-Morph work? A system of radial functions is used to fit a solution for the mesh movement/morphing, from a list of source points and their prescribed displacements Radial Basis Function interpolation is used to derive the displacement at any location in the space The RBF problem definition is mesh independent. 29

RBF-Morph is Integrated with Fluent 30

Example 1: Internal Flow Here, a pipe is projected onto a previously defined STL shape 31

Example 2: External Flow courtesy of Ignazio Maria Viola Ship sail rotation 32

Example 3: External Flow 33

Example 4: 50:50:50 Optimisation Courtesy of Volvo Cars Recently conducted conceptual study by ANSYS in conjunction with Volvo Cars 50 Million cell hybrid mesh of Volvo XC60 50 Design variants investigated using RBF-Morph Addon for ANSYS Fluent and Workbench Design Explorer 50 hours total clock time to complete full optimisation on HPC Cluster ~4% Reduction in total drag force 34

Example 5: Ship Hull Optimisation Courtesy of Leeds University Ship hull hydrodynamics optimisation study Block hexahedral mesh from ICEM CFD 50 Design variants investigated using RBF-Morph Addon for ANSYS Fluent and Workbench Design Explorer 7.9% Reduction in total drag force 35

Adjoint Solver 36

Adjoint Solution? An adjoint solver allows specific information about a fluid system to be computed that is very difficult to gather otherwise. The adjoint solution itself is a set of derivatives. They are not particularly useful in their raw form and must be post-processed appropriately. The derivative of an engineering quantity with respect to all of the inputs for the system can be computed in a single calculation. Example: Sensitivity of the drag on an airfoil to its shape. There are 4 main ways in which these derivatives can be used: 1. Qualitative guidance on what can influence the performance of a system strongly. 2. Quantitative guidance on the anticipated effect of specific design changes. 3. Guidance on important factors in solver numerics. 4. Gradient-based design optimization. 37

How to Use the Results - Qualitative GOAL: Identify features of a system design that are most influential in the performance of the system. EXAMPLE: Sensitivity of the Drag on a NACA 0012 airfoil to changes in the shape of the airfoil. The shape sensitivity field is extracted from the adjoint solution in a post-processing step. High sensitivity changes to shape have a big effect on drag Low sensitivity changes to shape have a small effect on drag 38

How to Use the Results - Quantitative GOAL: Identify specific system design changes that benefit the performance and quantify the improvement in performance that is anticipated. EXAMPLE: Design modifications to turning vanes in a 90 degree elbow to reduce the total pressure drop. The optimal adjustment that is made to the shape is defined by the shape sensitivity field (steepest descent algorithm). Effect of each change can be computed in advance based on linear extrapolation. Baseline Modified Original P = -232.8 Pa Expected change computed using the adjoint and linear extrapolation = 10.0 Pa Make the change and recompute the solution. Actual change = 9.0 Pa 39

How to Use the Results - Solver Numerics GOAL: Identify aspects of the solver numerics and computational mesh that have a strong influence on quantities that are being computed that are of engineering interest. EXAMPLE: Use the adjoint solution to identify parts of the mesh where mesh adaption will benefit the computed drag by reducing the influence of discretization errors. Baseline Mesh Adapted Mesh Adapted Mesh Detail 40

How to Use the Results - Optimization GOAL: Perform a sequence of automated design modifications to improve a specific performance measure for a system EXAMPLE: Gradient-based optimization of the total pressure drop in a pipe. Flow solution is recomputed and the adjoint recomputed at each design iteration. 100 90 Initial design 80 70 p tot [Pa] 60 50 40 30 41 Final design 30% reduction in total pressure drop after 30 design iterations 20 10 0 0 10 20 30 Iteration

Mesh Morphing Once a desired change to the geometry of the system has been selected, how is that change to be made? Mesh morphing provides a convenient and powerful means of changing the geometry and the computational mesh. Use Bernstein polynomial-based morphing scheme discussed earlier 42

Mesh Morphing & Adjoint Data Example: Sensitivity of lift to surface shape Flow Select portions of the geometry to be modified Adjoint to deformation operation Surface shape sensitivity becomes control point sensitivity (chain rule for differentiation) Benefit of this approach is two-fold Smooths the surface sensitivity field Provides a smooth interior and boundary mesh deformation 43

Mesh Morphing, Adjoint Data & Constraints The adjoint solution is determined based on the specific flow physics of the problem in hand. The effect of other practical engineering constraints must be reconciled with the adjoint data to decide on an allowable design change. Example: Some walls within the control volume may be constrained not to move. A minimal adjustment is made to the control-point sensitivity field so that deformation of the fixed walls is eliminated. Fixed wall Fixed wall Moveable walls 44

Current Functionality The adjoint solver is released with all Fluent 14 packages. Documentation is available Theory Usage Tutorial Case study Training is available Functionality is activated by Loading the adjoint solver add-on module A new menu item is added at the top level 45

Current Functionality Application Drivers Key initial application areas are: Low-speed external aerodynamics F1 (increase downforce) Production automobiles (decrease drag) Low-speed internal flows Total pressure drop (reduce losses) In Fluent 14.5 a mechanism for users to define a wide range of observables of interest will be provided. Forces Moments Pressure drop Swirl Ratios Products Variances Linear combinations Unary operations 46

Current Scope ANSYS-Fluent flow solver has very broad scope Adjoint is configured to compute solutions based on some assumptions Steady, incompressible, laminar flow. Steady, incompressible, turbulent flow with standard wall functions. First-order discretization in space. Frozen turbulence. The primary flow solution does NOT need to be run with these restrictions Strong evidence that these assumptions do not undermine the utility of the adjoint solution data for engineering purposes. Fully parallelized. Gradient algorithm for shape modification Mesh morphing using control points. Adjoint-based solution adaption 47

Example 1: Automotive Aerodynamics Surface map of the drag sensitivity to shape changes 48 Surface map of the drag sensitivity to shape changes Surface map of the drag sensitivity to shape changes

Example 2: Pressure Drop in a Duct Total Pressure Drop (Pa) Geometry Predicted Result Original --- -22.0 Modified -14.8-18.3 Aggressive adjustment results in a 17% reduction in loss in just one design iteration 49

HPC Best Practices 50

Guidelines : Know your hardware lifecycle Have a goal in mind for what you want to achieve Using Licensing productively Using ANSYS provided processes effectively 51

Hardware Considerations This section is meant to provide an overview of the different hardware components and how they can effect solution time. Hopefully this will give you some of the tools to understand why some of the benchmark numbers in better detail. ANSYS would always recommend that the best thing to do before buying a system is to look at the latest benchmarks. If you are not sure please ask. 52

Effect of Clock Speed Impact of CPU Clock on Application Performance Processor: Xeon X5600 Series Hyper Threading: OFF, TURBO: ON Active cores: 12/node; Memory speed: 1333 MHz (performance measure is improvement relative to CPU Clock 2.66 GHz) 1.40 1.35 1.30 Higher is better Improvement due to Clock 1.25 1.20 1.15 1.10 1.05 1.00 0.95 2.66 GHz 2.93 GHz 3.47 GHz 0.90 0.85 0.80 Clock Ratio eddy_417k aircraft_2m turbo_500k sedan_4m truck_14m ANSYS/FLUENT Model 53

Effect of Memory Speed We can see here the effect of memory speed. This has implications on how you build your hardware. Some processors types have slower memory speeds by default. On other processors nonoptimally filling the memory channels can slow the memory speed. Impact of Memory Speed 130% 125% 120% 115% 110% 105% 100% 95% 90% 85% 80% Impact of DIMM speed on ANSYS/FLUENT Application Performance (Intel Xeon x5670, 2.93 GHz) Hyper Threading: OFF, TURBO: ON Active threads per node: 12 (performance measure improvement is relative to memory speed of 1066 MHz) eddy_417k turbo_500k aircraft_2m sedan_4m truck_14m ANSYS/FLUENT Model 1066 MHz 1333 MHz 54

Turbo Boost (Intel) / Turbo Core (AMD) Turbo Boost (Intel)/ Turbo Core(AMD) is a form of over-clocking that allows you to give more GHz to individual processors when others are idle. With the Intel s have seen variable performance with this ranging between 0-8% improvement depending on the numbers of cores in use. The graph below for CFX on a Intel X5550. This only sees a maximum of 2.5% improvement. 55

Hyper-Threading: ANSYS Fluent Hyper-Threading Technology makes a single physical processor appear as two logical processors. Evaluation of Hyperthreading on ANSYS/FLUENT Performance idataplex M3 (Intel Xeon x5670, 2.93 GHz) TURBO: ON (measurement is improvement relative ot Hyperthtreading OFF) This is not the same as physically having two logical processors and does not give double the speedup. In our tests we ve seen as high as a 20% increase in performance although you can see the actual performance can be quite variable from the graph opposite. It is worth noting that this has licensing implications as you would need to oversubscribe the physical cores and hence would need double the HPC Licenses. Higher is better Improvemet due to Hyperthreading. 1.10 1.05 1.00 0.95 0.90 HT OFF (12 threads on 12 physical cores) HT ON (24 threads on 12 physical cores) eddy_417k turbo_500k aircraft_2m sedan_4m truck_14m ANSYS/FLUENT Model 56

AMD vs. Intel Traditionally Intel take the power approach in general in their 2 socket systems (faster core but less of them per processor/socket). Traditionally AMD take the economies of scale approach (more cores per processor but individually slower clock speeds). Remember that this landscape changes because they are constantly in competition with each other. Please note that whilst we do have some numbers for the new Intel Sandy-bridge chips we do not have scaling numbers for the equivalent AMD 6200 series at the time of writing this presentation. 57

2 Socket vs. 4 Socket Systems Current 4 socket systems come up slower than their 2 socket counterparts (based on Intel Westmere vs. Xeon E7-8837). Clock speed slower Memory speed slower No additional memory bandwidth. Performance of ANSYS Fluent on two-socket and four-socket based systems Performance measure is Fluent Rating (higher values are better) 2-socket based Systems IBM HS22/HS22V Blade, 3550/3650 M3, Dx360 M3 (Xeon 5600 Series) 4-socket based Systems IBM HX5 Blade, X3850 (Xeon E7-8837 series) Nodes Sockets Cores Fluent Fluent Nodes Sockets Cores Rating Rating 1 2 12 88 1 2 16 96 2 4 24 173 1 4 32 188 58

Effect of the Interconnect When going for multiple systems linked together the interconnect becomes an important factor. The interconnect is the fabric that connects the nodes. We can see from the graph opposite with FLUENT how quickly the performance of Gigabit Ethernet drops off. FLUENT Rating 5000 4000 3000 2000 1000 Higher is better ANSYS/FLUENT Performance idataplex M3 (Intel Xeon x5670, 12C 2.93 GHz) Network: Gigabit, 10-Gigabit, 4X QDR Infiniband (QLogic, Voltaire) Hyperthreading: OFF, TURBO: ON Models: truck_14m QLogic Voltaire 10-Gigabit Gigabit 0 12 24 48 96 192 384 768 Number of Cores used by a single job 59

ANSYS Fluent Auto-Partitioning Time to Partition cavity 200M case, Cavity 768 Case cores over 768 cores Auto partitioning is now very quick Less than 10s to process 800M cells! Serial pre-partitioning step no longer required Time in seconds 12 10 8 6 4 2 0 200M 400M 600M 800M Time 2.914 4.706 6.617 9.86 Time to Partition truck_111m 111M Truck Case 60 Time in seconds 9 8 7 6 5 4 3 2 1 0 192 384 768 1536 Time 5.307 4.542 6.177 8.109

ANSYS CFX Partitioning Optimize parallel partitioning in multi-core clusters (CFX) β Partitioner determines number of connections between partitions and optimizes part.-host assignments Re-use previous results to initialize calculations on large problem (CFX) β Large case interpolation for cases with >~100M nodes Clean up of coupled partitioning option for multi-domain cases (CFX) Eliminates isolated partition spots Dramatically reduced partitioning times for cases with fluid-solid interfaces and very large numbers of regions P2 P1 P4 P3 P1 P5 P6 P5 P3 P6 P7 P8 Partitioning step finds adjacency amongst partitions; partitions with max adjacency are grouped on same compute nodes P2 P4 P7 P8 Compute Node 1 Compute Node 2 61

ANSYS Fluent Parallel Scalability Consistently improved scalability across releases 7000 6000 5000 Intel Harpertown 4000 3000 6.3.0 12.0.0 Sedan, 4M cells 25000 20000 Xeon X5560 @ 2.80GHz (Nehalem EP) 2000 1000 0 70000 60000 50000 0 100 200 300 400 500 600 Intel Westmere 15000 10000 12.0.0 13.0.0 40000 30000 13.0.0 14.0.0 20000 5000 10000 62 0 0 2011 100 ANSYS, 200 Inc. 300 May 4009, 2012 500 600 0 0 500 1000 1500 2000

ANSYS Fluent Parallel Scalability Consistently improved scalability across releases Truck, 111M cells Intel Harpertown 450 400 350 300 250 200 150 100 50 0 0 200 400 600 800 1000 1200 SGI ICE 8400EX, Intel 6-core Intel Westmere hex-core 2.93 GHz 1400 2500 6.3.0 12.0.0 1200 1000 2000 800 600 12.0.0 13.0.0 1500 1000 13.0.0 14.0.0 400 200 500 63 0 0 2011 ANSYS, 500 Inc. 1000 May 1500 9, 2012 2000 0 0 1000 2000 3000 4000 5000

ANSYS Fluent Parallel Scalability on Intel ANSYS Fluent 14 Relative Performance Higher is better Geomean 1 1.86 6 core Xeon X5675 8 core Xeon E5-2680 Geomean 1 Sedan_4m Truck_14m 1.53 Leading Performance for fluid flow simulation The memory bandwidth of the Intel Xeon processor E5-2600 product family allows excellent scalability and per core performance. Support for higher speed memory DIMMs, added on-core capacity for memory loads, as well as a larger cache size are key to extending performance and scalability. Higher memory bandwidth has a pronounced impact with fully coupled solver applications, which are the most memory intensive. Sedan_4m is shown as an example of fully coupled solver performance. Truck_14m is representative of segregated solver performance. The horizontal line at 1.63 represents the geomean speedup over 6 standard benchmarks. 6 core Xeon X5675 8 core Xeon E5-2680 Data Source: Approved/published results as of February 1, 2012 See backup for details 64 *Other names and brands may be claimed as the property of others

ANSYS CFX Parallel Scalability on Intel 140 120 100 80 60 40 20 Intel Xeon 5650 Intel Xeon E5-2680 Good scalability and more operations per clock make obtaining results on Intel Xeon E5 1.68x faster than on Intel Xeon 5600 platforms For end user it is about faster turnaround or solving larger tasks with the same resources along with lower TCO 65 0 AirliftReactor BigPipe CombBVM CombEDM Cylinder IndyCar Internal LeMansCar LES_001 Pump RadCity RadFurnace StageCompressor StaticMixer100MM StaticMixer100 StaticMixer200 StaticMixer 400k Turbine Wigley100 Source: Published/submitted/approved results as of March 6, 2012. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Configuration Details: Please reference speaker note. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others

Including Monitors 3072 cores Scalability with Monitors Scalability to higher core counts Simulations with monitors including plotting and printing 35 30 25 20 Hex-core mesh, F1 car, 130 million cells monitor-enabled 15 10 5 0 0 200 400 600 800 1000 Example data for scaling with R14 monitors Monitor support optimizations maintain scalability expectations 66

Fluids I/O 67 FLUENT, CFX and AUTODYN use a singular file structure. This means there is one global set of files and every process writes to them. This methodology falls down at a large number of cores where the file I/O becomes a bottleneck. CFX deals with this by using inline compression (cdat) FLUENT has both inline compression (cdat) and at v12.x introduced support for a Parallel File (pdat). Parallel file system support in ANSYS FLUENT ~10x - 20x speedup for data write Eliminates scaling bottleneck for data intensive simulations on large clusters (e.g., transient flows) Serial I/O ANSYS FLUENT Parallel I/O

HPC Fluids Demonstration Case To Demonstrate 50:50:50 Method Volvo XC60 vehicle model Four shape parameters RBF Morph (Integrated within FLUENT) to define shape parameters Grid morphing in parallel ANSYS WorkBench (Frame Work to Automate Process) To drive shape parameters To create DOE To perform Goal Driven Optimization 50 50 50 The 50:50:50 Method 50 design points in the design space 50 million cells used in CFD simulation of each design point 50 hours total elapsed time to simulate all the design points EXTENT ACCURACY SPEED One Click Entire design space is simulated and postprocessed completely automatically after the initial baseline case setup 68

HPC Fluids Demonstration Case STEP 1 Prepare Meshed Model for Baseline Vehicle Shape STEP 2 CFD Solver Setup, Define Shape Parameters STEP 3 Generate DOE using Input Shape Parameters STEP 4 Mesh Morpher Integrated within FLUENT Solver (FLUENT), Optimizer (DX) & Post Processor (CFD Post) Morph Integrated Vehicle within Shape ANSYS WorkBench Run CFD Simulation STEP 5 Collate Data, Perform Optimization 69

HPC Fluids Demonstration Case 768 Cores 384 Cores 288 Cores 240 Cores 144 Cores Task Time (Seconds) Time (Seconds) Time (Seconds) Time (Seconds) Time (Seconds) Read volume mesh of baseline case into the CFD solver and apply solver settings Baseline Case (i.e. Design Point 1) 225 340 365 481 228 CFD Solution 6979 11153 14409 17256 27246 Writing CFD data file 681 538 558 600 532 Each Subsequent Design Point Morph vehicle shape 84 59 65 69 100 CFD Solution 1284 1754 2208 2630 4100 Writing CFD data file 734 559 572 621 532 Total Run Time (Wall Clock) Needed for All 50 Design Points (Hours) 30.80 35.63 42.98 50.28 72.19 70

HPC Fluids Demonstration Case Compute Cluster Details 1. Intel s Endeavor Cluster 2. Intel Xeon X5670 (dual socket) 3. Clock speed 2.93 GHz 4. Six cores per socket (12 cores per node) 5. 24 GB RAM @ 1333 MHz, SMT ON, Turbo ON 6. QDR Infiniband 7. RHEL Server Release 6.1 71

GPU Acceleration for CFD Radiation viewfactor calculation (ANSYS FLUENT 14 - beta) First capability for specialty physics view factors, ray tracing, reaction rates, etc. R&D focus on linear solvers, smoothers but potential limited by Amdahl s Law 72

Getting the right setup is balancing act.. 73

Factors to Consider HPC Licensing Cost Cost of Hardware Complexity of Deployment and Maintenance 74

HPC Licensing Cost ANSYS HPC is licensed in either the HPC Workgroup/Enterprise (or individually) or HPC Packs. Given that it is licensed per partition (which in most cases translated to a core) the best value for money is in getting the best scalability per core as possible. When running multiple cores make sure you are using them as effectively as the memory bandwidth allows. 75

Cost of Hardware ANSYS will, in general, recommend the best hardware for performance that gets you the best out of your licensing investment. However you may need to make trade-off's for your budget. 2 socket systems provide the best performance but more inherently more complexity (and hence cost) because of the need for high speed interconnects when in a cluster. Current 4 socket systems have less performance than their 2 socket counterparts but are also cheaper because of their lack of requirement for the high speed interconnects to get to higher numbers of nodes at the low end. 76

Complexity of Deployment and Maintenance A large cluster can have significant overheads in ease of deployment & on-going maintenance costs. A 4 socket system, whilst having less performance, may provide an easier deployment and maintenance route at the lower end and will be a better fit to what the average IT department is used to. Often users get too caught up on per core performance at the detriment of not getting any extra speedup at all. It is important to purchase something you feel you can internally support. Purchase 3 rd party support for high performance clusters if you do not feel you have the skills to support it internally. 77

Remember the Following... If you opt for unsupported infrastructure This does not mean that it will not work but you use them at your own risk. We may ask you to replicate it on a system that is supported before providing further support if you run into problems! We recommend: Buying Supported Operating systems and Hardware Using ANSYS Supported Practices Talking to us before buying! It is in all our interests that you get this right! 78

Information Available ANSYS Partner Solutions http://www.ansys.com/corporate/partners/partners-hpc.asp Reference configurations Performance data White papers Sales contact points Performance Data http://www.ansys.com/benchmarks 79

Information Available ANSYS Platform Support http://www.ansys.com/services/ss-platform-support.asp Platform Support Policies Supported Platforms Supported Hardware Tested systems ANSYS Virtual Demo Room http://www.ansys.com/demoroom/ Click on HPC! 80

Information Available The Manual Sections on best practices and parallel processing for various solvers Installation walkthroughs for installing the products, parallel processing, licensing and RSM (remote solve manager) ANSYS Advantage Online Magazine 81

Information Available Customer Portal http://www1.ansys.com/customer/ Knowledge Resources Installation and Systems FAQ s Customer Support http://www1.ansys.com/customer/ Portal, Email or Phone 82 Global ANSYS network providing Comprehensive Support