Understanding Hardware Selection to Speedup Your CFD and FEA Simulations

Size: px
Start display at page:

Download "Understanding Hardware Selection to Speedup Your CFD and FEA Simulations"

Transcription

1 Understanding Hardware Selection to Speedup Your CFD and FEA Simulations 1

2 Agenda Why Talking About Hardware HPC Terminology ANSYS Work-flow Hardware Considerations Additional resources 2

3 Agenda Why Talking About Hardware HPC Terminology ANSYS Work-flow Hardware Considerations Additional resources 3

4 Most Users Constrained by Hardware 4 Source: HPC Usage survey with over 1,800 ANSYS respondents

5 Problem Statement I am not achieving the performance and throughput I was expecting from my hardware & software 5 Image courtesy of Intel Corporation

6 Building A Balanced System Is The Key To Improving Your Experience If Your System Is Slow So Are Your Engineers & Analysts Networks Storage Memory Processors 6

7 What Hardware Configuration to Select? HDD vs. SSD SMP vs. DMP 7 CPUs? The right combination of hardware and software leads to maximum efficiency Clusters? GPUs? Interconnects?

8 Agenda Why Talking About Hardware HPC Terminology ANSYS Work-flow Hardware Considerations Additional resources 8

9 HPC Hardware Terminology Machine 1 (or Node 1) Machine N (or Node N) Processor 1 (or Socket 1) Processor 1 (or Socket 1) Processor 2 (or Socket 2) Processor 2 (or Socket 2) GPU GPU Interconnect (GigE or InfiniBand) 9

10 Shared Memory Parallel Machine 1 (or Node 1) Processor 1 (or Socket 1) Single Machine Parallel (SMP) systems share a single global memory image that may be distributed physically across multiple cores, but is globally addressable. OpenMP is the industry standard. 10

11 Distributed Memory Parallel Machine 1 (or Node 1) Processor 1 (or Socket 1) Distributed memory parallel processing (DMP) assumes that physical memory for each process is separate from all other processes. Parallel processing on such a system requires some form of message passing software to exchange data between the cores. 11 MPI (Message Passing Interface) is the industry standard for this.

12 Agenda Why Talking About Hardware HPC Terminology ANSYS Work-flow Hardware Considerations Additional resources 12

13 Typical HPC Growth Path Desktop User Workstation and/or Server Users Cluster Users Cloud Solution 13

14 14 Remote Visualization Ideal for remote users submitting jobs from a Windows machine to a Linux cluster or local users submitting jobs to a Linux cluster users that do not have enough power (memory or graphics) on their local workstation to build large meshes or view graphics. ANSYS 16.0 supports the following remote visualization applications Nice Desktop Cloud Visualiation (DCV) 2013 Linux server + Linux/Windows client OpenText Exceed ondemand 8 SP2/SP3 Linux server + Linux/Windows client RealVNC Enterprise Edition (with VirtualGL) Linux server + Linux/Windows client (on Windows cluster: Microsoft Remote Desktop) Hardware requirements for remote visualization servers require: GPU capable video cards large amounts of RAM accessible for multiple user availability when running ANSYS applications and pre/post processing

15 Virtual Desktop (VDI) Support Key focus area at ANSYS (internal use & software QA) Focus on GPU Pass-Through One GPU per VM, up to 8 VMs per machine (K1, K2 cards); memory constraints will limit in any case vgpu (NVIDIA Grid) as it matures; testing internally Not SW rendering, Not Shared GPU (too slow) Supported at R16.0: 15

16 ANSYS Remote Solve Manager (RSM) Desktop Server Cluster (with 3 rd party scheduler) The Remote Solve Manager (RSM) is a GUI-based, job queuing system that RSM as a scheduler RSM as a transport mechanism distributes simulation tasks to (shared) computing resources Submits to RSM itself. Submits through RSM to a high-level RSM enables tasks to be scheduler such as LSF, PBS Pro, Run in background mode on the local machine Windows HPC Server 2008 R2 / 2012, Sent Unit to recognition: a remote compute jobs (e.g. machine a run of a and Univa Grid Engine (at R15.0). Broken solver into such a as series CFX, of Fluent jobs for Mechanical) parallel processing Unit across recognition: a variety cores of computers 16

17 RSM Usage Scenarios Submission from a client to a centralized (shared) compute resource, allowing back-ground queuing on a centralized machine multiple users to share a common, usually large memory/fast machine (compared to client machine) 17

18 RSM Usage Scenarios Submission from a client to a centralized (shared) compute resource, allowing back-ground queuing on a centralized machine multiple users to share a common, usually large memory/fast machine (compared to client machine) Submission from a client to multiple (shared) compute resources, allowing back-ground queuing on a centralized machine that submits to other machines (compute servers) multiple users to share user workstations (often at night) using the RSM Limit Times for Job Submission feature 18

19 RSM Usage Scenarios Submission from a client to a centralized (shared) compute resource, allowing back-ground queuing on a centralized machine multiple users to share a common, usually large memory/fast machine (compared to client machine) Submission from a client to multiple (shared) compute resources, allowing back-ground queuing on a centralized machine that submits to other machines (compute servers) multiple users to share user workstations (often at night) using the RSM Limit Times for Job Submission feature Submission from a client to a centralized (shared) compute resource with a job scheduler, allowing back-ground queuing on a centralized machine that submits to a job scheduler (e.g. LSF) multiple users to run multi-node jobs on shared compute resources 19

20 Recent Enhancements in RSM Improved robustness and scalability Added support for Univa Grid Engine Added support for Mechanical/MAPDL restart Non-root users on Linux can now use RSM wizard Enriched support for RSM customization Added component override for design point update Improved efficiency of Design Point updates Design objectives: Equal fresh and exhaust gas mass flow distribution to each cylinder To minimize the overall pressure drop Input parameters: Radii of 3 fillets near inlet (8 design points) ~5.0x speed-up over sequential execution 20 Parametric, Optimization of Intake Manifold Initial Optimized

21 Guidelines : Know your hardware lifecycle Have a goal in mind for what you want to achieve. Using Licensing productively Using ANSYS provided processes effectively. 21

22 Agenda Why Talking About Hardware HPC Terminology ANSYS Work-flow Hardware Considerations Additional resources 22

23 What Hardware Configuration to Select? CPUs? GPU/Phi? HDD vs. SSD SMP vs. DMP Clusters? Interconnects? 23

24 Understanding the effect of clock speed Generally, ANSYS applications scale with clock frequency Cost/performance argues for high clock (but maybe not top bin) Using higher clock speed is always helpful to realize productivity gains ANSYS DMP benchmarks (8 core) Clock effect is highest for sparse solver 24

25 Understanding the effect of memory bandwidth - Is 24 Cores Equal to 24 Cores? 3 x (2 x 4) = 24 cores 2 x (2 x 6) = 24 cores x5570 x5670 x5570 x5570 x

26 Understanding the effect of memory bandwidth - Is 24 Cores Equal to 24 Cores? 3 x (2 x 4) = 24 cores 2 x (2 x 6) = 24 cores x5570 x5670 x5570 x5570 x5670 Consider memory per core! 26

27 Understanding the effect of memory bandwidth - Is 16 Cores Equal to 16 Cores? 2 x (2 x 4) = 16 cores 2 x (2 x 4) = 16 cores x5570 x5670 x5570 x5670 Using less cores per node can be helpful to realize productivity gains 27

28 Understanding the effect of memory bandwidth - ANSYS Mechanical Consider memory per core! 28

29 Understanding the effect of memory speed We can see here the effect of memory speed. This has implications on how you build your hardware. Some processors types have slower memory speeds by default. On other processors nonoptimally filling the memory channels can slow the memory speed. Has an effect on memory bandwidth Using higher memory speed can be helpful to realize productivity gains 29

30 Turbo Boost (Intel) / Turbo Core (AMD) - ANSYS CFD Turbo Boost (Intel)/ Turbo Core(AMD) is a form of over-clocking that allows you to give more GHz to individual processors when others are idle. With the Intel s have seen variable performance with this ranging between 0-8% improvement depending on the numbers of cores in use. The graph below for CFX on a Intel X5550. This only sees a maximum of 2.5% improvement. 30

31 Turbo Boost (Intel) / Turbo Core (AMD) - ANSYS Mechanical We can see that relative to 1 core we can see good performance gains in many cases by using Turbo Boost on the E5 processor family. Using Turbo Boost / Core can be helpful to realize productivity gains - particularly for lower core counts 31

32 Hyper-threading Evaluation of Hyperthreading on ANSYS/FLUENT Performance idataplex M3 (Intel Xeon x5670, 2.93 GHz) TURBO: ON (measurement is improvement relative ot Hyperthtreading OFF) HT OFF (12 threads on 12 physical cores) HT ON (24 threads on 12 physical cores) 1.10 Higher is better Improvemet due to Hyperthreading Hyper-threading is NOT recommended 0.90 eddy_417k turbo_500k aircraft_2m sedan_4m truck_14m ANSYS/FLUENT Model 32

33 Generation to Generation - ANSYS Mechanical Optimized for Intel Xeon E5 v3 processors: ANSYS Mechanical 16.0 performs well on the latest Intel processor architecture Haswell processor-based system is 20% to 40% faster than Sandy Bridge processor-based system for a variety of benchmarks 34

34 ANSYS Fluent on Intel Ivy Bridge Ivy Bridge vs. Sandy Bridge Single Node Ivy Bridge = Tick release of Sandy Bridge Similar micro architecture, more cores, reduced power Expect similar core-to-core performance on Ivy Bridge and Sandy Bridge Improved node-to-node Single-node performance of ANSYS Fluent 14.5 over six benchmark cases 2x8 core Sandy Bridge vs. 2x12 core Ivy Bridge 50% performance boost matches core count increase Scaling maintained on higher core density Achieved via efficient memory use (and higher RAM speed) ANSYS, Inc. June 18, 2015 Case Ivy Bridge Sandy Bridge Ratio turbo_500k eddy_417k aircraft_2m sedan_4m truck_14m truck_poly_14m

35 ANSYS Fluent Ivy Bridge vs. Sandy Bridge Scaling Multi-node performance of ANSYS Fluent 14.5 Up to 192 cores Nearly identical core-to-core scaling confirms system balance for Fluent Truck_14m Solver Rating, Fluent Solver Rating SandyBridge Ivybridge Number of Cores ANSYS, Inc. June 18, 2015

36 Per Node vs. Per Core Comparisons This is a 4 socket vs. 2 socket node comparison. Xeon E v GHz (4 socket) Xeon E v GHz (2 socket) From the per node comparison you d assume it was better to go with the 4 socket. Per core however the 2 socket is the better choice. Both are not showing linear scalability as they are running on all the cores per node (bandwidth constrained) 37

37 Generation to Generation - ANSYS Fluent ANSYS Application Example Case Details: Flow through a Combustor Number of cells: 12 Million Cell Type: Polyhedra Models used: Realizable K-ε turbulence Pressure based coupled, species transport, Least Square cell based, pseudo transient 38

38 Generation to Generation - ANSYS Fluent ANSYS Application Example Case Details: External flow over a passenger sedan Number of cells: 4 Million Cell Type: Mixed Models used: Standard K-ε turbulence Solver: Pressure based coupled, steady, Green-Gauss cell based 39

39 Recap Faster cores mean faster solution Faster memory means faster solution Memory bandwidth is an important factor for (linear) scale-ability Turbo Boost/Turbo Core modes do give some benefit especially at low core counts per node. In general hyper threading should not be used because of licensing implications. Be careful when looking at comparisons! Make sure you are comparing like with like! 40

40 What Hardware Configuration to Select? CPUs? GPU/Phi? HDD vs. SSD SMP vs. DMP Clusters? Interconnects? 41

41 Understanding the effect of the interconnect Need fast interconnects to feed fast processors Two main characteristics for each interconnect: latency and bandwidth Distributed ANSYS is highly bandwidth bound D I S T R I B U T E D A N S Y S S T A T I S T I C S Release: 14.5 Build: UP Platform: LINUX x64 Date Run: 08/09/2012 Time: 23:07 Processor Model: Intel(R) Xeon(R) CPU E GHz Total number of cores available : 32 Number of physical cores available : 32 Number of cores requested : 4 (Distributed Memory Parallel) MPI Type: INTELMPI Core Machine Name Working Directory hpclnxsmc00 /data1/ansyswork 1 hpclnxsmc00 /data1/ansyswork 2 hpclnxsmc01 /data1/ansyswork 3 hpclnxsmc01 /data1/ansyswork Latency time from master to core 1 = microseconds Latency time from master to core 2 = microseconds Latency time from master to core 3 = microseconds Communication speed from master to core 1 = MB/sec Same machine Communication speed from master to core 2 = MB/sec QDR Infiniband Communication speed from master to core 3 = MB/sec QDR Infiniband 42

42 Understanding the effect of the interconnect - ANSYS Fluent ANSYS/FLUENT Performance idataplex M3 (Intel Xeon x5670, 12C 2.93 GHz) Network: Gigabit, 10-Gigabit, 4X QDR Infiniband (QLogic, Voltaire) Hyperthreading: OFF, TURBO: ON Models: truck_14m 5000 QLogic Voltaire 10-Gigabit Gigabit FLUENT Rating Higher is better Number of Cores used by a single job 43

43 Understanding the effect of the interconnect - ANSYS Fluent Exhaust Model M cells Transient simulation with explicit time stepping for engine startup cycle Fujitsu PRIMERGY CX250 HPC systems (E5-2690v2 with 20 and E5-2697v2 with 24 cores per node, resp.) For CFD we can see the performance of IB vs GiGE GiGE starts to drop off after 2 nodes

44 Understanding the effect of the interconnect - ANSYS Fluent For CFD 10 GiGE starts to taper off after 8 nodes 45

45 Understanding the effect of the interconnect - ANSYS Mechanical V13sp-5 Model Rating (runs/day) Turbine geometry 2,100 K DOF 20 SOLID187 FEs Static, nonlinear One iteration 10 Direct sparse Linux cluster (8 cores per node) 0 Interconnect Performance Gigabit Ethernet DDR Infiniband 8 cores 16 cores 32 cores 64 cores 128 cores 46

46 Understanding the effect of the interconnect - ANSYS Mechanical For ANSYS Mechanical GiGE does not scale to more than 1 node! 47

47 Understanding the effect of the interconnect - ANSYS Mechanical GiGE (Gigabit Ethernet) 1 Gbits/sec ( 100 MB/sec ) 10 GiGE 10 Gbits/sec ( 1000 MB/sec ) Not recommended!! Bare minimum!! Myrinet (Myricom, Inc) 2 Gbits/sec ( 250 MB/sec ) Myri 10G 10 Gbits/sec (4 th generation Myrinet) Infiniband (many vendors/speeds) SDR/DDR/QDR 1x, 4x, 12x RECOMMENDATION Over 1000 MB/s, especially when running on more than 4 nodes 48

48 Recap 10GiGE and Infiniband are recommended for HPC Clusters. Currently Infiniband only for large clusters is recommended QDR should be more than adequate for small to medium clusters. FDR for large clusters. For more than 1 node you will see performance decrease using GiGE. For Mechanical users do not use GiGE at all if their jobs span more than one node. 49

49 What Hardware Configuration to Select? CPUs? GPU/Phi? HDD vs. SSD SMP vs. DMP Clusters? Interconnects? 50

50 Parallel file systems NFS Server and/or master node causes IO bottleneck Master node causes IO bottleneck IO scales with cluster 51

51 Parallel file systems - ANSYS Mechanical The example across from here is using GPFS for Mechanical. Notice how it is very similar in speed to a local RAID 0 configuration (4 x 15k SAS) 52

52 Understanding the effect of I/O - ANSYS Fluent Parallel I/O is based on MPI-IO Implemented for data file read and write A single file is written collectively by the nodes Suited for parallel file systems Does not work on NFS Support for Panasas, PVFS2, HP/SFS, IBM/GPFS, EMC/MPFS2, Lustre Files cannot be written directly compressed but can be compressed asynchronously 53

53 Understanding the effect of I/O - ANSYS Fluent Truck-111million (uses DES model with the segregated implicit solver) Truck-111m Write Data File Throughput (MB/s) Parallel IO = 7x ( Legacy-NAS ) Parallel IO = 4x ( Serial-IO ) Legacy NAS Serial IO Parallel IO Parallel IO (RAID-10, CW) 176 Cores Panasas layout available with MPI-IO Hints in Fluent

54 Understanding the effect of I/O - ANSYS Fluent Landing Gear Noise Predictions using Scale-Resolving Simulations (180M cell model using pressure based segregated solver) 55

55 Understanding the effect of I/O - ANSYS Fluent Asynchronous I/O for Linux Fluent Total write time 3-5x quicker over NFS Even larger speed-ups on bigger cases and local disk (up to 10x) Mesh File Location Async I/O Time 15M Cas NFS OFF 217s 15M Cas NFS ON 62s 15M Dat NFS OFF 113s 15M Dat NFS ON 8s 30M Cas NFS OFF 207s 30M Cas NFS ON 75s 30M Dat NFS OFF 144s 30M Dat NFS ON 10s 56

56 Understanding the effect of I/O - ANSYS Mechanical 4XSSD-RAID-0-SATA-3Gb/s 2XSSD-RAID-0-SATA-3Gb/s SSD-SATA-6Gb/s HD(7.2K RPM)-SATA-6Gb/s SP-5 (in-core) R14.5 Benchmark Results Rating (jobs/day) #Machine X #Core Memory 1X1 1X2 1X4 1X8 1X16 29GB 33GB 35.6GB 40.8GB 47.8GB 57

57 Understanding the effect of I/O - ANSYS Mechanical 4XSSD-RAID-0-SATA-3Gb/s 2XSSD-RAID-0-SATA-3Gb/s SSD-SATA-6Gb/s HD(7.2K RPM)-SATA-6Gb/s SP-5 (in-core) R14.5 Benchmark Results Rating (jobs/day) #Machine X #Core Memory 1X1 1X2 1X4 1X8 1X16 29GB 33GB 35.6GB 40.8GB 47.8GB 58

58 Understanding the effect of I/O - ANSYS Mechanical 4XSSD-RAID-0-SATA-3Gb/s 2XSSD-RAID-0-SATA-3Gb/s SSD-SATA-6Gb/s HD(7.2K RPM)-SATA-6Gb/s SP-5 (in-core) R14.5 Benchmark Results Rating (jobs/day) #Machine X #Core Memory 1X1 1X2 1X4 1X8 1X16 29GB 33GB 35.6GB 40.8GB 47.8GB 59

59 Recap IO is very important for Mechanical Solver o Raid 0 mandatory for multiple disks o SSD s recommended for speed, 15k SAS drives FLUENT and CFX for most customers won t require fast local disk access (for most type of job) Parallel file systems can meet the requirements of both types of solver. 60

60 I/O [Mb/s] Is Your Hardware Ready for HPC? - ANSYS Mechanical 2x SSD x SSD 2x SAS > 6 Mdof 4 Mdof 1x SAS 61 2 Mdof Mdof RAM [Gb]

61 What Hardware Configuration to Select? CPUs? GPU/Phi? HDD vs. SSD SMP vs. DMP Clusters? Interconnects? 62

62 DMP Outperforming SMP 6 Mio Degrees of Freedom Plasticity, Contact Bolt pretension 4 load steps 63

63 DMP: Good Performance at High Core Counts Number of Cores 10.7 Mio Degrees of Freedom Static, linear, structural 1 load step Number of Cores 1 Mio Degrees of Freedom Harmonic, linear, structural 4 frequencies Intel Xeon E processors (2.9 GHz, 16 cores total) 128 GB of RAM 64

64 ANSYS Mechanical 14.5 DMP Enabling Scalability at High Core Counts Minimum time to solution more important than scaling V14sp-5 Model Solution Scalability Turbine geometry 2.1 million DOF Static, nonlinear analysis 1 loadstep, 7 substeps, 25 equilibrium iterations 8-node Linux cluster (with 8 cores per node) Speedup

65 ANSYS Mechanical 15.0 Faster Performance at Higher Core Counts by an enhanced domain decomposition method 6 Improved Scaling at 8 cores 8-node Linux cluster (with 8 cores and 48 GB of RAM per node, InfiniBand DDR) Speedup over R x 1.7x 2.7x 2.4x 0 Engine (9 MDOF) Stent (520 KDOF) Clutch (160 KDOF) Bracket (45 KDOF) 66

66 ANSYS Mechanical 15.0 Faster Performance at Higher Core Counts by an enhanced domain decomposition method 6 Improved Scaling at 16 cores 8-node Linux cluster (with 8 cores and 48 GB of RAM per node, InfiniBand DDR) Speedup over R x 1.8x 3.8x 4.0x 0 Engine (9 MDOF) Stent (520 KDOF) Clutch (160 KDOF) Bracket (45 KDOF) 67

67 ANSYS Mechanical 15.0 Faster Performance at Higher Core Counts by an enhanced domain decomposition method Speedup over R x Improved Scaling at 32 cores 2.2x 3.9x 8-node Linux cluster (with 8 cores and 48 GB of RAM per node, InfiniBand DDR) 5.0x 0 Engine (9 MDOF) Stent (520 KDOF) Clutch (160 KDOF) Bracket (45 KDOF) 68

68 ANSYS Mechanical 16.0 Faster Performance at Higher Core Counts Continually improving Core Solver Rating to 128 cores Courtesy of HP 70

69 ANSYS Mechanical 15.0 HPC & Solver Technology Improvements Coupled Acoustic, 1.2 M DOF, Full Harmonic Response Improved Scalability of Distributed solver at higher core counts NEW Subspace eigen solver supports Shared and Distributed Parallel technology NEW MSUP Harmonic method for unsymmetric systems e.g vibro-acoustics 2.09 MDOFs first 20 modes 71

70 What Hardware Configuration to Select? CPUs? GPU/Phi? HDD vs. SSD SMP vs. DMP Clusters? Interconnects? 72

71 Some Basics ANSYS Software on NVIDIA GPUs GPUs are accelerators and can significantly speed up your simulations GPUs work hand in hand with CPUs Most ANSYS GPU acceleration is user-transparent Only requirement is to inform ANSYS of how many GPUs to use Schematic of a CPU with an attached GPU accelerator CPU begins/ends job, GPU manages heavy computations 73

72 GPU Accelerator Capability - ANSYS Fluent GPU-based Model: Radiation Heat Transfer using OptiX GPU-based Solver: Coupled Algebraic Multigrid (AMG) PBNS linear solver Operating Systems: Both Linux and Win64 for workstations and servers Parallel Methods: Shared and distributed memory Supported GPUs: Tesla K40, Tesla K80, and Quadro 6000 Multi-GPU Support: Full multi-gpu and multi-node support Model Suitability: Unlimited (hardware dependent) 74

73 ANSYS Fluent on GPU Performance of Pressure-Based Solver 27 Jobs/day 1.9x Sedan Model 12 Jobs/day Higher is Better 15 Jobs/day Sedan geometry 3.6M mixed cells Steady, turbulent External aerodynamics Coupled PBNS, DP CPU: Intel Xeon E5-2680; 8 cores GPU: 2 X Tesla K40 CPU only Segregated solver CPU only CPU + GPU Coupled solver Convergence criteria: 10e-03 for all variables; No of iterations until convergence: segregated CPU-2798 iterations (7070 secs); coupled CPU-967 iterations (5900 secs); coupled 985 iterations (3150 secs) NOTE: Times for total solution until convergence 75

74 ANSYS Fluent on GPU Performance of Pressure-Based Solver Higher is Better 33 Jobs/day Truck Model 200% Additional productivity from GPUs 11 Jobs/day External aerodynamics 14 million cells Steady, k-ε turbulence Coupled PBNS, DP 2 nodes each with dual Intel Xeon E V3 (16 CPU cores) and dual Tesla K80 GPUs Additional cost of adding GPUs 40% CPU-only solution cost 100% 100% Simulation productivity from CPU-only system 64 CPU cores 56 CPU cores + 4 Tesla K80 Cost CPU Benefit GPU Simulation productivity (with an HPC Workgroup 64 license) 76

75 ANSYS Fluent on GPU Better Speedup on Larger Models 36 CPU cores 36 CPU cores + 12 GPUs 144 CPU cores 144 CPU cores + 48 GPUs 36 Truck Model ANSYS Fluent Time (Sec) X 9.5 Lower is Better 2 X 18 External aerodynamics Steady, k-ε turbulence Double-precision solver CPU: Intel Xeon E5-2667; 12 cores per node GPU: Tesla K40, 4 per node 14 million cells 111 million cells NOTE: Reported times are per iteration 77

76 NVIDIA-GPU Solution Fit for ANSYS Fluent CFD analysis Is it single-phase & flow dominant? No Yes Pressurebased coupled solver? No Not ideal for GPUs Pressure based coupled solver Segregated solver Is it a steady-state analysis? No 78 Best-fit for GPUs Yes Consider switching to the pressure-based coupled solver for better performance (faster convergence) and further speedups with GPUs. Please see the next slide.

77 NVIDIA-GPU Solution Fit for ANSYS Fluent - Supported Hardware Configurations CPU GPU Homogeneous process distribution Homogeneous GPU selection Number of processes be an exact multiple of number of GPUs CPU CPU GPU Some nodes with 16 processes and some with 12 processes GPU Some nodes with 2 GPUs some with 1 GPU CPU 79 GPU 15 processes not divisible by 2 GPUs

78 ANSYS Fluent - Power Consumption Study Adding GPUs to a CPU-only node resulted in 2.1x speed up while reducing energy consumption by 38% 80

79 NVIDIA-GPU Solution Fit for ANSYS Fluent GPUs accelerate the AMG solver portion of the CFD analysis, thus benefit problems with relatively high %AMG Coupled solvers have high %AMG in the range of 60-70% Fine meshes and low-dissipation problems have high %AMG In some cases, pressure-based coupled solvers offer faster convergence compared to segregated solvers (problem-dependent) The whole problem must fit on GPUs for the calculations to proceed In pressure-based coupled solver, each million cells need approx. 4 GB of GPU memory High-memory cards such as Tesla K40 or Quadro K6000 are ideal Moving scalar equations such as turbulence may not benefit much because of low workloads (using scalar yes option in amg-options ) Better performance on lower CPU core counts A ratio of 3 or 4 CPU cores to 1 GPU is recommended 81

80 GPU Accelerator Capability - ANSYS Mechanical Supports majority of ANSYS structural mechanics solvers: Covers both sparse direct and PCG iterative solvers Only a few minor limitations Ease of use: Requires at least one supported GPU card to be installed No rebuild, no additional installation steps Performance: Offer significantly faster time to solution Should never slow down your simulation V14sp-5 Model 82

81 Influence of GPU Accelerator on Speedup ANSYS Mechanical Model Impeller Impeller geometry of ~2M DOF, solid FEs Normal modes analysis using cyclic symmetry ANSYS Mechanical SMP and Block-Lanczos solver ANSYS Mechanical Model Speaker Speaker geometry of ~0.7M DOF, solid FEs Vibroacoustic harmonic analysis for one frequency ANSYS Mechanical distributed sparse solver Speedup 5.9x 3.7x 2.4x Impeller 2M DOF Normal modes 4 cores + GPU = 2.4x speedup vs. 4 cores Speedup Speaker 0.7M DOF Harmonic analysis 4 cores + GPU = 2.7x speedup vs. 4 cores 83

82 NVIDIA-GPU Solution Fit for ANSYS Mechanical GPUs accelerate the solver part of analysis, consequently problems with high solver workloads benefit the most from GPUs Characterized by both high DOF and high factorization requirements Models with solid elements (such as castings) and have >500K DOF experience good speedups Better performance when run on DMP mode over SMP mode GPU and system memories both play important roles in performance Sparse solver: Bulkier and/or higher-order FE models are good and will be accelerated If the model exceeds 5M DOF, then either add another GPU with 5-6 GB of memory (Tesla K20 or K20X) or use a single GPU with 12 GB memory (Tesla K40 or Quadro K6000). PCG/JCG solver: Memory saving (MSAVE) option should be turned off for enabling GPUs Models with lower Level of Difficulty value (Lev_Diff) are better suited for GPUs 84

83 GPU Achievements ANSYS Mechanical 16.0 Supporting Newest GPUs 371 Jobs/day V15sp-4 Model 2.3x Higher is Better 247 Jobs/day V15sp-5 Model Turbine geometry 3.2 million DOF SOLID187 elements Static, nonlinear analysis Sparse direct solver 159 Jobs/day 135 Jobs/day 1.8x Ball grid array geometry 6.0 million DOF Static, nonlinear analysis Sparse direct solver 8 CPU cores 6 CPU cores + K80 GPU 8 CPU cores 6 CPU cores + K80 GPU 87 Distributed ANSYS Mechanical 16.0 with Intel Xeon E5-2697v2 2.7 GHz 8-core CPU; Tesla K80 GPU with boost clocks.

84 GPU Achievements ANSYS Mechanical 15.0 Supporting Newest GPUs GPUs can offer significantly faster time to solution Higher core counts favor multiple GPUs Lower core counts favor a single GPU Courtesy of HP 89

85 GPU Achievements ANSYS Mechanical 16.0 Supporting Xeon Phi Background: ANSYS Mechanical 15.0 was the first commercial FEA program to support Intel Xeon Phi coprocessor It was limited to shared memory parallelism (SMP) on Linux only Intel Xeon Phi coprocessor support R16 now supports distributed memory parallelism (DMP) and Windows Speedup core 2 cores 4 cores 8 cores 16 cores No Xeon Phi Xeon Phi

86 GPU Achievements ANSYS License Scheme for GPU and Phi Licensing Examples: 1 x ANSYS HPC Pack Total 8 HPC Tasks (4 GPU/Phi Max) Example of Valid Configurations: 6 CPU Cores + 2 GPU/Phi 4 CPU Cores + 4 GPU/Phi 2 x ANSYS HPC Pack Total 32 HPC Tasks (16 GPU/Phi Max)..... (Applies to all schemes: ANSYS HPC, ANSYS HPC Pack, ANSYS HPC Workgroup) 24 CPU Cores + 8 GPU/Phi (Total Use of 2 Compute Nodes) 93

87 Maximizing Performance Putting it Together HDD vs. SSD SMP vs. DMP The right combination of hardware and software leads to maximum efficiency 95 CPUs? Clusters? GPU/Phi? Interconnects?

88 Maximizing Performance ANSYS Mechanical #1 Rule Avoid waiting for I/O to complete Always check if job is I/O bound or compute bound Check output file for CPU and Elapsed times When Elapsed time >> main thread CPU time Total CPU time for main thread : seconds Elapsed Time (sec) = Date = 03/21/2013 I/O bound Consider adding more RAM or faster hard drive configuration When Elapsed time main thread CPU time Compute bound Considering moving simulation to a machine with newer, faster processors Consider using Distributed ANSYS (DMP) instead of SMP Consider running on more CPU cores or possibly using GPU(s) 96

89 Maximizing Performance ANSYS Mechanical How to improve an I/O bound simulation First consider adding more RAM Always the best option for optimal performance Allows the operating system to cache file data in memory Next consider improving the I/O configuration Need fast hard drives to feed fast processors Consider SSDs Higher bandwidths and extremely low seek times Consider RAID configurations RAID 0 for speed RAID 1,5 for redundancy RAID 10 for speed and redundancy 97

90 Maximizing Performance ANSYS Mechanical Example of an I/O bound simulation 2.1 million DOF Nonlinear static analysis Direct sparse solver (DSPARSE) 2 Intel Xeon E (2.6 GHz, 16 cores total) One 10k rpm HDD, one SSD Windows 7 Relative Speedup Benefits of SSD and RAM 16 GB RAM 5.9x 5.9x 128 GB RAM 2.7x 2.9x 0.8x 2 cores, HDD 8 cores, HDD 8 cores, SSD Adding RAM gives biggest gains & allows good scaling Single SSD helps allow some scaling. Not as helpful as RAM, but cheaper Lack of RAM and slow HDD ruin scaling 98

91 Maximizing Performance ANSYS Mechanical How to improve a compute bound simulation First consider using newer, faster processors New CPU architecture and faster clock speeds always help Next consider using parallel processing DMP virtually always recommended over SMP More computations performed in parallel with DMP Significantly faster speedups achieved using DMP DMP can take advantage of all resources on a cluster Whole new class of problems can be solved!! Last consider using GPU acceleration Can help accelerate critical, time-consuming computations 99

92 Maximizing Performance ANSYS Mechanical Example of a compute bound simulation 2.1 million DOF Nonlinear static analysis Direct sparse solver (DSPARSE) 2 Intel Xeon E (2.6 GHz, 16 cores total) 128 GB RAM 1 Tesla K20c Windows 7 Relative Speedup Benefits of DMP and GPU 11.0x Xeon x5675 Xeon E x 1.8x 2 cores 8 cores 8 cores, GPU Maximum performance found by adding GPU Using 8 cores gives faster performance Using newer Xeons gives big gain 100

93 Maximizing Performance ANSYS Mechanical Balanced System for Overall Optimum Performance 2.1 million DOF Nonlinear static analysis Direct sparse solver (DSPARSE) 2 Intel Xeon E (2.6 GHz, 16 cores total) 16 GB RAM SSD and SATA disks 1 Tesla K20c Windows 7 Relative Speedup x Balanced Performance IO Bound 2.7x 5.2x 2 cores 8 cores 8 cores + GPU 12.5x 8 cores + GPU + SSD 101

94 Maximizing Performance ANSYS Mechanical Balanced System for Overall Optimum Performance 2.1 million DOF Nonlinear static analysis Direct sparse solver (DSPARSE) 2 Intel Xeon E (2.6 GHz, 16 cores total) 128 GB RAM SSD and SATA disks 1 Tesla K20c Windows 7 Relative Speedup x 5.7x Balanced Performance IO Bound Compute Bound 12.0x 2.7x 5.2x 24.8x 2 cores 8 cores 8 cores + GPU 12.5x 27.3x 8 cores + GPU + SSD 102

95 Agenda Why Talking About Hardware HPC Terminology ANSYS Work-flow Hardware Considerations Additional resources 103

96 Wrap-up - Hardware An important part of specifying an HPC system is to purchase a balanced system. There is no point in spending all your money on the processor if the I/O is your biggest bottleneck. You are only as good as your slowest component! 104

97 Scalable HPC Licensing ANSYS HPC (per-process) ANSYS HPC Pack HPC product rewarding volume parallel processing for high-fidelity simulations Each simulation consumes one or more Packs Parallel enabled increases quickly with added Packs ANSYS HPC Workgroup HPC product rewarding volume parallel processing for increased simulation throughput within a single colocated workgroup 16 to parallel shared across any number of simulations on a single server Enterprise options available to deploy and use anywhere in the world Single HPC solution for FEA/CFD/FSI and any level of fidelity Parallel Enabled (Cores) HPC Packs per Simulation 105

98 Which type of Licensing is right for me? ANSYS HPC and ANSYS HPC Workgroup gives Flexible use of a pool of licenses. ANSYS HPC Pack gives quick scale-up but is more restrictive in how users can use it. The ability to be more flexible is why HPC Workgroup options cost more than the HPC Packs. 106

99 ANSYS HPC Parametric Pack License HPC license for running parametric FEA or CFD simulations on multiple CPU cores simultaneously, and more cost effectively Key Benefits Ability to automatically and simultaneously execute design points while consuming just one set of application licenses Scalable because number of simultaneous design points enabled increases quickly with added packs Amplifies complete workflow because design points can include execution of multiple applications (pre, meshing, solve, HPC, post) 107 Number of Simultaneous Design Points Enabled Number of HPC Parametric Pack Licenses

100 Additional Resources - IT Webinars Watch recorded webinars by clicking below: Understanding Hardware Selection for ANSYS 15.0 How to Speed Up ANSYS 15.0 with GPUs Intel Technologies Enabling Faster, More Effective Simulation Optimizing Remote Access to Simulation Click on webinars related to HPC/IT for more and upcoming ones! 108

101 Additional Resources - IT White Papers & Technical Briefs White Papers by clicking below: Optimizing Business Value in High-Performance Engineering Computing IBM Application Ready Solutions Reference Architecture for ANSYS Intel Solid-State Drives Increase Productivity of Product Design and Simulation Value of HPC for Ensuring Product Integrity Technical Briefs by clicking below: Parallel Scalability of ANSYS 15.0 on Hewlett-Packard Systems SGI Technology Guide for ANSYS Mechanical Analysts SGI Technology Guide for ANSYS Fluent Analysts Accelerating ANSYS Fluent 15.0 Using NVIDIA GPUs 109

102 Additional Resources - ANSYS IT Webcast Series On-demand webinars: Understanding Hardware Selection for ANSYS 15.0 How to Speed Up ANSYS 15.0 with GPUs Cloud Hosting of ANSYS: Gompute On-Demand Solutions Simplified HPC Clusters for ANSYS Users Intel Technologies Enabling Faster, More Effective Simulation Accelerating Time-to-Results with Parallel I/O Extreme Scalability for High-Fidelity CFD Simulations Methodology and Tools for Compute Performance at Any Scale Understanding Hardware Selection for Structural Mechanics Optimizing Remote Access to Simulation Scalable Storage and Data Management for Engineering Simulation 110

103 Additional Resources ANSYS Platform Support Platform Support Policies Supported Platforms Supported Hardware Tested Systems ANSYS Benchmarks 111

104 Additional Resources ANSYS Partner Solutions Reference configurations Performance data White papers Sales contact points Performance Data 112

105 Additional Resources The Manual Sections on best practices and parallel processing for various solvers Performance Guide for Mechanical Installation walkthroughs for installing the products, parallel processing, licensing and RSM (remote solve manager) ANSYS Advantage Online Magazine 113

106 Thank You! Connect with Me Connect with ANSYS, Inc. LinkedIn ANSYSInc Facebook ANSYSInc Follow our Blog ansys-blog.com 114

Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs

Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs Presented at the 2014 ANSYS Regional Conference- Detroit, June 5, 2014 Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs Bhushan Desam, Ph.D. NVIDIA Corporation 1 NVIDIA Enterprise

More information

Why HPC for. ANSYS Mechanical and ANSYS CFD?

Why HPC for. ANSYS Mechanical and ANSYS CFD? Why HPC for ANSYS Mechanical and ANSYS CFD? 1 HPC Defined High Performance Computing (HPC) at ANSYS: An ongoing effort designed to remove computing limitations from engineers who use computer aided engineering

More information

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation Ray Browell nvidia Technology Theater SC12 1 2012 ANSYS, Inc. nvidia Technology Theater SC12 HPC Revolution Recent

More information

ANSYS HPC Technology Leadership

ANSYS HPC Technology Leadership ANSYS HPC Technology Leadership 1 ANSYS, Inc. November 14, Why ANSYS Users Need HPC Insight you can t get any other way It s all about getting better insight into product behavior quicker! HPC enables

More information

Solving Large Complex Problems. Efficient and Smart Solutions for Large Models

Solving Large Complex Problems. Efficient and Smart Solutions for Large Models Solving Large Complex Problems Efficient and Smart Solutions for Large Models 1 ANSYS Structural Mechanics Solutions offers several techniques 2 Current trends in simulation show an increased need for

More information

Recent Advances in ANSYS Toward RDO Practices Using optislang. Wim Slagter, ANSYS Inc. Herbert Güttler, MicroConsult GmbH

Recent Advances in ANSYS Toward RDO Practices Using optislang. Wim Slagter, ANSYS Inc. Herbert Güttler, MicroConsult GmbH Recent Advances in ANSYS Toward RDO Practices Using optislang Wim Slagter, ANSYS Inc. Herbert Güttler, MicroConsult GmbH 1 Product Development Pressures Source: Engineering Simulation & HPC Usage Survey

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

ANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011

ANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011 ANSYS HPC Technology Leadership Barbara Hutchings barbara.hutchings@ansys.com 1 ANSYS, Inc. September 20, Why ANSYS Users Need HPC Insight you can t get any other way HPC enables high-fidelity Include

More information

Stan Posey, CAE Industry Development NVIDIA, Santa Clara, CA, USA

Stan Posey, CAE Industry Development NVIDIA, Santa Clara, CA, USA Stan Posey, CAE Industry Development NVIDIA, Santa Clara, CA, USA NVIDIA and HPC Evolution of GPUs Public, based in Santa Clara, CA ~$4B revenue ~5,500 employees Founded in 1999 with primary business in

More information

IBM Information Technology Guide For ANSYS Fluent Customers

IBM Information Technology Guide For ANSYS Fluent Customers IBM ISV & Developer Relations Manufacturing IBM Information Technology Guide For ANSYS Fluent Customers A collaborative effort between ANSYS and IBM 2 IBM Information Technology Guide For ANSYS Fluent

More information

The Cray CX1 puts massive power and flexibility right where you need it in your workgroup

The Cray CX1 puts massive power and flexibility right where you need it in your workgroup The Cray CX1 puts massive power and flexibility right where you need it in your workgroup Up to 96 cores of Intel 5600 compute power 3D visualization Up to 32TB of storage GPU acceleration Small footprint

More information

ANSYS High. Computing. User Group CAE Associates

ANSYS High. Computing. User Group CAE Associates ANSYS High Performance Computing User Group 010 010 CAE Associates Parallel Processing in ANSYS ANSYS offers two parallel processing methods: Shared-memory ANSYS: Shared-memory ANSYS uses the sharedmemory

More information

Engineers can be significantly more productive when ANSYS Mechanical runs on CPUs with a high core count. Executive Summary

Engineers can be significantly more productive when ANSYS Mechanical runs on CPUs with a high core count. Executive Summary white paper Computer-Aided Engineering ANSYS Mechanical on Intel Xeon Processors Engineer Productivity Boosted by Higher-Core CPUs Engineers can be significantly more productive when ANSYS Mechanical runs

More information

2008 International ANSYS Conference

2008 International ANSYS Conference 28 International ANSYS Conference Maximizing Performance for Large Scale Analysis on Multi-core Processor Systems Don Mize Technical Consultant Hewlett Packard 28 ANSYS, Inc. All rights reserved. 1 ANSYS,

More information

TFLOP Performance for ANSYS Mechanical

TFLOP Performance for ANSYS Mechanical TFLOP Performance for ANSYS Mechanical Dr. Herbert Güttler Engineering GmbH Holunderweg 8 89182 Bernstadt www.microconsult-engineering.de Engineering H. Güttler 19.06.2013 Seite 1 May 2009, Ansys12, 512

More information

GPU-Acceleration of CAE Simulations. Bhushan Desam NVIDIA Corporation

GPU-Acceleration of CAE Simulations. Bhushan Desam NVIDIA Corporation GPU-Acceleration of CAE Simulations Bhushan Desam NVIDIA Corporation bdesam@nvidia.com 1 AGENDA GPUs in Enterprise Computing Business Challenges in Product Development NVIDIA GPUs for CAE Applications

More information

Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance

Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for ANSYS Mechanical, ANSYS Fluent, and

More information

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia

More information

You will not hear hold music while waiting for the event to begin.

You will not hear hold music while waiting for the event to begin. To hear today s event : Listen via the audio stream through your computer speakers OR Listen via phone by clicking the teleconference request button in the Participants window You will not hear hold music

More information

Accelerating Implicit LS-DYNA with GPU

Accelerating Implicit LS-DYNA with GPU Accelerating Implicit LS-DYNA with GPU Yih-Yih Lin Hewlett-Packard Company Abstract A major hindrance to the widespread use of Implicit LS-DYNA is its high compute cost. This paper will show modern GPU,

More information

Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs. Baskar Rajagopalan Accelerated Computing, NVIDIA

Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs. Baskar Rajagopalan Accelerated Computing, NVIDIA Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs Baskar Rajagopalan Accelerated Computing, NVIDIA 1 Engineering & IT Challenges/Trends NVIDIA GPU Solutions AGENDA Abaqus GPU

More information

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations Performance Brief Quad-Core Workstation Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations With eight cores and up to 80 GFLOPS of peak performance at your fingertips,

More information

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? AGENDA Why use AMG? When to use AMG? NVIDIA AmgX Results 2

More information

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to

More information

MSC Nastran Explicit Nonlinear (SOL 700) on Advanced SGI Architectures

MSC Nastran Explicit Nonlinear (SOL 700) on Advanced SGI Architectures MSC Nastran Explicit Nonlinear (SOL 700) on Advanced SGI Architectures Presented By: Dr. Olivier Schreiber, Application Engineering, SGI Walter Schrauwen, Senior Engineer, Finite Element Development, MSC

More information

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Speedup Altair RADIOSS Solvers Using NVIDIA GPU Innovation Intelligence Speedup Altair RADIOSS Solvers Using NVIDIA GPU Eric LEQUINIOU, HPC Director Hongwei Zhou, Senior Software Developer May 16, 2012 Innovation Intelligence ALTAIR OVERVIEW Altair

More information

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,

More information

SIMPLIFYING HPC SIMPLIFYING HPC FOR ENGINEERING SIMULATION WITH ANSYS

SIMPLIFYING HPC SIMPLIFYING HPC FOR ENGINEERING SIMULATION WITH ANSYS SIMPLIFYING HPC SIMPLIFYING HPC FOR ENGINEERING SIMULATION WITH ANSYS THE DELL WAY We are an acknowledged leader in academic supercomputing including major HPC systems installed at the Cambridge University

More information

Maximizing Memory Performance for ANSYS Simulations

Maximizing Memory Performance for ANSYS Simulations Maximizing Memory Performance for ANSYS Simulations By Alex Pickard, 2018-11-19 Memory or RAM is an important aspect of configuring computers for high performance computing (HPC) simulation work. The performance

More information

QLogic TrueScale InfiniBand and Teraflop Simulations

QLogic TrueScale InfiniBand and Teraflop Simulations WHITE Paper QLogic TrueScale InfiniBand and Teraflop Simulations For ANSYS Mechanical v12 High Performance Interconnect for ANSYS Computer Aided Engineering Solutions Executive Summary Today s challenging

More information

University at Buffalo Center for Computational Research

University at Buffalo Center for Computational Research University at Buffalo Center for Computational Research The following is a short and long description of CCR Facilities for use in proposals, reports, and presentations. If desired, a letter of support

More information

Dell HPC System for Manufacturing System Architecture and Application Performance

Dell HPC System for Manufacturing System Architecture and Application Performance Dell HPC System for Manufacturing System Architecture and Application Performance This Dell technical white paper describes the architecture of the Dell HPC System for Manufacturing and discusses performance

More information

LBRN - HPC systems : CCT, LSU

LBRN - HPC systems : CCT, LSU LBRN - HPC systems : CCT, LSU HPC systems @ CCT & LSU LSU HPC Philip SuperMike-II SuperMIC LONI HPC Eric Qeenbee2 CCT HPC Delta LSU HPC Philip 3 Compute 32 Compute Two 2.93 GHz Quad Core Nehalem Xeon 64-bit

More information

Automated Design Exploration and Optimization + HPC Best Practices

Automated Design Exploration and Optimization + HPC Best Practices Automated Design Exploration and Optimization + HPC Best Practices 1 Outline The Path to Robust Design ANSYS DesignXplorer Mesh Morphing and Optimizer RBF Morph Adjoint Solver HPC Best Practices 2 The

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System X idataplex CINECA, Italy The site selection

More information

FEMAP/NX NASTRAN PERFORMANCE TUNING

FEMAP/NX NASTRAN PERFORMANCE TUNING FEMAP/NX NASTRAN PERFORMANCE TUNING Chris Teague - Saratech (949) 481-3267 www.saratechinc.com NX Nastran Hardware Performance History Running Nastran in 1984: Cray Y-MP, 32 Bits! (X-MP was only 24 Bits)

More information

A Comprehensive Study on the Performance of Implicit LS-DYNA

A Comprehensive Study on the Performance of Implicit LS-DYNA 12 th International LS-DYNA Users Conference Computing Technologies(4) A Comprehensive Study on the Performance of Implicit LS-DYNA Yih-Yih Lin Hewlett-Packard Company Abstract This work addresses four

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System

More information

HPC Architectures. Types of resource currently in use

HPC Architectures. Types of resource currently in use HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

2008 International ANSYS Conference

2008 International ANSYS Conference 2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,

More information

Best Practices for Setting BIOS Parameters for Performance

Best Practices for Setting BIOS Parameters for Performance White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page

More information

Network Design Considerations for Grid Computing

Network Design Considerations for Grid Computing Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom

More information

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System INSTITUTE FOR PLASMA RESEARCH (An Autonomous Institute of Department of Atomic Energy, Government of India) Near Indira Bridge; Bhat; Gandhinagar-382428; India PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE

More information

Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU. Robert Strzodka NVAMG Project Lead

Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU. Robert Strzodka NVAMG Project Lead Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU Robert Strzodka NVAMG Project Lead A Parallel Success Story in Five Steps 2 Step 1: Understand Application ANSYS Fluent Computational Fluid Dynamics

More information

STAR-CCM+ Performance Benchmark and Profiling. July 2014

STAR-CCM+ Performance Benchmark and Profiling. July 2014 STAR-CCM+ Performance Benchmark and Profiling July 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: CD-adapco, Intel, Dell, Mellanox Compute

More information

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT:

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT: HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms Author: Stan Posey Panasas, Inc. Correspondence: Stan Posey Panasas, Inc. Phone +510 608 4383 Email sposey@panasas.com

More information

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems

More information

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010 Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010 Windows HPC Server 2008 R2 Windows HPC Server 2008 R2 makes supercomputing

More information

AcuSolve Performance Benchmark and Profiling. October 2011

AcuSolve Performance Benchmark and Profiling. October 2011 AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, Altair Compute

More information

Real Application Performance and Beyond

Real Application Performance and Beyond Real Application Performance and Beyond Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400 Fax: 408-970-3403 http://www.mellanox.com Scientists, engineers and analysts

More information

Building NVLink for Developers

Building NVLink for Developers Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized

More information

Technical guide. Windows HPC server 2016 for LS-DYNA How to setup. Reference system setup - v1.0

Technical guide. Windows HPC server 2016 for LS-DYNA How to setup. Reference system setup - v1.0 Technical guide Windows HPC server 2016 for LS-DYNA How to setup Reference system setup - v1.0 2018-02-17 2018 DYNAmore Nordic AB LS-DYNA / LS-PrePost 1 Introduction - Running LS-DYNA on Windows HPC cluster

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton

More information

ANSYS Fluent 14 Performance Benchmark and Profiling. October 2012

ANSYS Fluent 14 Performance Benchmark and Profiling. October 2012 ANSYS Fluent 14 Performance Benchmark and Profiling October 2012 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information

More information

HP Z Turbo Drive G2 PCIe SSD

HP Z Turbo Drive G2 PCIe SSD Performance Evaluation of HP Z Turbo Drive G2 PCIe SSD Powered by Samsung NVMe technology Evaluation Conducted Independently by: Hamid Taghavi Senior Technical Consultant August 2015 Sponsored by: P a

More information

Deep Learning Performance and Cost Evaluation

Deep Learning Performance and Cost Evaluation Micron 5210 ION Quad-Level Cell (QLC) SSDs vs 7200 RPM HDDs in Centralized NAS Storage Repositories A Technical White Paper Don Wang, Rene Meyer, Ph.D. info@ AMAX Corporation Publish date: October 25,

More information

The BioHPC Nucleus Cluster & Future Developments

The BioHPC Nucleus Cluster & Future Developments 1 The BioHPC Nucleus Cluster & Future Developments Overview Today we ll talk about the BioHPC Nucleus HPC cluster with some technical details for those interested! How is it designed? What hardware does

More information

Faster Metal Forming Solution with Latest Intel Hardware & Software Technology

Faster Metal Forming Solution with Latest Intel Hardware & Software Technology 12 th International LS-DYNA Users Conference Computing Technologies(3) Faster Metal Forming Solution with Latest Intel Hardware & Software Technology Nick Meng 1, Jixian Sun 2, Paul J Besl 1 1 Intel Corporation,

More information

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015 Altair OptiStruct 13.0 Performance Benchmark and Profiling May 2015 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute

More information

SGI. Technology Guide for Users of Abaqus. September, Authors Scott Shaw, Dr. Olivier Schreiber, Tony DeVarco TECH GUIDE

SGI. Technology Guide for Users of Abaqus. September, Authors Scott Shaw, Dr. Olivier Schreiber, Tony DeVarco TECH GUIDE SGI Technology Guide for Users of Abaqus September, 2014 Authors Scott Shaw, Dr. Olivier Schreiber, Tony DeVarco Senior CAE Applications Engineer, SGI Applications Engineering Director of SGI Virtual Product

More information

Ultimate Workstation Performance

Ultimate Workstation Performance Product brief & COMPARISON GUIDE Intel Scalable Processors Intel W Processors Ultimate Workstation Performance Intel Scalable Processors and Intel W Processors for Professional Workstations Optimized to

More information

Deep Learning Performance and Cost Evaluation

Deep Learning Performance and Cost Evaluation Micron 5210 ION Quad-Level Cell (QLC) SSDs vs 7200 RPM HDDs in Centralized NAS Storage Repositories A Technical White Paper Rene Meyer, Ph.D. AMAX Corporation Publish date: October 25, 2018 Abstract Introduction

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx

More information

The Visual Computing Company

The Visual Computing Company The Visual Computing Company GPU Acceleration Benefits for Applied CAE Axel Koehler, Senior Solutions Architect HPC, NVIDIA HPC Advisory Council Meeting, April 2014, Lugano Outline General overview about

More information

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest

More information

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access

More information

Big Data Analytics Performance for Large Out-Of- Core Matrix Solvers on Advanced Hybrid Architectures

Big Data Analytics Performance for Large Out-Of- Core Matrix Solvers on Advanced Hybrid Architectures Procedia Computer Science Volume 51, 2015, Pages 2774 2778 ICCS 2015 International Conference On Computational Science Big Data Analytics Performance for Large Out-Of- Core Matrix Solvers on Advanced Hybrid

More information

IBM InfoSphere Streams v4.0 Performance Best Practices

IBM InfoSphere Streams v4.0 Performance Best Practices Henry May IBM InfoSphere Streams v4.0 Performance Best Practices Abstract Streams v4.0 introduces powerful high availability features. Leveraging these requires careful consideration of performance related

More information

Introduction to parallel Computing

Introduction to parallel Computing Introduction to parallel Computing VI-SEEM Training Paschalis Paschalis Korosoglou Korosoglou (pkoro@.gr) (pkoro@.gr) Outline Serial vs Parallel programming Hardware trends Why HPC matters HPC Concepts

More information

Memory-Based Cloud Architectures

Memory-Based Cloud Architectures Memory-Based Cloud Architectures ( Or: Technical Challenges for OnDemand Business Software) Jan Schaffner Enterprise Platform and Integration Concepts Group Example: Enterprise Benchmarking -) *%'+,#$)

More information

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Hitoshi Sato *1, Shuichi Ihara *2, Satoshi Matsuoka *1 *1 Tokyo Institute

More information

Headline in Arial Bold 30pt. SGI Altix XE Server ANSYS Microsoft Windows Compute Cluster Server 2003

Headline in Arial Bold 30pt. SGI Altix XE Server ANSYS Microsoft Windows Compute Cluster Server 2003 Headline in Arial Bold 30pt SGI Altix XE Server ANSYS Microsoft Windows Compute Cluster Server 2003 SGI Altix XE Building Blocks XE Cluster Head Node Two dual core Xeon processors 16GB Memory SATA/SAS

More information

Advances of parallel computing. Kirill Bogachev May 2016

Advances of parallel computing. Kirill Bogachev May 2016 Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being

More information

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek Storage Update and Storage Best Practices for Microsoft Server Applications Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek Agenda Introduction Storage Technologies Storage Devices

More information

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Overview HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Market Strategy HPC Commercial Scientific Modeling & Simulation Big Data Hadoop In-memory Analytics Archive Cloud Public Private

More information

Computer Aided Engineering with Today's Multicore, InfiniBand-Based Clusters ANSYS, Inc. All rights reserved. 1 ANSYS, Inc.

Computer Aided Engineering with Today's Multicore, InfiniBand-Based Clusters ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Computer Aided Engineering with Today's Multicore, InfiniBand-Based Clusters 2006 ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Proprietary Our Business Simulation Driven Product Development Deliver superior

More information

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007 Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise

More information

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing Accelerating HPC (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing SAAHPC, Knoxville, July 13, 2010 Legal Disclaimer Intel may make changes to specifications and product

More information

FUSION1200 Scalable x86 SMP System

FUSION1200 Scalable x86 SMP System FUSION1200 Scalable x86 SMP System Introduction Life Sciences Departmental System Manufacturing (CAE) Departmental System Competitive Analysis: IBM x3950 Competitive Analysis: SUN x4600 / SUN x4600 M2

More information

Consulting Solutions WHITE PAPER Citrix XenDesktop XenApp 6.x Planning Guide: Virtualization Best Practices

Consulting Solutions WHITE PAPER Citrix XenDesktop XenApp 6.x Planning Guide: Virtualization Best Practices Consulting Solutions WHITE PAPER Citrix XenDesktop XenApp 6.x Planning Guide: Virtualization Best Practices www.citrix.com Table of Contents Overview... 3 Scalability... 3 Guidelines... 4 Operations...

More information

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff cctrieloff@redhat.com Red Hat Lee Fisher lee.fisher@hp.com Hewlett-Packard High Performance Computing on Wall Street conference 14

More information

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark Testing validation report prepared under contract with Dell Introduction As innovation drives

More information

Free SolidWorks from Performance Constraints

Free SolidWorks from Performance Constraints Free SolidWorks from Performance Constraints Adrian Fanjoy Technical Services Director, CATI Josh Altergott Technical Support Manager, CATI Objective Build a better understanding of what factors involved

More information

HPC 2 Informed by Industry

HPC 2 Informed by Industry HPC 2 Informed by Industry HPC User Forum October 2011 Merle Giles Private Sector Program & Economic Development mgiles@ncsa.illinois.edu National Center for Supercomputing Applications University of Illinois

More information

The Oracle Database Appliance I/O and Performance Architecture

The Oracle Database Appliance I/O and Performance Architecture Simple Reliable Affordable The Oracle Database Appliance I/O and Performance Architecture Tammy Bednar, Sr. Principal Product Manager, ODA 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

More information

Cluster Scalability of Implicit and Implicit-Explicit LS-DYNA Simulations Using a Parallel File System

Cluster Scalability of Implicit and Implicit-Explicit LS-DYNA Simulations Using a Parallel File System Cluster Scalability of Implicit and Implicit-Explicit LS-DYNA Simulations Using a Parallel File System Mr. Stan Posey, Dr. Bill Loewe Panasas Inc., Fremont CA, USA Dr. Paul Calleja University of Cambridge,

More information

Performance Benefits of NVIDIA GPUs for LS-DYNA

Performance Benefits of NVIDIA GPUs for LS-DYNA Performance Benefits of NVIDIA GPUs for LS-DYNA Mr. Stan Posey and Dr. Srinivas Kodiyalam NVIDIA Corporation, Santa Clara, CA, USA Summary: This work examines the performance characteristics of LS-DYNA

More information

HP and CATIA HP Workstations for running Dassault Systèmes CATIA

HP and CATIA HP Workstations for running Dassault Systèmes CATIA Whitepaper HP and NX HP and CATIA HP Workstations for running Dassault Systèmes CATIA 4AA3-xxxxENW, Created Month 20XX This is an HP Indigo digital print (optional) Table of contents 3 Introduction 3 What

More information

FUJITSU PHI Turnkey Solution

FUJITSU PHI Turnkey Solution FUJITSU PHI Turnkey Solution Integrated ready to use XEON-PHI based platform Dr. Pierre Lagier ISC2014 - Leipzig PHI Turnkey Solution challenges System performance challenges Parallel IO best architecture

More information

AcuSolve Performance Benchmark and Profiling. October 2011

AcuSolve Performance Benchmark and Profiling. October 2011 AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox, Altair Compute

More information

Evolving HPC Solutions Using Open Source Software & Industry-Standard Hardware

Evolving HPC Solutions Using Open Source Software & Industry-Standard Hardware CLUSTER TO CLOUD Evolving HPC Solutions Using Open Source Software & Industry-Standard Hardware Carl Trieloff cctrieloff@redhat.com Red Hat, Technical Director Lee Fisher lee.fisher@hp.com Hewlett-Packard,

More information

Recent Advances in Modelling Wind Parks in STAR CCM+ Steve Evans

Recent Advances in Modelling Wind Parks in STAR CCM+ Steve Evans Recent Advances in Modelling Wind Parks in STAR CCM+ Steve Evans Introduction Company STAR-CCM+ Agenda Wind engineering at CD-adapco STAR-CCM+ & EnviroWizard Developments for Offshore Simulation CD-adapco:

More information

Hybrid (MPP+OpenMP) version of LS-DYNA

Hybrid (MPP+OpenMP) version of LS-DYNA Hybrid (MPP+OpenMP) version of LS-DYNA LS-DYNA Forum 2011 Jason Wang Oct. 12, 2011 Outline 1) Why MPP HYBRID 2) What is HYBRID 3) Benefits 4) How to use HYBRID Why HYBRID LS-DYNA LS-DYNA/MPP Speedup, 10M

More information

Turbostream: A CFD solver for manycore

Turbostream: A CFD solver for manycore Turbostream: A CFD solver for manycore processors Tobias Brandvik Whittle Laboratory University of Cambridge Aim To produce an order of magnitude reduction in the run-time of CFD solvers for the same hardware

More information

WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY

WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY Table of Contents Introduction 3 Performance on Hosted Server 3 Figure 1: Real World Performance 3 Benchmarks 3 System configuration used for benchmarks 3

More information

IBM Emulex 16Gb Fibre Channel HBA Evaluation

IBM Emulex 16Gb Fibre Channel HBA Evaluation IBM Emulex 16Gb Fibre Channel HBA Evaluation Evaluation report prepared under contract with Emulex Executive Summary The computing industry is experiencing an increasing demand for storage performance

More information

GPU TECHNOLOGY WORKSHOP SOUTH EAST ASIA 2014

GPU TECHNOLOGY WORKSHOP SOUTH EAST ASIA 2014 GPU TECHNOLOGY WORKSHOP SOUTH EAST ASIA 2014 Delivering virtualized 3D graphics apps with Citrix XenDesktop & NVIDIA Grid GPUs Garry Soriano Solution Engineer, ASEAN Citrix Systems garry.soriano@citrix.com

More information

RIGHTNOW A C E

RIGHTNOW A C E RIGHTNOW A C E 2 0 1 4 2014 Aras 1 A C E 2 0 1 4 Scalability Test Projects Understanding the results 2014 Aras Overview Original Use Case Scalability vs Performance Scale to? Scaling the Database Server

More information

InfoBrief. Dell 2-Node Cluster Achieves Unprecedented Result with Three-tier SAP SD Parallel Standard Application Benchmark on Linux

InfoBrief. Dell 2-Node Cluster Achieves Unprecedented Result with Three-tier SAP SD Parallel Standard Application Benchmark on Linux InfoBrief Dell 2-Node Cluster Achieves Unprecedented Result with Three-tier SAP SD Parallel Standard Application Benchmark on Linux Leveraging Oracle 9i Real Application Clusters (RAC) Technology and Red

More information