A Massively Parallel Two-Phase Solver for Incompressible Fluids on Multi-GPU Clusters

Size: px
Start display at page:

Download "A Massively Parallel Two-Phase Solver for Incompressible Fluids on Multi-GPU Clusters"

Transcription

1 A Massively Parallel Two-Phase Solver for Incompressible Fluids on Multi-GPU Clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn GPU Technology Conference 2012

2 Outline 1 Introduction 2 Multi-GPU Fluid Solver 3 Uncertainty Quantification in CFD

3 Two-phase flows major topic in computational fluid dynamics simulating interaction of two fluids like air & water, water & oil interesting small-scale phenomena: surface tension effects, droplet deformation, bubble dynamics large-scale studies: ship construction, river simulation

4 NaSt3DGPF - A 3D two-phase Navier-Stokes solver 3D grid-based fluid solver for the non-stationary two-phase incompressible Navier-Stokes equations simulation of two interacting fluids (e.g. air, water) with high-order methods running on HPC clusters

5 Application: Flows through hydraulic constructions Real-world sluice system was successfully modeled and simulated Joint project with German Federal Waterways Engineering and Research Institute

6 Bionics-motivated textile design development of textiles and materials which can store oil model in nature: specialized bees that transport oil

7 Application: Fluid animation Coupling NaSt3DGPF and Autodesk Maya integration of full fluid simulation pipeline in Maya graphical setup of simulation scenes automatic parallel solver run integration of computed fluid simulations photo-realistic rendering of fluid surfaces Z., Griebel: Photorealistic Visualization and Fluid Animation [...], Comp. Vis. Sci., 2012, accepted.

8 Outline 1 Introduction 2 Multi-GPU Fluid Solver 3 Uncertainty Quantification in CFD

9 NaSt3DGPF - A 3D two-phase Navier-Stokes solver We have ported our in-house fluid solver to the GPU 3D grid-based fluid solver for the non-stationary two-phase incompressible Navier-Stokes equations simulation of two interacting fluids (e.g. air, water) FD solver based on Chorin s pressure projection approach high-order space discretizations: e.g. WENO 5 th time discretizations: Runge-Kutta 3 rd, Adams-Bashforth Jacobi-preconditioned conjugate gradient solver for Poisson equation complex geometries with different boundary conditions multi-gpu MPI parallelization by domain decomposition Griebel, Z.: A multi-gpu accelerated solver for the three-dimensional two-phase incompressible Navier-Stokes equations. Comp. Sci. Res. Dev., 25(1-2):65-73, May Z., Griebel: Solving incompressible two-phase flows on multi-gpu clusters. Comp. Fluids, accept.

10 Core technique for two-phase flows: Level-set method Level-set method Representation of a free surface Γ t by a signed distance function φ R 3 R R: Γ t = {x φ(x, t) = 0} φ = 1 Fluid phase distinction by sign of level set function: Ω 2 : φ < 0 φ(x, t) > 0 for x Ω 1 φ(x, t) 0 for x Ω 2 Ω 1 : φ > 0

11 Two-phase Navier-Stokes equations PDE system u p φ ρ fluid velocity pressure level set function density ρ(φ)( t u + (u )u) = (µ(φ)s) p u = 0 t φ + u φ = 0 µ dynamic viscosity S stress tensor σ surface tension g volume forces S := u + { u} T ρ(φ) := ρ 2 + (ρ 1 ρ 2 ) H(φ) µ(φ) := µ 2 + (µ 1 µ 2 ) H(φ) σκ(φ)δ(φ) φ + ρ(φ)g 0 if φ < 0 1 H(φ) := 2 if φ = 0 1 if φ > 0 κ δ local curvature of fluid surface Dirac deltafunctional

12 Solver algorithm based on pressure projection For t = 1, 2,... do: 1 set boundary conditions for u n 2 compute intermediate velocity field u : u u n δt = (u n )u n + g + 1 ρ(φ n ) (µ(φn )S n ) 1 ρ(φ n ) σκ(φn )δ(φ n ) φ n 3 apply boundary conditions and transport level-set function: 4 reinitialize level-set function by solving φ = φ n + δt (u n φ n ) τ d + sign(φ )( d 1) = 0, 5 solve the pressure Poisson equation with φ n+1 = d: ( ) δt ρ(φ n+1 ) pn+1 = u 6 apply velocity correction: u n+1 = u δt ρ(φ n+1 ) pn+1 d 0 = φ

13 Solver algorithm based on pressure projection For t = 1, 2,... do: 1 set boundary conditions for u n 2 compute intermediate velocity field u : u u n δt = (u n )u n + g + 1 ρ(φ n ) (µ(φn )S n ) 1 ρ(φ n ) σκ(φn )δ(φ n ) φ n 3 apply boundary conditions and transport level-set function: 4 reinitialize level-set function by solving This φ = φis n + now δt (udone n φ n ) on multiple GPUs. τ d + sign(φ )( d 1) = 0, 5 solve the pressure Poisson equation with φ n+1 = d: ( ) δt ρ(φ n+1 ) pn+1 = u 6 apply velocity correction: u n+1 = u δt ρ(φ n+1 ) pn+1 d 0 = φ

14 CPU GPU porting process Our approach 1 identification of most time consuming parts of CPU code good starting point 2 stepwise porting with full CPU GPU data copy before and after GPU computation and per-method memory allocation 3 continuously: GPU code validation for each porting step 4 step-wise unification of data fields and reduction of CPU GPU data transfers 5 overall optimization Advantage first results within short period of time easy code validation during porting process

15 Design principles of the GPU code (1) General CUDA as GPU programming framework full double precision implementation linearization of 3D data fields Memory hierarchies use global memory wherever acceptable low algorithmic complexity L1 / L2 caches more and more popular and faster optimization with shared memory for most time-critical parts shmem-based parallel reduction used from SDK Compute configuration for maximized GPU occupancy use of maximum number of threads supported by symmetric multiprocessor (SM)

16 Design principles of the GPU code (2) Compute-intensive kernels high instruction count per kernel high register usage = slow kernels (example: WENO stencil) solution: precompute some parts in other kernel What remains on CPU? configuration file parser binary/visualization data file input/output parallel communication

17 Multi-GPU parallelization by domain decomposition Multi-GPU parallelization fully integrated with distributed memory MPI parallelization of CPU code: 1 GPU 1 CPU core

18 Optimizing multi-gpu data exchanges buffer on GPU buffer CPU RAM buffer on GPU on GPU buffer on GPU buffer CPU RAM Prepacking of boundary data buffer on GPU on GPU Matrix-vector product on inner cells Exchange boundary data Ax Results Matrix-vector product on boundary cells Ax Overlapping communication and computation (PCG solver)

19 Results

20 Benchmarking problem: air bubble rising in water Properties domain size: liquid phase: gas phase: surface tension: volume forces: initial air bubble radius: initial center position of bubble: 20 cm 20 cm 20 cm water at 20 o C air at 20 o C standard standard gravity 3 cm (10 cm, 6 cm, 10 cm)

21 Performance measurements for GPUs Perfectly fair CPU-GPU benchmarks are very hard! 1 GPU vs. 1 CPU core + good GPU results CPU speed unclear not realistic wrt. price Performance per dollar ++ best price realism price per node / CPU? prices subject to changes 1 GPU vs. 1 CPU socket + better price realism # of cores per socket? speed per CPU core? Performance per Watt ++ Green IT + power costs high influence of config.

22 Benchmarking platforms CPU Hardware dual-6-core Intel Xeon E5650 CPU 2.67 GHz 24 GB DDR3-RAM GPU Cluster (8 GT200 GPUs) 2 workstations with 4-core Intel Core i7-920 CPU 2.66 GHz 12 GB DDR3-RAM NVIDIA Tesla S1070 (4 GPUs) InfiniBand 40G QDR ConnectX Ubuntu Linux bit GCC compiler, CUDA 3.2 SDK, OpenMPI GPU Hardware (GF100 Fermi) 4-core Intel Xeon E5620 CPU 2.40 GHz 6 GB DDR3-RAM NVIDIA Tesla C2050 GPU GPU Cluster (48 GF100 RWTH Aachen University, Germany 24 double-gpu systems with 2 Intel Xeon X GHz 2 NVIDIA Quadro 6000 GPUs QDR InfiniBand Scientific Linux 6.1, GCC 4.4.5, OpenMPI 1.5.3, CUDA

23 Performance per dollar Speed-up on one GPU GT200 GPU vs. 6-core Xeon CPU GF100 GPU w. ECC vs. dual 6-core Xeon CPU GF100 GPU w/o ECC vs. dual 6-core Xeon CPU Simulation Grid Resolution Costs NVIDIA Tesla C1060 GPU similar to 6-core Xeon CPU new NVIDIA Tesla C2050 Fermi GPU similar to two 6-core Xeon CPUs 1 core vs. 1 GPU (w. ECC) > 36x speedup 4-core socket vs. 1 GPU (w. ECC) > 9x speedup

24 Performance per Watt Power consumption in kwh dual 6-core Xeon CPU 8 GT200 GPUs GF100 GPU with ECC GF100 GPU w/o ECC Grid resolution Fermi-type GPU more than two times more power-efficient

25 Multi-GPU performance (GT200 GPUs) Strong scaling speed-up relative to one GT200 GPU grid resolution (w/o overlap) grid resolution (w. overlap) Number of GPUs Weak scaling efficiency (in %) relative to one GPU Weak scaling efficiency (in %) relative to one GPU grid resolution per GPU (w/o overlap) grid resolution per GPU (w. overlap) grid resolution per GPU (w/o overlap) grid resolution per GPU (w. overlap) Number of GT200 GPUs strong scaling / speedup weak scaling / scale-up

26 Multi-GPU performance (Fermi GPUs) Strong scaling / speed-up relative to one Fermi GPU grid resolution Number of Fermi GPUs Weak scaling efficiency (in percent) relative to one GPU grid resolution per GPU Number of Fermi GPUs strong scaling / speed-up weak scaling / scale-up

27 Outline 1 Introduction 2 Multi-GPU Fluid Solver 3 Uncertainty Quantification in CFD

28 Uncertainty Quantification (UQ) in CFD Current standard CFD simulations for fixed and known input parameters Problem nature phenomena: input data not known exactly engineering: constructions / measurements always subject to perturbations Solution numerical modeling and analysis of uncertainties Uncertainty Quantification joint project with M. Griebel and C. Rieger

29 High-level idea of standard UQ methods Algorithmic idea: 1 sampling of stochastic input parameters according to some distribution 2 computation of hundreds or thousands of stochastic realizations (high-resolution simulations) 3 extraction of averaged data (expectation value), variance information,... as post-processing step non-intrusive method all available solvers usable without reprogramming Challenge: computation of realizations highly parallel but extremely time-intensive handling of enormous data quantities in post-processing step optimal setting for multi-gpu clusters in Exascale

30 Stochastic formulation of the Naver-Stokes equations Find stochastic functions u : D R + Ω R 3, p : D R + Ω R such that P-almost surely: 1 u(x, t, ω)+u(x, t, ω) u(x, t, ω)+ ( p(x, t, ω) µ(ω) u(x, t, ω)) = f(x, ω) t ρ(ω) on D u(x, t, ω) = 0 on D u(x, t, ω) = g(x, ω) on D x D R 3 physical domain (Ω, F, P) complete probability space Ω set of outcomes, F 2 Ω σ-algebra of events, P : F [0, 1] probability measure ρ, µ : Ω R, f, g : Ω R 3 stochastic functions

31 General setting Formulation for general setting Find stochastic functions u : D R + Ω R l such that P-almost surely: L(a(x, t, ω))u(x, t, ω) = f (x, t, ω) for all x D R d, t R + +BC L general operator depending on parameters modeled by random field a a, f : D R + Ω R s stochastic function x D R 3 physical domain (Ω, F, P) complete probability space Ω set of outcomes, F 2 Ω σ-algebra of events, P : F [0, 1] probability measure

32 Finite noise assumption Introduction of finite number of random variables: Γ M : Ω R M, ω (Y 1 (ω),..., Y M (ω)) T = (y (1),..., y (M) ) T =: y, M N Ω M := Γ M (Ω), (Ω, F, P) (Ω M, B(Ω M ), ρ M (y)), ρ M joint probability density E[Y i ] = 0, E[Y i Y j ] = δ i,j for all i, j = 1,..., M Finite dimensional representation of stochastic space: a(x, t, ω) a M (x, t, Γ M (ω)) = a M (x, t, y) f (x, t, ω) f M (x, t, Γ M (ω)) = f M (x, t, y) With further assumptions, solution can be given as: u(x, t, ω) u M (x, t, Γ M (ω)) = u M (x, t, y)

33 Stochastic RBF collocation 1. generate N(Ω) stochastic samples (collocation points) wrt. some distribution X N(Ω) := {y 1,..., y N(Ω) } Ω M 2. solve N(Ω) discretized deterministic problems L (h) (a (h) M (x, t, y i))u (h) M (x, t, y i) = f (h) M (x, t, y i) for all x D R d, t R + 3. compute Lagrange basis {l i } in trial space V XN(Ω) := span{k(, y i ) y i X N(Ω) } {l i (y)} N(Ω) i=1, l i(y) V XN(Ω), l i (y j ) = δ i,j K(, ) RBF kernel, i.e. K(y i, y j ) = k( y i y j ), 4. interpolate solution N(Ω) M (x, t, y) = u (h) (x, t, y i )l i (y) u (h) i=1 e.g. k(x) = e αx2

34 Computation of Lagrange basis {l i (y)} N(Ω) i=1 Given: samples in stochastic space: X N(Ω) := {y 1,..., y N(Ω) } Ω M Target: Lagrange basis {l i (y)} N(Ω) i=1 in trial space V XN(Ω) := span{k(, y i ) y i X N(Ω) } l i (y) V XN(Ω), l i (y j ) = δ i,j, K(y i, y j ) = k( y i y j ), k(x) = e αx2 Evaluation of Lagrange basis at point q Ω M l(q) K(y 1, y 1 ) K(y 1, y N(Ω) ) A = K(y N(Ω), y 1 ) K(y N(Ω), y N(Ω) ) Solution of a linear system for each q, l(q) = A l(q) = R(q) l 1(q). l N(Ω) (q), R(q) = K(q, y 1 ). K(q, y N(Ω) )

35 For now: expectation value Remember Expectation value N(Ω) M (x, t, y) = u (h) (x, t, y i )l i (y) u (h) i=1 E(u (h) E(u (h) M (x, t, y)) = Ω M u (h) (x, t, y i )ρ M (y)dy N(Ω) M (x, t, y)) = u (h) (x, t, y i )E(l i (y)), E(l i (y)) = l i (y)ρ M (y)dy Ω M i=1 Monte Carlo using quadrature points Q = {q 1,..., q Q } Ω M gives E(l i (y)) 1 l i (q)ρ M (y) Q q Q

36 Implementation Framework configuration files for independent fluid solver runs generated by Python scripts simulations performed with full multi-gpu code on 36 Nvidia Tesla M2090 cluster at our institute stochastics as post-processing on GPU using VTK data files produced by fluid solver GPU programming Nvidia CUDA 4.1 Thrust (transforms, reductions,... ) CULA Dense R14 (double precision LU factorization) CUBLAS (dot product with stride, not available in Thrust) custom GPU kernels (matrix setup, integration,... )

37 First results: Uncertainty in reattachment length

38 First results: Mixing in bubble reactors

39 Summary NaSt3DGPF solves the two-phase incompressible Navier-Stokes equations Code scales well on multi-gpu clusters Uncertainty Quantification profits from GPUs Thank you!

Two-Phase flows on massively parallel multi-gpu clusters

Two-Phase flows on massively parallel multi-gpu clusters Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous

More information

A multi-gpu accelerated solver for the three-dimensional two-phase incompressible Navier-Stokes equations

A multi-gpu accelerated solver for the three-dimensional two-phase incompressible Navier-Stokes equations Computer Science - Research and Development manuscript No. (will be inserted by the editor) A multi-gpu accelerated solver for the three-dimensional two-phase incompressible Navier-Stokes equations Michael

More information

Solving Incompressible Two-Phase Flows on Multi-GPU Clusters

Solving Incompressible Two-Phase Flows on Multi-GPU Clusters Wegelerstraße 6 53115 Bonn Germany phone +49 228 73-3427 fax +49 228 73-7527 www.ins.uni-bonn.de P. Zaspel, M. Griebel Solving Incompressible Two-Phase Flows on Multi-GPU Clusters INS Preprint No. 1113

More information

Gradient Free Design of Microfluidic Structures on a GPU Cluster

Gradient Free Design of Microfluidic Structures on a GPU Cluster Gradient Free Design of Microfluidic Structures on a GPU Cluster Austen Duffy - Florida State University SIAM Conference on Computational Science and Engineering March 2, 2011 Acknowledgements This work

More information

High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs

High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs Gordon Erlebacher Department of Scientific Computing Sept. 28, 2012 with Dimitri Komatitsch (Pau,France) David Michea

More information

FINITE POINTSET METHOD FOR 2D DAM-BREAK PROBLEM WITH GPU-ACCELERATION. M. Panchatcharam 1, S. Sundar 2

FINITE POINTSET METHOD FOR 2D DAM-BREAK PROBLEM WITH GPU-ACCELERATION. M. Panchatcharam 1, S. Sundar 2 International Journal of Applied Mathematics Volume 25 No. 4 2012, 547-557 FINITE POINTSET METHOD FOR 2D DAM-BREAK PROBLEM WITH GPU-ACCELERATION M. Panchatcharam 1, S. Sundar 2 1,2 Department of Mathematics

More information

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou Study and implementation of computational methods for Differential Equations in heterogeneous systems Asimina Vouronikoy - Eleni Zisiou Outline Introduction Review of related work Cyclic Reduction Algorithm

More information

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Speedup Altair RADIOSS Solvers Using NVIDIA GPU Innovation Intelligence Speedup Altair RADIOSS Solvers Using NVIDIA GPU Eric LEQUINIOU, HPC Director Hongwei Zhou, Senior Software Developer May 16, 2012 Innovation Intelligence ALTAIR OVERVIEW Altair

More information

Large scale Imaging on Current Many- Core Platforms

Large scale Imaging on Current Many- Core Platforms Large scale Imaging on Current Many- Core Platforms SIAM Conf. on Imaging Science 2012 May 20, 2012 Dr. Harald Köstler Chair for System Simulation Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen,

More information

2.7 Cloth Animation. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter 2 123

2.7 Cloth Animation. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter 2 123 2.7 Cloth Animation 320491: Advanced Graphics - Chapter 2 123 Example: Cloth draping Image Michael Kass 320491: Advanced Graphics - Chapter 2 124 Cloth using mass-spring model Network of masses and springs

More information

cuibm A GPU Accelerated Immersed Boundary Method

cuibm A GPU Accelerated Immersed Boundary Method cuibm A GPU Accelerated Immersed Boundary Method S. K. Layton, A. Krishnan and L. A. Barba Corresponding author: labarba@bu.edu Department of Mechanical Engineering, Boston University, Boston, MA, 225,

More information

Center for Computational Science

Center for Computational Science Center for Computational Science Toward GPU-accelerated meshfree fluids simulation using the fast multipole method Lorena A Barba Boston University Department of Mechanical Engineering with: Felipe Cruz,

More information

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs C.-C. Su a, C.-W. Hsieh b, M. R. Smith b, M. C. Jermy c and J.-S. Wu a a Department of Mechanical Engineering, National Chiao Tung

More information

Performance of Implicit Solver Strategies on GPUs

Performance of Implicit Solver Strategies on GPUs 9. LS-DYNA Forum, Bamberg 2010 IT / Performance Performance of Implicit Solver Strategies on GPUs Prof. Dr. Uli Göhner DYNAmore GmbH Stuttgart, Germany Abstract: The increasing power of GPUs can be used

More information

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,

More information

Particleworks: Particle-based CAE Software fully ported to GPU

Particleworks: Particle-based CAE Software fully ported to GPU Particleworks: Particle-based CAE Software fully ported to GPU Introduction PrometechVideo_v3.2.3.wmv 3.5 min. Particleworks Why the particle method? Existing methods FEM, FVM, FLIP, Fluid calculation

More information

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,

More information

Turbostream: A CFD solver for manycore

Turbostream: A CFD solver for manycore Turbostream: A CFD solver for manycore processors Tobias Brandvik Whittle Laboratory University of Cambridge Aim To produce an order of magnitude reduction in the run-time of CFD solvers for the same hardware

More information

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? AGENDA Why use AMG? When to use AMG? NVIDIA AmgX Results 2

More information

DIFFERENTIAL. Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka

DIFFERENTIAL. Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka USE OF FOR Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka Faculty of Nuclear Sciences and Physical Engineering Czech Technical University in Prague Mini workshop on advanced numerical methods

More information

Large-scale Gas Turbine Simulations on GPU clusters

Large-scale Gas Turbine Simulations on GPU clusters Large-scale Gas Turbine Simulations on GPU clusters Tobias Brandvik and Graham Pullan Whittle Laboratory University of Cambridge A large-scale simulation Overview PART I: Turbomachinery PART II: Stencil-based

More information

International Supercomputing Conference 2009

International Supercomputing Conference 2009 International Supercomputing Conference 2009 Implementation of a Lattice-Boltzmann-Method for Numerical Fluid Mechanics Using the nvidia CUDA Technology E. Riegel, T. Indinger, N.A. Adams Technische Universität

More information

Software and Performance Engineering for numerical codes on GPU clusters

Software and Performance Engineering for numerical codes on GPU clusters Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3

More information

Advances of parallel computing. Kirill Bogachev May 2016

Advances of parallel computing. Kirill Bogachev May 2016 Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being

More information

MESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP

MESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP Vol. 12, Issue 1/2016, 63-68 DOI: 10.1515/cee-2016-0009 MESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP Juraj MUŽÍK 1,* 1 Department of Geotechnics, Faculty of Civil Engineering, University

More information

J. Blair Perot. Ali Khajeh-Saeed. Software Engineer CD-adapco. Mechanical Engineering UMASS, Amherst

J. Blair Perot. Ali Khajeh-Saeed. Software Engineer CD-adapco. Mechanical Engineering UMASS, Amherst Ali Khajeh-Saeed Software Engineer CD-adapco J. Blair Perot Mechanical Engineering UMASS, Amherst Supercomputers Optimization Stream Benchmark Stag++ (3D Incompressible Flow Code) Matrix Multiply Function

More information

3D ADI Method for Fluid Simulation on Multiple GPUs. Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA

3D ADI Method for Fluid Simulation on Multiple GPUs. Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA 3D ADI Method for Fluid Simulation on Multiple GPUs Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA Introduction Fluid simulation using direct numerical methods Gives the most accurate result Requires

More information

Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards

Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards By Allan P. Engsig-Karup, Morten Gorm Madsen and Stefan L. Glimberg DTU Informatics Workshop

More information

Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation

Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation Nikolai Sakharnykh - NVIDIA San Jose Convention Center, San Jose, CA September 21, 2010 Introduction Tridiagonal solvers very popular

More information

Shape of Things to Come: Next-Gen Physics Deep Dive

Shape of Things to Come: Next-Gen Physics Deep Dive Shape of Things to Come: Next-Gen Physics Deep Dive Jean Pierre Bordes NVIDIA Corporation Free PhysX on CUDA PhysX by NVIDIA since March 2008 PhysX on CUDA available: August 2008 GPU PhysX in Games Physical

More information

General Purpose GPU Computing in Partial Wave Analysis

General Purpose GPU Computing in Partial Wave Analysis JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data

More information

ANSYS HPC Technology Leadership

ANSYS HPC Technology Leadership ANSYS HPC Technology Leadership 1 ANSYS, Inc. November 14, Why ANSYS Users Need HPC Insight you can t get any other way It s all about getting better insight into product behavior quicker! HPC enables

More information

ANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011

ANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011 ANSYS HPC Technology Leadership Barbara Hutchings barbara.hutchings@ansys.com 1 ANSYS, Inc. September 20, Why ANSYS Users Need HPC Insight you can t get any other way HPC enables high-fidelity Include

More information

The 3D DSC in Fluid Simulation

The 3D DSC in Fluid Simulation The 3D DSC in Fluid Simulation Marek K. Misztal Informatics and Mathematical Modelling, Technical University of Denmark mkm@imm.dtu.dk DSC 2011 Workshop Kgs. Lyngby, 26th August 2011 Governing Equations

More information

Automated Finite Element Computations in the FEniCS Framework using GPUs

Automated Finite Element Computations in the FEniCS Framework using GPUs Automated Finite Element Computations in the FEniCS Framework using GPUs Florian Rathgeber (f.rathgeber10@imperial.ac.uk) Advanced Modelling and Computation Group (AMCG) Department of Earth Science & Engineering

More information

GPU Implementation of Elliptic Solvers in NWP. Numerical Weather- and Climate- Prediction

GPU Implementation of Elliptic Solvers in NWP. Numerical Weather- and Climate- Prediction 1/8 GPU Implementation of Elliptic Solvers in Numerical Weather- and Climate- Prediction Eike Hermann Müller, Robert Scheichl Department of Mathematical Sciences EHM, Xu Guo, Sinan Shi and RS: http://arxiv.org/abs/1302.7193

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

Matrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs

Matrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs Iterative Solvers Numerical Results Conclusion and outlook 1/18 Matrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs Eike Hermann Müller, Robert Scheichl, Eero Vainikko

More information

Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics

Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics H. Y. Schive ( 薛熙于 ) Graduate Institute of Physics, National Taiwan University Leung Center for Cosmology and Particle Astrophysics

More information

Technology for a better society. hetcomp.com

Technology for a better society. hetcomp.com Technology for a better society hetcomp.com 1 J. Seland, C. Dyken, T. R. Hagen, A. R. Brodtkorb, J. Hjelmervik,E Bjønnes GPU Computing USIT Course Week 16th November 2011 hetcomp.com 2 9:30 10:15 Introduction

More information

Application of GPU-Based Computing to Large Scale Finite Element Analysis of Three-Dimensional Structures

Application of GPU-Based Computing to Large Scale Finite Element Analysis of Three-Dimensional Structures Paper 6 Civil-Comp Press, 2012 Proceedings of the Eighth International Conference on Engineering Computational Technology, B.H.V. Topping, (Editor), Civil-Comp Press, Stirlingshire, Scotland Application

More information

Sampling Using GPU Accelerated Sparse Hierarchical Models

Sampling Using GPU Accelerated Sparse Hierarchical Models Sampling Using GPU Accelerated Sparse Hierarchical Models Miroslav Stoyanov Oak Ridge National Laboratory supported by Exascale Computing Project (ECP) exascaleproject.org April 9, 28 Miroslav Stoyanov

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Outlining Midterm Projects Topic 3: GPU-based FEA Topic 4: GPU Direct Solver for Sparse Linear Algebra March 01, 2011 Dan Negrut, 2011 ME964

More information

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE Abu Asaduzzaman, Deepthi Gummadi, and Chok M. Yip Department of Electrical Engineering and Computer Science Wichita State University Wichita, Kansas, USA

More information

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative

More information

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI 1 Akshay N. Panajwar, 2 Prof.M.A.Shah Department of Computer Science and Engineering, Walchand College of Engineering,

More information

Droplet collisions using a Level Set method: comparisons between simulation and experiments

Droplet collisions using a Level Set method: comparisons between simulation and experiments Computational Methods in Multiphase Flow III 63 Droplet collisions using a Level Set method: comparisons between simulation and experiments S. Tanguy, T. Ménard & A. Berlemont CNRS-UMR6614-CORIA, Rouen

More information

FEM techniques for interfacial flows

FEM techniques for interfacial flows FEM techniques for interfacial flows How to avoid the explicit reconstruction of interfaces Stefan Turek, Shu-Ren Hysing (ture@featflow.de) Institute for Applied Mathematics University of Dortmund Int.

More information

Realtime Water Simulation on GPU. Nuttapong Chentanez NVIDIA Research

Realtime Water Simulation on GPU. Nuttapong Chentanez NVIDIA Research 1 Realtime Water Simulation on GPU Nuttapong Chentanez NVIDIA Research 2 3 Overview Approaches to realtime water simulation Hybrid shallow water solver + particles Hybrid 3D tall cell water solver + particles

More information

Multigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK

Multigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK Multigrid Solvers in CFD David Emerson Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK david.emerson@stfc.ac.uk 1 Outline Multigrid: general comments Incompressible

More information

CUDA Accelerated Linpack on Clusters. E. Phillips, NVIDIA Corporation

CUDA Accelerated Linpack on Clusters. E. Phillips, NVIDIA Corporation CUDA Accelerated Linpack on Clusters E. Phillips, NVIDIA Corporation Outline Linpack benchmark CUDA Acceleration Strategy Fermi DGEMM Optimization / Performance Linpack Results Conclusions LINPACK Benchmark

More information

Application of GPU technology to OpenFOAM simulations

Application of GPU technology to OpenFOAM simulations Application of GPU technology to OpenFOAM simulations Jakub Poła, Andrzej Kosior, Łukasz Miroslaw jakub.pola@vratis.com, www.vratis.com Wroclaw, Poland Agenda Motivation Partial acceleration SpeedIT OpenFOAM

More information

CUDA. Fluid simulation Lattice Boltzmann Models Cellular Automata

CUDA. Fluid simulation Lattice Boltzmann Models Cellular Automata CUDA Fluid simulation Lattice Boltzmann Models Cellular Automata Please excuse my layout of slides for the remaining part of the talk! Fluid Simulation Navier Stokes equations for incompressible fluids

More information

Adaptive Mesh Astrophysical Fluid Simulations on GPU. San Jose 10/2/2009 Peng Wang, NVIDIA

Adaptive Mesh Astrophysical Fluid Simulations on GPU. San Jose 10/2/2009 Peng Wang, NVIDIA Adaptive Mesh Astrophysical Fluid Simulations on GPU San Jose 10/2/2009 Peng Wang, NVIDIA Overview Astrophysical motivation & the Enzo code Finite volume method and adaptive mesh refinement (AMR) CUDA

More information

Overview of Traditional Surface Tracking Methods

Overview of Traditional Surface Tracking Methods Liquid Simulation With Mesh-Based Surface Tracking Overview of Traditional Surface Tracking Methods Matthias Müller Introduction Research lead of NVIDIA PhysX team PhysX GPU acc. Game physics engine www.nvidia.com\physx

More information

Multi-GPU simulations in OpenFOAM with SpeedIT technology.

Multi-GPU simulations in OpenFOAM with SpeedIT technology. Multi-GPU simulations in OpenFOAM with SpeedIT technology. Attempt I: SpeedIT GPU-based library of iterative solvers for Sparse Linear Algebra and CFD. Current version: 2.2. Version 1.0 in 2008. CMRS format

More information

Lecture 15: More Iterative Ideas

Lecture 15: More Iterative Ideas Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!

More information

Computing on GPU Clusters

Computing on GPU Clusters Computing on GPU Clusters Robert Strzodka (MPII), Dominik Göddeke G (TUDo( TUDo), Dominik Behr (AMD) Conference on Parallel Processing and Applied Mathematics Wroclaw, Poland, September 13-16, 16, 2009

More information

Asynchronous OpenCL/MPI numerical simulations of conservation laws

Asynchronous OpenCL/MPI numerical simulations of conservation laws Asynchronous OpenCL/MPI numerical simulations of conservation laws Philippe HELLUY 1,3, Thomas STRUB 2. 1 IRMA, Université de Strasbourg, 2 AxesSim, 3 Inria Tonus, France IWOCL 2015, Stanford Conservation

More information

First Steps of YALES2 Code Towards GPU Acceleration on Standard and Prototype Cluster

First Steps of YALES2 Code Towards GPU Acceleration on Standard and Prototype Cluster First Steps of YALES2 Code Towards GPU Acceleration on Standard and Prototype Cluster YALES2: Semi-industrial code for turbulent combustion and flows Jean-Matthieu Etancelin, ROMEO, NVIDIA GPU Application

More information

Performance Optimization of a Massively Parallel Phase-Field Method Using the HPC Framework walberla

Performance Optimization of a Massively Parallel Phase-Field Method Using the HPC Framework walberla Performance Optimization of a Massively Parallel Phase-Field Method Using the HPC Framework walberla SIAM PP 2016, April 13 th 2016 Martin Bauer, Florian Schornbaum, Christian Godenschwager, Johannes Hötzer,

More information

Numerical Algorithms on Multi-GPU Architectures

Numerical Algorithms on Multi-GPU Architectures Numerical Algorithms on Multi-GPU Architectures Dr.-Ing. Harald Köstler 2 nd International Workshops on Advances in Computational Mechanics Yokohama, Japan 30.3.2010 2 3 Contents Motivation: Applications

More information

Introduction to GPU hardware and to CUDA

Introduction to GPU hardware and to CUDA Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 35 Course outline Introduction to GPU hardware

More information

Cloth Simulation. COMP 768 Presentation Zhen Wei

Cloth Simulation. COMP 768 Presentation Zhen Wei Cloth Simulation COMP 768 Presentation Zhen Wei Outline Motivation and Application Cloth Simulation Methods Physically-based Cloth Simulation Overview Development References 2 Motivation Movies Games VR

More information

GPU Cluster Computing for FEM

GPU Cluster Computing for FEM GPU Cluster Computing for FEM Dominik Göddeke Sven H.M. Buijssen, Hilmar Wobker and Stefan Turek Angewandte Mathematik und Numerik TU Dortmund, Germany dominik.goeddeke@math.tu-dortmund.de GPU Computing

More information

B. Tech. Project Second Stage Report on

B. Tech. Project Second Stage Report on B. Tech. Project Second Stage Report on GPU Based Active Contours Submitted by Sumit Shekhar (05007028) Under the guidance of Prof Subhasis Chaudhuri Table of Contents 1. Introduction... 1 1.1 Graphic

More information

On Massively Parallel Algorithms to Track One Path of a Polynomial Homotopy

On Massively Parallel Algorithms to Track One Path of a Polynomial Homotopy On Massively Parallel Algorithms to Track One Path of a Polynomial Homotopy Jan Verschelde joint with Genady Yoffe and Xiangcheng Yu University of Illinois at Chicago Department of Mathematics, Statistics,

More information

OpenFOAM + GPGPU. İbrahim Özküçük

OpenFOAM + GPGPU. İbrahim Özküçük OpenFOAM + GPGPU İbrahim Özküçük Outline GPGPU vs CPU GPGPU plugins for OpenFOAM Overview of Discretization CUDA for FOAM Link (cufflink) Cusp & Thrust Libraries How Cufflink Works Performance data of

More information

PORTABLE AND SCALABLE SOLUTIONS FOR CFD ON MODERN SUPERCOMPUTERS

PORTABLE AND SCALABLE SOLUTIONS FOR CFD ON MODERN SUPERCOMPUTERS PORTABLE AND SCALABLE SOLUTIONS FOR CFD ON MODERN SUPERCOMPUTERS Ricard Borrell Pol Head and Mass Transfer Technological Center cttc.upc.edu Termo Fluids S.L termofluids.co Barcelona Supercomputing Center

More information

Computational Fluid Dynamics (CFD) using Graphics Processing Units

Computational Fluid Dynamics (CFD) using Graphics Processing Units Computational Fluid Dynamics (CFD) using Graphics Processing Units Aaron F. Shinn Mechanical Science and Engineering Dept., UIUC Accelerators for Science and Engineering Applications: GPUs and Multicores

More information

Fast Multipole Method on the GPU

Fast Multipole Method on the GPU Fast Multipole Method on the GPU with application to the Adaptive Vortex Method University of Bristol, Bristol, United Kingdom. 1 Introduction Particle methods Highly parallel Computational intensive Numerical

More information

GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE)

GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE) GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE) NATALIA GIMELSHEIN ANSHUL GUPTA STEVE RENNICH SEID KORIC NVIDIA IBM NVIDIA NCSA WATSON SPARSE MATRIX PACKAGE (WSMP) Cholesky, LDL T, LU factorization

More information

FLUID SIMULATION. Kristofer Schlachter

FLUID SIMULATION. Kristofer Schlachter FLUID SIMULATION Kristofer Schlachter The Equations Incompressible Navier-Stokes: @u @t = (r u)u 1 rp + vr2 u + F Incompressibility condition r u =0 Breakdown @u @t The derivative of velocity with respect

More information

Generic Refinement and Block Partitioning enabling efficient GPU CFD on Unstructured Grids

Generic Refinement and Block Partitioning enabling efficient GPU CFD on Unstructured Grids Generic Refinement and Block Partitioning enabling efficient GPU CFD on Unstructured Grids Matthieu Lefebvre 1, Jean-Marie Le Gouez 2 1 PhD at Onera, now post-doc at Princeton, department of Geosciences,

More information

CS-184: Computer Graphics Lecture #21: Fluid Simulation II

CS-184: Computer Graphics Lecture #21: Fluid Simulation II CS-184: Computer Graphics Lecture #21: Fluid Simulation II Rahul Narain University of California, Berkeley Nov. 18 19, 2013 Grid-based fluid simulation Recap: Eulerian viewpoint Grid is fixed, fluid moves

More information

Accelerating GPU computation through mixed-precision methods. Michael Clark Harvard-Smithsonian Center for Astrophysics Harvard University

Accelerating GPU computation through mixed-precision methods. Michael Clark Harvard-Smithsonian Center for Astrophysics Harvard University Accelerating GPU computation through mixed-precision methods Michael Clark Harvard-Smithsonian Center for Astrophysics Harvard University Outline Motivation Truncated Precision using CUDA Solving Linear

More information

Why Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends

Why Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends Imagine stream processor; Bill Dally, Stanford Connection Machine CM; Thinking Machines Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid Jeffrey Bolz Eitan Grinspun Caltech Ian Farmer

More information

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand

More information

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids Patrice Castonguay and Antony Jameson Aerospace Computing Lab, Stanford University GTC Asia, Beijing, China December 15 th, 2011

More information

High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters

High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters SIAM PP 2014 High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters C. Riesinger, A. Bakhtiari, M. Schreiber Technische Universität München February 20, 2014

More information

Development of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics

Development of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics Development of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics I. Pantle Fachgebiet Strömungsmaschinen Karlsruher Institut für Technologie KIT Motivation

More information

Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling

Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling Iterative Solvers Numerical Results Conclusion and outlook 1/22 Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling Part II: GPU Implementation and Scaling on Titan Eike

More information

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013 GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»

More information

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can

More information

Massively Parallel Phase Field Simulations using HPC Framework walberla

Massively Parallel Phase Field Simulations using HPC Framework walberla Massively Parallel Phase Field Simulations using HPC Framework walberla SIAM CSE 2015, March 15 th 2015 Martin Bauer, Florian Schornbaum, Christian Godenschwager, Johannes Hötzer, Harald Köstler and Ulrich

More information

Directed Optimization On Stencil-based Computational Fluid Dynamics Application(s)

Directed Optimization On Stencil-based Computational Fluid Dynamics Application(s) Directed Optimization On Stencil-based Computational Fluid Dynamics Application(s) Islam Harb 08/21/2015 Agenda Motivation Research Challenges Contributions & Approach Results Conclusion Future Work 2

More information

Prim ary break-up: DNS of liquid jet to improve atomization m odelling

Prim ary break-up: DNS of liquid jet to improve atomization m odelling Computational Methods in Multiphase Flow III 343 Prim ary break-up: DNS of liquid jet to improve atomization m odelling T. Ménard, P. A. Beau, S. Tanguy, F. X. Demoulin & A. Berlemont CNRS-UMR6614-CORIA,

More information

Interdisciplinary practical course on parallel finite element method using HiFlow 3

Interdisciplinary practical course on parallel finite element method using HiFlow 3 Interdisciplinary practical course on parallel finite element method using HiFlow 3 E. Treiber, S. Gawlok, M. Hoffmann, V. Heuveline, W. Karl EuroEDUPAR, 2015/08/24 KARLSRUHE INSTITUTE OF TECHNOLOGY -

More information

Ab initio NMR Chemical Shift Calculations for Biomolecular Systems Using Fragment Molecular Orbital Method

Ab initio NMR Chemical Shift Calculations for Biomolecular Systems Using Fragment Molecular Orbital Method 4 Ab initio NMR Chemical Shift Calculations for Biomolecular Systems Using Fragment Molecular Orbital Method A Large-scale Two-phase Flow Simulation Evolutive Image/ Video Coding with Massively Parallel

More information

NVIDIA. Interacting with Particle Simulation in Maya using CUDA & Maximus. Wil Braithwaite NVIDIA Applied Engineering Digital Film

NVIDIA. Interacting with Particle Simulation in Maya using CUDA & Maximus. Wil Braithwaite NVIDIA Applied Engineering Digital Film NVIDIA Interacting with Particle Simulation in Maya using CUDA & Maximus Wil Braithwaite NVIDIA Applied Engineering Digital Film Some particle milestones FX Rendering Physics 1982 - First CG particle FX

More information

Aerodynamics of a hi-performance vehicle: a parallel computing application inside the Hi-ZEV project

Aerodynamics of a hi-performance vehicle: a parallel computing application inside the Hi-ZEV project Workshop HPC enabling of OpenFOAM for CFD applications Aerodynamics of a hi-performance vehicle: a parallel computing application inside the Hi-ZEV project A. De Maio (1), V. Krastev (2), P. Lanucara (3),

More information

Multiphysics simulations of nuclear reactors and more

Multiphysics simulations of nuclear reactors and more Multiphysics simulations of nuclear reactors and more Gothenburg Region OpenFOAM User Group Meeting Klas Jareteg klasjareteg@chalmersse Division of Nuclear Engineering Department of Applied Physics Chalmers

More information

HPC Usage for Aerodynamic Flow Computation with Different Levels of Detail

HPC Usage for Aerodynamic Flow Computation with Different Levels of Detail DLR.de Folie 1 HPCN-Workshop 14./15. Mai 2018 HPC Usage for Aerodynamic Flow Computation with Different Levels of Detail Cornelia Grabe, Marco Burnazzi, Axel Probst, Silvia Probst DLR, Institute of Aerodynamics

More information

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić How to perform HPL on CPU&GPU clusters Dr.sc. Draško Tomić email: drasko.tomic@hp.com Forecasting is not so easy, HPL benchmarking could be even more difficult Agenda TOP500 GPU trends Some basics about

More information

Investigation of the critical submergence at pump intakes based on multiphase CFD calculations

Investigation of the critical submergence at pump intakes based on multiphase CFD calculations Advances in Fluid Mechanics X 143 Investigation of the critical submergence at pump intakes based on multiphase CFD calculations P. Pandazis & F. Blömeling TÜV NORD SysTec GmbH and Co. KG, Germany Abstract

More information

Interaction of Fluid Simulation Based on PhysX Physics Engine. Huibai Wang, Jianfei Wan, Fengquan Zhang

Interaction of Fluid Simulation Based on PhysX Physics Engine. Huibai Wang, Jianfei Wan, Fengquan Zhang 4th International Conference on Sensors, Measurement and Intelligent Materials (ICSMIM 2015) Interaction of Fluid Simulation Based on PhysX Physics Engine Huibai Wang, Jianfei Wan, Fengquan Zhang College

More information

GPUs and Einstein s Equations

GPUs and Einstein s Equations GPUs and Einstein s Equations Tim Dewey Advisor: Dr. Manuel Tiglio AMSC Scientific Computing University of Maryland May 5, 2011 Outline 1 Project Summary 2 Evolving Einstein s Equations 3 Implementation

More information

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER AVISHA DHISLE PRERIT RODNEY ADHISLE PRODNEY 15618: PARALLEL COMPUTER ARCHITECTURE PROF. BRYANT PROF. KAYVON LET S

More information