Massively Parallel Finite Element Simulations with deal.ii

Size: px
Start display at page:

Download "Massively Parallel Finite Element Simulations with deal.ii"

Transcription

1 Massively Parallel Finite Element Simulations with deal.ii Timo Heister, Texas A&M University SIAM PP2012 joint work with: Wolfgang Bangerth, Carsten Burstedde, Thomas Geenen, Martin Kronbichler

2 Content 1 Introduction 2 Data structures and algorithms Overview Triangulation Distributing the Degrees of Freedom Linear Algebra, Postprocessing 3 Numerical Results scalability tests Mantle Convection 4 Conclusions 2 / 26

3 Motivation: Mantle Convection material (rock) is mostly solid slow velocities (cm/year) driven by temperature differences (source: wikipedia) 3 / 26

4 Motivation (temperature snapshot, degrees of freedom, 2d simulation) Requirements: complex coupled equations and material laws 3d simulations, adaptive meshes large number of unknowns (100 million or more) 4 / 26

5 Goals and Background Goals: 1. Finite Elements with adaptive mesh refinement 2. scalable: cores, 1 billion+ unknowns 3. flexible: higher order, multiphysics, reuse existing software done in deal.ii and available today but described in a generic way Bangerth and Kanschat. deal.ii Differential Equations Analysis Library, Technical Reference, Bangerth, Burstedde, Heister, and Kronbichler. Algorithms and Data Structures for Massively Parallel Generic Finite Element Codes. ACM Trans. Math. Softw., 38(2), / 26

6 Requirements for Scalability distributed data storage everywhere need special data structures efficient algorithms not depending on total problem size localize and hide communication point-to-point MPI, nonblocking sends and receives 6 / 26

7 Status of Parallelization Often: (in deal.ii before and many other libraries) only using parallel linear algebra (matrix, vector, solvers) duplicated data: mesh, degrees of freedom,... algorithms not scaling: mesh handling, DoFs,... not efficient for > 100 cores or larger problems 7 / 26

8 Overview of data structures and algorithms needs to be parallelized: 1. Triangulation (mesh with all associated data) hard: distributed storage, new algorithms 2. DoFHandler (manages degrees of freedom, global numbering,... ) hard: find global numbering of DoFs 3. Linear Algebra (matrices, vectors, solvers, preconditioners) easy: use existing library 4. Postprocessing (error estimation, solution transfer, output,... ) medium: do work on local mesh, communicate Triangulation unit cell DoFHandler linear algebra post processing Finite Element, Quadratures, Mapping,... 8 / 26

9 Triangulation p4est library: parallel quad-/octrees store refinement flags from a base mesh based on space-filling curves very good scalability Burstedde, Wilcox, and Ghattas. p4est: Scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM J. Sci. Comput., 33 no. 3 (2011), pages / 26

10 Triangulation partitioning is cheap and simple: #1 #2 then: take p4est refinement information recreate rich deal.ii Triangulation only for local cells (stores coordinates, connectivity, faces, materials,... ) how? recursive queries to p4est also create ghost layer (one layer of cells around own ones) 10 / 26

11 Example: Distributed Mesh Storage = & & (color: owned by CPU id) 11 / 26

12 Distributing the Degrees of Freedom (DoFs) sketch: create global numbering for all DoFs reason: identify shared ones problem: no knowledge about the whole mesh decide on ownership of DoFs on interface (no communication!) 2. enumerate locally (only own DoFs) 3. shift indices to make them globally unique (only communicate local quantities) 4. exchange indices to ghost neighbors 12 / 26

13 Distributing the Degrees of Freedom 1 local numbering: 13 / 26

14 Distributing the Degrees of Freedom 2 shift indices: 14 / 26

15 Distributing the Degrees of Freedom 3 transfer to neighbor (view: green) 15 / 26

16 Distributing the Degrees of Freedom 4 second transfer needed for some cells: 16 / 26

17 Linear Algebra, Postprocessing Linear Algebra use distributed matrices and vectors (PETSc or Trilinos) assemble local parts (some communication on interfaces) solve (preconditioners!) not covered today: error estimation decide over refnement and coarsening (communication!) handling hanging nodes solution transfer (after refinement and repartitioning) 17 / 26

18 Strong Scaling: 2d adaptive Poisson problem Wall time [seconds] Wall clock times for problem of fixed size 335M linear solver copy to deal.ii error estimation assembly init matrix sparsity pattern coarsen and refine Number of processors 18 / 26

19 Weak Scaling Wall time [seconds] Linear Solver Preconditioner Setup Matrices Assembly Assembly (T) Refinement SolutionTransfer Setup DoFs Distribute DoFs Weak scaling, 512 processors e+07 1e+08 1e+09 Number of degrees of freedom 19 / 26

20 Test: memory consumption mem / MB avg max #CPUs average and maximum memory consumption (VmPeak) 3D, weak scalability from 8 to 1000 processors with about DoFs per processor (4 million up to 500 million total) constant memory usage with increasing #CPUs & problem size 20 / 26

21 Test: memory consumption memory in MB # CPUs Triangulation p4est DofHandler Constraints Matrix Vector 3D, weak scalability; better for more complicated problems 21 / 26

22 Mantle Convection flow driven by temperature differences Boussinesq Model: T t (2ηε(u)) + p = ρ β T g, u = 0, + u T κ T = γ. (velocity u, pressure p, temperature T ) 22 / 26

23 Mantle Convection Solution and partition of a 3d simulation with roughly 54 million unknowns on 1.4 million cells running on 512 cores. 23 / 26

24 Strong Scaling: coarse 3d case 10 3 Strong scaling, 4 million DoFs 10 2 time/s #CPUs Setup DoFs Assemble T RHS Assemble Stokes Refine mesh Build preconditioner Solve T Solve Stokes (linear) 24 / 26

25 Weak Scaling 10 time/s # DoFs Error Est & SolutionTransf er setup DoFs Assemble Stokes Assenble Temp. recreate local mesh p4est balance p4est partition ref ine and coarsen linear 2d, 512 cores, adaptive refinement for a fixed time step, 2 million to 160 million unknowns 25 / 26

26 Thanks for your attention! 26 / 26

Parallel Computations

Parallel Computations Parallel Computations Timo Heister, Clemson University heister@clemson.edu 2015-08-05 deal.ii workshop 2015 2 Introduction Parallel computations with deal.ii: Introduction Applications Parallel, adaptive,

More information

deal.ii: a numerical library to tackle realistic challenges from industry and academia! Luca Heltai SISSA International School for Advanced Studies!

deal.ii: a numerical library to tackle realistic challenges from industry and academia! Luca Heltai SISSA International School for Advanced Studies! deal.ii: a numerical library to tackle realistic challenges from industry and academia! Luca Heltai SISSA International School for Advanced Studies! Adapted from a talk by! Wolfgang Bangerth Texas A&M

More information

Generic finite element capabilities for forest-of-octrees AMR

Generic finite element capabilities for forest-of-octrees AMR Generic finite element capabilities for forest-of-octrees AMR Carsten Burstedde joint work with Omar Ghattas, Tobin Isaac Institut für Numerische Simulation (INS) Rheinische Friedrich-Wilhelms-Universität

More information

Adaptive Mesh Refinement (AMR)

Adaptive Mesh Refinement (AMR) Adaptive Mesh Refinement (AMR) Carsten Burstedde Omar Ghattas, Georg Stadler, Lucas C. Wilcox Institute for Computational Engineering and Sciences (ICES) The University of Texas at Austin Collaboration

More information

Forest-of-octrees AMR: algorithms and interfaces

Forest-of-octrees AMR: algorithms and interfaces Forest-of-octrees AMR: algorithms and interfaces Carsten Burstedde joint work with Omar Ghattas, Tobin Isaac, Georg Stadler, Lucas C. Wilcox Institut für Numerische Simulation (INS) Rheinische Friedrich-Wilhelms-Universität

More information

Algorithms and Data Structures for Massively Parallel Generic Adaptive Finite Element Codes

Algorithms and Data Structures for Massively Parallel Generic Adaptive Finite Element Codes Algorithms and Data Structures for Massively Parallel Generic Adaptive Finite Element Codes WOLFGANG BANGERTH, Texas A&M University CARSTEN BURSTEDDE, The University of Texas at Austin TIMO HEISTER, UniversityofGöttingen

More information

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle ICES Student Forum The University of Texas at Austin, USA November 4, 204 Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of

More information

Parallel algorithms for Scientific Computing May 28, Hands-on and assignment solving numerical PDEs: experience with PETSc, deal.

Parallel algorithms for Scientific Computing May 28, Hands-on and assignment solving numerical PDEs: experience with PETSc, deal. Division of Scientific Computing Department of Information Technology Uppsala University Parallel algorithms for Scientific Computing May 28, 2013 Hands-on and assignment solving numerical PDEs: experience

More information

Finite element methods in scientific computing. Wolfgang Bangerth, Texas A&M University

Finite element methods in scientific computing. Wolfgang Bangerth, Texas A&M University Finite element methods in scientific computing, Texas A&M University Implementing the finite element method A brief re-hash of the FEM, using the Poisson equation: We start with the strong form: Δ u=f...and

More information

Parallel adaptive mesh refinement using multiple octrees and the p4est software

Parallel adaptive mesh refinement using multiple octrees and the p4est software Parallel adaptive mesh refinement using multiple octrees and the p4est software Carsten Burstedde Institut für Numerische Simulation (INS) Rheinische Friedrich-Wilhelms-Universität Bonn, Germany August

More information

On Robust Parallel Preconditioning for Incompressible Flow Problems

On Robust Parallel Preconditioning for Incompressible Flow Problems On Robust Parallel Preconditioning for Incompressible Flow Problems Timo Heister, Gert Lube, and Gerd Rapin Abstract We consider time-dependent flow problems discretized with higher order finite element

More information

Multilevel Methods for Forward and Inverse Ice Sheet Modeling

Multilevel Methods for Forward and Inverse Ice Sheet Modeling Multilevel Methods for Forward and Inverse Ice Sheet Modeling Tobin Isaac Institute for Computational Engineering & Sciences The University of Texas at Austin SIAM CSE 2015 Salt Lake City, Utah τ 2 T.

More information

UNITING PERFORMANCE AND EXTENSIBILITY IN ADAPTIVE FINITE ELEMENT COMPUTATIONS

UNITING PERFORMANCE AND EXTENSIBILITY IN ADAPTIVE FINITE ELEMENT COMPUTATIONS UNITING PERFORMANCE AND EXTENSIBILITY IN ADAPTIVE FINITE ELEMENT COMPUTATIONS Toby Isaac tisaac@ices.utexas.edu The University of Chicago at Austin September 14, 2015 CAAM Colloqium Rice University T.

More information

Computational Fluid Dynamics - Incompressible Flows

Computational Fluid Dynamics - Incompressible Flows Computational Fluid Dynamics - Incompressible Flows March 25, 2008 Incompressible Flows Basis Functions Discrete Equations CFD - Incompressible Flows CFD is a Huge field Numerical Techniques for solving

More information

Efficient Global Element Indexing for Parallel Adaptive Flow Solvers

Efficient Global Element Indexing for Parallel Adaptive Flow Solvers Procedia Computer Science Volume 29, 2014, Pages 246 255 ICCS 2014. 14th International Conference on Computational Science Efficient Global Element Indexing for Parallel Adaptive Flow Solvers Michael Lieb,

More information

Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM

Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM Alexander Monakov, amonakov@ispras.ru Institute for System Programming of Russian Academy of Sciences March 20, 2013 1 / 17 Problem Statement In OpenFOAM,

More information

Sustainability and Efficiency for Simulation Software in the Exascale Era

Sustainability and Efficiency for Simulation Software in the Exascale Era Sustainability and Efficiency for Simulation Software in the Exascale Era Dominik Thönnes, Ulrich Rüde, Nils Kohl Chair for System Simulation, University of Erlangen-Nürnberg March 09, 2018 SIAM Conference

More information

Finite Volume Discretization on Irregular Voronoi Grids

Finite Volume Discretization on Irregular Voronoi Grids Finite Volume Discretization on Irregular Voronoi Grids C.Huettig 1, W. Moore 1 1 Hampton University / National Institute of Aerospace Folie 1 The earth and its terrestrial neighbors NASA Colin Rose, Dorling

More information

Research Collection. WebParFE A web interface for the high performance parallel finite element solver ParFE. Report. ETH Library

Research Collection. WebParFE A web interface for the high performance parallel finite element solver ParFE. Report. ETH Library Research Collection Report WebParFE A web interface for the high performance parallel finite element solver ParFE Author(s): Paranjape, Sumit; Kaufmann, Martin; Arbenz, Peter Publication Date: 2009 Permanent

More information

p4est: SCALABLE ALGORITHMS FOR PARALLEL ADAPTIVE MESH REFINEMENT ON FORESTS OF OCTREES

p4est: SCALABLE ALGORITHMS FOR PARALLEL ADAPTIVE MESH REFINEMENT ON FORESTS OF OCTREES p4est: SCALABLE ALGORITHMS FOR PARALLEL ADAPTIVE MESH REFINEMENT ON FORESTS OF OCTREES CARSTEN BURSTEDDE, LUCAS C. WILCOX, AND OMAR GHATTAS Abstract. We present scalable algorithms for parallel adaptive

More information

Eulerian Techniques for Fluid-Structure Interactions - Part II: Applications

Eulerian Techniques for Fluid-Structure Interactions - Part II: Applications Published in Lecture Notes in Computational Science and Engineering Vol. 103, Proceedings of ENUMATH 2013, pp. 755-762, Springer, 2014 Eulerian Techniques for Fluid-Structure Interactions - Part II: Applications

More information

Peta-Scale Simulations with the HPC Software Framework walberla:

Peta-Scale Simulations with the HPC Software Framework walberla: Peta-Scale Simulations with the HPC Software Framework walberla: Massively Parallel AMR for the Lattice Boltzmann Method SIAM PP 2016, Paris April 15, 2016 Florian Schornbaum, Christian Godenschwager,

More information

arxiv: v4 [math.na] 31 Dec 2013

arxiv: v4 [math.na] 31 Dec 2013 THE deal.ii LIBRARY, VERSION 8.1 WOLFGANG BANGERTH, TIMO HEISTER, LUCA HELTAI, GUIDO KANSCHAT, MARTIN KRONBICHLER, MATTHIAS MAIER, BRUNO TURCKSIN, AND TOBY D. YOUNG Abstract. This paper provides an overview

More information

simulation framework for piecewise regular grids

simulation framework for piecewise regular grids WALBERLA, an ultra-scalable multiphysics simulation framework for piecewise regular grids ParCo 2015, Edinburgh September 3rd, 2015 Christian Godenschwager, Florian Schornbaum, Martin Bauer, Harald Köstler

More information

Incorporation of Multicore FEM Integration Routines into Scientific Libraries

Incorporation of Multicore FEM Integration Routines into Scientific Libraries Incorporation of Multicore FEM Integration Routines into Scientific Libraries Matthew Knepley Computation Institute University of Chicago Department of Molecular Biology and Physiology Rush University

More information

Towards a Reconfigurable HPC Component Model

Towards a Reconfigurable HPC Component Model C2S@EXA Meeting July 10, 2014 Towards a Reconfigurable HPC Component Model Vincent Lanore1, Christian Pérez2 1 ENS de Lyon, LIP 2 Inria, LIP Avalon team 1 Context 1/4 Adaptive Mesh Refinement 2 Context

More information

Lecture 15: More Iterative Ideas

Lecture 15: More Iterative Ideas Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!

More information

arxiv: v3 [math.na] 28 Dec 2013

arxiv: v3 [math.na] 28 Dec 2013 THE deal.ii LIBRARY, VERSION 8.0 WOLFGANG BANGERTH, TIMO HEISTER, LUCA HELTAI, GUIDO KANSCHAT, MARTIN KRONBICHLER, MATTHIAS MAIER, BRUNO TURCKSIN, AND TOBY D. YOUNG Abstract. This paper provides an overview

More information

Performance Optimization of a Massively Parallel Phase-Field Method Using the HPC Framework walberla

Performance Optimization of a Massively Parallel Phase-Field Method Using the HPC Framework walberla Performance Optimization of a Massively Parallel Phase-Field Method Using the HPC Framework walberla SIAM PP 2016, April 13 th 2016 Martin Bauer, Florian Schornbaum, Christian Godenschwager, Johannes Hötzer,

More information

HPC Algorithms and Applications

HPC Algorithms and Applications HPC Algorithms and Applications Dwarf #5 Structured Grids Michael Bader Winter 2012/2013 Dwarf #5 Structured Grids, Winter 2012/2013 1 Dwarf #5 Structured Grids 1. dense linear algebra 2. sparse linear

More information

Predictive Engineering and Computational Sciences. Data Structures and Methods for Unstructured Distributed Meshes. Roy H. Stogner

Predictive Engineering and Computational Sciences. Data Structures and Methods for Unstructured Distributed Meshes. Roy H. Stogner PECOS Predictive Engineering and Computational Sciences Data Structures and Methods for Unstructured Distributed Meshes Roy H. Stogner The University of Texas at Austin May 23, 2012 Roy H. Stogner Distributed

More information

High Performance Computing: Tools and Applications

High Performance Computing: Tools and Applications High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 15 Numerically solve a 2D boundary value problem Example:

More information

SOFTWARE CONCEPTS AND ALGORITHMS FOR AN EFFICIENT AND SCALABLE PARALLEL FINITE ELEMENT METHOD

SOFTWARE CONCEPTS AND ALGORITHMS FOR AN EFFICIENT AND SCALABLE PARALLEL FINITE ELEMENT METHOD Fakultät Mathematik und Naturwissenschaften, Institut für Wissenschaftliches Rechnen SOFTWARE CONCEPTS AND ALGORITHMS FOR AN EFFICIENT AND SCALABLE PARALLEL FINITE ELEMENT METHOD Der Fakultät Mathematik

More information

Post-processing utilities in Elmer

Post-processing utilities in Elmer Post-processing utilities in Elmer Peter Råback ElmerTeam CSC IT Center for Science PATC course on parallel workflows Stockholm, 4-6.12.2013 Alternative postprocessors for Elmer Open source ElmerPost Postprocessor

More information

deal.ii a General Purpose Object Oriented Finite Element Library

deal.ii a General Purpose Object Oriented Finite Element Library deal.ii a General Purpose Object Oriented Finite Element Library W. BANGERTH Texas A&M University and R. HARTMANN German Aerospace Center (DLR) and G. KANSCHAT Universität Heidelberg An overview of the

More information

Fast Dynamic Load Balancing for Extreme Scale Systems

Fast Dynamic Load Balancing for Extreme Scale Systems Fast Dynamic Load Balancing for Extreme Scale Systems Cameron W. Smith, Gerrett Diamond, M.S. Shephard Computation Research Center (SCOREC) Rensselaer Polytechnic Institute Outline: n Some comments on

More information

A Parallel Implementation of the BDDC Method for Linear Elasticity

A Parallel Implementation of the BDDC Method for Linear Elasticity A Parallel Implementation of the BDDC Method for Linear Elasticity Jakub Šístek joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík Institute of Mathematics of the AS CR, Prague

More information

Implementation of the Continuous-Discontinuous Galerkin Finite Element Method

Implementation of the Continuous-Discontinuous Galerkin Finite Element Method Implementation of the Continuous-Discontinuous Galerkin Finite Element Method Andrea Cangiani, John Chapman, Emmanuil Georgoulis and Max Jensen Abstract For the stationary advection-diffusion problem the

More information

Parallelization of the multi-level hp-adaptive finite cell method

Parallelization of the multi-level hp-adaptive finite cell method Parallelization of the multi-level hp-adaptive finite cell method John N. Jomo 1, Nils Zander 1, Mohamed Elhaddad 1, Ali Özcan1, Stefan Kollmannsberger 1, Ralf-Peter Mundani 1, and Ernst Rank 1/2 1 Chair

More information

Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves

Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves Michael Bader TU München Stefanie Schraufstetter TU München Jörn Behrens AWI Bremerhaven Abstract

More information

Exploring unstructured Poisson solvers for FDS

Exploring unstructured Poisson solvers for FDS Exploring unstructured Poisson solvers for FDS Dr. Susanne Kilian hhpberlin - Ingenieure für Brandschutz 10245 Berlin - Germany Agenda 1 Discretization of Poisson- Löser 2 Solvers for 3 Numerical Tests

More information

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution Multigrid Pattern I. Problem Problem domain is decomposed into a set of geometric grids, where each element participates in a local computation followed by data exchanges with adjacent neighbors. The grids

More information

Hierarchical Hybrid Grids

Hierarchical Hybrid Grids Hierarchical Hybrid Grids IDK Summer School 2012 Björn Gmeiner, Ulrich Rüde July, 2012 Contents Mantle convection Hierarchical Hybrid Grids Smoothers Geometric approximation Performance modeling 2 Mantle

More information

Approaches to Parallel Implementation of the BDDC Method

Approaches to Parallel Implementation of the BDDC Method Approaches to Parallel Implementation of the BDDC Method Jakub Šístek Includes joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík. Institute of Mathematics of the AS CR, Prague

More information

ALPS: A framework for parallel adaptive PDE solution

ALPS: A framework for parallel adaptive PDE solution ALPS: A framework for parallel adaptive PDE solution Carsten Burstedde Martin Burtscher Omar Ghattas, Georg Stadler Tiankai Tu Lucas C. Wilcox Institute for Computational Engineering and Sciences (ICES),

More information

Scalability of Uintah Past Present and Future

Scalability of Uintah Past Present and Future DOE for funding the CSAFE project (97-10), DOE NETL, DOE NNSA NSF for funding via SDCI and PetaApps, INCITE, XSEDE Scalability of Uintah Past Present and Future Martin Berzins Qingyu Meng John Schmidt,

More information

Parallel Performance Studies for COMSOL Multiphysics Using Scripting and Batch Processing

Parallel Performance Studies for COMSOL Multiphysics Using Scripting and Batch Processing Parallel Performance Studies for COMSOL Multiphysics Using Scripting and Batch Processing Noemi Petra and Matthias K. Gobbert Department of Mathematics and Statistics, University of Maryland, Baltimore

More information

Parallel Mesh Partitioning in Alya

Parallel Mesh Partitioning in Alya Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Parallel Mesh Partitioning in Alya A. Artigues a *** and G. Houzeaux a* a Barcelona Supercomputing Center ***antoni.artigues@bsc.es

More information

Software and Performance Engineering for numerical codes on GPU clusters

Software and Performance Engineering for numerical codes on GPU clusters Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3

More information

Outline. COMSOL Multyphysics: Overview of software package and capabilities

Outline. COMSOL Multyphysics: Overview of software package and capabilities COMSOL Multyphysics: Overview of software package and capabilities Lecture 5 Special Topics: Device Modeling Outline Basic concepts and modeling paradigm Overview of capabilities Steps in setting-up a

More information

Automated Finite Element Computations in the FEniCS Framework using GPUs

Automated Finite Element Computations in the FEniCS Framework using GPUs Automated Finite Element Computations in the FEniCS Framework using GPUs Florian Rathgeber (f.rathgeber10@imperial.ac.uk) Advanced Modelling and Computation Group (AMCG) Department of Earth Science & Engineering

More information

Towards Adaptive Mesh PDE Simulations on Petascale Computers

Towards Adaptive Mesh PDE Simulations on Petascale Computers Towards Adaptive Mesh PDE Simulations on Petascale Computers Carsten Burstedde, Omar Ghattas, Georg Stadler, Tiankai Tu, Lucas C. Wilcox Institute for Computational Engineering & Sciences Jackson School

More information

Computational Fluid Dynamics with the Lattice Boltzmann Method KTH SCI, Stockholm

Computational Fluid Dynamics with the Lattice Boltzmann Method KTH SCI, Stockholm Computational Fluid Dynamics with the Lattice Boltzmann Method KTH SCI, Stockholm March 17 March 21, 2014 Florian Schornbaum, Martin Bauer, Simon Bogner Chair for System Simulation Friedrich-Alexander-Universität

More information

Accelerating Finite Element Analysis in MATLAB with Parallel Computing

Accelerating Finite Element Analysis in MATLAB with Parallel Computing MATLAB Digest Accelerating Finite Element Analysis in MATLAB with Parallel Computing By Vaishali Hosagrahara, Krishna Tamminana, and Gaurav Sharma The Finite Element Method is a powerful numerical technique

More information

Revision of the SolidWorks Variable Pressure Simulation Tutorial J.E. Akin, Rice University, Mechanical Engineering. Introduction

Revision of the SolidWorks Variable Pressure Simulation Tutorial J.E. Akin, Rice University, Mechanical Engineering. Introduction Revision of the SolidWorks Variable Pressure Simulation Tutorial J.E. Akin, Rice University, Mechanical Engineering Introduction A SolidWorks simulation tutorial is just intended to illustrate where to

More information

The effect of irregular interfaces on the BDDC method for the Navier-Stokes equations

The effect of irregular interfaces on the BDDC method for the Navier-Stokes equations 153 The effect of irregular interfaces on the BDDC method for the Navier-Stokes equations Martin Hanek 1, Jakub Šístek 2,3 and Pavel Burda 1 1 Introduction The Balancing Domain Decomposition based on Constraints

More information

Parallel Algorithms: Adaptive Mesh Refinement (AMR) method and its implementation

Parallel Algorithms: Adaptive Mesh Refinement (AMR) method and its implementation Parallel Algorithms: Adaptive Mesh Refinement (AMR) method and its implementation Massimiliano Guarrasi m.guarrasi@cineca.it Super Computing Applications and Innovation Department AMR - Introduction Solving

More information

Extreme Scalability Challenges in Micro-Finite Element Simulations of Human Bone

Extreme Scalability Challenges in Micro-Finite Element Simulations of Human Bone Extreme Scalability Challenges in Micro-Finite Element Simulations of Human Bone C. Bekas, A. Curioni IBM Research, Zurich Research Laboratory, Switzerland P. Arbenz, C. Flaig, Computer Science Dept.,

More information

High-Performance Computational Electromagnetic Modeling Using Low-Cost Parallel Computers

High-Performance Computational Electromagnetic Modeling Using Low-Cost Parallel Computers High-Performance Computational Electromagnetic Modeling Using Low-Cost Parallel Computers July 14, 1997 J Daniel S. Katz (Daniel.S.Katz@jpl.nasa.gov) Jet Propulsion Laboratory California Institute of Technology

More information

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA Partitioning and Partitioning Tools Tim Barth NASA Ames Research Center Moffett Field, California 94035-00 USA 1 Graph/Mesh Partitioning Why do it? The graph bisection problem What are the standard heuristic

More information

Coupled analysis of material flow and die deflection in direct aluminum extrusion

Coupled analysis of material flow and die deflection in direct aluminum extrusion Coupled analysis of material flow and die deflection in direct aluminum extrusion W. Assaad and H.J.M.Geijselaers Materials innovation institute, The Netherlands w.assaad@m2i.nl Faculty of Engineering

More information

Adaptive-Mesh-Refinement Pattern

Adaptive-Mesh-Refinement Pattern Adaptive-Mesh-Refinement Pattern I. Problem Data-parallelism is exposed on a geometric mesh structure (either irregular or regular), where each point iteratively communicates with nearby neighboring points

More information

Efficient AMG on Hybrid GPU Clusters. ScicomP Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann. Fraunhofer SCAI

Efficient AMG on Hybrid GPU Clusters. ScicomP Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann. Fraunhofer SCAI Efficient AMG on Hybrid GPU Clusters ScicomP 2012 Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann Fraunhofer SCAI Illustration: Darin McInnis Motivation Sparse iterative solvers benefit from

More information

Adaptive numerical methods

Adaptive numerical methods METRO MEtallurgical TRaining On-line Adaptive numerical methods Arkadiusz Nagórka CzUT Education and Culture Introduction Common steps of finite element computations consists of preprocessing - definition

More information

Handling Parallelisation in OpenFOAM

Handling Parallelisation in OpenFOAM Handling Parallelisation in OpenFOAM Hrvoje Jasak hrvoje.jasak@fsb.hr Faculty of Mechanical Engineering and Naval Architecture University of Zagreb, Croatia Handling Parallelisation in OpenFOAM p. 1 Parallelisation

More information

Simulating tsunami propagation on parallel computers using a hybrid software framework

Simulating tsunami propagation on parallel computers using a hybrid software framework Simulating tsunami propagation on parallel computers using a hybrid software framework Xing Simula Research Laboratory, Norway Department of Informatics, University of Oslo March 12, 2007 Outline Intro

More information

ITU/FAA Faculty of Aeronautics and Astronautics

ITU/FAA Faculty of Aeronautics and Astronautics S. Banu YILMAZ, Mehmet SAHIN, M. Fevzi UNAL, Istanbul Technical University, 34469, Maslak/Istanbul, TURKEY 65th Annual Meeting of the APS Division of Fluid Dynamics November 18-20, 2012, San Diego, CA

More information

Engineering Effects of Boundary Conditions (Fixtures and Temperatures) J.E. Akin, Rice University, Mechanical Engineering

Engineering Effects of Boundary Conditions (Fixtures and Temperatures) J.E. Akin, Rice University, Mechanical Engineering Engineering Effects of Boundary Conditions (Fixtures and Temperatures) J.E. Akin, Rice University, Mechanical Engineering Here SolidWorks stress simulation tutorials will be re-visited to show how they

More information

Velocity and Concentration Properties of Porous Medium in a Microfluidic Device

Velocity and Concentration Properties of Porous Medium in a Microfluidic Device Velocity and Concentration Properties of Porous Medium in a Microfluidic Device Rachel Freeman Department of Chemical Engineering University of Washington ChemE 499 Undergraduate Research December 14,

More information

Dendro: Parallel algorithms for multigrid and AMR methods on 2:1 balanced octrees

Dendro: Parallel algorithms for multigrid and AMR methods on 2:1 balanced octrees Dendro: Parallel algorithms for multigrid and AMR methods on 2:1 balanced octrees Rahul S. Sampath, Santi S. Adavani, Hari Sundar, Ilya Lashuk, and George Biros University of Pennsylvania Abstract In this

More information

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative

More information

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC 2015-09-14 Algorithms, System and Data Centre Optimisation for Energy Efficient HPC Vincent Heuveline URZ Computing Centre of Heidelberg University EMCL Engineering Mathematics and Computing Lab 1 Energy

More information

Joint Advanced Student School 2007 Martin Dummer

Joint Advanced Student School 2007 Martin Dummer Sierpiński-Curves Joint Advanced Student School 2007 Martin Dummer Statement of the Problem What is the best way to store a triangle mesh efficiently in memory? The following points are desired : Easy

More information

The University of Chicago. FEM Software Automation, with a case study on the Stokes Equations [1]

The University of Chicago. FEM Software Automation, with a case study on the Stokes Equations [1] The University of Chicago FEM Software Automation, with a case study on the Stokes Equations [1] A Masters Paper submitted to The Faculty of the Computer Science Department In Candidacy for the degree

More information

High Performance Calculation with Code_Saturne at EDF. Toolchain evoution and roadmap

High Performance Calculation with Code_Saturne at EDF. Toolchain evoution and roadmap High Performance Calculation with Code_Saturne at EDF Toolchain evoution and roadmap Code_Saturne Features of note to HPC Segregated solver All variables are solved or independently, coupling terms are

More information

Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures. Allison H. Baker, Todd Gamblin, Martin Schulz, and Ulrike Meier Yang

Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures. Allison H. Baker, Todd Gamblin, Martin Schulz, and Ulrike Meier Yang Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures. Allison H. Baker, Todd Gamblin, Martin Schulz, and Ulrike Meier Yang Multigrid Solvers Method of solving linear equation

More information

Computational Fluid Dynamics and Interactive Visualisation

Computational Fluid Dynamics and Interactive Visualisation Computational Fluid Dynamics and Interactive Visualisation Ralf-Peter Mundani 1, Jérôme Frisch 2 1 Computation in Engineering, TUM 2 E3D, RWTH Aachen University Interdisciplinary Cluster Workshop on Visualization

More information

3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs

3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs 3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs H. Knibbe, C. W. Oosterlee, C. Vuik Abstract We are focusing on an iterative solver for the three-dimensional

More information

Multigrid. James Demmel. Poisson s equation in 1D: T = Graph and stencil. 2 u/ x 2 = f(x)

Multigrid. James Demmel. Poisson s equation in 1D: T =   Graph and stencil. 2 u/ x 2 = f(x) Multigrid James Demmel www.cs.berkeley.edu/~demmel/ma221_spr16 Poisson s equation in 1D: 2 u/ x 2 = f(x) T = 2-1 -1 2-1 -1 2-1 -1 2-1 -1 2 Graph and stencil -1 2-1 1 2D Poisson s equation Similar to the

More information

arxiv: v2 [cs.ms] 2 Oct 2017

arxiv: v2 [cs.ms] 2 Oct 2017 Enhancing speed and scalability of the ParFlow simulation code Carsten Burstedde, Jose A. Fonseca, Stefan Kollet arxiv:1702.06898v2 [cs.ms] 2 Oct 2017 Abstract Regional hydrology studies are often supported

More information

Reconstruction of Trees from Laser Scan Data and further Simulation Topics

Reconstruction of Trees from Laser Scan Data and further Simulation Topics Reconstruction of Trees from Laser Scan Data and further Simulation Topics Helmholtz-Research Center, Munich Daniel Ritter http://www10.informatik.uni-erlangen.de Overview 1. Introduction of the Chair

More information

Predictive Engineering and Computational Sciences. Full System Simulations with Fully Implicit Navier-Stokes

Predictive Engineering and Computational Sciences. Full System Simulations with Fully Implicit Navier-Stokes PECOS Predictive Engineering and Computational Sciences Full System Simulations with Fully Implicit Navier-Stokes Roy Stogner, Benjamin Kirk, Paul Bauman, Todd Oliver, Kemelli Estacio, Marco Panesi, Juan

More information

AllScale Pilots Applications AmDaDos Adaptive Meshing and Data Assimilation for the Deepwater Horizon Oil Spill

AllScale Pilots Applications AmDaDos Adaptive Meshing and Data Assimilation for the Deepwater Horizon Oil Spill This project has received funding from the European Union s Horizon 2020 research and innovation programme under grant agreement No. 671603 An Exascale Programming, Multi-objective Optimisation and Resilience

More information

Introduction to Multigrid and its Parallelization

Introduction to Multigrid and its Parallelization Introduction to Multigrid and its Parallelization! Thomas D. Economon Lecture 14a May 28, 2014 Announcements 2 HW 1 & 2 have been returned. Any questions? Final projects are due June 11, 5 pm. If you are

More information

Low-Cost Parallel Algorithms for 2:1 Octree Balance

Low-Cost Parallel Algorithms for 2:1 Octree Balance Low-Cost Parallel Algorithms for 2: Octree Balance Tobin Isaac, Carsten Burstedde, Omar Ghattas Institute for Computational Engineering and Sciences (ICES) The University of Texas at Austin, USA Email:

More information

Timo Lähivaara, Tomi Huttunen, Simo-Pekka Simonaho University of Kuopio, Department of Physics P.O.Box 1627, FI-70211, Finland

Timo Lähivaara, Tomi Huttunen, Simo-Pekka Simonaho University of Kuopio, Department of Physics P.O.Box 1627, FI-70211, Finland Timo Lähivaara, Tomi Huttunen, Simo-Pekka Simonaho University of Kuopio, Department of Physics P.O.Box 627, FI-72, Finland timo.lahivaara@uku.fi INTRODUCTION The modeling of the acoustic wave fields often

More information

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND Student Submission for the 5 th OpenFOAM User Conference 2017, Wiesbaden - Germany: SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND TESSA UROIĆ Faculty of Mechanical Engineering and Naval Architecture, Ivana

More information

Directions: 1) Delete this text box 2) Insert desired picture here

Directions: 1) Delete this text box 2) Insert desired picture here Directions: 1) Delete this text box 2) Insert desired picture here Multi-Disciplinary Applications using Overset Grid Technology in STAR-CCM+ CD-adapco Dmitry Pinaev, Frank Schäfer, Eberhard Schreck Outline

More information

A Kernel-independent Adaptive Fast Multipole Method

A Kernel-independent Adaptive Fast Multipole Method A Kernel-independent Adaptive Fast Multipole Method Lexing Ying Caltech Joint work with George Biros and Denis Zorin Problem Statement Given G an elliptic PDE kernel, e.g. {x i } points in {φ i } charges

More information

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations Performance Brief Quad-Core Workstation Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations With eight cores and up to 80 GFLOPS of peak performance at your fingertips,

More information

Enzo-P / Cello. Formation of the First Galaxies. San Diego Supercomputer Center. Department of Physics and Astronomy

Enzo-P / Cello. Formation of the First Galaxies. San Diego Supercomputer Center. Department of Physics and Astronomy Enzo-P / Cello Formation of the First Galaxies James Bordner 1 Michael L. Norman 1 Brian O Shea 2 1 University of California, San Diego San Diego Supercomputer Center 2 Michigan State University Department

More information

Multigrid James Demmel

Multigrid James Demmel Multigrid James Demmel www.cs.berkeley.edu/~demmel/ma221 Review of Previous Lectures and Outline Review Poisson equation Overview of Methods for Poisson Equation Jacobi s method Gauss-Seidel method Red-Black

More information

The purpose of this tutorial is to illustrate how to set up and solve a problem using the. Moving Deforming Mesh (MDM) using the layering algorithm.

The purpose of this tutorial is to illustrate how to set up and solve a problem using the. Moving Deforming Mesh (MDM) using the layering algorithm. Tutorial: Introduction The purpose of this tutorial is to illustrate how to set up and solve a problem using the following two features in FLUENT. Moving Deforming Mesh (MDM) using the layering algorithm.

More information

cuibm A GPU Accelerated Immersed Boundary Method

cuibm A GPU Accelerated Immersed Boundary Method cuibm A GPU Accelerated Immersed Boundary Method S. K. Layton, A. Krishnan and L. A. Barba Corresponding author: labarba@bu.edu Department of Mechanical Engineering, Boston University, Boston, MA, 225,

More information

Teko: A Package for Multiphysics Preconditioners

Teko: A Package for Multiphysics Preconditioners SAND 2009-7242P Teko: A Package for Multiphysics Preconditioners Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energyʼs

More information

FEMLAB Exercise 1 for ChE366

FEMLAB Exercise 1 for ChE366 FEMLAB Exercise 1 for ChE366 Problem statement Consider a spherical particle of radius r s moving with constant velocity U in an infinitely long cylinder of radius R that contains a Newtonian fluid. Let

More information

Development of a Consistent Discrete Adjoint Solver for the SU 2 Framework

Development of a Consistent Discrete Adjoint Solver for the SU 2 Framework Development of a Consistent Discrete Adjoint Solver for the SU 2 Framework Tim Albring, Max Sagebaum, Nicolas Gauger Chair for Scientific Computing TU Kaiserslautern 16th Euro-AD Workshop, Jena December

More information

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Speedup Altair RADIOSS Solvers Using NVIDIA GPU Innovation Intelligence Speedup Altair RADIOSS Solvers Using NVIDIA GPU Eric LEQUINIOU, HPC Director Hongwei Zhou, Senior Software Developer May 16, 2012 Innovation Intelligence ALTAIR OVERVIEW Altair

More information