Approaches to Parallel Implementation of the BDDC Method

Size: px
Start display at page:

Download "Approaches to Parallel Implementation of the BDDC Method"

Transcription

1 Approaches to Parallel Implementation of the BDDC Method Jakub Šístek Includes joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík. Institute of Mathematics of the AS CR, Prague Czech Technical University, Prague University of Colorado Denver September 25, 2009, Tampere University of Technology INSTITUTE of MATHEMATICS Academy of Sciences Czech Republic

2 Outline Some motivation for domain decomposition BDDC method Implementation by global matrices and generalized change of variables Implementation subdomain by subdomain with explicit coarse problem Numerical results Conclusion

3 Some motivation for domain decomposition (DD) methods very large systems of algebraic equations arising from FEM difficulties with solution by conventional numerical methods direct methods slow due to the problem size iterative methods slow due to large condition number suitable preconditioner needed combination of these approaches synergy in domain decomposition (DD) methods natural way to parallelize FEM two recent methods for elliptic PDEs BDDC and FETI-DP

4 Outline Some motivation for domain decomposition BDDC method Implementation by global matrices and generalized change of variables Implementation subdomain by subdomain with explicit coarse problem Numerical results Conclusion

5 Brief overview of BDDC method Balancing Domain Decomposition by Constraints 2003 C. Dohrmann (Sandia), theory with J. Mandel (UCD) nonoverlapping primary domain decomposition method equivalent to FETI-DP [Mandel, Dohrmann, Tezaur 2005]

6 The abstract problem Variational setting u U : a(u, v) = f, v v U a (, ) symmetric positive definite form on U, is inner product on U U is finite dimensional space Matrix form u U : Au = f A symmetric positive definite matrix on U Linked together Au, v = a (u, v) u, v U

7 BDDC set-up division into subdomains selection of coarse problem nodes (also called corners) interface subdomain Ωi coarse problem nodes h H finite elements

8 Function spaces in BDDC U W c W continuous continuous at coarse no continuity problem nodes enough coarse nodes to fix floating subdomains rigid body modes captured a (, ) symmetric positive definite form on W c corresponding matrix Ãc symmetric positive definite, almost block diagonal structure, larger dimension than A operator of projection E : W c U, Range(E) = U, e.g. averaging across interfaces (arithmetic, weighted)

9 Fictitious mesh à c can be constructed using auxiliary mesh

10 The second intermediate space in BDDC Only coarse nodes do not suffice for optimal preconditioning in 3D additional constraints on functions from W c necessary U W W c continuous add constraints only corners Examples: equivalent averages on subsets of interface (edges, faces) across interface, additional pointwise continuity constraints

11 The BDDC preconditioner Define M BDDC : r U û U M BDDC : r û = Ew, w W : a (w, z) = r, Ez, z W r - residual in an iteration of PCG û - correction to solution (preconditioned residual)

12 Outline Some motivation for domain decomposition BDDC method Implementation by global matrices and generalized change of variables Implementation subdomain by subdomain with explicit coarse problem Numerical results Conclusion

13 Enforcing additional constraints Introduce matrix G with constraints each row of G corresponds to a continuity constraint between two subdomains introduces new coupling between subdomains Example: for arithmetic averages on an edge between subdomains i and j, a row of G is g k = [ } 1 1{{ 1 1} } 1 1 {{ 1 1 } ] edge dof on Ω i edge dof on Ω j define intermediate space as { W = w W } c : Gw = 0

14 The BDDC preconditioner with averages variational form M BDDC : r û = Ew, w W : a (w, z) = r, Ez, z W matrix form à c w + G T λ = E T r Gw = 0 û = M BDDC r = Ew r - residual in an iteration of PCG û - correction to solution (preconditioned residual)

15 Change of variables Change of variables on each subdomain, such that averages appear as single node constraints. w = Tw, w = Bw, B = T 1 Matrix T invertible, contains weights of averages. Compute M BDDC r = EBw, where w is the solution to B T Ã c Bw + B T G T λ = B T E T r GBw = 0. Transformed averages may be handled as corners and further assembled [Li, Widlund 2006]. Drawback: The distinction between W c and W lost.

16 Projected change of variables Combination of projected BDDC and change of variables. Define matrix G = GB reduces to one 1 and one 1 in each row. Projection onto null(g) Construct matrix BDDC preconditioner as P = I G T (GG T ) 1 G Ã = PB T Ã c BP + t(i P) Ãw = PB T E T r û = M BDDC r = EBw

17 Outline Some motivation for domain decomposition BDDC method Implementation by global matrices and generalized change of variables Implementation subdomain by subdomain with explicit coarse problem Numerical results Conclusion

18 The coarse space in BDDC In implementation, space W may be split into independent subdomain spaces and energy-orthogonal coarse space. On each subdomain coarse degrees of freedom basis functions Ψ i prescribed values of coarse degrees of freedom, minimal energy elsewhere, [ Ai C T i C i 0 ] [ Ψi Λ i ] = A i... local subdomain stiffness matrix [ 0 I C i... matrix of constraints selects unknowns into coarse degrees of freedom Matrix of coarse problem A C assembled from local matrices A Ci = Ψ T i A i Ψ i. ].

19 The coarse space in BDDC a function from coarse space coarse basis function

20 An iteration of BDDC 1. Residual on interface r (k) = g Sû (k) S Schur complement w.r.t. interface, g condensed r.h.s. 2. Distribution of residual local problems coarse problem for i = 1,..., N N r i = Ei T r (k) r C = RCi T ΨT i Ei T r (k) 3. Correction of solution i=1 [ Ai Ci T ] [ ] [ ] ui ri = C i 0 µ i 0 A C u C = r C 4. New approximation û = N E i (Ψ i R Ci u C + u i ), û (k+1) = û (k) + û i=1

21 Outline Some motivation for domain decomposition BDDC method Implementation by global matrices and generalized change of variables Implementation subdomain by subdomain with explicit coarse problem Numerical results Conclusion

22 Parallel implementation global matrices and generalized change of variables built on top of multifrontal solver MUMPS Fortran 90 programming language, MPI library subdomain by subdomain with explicit coarse problem built on top of frontal solver (subdomains) Fortran programming language, MPI library coarse problem by multifrontal solver MUMPS Tested on SGI Altix 4700, CTU, Prague, CR 72 processors Intel Itanium 2, OS Linux.

23 Steam turbine nozzle quadratic elements, unknowns 16 subdomains by METIS, 77 corners, 14 edges, and 30 faces

24 Comparison of implementations steam turbine nozzle 16 subdomains, 16 processors of SGI Altix 4700 implementation MUMPS frontal solver constraints C C+E+F C C+E+F iterations cond. number est transforming matrix (sec) analysis (sec) factorization (sec) PCG iter (sec) total (sec) C... Corners, E... averages on Edges, F... averages on Faces

25 Hip joint replacement quadratic elements, unknowns 16 subdomains by METIS, 101 corners, 12 edges, and 35 faces

26 Comparison of implementations hip joint replacement 16 processors of SGI Altix 4700 implementation MUMPS frontal solver constraints C C+E+F C C+E+F iterations cond. number est transforming matrix (sec) analysis (sec) factorization (sec) PCG iter (sec) total (sec) C... Corners, E... averages on Edges, F... averages on Faces

27 Outline Some motivation for domain decomposition BDDC method Implementation by global matrices and generalized change of variables Implementation subdomain by subdomain with explicit coarse problem Numerical results Conclusion

28 Conclusion global matrices and generalized change of variables allows simple implementation parallelism by parallel direct solver bounded by problem size subdomain by subdomain with explicit coarse problem less straightforward (explicit coarse problem, mapping,... ) allows different handling of global coarse problem (MUMPS on coarse, multilevel extensions (ongoing research),... ) common distinguish between point constraints and averages for implementation constraits on edges and/or faces can considerably shorten the solution time averages on faces might be too expensive more sophisticated way for selection of weighted averages adaptive BDDC (ongoing research)

A Parallel Implementation of the BDDC Method for Linear Elasticity

A Parallel Implementation of the BDDC Method for Linear Elasticity A Parallel Implementation of the BDDC Method for Linear Elasticity Jakub Šístek joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík Institute of Mathematics of the AS CR, Prague

More information

Primal methods of iterative substructuring

Primal methods of iterative substructuring Primal methods of iterative substructuring Jakub Šístek Institute of Mathematics of the AS CR, Prague Programs and Algorithms of Numerical Mathematics 16 June 5th, 2012 Co-workers Presentation based on

More information

arxiv: v2 [math.na] 19 Mar 2011

arxiv: v2 [math.na] 19 Mar 2011 Face-based Selection of Corners in 3D Substructuring Jakub Šísteka,, Marta Čertíkováb, Pavel Burda b, Jaroslav Novotný c arxiv:0911.1245v2 [math.na] 19 Mar 2011 Abstract a Institute of Mathematics, Academy

More information

The effect of irregular interfaces on the BDDC method for the Navier-Stokes equations

The effect of irregular interfaces on the BDDC method for the Navier-Stokes equations 153 The effect of irregular interfaces on the BDDC method for the Navier-Stokes equations Martin Hanek 1, Jakub Šístek 2,3 and Pavel Burda 1 1 Introduction The Balancing Domain Decomposition based on Constraints

More information

arxiv: v2 [math.na] 15 May 2010

arxiv: v2 [math.na] 15 May 2010 Coarse spaces over the ages arxiv:0911.5725v2 [math.na] 15 May 2010 Jan Mandel 1 and Bedřich Sousedík 1,2 1 Department of Mathematical and Statistical Sciences, University of Colorado Denver, Campus Box

More information

BDDCML. solver library based on Multi-Level Balancing Domain Decomposition by Constraints copyright (C) Jakub Šístek version 1.

BDDCML. solver library based on Multi-Level Balancing Domain Decomposition by Constraints copyright (C) Jakub Šístek version 1. BDDCML solver library based on Multi-Level Balancing Domain Decomposition by Constraints copyright (C) 2010-2012 Jakub Šístek version 1.3 Jakub Šístek i Table of Contents 1 Introduction.....................................

More information

Contents. I The Basic Framework for Stationary Problems 1

Contents. I The Basic Framework for Stationary Problems 1 page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other

More information

Some Computational Results for Dual-Primal FETI Methods for Elliptic Problems in 3D

Some Computational Results for Dual-Primal FETI Methods for Elliptic Problems in 3D Some Computational Results for Dual-Primal FETI Methods for Elliptic Problems in 3D Axel Klawonn 1, Oliver Rheinbach 1, and Olof B. Widlund 2 1 Universität Duisburg-Essen, Campus Essen, Fachbereich Mathematik

More information

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle ICES Student Forum The University of Texas at Austin, USA November 4, 204 Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of

More information

Null space computation of sparse singular matrices with MUMPS

Null space computation of sparse singular matrices with MUMPS Null space computation of sparse singular matrices with MUMPS Xavier Vasseur (CERFACS) In collaboration with Patrick Amestoy (INPT-IRIT, University of Toulouse and ENSEEIHT), Serge Gratton (INPT-IRIT,

More information

Elliptic model problem, finite elements, and geometry

Elliptic model problem, finite elements, and geometry Thirteenth International Conference on Domain Decomposition Methods Editors: N. Debit, M.Garbey, R. Hoppe, J. Périaux, D. Keyes, Y. Kuznetsov c 200 DDM.org 4 FETI-DP Methods for Elliptic Problems with

More information

SCALABLE ALGORITHMS for solving large sparse linear systems of equations

SCALABLE ALGORITHMS for solving large sparse linear systems of equations SCALABLE ALGORITHMS for solving large sparse linear systems of equations CONTENTS Sparse direct solvers (multifrontal) Substructuring methods (hybrid solvers) Jacko Koster, Bergen Center for Computational

More information

Scheduling Strategies for Parallel Sparse Backward/Forward Substitution

Scheduling Strategies for Parallel Sparse Backward/Forward Substitution Scheduling Strategies for Parallel Sparse Backward/Forward Substitution J.I. Aliaga M. Bollhöfer A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ. Jaume I (Spain) {aliaga,martina,quintana}@icc.uji.es

More information

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report ESPRESO ExaScale PaRallel FETI Solver Hybrid FETI Solver Report Lubomir Riha, Tomas Brzobohaty IT4Innovations Outline HFETI theory from FETI to HFETI communication hiding and avoiding techniques our new

More information

PROGRAMMING OF MULTIGRID METHODS

PROGRAMMING OF MULTIGRID METHODS PROGRAMMING OF MULTIGRID METHODS LONG CHEN In this note, we explain the implementation detail of multigrid methods. We will use the approach by space decomposition and subspace correction method; see Chapter:

More information

THE MORTAR FINITE ELEMENT METHOD IN 2D: IMPLEMENTATION IN MATLAB

THE MORTAR FINITE ELEMENT METHOD IN 2D: IMPLEMENTATION IN MATLAB THE MORTAR FINITE ELEMENT METHOD IN D: IMPLEMENTATION IN MATLAB J. Daněk, H. Kutáková Department of Mathematics, University of West Bohemia, Pilsen MECAS ESI s.r.o., Pilsen Abstract The paper is focused

More information

GRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS

GRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS Discussiones Mathematicae Graph Theory 30 (2010 ) 349 359 GRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS Pavla Kabelíková Department of Applied Mathematics FEI, VSB Technical University

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Outlining Midterm Projects Topic 3: GPU-based FEA Topic 4: GPU Direct Solver for Sparse Linear Algebra March 01, 2011 Dan Negrut, 2011 ME964

More information

Advanced Numerical Techniques for Cluster Computing

Advanced Numerical Techniques for Cluster Computing Advanced Numerical Techniques for Cluster Computing Presented by Piotr Luszczek http://icl.cs.utk.edu/iter-ref/ Presentation Outline Motivation hardware Dense matrix calculations Sparse direct solvers

More information

FETI Coarse Problem Parallelization Strategies and Their Comparison

FETI Coarse Problem Parallelization Strategies and Their Comparison Available on-line at www.prace-ri.eu Partnership for Advanced Computing in Europe FETI Coarse Problem Parallelization Strategies and Their Comparison T. Kozubek a,, D. Horak a, V. Hapla a a CE IT4Innovations,

More information

GPU Cluster Computing for FEM

GPU Cluster Computing for FEM GPU Cluster Computing for FEM Dominik Göddeke Sven H.M. Buijssen, Hilmar Wobker and Stefan Turek Angewandte Mathematik und Numerik TU Dortmund, Germany dominik.goeddeke@math.tu-dortmund.de GPU Computing

More information

The 3D DSC in Fluid Simulation

The 3D DSC in Fluid Simulation The 3D DSC in Fluid Simulation Marek K. Misztal Informatics and Mathematical Modelling, Technical University of Denmark mkm@imm.dtu.dk DSC 2011 Workshop Kgs. Lyngby, 26th August 2011 Governing Equations

More information

Overlapping Domain Decomposition Methods

Overlapping Domain Decomposition Methods Overlapping Domain Decomposition Methods X. Cai 1,2 1 Simula Research Laboratory 2 Department of Informatics, University of Oslo Abstract. Overlapping domain decomposition methods are efficient and flexible.

More information

Preparation of Codes for Trinity

Preparation of Codes for Trinity Preparation of Codes for Trinity Courtenay T. Vaughan, Mahesh Rajan, Dennis C. Dinge, Clark R. Dohrmann, Micheal W. Glass, Kenneth J. Franko, Kendall H. Pierson, and Michael R. Tupek Sandia National Laboratories

More information

Parallel resolution of sparse linear systems by mixing direct and iterative methods

Parallel resolution of sparse linear systems by mixing direct and iterative methods Parallel resolution of sparse linear systems by mixing direct and iterative methods Phyleas Meeting, Bordeaux J. Gaidamour, P. Hénon, J. Roman, Y. Saad LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix

More information

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Speedup Altair RADIOSS Solvers Using NVIDIA GPU Innovation Intelligence Speedup Altair RADIOSS Solvers Using NVIDIA GPU Eric LEQUINIOU, HPC Director Hongwei Zhou, Senior Software Developer May 16, 2012 Innovation Intelligence ALTAIR OVERVIEW Altair

More information

Simulating tsunami propagation on parallel computers using a hybrid software framework

Simulating tsunami propagation on parallel computers using a hybrid software framework Simulating tsunami propagation on parallel computers using a hybrid software framework Xing Simula Research Laboratory, Norway Department of Informatics, University of Oslo March 12, 2007 Outline Intro

More information

A parallel direct/iterative solver based on a Schur complement approach

A parallel direct/iterative solver based on a Schur complement approach A parallel direct/iterative solver based on a Schur complement approach Gene around the world at CERFACS Jérémie Gaidamour LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project) February 29th, 2008

More information

GPU Acceleration of Unmodified CSM and CFD Solvers

GPU Acceleration of Unmodified CSM and CFD Solvers GPU Acceleration of Unmodified CSM and CFD Solvers Dominik Göddeke Sven H.M. Buijssen, Hilmar Wobker and Stefan Turek Angewandte Mathematik und Numerik TU Dortmund, Germany dominik.goeddeke@math.tu-dortmund.de

More information

smooth coefficients H. Köstler, U. Rüde

smooth coefficients H. Köstler, U. Rüde A robust multigrid solver for the optical flow problem with non- smooth coefficients H. Köstler, U. Rüde Overview Optical Flow Problem Data term and various regularizers A Robust Multigrid Solver Galerkin

More information

Modelling and Simulation of Complex GeoTechnical Problems

Modelling and Simulation of Complex GeoTechnical Problems Modelling and Simulation of Complex GeoTechnical Problems R. Blaheta Institute of Geonics AS CR, v.v.i., Ostrava, Czech Republic 1 Introduction This paper provides three examples of topical geotechnical

More information

Numerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes

Numerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes Numerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes Jung-Han Kimn 1 and Blaise Bourdin 2 1 Department of Mathematics and The Center for Computation and

More information

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach Mini-workshop PHyLeaS associated team J. Gaidamour, P. Hénon July 9, 28 HIPS : an hybrid direct/iterative solver /

More information

Research Article A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity Problems

Research Article A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity Problems Mathematical Problems in Engineering Volume 2015, Article ID 147286, 7 pages http://dx.doi.org/10.1155/2015/147286 Research Article A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity

More information

FOR P3: A monolithic multigrid FEM solver for fluid structure interaction

FOR P3: A monolithic multigrid FEM solver for fluid structure interaction FOR 493 - P3: A monolithic multigrid FEM solver for fluid structure interaction Stefan Turek 1 Jaroslav Hron 1,2 Hilmar Wobker 1 Mudassar Razzaq 1 1 Institute of Applied Mathematics, TU Dortmund, Germany

More information

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

Performance Comparison between Blocking and Non-Blocking Communications for a Three-Dimensional Poisson Problem

Performance Comparison between Blocking and Non-Blocking Communications for a Three-Dimensional Poisson Problem Performance Comparison between Blocking and Non-Blocking Communications for a Three-Dimensional Poisson Problem Guan Wang and Matthias K. Gobbert Department of Mathematics and Statistics, University of

More information

problems on. Of course, additional interface conditions must be

problems on. Of course, additional interface conditions must be Thirteenth International Conference on Domain Decomposition Methods Editors: N. Debit, M.Garbey, R. Hoppe, J. Périaux, D. Keyes, Y. Kuznetsov c 2001 DDM.org 53 Schur Complement Based Preconditioners for

More information

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC

More information

Parallel Mesh Partitioning in Alya

Parallel Mesh Partitioning in Alya Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Parallel Mesh Partitioning in Alya A. Artigues a *** and G. Houzeaux a* a Barcelona Supercomputing Center ***antoni.artigues@bsc.es

More information

2 Fundamentals of Serial Linear Algebra

2 Fundamentals of Serial Linear Algebra . Direct Solution of Linear Systems.. Gaussian Elimination.. LU Decomposition and FBS..3 Cholesky Decomposition..4 Multifrontal Methods. Iterative Solution of Linear Systems.. Jacobi Method Fundamentals

More information

SUMMARY. solve the matrix system using iterative solvers. We use the MUMPS codes and distribute the computation over many different

SUMMARY. solve the matrix system using iterative solvers. We use the MUMPS codes and distribute the computation over many different Forward Modelling and Inversion of Multi-Source TEM Data D. W. Oldenburg 1, E. Haber 2, and R. Shekhtman 1 1 University of British Columbia, Department of Earth & Ocean Sciences 2 Emory University, Atlanta,

More information

Transaction of JSCES, Paper No Parallel Finite Element Analysis in Large Scale Shell Structures using CGCG Solver

Transaction of JSCES, Paper No Parallel Finite Element Analysis in Large Scale Shell Structures using CGCG Solver Transaction of JSCES, Paper No. 257 * Parallel Finite Element Analysis in Large Scale Shell Structures using Solver 1 2 3 Shohei HORIUCHI, Hirohisa NOGUCHI and Hiroshi KAWAI 1 24-62 22-4 2 22 3-14-1 3

More information

Numerical methods for volume preserving image registration

Numerical methods for volume preserving image registration Numerical methods for volume preserving image registration SIAM meeting on imaging science 2004 Eldad Haber with J. Modersitzki haber@mathcs.emory.edu Emory University Volume preserving IR p.1/?? Outline

More information

Numerical schemes for Hamilton-Jacobi equations, control problems and games

Numerical schemes for Hamilton-Jacobi equations, control problems and games Numerical schemes for Hamilton-Jacobi equations, control problems and games M. Falcone H. Zidani SADCO Spring School Applied and Numerical Optimal Control April 23-27, 2012, Paris Lecture 2/3 M. Falcone

More information

The numerical simulation of complex PDE problems. A numerical simulation project The finite element method for solving a boundary-value problem in R 2

The numerical simulation of complex PDE problems. A numerical simulation project The finite element method for solving a boundary-value problem in R 2 Universidad de Chile The numerical simulation of complex PDE problems Facultad de Ciencias Físicas y Matemáticas P. Frey, M. De Buhan Year 2008 MA691 & CC60X A numerical simulation project The finite element

More information

Massively Parallel Finite Element Simulations with deal.ii

Massively Parallel Finite Element Simulations with deal.ii Massively Parallel Finite Element Simulations with deal.ii Timo Heister, Texas A&M University 2012-02-16 SIAM PP2012 joint work with: Wolfgang Bangerth, Carsten Burstedde, Thomas Geenen, Martin Kronbichler

More information

Combinatorial problems in a Parallel Hybrid Linear Solver

Combinatorial problems in a Parallel Hybrid Linear Solver Combinatorial problems in a Parallel Hybrid Linear Solver Ichitaro Yamazaki and Xiaoye Li Lawrence Berkeley National Laboratory François-Henry Rouet and Bora Uçar ENSEEIHT-IRIT and LIP, ENS-Lyon SIAM workshop

More information

Nodal Basis Functions for Serendipity Finite Elements

Nodal Basis Functions for Serendipity Finite Elements Nodal Basis Functions for Serendipity Finite Elements Andrew Gillette Department of Mathematics University of Arizona joint work with Michael Floater (University of Oslo) Andrew Gillette - U. Arizona Nodal

More information

Parallel implementation of Total-FETI DDM with application to medical image registration

Parallel implementation of Total-FETI DDM with application to medical image registration Parallel implementation of Total-FETI DDM with application to medical image registration Michal Merta 1, Alena Vašatová 1, Václav Hapla 1, and David Horák 1 Key words: Total-FETI, image registration, parallel

More information

Sparse Multifrontal Performance Gains via NVIDIA GPU January 16, 2009

Sparse Multifrontal Performance Gains via NVIDIA GPU January 16, 2009 Sparse Multifrontal Performance Gains via NVIDIA GPU January 16, 2009 Dan l Pierce, PhD, MBA, CEO & President AAI Joint with: Yukai Hung, Chia-Chi Liu, Yao-Hung Tsai, Weichung Wang, and David Yu Access

More information

Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves

Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves Michael Bader TU München Stefanie Schraufstetter TU München Jörn Behrens AWI Bremerhaven Abstract

More information

A direct solver with reutilization of previously-computed LU factorizations for h-adaptive finite element grids with point singularities

A direct solver with reutilization of previously-computed LU factorizations for h-adaptive finite element grids with point singularities Maciej Paszynski AGH University of Science and Technology Faculty of Computer Science, Electronics and Telecommunication Department of Computer Science al. A. Mickiewicza 30 30-059 Krakow, Poland tel.

More information

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation 1 Cheng-Han Du* I-Hsin Chung** Weichung Wang* * I n s t i t u t e o f A p p l i e d M

More information

An introduction to mesh generation Part IV : elliptic meshing

An introduction to mesh generation Part IV : elliptic meshing Elliptic An introduction to mesh generation Part IV : elliptic meshing Department of Civil Engineering, Université catholique de Louvain, Belgium Elliptic Curvilinear Meshes Basic concept A curvilinear

More information

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

Recent developments in the solution of indefinite systems Location: De Zwarte Doos (TU/e campus)

Recent developments in the solution of indefinite systems Location: De Zwarte Doos (TU/e campus) 1-day workshop, TU Eindhoven, April 17, 2012 Recent developments in the solution of indefinite systems Location: De Zwarte Doos (TU/e campus) :10.25-10.30: Opening and word of welcome 10.30-11.15: Michele

More information

Application of GPU-Based Computing to Large Scale Finite Element Analysis of Three-Dimensional Structures

Application of GPU-Based Computing to Large Scale Finite Element Analysis of Three-Dimensional Structures Paper 6 Civil-Comp Press, 2012 Proceedings of the Eighth International Conference on Engineering Computational Technology, B.H.V. Topping, (Editor), Civil-Comp Press, Stirlingshire, Scotland Application

More information

The Immersed Interface Method

The Immersed Interface Method The Immersed Interface Method Numerical Solutions of PDEs Involving Interfaces and Irregular Domains Zhiiin Li Kazufumi Ito North Carolina State University Raleigh, North Carolina Society for Industrial

More information

What is Multigrid? They have been extended to solve a wide variety of other problems, linear and nonlinear.

What is Multigrid? They have been extended to solve a wide variety of other problems, linear and nonlinear. AMSC 600/CMSC 760 Fall 2007 Solution of Sparse Linear Systems Multigrid, Part 1 Dianne P. O Leary c 2006, 2007 What is Multigrid? Originally, multigrid algorithms were proposed as an iterative method to

More information

Automated Finite Element Computations in the FEniCS Framework using GPUs

Automated Finite Element Computations in the FEniCS Framework using GPUs Automated Finite Element Computations in the FEniCS Framework using GPUs Florian Rathgeber (f.rathgeber10@imperial.ac.uk) Advanced Modelling and Computation Group (AMCG) Department of Earth Science & Engineering

More information

AMS527: Numerical Analysis II

AMS527: Numerical Analysis II AMS527: Numerical Analysis II A Brief Overview of Finite Element Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao SUNY Stony Brook AMS527: Numerical Analysis II 1 / 25 Overview Basic concepts Mathematical

More information

A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS

A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS Contemporary Mathematics Volume 157, 1994 A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS T.E. Tezduyar, M. Behr, S.K. Aliabadi, S. Mittal and S.E. Ray ABSTRACT.

More information

Parallel Sum Primal Spaces for Isogeometric Deluxe BDDC Preconditioners

Parallel Sum Primal Spaces for Isogeometric Deluxe BDDC Preconditioners 13 Parallel Sum Primal Spaces for Isogeometric Deluxe BDDC Preconditioners L. Beirão da Veiga 1, L. F. Pavarino 2, S. Scacchi 2, O. B. Widlund 3, and S. Zampini 4 1 Introduction In this paper, we study

More information

6 Implementation of Parallel FE Systems

6 Implementation of Parallel FE Systems 6 Implementation of Parallel FE Systems 6.1 Implementation of Domain Decomposition in MSC.NASTRAN V70.7 6.2 Further Parallel Features of MSC.NASTRAN V70.7 6.2.1 Parallel Normal Modes Analysis 6.2.2 Parallel

More information

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Kevin Deweese 1 Erik Boman 2 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms

More information

MaPHyS, a sparse hybrid linear solver and preliminary complexity analysis

MaPHyS, a sparse hybrid linear solver and preliminary complexity analysis MaPHyS, a sparse hybrid linear solver and preliminary complexity analysis Emmanuel AGULLO, Luc GIRAUD, Abdou GUERMOUCHE, Azzam HAIDAR, Jean ROMAN INRIA Bordeaux Sud Ouest CERFACS Université de Bordeaux

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction ME 475: Computer-Aided Design of Structures 1-1 CHAPTER 1 Introduction 1.1 Analysis versus Design 1.2 Basic Steps in Analysis 1.3 What is the Finite Element Method? 1.4 Geometrical Representation, Discretization

More information

Computational methods for inverse problems

Computational methods for inverse problems Computational methods for inverse problems Eldad Haber contributors: Stefan Heldmann, Lauren Hanson, Lior Horesh, Jan Modersitzki, Uri Ascher, Doug Oldenburg Sponsored by: DOE and NSF Motivation and examples

More information

HYBRID DIRECT AND ITERATIVE SOLVER FOR H ADAPTIVE MESHES WITH POINT SINGULARITIES

HYBRID DIRECT AND ITERATIVE SOLVER FOR H ADAPTIVE MESHES WITH POINT SINGULARITIES HYBRID DIRECT AND ITERATIVE SOLVER FOR H ADAPTIVE MESHES WITH POINT SINGULARITIES Maciej Paszyński Department of Computer Science, AGH University of Science and Technology, Kraków, Poland email: paszynsk@agh.edu.pl

More information

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou Study and implementation of computational methods for Differential Equations in heterogeneous systems Asimina Vouronikoy - Eleni Zisiou Outline Introduction Review of related work Cyclic Reduction Algorithm

More information

A substructure based parallel dynamic solution of large systems on homogeneous PC clusters

A substructure based parallel dynamic solution of large systems on homogeneous PC clusters CHALLENGE JOURNAL OF STRUCTURAL MECHANICS 1 (4) (2015) 156 160 A substructure based parallel dynamic solution of large systems on homogeneous PC clusters Semih Özmen, Tunç Bahçecioğlu, Özgür Kurç * Department

More information

Adaptive boundary element methods in industrial applications

Adaptive boundary element methods in industrial applications Adaptive boundary element methods in industrial applications Günther Of, Olaf Steinbach, Wolfgang Wendland Adaptive Fast Boundary Element Methods in Industrial Applications, Hirschegg, September 30th,

More information

Numerical Study of the Almost Nested Case in a 2 Multilevel Method Based on Non-nested Meshes 3 UNCORRECTED PROOF

Numerical Study of the Almost Nested Case in a 2 Multilevel Method Based on Non-nested Meshes 3 UNCORRECTED PROOF 1 Numerical Study of the Almost Nested Case in a 2 Multilevel Method Based on Non-nested Meshes 3 Thomas Dickopf and Rolf Krause 4 University of Lugano, Institute of Computational Science, Via G. Buffi

More information

A Study of Prolongation Operators Between Non-nested Meshes

A Study of Prolongation Operators Between Non-nested Meshes A Study of Prolongation Operators Between Non-nested Meshes Thomas Dickopf 1 and Rolf Krause 2 1 Institute for Numerical Simulation, University of Bonn, 53115 Bonn, Germany, dickopf@ins.uni-bonn.de 2 Institute

More information

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA Partitioning and Partitioning Tools Tim Barth NASA Ames Research Center Moffett Field, California 94035-00 USA 1 Graph/Mesh Partitioning Why do it? The graph bisection problem What are the standard heuristic

More information

Introduction to Multigrid and its Parallelization

Introduction to Multigrid and its Parallelization Introduction to Multigrid and its Parallelization! Thomas D. Economon Lecture 14a May 28, 2014 Announcements 2 HW 1 & 2 have been returned. Any questions? Final projects are due June 11, 5 pm. If you are

More information

Solving large systems on HECToR using the 2-lagrange multiplier methods Karangelis, Anastasios; Loisel, Sebastien; Maynard, Chris

Solving large systems on HECToR using the 2-lagrange multiplier methods Karangelis, Anastasios; Loisel, Sebastien; Maynard, Chris Heriot-Watt University Heriot-Watt University Research Gateway Solving large systems on HECToR using the 2-lagrange multiplier methods Karangelis, Anastasios; Loisel, Sebastien; Maynard, Chris Published

More information

Domain Decomposition and hp-adaptive Finite Elements

Domain Decomposition and hp-adaptive Finite Elements Domain Decomposition and hp-adaptive Finite Elements Randolph E. Bank 1 and Hieu Nguyen 1 Department of Mathematics, University of California, San Diego, La Jolla, CA 9093-011, USA, rbank@ucsd.edu. Department

More information

Overview of Trilinos and PT-Scotch

Overview of Trilinos and PT-Scotch 29.03.2012 Outline PT-Scotch 1 PT-Scotch The Dual Recursive Bipartitioning Algorithm Parallel Graph Bipartitioning Methods 2 Overview of the Trilinos Packages Examples on using Trilinos PT-Scotch The Scotch

More information

Using multifrontal hierarchically solver and HPC systems for 3D Helmholtz problem

Using multifrontal hierarchically solver and HPC systems for 3D Helmholtz problem Using multifrontal hierarchically solver and HPC systems for 3D Helmholtz problem Sergey Solovyev 1, Dmitry Vishnevsky 1, Hongwei Liu 2 Institute of Petroleum Geology and Geophysics SB RAS 1 EXPEC ARC,

More information

Data mining with sparse grids using simplicial basis functions

Data mining with sparse grids using simplicial basis functions Data mining with sparse grids using simplicial basis functions Jochen Garcke and Michael Griebel Institut für Angewandte Mathematik Universität Bonn Part of the work was supported within the project 03GRM6BN

More information

High Order Nédélec Elements with local complete sequence properties

High Order Nédélec Elements with local complete sequence properties High Order Nédélec Elements with local complete sequence properties Joachim Schöberl and Sabine Zaglmayr Institute for Computational Mathematics, Johannes Kepler University Linz, Austria E-mail: {js,sz}@jku.at

More information

Preprint ISSN

Preprint ISSN Fakultät für Mathematik und Informatik Preprint 216-4 Alexander Heinlein, Axel Klawonn, and Oliver Rheinbach A Parallel Implementation of a Two- Level Overlapping Schwarz Method with Energy-Minimizing

More information

Finite Element Implementation

Finite Element Implementation Chapter 8 Finite Element Implementation 8.1 Elements Elements andconditions are the main extension points of Kratos. New formulations can be introduced into Kratos by implementing a new Element and its

More information

Matrix-free IPM with GPU acceleration

Matrix-free IPM with GPU acceleration Matrix-free IPM with GPU acceleration Julian Hall, Edmund Smith and Jacek Gondzio School of Mathematics University of Edinburgh jajhall@ed.ac.uk 29th June 2011 Linear programming theory Primal-dual pair

More information

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND Student Submission for the 5 th OpenFOAM User Conference 2017, Wiesbaden - Germany: SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND TESSA UROIĆ Faculty of Mechanical Engineering and Naval Architecture, Ivana

More information

Outline. Parallel Algorithms for Linear Algebra. Number of Processors and Problem Size. Speedup and Efficiency

Outline. Parallel Algorithms for Linear Algebra. Number of Processors and Problem Size. Speedup and Efficiency 1 2 Parallel Algorithms for Linear Algebra Richard P. Brent Computer Sciences Laboratory Australian National University Outline Basic concepts Parallel architectures Practical design issues Programming

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26

More information

High Performance Computing for PDE Towards Petascale Computing

High Performance Computing for PDE Towards Petascale Computing High Performance Computing for PDE Towards Petascale Computing S. Turek, D. Göddeke with support by: Chr. Becker, S. Buijssen, M. Grajewski, H. Wobker Institut für Angewandte Mathematik, Univ. Dortmund

More information

Solution of 2D Euler Equations and Application to Airfoil Design

Solution of 2D Euler Equations and Application to Airfoil Design WDS'6 Proceedings of Contributed Papers, Part I, 47 52, 26. ISBN 8-86732-84-3 MATFYZPRESS Solution of 2D Euler Equations and Application to Airfoil Design J. Šimák Charles University, Faculty of Mathematics

More information

A Finite Element Method for Deformable Models

A Finite Element Method for Deformable Models A Finite Element Method for Deformable Models Persephoni Karaolani, G.D. Sullivan, K.D. Baker & M.J. Baines Intelligent Systems Group, Department of Computer Science University of Reading, RG6 2AX, UK,

More information

Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms

Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms By:- Nitin Kamra Indian Institute of Technology, Delhi Advisor:- Prof. Ulrich Reude 1. Introduction to Linear

More information

Application of A Priori Error Estimates for Navier-Stokes Equations to Accurate Finite Element Solution

Application of A Priori Error Estimates for Navier-Stokes Equations to Accurate Finite Element Solution Application of A Priori Error Estimates for Navier-Stokes Equations to Accurate Finite Element Solution P. BURDA a,, J. NOVOTNÝ b,, J. ŠÍSTE a, a Department of Mathematics Czech University of Technology

More information

Space Filling Curves and Hierarchical Basis. Klaus Speer

Space Filling Curves and Hierarchical Basis. Klaus Speer Space Filling Curves and Hierarchical Basis Klaus Speer Abstract Real world phenomena can be best described using differential equations. After linearisation we have to deal with huge linear systems of

More information

Lecture 15: More Iterative Ideas

Lecture 15: More Iterative Ideas Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!

More information

An Hybrid Approach for the Parallelization of a Block Iterative Algorithm

An Hybrid Approach for the Parallelization of a Block Iterative Algorithm An Hybrid Approach for the Parallelization of a Block Iterative Algorithm Carlos Balsa 1, Ronan Guivarch 2, Daniel Ruiz 2, and Mohamed Zenadi 2 1 CEsA FEUP, Porto, Portugal balsa@ipb.pt 2 Université de

More information

Performance of Implicit Solver Strategies on GPUs

Performance of Implicit Solver Strategies on GPUs 9. LS-DYNA Forum, Bamberg 2010 IT / Performance Performance of Implicit Solver Strategies on GPUs Prof. Dr. Uli Göhner DYNAmore GmbH Stuttgart, Germany Abstract: The increasing power of GPUs can be used

More information

Distributed Schur Complement Solvers for Real and Complex Block-Structured CFD Problems

Distributed Schur Complement Solvers for Real and Complex Block-Structured CFD Problems Distributed Schur Complement Solvers for Real and Complex Block-Structured CFD Problems Dr.-Ing. Achim Basermann, Dr. Hans-Peter Kersken German Aerospace Center (DLR) Simulation- and Software Technology

More information

Parallel solution for finite element linear systems of. equations on workstation cluster *

Parallel solution for finite element linear systems of. equations on workstation cluster * Aug. 2009, Volume 6, No.8 (Serial No.57) Journal of Communication and Computer, ISSN 1548-7709, USA Parallel solution for finite element linear systems of equations on workstation cluster * FU Chao-jiang

More information