Simulating tsunami propagation on parallel computers using a hybrid software framework
|
|
- Dominick Gardner
- 5 years ago
- Views:
Transcription
1 Simulating tsunami propagation on parallel computers using a hybrid software framework Xing Simula Research Laboratory, Norway Department of Informatics, University of Oslo March 12, 2007
2 Outline Intro Parallelization Vision HLRS 1 Introduction 2 A hybrid software framework for parallelization 3 Desirable simulation setup for future 4 Performance analysis done at HLRS
3 List of Topics 1 Introduction 2 A hybrid software framework for parallelization 3 Desirable simulation setup for future 4 Performance analysis done at HLRS
4 The origin of the word tsunami
5 Different types of tsunamis Tsunamis: large waves formed by rapid mass movements Induced by subwater earthquake (such as Dec Indian Ocean Tsunami) Induced by asteroid impact (such as the Mjølnir Impact) Induced by landslide (of great importance to the Norwegian fjords)
6 Motivation Wave propagation simulation is very important for studying tsunamis A computational challenge huge computational domain different physics required in different areas Parallel computing should reuse existing serial wave codes should allow different math models/resolutions in different areas Objective: a framework for parallel hybrid tsunami simulations
7 Huge computations (example: Indian Ocean) 1km 1km resolution overall: about mesh points 200m 200m resolution overall: 10 9 mesh points
8 Computational challenge Example: Indian Ocean 1km 1km resolution is not sufficient everywhere 200m 200m resolution overall is too much We need smart computing : High resolution only in areas where necessary Simple mathematical model in vast areas Advanced mathematical model (due to complicated physics) in small areas Result: parallel hybrid tsunami simulator Desirable resolution requires number of mesh points number of time steps many thousands
9 List of Topics 1 Introduction 2 A hybrid software framework for parallelization 3 Desirable simulation setup for future 4 Performance analysis done at HLRS
10 Parallelization objectives Requirement 1: easy parallelization Reuse of serial wave codes during parallelization Different serial codes collaborate inside a hybrid framework Requirement 2: efficient for computational resource FEM only in areas where unstructured meshes and advanced numerics are needed FDM elsewhere
11 Basic idea: divide and conquer Domain decomposition: one global solution domain is divided into many subdomains Each subdomain: (relatively) independent working unit Collaboration between the subdomains: communication
12 Overall parallelization strategy Ω = P s=1ω s Divide a vast ocean domain into many subdomains Uniform local meshes and FDM on most of the subdomains Unstructured local meshes and FEM on selected subdomains A global iteration among all subdomains During each iteration a subdomain independently updates its local solution Exchange of local solutions between neighboring subdomains at end of each iteration Solution of L Ω (u) = f Ω is found as u 0,u 1,...,u i L Ωs (u i s) = f i Ω s 1 s P u i = P s=1u i s
13 Convergence among subdomains Schwarz methods work as the numerical foundation Small amount of overlap between neighboring subdomains (overlapping domain decomposition) Originally well-known as a parallel numerical strategy for solving large linear systems We apply DD at software level (not at linear-algebra level ) No global matrices/vectors exist, all represented by the collection of subdomain matrices/vectors Neighboring subdomain meshes may be non-matching and/or of different types
14 A generic library of Schwarz methods Schwarz methods: a general approach to solving PDEs in parallel, a generic library can be programmed Object-oriented programming is well suited Generic components: subdomain solvers and a global administrator class SubdomainSolver: generic interface of a subdomain solver, only declaration of standard functions, no implementation class Administrator: implementation of generic functions for invoking communication and checking global convergence
15 A framework of hybrid tsunami simulators Objective: a generic framework for creating hybrid parallel tsunami simulators, based on existing serial codes Starting point C++ Boussinesq solver using FEM: class Boussinesq Legacy F77 code using FDM: a set of subroutines Direct parallelization of either code requires too much work A hybrid software framework class SubdomainBQFEMSolver : public Boussinesq, public SubdomainSolver class SubdomainBQFDMSolver : public SubdomainSolver (calling F77 subroutines internally) HybridBQSolver : public Administrator Implementation using Diffpack (
16 Flexibility Intro Parallelization Vision HLRS Free choice between SubdomainBQFEMSolver and SubdomainBQFDMSolver for each subdomain Adaptive mesh refinement allowed for FEM subdomains Neighboring subdomains may use non-matching local meshes Possible to incorporate other serial codes as subdomain solvers
17 List of Topics 1 Introduction 2 A hybrid software framework for parallelization 3 Desirable simulation setup for future 4 Performance analysis done at HLRS
18 Subdomain preparation p1 p4 768 New finite element code 700 Finite difference legacy code p2 Simulating tsunami propagation on parallel computers p3 using a h
19 Coarse-mesh simulation of Indian Ocean Tsunami Initial wave elevation after the earthquake
20 Coarse-mesh simulation snapshot 1 After 1.4 hours
21 Coarse-mesh simulation snapshot 2 After 2.8 hours
22 List of Topics 1 Introduction 2 A hybrid software framework for parallelization 3 Desirable simulation setup for future 4 Performance analysis done at HLRS
23 Motivation for my HPC Europa visit Vector-CPU based system at HLRS Extensive experience with performance analysis at HLRS Purpose: a fine-grained diagnosis of the tsunami simulator and our parallel PDE library
24 Observations so far (1) When the computational domain has no points on land, the parallel computation is well balanced On SX-8, the main work at each time step goes to the discretization, not solving the resulting distributed linear system
25 Observations so far (2) When the computational domain has points on land, the parallel computation is not balanced Causes of imbalance: Imbalance in the distributed discretization (some subdomains have many points on land) Imbalance in the parallel DD solver (some subdomain problems are easier to solve)
26 Observations so far (3) The SX compiler does not optimize the discretization phase very well C++ code Many levels of nested for-loops Extensive use of virtual functions
27 Observations so far (4) Vectorization is enabled for some parts of the code Example: vector addition x = y + z #pragma cdir nodep for (int i=0; i<length; i++) tmp_x[i] = tmp_y[i] + tmp_z[i]; Percentage of vectorized code is increased from 6-7% to 13-14% in the solution phase
28 Observations so far (5) Vectorization does not work for some parts of the code Example: sparse matrix-vector multiplication y = Ax Compressed row storage Indirect (and random) access of data entries #pragma cdir nodep for (i = 1; i <= nrows; i++) { rstart = ad.irow(i); rstop = ad.irow(i+1); #pragma cdir novector tmp = 0.0; for (r = rstart; r < rstop; r++) tmp += entries(r) * x(ad.jcol(r)); y(i) += tmp; } Vectorization of the inner for-loop has to be turned off!
29 Conclusions Schwarz methods: numerical foundation for the parallelization Object-oriented programming enables a hybrid framework of tsunami simulators Full flexibility in choosing subdomain solvers different mathematical models different discretizations different local meshes different codes Some parts of the tsunami simulator are improved due to analysis done at HLRS Challenge: performance and load balancing
Simulation of tsunami propagation
Simulation of tsunami propagation Xing Cai (Joint with G. Pedersen, S. Glimsdal, F. Løvholt, H. P. Langtangen, C. Harbitz) Simula Research Laboratory Dept. of Informatics, University of Oslo 2nd escience
More informationParallel Simulation of Tsunamis Using a Hybrid Software Approach
John von Neumann Institute for Computing Parallel Simulation of Tsunamis Using a Hybrid Software Approach X. Cai, G.K. Pedersen, H.P. Langtangen, S. Glimsdal published in Parallel Computing: Current &
More informationParallel FEM Computation and Multilevel Graph Partitioning Xing Cai
Parallel FEM Computation and Multilevel Graph Partitioning Xing Cai Simula Research Laboratory Overview Parallel FEM computation how? Graph partitioning why? The multilevel approach to GP A numerical example
More informationOverlapping Domain Decomposition Methods
Overlapping Domain Decomposition Methods X. Cai 1,2 1 Simula Research Laboratory 2 Department of Informatics, University of Oslo Abstract. Overlapping domain decomposition methods are efficient and flexible.
More informationCMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)
CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can
More informationComputational Fluid Dynamics - Incompressible Flows
Computational Fluid Dynamics - Incompressible Flows March 25, 2008 Incompressible Flows Basis Functions Discrete Equations CFD - Incompressible Flows CFD is a Huge field Numerical Techniques for solving
More informationOn Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators
On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC
More informationTheoretical Foundations
Theoretical Foundations Dmitry Karpeev 1,2, Matthew Knepley 1,2, and Robert Kirby 3 1 Mathematics and Computer Science Division 2 Computation Institute Argonne National Laboratory University of Chicago
More informationOn a Future Software Platform for Demanding Multi-Scale and Multi-Physics Problems
On a Future Software Platform for Demanding Multi-Scale and Multi-Physics Problems H. P. Langtangen X. Cai Simula Research Laboratory, Oslo Department of Informatics, Univ. of Oslo SIAM-CSE07, February
More informationParallel Computing Why & How?
Parallel Computing Why & How? Xing Cai Simula Research Laboratory Dept. of Informatics, University of Oslo Winter School on Parallel Computing Geilo January 20 25, 2008 Outline 1 Motivation 2 Parallel
More informationHPC Algorithms and Applications
HPC Algorithms and Applications Dwarf #5 Structured Grids Michael Bader Winter 2012/2013 Dwarf #5 Structured Grids, Winter 2012/2013 1 Dwarf #5 Structured Grids 1. dense linear algebra 2. sparse linear
More information1.2 Numerical Solutions of Flow Problems
1.2 Numerical Solutions of Flow Problems DIFFERENTIAL EQUATIONS OF MOTION FOR A SIMPLIFIED FLOW PROBLEM Continuity equation for incompressible flow: 0 Momentum (Navier-Stokes) equations for a Newtonian
More informationImplementation of an integrated efficient parallel multiblock Flow solver
Implementation of an integrated efficient parallel multiblock Flow solver Thomas Bönisch, Panagiotis Adamidis and Roland Rühle adamidis@hlrs.de Outline Introduction to URANUS Why using Multiblock meshes
More informationLecture 4: Principles of Parallel Algorithm Design (part 3)
Lecture 4: Principles of Parallel Algorithm Design (part 3) 1 Exploratory Decomposition Decomposition according to a search of a state space of solutions Example: the 15-puzzle problem Determine any sequence
More informationAsynchronous OpenCL/MPI numerical simulations of conservation laws
Asynchronous OpenCL/MPI numerical simulations of conservation laws Philippe HELLUY 1,3, Thomas STRUB 2. 1 IRMA, Université de Strasbourg, 2 AxesSim, 3 Inria Tonus, France IWOCL 2015, Stanford Conservation
More informationPARALLEL DECOMPOSITION OF 100-MILLION DOF MESHES INTO HIERARCHICAL SUBDOMAINS
Technical Report of ADVENTURE Project ADV-99-1 (1999) PARALLEL DECOMPOSITION OF 100-MILLION DOF MESHES INTO HIERARCHICAL SUBDOMAINS Hiroyuki TAKUBO and Shinobu YOSHIMURA School of Engineering University
More informationApproaches to Parallel Implementation of the BDDC Method
Approaches to Parallel Implementation of the BDDC Method Jakub Šístek Includes joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík. Institute of Mathematics of the AS CR, Prague
More informationIntroduction to Parallel Programming for Multicore/Manycore Clusters Part II-3: Parallel FVM using MPI
Introduction to Parallel Programming for Multi/Many Clusters Part II-3: Parallel FVM using MPI Kengo Nakajima Information Technology Center The University of Tokyo 2 Overview Introduction Local Data Structure
More informationADAPTIVE FINITE ELEMENT
Finite Element Methods In Linear Structural Mechanics Univ. Prof. Dr. Techn. G. MESCHKE SHORT PRESENTATION IN ADAPTIVE FINITE ELEMENT Abdullah ALSAHLY By Shorash MIRO Computational Engineering Ruhr Universität
More informationHigh-performance computing on distributed-memory architecture
High-performance computing on distributed-memory architecture Xing Cai Simula Research Laboratory Dept. of Informatics, University of Oslo Winter School on Parallel Computing Geilo January 20 25, 2008
More informationHigh Performance Computing for PDE Towards Petascale Computing
High Performance Computing for PDE Towards Petascale Computing S. Turek, D. Göddeke with support by: Chr. Becker, S. Buijssen, M. Grajewski, H. Wobker Institut für Angewandte Mathematik, Univ. Dortmund
More informationCHAO YANG. Early Experience on Optimizations of Application Codes on the Sunway TaihuLight Supercomputer
CHAO YANG Dr. Chao Yang is a full professor at the Laboratory of Parallel Software and Computational Sciences, Institute of Software, Chinese Academy Sciences. His research interests include numerical
More informationEFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI
EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI 1 Akshay N. Panajwar, 2 Prof.M.A.Shah Department of Computer Science and Engineering, Walchand College of Engineering,
More informationAccelerated ANSYS Fluent: Algebraic Multigrid on a GPU. Robert Strzodka NVAMG Project Lead
Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU Robert Strzodka NVAMG Project Lead A Parallel Success Story in Five Steps 2 Step 1: Understand Application ANSYS Fluent Computational Fluid Dynamics
More informationParallel resolution of sparse linear systems by mixing direct and iterative methods
Parallel resolution of sparse linear systems by mixing direct and iterative methods Phyleas Meeting, Bordeaux J. Gaidamour, P. Hénon, J. Roman, Y. Saad LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix
More informationHIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach
HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach Mini-workshop PHyLeaS associated team J. Gaidamour, P. Hénon July 9, 28 HIPS : an hybrid direct/iterative solver /
More informationNumerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes
Numerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes Jung-Han Kimn 1 and Blaise Bourdin 2 1 Department of Mathematics and The Center for Computation and
More informationESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report
ESPRESO ExaScale PaRallel FETI Solver Hybrid FETI Solver Report Lubomir Riha, Tomas Brzobohaty IT4Innovations Outline HFETI theory from FETI to HFETI communication hiding and avoiding techniques our new
More informationDiffpack- A Flexible Development Framework for the Numerical Modeling and Solution of Partial Differential Equations
Diffpack- A Flexible Development Framework for the Numerical Modeling and Solution of Partial Differential Equations Peter Böhm, Frank Vogel Bratislava, July 25 th 2005 Outline inutech - Diffpack - an
More informationA parallel direct/iterative solver based on a Schur complement approach
A parallel direct/iterative solver based on a Schur complement approach Gene around the world at CERFACS Jérémie Gaidamour LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project) February 29th, 2008
More informationA Scalable Parallel LSQR Algorithm for Solving Large-Scale Linear System for Seismic Tomography
1 A Scalable Parallel LSQR Algorithm for Solving Large-Scale Linear System for Seismic Tomography He Huang, Liqiang Wang, Po Chen(University of Wyoming) John Dennis (NCAR) 2 LSQR in Seismic Tomography
More informationIntroduction to parallel Computing
Introduction to parallel Computing VI-SEEM Training Paschalis Paschalis Korosoglou Korosoglou (pkoro@.gr) (pkoro@.gr) Outline Serial vs Parallel programming Hardware trends Why HPC matters HPC Concepts
More informationLecture 2 Unstructured Mesh Generation
Lecture 2 Unstructured Mesh Generation MIT 16.930 Advanced Topics in Numerical Methods for Partial Differential Equations Per-Olof Persson (persson@mit.edu) February 13, 2006 1 Mesh Generation Given a
More informationLecture 15: More Iterative Ideas
Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!
More informationFinite element methods in scientific computing. Wolfgang Bangerth, Texas A&M University
Finite element methods in scientific computing, Texas A&M University Implementing the finite element method A brief re-hash of the FEM, using the Poisson equation: We start with the strong form: Δ u=f...and
More informationLecture 04 FUNCTIONS AND ARRAYS
Lecture 04 FUNCTIONS AND ARRAYS 1 Motivations Divide hug tasks to blocks: divide programs up into sets of cooperating functions. Define new functions with function calls and parameter passing. Use functions
More informationDesigning Parallel Programs. This review was developed from Introduction to Parallel Computing
Designing Parallel Programs This review was developed from Introduction to Parallel Computing Author: Blaise Barney, Lawrence Livermore National Laboratory references: https://computing.llnl.gov/tutorials/parallel_comp/#whatis
More informationParallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides)
Parallel Computing 2012 Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Algorithm Design Outline Computational Model Design Methodology Partitioning Communication
More informationINF3380: Parallel Programming for Natural Sciences
INF3380: Parallel Programming for Natural Sciences Xing Cai & Aslak Tveito Simula Research Laboratory, and Dept. of Informatics, Univ. of Oslo INF3380: Parallel Programming for Natural Sciences p. 1 Lecture
More informationShallow Water Simulations on Graphics Hardware
Shallow Water Simulations on Graphics Hardware Ph.D. Thesis Presentation 2014-06-27 Martin Lilleeng Sætra Outline Introduction Parallel Computing and the GPU Simulating Shallow Water Flow Topics of Thesis
More informationImproving Inter-subdomain Communication and Load-balancing for the Parallel Diffpack Library. Master s thesis. Martin Burheim Tingstad
UNIVERSITY OF OSLO Department of Informatics Improving Inter-subdomain Communication and Load-balancing for the Parallel Diffpack Library Master s thesis Martin Burheim Tingstad June 21, 2007 Contents
More informationarxiv: v1 [math.na] 26 Jun 2014
for spectrally accurate wave propagation Vladimir Druskin, Alexander V. Mamonov and Mikhail Zaslavsky, Schlumberger arxiv:406.6923v [math.na] 26 Jun 204 SUMMARY We develop a method for numerical time-domain
More informationINF3380: Parallel Programming for Scientific Problems
INF3380: Parallel Programming for Scientific Problems Xing Cai Simula Research Laboratory, and Dept. of Informatics, Univ. of Oslo INF3380: Parallel Programming for Scientific Problems p. 1 Course overview
More informationHybrid MPI + OpenMP Approach to Improve the Scalability of a Phase-Field-Crystal Code
Hybrid MPI + OpenMP Approach to Improve the Scalability of a Phase-Field-Crystal Code Reuben D. Budiardja reubendb@utk.edu ECSS Symposium March 19 th, 2013 Project Background Project Team (University of
More informationPartial Differential Equations
Simulation in Computer Graphics Partial Differential Equations Matthias Teschner Computer Science Department University of Freiburg Motivation various dynamic effects and physical processes are described
More informationThe Shallow Water Equations and CUDA
The Shallow Water Equations and CUDA HPC - Algorithms and Applications Alexander Pöppl Technical University of Munich Chair of Scientific Computing January 11 th 2017 Last Tutorial Discretized Heat Equation
More information: What is Finite Element Analysis (FEA)?
Q: What is Finite Element Analysis (FEA)? A1: It is a numerical technique for finding approximate solutions of partial differential equations (PDE) as well as of integral equations. The solution approach
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26
More informationLecture 4: Principles of Parallel Algorithm Design (part 3)
Lecture 4: Principles of Parallel Algorithm Design (part 3) 1 Exploratory Decomposition Decomposition according to a search of a state space of solutions Example: the 15-puzzle problem Determine any sequence
More informationMultigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK
Multigrid Solvers in CFD David Emerson Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK david.emerson@stfc.ac.uk 1 Outline Multigrid: general comments Incompressible
More informationLecture 6: Input Compaction and Further Studies
PASI Summer School Advanced Algorithmic Techniques for GPUs Lecture 6: Input Compaction and Further Studies 1 Objective To learn the key techniques for compacting input data for reduced consumption of
More informationAdaptive Mesh Refinement in Titanium
Adaptive Mesh Refinement in Titanium http://seesar.lbl.gov/anag Lawrence Berkeley National Laboratory April 7, 2005 19 th IPDPS, April 7, 2005 1 Overview Motivations: Build the infrastructure in Titanium
More informationMassively Parallel Finite Element Simulations with deal.ii
Massively Parallel Finite Element Simulations with deal.ii Timo Heister, Texas A&M University 2012-02-16 SIAM PP2012 joint work with: Wolfgang Bangerth, Carsten Burstedde, Thomas Geenen, Martin Kronbichler
More informationApplication of Finite Volume Method for Structural Analysis
Application of Finite Volume Method for Structural Analysis Saeed-Reza Sabbagh-Yazdi and Milad Bayatlou Associate Professor, Civil Engineering Department of KNToosi University of Technology, PostGraduate
More informationDeveloping the TELEMAC system for HECToR (phase 2b & beyond) Zhi Shang
Developing the TELEMAC system for HECToR (phase 2b & beyond) Zhi Shang Outline of the Talk Introduction to the TELEMAC System and to TELEMAC-2D Code Developments Data Reordering Strategy Results Conclusions
More informationCS 470 Spring Other Architectures. Mike Lam, Professor. (with an aside on linear algebra)
CS 470 Spring 2016 Mike Lam, Professor Other Architectures (with an aside on linear algebra) Parallel Systems Shared memory (uniform global address space) Primary story: make faster computers Programming
More informationJust the Facts Small-Sliding Contact in ANSYS Mechanical
Just the Facts Small-Sliding Contact in ANSYS Mechanical ANSYS, Inc. 2600 ANSYS Drive Canonsburg, PA 15317 29 March 2018 Although this document provides information that customers may find useful, it is
More informationPartitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA
Partitioning and Partitioning Tools Tim Barth NASA Ames Research Center Moffett Field, California 94035-00 USA 1 Graph/Mesh Partitioning Why do it? The graph bisection problem What are the standard heuristic
More informationParallel Greedy Matching Algorithms
Parallel Greedy Matching Algorithms Fredrik Manne Department of Informatics University of Bergen, Norway Rob Bisseling, University of Utrecht Md. Mostofa Patwary, University of Bergen 1 Outline Background
More informationFast Methods with Sieve
Fast Methods with Sieve Matthew G Knepley Mathematics and Computer Science Division Argonne National Laboratory August 12, 2008 Workshop on Scientific Computing Simula Research, Oslo, Norway M. Knepley
More informationGPU Cluster Computing for FEM
GPU Cluster Computing for FEM Dominik Göddeke Sven H.M. Buijssen, Hilmar Wobker and Stefan Turek Angewandte Mathematik und Numerik TU Dortmund, Germany dominik.goeddeke@math.tu-dortmund.de GPU Computing
More informationA Parallel Implementation of the BDDC Method for Linear Elasticity
A Parallel Implementation of the BDDC Method for Linear Elasticity Jakub Šístek joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík Institute of Mathematics of the AS CR, Prague
More informationShape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI
Parallel Adaptive Institute of Theoretical Informatics Karlsruhe Institute of Technology (KIT) 10th DIMACS Challenge Workshop, Feb 13-14, 2012, Atlanta 1 Load Balancing by Repartitioning Application: Large
More informationParallel Programming Patterns
Parallel Programming Patterns Pattern-Driven Parallel Application Development 7/10/2014 DragonStar 2014 - Qing Yi 1 Parallelism and Performance p Automatic compiler optimizations have their limitations
More informationTowards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers
Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,
More informationEfficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs
Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,
More informationSmoothers. < interactive example > Partial Differential Equations Numerical Methods for PDEs Sparse Linear Systems
Smoothers Partial Differential Equations Disappointing convergence rates observed for stationary iterative methods are asymptotic Much better progress may be made initially before eventually settling into
More informationPresented by: Terry L. Wilmarth
C h a l l e n g e s i n D y n a m i c a l l y E v o l v i n g M e s h e s f o r L a r g e - S c a l e S i m u l a t i o n s Presented by: Terry L. Wilmarth Parallel Programming Laboratory and Center for
More informationMultigrid Pattern. I. Problem. II. Driving Forces. III. Solution
Multigrid Pattern I. Problem Problem domain is decomposed into a set of geometric grids, where each element participates in a local computation followed by data exchanges with adjacent neighbors. The grids
More informationContents. F10: Parallel Sparse Matrix Computations. Parallel algorithms for sparse systems Ax = b. Discretized domain a metal sheet
Contents 2 F10: Parallel Sparse Matrix Computations Figures mainly from Kumar et. al. Introduction to Parallel Computing, 1st ed Chap. 11 Bo Kågström et al (RG, EE, MR) 2011-05-10 Sparse matrices and storage
More informationThe Shallow Water Equations and CUDA
The Shallow Water Equations and CUDA Alexander Pöppl December 9 th 2015 Tutorial: High Performance Computing - Algorithms and Applications, December 9 th 2015 1 Last Tutorial Discretized Heat Equation
More informationLarge-scale workflows for wave-equation based inversion in Julia
Large-scale workflows for wave-equation based inversion in Julia Philipp A. Witte, Mathias Louboutin and Felix J. Herrmann SLIM University of British Columbia Motivation Use Geophysics to understand the
More informationParallel Implementations of Gaussian Elimination
s of Western Michigan University vasilije.perovic@wmich.edu January 27, 2012 CS 6260: in Parallel Linear systems of equations General form of a linear system of equations is given by a 11 x 1 + + a 1n
More informationIntroduction to Multigrid and its Parallelization
Introduction to Multigrid and its Parallelization! Thomas D. Economon Lecture 14a May 28, 2014 Announcements 2 HW 1 & 2 have been returned. Any questions? Final projects are due June 11, 5 pm. If you are
More informationCPS343 Parallel and High Performance Computing Project 1 Spring 2018
CPS343 Parallel and High Performance Computing Project 1 Spring 2018 Assignment Write a program using OpenMP to compute the estimate of the dominant eigenvalue of a matrix Due: Wednesday March 21 The program
More informationABOUT THE GENERATION OF UNSTRUCTURED MESH FAMILIES FOR GRID CONVERGENCE ASSESSMENT BY MIXED MESHES
VI International Conference on Adaptive Modeling and Simulation ADMOS 2013 J. P. Moitinho de Almeida, P. Díez, C. Tiago and N. Parés (Eds) ABOUT THE GENERATION OF UNSTRUCTURED MESH FAMILIES FOR GRID CONVERGENCE
More informationPROGRAMMING OF MULTIGRID METHODS
PROGRAMMING OF MULTIGRID METHODS LONG CHEN In this note, we explain the implementation detail of multigrid methods. We will use the approach by space decomposition and subspace correction method; see Chapter:
More informationAdaptive-Mesh-Refinement Pattern
Adaptive-Mesh-Refinement Pattern I. Problem Data-parallelism is exposed on a geometric mesh structure (either irregular or regular), where each point iteratively communicates with nearby neighboring points
More informationSPH: Why and what for?
SPH: Why and what for? 4 th SPHERIC training day David Le Touzé, Fluid Mechanics Laboratory, Ecole Centrale de Nantes / CNRS SPH What for and why? How it works? Why not for everything? Duality of SPH SPH
More informationRadial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing
Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing Natasha Flyer National Center for Atmospheric Research Boulder, CO Meshes vs. Mesh-free
More informationSparse Matrix Formats
Christopher Bross Friedrich-Alexander-Universität Erlangen-Nürnberg Motivation Sparse Matrices are everywhere Sparse Matrix Formats C. Bross BGCE Research Day, Erlangen, 09.06.2016 2/16 Motivation Sparse
More informationAllScale Pilots Applications AmDaDos Adaptive Meshing and Data Assimilation for the Deepwater Horizon Oil Spill
This project has received funding from the European Union s Horizon 2020 research and innovation programme under grant agreement No. 671603 An Exascale Programming, Multi-objective Optimisation and Resilience
More informationParallel Mesh Partitioning in Alya
Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Parallel Mesh Partitioning in Alya A. Artigues a *** and G. Houzeaux a* a Barcelona Supercomputing Center ***antoni.artigues@bsc.es
More informationMesh-Free Applications for Static and Dynamically Changing Node Configurations
Mesh-Free Applications for Static and Dynamically Changing Node Configurations Natasha Flyer Computational Information Systems Lab National Center for Atmospheric Research Boulder, CO Meshes vs. Mesh-free
More informationThe Shallow Water Equations and CUDA
The Shallow Water Equations and CUDA Oliver Meister December 17 th 2014 Tutorial Parallel Programming and High Performance Computing, December 17 th 2014 1 Last Tutorial Discretized Heat Equation System
More informationcomputational Fluid Dynamics - Prof. V. Esfahanian
Three boards categories: Experimental Theoretical Computational Crucial to know all three: Each has their advantages and disadvantages. Require validation and verification. School of Mechanical Engineering
More informationSession 3 Introduction to SIMULINK
Session 3 Introduction to SIMULINK Brian Daku Department of Electrical Engineering University of Saskatchewan email: daku@engr.usask.ca EE 290 Brian Daku Outline This section covers some basic concepts
More informationGeometric Modeling Assignment 3: Discrete Differential Quantities
Geometric Modeling Assignment : Discrete Differential Quantities Acknowledgements: Julian Panetta, Olga Diamanti Assignment (Optional) Topic: Discrete Differential Quantities with libigl Vertex Normals,
More informationPerformance and accuracy of hardware-oriented. native-, solvers in FEM simulations
Robert Strzodka, Stanford University Dominik Göddeke, Universität Dortmund Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations Number of slices
More informationGraph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen
Graph Partitioning for High-Performance Scientific Simulations Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Challenges for irregular meshes Modeling mesh-based computations as graphs Static
More informationlecture 8 Groundwater Modelling -1
The Islamic University of Gaza Faculty of Engineering Civil Engineering Department Water Resources Msc. Groundwater Hydrology- ENGC 6301 lecture 8 Groundwater Modelling -1 Instructor: Dr. Yunes Mogheir
More informationParallelizing the Method of Conjugate Gradients for Shared Memory Architectures
IT Licentiate theses 2004-005 Parallelizing the Method of Conjugate Gradients for Shared Memory Architectures HENRIK LÖF UPPSALA UNIVERSITY Department of Information Technology Parallelizing the Method
More informationEfficiency of adaptive mesh algorithms
Efficiency of adaptive mesh algorithms 23.11.2012 Jörn Behrens KlimaCampus, Universität Hamburg http://www.katrina.noaa.gov/satellite/images/katrina-08-28-2005-1545z.jpg Model for adaptive efficiency 10
More informationParallel Adaptive Tsunami Modelling with Triangular Discontinuous Galerkin Schemes
Parallel Adaptive Tsunami Modelling with Triangular Discontinuous Galerkin Schemes Stefan Vater 1 Kaveh Rahnema 2 Jörn Behrens 1 Michael Bader 2 1 Universität Hamburg 2014 PDES Workshop 2 TU München Partial
More informationHigh Performance Computing
High Performance Computing ADVANCED SCIENTIFIC COMPUTING Dr. Ing. Morris Riedel Adjunct Associated Professor School of Engineering and Natural Sciences, University of Iceland Research Group Leader, Juelich
More informationTools and Primitives for High Performance Graph Computation
Tools and Primitives for High Performance Graph Computation John R. Gilbert University of California, Santa Barbara Aydin Buluç (LBNL) Adam Lugowski (UCSB) SIAM Minisymposium on Analyzing Massive Real-World
More informationScalable Algorithmic Techniques Decompositions & Mapping. Alexandre David
Scalable Algorithmic Techniques Decompositions & Mapping Alexandre David 1.2.05 adavid@cs.aau.dk Introduction Focus on data parallelism, scale with size. Task parallelism limited. Notion of scalability
More information3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs
3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs H. Knibbe, C. W. Oosterlee, C. Vuik Abstract We are focusing on an iterative solver for the three-dimensional
More informationThe Icosahedral Nonhydrostatic (ICON) Model
The Icosahedral Nonhydrostatic (ICON) Model Scalability on Massively Parallel Computer Architectures Florian Prill, DWD + the ICON team 15th ECMWF Workshop on HPC in Meteorology October 2, 2012 ICON =
More informationAchieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation
Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation Michael Lange 1 Gerard Gorman 1 Michele Weiland 2 Lawrence Mitchell 2 Xiaohu Guo 3 James Southern 4 1 AMCG, Imperial College
More information