Prof. Dr. Stefan Funken, Prof. Dr. Alexander Keller, Prof. Dr. Karsten Urban 11. Januar Scientific Computing Parallele Algorithmen

Size: px
Start display at page:

Download "Prof. Dr. Stefan Funken, Prof. Dr. Alexander Keller, Prof. Dr. Karsten Urban 11. Januar Scientific Computing Parallele Algorithmen"

Transcription

1 Prof. Dr. Stefan Funken, Prof. Dr. Alexander Keller, Prof. Dr. Karsten Urban 11. Januar 2007 Scientific Computing Parallele Algorithmen

2 Page 2 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Methods Flowchart with suggestions for the selection of iterative methods. Is the matrix symmetric? n y Is the matrix y Are the outer y definite? eigenvalues known? n n Try Chebyshev or CG Try MinRES or CG Try CG Is the transpose available? n y Try QMR Is storage at a premium n y Try CGS or Bi CGSTAB Try GMRES with long restart

3 Page 3 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Methods This book is also available in Postscript from ftp.netlib.org/templates/templates.ps.

4 Page 4 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Methods for Algebraic Eigenvalue Problems There exists also a similiar book for algebraic eigenvalue problems. This is also available as online document at dongarra/etemplates/book.html.

5 Page 5 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner Convergence rate of iterative methods depends on spectral properties of the coefficient matrix. Example: CG-method x x (k) A 2ρ k x x (k) A with ρ 2 := κ 2(A) 1 κ 2 (A) 1 and x x (k) 2 A := x x (k), A(x x (k) ). Note: The number of iterations to reach a relative reduction of ɛ in the error is proportional to κ 2.

6 Page 6 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner Convergence rate of iterative methods depends on spectral properties of the coefficient matrix. Hence, one may attempt to transform the linear system into one that is equivalent in the sense that it has the same solution, but that has more favorable spectral properties. A preconditioner is a matrix that effects such a transformation. For instance, if a matrix W approximates the coefficient matrix A in some way, the transformed system W 1 Ax = W 1 b has the same solution as the original system Ax = b, but the spectral properties of its coefficient matrix W 1 A may be more favorable.

7 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A,

8 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A, or finding a matrix W that approximates A 1, so that only multiplication by W is needed.

9 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A, or finding a matrix W that approximates A 1, so that only multiplication by W is needed. The majority of preconditioners falls in the first category.

10 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A, or finding a matrix W that approximates A 1, so that only multiplication by W is needed. The majority of preconditioners falls in the first category. On parallel machines there is a further trade-off between the efficancy of a preconditioner in the classical sense, and its parallel efficiency.

11 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A, or finding a matrix W that approximates A 1, so that only multiplication by W is needed. The majority of preconditioners falls in the first category. On parallel machines there is a further trade-off between the efficancy of a preconditioner in the classical sense, and its parallel efficiency. Many of the traditional preconditioners have a large sequential component.

12 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners

13 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method,

14 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method,

15 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method,

16 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method, non-overlapping domain decomposition,

17 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method, non-overlapping domain decomposition, and the parallelization of the Gauß-Seidel and SOR method with wavefront numbering,

18 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method, non-overlapping domain decomposition, and the parallelization of the Gauß-Seidel and SOR method with wavefront numbering, red-black numbering.

19 Page 9 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. on each diagonale, each component can be computed seperatly P1: P2: P3: P4: P5:

20 Page 9 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. on each diagonale, each component can be computed seperatly 2. work load unbalanced P1: P2: P3: P4: P5:

21 Page 9 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. on each diagonale, each component can be computed seperatly 2. work load unbalanced 3. maximal possible speed-up in a P P-mesh is P/2 P1: P2: P3: P4: P5:

22 Page 9 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. on each diagonale, each component can be computed seperatly 2. work load unbalanced 3. maximal possible speed-up in a P P-mesh is P/2 4. what about more general meshes (no quadratic mesh)? P1: P2: P3: P4: P5:

23 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm P1: P2: P3: P4: P5: P6: P7: P8:

24 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal P1: P2: P3: P4: P5: P6: P7: P8:

25 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible P1: P2: P3: P4: P5: P6: P7: P8:

26 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer P1: P2: P3: P4: P5: P6: P7: P8:

27 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:

28 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:

29 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:

30 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:

31 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:

32 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:

33 Page 11 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Block-Strips Algorithm 1. each block strip will be computed one after another

34 Page 11 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Block-Strips Algorithm 1. each block strip will be computed one after another 2. work load balanced (optimal for kp kp-meshes)

35 Page 11 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Block-Strips Algorithm 1. each block strip will be computed one after another 2. work load balanced (optimal for kp kp-meshes) 3. maximal possible speed-up is kp/(k + 1)

36 Page 12 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first?

37 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@

38 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@

39 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@

40 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@

41 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@

42 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@

43 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@

44 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@

45 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@

46 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering B@ What happens, if we number all red nodes first? CA

47 Page 14 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? Properties 1. FEM-matrix with swaped rows and columns

48 Page 14 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? Properties 1. FEM-matrix with swaped rows and columns block matrix

49 Page 14 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? Properties 1. FEM-matrix with swaped rows and columns block matrix 3. diagonal blocks are diagonal matrices

50 Page 15 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Jacobi / Gauß-Seidel Iteration Consider the system Ax = b and the decomposition A = L + D + U. Sequential version of Jacobi iteration. x (k+1) := D 1 (b Lx (k) Ux (k) ) If D 1 is available on each processor, only communication is necessary to exchange parts of x (k+1) after updating. Sequential version of Gauß-Seidel iteration. or x (k+1) := D 1 (b Lx (k+1) Ux (k) ) x (k+1) := D 1 (b Lx (k) Ux (k+1) )

51 Page 16 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel Iteration (Red-Black-Numbering) Let A = (a ij ) R n n. Assume, we have at least two disjoint index sets I red and I black, s.t. a ij 0 for all i j I red resp. I black. Parallel version of Gauß-Seidel iteration. x (k+1) red x (k+1) black := D 1 red (b red (L rb + U rb )x (k) black ) := D 1 black (b black (L br + U br )x (k) red ) If P >= 2 it is recommended to use a block version, s.t. blocks of the same color need no communication.

52 Page 17 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Non-overlapping Subdomains Different Indizes 1. I nodes in interior of subdomains [N I = p j=1 N I,j]. 2. E nodes in interior of subdomains-edges [N E = n e j=1 N E,j]. (n e number of subdomain-edges) 3. V crosspoints, i.e. endpoints of subdomain-edges [N V ]

53 Page 18 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Types of Vectors Two types of vectors, depending on the storage type: type I: u is stored on P k as restriction u k = C k u. Complete value accessable on P k. type II: r is stored on P k as r k, s.t. r = p k=1 C k T r k. Nodes on the interface have only a part of the full value. How should we parallelize the Gauß-Seidel iteration if we have non-overlapping subdomains? resp. x (k+1) := D 1 (b Lx (k+1) Ux (k) ) x (k+1) i := ( C i ). 1 p p Ck T diag(d) Cl T (b Lx (k+1) Ux (k) ) k=1 l=1

54 Page 19 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains Consider the following ordering of global index set: (V, E, I ) A VV A VE A VI A EV A EE A EI x V x E = b V b E A IV A IE A II x I b I

55 Page 20 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, Draft) Let d := {1/d ii } i=1,...,n, componentwise multiplication. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V

56 Page 20 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, Draft) Let d := {1/d ii } i=1,...,n, componentwise multiplication. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V r E := b E A EV x k+1 V A EE x k E A EI x k I p w E := communication, real Gauß-Seidel??? l=1 C T E,lr E,l x k+1 E := x k E + d E w E

57 Page 20 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, Draft) Let d := {1/d ii } i=1,...,n, componentwise multiplication. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V r E := b E A EV x k+1 V A EE x k E A EI x k I p w E := communication, real Gauß-Seidel??? l=1 C T E,lr E,l x k+1 E := x k E + d E w E r I := b I A IV x k+1 V A IE x k+1 E w I := p l=1 C T I,lr I,l x k+1 I := x k I + d I w I A II x k I no communication!!!

58 Page 21 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, modified) Assume at least one node on each coupling edge and no connection between different edges. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V

59 Page 21 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, modified) Assume at least one node on each coupling edge and no connection between different edges. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V r E := b E A EV x k+1 V A EE x k E A EI x k I p w E := l=1 C T E,lr E,l x k+1 E := x k E + A 1 EE w E block diagonal matrix, each block tridiagonal

60 Page 21 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, modified) Assume at least one node on each coupling edge and no connection between different edges. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V r E := b E A EV x k+1 V A EE x k E A EI x k I p w E := l=1 C T E,lr E,l x k+1 E := x k E + A 1 EE w E block diagonal matrix, each block tridiagonal r I := b I A IV x k+1 V A IE x k+1 E A II x k I w I := x k+1 p l=1 C T I,lr I,l I := x k I + A 1 II w I no communication!!!

61 Page 22 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Gauß-Seidel via Jacobi Definition: A Matrix A R m n is called non-negativ, if all coefficients a ij of A are non-negativ. Satz: [Stein and Rosenberg] Let the iteration matrix C J R n n of the Jacobi-iteration be non-negativ. Then there hold the following properties i) ϱ(c J ) = ϱ(c G ) = 0 ii) ϱ(c J ) = ϱ(c G ) = 1 iii) 0 < ϱ(c G ) < ϱ(c J ) < 1 iv) 1 < ϱ(c J ) < ϱ(c G ) Example: A = , C J = Gauß-Seidel is faster than Jacobi (for FEM matrices, also in 2D/3D)!

Iterative Methods for Linear Systems

Iterative Methods for Linear Systems Iterative Methods for Linear Systems 1 the method of Jacobi derivation of the formulas cost and convergence of the algorithm a Julia function 2 Gauss-Seidel Relaxation an iterative method for solving linear

More information

Parallel Implementations of Gaussian Elimination

Parallel Implementations of Gaussian Elimination s of Western Michigan University vasilije.perovic@wmich.edu January 27, 2012 CS 6260: in Parallel Linear systems of equations General form of a linear system of equations is given by a 11 x 1 + + a 1n

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 4 Sparse Linear Systems Section 4.3 Iterative Methods Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign

More information

Lecture 15: More Iterative Ideas

Lecture 15: More Iterative Ideas Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!

More information

(Sparse) Linear Solvers

(Sparse) Linear Solvers (Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert

More information

Contents. F10: Parallel Sparse Matrix Computations. Parallel algorithms for sparse systems Ax = b. Discretized domain a metal sheet

Contents. F10: Parallel Sparse Matrix Computations. Parallel algorithms for sparse systems Ax = b. Discretized domain a metal sheet Contents 2 F10: Parallel Sparse Matrix Computations Figures mainly from Kumar et. al. Introduction to Parallel Computing, 1st ed Chap. 11 Bo Kågström et al (RG, EE, MR) 2011-05-10 Sparse matrices and storage

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26

More information

CSCE 5160 Parallel Processing. CSCE 5160 Parallel Processing

CSCE 5160 Parallel Processing. CSCE 5160 Parallel Processing HW #9 10., 10.3, 10.7 Due April 17 { } Review Completing Graph Algorithms Maximal Independent Set Johnson s shortest path algorithm using adjacency lists Q= V; for all v in Q l[v] = infinity; l[s] = 0;

More information

High Performance Computing Programming Paradigms and Scalability Part 6: Examples of Parallel Algorithms

High Performance Computing Programming Paradigms and Scalability Part 6: Examples of Parallel Algorithms High Performance Computing Programming Paradigms and Scalability Part 6: Examples of Parallel Algorithms PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering (CiE) Scientific Computing

More information

THE DEVELOPMENT OF THE POTENTIAL AND ACADMIC PROGRAMMES OF WROCLAW UNIVERISTY OF TECH- NOLOGY ITERATIVE LINEAR SOLVERS

THE DEVELOPMENT OF THE POTENTIAL AND ACADMIC PROGRAMMES OF WROCLAW UNIVERISTY OF TECH- NOLOGY ITERATIVE LINEAR SOLVERS ITERATIVE LIEAR SOLVERS. Objectives The goals of the laboratory workshop are as follows: to learn basic properties of iterative methods for solving linear least squares problems, to study the properties

More information

Contents. I The Basic Framework for Stationary Problems 1

Contents. I The Basic Framework for Stationary Problems 1 page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other

More information

Sparse Matrices. This means that for increasing problem size the matrices become sparse and sparser. O. Rheinbach, TU Bergakademie Freiberg

Sparse Matrices. This means that for increasing problem size the matrices become sparse and sparser. O. Rheinbach, TU Bergakademie Freiberg Sparse Matrices Many matrices in computing only contain a very small percentage of nonzeros. Such matrices are called sparse ( dünn besetzt ). Often, an upper bound on the number of nonzeros in a row can

More information

What is Multigrid? They have been extended to solve a wide variety of other problems, linear and nonlinear.

What is Multigrid? They have been extended to solve a wide variety of other problems, linear and nonlinear. AMSC 600/CMSC 760 Fall 2007 Solution of Sparse Linear Systems Multigrid, Part 1 Dianne P. O Leary c 2006, 2007 What is Multigrid? Originally, multigrid algorithms were proposed as an iterative method to

More information

Implicit schemes for wave models

Implicit schemes for wave models Implicit schemes for wave models Mathieu Dutour Sikirić Rudjer Bo sković Institute, Croatia and Universität Rostock April 17, 2013 I. Wave models Stochastic wave modelling Oceanic models are using grids

More information

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER AVISHA DHISLE PRERIT RODNEY ADHISLE PRODNEY 15618: PARALLEL COMPUTER ARCHITECTURE PROF. BRYANT PROF. KAYVON LET S

More information

Linear Equation Systems Iterative Methods

Linear Equation Systems Iterative Methods Linear Equation Systems Iterative Methods Content Iterative Methods Jacobi Iterative Method Gauss Seidel Iterative Method Iterative Methods Iterative methods are those that produce a sequence of successive

More information

Xinyu Dou Acoustics Technology Center, Motorola, Inc., Schaumburg, Illinois 60196

Xinyu Dou Acoustics Technology Center, Motorola, Inc., Schaumburg, Illinois 60196 A unified boundary element method for the analysis of sound and shell-like structure interactions. II. Efficient solution techniques Shaohai Chen and Yijun Liu a) Department of Mechanical Engineering,

More information

Preliminary Investigations on Resilient Parallel Numerical Linear Algebra Solvers

Preliminary Investigations on Resilient Parallel Numerical Linear Algebra Solvers SIAM EX14 Workshop July 7, Chicago - IL reliminary Investigations on Resilient arallel Numerical Linear Algebra Solvers HieACS Inria roject Joint Inria-CERFACS lab INRIA Bordeaux Sud-Ouest Luc Giraud joint

More information

(Sparse) Linear Solvers

(Sparse) Linear Solvers (Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 1 Don t you just invert

More information

8. Hardware-Aware Numerics. Approaching supercomputing...

8. Hardware-Aware Numerics. Approaching supercomputing... Approaching supercomputing... Numerisches Programmieren, Hans-Joachim Bungartz page 1 of 22 8.1. Hardware-Awareness Introduction Since numerical algorithms are ubiquitous, they have to run on a broad spectrum

More information

8. Hardware-Aware Numerics. Approaching supercomputing...

8. Hardware-Aware Numerics. Approaching supercomputing... Approaching supercomputing... Numerisches Programmieren, Hans-Joachim Bungartz page 1 of 48 8.1. Hardware-Awareness Introduction Since numerical algorithms are ubiquitous, they have to run on a broad spectrum

More information

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND Student Submission for the 5 th OpenFOAM User Conference 2017, Wiesbaden - Germany: SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND TESSA UROIĆ Faculty of Mechanical Engineering and Naval Architecture, Ivana

More information

Sparse Linear Systems

Sparse Linear Systems 1 Sparse Linear Systems Rob H. Bisseling Mathematical Institute, Utrecht University Course Introduction Scientific Computing February 22, 2018 2 Outline Iterative solution methods 3 A perfect bipartite

More information

Iterative Sparse Triangular Solves for Preconditioning

Iterative Sparse Triangular Solves for Preconditioning Euro-Par 2015, Vienna Aug 24-28, 2015 Iterative Sparse Triangular Solves for Preconditioning Hartwig Anzt, Edmond Chow and Jack Dongarra Incomplete Factorization Preconditioning Incomplete LU factorizations

More information

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI 1 Akshay N. Panajwar, 2 Prof.M.A.Shah Department of Computer Science and Engineering, Walchand College of Engineering,

More information

CS 542G: Solving Sparse Linear Systems

CS 542G: Solving Sparse Linear Systems CS 542G: Solving Sparse Linear Systems Robert Bridson November 26, 2008 1 Direct Methods We have already derived several methods for solving a linear system, say Ax = b, or the related leastsquares problem

More information

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach Mini-workshop PHyLeaS associated team J. Gaidamour, P. Hénon July 9, 28 HIPS : an hybrid direct/iterative solver /

More information

Comparison of Two Stationary Iterative Methods

Comparison of Two Stationary Iterative Methods Comparison of Two Stationary Iterative Methods Oleksandra Osadcha Faculty of Applied Mathematics Silesian University of Technology Gliwice, Poland Email:olekosa@studentpolslpl Abstract This paper illustrates

More information

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013 GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»

More information

HPC Algorithms and Applications

HPC Algorithms and Applications HPC Algorithms and Applications Dwarf #5 Structured Grids Michael Bader Winter 2012/2013 Dwarf #5 Structured Grids, Winter 2012/2013 1 Dwarf #5 Structured Grids 1. dense linear algebra 2. sparse linear

More information

x = 12 x = 12 1x = 16

x = 12 x = 12 1x = 16 2.2 - The Inverse of a Matrix We've seen how to add matrices, multiply them by scalars, subtract them, and multiply one matrix by another. The question naturally arises: Can we divide one matrix by another?

More information

Chapter 14: Matrix Iterative Methods

Chapter 14: Matrix Iterative Methods Chapter 14: Matrix Iterative Methods 14.1INTRODUCTION AND OBJECTIVES This chapter discusses how to solve linear systems of equations using iterative methods and it may be skipped on a first reading of

More information

A Study of Numerical Methods for Simultaneous Equations

A Study of Numerical Methods for Simultaneous Equations A Study of Numerical Methods for Simultaneous Equations Er. Chandan Krishna Mukherjee B.Sc.Engg., ME, MBA Asstt. Prof. ( Mechanical ), SSBT s College of Engg. & Tech., Jalgaon, Maharashtra Abstract: -

More information

CS 6210 Fall 2016 Bei Wang. Review Lecture What have we learnt in Scientific Computing?

CS 6210 Fall 2016 Bei Wang. Review Lecture What have we learnt in Scientific Computing? CS 6210 Fall 2016 Bei Wang Review Lecture What have we learnt in Scientific Computing? Let s recall the scientific computing pipeline observed phenomenon mathematical model discretization solution algorithm

More information

S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS

S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS John R Appleyard Jeremy D Appleyard Polyhedron Software with acknowledgements to Mark A Wakefield Garf Bowen Schlumberger Outline of Talk Reservoir

More information

Highly Parallel Multigrid Solvers for Multicore and Manycore Processors

Highly Parallel Multigrid Solvers for Multicore and Manycore Processors Highly Parallel Multigrid Solvers for Multicore and Manycore Processors Oleg Bessonov (B) Institute for Problems in Mechanics of the Russian Academy of Sciences, 101, Vernadsky Avenue, 119526 Moscow, Russia

More information

Lesson 2 7 Graph Partitioning

Lesson 2 7 Graph Partitioning Lesson 2 7 Graph Partitioning The Graph Partitioning Problem Look at the problem from a different angle: Let s multiply a sparse matrix A by a vector X. Recall the duality between matrices and graphs:

More information

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3 6 Iterative Solvers Lab Objective: Many real-world problems of the form Ax = b have tens of thousands of parameters Solving such systems with Gaussian elimination or matrix factorizations could require

More information

Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms

Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms By:- Nitin Kamra Indian Institute of Technology, Delhi Advisor:- Prof. Ulrich Reude 1. Introduction to Linear

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Construction and application of hierarchical matrix preconditioners

Construction and application of hierarchical matrix preconditioners University of Iowa Iowa Research Online Theses and Dissertations 2008 Construction and application of hierarchical matrix preconditioners Fang Yang University of Iowa Copyright 2008 Fang Yang This dissertation

More information

Figure 6.1: Truss topology optimization diagram.

Figure 6.1: Truss topology optimization diagram. 6 Implementation 6.1 Outline This chapter shows the implementation details to optimize the truss, obtained in the ground structure approach, according to the formulation presented in previous chapters.

More information

SLEPc: Scalable Library for Eigenvalue Problem Computations

SLEPc: Scalable Library for Eigenvalue Problem Computations SLEPc: Scalable Library for Eigenvalue Problem Computations Jose E. Roman Joint work with A. Tomas and E. Romero Universidad Politécnica de Valencia, Spain 10th ACTS Workshop - August, 2009 Outline 1 Introduction

More information

The 3D DSC in Fluid Simulation

The 3D DSC in Fluid Simulation The 3D DSC in Fluid Simulation Marek K. Misztal Informatics and Mathematical Modelling, Technical University of Denmark mkm@imm.dtu.dk DSC 2011 Workshop Kgs. Lyngby, 26th August 2011 Governing Equations

More information

nag sparse nsym sol (f11dec)

nag sparse nsym sol (f11dec) f11 Sparse Linear Algebra f11dec nag sparse nsym sol (f11dec) 1. Purpose nag sparse nsym sol (f11dec) solves a real sparse nonsymmetric system of linear equations, represented in coordinate storage format,

More information

Parallel resolution of sparse linear systems by mixing direct and iterative methods

Parallel resolution of sparse linear systems by mixing direct and iterative methods Parallel resolution of sparse linear systems by mixing direct and iterative methods Phyleas Meeting, Bordeaux J. Gaidamour, P. Hénon, J. Roman, Y. Saad LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix

More information

ECE 204 Numerical Methods for Computer Engineers MIDTERM EXAMINATION /4:30-6:00

ECE 204 Numerical Methods for Computer Engineers MIDTERM EXAMINATION /4:30-6:00 ECE 4 Numerical Methods for Computer Engineers ECE 4 Numerical Methods for Computer Engineers MIDTERM EXAMINATION --7/4:-6: The eamination is out of marks. Instructions: No aides. Write your name and student

More information

NAG Fortran Library Routine Document F11DSF.1

NAG Fortran Library Routine Document F11DSF.1 NAG Fortran Library Routine Document Note: before using this routine, please read the Users Note for your implementation to check the interpretation of bold italicised terms and other implementation-dependent

More information

Lecture 27: Fast Laplacian Solvers

Lecture 27: Fast Laplacian Solvers Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall

More information

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Kevin Deweese 1 Erik Boman 2 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms

More information

Introduction to PETSc KSP, PC. CS595, Fall 2010

Introduction to PETSc KSP, PC. CS595, Fall 2010 Introduction to PETSc KSP, PC CS595, Fall 2010 1 Linear Solution Main Routine PETSc Solve Ax = b Linear Solvers (KSP) PC Application Initialization Evaluation of A and b Post- Processing User code PETSc

More information

Lecture 17: More Fun With Sparse Matrices

Lecture 17: More Fun With Sparse Matrices Lecture 17: More Fun With Sparse Matrices David Bindel 26 Oct 2011 Logistics Thanks for info on final project ideas. HW 2 due Monday! Life lessons from HW 2? Where an error occurs may not be where you

More information

Approaches to Parallel Implementation of the BDDC Method

Approaches to Parallel Implementation of the BDDC Method Approaches to Parallel Implementation of the BDDC Method Jakub Šístek Includes joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík. Institute of Mathematics of the AS CR, Prague

More information

AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENETIC ALGORITHMS

AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENETIC ALGORITHMS AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENETIC ALGORITHMS Seyed Abolfazl Shahzadehfazeli 1, Zainab Haji Abootorabi,3 1 Parallel Processing Laboratory, Yazd University,

More information

Improvements of the Discrete Dipole Approximation method

Improvements of the Discrete Dipole Approximation method arxiv:physics/0006064v1 [physics.ao-ph] 26 Jun 2000 Improvements of the Discrete Dipole Approximation method Piotr J. Flatau Scripps Institution of Oceanography, University of California, San Diego, La

More information

Overview of Trilinos and PT-Scotch

Overview of Trilinos and PT-Scotch 29.03.2012 Outline PT-Scotch 1 PT-Scotch The Dual Recursive Bipartitioning Algorithm Parallel Graph Bipartitioning Methods 2 Overview of the Trilinos Packages Examples on using Trilinos PT-Scotch The Scotch

More information

Outline. Parallel Algorithms for Linear Algebra. Number of Processors and Problem Size. Speedup and Efficiency

Outline. Parallel Algorithms for Linear Algebra. Number of Processors and Problem Size. Speedup and Efficiency 1 2 Parallel Algorithms for Linear Algebra Richard P. Brent Computer Sciences Laboratory Australian National University Outline Basic concepts Parallel architectures Practical design issues Programming

More information

Chapter Introduction

Chapter Introduction Chapter 4.1 Introduction After reading this chapter, you should be able to 1. define what a matrix is. 2. identify special types of matrices, and 3. identify when two matrices are equal. What does a matrix

More information

A parallel direct/iterative solver based on a Schur complement approach

A parallel direct/iterative solver based on a Schur complement approach A parallel direct/iterative solver based on a Schur complement approach Gene around the world at CERFACS Jérémie Gaidamour LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project) February 29th, 2008

More information

Social-Network Graphs

Social-Network Graphs Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities

More information

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras Spectral Graph Sparsification: overview of theory and practical methods Yiannis Koutis University of Puerto Rico - Rio Piedras Graph Sparsification or Sketching Compute a smaller graph that preserves some

More information

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou Study and implementation of computational methods for Differential Equations in heterogeneous systems Asimina Vouronikoy - Eleni Zisiou Outline Introduction Review of related work Cyclic Reduction Algorithm

More information

Lecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice

Lecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice Stat60/CS94: Randomized Algorithms for Matrices and Data Lecture 11-10/09/013 Lecture 11: Randomized Least-squares Approximation in Practice Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these

More information

Overlapping Domain Decomposition Methods

Overlapping Domain Decomposition Methods Overlapping Domain Decomposition Methods X. Cai 1,2 1 Simula Research Laboratory 2 Department of Informatics, University of Oslo Abstract. Overlapping domain decomposition methods are efficient and flexible.

More information

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen Graph Partitioning for High-Performance Scientific Simulations Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Challenges for irregular meshes Modeling mesh-based computations as graphs Static

More information

Chapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition

Chapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition Chapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition Nicholas J. Higham Pythagoras Papadimitriou Abstract A new method is described for computing the singular value decomposition

More information

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11 Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral

More information

Multigrid solvers M. M. Sussman sussmanm@math.pitt.edu Office Hours: 11:10AM-12:10PM, Thack 622 May 12 June 19, 2014 1 / 43 Multigrid Geometrical multigrid Introduction Details of GMG Summary Algebraic

More information

MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix.

MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. Row echelon form A matrix is said to be in the row echelon form if the leading entries shift to the

More information

Numerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes

Numerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes Numerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes Jung-Han Kimn 1 and Blaise Bourdin 2 1 Department of Mathematics and The Center for Computation and

More information

Solving Sparse Linear Systems. Forward and backward substitution for solving lower or upper triangular systems

Solving Sparse Linear Systems. Forward and backward substitution for solving lower or upper triangular systems AMSC 6 /CMSC 76 Advanced Linear Numerical Analysis Fall 7 Direct Solution of Sparse Linear Systems and Eigenproblems Dianne P. O Leary c 7 Solving Sparse Linear Systems Assumed background: Gauss elimination

More information

Generalized trace ratio optimization and applications

Generalized trace ratio optimization and applications Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 3 Dense Linear Systems Section 3.3 Triangular Linear Systems Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Introduction to Parallel Computing W. P. Petersen Seminar for Applied Mathematics Department of Mathematics, ETHZ, Zurich wpp@math. ethz.ch P. Arbenz Institute for Scientific Computing Department Informatik,

More information

Iterative Solver Benchmark Jack Dongarra, Victor Eijkhout, Henk van der Vorst 2001/01/14 1 Introduction The traditional performance measurement for co

Iterative Solver Benchmark Jack Dongarra, Victor Eijkhout, Henk van der Vorst 2001/01/14 1 Introduction The traditional performance measurement for co Iterative Solver Benchmark Jack Dongarra, Victor Eijkhout, Henk van der Vorst 2001/01/14 1 Introduction The traditional performance measurement for computers on scientic application has been the Linpack

More information

CG solver assignment

CG solver assignment CG solver assignment David Bindel Nikos Karampatziakis 3/16/2010 Contents 1 Introduction 1 2 Solver parameters 2 3 Preconditioned CG 3 4 3D Laplace operator 4 5 Preconditioners for the Laplacian 5 5.1

More information

Exam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3

Exam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3 UMEÅ UNIVERSITET Institutionen för datavetenskap Lars Karlsson, Bo Kågström och Mikael Rännar Design and Analysis of Algorithms for Parallel Computer Systems VT2009 June 2, 2009 Exam Design and Analysis

More information

Parallel Threshold-based ILU Factorization

Parallel Threshold-based ILU Factorization A short version of this paper appears in Supercomputing 997 Parallel Threshold-based ILU Factorization George Karypis and Vipin Kumar University of Minnesota, Department of Computer Science / Army HPC

More information

Chapter 13. Boundary Value Problems for Partial Differential Equations* Linz 2002/ page

Chapter 13. Boundary Value Problems for Partial Differential Equations* Linz 2002/ page Chapter 13 Boundary Value Problems for Partial Differential Equations* E lliptic equations constitute the third category of partial differential equations. As a prototype, we take the Poisson equation

More information

Q. Wang National Key Laboratory of Antenna and Microwave Technology Xidian University No. 2 South Taiba Road, Xi an, Shaanxi , P. R.

Q. Wang National Key Laboratory of Antenna and Microwave Technology Xidian University No. 2 South Taiba Road, Xi an, Shaanxi , P. R. Progress In Electromagnetics Research Letters, Vol. 9, 29 38, 2009 AN IMPROVED ALGORITHM FOR MATRIX BANDWIDTH AND PROFILE REDUCTION IN FINITE ELEMENT ANALYSIS Q. Wang National Key Laboratory of Antenna

More information

MLR Institute of Technology

MLR Institute of Technology Course Name : Engineering Optimization Course Code : 56021 Class : III Year Branch : Aeronautical Engineering Year : 2014-15 Course Faculty : Mr Vamsi Krishna Chowduru, Assistant Professor Course Objective

More information

Chapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition

Chapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition Chapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition Nicholas J. Higham Pythagoras Papadimitriou Abstract A new method is described for computing the singular value decomposition

More information

AMath 483/583 Lecture 24. Notes: Notes: Steady state diffusion. Notes: Finite difference method. Outline:

AMath 483/583 Lecture 24. Notes: Notes: Steady state diffusion. Notes: Finite difference method. Outline: AMath 483/583 Lecture 24 Outline: Heat equation and discretization OpenMP and MPI for iterative methods Jacobi, Gauss-Seidel, SOR Notes and Sample codes: Class notes: Linear algebra software $UWHPSC/codes/openmp/jacobi1d_omp1.f90

More information

Numerical Algorithms

Numerical Algorithms Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0

More information

An Approximate Singular Value Decomposition of Large Matrices in Julia

An Approximate Singular Value Decomposition of Large Matrices in Julia An Approximate Singular Value Decomposition of Large Matrices in Julia Alexander J. Turner 1, 1 Harvard University, School of Engineering and Applied Sciences, Cambridge, MA, USA. In this project, I implement

More information

AMath 483/583 Lecture 24

AMath 483/583 Lecture 24 AMath 483/583 Lecture 24 Outline: Heat equation and discretization OpenMP and MPI for iterative methods Jacobi, Gauss-Seidel, SOR Notes and Sample codes: Class notes: Linear algebra software $UWHPSC/codes/openmp/jacobi1d_omp1.f90

More information

Matrices 4: use of MATLAB

Matrices 4: use of MATLAB Matrices 4: use of MATLAB Anthony Rossiter http://controleducation.group.shef.ac.uk/indexwebbook.html http://www.shef.ac.uk/acse Department of Automatic Control and Systems Engineering Introduction The

More information

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative

More information

Spline Curves. Spline Curves. Prof. Dr. Hans Hagen Algorithmic Geometry WS 2013/2014 1

Spline Curves. Spline Curves. Prof. Dr. Hans Hagen Algorithmic Geometry WS 2013/2014 1 Spline Curves Prof. Dr. Hans Hagen Algorithmic Geometry WS 2013/2014 1 Problem: In the previous chapter, we have seen that interpolating polynomials, especially those of high degree, tend to produce strong

More information

An iterative solver benchmark 1

An iterative solver benchmark 1 223 An iterative solver benchmark 1 Jack Dongarra, Victor Eijkhout and Henk van der Vorst Revised 31 August 2001 We present a benchmark of iterative solvers for sparse matrices. The benchmark contains

More information

Simulating tsunami propagation on parallel computers using a hybrid software framework

Simulating tsunami propagation on parallel computers using a hybrid software framework Simulating tsunami propagation on parallel computers using a hybrid software framework Xing Simula Research Laboratory, Norway Department of Informatics, University of Oslo March 12, 2007 Outline Intro

More information

Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD

Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD Goals. The goal of the first part of this lab is to demonstrate how the SVD can be used to remove redundancies in data; in this example

More information

Sparse Linear Algebra

Sparse Linear Algebra Lecture 5 Sparse Linear Algebra The solution of a linear system Ax = b is one of the most important computational problems in scientific computing. As we shown in the previous section, these linear systems

More information

Handling Parallelisation in OpenFOAM

Handling Parallelisation in OpenFOAM Handling Parallelisation in OpenFOAM Hrvoje Jasak hrvoje.jasak@fsb.hr Faculty of Mechanical Engineering and Naval Architecture University of Zagreb, Croatia Handling Parallelisation in OpenFOAM p. 1 Parallelisation

More information

Coupled Finite Element Method Based Vibroacoustic Analysis of Orion Spacecraft

Coupled Finite Element Method Based Vibroacoustic Analysis of Orion Spacecraft Coupled Finite Element Method Based Vibroacoustic Analysis of Orion Spacecraft Lockheed Martin Space Systems Company (LMSSC) Spacecraft and Launch Vehicle Dynamic Environments Workshop June 21 23, 2016

More information

The clustering in general is the task of grouping a set of objects in such a way that objects

The clustering in general is the task of grouping a set of objects in such a way that objects Spectral Clustering: A Graph Partitioning Point of View Yangzihao Wang Computer Science Department, University of California, Davis yzhwang@ucdavis.edu Abstract This course project provide the basic theory

More information

All use is subject to licence. See For any commercial application, a separate license must be signed.

All use is subject to licence. See  For any commercial application, a separate license must be signed. HSL HSL MI20 PACKAGE SPECIFICATION HSL 2007 1 SUMMARY Given an n n sparse matrix A and an n vector z, HSL MI20 computes the vector x = Mz, where M is an algebraic multigrid (AMG) v-cycle preconditioner

More information

GPU-Accelerated Algebraic Multigrid for Commercial Applications. Joe Eaton, Ph.D. Manager, NVAMG CUDA Library NVIDIA

GPU-Accelerated Algebraic Multigrid for Commercial Applications. Joe Eaton, Ph.D. Manager, NVAMG CUDA Library NVIDIA GPU-Accelerated Algebraic Multigrid for Commercial Applications Joe Eaton, Ph.D. Manager, NVAMG CUDA Library NVIDIA ANSYS Fluent 2 Fluent control flow Accelerate this first Non-linear iterations Assemble

More information

10/24/ Rotations. 2. // s left subtree s right subtree 3. if // link s parent to elseif == else 11. // put x on s left

10/24/ Rotations. 2. // s left subtree s right subtree 3. if // link s parent to elseif == else 11. // put x on s left 13.2 Rotations MAT-72006 AA+DS, Fall 2013 24-Oct-13 368 LEFT-ROTATE(, ) 1. // set 2. // s left subtree s right subtree 3. if 4. 5. // link s parent to 6. if == 7. 8. elseif == 9. 10. else 11. // put x

More information

Optimizing Data Locality for Iterative Matrix Solvers on CUDA

Optimizing Data Locality for Iterative Matrix Solvers on CUDA Optimizing Data Locality for Iterative Matrix Solvers on CUDA Raymond Flagg, Jason Monk, Yifeng Zhu PhD., Bruce Segee PhD. Department of Electrical and Computer Engineering, University of Maine, Orono,

More information