Prof. Dr. Stefan Funken, Prof. Dr. Alexander Keller, Prof. Dr. Karsten Urban 11. Januar Scientific Computing Parallele Algorithmen
|
|
- Susan Carroll
- 5 years ago
- Views:
Transcription
1 Prof. Dr. Stefan Funken, Prof. Dr. Alexander Keller, Prof. Dr. Karsten Urban 11. Januar 2007 Scientific Computing Parallele Algorithmen
2 Page 2 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Methods Flowchart with suggestions for the selection of iterative methods. Is the matrix symmetric? n y Is the matrix y Are the outer y definite? eigenvalues known? n n Try Chebyshev or CG Try MinRES or CG Try CG Is the transpose available? n y Try QMR Is storage at a premium n y Try CGS or Bi CGSTAB Try GMRES with long restart
3 Page 3 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Methods This book is also available in Postscript from ftp.netlib.org/templates/templates.ps.
4 Page 4 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Methods for Algebraic Eigenvalue Problems There exists also a similiar book for algebraic eigenvalue problems. This is also available as online document at dongarra/etemplates/book.html.
5 Page 5 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner Convergence rate of iterative methods depends on spectral properties of the coefficient matrix. Example: CG-method x x (k) A 2ρ k x x (k) A with ρ 2 := κ 2(A) 1 κ 2 (A) 1 and x x (k) 2 A := x x (k), A(x x (k) ). Note: The number of iterations to reach a relative reduction of ɛ in the error is proportional to κ 2.
6 Page 6 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner Convergence rate of iterative methods depends on spectral properties of the coefficient matrix. Hence, one may attempt to transform the linear system into one that is equivalent in the sense that it has the same solution, but that has more favorable spectral properties. A preconditioner is a matrix that effects such a transformation. For instance, if a matrix W approximates the coefficient matrix A in some way, the transformed system W 1 Ax = W 1 b has the same solution as the original system Ax = b, but the spectral properties of its coefficient matrix W 1 A may be more favorable.
7 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A,
8 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A, or finding a matrix W that approximates A 1, so that only multiplication by W is needed.
9 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A, or finding a matrix W that approximates A 1, so that only multiplication by W is needed. The majority of preconditioners falls in the first category.
10 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A, or finding a matrix W that approximates A 1, so that only multiplication by W is needed. The majority of preconditioners falls in the first category. On parallel machines there is a further trade-off between the efficancy of a preconditioner in the classical sense, and its parallel efficiency.
11 Page 7 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner In devising a preconditioner, we are faced with a choice between finding a matrix W that approximates A, and for which solving a system is easier than solving one with A, or finding a matrix W that approximates A 1, so that only multiplication by W is needed. The majority of preconditioners falls in the first category. On parallel machines there is a further trade-off between the efficancy of a preconditioner in the classical sense, and its parallel efficiency. Many of the traditional preconditioners have a large sequential component.
12 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners
13 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method,
14 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method,
15 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method,
16 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method, non-overlapping domain decomposition,
17 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method, non-overlapping domain decomposition, and the parallelization of the Gauß-Seidel and SOR method with wavefront numbering,
18 Page 8 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Preconditioner We consider the following parallel preconditioners Richardson method, Jacobi method, non-overlapping domain decomposition, and the parallelization of the Gauß-Seidel and SOR method with wavefront numbering, red-black numbering.
19 Page 9 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. on each diagonale, each component can be computed seperatly P1: P2: P3: P4: P5:
20 Page 9 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. on each diagonale, each component can be computed seperatly 2. work load unbalanced P1: P2: P3: P4: P5:
21 Page 9 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. on each diagonale, each component can be computed seperatly 2. work load unbalanced 3. maximal possible speed-up in a P P-mesh is P/2 P1: P2: P3: P4: P5:
22 Page 9 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. on each diagonale, each component can be computed seperatly 2. work load unbalanced 3. maximal possible speed-up in a P P-mesh is P/2 4. what about more general meshes (no quadratic mesh)? P1: P2: P3: P4: P5:
23 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm P1: P2: P3: P4: P5: P6: P7: P8:
24 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal P1: P2: P3: P4: P5: P6: P7: P8:
25 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible P1: P2: P3: P4: P5: P6: P7: P8:
26 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer P1: P2: P3: P4: P5: P6: P7: P8:
27 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:
28 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:
29 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:
30 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:
31 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:
32 Page 10 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Wavefront Numbering Algorithm 1. start at a node s.t. number of layers is minimal 2. mark next layer and update as much nodes as possible 3. update remainding nodes before marking next layer 4. continue with 2. P1: P2: P3: P4: P5: P6: P7: P8:
33 Page 11 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Block-Strips Algorithm 1. each block strip will be computed one after another
34 Page 11 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Block-Strips Algorithm 1. each block strip will be computed one after another 2. work load balanced (optimal for kp kp-meshes)
35 Page 11 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Block-Strips Algorithm 1. each block strip will be computed one after another 2. work load balanced (optimal for kp kp-meshes) 3. maximal possible speed-up is kp/(k + 1)
36 Page 12 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first?
37 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@
38 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@
39 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@
40 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@
41 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@
42 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@
43 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@
44 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@
45 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? B@
46 Page 13 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering B@ What happens, if we number all red nodes first? CA
47 Page 14 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? Properties 1. FEM-matrix with swaped rows and columns
48 Page 14 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? Properties 1. FEM-matrix with swaped rows and columns block matrix
49 Page 14 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Red-Black Numbering What happens, if we number all red nodes first? Properties 1. FEM-matrix with swaped rows and columns block matrix 3. diagonal blocks are diagonal matrices
50 Page 15 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Jacobi / Gauß-Seidel Iteration Consider the system Ax = b and the decomposition A = L + D + U. Sequential version of Jacobi iteration. x (k+1) := D 1 (b Lx (k) Ux (k) ) If D 1 is available on each processor, only communication is necessary to exchange parts of x (k+1) after updating. Sequential version of Gauß-Seidel iteration. or x (k+1) := D 1 (b Lx (k+1) Ux (k) ) x (k+1) := D 1 (b Lx (k) Ux (k+1) )
51 Page 16 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel Iteration (Red-Black-Numbering) Let A = (a ij ) R n n. Assume, we have at least two disjoint index sets I red and I black, s.t. a ij 0 for all i j I red resp. I black. Parallel version of Gauß-Seidel iteration. x (k+1) red x (k+1) black := D 1 red (b red (L rb + U rb )x (k) black ) := D 1 black (b black (L br + U br )x (k) red ) If P >= 2 it is recommended to use a block version, s.t. blocks of the same color need no communication.
52 Page 17 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Non-overlapping Subdomains Different Indizes 1. I nodes in interior of subdomains [N I = p j=1 N I,j]. 2. E nodes in interior of subdomains-edges [N E = n e j=1 N E,j]. (n e number of subdomain-edges) 3. V crosspoints, i.e. endpoints of subdomain-edges [N V ]
53 Page 18 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Types of Vectors Two types of vectors, depending on the storage type: type I: u is stored on P k as restriction u k = C k u. Complete value accessable on P k. type II: r is stored on P k as r k, s.t. r = p k=1 C k T r k. Nodes on the interface have only a part of the full value. How should we parallelize the Gauß-Seidel iteration if we have non-overlapping subdomains? resp. x (k+1) := D 1 (b Lx (k+1) Ux (k) ) x (k+1) i := ( C i ). 1 p p Ck T diag(d) Cl T (b Lx (k+1) Ux (k) ) k=1 l=1
54 Page 19 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains Consider the following ordering of global index set: (V, E, I ) A VV A VE A VI A EV A EE A EI x V x E = b V b E A IV A IE A II x I b I
55 Page 20 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, Draft) Let d := {1/d ii } i=1,...,n, componentwise multiplication. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V
56 Page 20 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, Draft) Let d := {1/d ii } i=1,...,n, componentwise multiplication. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V r E := b E A EV x k+1 V A EE x k E A EI x k I p w E := communication, real Gauß-Seidel??? l=1 C T E,lr E,l x k+1 E := x k E + d E w E
57 Page 20 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, Draft) Let d := {1/d ii } i=1,...,n, componentwise multiplication. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V r E := b E A EV x k+1 V A EE x k E A EI x k I p w E := communication, real Gauß-Seidel??? l=1 C T E,lr E,l x k+1 E := x k E + d E w E r I := b I A IV x k+1 V A IE x k+1 E w I := p l=1 C T I,lr I,l x k+1 I := x k I + d I w I A II x k I no communication!!!
58 Page 21 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, modified) Assume at least one node on each coupling edge and no connection between different edges. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V
59 Page 21 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, modified) Assume at least one node on each coupling edge and no connection between different edges. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V r E := b E A EV x k+1 V A EE x k E A EI x k I p w E := l=1 C T E,lr E,l x k+1 E := x k E + A 1 EE w E block diagonal matrix, each block tridiagonal
60 Page 21 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Parallel Gauß-Seidel (Non-Overlapping Domains, modified) Assume at least one node on each coupling edge and no connection between different edges. r V := b V A VV x k V A VE x k E A VI x k I p w V := communication l=1 C T V,lr V,l x k+1 V := x k V + d V w V r E := b E A EV x k+1 V A EE x k E A EI x k I p w E := l=1 C T E,lr E,l x k+1 E := x k E + A 1 EE w E block diagonal matrix, each block tridiagonal r I := b I A IV x k+1 V A IE x k+1 E A II x k I w I := x k+1 p l=1 C T I,lr I,l I := x k I + A 1 II w I no communication!!!
61 Page 22 Scientific Computing 11. Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Gauß-Seidel via Jacobi Definition: A Matrix A R m n is called non-negativ, if all coefficients a ij of A are non-negativ. Satz: [Stein and Rosenberg] Let the iteration matrix C J R n n of the Jacobi-iteration be non-negativ. Then there hold the following properties i) ϱ(c J ) = ϱ(c G ) = 0 ii) ϱ(c J ) = ϱ(c G ) = 1 iii) 0 < ϱ(c G ) < ϱ(c J ) < 1 iv) 1 < ϱ(c J ) < ϱ(c G ) Example: A = , C J = Gauß-Seidel is faster than Jacobi (for FEM matrices, also in 2D/3D)!
Iterative Methods for Linear Systems
Iterative Methods for Linear Systems 1 the method of Jacobi derivation of the formulas cost and convergence of the algorithm a Julia function 2 Gauss-Seidel Relaxation an iterative method for solving linear
More informationParallel Implementations of Gaussian Elimination
s of Western Michigan University vasilije.perovic@wmich.edu January 27, 2012 CS 6260: in Parallel Linear systems of equations General form of a linear system of equations is given by a 11 x 1 + + a 1n
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms Chapter 4 Sparse Linear Systems Section 4.3 Iterative Methods Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign
More informationLecture 15: More Iterative Ideas
Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!
More information(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert
More informationContents. F10: Parallel Sparse Matrix Computations. Parallel algorithms for sparse systems Ax = b. Discretized domain a metal sheet
Contents 2 F10: Parallel Sparse Matrix Computations Figures mainly from Kumar et. al. Introduction to Parallel Computing, 1st ed Chap. 11 Bo Kågström et al (RG, EE, MR) 2011-05-10 Sparse matrices and storage
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26
More informationCSCE 5160 Parallel Processing. CSCE 5160 Parallel Processing
HW #9 10., 10.3, 10.7 Due April 17 { } Review Completing Graph Algorithms Maximal Independent Set Johnson s shortest path algorithm using adjacency lists Q= V; for all v in Q l[v] = infinity; l[s] = 0;
More informationHigh Performance Computing Programming Paradigms and Scalability Part 6: Examples of Parallel Algorithms
High Performance Computing Programming Paradigms and Scalability Part 6: Examples of Parallel Algorithms PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering (CiE) Scientific Computing
More informationTHE DEVELOPMENT OF THE POTENTIAL AND ACADMIC PROGRAMMES OF WROCLAW UNIVERISTY OF TECH- NOLOGY ITERATIVE LINEAR SOLVERS
ITERATIVE LIEAR SOLVERS. Objectives The goals of the laboratory workshop are as follows: to learn basic properties of iterative methods for solving linear least squares problems, to study the properties
More informationContents. I The Basic Framework for Stationary Problems 1
page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other
More informationSparse Matrices. This means that for increasing problem size the matrices become sparse and sparser. O. Rheinbach, TU Bergakademie Freiberg
Sparse Matrices Many matrices in computing only contain a very small percentage of nonzeros. Such matrices are called sparse ( dünn besetzt ). Often, an upper bound on the number of nonzeros in a row can
More informationWhat is Multigrid? They have been extended to solve a wide variety of other problems, linear and nonlinear.
AMSC 600/CMSC 760 Fall 2007 Solution of Sparse Linear Systems Multigrid, Part 1 Dianne P. O Leary c 2006, 2007 What is Multigrid? Originally, multigrid algorithms were proposed as an iterative method to
More informationImplicit schemes for wave models
Implicit schemes for wave models Mathieu Dutour Sikirić Rudjer Bo sković Institute, Croatia and Universität Rostock April 17, 2013 I. Wave models Stochastic wave modelling Oceanic models are using grids
More informationHYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE
HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER AVISHA DHISLE PRERIT RODNEY ADHISLE PRODNEY 15618: PARALLEL COMPUTER ARCHITECTURE PROF. BRYANT PROF. KAYVON LET S
More informationLinear Equation Systems Iterative Methods
Linear Equation Systems Iterative Methods Content Iterative Methods Jacobi Iterative Method Gauss Seidel Iterative Method Iterative Methods Iterative methods are those that produce a sequence of successive
More informationXinyu Dou Acoustics Technology Center, Motorola, Inc., Schaumburg, Illinois 60196
A unified boundary element method for the analysis of sound and shell-like structure interactions. II. Efficient solution techniques Shaohai Chen and Yijun Liu a) Department of Mechanical Engineering,
More informationPreliminary Investigations on Resilient Parallel Numerical Linear Algebra Solvers
SIAM EX14 Workshop July 7, Chicago - IL reliminary Investigations on Resilient arallel Numerical Linear Algebra Solvers HieACS Inria roject Joint Inria-CERFACS lab INRIA Bordeaux Sud-Ouest Luc Giraud joint
More information(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 1 Don t you just invert
More information8. Hardware-Aware Numerics. Approaching supercomputing...
Approaching supercomputing... Numerisches Programmieren, Hans-Joachim Bungartz page 1 of 22 8.1. Hardware-Awareness Introduction Since numerical algorithms are ubiquitous, they have to run on a broad spectrum
More information8. Hardware-Aware Numerics. Approaching supercomputing...
Approaching supercomputing... Numerisches Programmieren, Hans-Joachim Bungartz page 1 of 48 8.1. Hardware-Awareness Introduction Since numerical algorithms are ubiquitous, they have to run on a broad spectrum
More informationSELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND
Student Submission for the 5 th OpenFOAM User Conference 2017, Wiesbaden - Germany: SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND TESSA UROIĆ Faculty of Mechanical Engineering and Naval Architecture, Ivana
More informationSparse Linear Systems
1 Sparse Linear Systems Rob H. Bisseling Mathematical Institute, Utrecht University Course Introduction Scientific Computing February 22, 2018 2 Outline Iterative solution methods 3 A perfect bipartite
More informationIterative Sparse Triangular Solves for Preconditioning
Euro-Par 2015, Vienna Aug 24-28, 2015 Iterative Sparse Triangular Solves for Preconditioning Hartwig Anzt, Edmond Chow and Jack Dongarra Incomplete Factorization Preconditioning Incomplete LU factorizations
More informationEFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI
EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI 1 Akshay N. Panajwar, 2 Prof.M.A.Shah Department of Computer Science and Engineering, Walchand College of Engineering,
More informationCS 542G: Solving Sparse Linear Systems
CS 542G: Solving Sparse Linear Systems Robert Bridson November 26, 2008 1 Direct Methods We have already derived several methods for solving a linear system, say Ax = b, or the related leastsquares problem
More informationHIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach
HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach Mini-workshop PHyLeaS associated team J. Gaidamour, P. Hénon July 9, 28 HIPS : an hybrid direct/iterative solver /
More informationComparison of Two Stationary Iterative Methods
Comparison of Two Stationary Iterative Methods Oleksandra Osadcha Faculty of Applied Mathematics Silesian University of Technology Gliwice, Poland Email:olekosa@studentpolslpl Abstract This paper illustrates
More informationGTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013
GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»
More informationHPC Algorithms and Applications
HPC Algorithms and Applications Dwarf #5 Structured Grids Michael Bader Winter 2012/2013 Dwarf #5 Structured Grids, Winter 2012/2013 1 Dwarf #5 Structured Grids 1. dense linear algebra 2. sparse linear
More informationx = 12 x = 12 1x = 16
2.2 - The Inverse of a Matrix We've seen how to add matrices, multiply them by scalars, subtract them, and multiply one matrix by another. The question naturally arises: Can we divide one matrix by another?
More informationChapter 14: Matrix Iterative Methods
Chapter 14: Matrix Iterative Methods 14.1INTRODUCTION AND OBJECTIVES This chapter discusses how to solve linear systems of equations using iterative methods and it may be skipped on a first reading of
More informationA Study of Numerical Methods for Simultaneous Equations
A Study of Numerical Methods for Simultaneous Equations Er. Chandan Krishna Mukherjee B.Sc.Engg., ME, MBA Asstt. Prof. ( Mechanical ), SSBT s College of Engg. & Tech., Jalgaon, Maharashtra Abstract: -
More informationCS 6210 Fall 2016 Bei Wang. Review Lecture What have we learnt in Scientific Computing?
CS 6210 Fall 2016 Bei Wang Review Lecture What have we learnt in Scientific Computing? Let s recall the scientific computing pipeline observed phenomenon mathematical model discretization solution algorithm
More informationS0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS
S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS John R Appleyard Jeremy D Appleyard Polyhedron Software with acknowledgements to Mark A Wakefield Garf Bowen Schlumberger Outline of Talk Reservoir
More informationHighly Parallel Multigrid Solvers for Multicore and Manycore Processors
Highly Parallel Multigrid Solvers for Multicore and Manycore Processors Oleg Bessonov (B) Institute for Problems in Mechanics of the Russian Academy of Sciences, 101, Vernadsky Avenue, 119526 Moscow, Russia
More informationLesson 2 7 Graph Partitioning
Lesson 2 7 Graph Partitioning The Graph Partitioning Problem Look at the problem from a different angle: Let s multiply a sparse matrix A by a vector X. Recall the duality between matrices and graphs:
More information1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3
6 Iterative Solvers Lab Objective: Many real-world problems of the form Ax = b have tens of thousands of parameters Solving such systems with Gaussian elimination or matrix factorizations could require
More informationIterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms
Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms By:- Nitin Kamra Indian Institute of Technology, Delhi Advisor:- Prof. Ulrich Reude 1. Introduction to Linear
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationConstruction and application of hierarchical matrix preconditioners
University of Iowa Iowa Research Online Theses and Dissertations 2008 Construction and application of hierarchical matrix preconditioners Fang Yang University of Iowa Copyright 2008 Fang Yang This dissertation
More informationFigure 6.1: Truss topology optimization diagram.
6 Implementation 6.1 Outline This chapter shows the implementation details to optimize the truss, obtained in the ground structure approach, according to the formulation presented in previous chapters.
More informationSLEPc: Scalable Library for Eigenvalue Problem Computations
SLEPc: Scalable Library for Eigenvalue Problem Computations Jose E. Roman Joint work with A. Tomas and E. Romero Universidad Politécnica de Valencia, Spain 10th ACTS Workshop - August, 2009 Outline 1 Introduction
More informationThe 3D DSC in Fluid Simulation
The 3D DSC in Fluid Simulation Marek K. Misztal Informatics and Mathematical Modelling, Technical University of Denmark mkm@imm.dtu.dk DSC 2011 Workshop Kgs. Lyngby, 26th August 2011 Governing Equations
More informationnag sparse nsym sol (f11dec)
f11 Sparse Linear Algebra f11dec nag sparse nsym sol (f11dec) 1. Purpose nag sparse nsym sol (f11dec) solves a real sparse nonsymmetric system of linear equations, represented in coordinate storage format,
More informationParallel resolution of sparse linear systems by mixing direct and iterative methods
Parallel resolution of sparse linear systems by mixing direct and iterative methods Phyleas Meeting, Bordeaux J. Gaidamour, P. Hénon, J. Roman, Y. Saad LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix
More informationECE 204 Numerical Methods for Computer Engineers MIDTERM EXAMINATION /4:30-6:00
ECE 4 Numerical Methods for Computer Engineers ECE 4 Numerical Methods for Computer Engineers MIDTERM EXAMINATION --7/4:-6: The eamination is out of marks. Instructions: No aides. Write your name and student
More informationNAG Fortran Library Routine Document F11DSF.1
NAG Fortran Library Routine Document Note: before using this routine, please read the Users Note for your implementation to check the interpretation of bold italicised terms and other implementation-dependent
More informationLecture 27: Fast Laplacian Solvers
Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall
More informationPreconditioning Linear Systems Arising from Graph Laplacians of Complex Networks
Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Kevin Deweese 1 Erik Boman 2 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms
More informationIntroduction to PETSc KSP, PC. CS595, Fall 2010
Introduction to PETSc KSP, PC CS595, Fall 2010 1 Linear Solution Main Routine PETSc Solve Ax = b Linear Solvers (KSP) PC Application Initialization Evaluation of A and b Post- Processing User code PETSc
More informationLecture 17: More Fun With Sparse Matrices
Lecture 17: More Fun With Sparse Matrices David Bindel 26 Oct 2011 Logistics Thanks for info on final project ideas. HW 2 due Monday! Life lessons from HW 2? Where an error occurs may not be where you
More informationApproaches to Parallel Implementation of the BDDC Method
Approaches to Parallel Implementation of the BDDC Method Jakub Šístek Includes joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík. Institute of Mathematics of the AS CR, Prague
More informationAN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENETIC ALGORITHMS
AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENETIC ALGORITHMS Seyed Abolfazl Shahzadehfazeli 1, Zainab Haji Abootorabi,3 1 Parallel Processing Laboratory, Yazd University,
More informationImprovements of the Discrete Dipole Approximation method
arxiv:physics/0006064v1 [physics.ao-ph] 26 Jun 2000 Improvements of the Discrete Dipole Approximation method Piotr J. Flatau Scripps Institution of Oceanography, University of California, San Diego, La
More informationOverview of Trilinos and PT-Scotch
29.03.2012 Outline PT-Scotch 1 PT-Scotch The Dual Recursive Bipartitioning Algorithm Parallel Graph Bipartitioning Methods 2 Overview of the Trilinos Packages Examples on using Trilinos PT-Scotch The Scotch
More informationOutline. Parallel Algorithms for Linear Algebra. Number of Processors and Problem Size. Speedup and Efficiency
1 2 Parallel Algorithms for Linear Algebra Richard P. Brent Computer Sciences Laboratory Australian National University Outline Basic concepts Parallel architectures Practical design issues Programming
More informationChapter Introduction
Chapter 4.1 Introduction After reading this chapter, you should be able to 1. define what a matrix is. 2. identify special types of matrices, and 3. identify when two matrices are equal. What does a matrix
More informationA parallel direct/iterative solver based on a Schur complement approach
A parallel direct/iterative solver based on a Schur complement approach Gene around the world at CERFACS Jérémie Gaidamour LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project) February 29th, 2008
More informationSocial-Network Graphs
Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities
More informationSpectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras
Spectral Graph Sparsification: overview of theory and practical methods Yiannis Koutis University of Puerto Rico - Rio Piedras Graph Sparsification or Sketching Compute a smaller graph that preserves some
More informationStudy and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou
Study and implementation of computational methods for Differential Equations in heterogeneous systems Asimina Vouronikoy - Eleni Zisiou Outline Introduction Review of related work Cyclic Reduction Algorithm
More informationLecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice
Stat60/CS94: Randomized Algorithms for Matrices and Data Lecture 11-10/09/013 Lecture 11: Randomized Least-squares Approximation in Practice Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these
More informationOverlapping Domain Decomposition Methods
Overlapping Domain Decomposition Methods X. Cai 1,2 1 Simula Research Laboratory 2 Department of Informatics, University of Oslo Abstract. Overlapping domain decomposition methods are efficient and flexible.
More informationGraph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen
Graph Partitioning for High-Performance Scientific Simulations Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Challenges for irregular meshes Modeling mesh-based computations as graphs Static
More informationChapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition
Chapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition Nicholas J. Higham Pythagoras Papadimitriou Abstract A new method is described for computing the singular value decomposition
More informationBig Data Analytics. Special Topics for Computer Science CSE CSE Feb 11
Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral
More informationMultigrid solvers M. M. Sussman sussmanm@math.pitt.edu Office Hours: 11:10AM-12:10PM, Thack 622 May 12 June 19, 2014 1 / 43 Multigrid Geometrical multigrid Introduction Details of GMG Summary Algebraic
More informationMATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix.
MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. Row echelon form A matrix is said to be in the row echelon form if the leading entries shift to the
More informationNumerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes
Numerical Implementation of Overlapping Balancing Domain Decomposition Methods on Unstructured Meshes Jung-Han Kimn 1 and Blaise Bourdin 2 1 Department of Mathematics and The Center for Computation and
More informationSolving Sparse Linear Systems. Forward and backward substitution for solving lower or upper triangular systems
AMSC 6 /CMSC 76 Advanced Linear Numerical Analysis Fall 7 Direct Solution of Sparse Linear Systems and Eigenproblems Dianne P. O Leary c 7 Solving Sparse Linear Systems Assumed background: Gauss elimination
More informationGeneralized trace ratio optimization and applications
Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms Chapter 3 Dense Linear Systems Section 3.3 Triangular Linear Systems Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign
More informationIntroduction to Parallel Computing
Introduction to Parallel Computing W. P. Petersen Seminar for Applied Mathematics Department of Mathematics, ETHZ, Zurich wpp@math. ethz.ch P. Arbenz Institute for Scientific Computing Department Informatik,
More informationIterative Solver Benchmark Jack Dongarra, Victor Eijkhout, Henk van der Vorst 2001/01/14 1 Introduction The traditional performance measurement for co
Iterative Solver Benchmark Jack Dongarra, Victor Eijkhout, Henk van der Vorst 2001/01/14 1 Introduction The traditional performance measurement for computers on scientic application has been the Linpack
More informationCG solver assignment
CG solver assignment David Bindel Nikos Karampatziakis 3/16/2010 Contents 1 Introduction 1 2 Solver parameters 2 3 Preconditioned CG 3 4 3D Laplace operator 4 5 Preconditioners for the Laplacian 5 5.1
More informationExam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3
UMEÅ UNIVERSITET Institutionen för datavetenskap Lars Karlsson, Bo Kågström och Mikael Rännar Design and Analysis of Algorithms for Parallel Computer Systems VT2009 June 2, 2009 Exam Design and Analysis
More informationParallel Threshold-based ILU Factorization
A short version of this paper appears in Supercomputing 997 Parallel Threshold-based ILU Factorization George Karypis and Vipin Kumar University of Minnesota, Department of Computer Science / Army HPC
More informationChapter 13. Boundary Value Problems for Partial Differential Equations* Linz 2002/ page
Chapter 13 Boundary Value Problems for Partial Differential Equations* E lliptic equations constitute the third category of partial differential equations. As a prototype, we take the Poisson equation
More informationQ. Wang National Key Laboratory of Antenna and Microwave Technology Xidian University No. 2 South Taiba Road, Xi an, Shaanxi , P. R.
Progress In Electromagnetics Research Letters, Vol. 9, 29 38, 2009 AN IMPROVED ALGORITHM FOR MATRIX BANDWIDTH AND PROFILE REDUCTION IN FINITE ELEMENT ANALYSIS Q. Wang National Key Laboratory of Antenna
More informationMLR Institute of Technology
Course Name : Engineering Optimization Course Code : 56021 Class : III Year Branch : Aeronautical Engineering Year : 2014-15 Course Faculty : Mr Vamsi Krishna Chowduru, Assistant Professor Course Objective
More informationChapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition
Chapter 1 A New Parallel Algorithm for Computing the Singular Value Decomposition Nicholas J. Higham Pythagoras Papadimitriou Abstract A new method is described for computing the singular value decomposition
More informationAMath 483/583 Lecture 24. Notes: Notes: Steady state diffusion. Notes: Finite difference method. Outline:
AMath 483/583 Lecture 24 Outline: Heat equation and discretization OpenMP and MPI for iterative methods Jacobi, Gauss-Seidel, SOR Notes and Sample codes: Class notes: Linear algebra software $UWHPSC/codes/openmp/jacobi1d_omp1.f90
More informationNumerical Algorithms
Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0
More informationAn Approximate Singular Value Decomposition of Large Matrices in Julia
An Approximate Singular Value Decomposition of Large Matrices in Julia Alexander J. Turner 1, 1 Harvard University, School of Engineering and Applied Sciences, Cambridge, MA, USA. In this project, I implement
More informationAMath 483/583 Lecture 24
AMath 483/583 Lecture 24 Outline: Heat equation and discretization OpenMP and MPI for iterative methods Jacobi, Gauss-Seidel, SOR Notes and Sample codes: Class notes: Linear algebra software $UWHPSC/codes/openmp/jacobi1d_omp1.f90
More informationMatrices 4: use of MATLAB
Matrices 4: use of MATLAB Anthony Rossiter http://controleducation.group.shef.ac.uk/indexwebbook.html http://www.shef.ac.uk/acse Department of Automatic Control and Systems Engineering Introduction The
More informationAmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015
AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative
More informationSpline Curves. Spline Curves. Prof. Dr. Hans Hagen Algorithmic Geometry WS 2013/2014 1
Spline Curves Prof. Dr. Hans Hagen Algorithmic Geometry WS 2013/2014 1 Problem: In the previous chapter, we have seen that interpolating polynomials, especially those of high degree, tend to produce strong
More informationAn iterative solver benchmark 1
223 An iterative solver benchmark 1 Jack Dongarra, Victor Eijkhout and Henk van der Vorst Revised 31 August 2001 We present a benchmark of iterative solvers for sparse matrices. The benchmark contains
More informationSimulating tsunami propagation on parallel computers using a hybrid software framework
Simulating tsunami propagation on parallel computers using a hybrid software framework Xing Simula Research Laboratory, Norway Department of Informatics, University of Oslo March 12, 2007 Outline Intro
More informationLab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD
Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD Goals. The goal of the first part of this lab is to demonstrate how the SVD can be used to remove redundancies in data; in this example
More informationSparse Linear Algebra
Lecture 5 Sparse Linear Algebra The solution of a linear system Ax = b is one of the most important computational problems in scientific computing. As we shown in the previous section, these linear systems
More informationHandling Parallelisation in OpenFOAM
Handling Parallelisation in OpenFOAM Hrvoje Jasak hrvoje.jasak@fsb.hr Faculty of Mechanical Engineering and Naval Architecture University of Zagreb, Croatia Handling Parallelisation in OpenFOAM p. 1 Parallelisation
More informationCoupled Finite Element Method Based Vibroacoustic Analysis of Orion Spacecraft
Coupled Finite Element Method Based Vibroacoustic Analysis of Orion Spacecraft Lockheed Martin Space Systems Company (LMSSC) Spacecraft and Launch Vehicle Dynamic Environments Workshop June 21 23, 2016
More informationThe clustering in general is the task of grouping a set of objects in such a way that objects
Spectral Clustering: A Graph Partitioning Point of View Yangzihao Wang Computer Science Department, University of California, Davis yzhwang@ucdavis.edu Abstract This course project provide the basic theory
More informationAll use is subject to licence. See For any commercial application, a separate license must be signed.
HSL HSL MI20 PACKAGE SPECIFICATION HSL 2007 1 SUMMARY Given an n n sparse matrix A and an n vector z, HSL MI20 computes the vector x = Mz, where M is an algebraic multigrid (AMG) v-cycle preconditioner
More informationGPU-Accelerated Algebraic Multigrid for Commercial Applications. Joe Eaton, Ph.D. Manager, NVAMG CUDA Library NVIDIA
GPU-Accelerated Algebraic Multigrid for Commercial Applications Joe Eaton, Ph.D. Manager, NVAMG CUDA Library NVIDIA ANSYS Fluent 2 Fluent control flow Accelerate this first Non-linear iterations Assemble
More information10/24/ Rotations. 2. // s left subtree s right subtree 3. if // link s parent to elseif == else 11. // put x on s left
13.2 Rotations MAT-72006 AA+DS, Fall 2013 24-Oct-13 368 LEFT-ROTATE(, ) 1. // set 2. // s left subtree s right subtree 3. if 4. 5. // link s parent to 6. if == 7. 8. elseif == 9. 10. else 11. // put x
More informationOptimizing Data Locality for Iterative Matrix Solvers on CUDA
Optimizing Data Locality for Iterative Matrix Solvers on CUDA Raymond Flagg, Jason Monk, Yifeng Zhu PhD., Bruce Segee PhD. Department of Electrical and Computer Engineering, University of Maine, Orono,
More information