Parallel resolution of sparse linear systems by mixing direct and iterative methods

Size: px
Start display at page:

Download "Parallel resolution of sparse linear systems by mixing direct and iterative methods"

Transcription

1 Parallel resolution of sparse linear systems by mixing direct and iterative methods Phyleas Meeting, Bordeaux J. Gaidamour, P. Hénon, J. Roman, Y. Saad LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project), France University of Minnesota, USA May, Bordeaux, France An hybrid direct/iterative solver 1 / 27

2 Outline Hybrid Solver Parallelization Results 1 Introduction 2 Hybrid Solver Schur complement techniques Ordering and partitioning of the Schur complement 3 Parallelization 4 Experimental results 5 Conclusion An hybrid direct/iterative solver 2 / 27

3 Plan Hybrid Solver Parallelization Results 1 Introduction 2 Hybrid Solver Schur complement techniques Ordering and partitioning of the Schur complement 3 Parallelization 4 Experimental results 5 Conclusion An hybrid direct/iterative solver 3 / 27

4 Motivation of this work The most popular algebraic methods to solve large sparse linear system A.x = r are : Direct method (exact factorization) Build a dense block structure of the factor (BLAS 3) Solution have a great accuracy ( ) High memory consumption (unable to solve very large 3D problems) Preconditioned iterative methods Robustness depends on how much memory is allowed in the preconditioner Based on scalar implementation (eg : ILU(k) or ILUT) Convergence difficult on very ill-conditioned system we want a trade-off : a solver that can solve difficult problems and that requires less memory than direct solver An hybrid direct/iterative solver 4 / 27

5 Our approach HIPS : Hierarchical Iterative Parallel Solver Goals : Generic algebraic approach : no information about the problem (black box). Reuse direct solver technologies (BLAS, symbolic factorization). Try to take advantage of parallelism of domain decomposition like methods. An hybrid direct/iterative solver 5 / 27

6 Our approach HIPS : Hierarchical Iterative Parallel Solver Caracteristics : Build a decomposition of the adjacency graph of the system into a set of subdomains with overlap. Direct methods inside of subdomains, iterative resolution in the interfaces. Robust preconditioner in the Schur Complement : the number of iterations weakly depends on the number of domains small subdomains to reduce memory consumption. An hybrid direct/iterative solver 6 / 27

7 Plan Hybrid Solver Parallelization Results Schur Ordering 1 Introduction 2 Hybrid Solver Schur complement techniques Ordering and partitioning of the Schur complement 3 Parallelization 4 Experimental results 5 Conclusion An hybrid direct/iterative solver 7 / 27

8 Schur complement (1/2) : Schur Ordering The linear system A.x = r can be written as : ( AB F E A C ) = ( LB EU B 1 S ) ( UB L 1 B F I ) (1) The system A.x = r can be solved in three steps : A B.z B = r B S.x C = r C E.z B (2) A B.x B = r B F.x C with S = A C E.A 1 B.F = A C E.U 1 B.L 1 B.F An hybrid direct/iterative solver 8 / 27

9 Schur complement (2/2) : Schur Ordering Schur Complement utilization : A B = L B.U B : exact factorization direct resolution of subsystems (1) and (3) Each interior of subdomains can be computed independently S L s.u s : incomplete factorization (2) is solved by a preconditioned Krylov subspace method Solve the Schur complement by a preconditioned GMRES. 8 >< A B.z B = r B (1) S.x C = r C E.z B (2) >: A B.x B = r B F.x C (3) Iterative resolution : Iterate on S is numerically equivalent to iterate on the whole system A. An hybrid direct/iterative solver 9 / 27

10 Schur Ordering Ordering and partitioning of the Schur complement We need a special ordering for the Schur complement to compute a block incomplete factorization. The unknowns in the interface are ordering according to a Hierarchical Interface Decomposition (Hénon, Saad, SIAM SISC). Interior Points Cross- Point Domain Edges Grid 8 8. The reordered matrix. We use the quotient graph induced by this partition to define block incomplete factorizations An hybrid direct/iterative solver 10 / 27

11 Schur Ordering Precondition the Schur complement Non-zero pattern of the global factors obtained on a small matrix : (Fill-in allowed only in local Schur complement) ( LB EU B 1 S ) How to avoid memory cost of EU B 1 and S in 3D problems : ILUT : EU B 1 is numerically sparsified, S factors sparsified during theirs computation (left looking approach). We do not need to store S to compute Schur product using its implicit formulation : (A C E.U 1 B.L 1 B.F).x An hybrid direct/iterative solver 11 / 27

12 Schur Ordering Precondition the Schur complement To reduce memory consumption and enhance parallelism, we defined also another block fill-in pattern for the factors : Locally consistent rules Strictly consistent rules Strictly consistent rules : No fill-in is allowed between the connectors of a same level (same block pattern than A) to keep the block diagonal pattern induced by the HID ordering. An hybrid direct/iterative solver 12 / 27

13 Plan Hybrid Solver Parallelization Results 1 Introduction 2 Hybrid Solver Schur complement techniques Ordering and partitioning of the Schur complement 3 Parallelization 4 Experimental results 5 Conclusion An hybrid direct/iterative solver 13 / 27

14 Unknown elimination in parallel We build a decomposition of the adjacency graph of the system into a set of small subdomains ( nodes). We can recover communications between processors by elimination of local subdomains An hybrid direct/iterative solver 14 / 27

15 Construction of the domain partition Justification of small subdomains choice : Need low memory (not too much direct), Convergence independent of the number of processors, Number of subdomains become a parameter to control memory / convergence according to the problem difficulty, Give high potential parallelism (multiple domains per processors). An hybrid direct/iterative solver 15 / 27

16 Equilibration Subdomains distribution over available processors : Equilibration using a graph partitionner (SCOTCH) Equilibration of S.x computation (solving step) by using the symbolic factorization to compute the number of NNZ of the interiors of subdomains. Election of the processor responsible for the computation of a piece of interface (connectors). An hybrid direct/iterative solver 16 / 27

17 Plan Hybrid Solver Parallelization Results 1 Introduction 2 Hybrid Solver Schur complement techniques Ordering and partitioning of the Schur complement 3 Parallelization 4 Experimental results 5 Conclusion An hybrid direct/iterative solver 17 / 27

18 Test cases Experimental conditions : 10 nodes of 2.6 Ghz quadri dual-core Opteron (Myrinet) Partitionner : Scotch b A.x / b < 10 7, no restart in GMRES Tests cases : Haltere, Amande (CEA/CESTA) : Symmetric complex matrix 3D electromagnetism problems (Helmholtz operator) An hybrid direct/iterative solver 18 / 27

19 Test case : Haltere (sequential study) Haltere (CEA/CESTA) : n = 1, 288, 825 ; nnz(a) = 10, 476, 775, fill ratio : x HIPS : ILUT (locally consistent, τ = 0.01, 10 7 ) # domains Precond. Solve Total Iter. Fill (sec.) (sec.) (sec.) ratio An hybrid direct/iterative solver 19 / 27

20 Test case : Haltere (sequential study) Convergence/time for several parameters with two different domain size parameters : Domain size set to 1000 (1021 domains) : Domain size set to (119 domains) : 0.01 Strictly consistent, t = 0.01 Strictly consistent, t = Locally consistent, t = 0.01 Locally consistent, t = Strictly consistent, t = 0.01 Strictly consistent, t = Locally consistent, t = 0.01 Locally consistent, t = e-04 1e-04 Relative residual norm 1e-06 1e-08 Relative residual norm 1e-06 1e-08 1e-10 1e-10 1e-12 1e Time (sec.) Time (sec.) (preconditioning time = curve offset) An hybrid direct/iterative solver 20 / 27

21 Test case : Haltere (parallel study) HIPS : ILUT (τ = 0.01, 10 7 ) 1021 domains of 1481 nodes fill ratio in precond : 5.70 (peak) dim(s) = 14.26% of dim(a) Strictly consistent : 21 iterations fill ratio in solve : 5.52 # proc Precond. Solve Total (sec.) (sec.) (sec.) Locally consistent : 13 iterations fill ratio in solve : 5.69 # proc Precond. Solve Total (sec.) (sec.) (sec.) An hybrid direct/iterative solver 21 / 27

22 Test case : Amande Amande (CEA/CESTA) : n = 6, 994, 683 ; nnz(a) = 58, 477, 383, fill ratio : x HIPS : ILUT (locally consistent, τ = 0.001, 10 7 ) 2053 domains of 3770 nodes 77 iterations fill ratio in precond / solve : (peak) dim(s) = 9.59 % of dim(a) # proc Precond. Solve Total nnz(p max).10 6 (sec.) (sec.) (sec.) An hybrid direct/iterative solver 22 / 27

23 Test case : Amande HIPS : ILUT (locally consistent, τ = 0.001, 10 7 ) Precond. Solve Total Optimal total 512 time (s) number of processors Time decomposition for one iteration of GMRES : # proc Total Triangular S.x Other 1 Iter. (sec.) Solve (sec.) (sec.) (sec.) An hybrid direct/iterative solver 23 / 27

24 Plan Hybrid Solver Parallelization Results 1 Introduction 2 Hybrid Solver Schur complement techniques Ordering and partitioning of the Schur complement 3 Parallelization 4 Experimental results 5 Conclusion An hybrid direct/iterative solver 24 / 27

25 Conclusion Conclusion : Generic algebraic approach, mix direct and iterative methods thought a Schur complement approach, The part of direct factorization is controlled by the size of domains, Many different strategies are implemented (dense block ILU). Perspective (preprocessing) : PT-Scotch integration, Parallel interface renumbering, Providing indications about good domain size parameters. HIPS public release : March 2008 (Cecill-C license) Features : real (symmetric, unsymmetric), complex (symmetric) An hybrid direct/iterative solver 25 / 27

26 * An hybrid direct/iterative solver 26 / 27

27 The domain partition is constructed from the reordering based on Nested-Dissection like algorithms (eg : METIS, SCOTCH) C 7 C 4 C 6 C 7 C 3 C 2 C C 6 3 C 5 C 1 C C C C D D D D D D D D Minimize overlap between subdomains, quality of the interface An hybrid direct/iterative solver 27 / 27

28 * An hybrid direct/iterative solver 27 / 27

29 We choose a level of the elimination tree of direct method : Subtrees rooted in this level are the interior of subdomains The upper part of the elimination tree corresponds to the interfaces Possibility to choose the ratio of direct/iterative according to the problem difficulty or the accuracy needed. An hybrid direct/iterative solver 27 / 27

A parallel direct/iterative solver based on a Schur complement approach

A parallel direct/iterative solver based on a Schur complement approach A parallel direct/iterative solver based on a Schur complement approach Gene around the world at CERFACS Jérémie Gaidamour LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project) February 29th, 2008

More information

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach Mini-workshop PHyLeaS associated team J. Gaidamour, P. Hénon July 9, 28 HIPS : an hybrid direct/iterative solver /

More information

Solvers and partitioners in the Bacchus project

Solvers and partitioners in the Bacchus project 1 Solvers and partitioners in the Bacchus project 11/06/2009 François Pellegrini INRIA-UIUC joint laboratory The Bacchus team 2 Purpose Develop and validate numerical methods and tools adapted to problems

More information

MaPHyS, a sparse hybrid linear solver and preliminary complexity analysis

MaPHyS, a sparse hybrid linear solver and preliminary complexity analysis MaPHyS, a sparse hybrid linear solver and preliminary complexity analysis Emmanuel AGULLO, Luc GIRAUD, Abdou GUERMOUCHE, Azzam HAIDAR, Jean ROMAN INRIA Bordeaux Sud Ouest CERFACS Université de Bordeaux

More information

Combinatorial problems in a Parallel Hybrid Linear Solver

Combinatorial problems in a Parallel Hybrid Linear Solver Combinatorial problems in a Parallel Hybrid Linear Solver Ichitaro Yamazaki and Xiaoye Li Lawrence Berkeley National Laboratory François-Henry Rouet and Bora Uçar ENSEEIHT-IRIT and LIP, ENS-Lyon SIAM workshop

More information

Toward robust hybrid parallel sparse solvers for large scale applications

Toward robust hybrid parallel sparse solvers for large scale applications Toward robust hybrid parallel sparse solvers for large scale applications Luc Giraud (INPT/INRIA) joint work with Azzam Haidar (CERFACS-INPT/IRIT) and Jean Roman (ENSEIRB, LaBRI and INRIA) 1st workshop

More information

Overview of Trilinos and PT-Scotch

Overview of Trilinos and PT-Scotch 29.03.2012 Outline PT-Scotch 1 PT-Scotch The Dual Recursive Bipartitioning Algorithm Parallel Graph Bipartitioning Methods 2 Overview of the Trilinos Packages Examples on using Trilinos PT-Scotch The Scotch

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26

More information

Distributed Schur Complement Solvers for Real and Complex Block-Structured CFD Problems

Distributed Schur Complement Solvers for Real and Complex Block-Structured CFD Problems Distributed Schur Complement Solvers for Real and Complex Block-Structured CFD Problems Dr.-Ing. Achim Basermann, Dr. Hans-Peter Kersken German Aerospace Center (DLR) Simulation- and Software Technology

More information

Native mesh ordering with Scotch 4.0

Native mesh ordering with Scotch 4.0 Native mesh ordering with Scotch 4.0 François Pellegrini INRIA Futurs Project ScAlApplix pelegrin@labri.fr Abstract. Sparse matrix reordering is a key issue for the the efficient factorization of sparse

More information

PARDISO Version Reference Sheet Fortran

PARDISO Version Reference Sheet Fortran PARDISO Version 5.0.0 1 Reference Sheet Fortran CALL PARDISO(PT, MAXFCT, MNUM, MTYPE, PHASE, N, A, IA, JA, 1 PERM, NRHS, IPARM, MSGLVL, B, X, ERROR, DPARM) 1 Please note that this version differs significantly

More information

Preliminary Investigations on Resilient Parallel Numerical Linear Algebra Solvers

Preliminary Investigations on Resilient Parallel Numerical Linear Algebra Solvers SIAM EX14 Workshop July 7, Chicago - IL reliminary Investigations on Resilient arallel Numerical Linear Algebra Solvers HieACS Inria roject Joint Inria-CERFACS lab INRIA Bordeaux Sud-Ouest Luc Giraud joint

More information

Hypergraph Partitioning for Parallel Iterative Solution of General Sparse Linear Systems

Hypergraph Partitioning for Parallel Iterative Solution of General Sparse Linear Systems Hypergraph Partitioning for Parallel Iterative Solution of General Sparse Linear Systems Masha Sosonkina Bora Uçar Yousef Saad February 1, 2007 Abstract The efficiency of parallel iterative methods for

More information

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013 GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»

More information

A NUMA Aware Scheduler for a Parallel Sparse Direct Solver

A NUMA Aware Scheduler for a Parallel Sparse Direct Solver Author manuscript, published in "N/P" A NUMA Aware Scheduler for a Parallel Sparse Direct Solver Mathieu Faverge a, Pierre Ramet a a INRIA Bordeaux - Sud-Ouest & LaBRI, ScAlApplix project, Université Bordeaux

More information

Accelerating the Iterative Linear Solver for Reservoir Simulation

Accelerating the Iterative Linear Solver for Reservoir Simulation Accelerating the Iterative Linear Solver for Reservoir Simulation Wei Wu 1, Xiang Li 2, Lei He 1, Dongxiao Zhang 2 1 Electrical Engineering Department, UCLA 2 Department of Energy and Resources Engineering,

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Outlining Midterm Projects Topic 3: GPU-based FEA Topic 4: GPU Direct Solver for Sparse Linear Algebra March 01, 2011 Dan Negrut, 2011 ME964

More information

S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS

S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS John R Appleyard Jeremy D Appleyard Polyhedron Software with acknowledgements to Mark A Wakefield Garf Bowen Schlumberger Outline of Talk Reservoir

More information

Implicit schemes for wave models

Implicit schemes for wave models Implicit schemes for wave models Mathieu Dutour Sikirić Rudjer Bo sković Institute, Croatia and Universität Rostock April 17, 2013 I. Wave models Stochastic wave modelling Oceanic models are using grids

More information

MUMPS. The MUMPS library. Abdou Guermouche and MUMPS team, June 22-24, Univ. Bordeaux 1 and INRIA

MUMPS. The MUMPS library. Abdou Guermouche and MUMPS team, June 22-24, Univ. Bordeaux 1 and INRIA The MUMPS library Abdou Guermouche and MUMPS team, Univ. Bordeaux 1 and INRIA June 22-24, 2010 MUMPS Outline MUMPS status Recently added features MUMPS and multicores? Memory issues GPU computing Future

More information

Block Distributed Schur Complement Preconditioners for CFD Computations on Many-Core Systems

Block Distributed Schur Complement Preconditioners for CFD Computations on Many-Core Systems Block Distributed Schur Complement Preconditioners for CFD Computations on Many-Core Systems Dr.-Ing. Achim Basermann, Melven Zöllner** German Aerospace Center (DLR) Simulation- and Software Technology

More information

Parallel Threshold-based ILU Factorization

Parallel Threshold-based ILU Factorization A short version of this paper appears in Supercomputing 997 Parallel Threshold-based ILU Factorization George Karypis and Vipin Kumar University of Minnesota, Department of Computer Science / Army HPC

More information

DEVELOPMENT OF A RESTRICTED ADDITIVE SCHWARZ PRECONDITIONER FOR SPARSE LINEAR SYSTEMS ON NVIDIA GPU

DEVELOPMENT OF A RESTRICTED ADDITIVE SCHWARZ PRECONDITIONER FOR SPARSE LINEAR SYSTEMS ON NVIDIA GPU INTERNATIONAL JOURNAL OF NUMERICAL ANALYSIS AND MODELING, SERIES B Volume 5, Number 1-2, Pages 13 20 c 2014 Institute for Scientific Computing and Information DEVELOPMENT OF A RESTRICTED ADDITIVE SCHWARZ

More information

Iterative Sparse Triangular Solves for Preconditioning

Iterative Sparse Triangular Solves for Preconditioning Euro-Par 2015, Vienna Aug 24-28, 2015 Iterative Sparse Triangular Solves for Preconditioning Hartwig Anzt, Edmond Chow and Jack Dongarra Incomplete Factorization Preconditioning Incomplete LU factorizations

More information

SCALABLE ALGORITHMS for solving large sparse linear systems of equations

SCALABLE ALGORITHMS for solving large sparse linear systems of equations SCALABLE ALGORITHMS for solving large sparse linear systems of equations CONTENTS Sparse direct solvers (multifrontal) Substructuring methods (hybrid solvers) Jacko Koster, Bergen Center for Computational

More information

Tools and Libraries for Parallel Sparse Matrix Computations. Edmond Chow and Yousef Saad. University of Minnesota. Minneapolis, MN

Tools and Libraries for Parallel Sparse Matrix Computations. Edmond Chow and Yousef Saad. University of Minnesota. Minneapolis, MN Tools and Libraries for Parallel Sparse Matrix Computations Edmond Chow and Yousef Saad Department of Computer Science, and Minnesota Supercomputer Institute University of Minnesota Minneapolis, MN 55455

More information

Toward a supernodal sparse direct solver over DAG runtimes

Toward a supernodal sparse direct solver over DAG runtimes Toward a supernodal sparse direct solver over DAG runtimes HOSCAR 2013, Bordeaux X. Lacoste Xavier LACOSTE HiePACS team Inria Bordeaux Sud-Ouest November 27, 2012 Guideline Context and goals About PaStiX

More information

Solving Sparse Linear Systems. Forward and backward substitution for solving lower or upper triangular systems

Solving Sparse Linear Systems. Forward and backward substitution for solving lower or upper triangular systems AMSC 6 /CMSC 76 Advanced Linear Numerical Analysis Fall 7 Direct Solution of Sparse Linear Systems and Eigenproblems Dianne P. O Leary c 7 Solving Sparse Linear Systems Assumed background: Gauss elimination

More information

GPU-based Parallel Reservoir Simulators

GPU-based Parallel Reservoir Simulators GPU-based Parallel Reservoir Simulators Zhangxin Chen 1, Hui Liu 1, Song Yu 1, Ben Hsieh 1 and Lei Shao 1 Key words: GPU computing, reservoir simulation, linear solver, parallel 1 Introduction Nowadays

More information

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Kevin Deweese 1 Erik Boman 2 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms

More information

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC

More information

Nonsymmetric Problems. Abstract. The eect of a threshold variant TPABLO of the permutation

Nonsymmetric Problems. Abstract. The eect of a threshold variant TPABLO of the permutation Threshold Ordering for Preconditioning Nonsymmetric Problems Michele Benzi 1, Hwajeong Choi 2, Daniel B. Szyld 2? 1 CERFACS, 42 Ave. G. Coriolis, 31057 Toulouse Cedex, France (benzi@cerfacs.fr) 2 Department

More information

Recent developments in the solution of indefinite systems Location: De Zwarte Doos (TU/e campus)

Recent developments in the solution of indefinite systems Location: De Zwarte Doos (TU/e campus) 1-day workshop, TU Eindhoven, April 17, 2012 Recent developments in the solution of indefinite systems Location: De Zwarte Doos (TU/e campus) :10.25-10.30: Opening and word of welcome 10.30-11.15: Michele

More information

Hierarchical hybrid sparse linear solver for multicore platforms

Hierarchical hybrid sparse linear solver for multicore platforms Hierarchical hybrid sparse linear solver for multicore platforms Emmanuel Agullo, Luc Giraud, Stojce Nakov, Jean Roman To cite this version: Emmanuel Agullo, Luc Giraud, Stojce Nakov, Jean Roman. Hierarchical

More information

Shared memory parallel algorithms in Scotch 6

Shared memory parallel algorithms in Scotch 6 Shared memory parallel algorithms in Scotch 6 François Pellegrini EQUIPE PROJET BACCHUS Bordeaux Sud-Ouest 29/05/2012 Outline of the talk Context Why shared-memory parallelism in Scotch? How to implement

More information

Scheduling Strategies for Parallel Sparse Backward/Forward Substitution

Scheduling Strategies for Parallel Sparse Backward/Forward Substitution Scheduling Strategies for Parallel Sparse Backward/Forward Substitution J.I. Aliaga M. Bollhöfer A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ. Jaume I (Spain) {aliaga,martina,quintana}@icc.uji.es

More information

sizes become smaller than some threshold value. This ordering guarantees that no non zero term can appear in the factorization process between unknown

sizes become smaller than some threshold value. This ordering guarantees that no non zero term can appear in the factorization process between unknown Hybridizing Nested Dissection and Halo Approximate Minimum Degree for Ecient Sparse Matrix Ordering? François Pellegrini 1, Jean Roman 1, and Patrick Amestoy 2 1 LaBRI, UMR CNRS 5800, Université Bordeaux

More information

arxiv: v1 [cs.ms] 2 Jun 2016

arxiv: v1 [cs.ms] 2 Jun 2016 Parallel Triangular Solvers on GPU Zhangxin Chen, Hui Liu, and Bo Yang University of Calgary 2500 University Dr NW, Calgary, AB, Canada, T2N 1N4 {zhachen,hui.j.liu,yang6}@ucalgary.ca arxiv:1606.00541v1

More information

Solution of 2D Euler Equations and Application to Airfoil Design

Solution of 2D Euler Equations and Application to Airfoil Design WDS'6 Proceedings of Contributed Papers, Part I, 47 52, 26. ISBN 8-86732-84-3 MATFYZPRESS Solution of 2D Euler Equations and Application to Airfoil Design J. Šimák Charles University, Faculty of Mathematics

More information

Approaches to Parallel Implementation of the BDDC Method

Approaches to Parallel Implementation of the BDDC Method Approaches to Parallel Implementation of the BDDC Method Jakub Šístek Includes joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík. Institute of Mathematics of the AS CR, Prague

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 4 Sparse Linear Systems Section 4.3 Iterative Methods Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign

More information

Intel MKL Sparse Solvers. Software Solutions Group - Developer Products Division

Intel MKL Sparse Solvers. Software Solutions Group - Developer Products Division Intel MKL Sparse Solvers - Agenda Overview Direct Solvers Introduction PARDISO: main features PARDISO: advanced functionality DSS Performance data Iterative Solvers Performance Data Reference Copyright

More information

A High-Order Accurate Unstructured GMRES Solver for Poisson s Equation

A High-Order Accurate Unstructured GMRES Solver for Poisson s Equation A High-Order Accurate Unstructured GMRES Solver for Poisson s Equation Amir Nejat * and Carl Ollivier-Gooch Department of Mechanical Engineering, The University of British Columbia, BC V6T 1Z4, Canada

More information

THE DEVELOPMENT OF THE POTENTIAL AND ACADMIC PROGRAMMES OF WROCLAW UNIVERISTY OF TECH- NOLOGY ITERATIVE LINEAR SOLVERS

THE DEVELOPMENT OF THE POTENTIAL AND ACADMIC PROGRAMMES OF WROCLAW UNIVERISTY OF TECH- NOLOGY ITERATIVE LINEAR SOLVERS ITERATIVE LIEAR SOLVERS. Objectives The goals of the laboratory workshop are as follows: to learn basic properties of iterative methods for solving linear least squares problems, to study the properties

More information

Sparse Linear Systems

Sparse Linear Systems 1 Sparse Linear Systems Rob H. Bisseling Mathematical Institute, Utrecht University Course Introduction Scientific Computing February 22, 2018 2 Outline Iterative solution methods 3 A perfect bipartite

More information

GPU-Accelerated Algebraic Multigrid for Commercial Applications. Joe Eaton, Ph.D. Manager, NVAMG CUDA Library NVIDIA

GPU-Accelerated Algebraic Multigrid for Commercial Applications. Joe Eaton, Ph.D. Manager, NVAMG CUDA Library NVIDIA GPU-Accelerated Algebraic Multigrid for Commercial Applications Joe Eaton, Ph.D. Manager, NVAMG CUDA Library NVIDIA ANSYS Fluent 2 Fluent control flow Accelerate this first Non-linear iterations Assemble

More information

Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM

Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM Alexander Monakov, amonakov@ispras.ru Institute for System Programming of Russian Academy of Sciences March 20, 2013 1 / 17 Problem Statement In OpenFOAM,

More information

Some progresses on hybrid solvers toward extreme scale

Some progresses on hybrid solvers toward extreme scale Inria Sophia-Antipolis Sept. 23, 2015 Some progresses on hybrid solvers toward extreme scale Luc Giraud joint work with Inria HiePACS project members HiePACS Inria Project INRIA Bordeaux Sud-Ouest Joint

More information

MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures

MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures Stan Tomov Innovative Computing Laboratory University of Tennessee, Knoxville OLCF Seminar Series, ORNL June 16, 2010

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Sparse Linear Systems and Factorization Methods Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 18 Sparse

More information

A Parallel Implementation of the BDDC Method for Linear Elasticity

A Parallel Implementation of the BDDC Method for Linear Elasticity A Parallel Implementation of the BDDC Method for Linear Elasticity Jakub Šístek joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík Institute of Mathematics of the AS CR, Prague

More information

Lecture 27: Fast Laplacian Solvers

Lecture 27: Fast Laplacian Solvers Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall

More information

Highly Parallel Multigrid Solvers for Multicore and Manycore Processors

Highly Parallel Multigrid Solvers for Multicore and Manycore Processors Highly Parallel Multigrid Solvers for Multicore and Manycore Processors Oleg Bessonov (B) Institute for Problems in Mechanics of the Russian Academy of Sciences, 101, Vernadsky Avenue, 119526 Moscow, Russia

More information

Primal methods of iterative substructuring

Primal methods of iterative substructuring Primal methods of iterative substructuring Jakub Šístek Institute of Mathematics of the AS CR, Prague Programs and Algorithms of Numerical Mathematics 16 June 5th, 2012 Co-workers Presentation based on

More information

Advances in Parallel Partitioning, Load Balancing and Matrix Ordering for Scientific Computing

Advances in Parallel Partitioning, Load Balancing and Matrix Ordering for Scientific Computing Advances in Parallel Partitioning, Load Balancing and Matrix Ordering for Scientific Computing Erik G. Boman 1, Umit V. Catalyurek 2, Cédric Chevalier 1, Karen D. Devine 1, Ilya Safro 3, Michael M. Wolf

More information

THE application of advanced computer architecture and

THE application of advanced computer architecture and 544 IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, VOL. 45, NO. 3, MARCH 1997 Scalable Solutions to Integral-Equation and Finite-Element Simulations Tom Cwik, Senior Member, IEEE, Daniel S. Katz, Member,

More information

Exam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3

Exam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3 UMEÅ UNIVERSITET Institutionen för datavetenskap Lars Karlsson, Bo Kågström och Mikael Rännar Design and Analysis of Algorithms for Parallel Computer Systems VT2009 June 2, 2009 Exam Design and Analysis

More information

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative

More information

nag sparse nsym sol (f11dec)

nag sparse nsym sol (f11dec) f11 Sparse Linear Algebra f11dec nag sparse nsym sol (f11dec) 1. Purpose nag sparse nsym sol (f11dec) solves a real sparse nonsymmetric system of linear equations, represented in coordinate storage format,

More information

Comparison of parallel preconditioners for a Newton-Krylov flow solver

Comparison of parallel preconditioners for a Newton-Krylov flow solver Comparison of parallel preconditioners for a Newton-Krylov flow solver Jason E. Hicken, Michal Osusky, and David W. Zingg 1Introduction Analysis of the results from the AIAA Drag Prediction workshops (Mavriplis

More information

Simulating tsunami propagation on parallel computers using a hybrid software framework

Simulating tsunami propagation on parallel computers using a hybrid software framework Simulating tsunami propagation on parallel computers using a hybrid software framework Xing Simula Research Laboratory, Norway Department of Informatics, University of Oslo March 12, 2007 Outline Intro

More information

Using multifrontal hierarchically solver and HPC systems for 3D Helmholtz problem

Using multifrontal hierarchically solver and HPC systems for 3D Helmholtz problem Using multifrontal hierarchically solver and HPC systems for 3D Helmholtz problem Sergey Solovyev 1, Dmitry Vishnevsky 1, Hongwei Liu 2 Institute of Petroleum Geology and Geophysics SB RAS 1 EXPEC ARC,

More information

Preconditioner updates for solving sequences of linear systems in matrix-free environment

Preconditioner updates for solving sequences of linear systems in matrix-free environment NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2000; 00:1 6 [Version: 2002/09/18 v1.02] Preconditioner updates for solving sequences of linear systems in matrix-free environment

More information

Keywords: Block ILU preconditioner, Krylov subspace methods, Additive Schwarz, Domain decomposition

Keywords: Block ILU preconditioner, Krylov subspace methods, Additive Schwarz, Domain decomposition BLOCK ILU PRECONDITIONERS FOR PARALLEL AMR/C SIMULATIONS Jose J. Camata Alvaro L. G. A. Coutinho Federal University of Rio de Janeiro, NACAD, COPPE Department of Civil Engineering, Rio de Janeiro, Brazil

More information

Lecture 17: More Fun With Sparse Matrices

Lecture 17: More Fun With Sparse Matrices Lecture 17: More Fun With Sparse Matrices David Bindel 26 Oct 2011 Logistics Thanks for info on final project ideas. HW 2 due Monday! Life lessons from HW 2? Where an error occurs may not be where you

More information

A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS

A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS Contemporary Mathematics Volume 157, 1994 A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS T.E. Tezduyar, M. Behr, S.K. Aliabadi, S. Mittal and S.E. Ray ABSTRACT.

More information

problems on. Of course, additional interface conditions must be

problems on. Of course, additional interface conditions must be Thirteenth International Conference on Domain Decomposition Methods Editors: N. Debit, M.Garbey, R. Hoppe, J. Périaux, D. Keyes, Y. Kuznetsov c 2001 DDM.org 53 Schur Complement Based Preconditioners for

More information

Contents. I The Basic Framework for Stationary Problems 1

Contents. I The Basic Framework for Stationary Problems 1 page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other

More information

Exploiting Thread-Level Parallelism in the Iterative Solution of Sparse Linear Systems

Exploiting Thread-Level Parallelism in the Iterative Solution of Sparse Linear Systems Exploiting Thread-Level Parallelism in the Iterative Solution of Sparse Linear Systems José I. Aliaga a,1, Matthias Bollhöfer b,2, Alberto F. Martín a,1, Enrique S. Quintana-Ortí a,1 a Dpto. de Ingeniería

More information

Stopping Criteria for Iterative Solution to Linear Systems of Equations

Stopping Criteria for Iterative Solution to Linear Systems of Equations Stopping Criteria for Iterative Solution to Linear Systems of Equations Gerald Recktenwald Portland State University Mechanical Engineering Department gerry@me.pdx.edu Iterative Methods: High-level view

More information

Domain decomposition for 3D electromagnetic modeling

Domain decomposition for 3D electromagnetic modeling Earth Planets Space, 51, 1013 1018, 1999 Domain decomposition for 3D electromagnetic modeling Zonghou Xiong CRC AMET, Earth Science, Macquarie University, Sydney, NSW 2109, Australia (Received November

More information

BDDCML. solver library based on Multi-Level Balancing Domain Decomposition by Constraints copyright (C) Jakub Šístek version 1.

BDDCML. solver library based on Multi-Level Balancing Domain Decomposition by Constraints copyright (C) Jakub Šístek version 1. BDDCML solver library based on Multi-Level Balancing Domain Decomposition by Constraints copyright (C) 2010-2012 Jakub Šístek version 1.3 Jakub Šístek i Table of Contents 1 Introduction.....................................

More information

COMMUNICATION AVOIDING ILU0 PRECONDITIONER

COMMUNICATION AVOIDING ILU0 PRECONDITIONER SIAM J. SCI. COMPUT. Vol. 37, No. 2, pp. C217 C246 c 2015 Society for Industrial and Applied Mathematics COMMUNICATION AVOIDING ILU0 PRECONDITIONER LAURA GRIGORI AND SOPHIE MOUFAWAD Abstract. In this paper

More information

2 Fundamentals of Serial Linear Algebra

2 Fundamentals of Serial Linear Algebra . Direct Solution of Linear Systems.. Gaussian Elimination.. LU Decomposition and FBS..3 Cholesky Decomposition..4 Multifrontal Methods. Iterative Solution of Linear Systems.. Jacobi Method Fundamentals

More information

VBARMS: A variable block algebraic recursive multilevel solver for sparse linear systems Liao, Jia

VBARMS: A variable block algebraic recursive multilevel solver for sparse linear systems Liao, Jia University of Groningen VBARMS: A variable block algebraic recursive multilevel solver for sparse linear systems Liao, Jia IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's

More information

Lecture 15: More Iterative Ideas

Lecture 15: More Iterative Ideas Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!

More information

A comparison of Algorithms for Sparse Matrix. Real-time Multibody Dynamic Simulation

A comparison of Algorithms for Sparse Matrix. Real-time Multibody Dynamic Simulation A comparison of Algorithms for Sparse Matrix Factoring and Variable Reordering aimed at Real-time Multibody Dynamic Simulation Jose-Luis Torres-Moreno, Jose-Luis Blanco, Javier López-Martínez, Antonio

More information

Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations

Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations Hartwig Anzt 1, Marc Baboulin 2, Jack Dongarra 1, Yvan Fournier 3, Frank Hulsemann 3, Amal Khabou 2, and Yushan Wang 2 1 University

More information

o-diagonal blocks. One can eciently perform such a block symbolic factorization in quasi-linear space and time complexities [5]. From the block struct

o-diagonal blocks. One can eciently perform such a block symbolic factorization in quasi-linear space and time complexities [5]. From the block struct PaStiX : A Parallel Sparse Direct Solver Based on a Static Scheduling for Mixed 1D/2D Block Distributions? Pascal Hénon, Pierre Ramet, and Jean Roman LaBRI, UMR CNRS 5800, Université Bordeaux I & ENSERB

More information

Lecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice

Lecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice Stat60/CS94: Randomized Algorithms for Matrices and Data Lecture 11-10/09/013 Lecture 11: Randomized Least-squares Approximation in Practice Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these

More information

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER AVISHA DHISLE PRERIT RODNEY ADHISLE PRODNEY 15618: PARALLEL COMPUTER ARCHITECTURE PROF. BRYANT PROF. KAYVON LET S

More information

Research Article A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity Problems

Research Article A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity Problems Mathematical Problems in Engineering Volume 2015, Article ID 147286, 7 pages http://dx.doi.org/10.1155/2015/147286 Research Article A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity

More information

The GPU as a co-processor in FEM-based simulations. Preliminary results. Dipl.-Inform. Dominik Göddeke.

The GPU as a co-processor in FEM-based simulations. Preliminary results. Dipl.-Inform. Dominik Göddeke. The GPU as a co-processor in FEM-based simulations Preliminary results Dipl.-Inform. Dominik Göddeke dominik.goeddeke@mathematik.uni-dortmund.de Institute of Applied Mathematics University of Dortmund

More information

Advanced Numerical Techniques for Cluster Computing

Advanced Numerical Techniques for Cluster Computing Advanced Numerical Techniques for Cluster Computing Presented by Piotr Luszczek http://icl.cs.utk.edu/iter-ref/ Presentation Outline Motivation hardware Dense matrix calculations Sparse direct solvers

More information

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation 1 Cheng-Han Du* I-Hsin Chung** Weichung Wang* * I n s t i t u t e o f A p p l i e d M

More information

Thermomechanical and hydraulic industrial simulations using MUMPS at EDF

Thermomechanical and hydraulic industrial simulations using MUMPS at EDF K u f Thermomechanical and hydraulic industrial simulations using MUMPS at EDF MUMPS User group meeting 15 april 2010 O.Boiteau, C.Denis, F.Zaoui MUltifrontal Massively Parallel sparse direct Solver 1

More information

Contributions au partitionnement de graphes parallèle multi-niveaux

Contributions au partitionnement de graphes parallèle multi-niveaux 1 Habilitation à Diriger des Recherches École doctorale de Mathématiques et d'informatique Université de Bordeaux 1 Contributions au partitionnement de graphes parallèle multi-niveaux (Contributions to

More information

Ilya Lashuk, Merico Argentati, Evgenii Ovtchinnikov, Andrew Knyazev (speaker)

Ilya Lashuk, Merico Argentati, Evgenii Ovtchinnikov, Andrew Knyazev (speaker) Ilya Lashuk, Merico Argentati, Evgenii Ovtchinnikov, Andrew Knyazev (speaker) Department of Mathematics and Center for Computational Mathematics University of Colorado at Denver SIAM Conference on Parallel

More information

Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU

Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU M. Naumov, P. Castonguay and J. Cohen NVIDIA, 2701 San Tomas Expressway, Santa Clara, CA 95050 Abstract In this technical

More information

Contents. F10: Parallel Sparse Matrix Computations. Parallel algorithms for sparse systems Ax = b. Discretized domain a metal sheet

Contents. F10: Parallel Sparse Matrix Computations. Parallel algorithms for sparse systems Ax = b. Discretized domain a metal sheet Contents 2 F10: Parallel Sparse Matrix Computations Figures mainly from Kumar et. al. Introduction to Parallel Computing, 1st ed Chap. 11 Bo Kågström et al (RG, EE, MR) 2011-05-10 Sparse matrices and storage

More information

Hartwig Anzt, Edmond Chow, Daniel B. Szyld, and Jack Dongarra. Report Novermber Revised February 2016

Hartwig Anzt, Edmond Chow, Daniel B. Szyld, and Jack Dongarra. Report Novermber Revised February 2016 Domain Overlap for Iterative Sparse Triangular Solves on GPUs Hartwig Anzt, Edmond Chow, Daniel B. Szyld, and Jack Dongarra Report 15-11-24 Novermber 2015. Revised February 2016 Department of Mathematics

More information

FOR P3: A monolithic multigrid FEM solver for fluid structure interaction

FOR P3: A monolithic multigrid FEM solver for fluid structure interaction FOR 493 - P3: A monolithic multigrid FEM solver for fluid structure interaction Stefan Turek 1 Jaroslav Hron 1,2 Hilmar Wobker 1 Mudassar Razzaq 1 1 Institute of Applied Mathematics, TU Dortmund, Germany

More information

Memory Hierarchy Management for Iterative Graph Structures

Memory Hierarchy Management for Iterative Graph Structures Memory Hierarchy Management for Iterative Graph Structures Ibraheem Al-Furaih y Syracuse University Sanjay Ranka University of Florida Abstract The increasing gap in processor and memory speeds has forced

More information

A Compiler for Parallel Finite Element Methods. with Domain-Decomposed Unstructured Meshes JONATHAN RICHARD SHEWCHUK AND OMAR GHATTAS

A Compiler for Parallel Finite Element Methods. with Domain-Decomposed Unstructured Meshes JONATHAN RICHARD SHEWCHUK AND OMAR GHATTAS Contemporary Mathematics Volume 00, 0000 A Compiler for Parallel Finite Element Methods with Domain-Decomposed Unstructured Meshes JONATHAN RICHARD SHEWCHUK AND OMAR GHATTAS December 11, 1993 Abstract.

More information

NAG Fortran Library Routine Document F11DSF.1

NAG Fortran Library Routine Document F11DSF.1 NAG Fortran Library Routine Document Note: before using this routine, please read the Users Note for your implementation to check the interpretation of bold italicised terms and other implementation-dependent

More information

Iterative methods for use with the Fast Multipole Method

Iterative methods for use with the Fast Multipole Method Iterative methods for use with the Fast Multipole Method Ramani Duraiswami Perceptual Interfaces and Reality Lab. Computer Science & UMIACS University of Maryland, College Park, MD Joint work with Nail

More information

Sparse Training Data Tutorial of Parameter Server

Sparse Training Data Tutorial of Parameter Server Carnegie Mellon University Sparse Training Data Tutorial of Parameter Server Mu Li! CSD@CMU & IDL@Baidu! muli@cs.cmu.edu High-dimensional data are sparse Why high dimension?! make the classifier s job

More information

Sparse Direct Solvers for Extreme-Scale Computing

Sparse Direct Solvers for Extreme-Scale Computing Sparse Direct Solvers for Extreme-Scale Computing Iain Duff Joint work with Florent Lopez and Jonathan Hogg STFC Rutherford Appleton Laboratory SIAM Conference on Computational Science and Engineering

More information

F k G A S S1 3 S 2 S S V 2 V 3 V 1 P 01 P 11 P 10 P 00

F k G A S S1 3 S 2 S S V 2 V 3 V 1 P 01 P 11 P 10 P 00 PRLLEL SPRSE HOLESKY FTORIZTION J URGEN SHULZE University of Paderborn, Department of omputer Science Furstenallee, 332 Paderborn, Germany Sparse matrix factorization plays an important role in many numerical

More information

F11DCFP.1. NAG Parallel Library Routine Document

F11DCFP.1. NAG Parallel Library Routine Document F11 Sparse Linear Algebra F11DCFP NAG Parallel Library Routine Document Note: Before using this routine, please read the Users Note for your implementation to check for implementation-dependent details.

More information