Computing the rank of big sparse matrices modulo p using gaussian elimination
|
|
- Allen Benson
- 5 years ago
- Views:
Transcription
1 Computing the rank of big sparse matrices modulo p using gaussian elimination Charles Bouillaguet 1 Claire Delaplace 2 12 CRIStAL, Université de Lille 2 IRISA, Université de Rennes 1 JNCF, 16 janvier 2017 Claire Delaplace Sparse elimination Mod p JNCF / 20
2 Sparse Linear Algebra Background Sparse Linear Algebra Modulo p (coefficients : int) Operations Rank Linear systems Kernel etc... Claire Delaplace Sparse elimination Mod p JNCF / 20
3 Sparse Linear Algebra Background Sparse Linear Algebra Modulo p (coefficients : int) Operations Rank Linear systems Kernel etc... Two families of Algorithms Direct methods (Gaussian Elimination, LU,...) Iterative methods (Wiedemann, Lanczos...) Claire Delaplace Sparse elimination Mod p JNCF / 20
4 Sparse Linear Algebra Related Work Algorithms Comparison between a sparse gaussian elimination and the Wiedmann algorithm: [Dumas & Villard 02] Direct methods in the numerical world (e.g. [Davis 06]) Pivots selection heuristic for Gröbner Basis Matrices [Faugère & Lacharte 10] Software Exact: not much (LinBox, GBLA (Gröbner basis), MAGMA) Numeric: many (SuperLU, UMFPACK,...) Claire Delaplace Sparse elimination Mod p JNCF / 20
5 Sparse Linear Algebra PLUQ Factorization U = P A Q 1 L L has non zero diagonal U has unit diagonal A can be rectangular A can be rank deficient Claire Delaplace Sparse elimination Mod p JNCF / 20
6 Sparse Linear Algebra Right-looking and Left-looking LU Usual right-looking Algorithm: U Left-looking GPLU Algorithm [Gilbert & Peierls 88] U L S L A Claire Delaplace Sparse elimination Mod p JNCF / 20
7 Sparse Linear Algebra Right-looking and Left-looking LU Usual right-looking Algorithm: U Left-looking GPLU Algorithm [Gilbert & Peierls 88] U L L A Data access Claire Delaplace Sparse elimination Mod p JNCF / 20
8 Sparse Linear Algebra Right-looking and Left-looking LU Usual right-looking Algorithm: U Left-looking GPLU Algorithm [Gilbert & Peierls 88] U L S L A Claire Delaplace Sparse elimination Mod p JNCF / 20
9 Sparse Linear Algebra Right-looking and Left-looking LU Usual right-looking Algorithm: U Left-looking GPLU Algorithm [Gilbert & Peierls 88] U L S A Data access Claire Delaplace Sparse elimination Mod p JNCF / 20
10 Sparse Linear Algebra Right-looking and Left-looking LU Usual right-looking Algorithm: U Left-looking GPLU Algorithm [Gilbert & Peierls 88] U L S L A Claire Delaplace Sparse elimination Mod p JNCF / 20
11 Sparse Linear Algebra An up-looking variant of the GPLU Row-by-row version Adapted to Computer Algebra L A U Claire Delaplace Sparse elimination Mod p JNCF / 20
12 Sparse Linear Algebra An up-looking variant of the GPLU Row-by-row version Adapted to Computer Algebra L A Data access Claire Delaplace Sparse elimination Mod p JNCF / 20
13 Sparse Linear Algebra An up-looking variant of the GPLU Row-by-row version Adapted to Computer Algebra L A U Claire Delaplace Sparse elimination Mod p JNCF / 20
14 Sparse Linear Algebra GPLU Algorithm: Application to Exact Linear Algebra Never been used before for exact computations We implemented it We benchmarked it against LinBox (sparse right-looking) Our Benchmarks show: GPLU work best when U is very sparse Sometimes GPLU outperform the right-looking algorithm (often) Sometimes the right-looking algorithm outperform GPLU (less often) = Can we take advantage of both methods? Claire Delaplace Sparse elimination Mod p JNCF / 20
15 A new hybrid algorithm Our Work: A New Hybrid Algorithm (CASC 2016) Description Find many pivots without performing any arithmetical operations. Compute the Schur complement S, using an up-looking algorithm. Compute the rank of S. Claire Delaplace Sparse elimination Mod p JNCF / 20
16 A new hybrid algorithm Our Work: A New Hybrid Algorithm (CASC 2016) Description Find many pivots without performing any arithmetical operations. Compute the Schur complement S, using an up-looking algorithm. Compute the rank of S. Rank of S Recurse Dense rank computation Wiedemann Algorithm... Claire Delaplace Sparse elimination Mod p JNCF / 20
17 A new hybrid algorithm Example: Claire Delaplace Sparse elimination Mod p JNCF / 20
18 A new hybrid algorithm Example: Set of pivots found without any arithmetical operations Claire Delaplace Sparse elimination Mod p JNCF / 20
19 A new hybrid algorithm Example: Schur Complement Claire Delaplace Sparse elimination Mod p JNCF / 20
20 A new hybrid algorithm Example: Claire Delaplace Sparse elimination Mod p JNCF / 20
21 A new hybrid algorithm Example: Claire Delaplace Sparse elimination Mod p JNCF / 20
22 A new hybrid algorithm Example: Claire Delaplace Sparse elimination Mod p JNCF / 20
23 A new hybrid algorithm Example: Parallelizable Claire Delaplace Sparse elimination Mod p JNCF / 20
24 A new hybrid algorithm Initial Pivots Selection Heuristic [Faugère & Lachartre 10] Description Each row is mapped to the column of its leftmost coefficient. When several rows have the same leftmost coefficient, select the sparsest. Move the selected rows before the others and sort them by increasing position of the leftmost coefficient. Claire Delaplace Sparse elimination Mod p JNCF / 20
25 A new hybrid algorithm Schur Complement Computation P denotes the permutation that pushes the pre-computed pivots in the top of A. Ignoring permutation over the columns of A: ( ) ( ) ( ) A00 A PA = 01 L00 U00 U = 01 A 10 A 11 L 10 L 11 U 11 Claire Delaplace Sparse elimination Mod p JNCF / 20
26 A new hybrid algorithm Schur Complement Computation P denotes the permutation that pushes the pre-computed pivots in the top of A. Ignoring permutation over the columns of A: ( ) ( ) ( ) A00 A PA = 01 L00 U00 U = 01 A 10 A 11 L 10 L 11 U 11 Definition The Schur Complement S of PA with respect to U 00 is given by : S = A 11 A 10 U 1 00 U 01 Claire Delaplace Sparse elimination Mod p JNCF / 20
27 A new hybrid algorithm Schur Complement Computation P denotes the permutation that pushes the pre-computed pivots in the top of A. Ignoring permutation over the columns of A: ( ) ( ) ( ) A00 A PA = 01 L00 U00 U = 01 A 10 A 11 L 10 L 11 U 11 Definition The Schur Complement S of PA with respect to U 00 is given by : S = A 11 A 10 U 1 00 U 01 Denote by (a i0 a i1 ) the i-th row of (A 10 A 11 ), and consider the following system : ( ) U00 U (x 0 x 1 ) 01 = (a Id i0 a i1 ) Claire Delaplace Sparse elimination Mod p JNCF / 20
28 A new hybrid algorithm Schur Complement Computation P denotes the permutation that pushes the pre-computed pivots in the top of A. Ignoring permutation over the columns of A: ( ) ( ) ( ) A00 A PA = 01 L00 U00 U = 01 A 10 A 11 L 10 L 11 U 11 Definition The Schur Complement S of PA with respect to U 00 is given by : S = A 11 A 10 U 1 00 U 01 Denote by (a i0 a i1 ) the i-th row of (A 10 A 11 ), and consider the following system : ( ) U00 U (x 0 x 1 ) 01 = (a Id i0 a i1 ) We obtain x 1 = a i1 a i0 U 1 00 U 01. x 1 is the i-th row of S Claire Delaplace Sparse elimination Mod p JNCF / 20
29 A new hybrid algorithm Schur Complement Computation In a Nutshell: Each row of S can be computed independently Guess if S is sparse or dense by computing some random rows If S sparse : compute S using the previous method The Schur complement computation is parallelizable Claire Delaplace Sparse elimination Mod p JNCF / 20
30 A new hybrid algorithm Performance of the Hybrid Algorithm (CASC 2016) Experiments carried on an Intel Core i with 8 GB of RAM Experiments carried on all matrices from SIMC with integer coefficents Only one core used Hybrid versus Right-looking (Linbox) and GPLU (time in s) Matrix Right-looking GPLU Hybrid GL7d/GL7d Margulies/cat_ears_4_ Homology/ch7-8.b Homology/ch7-8.b Hybrid versus Wiedmann (time in s) Matrix Wiedmann Hybrid M0,6-D relat relat Claire Delaplace Sparse elimination Mod p JNCF / 20
31 Recent Improvement: Better Pivots Selection Heuristic New: Improvement of the Pivots Selection Heuristic Why Enable us to reduce the number of the elimination steps Keep U sparse fast elimination steps Claire Delaplace Sparse elimination Mod p JNCF / 20
32 Recent Improvement: Better Pivots Selection Heuristic New: Improvement of the Pivots Selection Heuristic Why Enable us to reduce the number of the elimination steps Keep U sparse fast elimination steps How Find the largest set: NP-Complete Find a large set using heuristics Our idea: First find the F.-L. pivots, then search for more. Claire Delaplace Sparse elimination Mod p JNCF / 20
33 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
34 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: One candidate Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
35 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: No entries on upper rows Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
36 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: New pivot found! Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
37 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
38 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
39 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: Three candidates Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
40 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
41 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
42 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: May be eliminated Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
43 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: Can t be eliminated Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
44 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: New pivot found! Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
45 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
46 Recent Improvement: Better Pivots Selection Heuristic How to Find Remaining Pivots: Choice of new pivots: The pivotal submatrix must form a DAG If there is a cycle, the entry can t be selected. Claire Delaplace Sparse elimination Mod p JNCF / 20
47 Recent Improvement: Better Pivots Selection Heuristic Our new pivot selection algorithm In a Nutshell: Find the F.-L. pivots Push the F.-L. pivots on the top of A. For all non-pivotal column j, if the upmost coefficient a ij is on a non-pivotal row, select a ij Perform a graph search to find more pivots Claire Delaplace Sparse elimination Mod p JNCF / 20
48 Recent Improvement: Better Pivots Selection Heuristic About Graph Search Remark This method is parallelizable. The worst case complexity of the search is O (n A ) (same as Wiedemann) In practice : much faster than Wiedmann Claire Delaplace Sparse elimination Mod p JNCF / 20
49 Recent Improvement: Better Pivots Selection Heuristic Example Claire Delaplace Sparse elimination Mod p JNCF / 20
50 Recent Improvement: Better Pivots Selection Heuristic Example F.-L. pivots Claire Delaplace Sparse elimination Mod p JNCF / 20
51 Recent Improvement: Better Pivots Selection Heuristic Example More pivots after graph search Claire Delaplace Sparse elimination Mod p JNCF / 20
52 Conclusion Conclusion Implemented in C and C++ in the SpaSM (SPArse direct Solver Modulo p) library and publicy available at: Will be implemented in LinBox Claire Delaplace Sparse elimination Mod p JNCF / 20
53 Conclusion Conclusion Implemented in C and C++ in the SpaSM (SPArse direct Solver Modulo p) library and publicy available at: Will be implemented in LinBox Open questions: Can we still improve the pivots selection? Is it possible to adapt this method on matrices from CADO-NFS? Claire Delaplace Sparse elimination Mod p JNCF / 20
54 Conclusion Thank you for your time! Claire Delaplace Sparse elimination Mod p JNCF / 20
Adaptive Parallel Exact dense LU factorization
1/35 Adaptive Parallel Exact dense LU factorization Ziad SULTAN 15 mai 2013 JNCF2013, Université de Grenoble 2/35 Outline 1 Introduction 2 Exact gaussian elimination Gaussian elimination in numerical computation
More informationFFPACK: Finite Field Linear Algebra Package
FFPACK: Finite Field Linear Algebra Package Jean-Guillaume Dumas, Pascal Giorgi and Clément Pernet pascal.giorgi@ens-lyon.fr, {Jean.Guillaume.Dumas, Clément.Pernet}@imag.fr P. Giorgi, J-G. Dumas & C. Pernet
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Sparse Linear Systems and Factorization Methods Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 18 Sparse
More informationSparse LU Factorization for Parallel Circuit Simulation on GPUs
Department of Electronic Engineering, Tsinghua University Sparse LU Factorization for Parallel Circuit Simulation on GPUs Ling Ren, Xiaoming Chen, Yu Wang, Chenxi Zhang, Huazhong Yang Nano-scale Integrated
More informationFrom BLAS routines to finite field exact linear algebra solutions
From BLAS routines to finite field exact linear algebra solutions Pascal Giorgi Laboratoire de l Informatique du Parallélisme (Arenaire team) ENS Lyon - CNRS - INRIA - UCBL France Main goals Solve Linear
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26
More informationPARDISO Version Reference Sheet Fortran
PARDISO Version 5.0.0 1 Reference Sheet Fortran CALL PARDISO(PT, MAXFCT, MNUM, MTYPE, PHASE, N, A, IA, JA, 1 PERM, NRHS, IPARM, MSGLVL, B, X, ERROR, DPARM) 1 Please note that this version differs significantly
More informationLINPACK Benchmark. on the Fujitsu AP The LINPACK Benchmark. Assumptions. A popular benchmark for floating-point performance. Richard P.
1 2 The LINPACK Benchmark on the Fujitsu AP 1000 Richard P. Brent Computer Sciences Laboratory The LINPACK Benchmark A popular benchmark for floating-point performance. Involves the solution of a nonsingular
More informationGenerating Optimized Sparse Matrix Vector Product over Finite Fields
Generating Optimized Sparse Matrix Vector Product over Finite Fields Pascal Giorgi 1 and Bastien Vialla 1 LIRMM, CNRS, Université Montpellier 2, pascal.giorgi@lirmm.fr, bastien.vialla@lirmm.fr Abstract.
More informationEfficient Matrix Rank Computation with Application to the Study of Strongly Regular Graphs
Efficient Matrix Rank Computation with Application to the Study of Strongly Regular Graphs John P. May University of Delaware Newark, DE, USA jpmay@udel.edu David Saunders University of Delaware Newark,
More informationUppsala University Department of Information technology. Hands-on 1: Ill-conditioning = x 2
Uppsala University Department of Information technology Hands-on : Ill-conditioning Exercise (Ill-conditioned linear systems) Definition A system of linear equations is said to be ill-conditioned when
More informationOutline. Parallel Algorithms for Linear Algebra. Number of Processors and Problem Size. Speedup and Efficiency
1 2 Parallel Algorithms for Linear Algebra Richard P. Brent Computer Sciences Laboratory Australian National University Outline Basic concepts Parallel architectures Practical design issues Programming
More informationSolving Dense Linear Systems on Graphics Processors
Solving Dense Linear Systems on Graphics Processors Sergio Barrachina Maribel Castillo Francisco Igual Rafael Mayo Enrique S. Quintana-Ortí High Performance Computing & Architectures Group Universidad
More informationLARP / 2018 ACK : 1. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates
Triangular Factors and Row Exchanges LARP / 28 ACK :. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates Then there were three
More informationThe M4RI & M4RIE libraries for linear algebra over F 2 and small extensions
The M4RI & M4RIE libraries for linear algebra over F 2 and small extensions Martin R. Albrecht Nancy, March 30, 2011 Outline M4RI Introduction Multiplication Elimination M4RIE Introduction Travolta Tables
More informationTask based parallelization of recursive linear algebra routines using Kaapi
Task based parallelization of recursive linear algebra routines using Kaapi Clément PERNET joint work with Jean-Guillaume DUMAS and Ziad SULTAN Université Grenoble Alpes, LJK-CASYS January 20, 2017 Journée
More informationLinear Equations in Linear Algebra
1 Linear Equations in Linear Algebra 1.2 Row Reduction and Echelon Forms ECHELON FORM A rectangular matrix is in echelon form (or row echelon form) if it has the following three properties: 1. All nonzero
More informationThe LinBox Project for Linear Algebra Computation
The LinBox Project for Linear Algebra Computation A Practical Tutorial Daniel S. Roche Symbolic Computation Group School of Computer Science University of Waterloo MOCAA 2008 University of Western Ontario
More informationSparse Matrices Direct methods
Sparse Matrices Direct methods Iain Duff STFC Rutherford Appleton Laboratory and CERFACS Summer School The 6th de Brùn Workshop. Linear Algebra and Matrix Theory: connections, applications and computations.
More informationCSCE 689 : Special Topics in Sparse Matrix Algorithms Department of Computer Science and Engineering Spring 2015 syllabus
CSCE 689 : Special Topics in Sparse Matrix Algorithms Department of Computer Science and Engineering Spring 2015 syllabus Tim Davis last modified September 23, 2014 1 Catalog Description CSCE 689. Special
More informationMatrix Multiplication
Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2013 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2013 1 / 32 Outline 1 Matrix operations Importance Dense and sparse
More informationChapter Introduction
Chapter 4.1 Introduction After reading this chapter, you should be able to 1. define what a matrix is. 2. identify special types of matrices, and 3. identify when two matrices are equal. What does a matrix
More informationSparse matrices, graphs, and tree elimination
Logistics Week 6: Friday, Oct 2 1. I will be out of town next Tuesday, October 6, and so will not have office hours on that day. I will be around on Monday, except during the SCAN seminar (1:25-2:15);
More informationMatrix Multiplication
Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2018 1 / 32 Outline 1 Matrix operations Importance Dense and sparse
More informationParallel Implementations of Gaussian Elimination
s of Western Michigan University vasilije.perovic@wmich.edu January 27, 2012 CS 6260: in Parallel Linear systems of equations General form of a linear system of equations is given by a 11 x 1 + + a 1n
More informationLecture 27: Fast Laplacian Solvers
Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall
More informationRobot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss
Robot Mapping Least Squares Approach to SLAM Cyrill Stachniss 1 Three Main SLAM Paradigms Kalman filter Particle filter Graphbased least squares approach to SLAM 2 Least Squares in General Approach for
More informationGraphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General
Robot Mapping Three Main SLAM Paradigms Least Squares Approach to SLAM Kalman filter Particle filter Graphbased Cyrill Stachniss least squares approach to SLAM 1 2 Least Squares in General! Approach for
More informationHow to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić
How to perform HPL on CPU&GPU clusters Dr.sc. Draško Tomić email: drasko.tomic@hp.com Forecasting is not so easy, HPL benchmarking could be even more difficult Agenda TOP500 GPU trends Some basics about
More informationSection 3.1 Gaussian Elimination Method (GEM) Key terms
Section 3.1 Gaussian Elimination Method (GEM) Key terms Rectangular systems Consistent system & Inconsistent systems Rank Types of solution sets RREF Upper triangular form & back substitution Nonsingular
More informationCS6015 / LARP ACK : Linear Algebra and Its Applications - Gilbert Strang
Solving and CS6015 / LARP 2018 ACK : Linear Algebra and Its Applications - Gilbert Strang Introduction Chapter 1 concentrated on square invertible matrices. There was one solution to Ax = b and it was
More informationCS 598: Communication Cost Analysis of Algorithms Lecture 4: communication avoiding algorithms for LU factorization
CS 598: Communication Cost Analysis of Algorithms Lecture 4: communication avoiding algorithms for LU factorization Edgar Solomonik University of Illinois at Urbana-Champaign August 31, 2016 Review of
More informationFaugère-Lachartre Parallel Gaussian Elimination for Gröbner Bases Computations Over Finite Fields
UPMC - Pierre and Marie Curie University MASTER OF SCIENCE AND TECHNOLOGY IN Distributed Systems and Applications Faugère-Lachartre Parallel Gaussian Elimination for Gröbner Bases Computations Over Finite
More informationA Standard for Batching BLAS Operations
A Standard for Batching BLAS Operations Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 5/8/16 1 API for Batching BLAS Operations We are proposing, as a community
More informationFast and reliable linear system solutions on new parallel architectures
Fast and reliable linear system solutions on new parallel architectures Marc Baboulin Université Paris-Sud Chaire Inria Saclay Île-de-France Séminaire Aristote - Ecole Polytechnique 15 mai 2013 Marc Baboulin
More informationAn Improved Measurement Placement Algorithm for Network Observability
IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 16, NO. 4, NOVEMBER 2001 819 An Improved Measurement Placement Algorithm for Network Observability Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper
More informationCS/ENGRD 2110 Object-Oriented Programming and Data Structures Spring 2012 Thorsten Joachims. Lecture 10: Asymptotic Complexity and
CS/ENGRD 2110 Object-Oriented Programming and Data Structures Spring 2012 Thorsten Joachims Lecture 10: Asymptotic Complexity and What Makes a Good Algorithm? Suppose you have two possible algorithms or
More informationA GPU Sparse Direct Solver for AX=B
1 / 25 A GPU Sparse Direct Solver for AX=B Jonathan Hogg, Evgueni Ovtchinnikov, Jennifer Scott* STFC Rutherford Appleton Laboratory 26 March 2014 GPU Technology Conference San Jose, California * Thanks
More informationExam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3
UMEÅ UNIVERSITET Institutionen för datavetenskap Lars Karlsson, Bo Kågström och Mikael Rännar Design and Analysis of Algorithms for Parallel Computer Systems VT2009 June 2, 2009 Exam Design and Analysis
More informationAim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview
Aim Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk What should a 2-hour PhD lecture on structure
More informationSolving Sparse Linear Systems. Forward and backward substitution for solving lower or upper triangular systems
AMSC 6 /CMSC 76 Advanced Linear Numerical Analysis Fall 7 Direct Solution of Sparse Linear Systems and Eigenproblems Dianne P. O Leary c 7 Solving Sparse Linear Systems Assumed background: Gauss elimination
More informationParallel computation of echelon forms
Parallel computation of echelon forms Jean-Guillaume Dumas, Thierry Gautier, Clément Pernet, Ziad Sultan To cite this version: Jean-Guillaume Dumas, Thierry Gautier, Clément Pernet, Ziad Sultan. Parallel
More informationProject Report. 1 Abstract. 2 Algorithms. 2.1 Gaussian elimination without partial pivoting. 2.2 Gaussian elimination with partial pivoting
Project Report Bernardo A. Gonzalez Torres beaugonz@ucsc.edu Abstract The final term project consist of two parts: a Fortran implementation of a linear algebra solver and a Python implementation of a run
More informationSparse matrices: Basics
Outline : Basics Bora Uçar RO:MA, LIP, ENS Lyon, France CR-08: Combinatorial scientific computing, September 201 http://perso.ens-lyon.fr/bora.ucar/cr08/ 1/28 CR09 Outline Outline 1 Course presentation
More informationCS 598: Communication Cost Analysis of Algorithms Lecture 7: parallel algorithms for QR factorization
CS 598: Communication Cost Analysis of Algorithms Lecture 7: parallel algorithms for QR factorization Edgar Solomonik University of Illinois at Urbana-Champaign September 14, 2016 Parallel Householder
More informationqr_mumps a multithreaded, multifrontal QR solver The MUMPS team,
qr_mumps a multithreaded, multifrontal QR solver The MUMPS team, MUMPS Users Group Meeting, May 29-30, 2013 The qr_mumps software the qr_mumps software qr_mumps version 1.0 a multithreaded, multifrontal
More informationarxiv: v1 [cs.dm] 24 Sep 2012
A new edge selection heuristic for computing the Tutte polynomial of an undirected graph. arxiv:1209.5160v1 [cs.dm] 2 Sep 2012 Michael Monagan Department of Mathematics, Simon Fraser University mmonagan@cecms.sfu.ca
More informationDeterminant Computation on the GPU using the Condensation Method
Determinant Computation on the GPU using the Condensation Method Sardar Anisul Haque, Marc Moreno Maza 2 University of Western Ontario, London N6A M8, Canada E-mail: shaque4@csd.uwo.ca, 2 moreno@csd.uwo.ca
More informationSCALABLE ALGORITHMS for solving large sparse linear systems of equations
SCALABLE ALGORITHMS for solving large sparse linear systems of equations CONTENTS Sparse direct solvers (multifrontal) Substructuring methods (hybrid solvers) Jacko Koster, Bergen Center for Computational
More informationNumeric-Symbolic Exact Rational Linear System Solver
Numeric-Symbolic Exact Rational Linear System Solver B David Saunders, David Wood, and Bryan Youse University of Delaware June 10, 2011 Motivation Ax = b Given A Z m n and b Z m, compute x Q n Core problem
More informationComputational Methods CMSC/AMSC/MAPL 460. Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science
Computational Methods CMSC/AMSC/MAPL 460 Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Zero elements of first column below 1 st row multiplying 1 st
More informationUsing multifrontal hierarchically solver and HPC systems for 3D Helmholtz problem
Using multifrontal hierarchically solver and HPC systems for 3D Helmholtz problem Sergey Solovyev 1, Dmitry Vishnevsky 1, Hongwei Liu 2 Institute of Petroleum Geology and Geophysics SB RAS 1 EXPEC ARC,
More information(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert
More informationNumerical Linear Algebra
Numerical Linear Algebra Probably the simplest kind of problem. Occurs in many contexts, often as part of larger problem. Symbolic manipulation packages can do linear algebra "analytically" (e.g. Mathematica,
More informationon the CM-200. Claus Bendtsen Abstract to implement LAPACK-style routines already developed for other architectures,
\Quick" Implementation of Block LU Algorithms on the CM-200. Claus Bendtsen Abstract The CMSSL library only includes a limited amount of mathematical algorithms. Hence, when writing code for the Connection
More informationA MATLAB Interface to the GPU
Introduction Results, conclusions and further work References Department of Informatics Faculty of Mathematics and Natural Sciences University of Oslo June 2007 Introduction Results, conclusions and further
More informationStatistical Evaluation of a Self-Tuning Vectorized Library for the Walsh Hadamard Transform
Statistical Evaluation of a Self-Tuning Vectorized Library for the Walsh Hadamard Transform Michael Andrews and Jeremy Johnson Department of Computer Science, Drexel University, Philadelphia, PA USA Abstract.
More informationIntel Math Kernel Library (Intel MKL) BLAS. Victor Kostin Intel MKL Dense Solvers team manager
Intel Math Kernel Library (Intel MKL) BLAS Victor Kostin Intel MKL Dense Solvers team manager Intel MKL BLAS/Sparse BLAS Original ( dense ) BLAS available from www.netlib.org Additionally Intel MKL provides
More informationLecture 9. Introduction to Numerical Techniques
Lecture 9. Introduction to Numerical Techniques Ivan Papusha CDS270 2: Mathematical Methods in Control and System Engineering May 27, 2015 1 / 25 Logistics hw8 (last one) due today. do an easy problem
More informationWork Allocation. Mark Greenstreet. CpSc 418 Oct. 25, Mark Greenstreet Work Allocation CpSc 418 Oct. 25, / 13
Work Allocation Mark Greenstreet CpSc 418 Oct. 25, 2012 Mark Greenstreet Work Allocation CpSc 418 Oct. 25, 2012 0 / 13 Lecture Outline Work Allocation Static Allocation (matrices and other arrays) Stripes
More informationA new edge selection heuristic for computing the Tutte polynomial of an undirected graph.
FPSAC 2012, Nagoya, Japan DMTCS proc. (subm.), by the authors, 1 12 A new edge selection heuristic for computing the Tutte polynomial of an undirected graph. Michael Monagan 1 1 Department of Mathematics,
More informationSparse Linear Solver for Power System Analyis using FPGA
Sparse Linear Solver for Power System Analyis using FPGA J. R. Johnson P. Nagvajara C. Nwankpa 1 Extended Abstract Load flow computation and contingency analysis is the foundation of power system analysis.
More informationComputer Algorithms. Introduction to Algorithm
Computer Algorithms Introduction to Algorithm CISC 4080 Yanjun Li 1 What is Algorithm? An Algorithm is a sequence of well-defined computational steps that transform the input into the output. These steps
More informationIdentity Matrix: >> eye(3) ans = Matrix of Ones: >> ones(2,3) ans =
Very Basic MATLAB Peter J. Olver January, 2009 Matrices: Type your matrix as follows: Use space or, to separate entries, and ; or return after each row. >> [;5 0-3 6;; - 5 ] or >> [,5,6,-9;5,0,-3,6;7,8,5,0;-,,5,]
More informationLINBOX: A GENERIC LIBRARY FOR EXACT LINEAR ALGEBRA
LINBOX: A GENERIC LIBRARY FOR EXACT LINEAR ALGEBRA J.-G. DUMAS 1, T. GAUTIER 2, M. GIESBRECHT 3, P. GIORGI 6, B. HOVINEN 4, E. KALTOFEN 5, B.D. SAUNDERS 4, W.J. TURNER 5 AND G. VILLARD 6 1 Introduction
More informationDAG-Scheduled Linear Algebra Using Template-Based Building Blocks
DAG-Scheduled Linear Algebra Using Template-Based Building Blocks Jonathan Hogg STFC Rutherford Appleton Laboratory 1 / 20 19 March 2015 GPU Technology Conference San Jose, California * Thanks also to
More informationAutomatic Tuning of Sparse Matrix Kernels
Automatic Tuning of Sparse Matrix Kernels Kathy Yelick U.C. Berkeley and Lawrence Berkeley National Laboratory Richard Vuduc, Lawrence Livermore National Laboratory James Demmel, U.C. Berkeley Berkeley
More informationSparse Matrices and Graphs: There and Back Again
Sparse Matrices and Graphs: There and Back Again John R. Gilbert University of California, Santa Barbara Simons Institute Workshop on Parallel and Distributed Algorithms for Inference and Optimization
More informationDense Matrix Algorithms
Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication
More informationCombinatorial problems in a Parallel Hybrid Linear Solver
Combinatorial problems in a Parallel Hybrid Linear Solver Ichitaro Yamazaki and Xiaoye Li Lawrence Berkeley National Laboratory François-Henry Rouet and Bora Uçar ENSEEIHT-IRIT and LIP, ENS-Lyon SIAM workshop
More informationDeterminant Computation on the GPU using the Condensation AMMCS Method / 1
Determinant Computation on the GPU using the Condensation Method Sardar Anisul Haque Marc Moreno Maza Ontario Research Centre for Computer Algebra University of Western Ontario, London, Ontario AMMCS 20,
More information2.13. Worst-Case Linear Time Selection
2.13. Worst-Case Linear Time Selection Section authors: Sergej Roytman. Geoff Greene, Seyit Camtepe, Armand Cistaro, and Divide & Conquer 1.5 Selection Divide & Conquer Selection Linear-Time Selection
More informationMaths for Signals and Systems Linear Algebra in Engineering. Some problems by Gilbert Strang
Maths for Signals and Systems Linear Algebra in Engineering Some problems by Gilbert Strang Problems. Consider u, v, w to be non-zero vectors in R 7. These vectors span a vector space. What are the possible
More informationGPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis
GPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis Abstract: Lower upper (LU) factorization for sparse matrices is the most important computing step for circuit simulation
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic
More informationUser Guide for GLU V2.0
User Guide for GLU V2.0 Lebo Wang and Sheldon Tan University of California, Riverside June 2017 Contents 1 Introduction 1 2 Using GLU within a C/C++ program 2 2.1 GLU flowchart........................
More informationCS 770G - Parallel Algorithms in Scientific Computing
CS 770G - Parallel lgorithms in Scientific Computing Dense Matrix Computation II: Solving inear Systems May 28, 2001 ecture 6 References Introduction to Parallel Computing Kumar, Grama, Gupta, Karypis,
More informationGRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS
Discussiones Mathematicae Graph Theory 30 (2010 ) 349 359 GRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS Pavla Kabelíková Department of Applied Mathematics FEI, VSB Technical University
More informationLecture 17: Array Algorithms
Lecture 17: Array Algorithms CS178: Programming Parallel and Distributed Systems April 4, 2001 Steven P. Reiss I. Overview A. We talking about constructing parallel programs 1. Last time we discussed sorting
More information10/26/ Solving Systems of Linear Equations Using Matrices. Objectives. Matrices
6.1 Solving Systems of Linear Equations Using Matrices Objectives Write the augmented matrix for a linear system. Perform matrix row operations. Use matrices and Gaussian elimination to solve systems.
More informationPerformance Evaluation of a New Parallel Preconditioner
Performance Evaluation of a New Parallel Preconditioner Keith D. Gremban Gary L. Miller Marco Zagha School of Computer Science Carnegie Mellon University 5 Forbes Avenue Pittsburgh PA 15213 Abstract The
More informationIntel Math Kernel Library (Intel MKL) Sparse Solvers. Alexander Kalinkin Intel MKL developer, Victor Kostin Intel MKL Dense Solvers team manager
Intel Math Kernel Library (Intel MKL) Sparse Solvers Alexander Kalinkin Intel MKL developer, Victor Kostin Intel MKL Dense Solvers team manager Copyright 3, Intel Corporation. All rights reserved. Sparse
More informationGroups of two-state devices are used to represent data in a computer. In general, we say the states are either: high/low, on/off, 1/0,...
Chapter 9 Computer Arithmetic Reading: Section 9.1 on pp. 290-296 Computer Representation of Data Groups of two-state devices are used to represent data in a computer. In general, we say the states are
More informationNumerical Algorithms
Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0
More informationPerformance Analysis of BLAS Libraries in SuperLU_DIST for SuperLU_MCDT (Multi Core Distributed) Development
Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Performance Analysis of BLAS Libraries in SuperLU_DIST for SuperLU_MCDT (Multi Core Distributed) Development M. Serdar Celebi
More informationNull space computation of sparse singular matrices with MUMPS
Null space computation of sparse singular matrices with MUMPS Xavier Vasseur (CERFACS) In collaboration with Patrick Amestoy (INPT-IRIT, University of Toulouse and ENSEEIHT), Serge Gratton (INPT-IRIT,
More informationChemical Engineering 541
Chemical Engineering 541 Computer Aided Design Methods Direct Solution of Linear Systems 1 Outline 2 Gauss Elimination Pivoting Scaling Cost LU Decomposition Thomas Algorithm (Iterative Improvement) Overview
More informationPERFORMANCE ANALYSIS OF LOAD FLOW COMPUTATION USING FPGA 1
PERFORMANCE ANALYSIS OF LOAD FLOW COMPUTATION USING FPGA 1 J. Johnson, P. Vachranukunkiet, S. Tiwari, P. Nagvajara, C. Nwankpa Drexel University Philadelphia, PA Abstract Full-AC load flow constitutes
More informationSparse LU Decomposition using FPGA
Sparse LU Decomposition using FPGA Jeremy Johnson 1, Timothy Chagnon 1, Petya Vachranukunkiet 2, Prawat Nagvajara 2, and Chika Nwankpa 2 CS 1 and ECE 2 Departments Drexel University, Philadelphia, PA jjohnson@cs.drexel.edu,tchagnon@drexel.edu,pv29@drexel.edu,
More informationGenericity and efficiency in exact linear algebra with the FFLAS-FFPACK and LinBox libraries
Genericity and efficiency in exact linear algebra with the FFLAS-FFPACK and LinBox libraries Clément Pernet & the LinBox group U. Joseph Fourier (Grenoble 1, Inria/LIP AriC) Séminaire Performance et Généricité,
More informationSorting and Selection
Sorting and Selection Introduction Divide and Conquer Merge-Sort Quick-Sort Radix-Sort Bucket-Sort 10-1 Introduction Assuming we have a sequence S storing a list of keyelement entries. The key of the element
More informationA GPU-based Approximate SVD Algorithm Blake Foster, Sridhar Mahadevan, Rui Wang
A GPU-based Approximate SVD Algorithm Blake Foster, Sridhar Mahadevan, Rui Wang University of Massachusetts Amherst Introduction Singular Value Decomposition (SVD) A: m n matrix (m n) U, V: orthogonal
More informationNested-Dissection Orderings for Sparse LU with Partial Pivoting
Nested-Dissection Orderings for Sparse LU with Partial Pivoting Igor Brainman 1 and Sivan Toledo 1 School of Mathematical Sciences, Tel-Aviv University Tel-Aviv 69978, ISRAEL Email: sivan@math.tau.ac.il
More informationCS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018
CS 395T Lecture 12: Feature Matching and Bundle Adjustment Qixing Huang October 10 st 2018 Lecture Overview Dense Feature Correspondences Bundle Adjustment in Structure-from-Motion Image Matching Algorithm
More informationCOSC6365. Introduction to HPC. Lecture 21. Lennart Johnsson Department of Computer Science
Introduction to HPC Lecture 21 Department of Computer Science Most slides from UC Berkeley CS 267 Spring 2011, Lecture 12, Dense Linear Algebra (part 2), Parallel Gaussian Elimination. Jim Demmel Dense
More informationLSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems
LSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems Xiangrui Meng Joint with Michael A. Saunders and Michael W. Mahoney Stanford University June 19, 2012 Meng, Saunders, Mahoney
More informationA Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression
A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression Xiaoye Sherry Li, xsli@lbl.gov Liza Rebrova, Gustavo Chavez, Yang Liu, Pieter Ghysels Lawrence Berkeley National
More informationOptimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning
Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning Michael M. Wolf 1,2, Erik G. Boman 2, and Bruce A. Hendrickson 3 1 Dept. of Computer Science, University of Illinois at Urbana-Champaign,
More informationAnalysis of Algorithms. Unit 4 - Analysis of well known Algorithms
Analysis of Algorithms Unit 4 - Analysis of well known Algorithms 1 Analysis of well known Algorithms Brute Force Algorithms Greedy Algorithms Divide and Conquer Algorithms Decrease and Conquer Algorithms
More informationLecture 5: Matrices. Dheeraj Kumar Singh 07CS1004 Teacher: Prof. Niloy Ganguly Department of Computer Science and Engineering IIT Kharagpur
Lecture 5: Matrices Dheeraj Kumar Singh 07CS1004 Teacher: Prof. Niloy Ganguly Department of Computer Science and Engineering IIT Kharagpur 29 th July, 2008 Types of Matrices Matrix Addition and Multiplication
More information