Sparse Matrices. This means that for increasing problem size the matrices become sparse and sparser. O. Rheinbach, TU Bergakademie Freiberg

Sparse Matrices Many matrices in computing only contain a very small percentage of nonzeros. Such matrices are called sparse ( dünn besetzt ). Often, an upper bound on the number of nonzeros in a row can be given (e.g., n = 100), which is independent of the matrix size. This means that for increasing problem size the matrices become sparse and sparser. 0-100 -200-300 -400-500 -600-700 -800-900 0 100 200 300 400 500 600 700 800 900 1

Solution of Systems of Equations Iterative methods like (Preconditioned) Conjugate Gradient Method (PCG), GMRES a.o. Use only matrix vector multiplikations Use little additional memory Direct sparse or banded solvers like LU-decomposition (Gaussian Elimination) First step is symbolic. Allocates additional memory for the fill-in during elimination Followed by a second step in which the elimination takes place 2

Preconditioned CG-Method (PCG) The PCG method is an iterative method to solve linear systems where A is symmetric positive definite. Works by minimizing the energy Ax = b 1 2 xt Ax x T b. Minimum is the solution of Ax = b and the necessary condition is 0 = ( 1 2 xt Ax x T b) = Ax b. Needs only multiplications with the matrix A and an (optional) preconditioner matrix or operator M 1. Simple preconditioners are, e.g., be Jacobi/Gauß-Seydel (not optimal) Optimal preconditioners will result in a constant number of iterations (for a given error tolerance). Optimal or almost-optimal preconditioners are much more sophisticated algorithms than Jacobi or Gauß-Seydel. 3

Preconditioned CG-Method (PCG) - Algorithm /* Preconditioned CG-Method for Ax=b */ i=0 r=b-a*x /* */ d=m^{-1}*r delta_new=<r,d> while delta_new > eps do q=a*d /* */ alpha=delta_new / <d,q> x=x + alpha*d r=r - alpha*q done s=m^{-1}*r /* */ delta_old=delta_new delta_new=<r,s> beta=delta_new / delta_old d=s + beta*d i=i+1 4

Naive Coordinate Format for Sparse Matrices Use three arrays or linked lists (column, row, value) Insertion/deletion if entries (+) Matrix operations (?) 5

Compressed Sparse Row Format (CSR) for Sparse Matrices Use three arrays val stores the nonzero entries of the matrix in row-wise order cols stores the column number of the entry rowstart stores the indices of the start of the rows in val and cols. the diagonal is (often) stored first to allow fast access to diagonal entries. 0 1 2 3 4 0 1 2 3 4 14 1 1 1 1 1 1 4 0 1 2 3 4 5 6 7 8 9 10 11 val 4-1 -1 4-1 -1-1 4-1 4-1 4 cols 0 1 2 1 0 3 4 2 0 3 1 4 rowstart 0 3 7 9 11 12 6

Compressed Row Format compact and efficient very general fast for the most important operations Sorting the rows allows a fast access to the entries by binary search - Insertion of entries is very inefficient (has to be accepted, if frequent insertion is necessary use a different format and then convert) 7

Compressed Sparse Column Format (SCS) for Sparse Matrices Is equivalent to compressed row storage of A T. 0 1 2 3 4 0 1 2 3 4 14 1 1 1 1 1 1 4 0 1 2 3 4 5 6 7 8 9 10 11 val 4-1 -1 4-1 -1 4-1 4-1 4-1 rows 0 1 2 1 0 3 2 0 3 1 4 1 colstart 0 3 6 8 10 12 8

Knuth scheme Combination of CCR and CCS. The fields nextr and nextc allow fast traversal of columns AND rows. The are linked lists to the next row or column. The last in a row/column links back to the first. Diagonal is not sorted to the front. To find the first entry in a row or column, both, rowstart and colstart are present. 0 1 2 3 4 0 1 2 3 4 1 1 1 1 1 4 1 1 1 1 4 val 4-1 -1-1 4-1 -1-1 4-1 4 4 row 0 0 0 1 1 1 1 2 2 3 3 4 col 0 1 2 0 1 3 4 0 2 1 3 4 nextr 1 2 0 4 5 6 3 8 7 10 9 11 nextc 3 4 8 7 9 10 11 0 2 1 5 6 rowstart 0 3 7 9 11 colstart 0 1 2 5 6 9

Hypersparse Matrices In typical sparse matrices the number of elements in a row and column is bounded. Hypersparse matrices are matrices where almost all rows and columns are zero. Examples are restriction operators, adjacence matrices of hypersparse graphs,... Compressed Row and Compressed Column are inefficient for hypersparse matrices. In CSR rowstart has as many entries as there are rows. In CSC colstart has as many entries as there are columns. Both can be much larger than the number of nonzeros in the matrix. To avoid this rowstart or colstart, respectively, can be compressed to save space, i.e., by a runlength encoding. Remark: This is also important when implementing an efficient sparse matrix-matrix multiplication. 10

Example for Sparse Operations Matrix-vector multiplikation in compressed row format Matrix-vector multiplikation in compressed column format 11

Matrix-Vector Multiplikation in Compressed Row Format /* Berechnet v=a*w, wobei A im CRF */ for i=0..n-1 v(i)=0; left=rowstart(i); right=rowstart(i+1)-1; for j=left..right v(i)=v(i)+val(j)*w(cols(j)) end end 0 1 2 3 4 5 6 7 8 9 10 11 val 4-1 -1 4-1 -1-1 4-1 4-1 4 cols 0 1 2 1 0 3 4 2 0 3 1 4 rowstart 0 3 7 9 11 12 12

Matrix-Vector Multiplication im Compressed Column Format /* Berechnet v=a*w, wobei A im CCF */ for i=0..n-1 v(i)=0; top=colstart(i); bottom=colstart(i+1)-1; for j=top..bottom v(rows(j))=v(rows(j))+val(j)*w(i) end end 0 1 2 3 4 5 6 7 8 9 10 11 val 4-1 -1 4-1 -1 4-1 4-1 4-1 rows 0 1 2 1 0 3 2 0 3 1 4 1 colstart 0 3 6 8 10 12 13

Banded LU-Decomposition, Symbolic Step, Find Skyline 4 1 1 1 1 14

Banded LU-decomposition, symbolic step Schritt: allocate memory, copy, sort 4 1 1 0 1 1 1 0 4 0 0 1 0 4 0 1 0 0 4 val 4-1 -1 4-1 0-1 -1 4-1 0 0 0 cols 0 1 2 1 0 2 3 4 2 0 1 3 4 rowstart 0 3 8 13 17 21 Data structure after symbolic step. 4-1 0 0 4-1 0 0 3 1 2 4 4 1 2 3 Banded LU Decomposition: Complexity n (bandwidth) 2 15