1.5D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY

Size: px
Start display at page:

Download "1.5D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY"

Transcription

1 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY ENVER KAYAASLAN, BORA UÇAR, AND CEVDET AYKANAT Abstract. There are three common parallel sparse matrix-vector multiply algorithms: D row-parallel, D column-parallel and D row-column-parallel. The D parallel algorithms offer the advantage of having only one communication phase. On the other hand, the D parallel algorithm is more scalable due to a high level of flexibility on distributing fine-grain tasks, whereas they suffer from two communication phases. Here, we introduce a novel concept of heterogeneous messages where a heterogeneous message may contain both input-vector entries and partially computed output-vector entries. This concept not only leads to a decreased number of messages but also enables fusing the input- and output-communication phases into a single phase. These findings are utilized to propose a.d parallel sparse matrix-vector multiply algorithm which is called local row-column-parallel. This proposed algorithm requires local fine-grain partitioning where locality refers to the constraint on each fine-grain task being assigned to the processor that contains either its input-vector entry, or its output-vector entry, or both. This constraint, nevertheless, happens to be not very restrictive so that we achieve a partitioning quality close to that of the D parallel algorithm. We propose two methods for local fine-grain partitioning. The first method is based on a novel directed hypergraph partitioning model that minimizes total communication volume while maintaining a load balance constraint as well as an additional locality constraint which is handled by adopting and adapting a recent and simple yet effective approach. The second method has two parts where the first part finds a distribution of the input- and output-vectors and the second part finds a nonzero/task distribution that exactly minimizes total communication volume while keeping the vector distribution intact. We conduct our experiments on a large set of test matrices to evaluate the partitioning qualities and partitioning times of these proposed.d methods. Key words. sparse matrix partitioning, parallel sparse matrix-vector multiplication, directed hypergraph model, bipartite vertex cover, combinatorial scientific computing AMS subject classifications. 0C0, 0C, 0C0, F0, F0, Y0. Introduction. The sparse matrix-vector multiply is a fundamental operation in many iterative solvers such as for linear systems, eigensystems and least squares problems. This renders the parallelization of sparse matrix-vector multiply as an important problem. Since the same sparse matrix is multiplied many times during the iterations of such applications, several comprehensive sparse matrix partitioning models and methods are proposed and implemented for scaling parallel sparse matrixvector multiply operations on distributed memory systems. The parallel sparse matrix-vector multiply operation is composed of fine-grain tasks of multiply-and-add operations where each fine-grain task involves an inputvector entry, a nonzero and a partial result on an output-vector entry. Here, each fine-grain task is associated with a separate nonzero and assumed to be performed by the processor that contains the associated nonzero by the owner-computes rule. In the literature, there are three basic sparse matrix-vector multiply algorithms: row-parallel, column-parallel and row-column-parallel. The row- and column-parallel algorithms are D parallel, whereas the row-column-parallel algorithm is D parallel. In row-parallel sparse matrix-vector multiply, all fine-grain tasks associated with the nonzeros at a row are combined into a composite task of inner product of a sparse row vector and a dense input vector. This row-oriented combination requires rowwise partitioning where the nonzeros at a row and the respective output-vector entry are all assigned to the same processor. Similarly, in column-parallel sparse matrix-vector Independent Researcher CNRS and University of Lyon, FRANCE Bilkent University, TURKEY

2 E. KAYAASLAN, B. UÇAR AND C. AYKANAT multiply, all fine-grain tasks associated with the nonzeros at a column are combined into a composite task of daxpy operation over a dense output vector where the operation involves a sparse column vector and an input-vector entry. This columnoriented combination requires columnwise partitioning where the nonzeros at a column and the respective input-vector entry are all assigned to the same processor. In row-parallel sparse matrix-vector multiply, all messages are communicated in an input-communication phase called expand where each message contains only inputvector entries. In column-parallel sparse matrix-vector multiply, on the other hand, all messages are communicated in an output-communication phase called fold where each message contains only partially computed output-vector entries. In row-columnparallel sparse matrix-vector multiply, there is no restriction of any kind on distributing input- and output-vector entries and nonzeros, which is also referred as fine-grain partitioning. In the row-column-parallel algorithm, some messages are communicated in the expand phase and some messages are communicated in the fold phase. Each message of the expand phase contains only input-vector entries as in the row-parallel algorithm, whereas each message of the fold phase contains only partially computed output-vector entries as in the column-parallel algorithm. In all three sparse matrixvector multiply algorithms, the messages are homogenous, that is, each message contains either only input-vector entries or only partially computed output-vector entries. In order to solve each of the above-mentioned three partitioning problems, a different hypergraph model is proposed, where vertex partitioning with minimum cutsize while maintaining balance on part weights exactly corresponds to matrix partitioning with minimum total communication volume while maintaining computational load balance on processors. These hypergraph models are as follows: the column-net hypergraph model [

3 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY expand-fold phase. The proposed local row-column-parallel algorithm requires local fine-grain partitioning where a fine-grain partition is said to be local if each fine-grain task is local either to its input-vector entry, or to its output-vector entry, or to both. This flexibility on assigning fine-grain tasks brings an opportunity to perform sparse matrix-vector multiply in parallel with a partitioning time and partitioning quality close to those of the D and D parallel algorithms, respectively. We propose two methods to obtain a.d local fine-grain partition each with a different setting and approach where some preliminary studies on these methods are given in our recent work [

4 E. KAYAASLAN, B. UÇAR AND C. AYKANAT P` P r P k x j x j a ij ŷ i y i ŷ i ŷ i + a ij x j y i y i +ŷ i Fig..: A fine-grain task and its parallel computation. task a ij. P` P k P` P k x j a ij ŷi y i x j x j y i a ij ŷ i ŷ i + a ij x j y i y i +ŷ i y i y i + a ij x j a P P P x [x ] x [x, ŷ ] y x a y a a y a x + a x ŷ a x y a x +ŷ

5 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY P P P P P P P P P x P P P x a x y a a y a Fig..: A task-and-data distribution P Π(y Ax) P P P P of matrix-vector P multiply on a sample sparse matrix A. P P P A of the block structure ( P P P x a x x a y a y a

6 E. KAYAASLAN, B. UÇAR AND C. AYKANAT In column-parallel sparse matrix-vector multiply, the basic computational units are the columns. For an input-vector entry x j assigned to processor P k, the fine-grain tasks associated with the nonzeros of A j = {a ij A : i m} are combined into a composite task of daxpy operation ŷ k ŷ k +A j x j which is to be carried out on P k where ŷ k is the partially computed output-vector of P k. As a result, a task-anddata distribution Π(y Ax) of matrix-vector multiply on A for the column-parallel algorithm should satisfy the following condition: a ij A (k) whenever x j x (k) (.9) and in the literature this kind of distribution is known as columnwise partitioning [

7 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY Algorithm The row-column-parallel sparse matrix-vector multiply For each processor P k :. (expand) for each nonzero column stripe A (l) (a) form vector ˆx (k) l which contains only those entries of x (k) corresponding to nonzero columns in A (l) k and (b) send vector ˆx (k) l to P l,. for each nonzero row stripe A (k) l ; compute (a) y (l) k A (k) lk x(k) and (b) y (l) k y (l) k + r k A(k) lr ˆx(r) k. (fold) for each nonzero row stripe A (k) (a) form vector ŷ (l) k to nonzero rows in A (k) l (b) send vector ŷ (l) k to P l,. compute output-subvector (a) y (k) A kk x (k), (b) y (k) y (k) + A (k) ˆx(l) k and (c) y (k) y (k) + l k ŷ(k) l. l ; k ; which contains only those entries of y (l) k and corresponding of such messages. Then, the total reduction in the number of messages equals to the number of heterogeneous messages of the local row-column-parallel algorithm... Task-communication dependency graph. We first introduce a two-way categorization of input- and output-vector entries and a four-way categorization of fine-grain tasks (

8 expand NL fold expand fold NL 8 E. KAYAASLAN, B. UÇAR AND C. AYKANAT NL expand expand NL NL fold fold NL (c) expand-fold row-parallel (a) task-communication dependency graph fold expand fold expand NL fold (d) column-parallel fold expand fold fold NL NL expand expand-fold expand expand (b) row-column-parallel expand-fold (e) local row-column-parallel expand-fold expand-fold Fig..: (a) task-communication dependency graph, (b) (e) topological orderings for different sparse matrix-vector multiply algorithms. : input-communication phase, : output-communication phase, : input-output-local tasks, : inputlocal tasks, : output-local tasks, NL: nonlocal tasks. a dependency on the output-communication phase, however, the nonlocal tasks are linked with both communication phases. Figure

9 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY 9 In the column-parallel algorithm, each of the fine-grain tasks is either inputoutput-local or input-local due to the columnwise partitioning condition (

10 P P P P P P x a x P P P x y a P y a a P P 0 E. KAYAASLAN, B. UÇAR AND C. AYKANAT P P P P P P P P P P P P x a x x y a a y a Fig..: A sample local fine-grain partition. Here, a is an input-output-local task, a is an input-local task, a and a are output-local tasks. A = A A () A () A () A () A () A () A () A () 0 A () A () A A () 0 0 A () A () 0 0 A () A A () A () 0 A () A () A () A () A () A () A () A () A +. (.) For instance, A = A () + A(), A = A () + A(), A = A () + A(),..., etc. Figure

11 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY Algorithm The local row-column-parallel sparse matrix-vector multiply For each processor P k :. for each nonzero off-diagonal block A (k) compute y (l) k lk ; A (k) P` P m P k input-local tasks of P k lk x(k),. (expand-fold) for each nonzero x off-diagonal block A lk = A (k) j ŷ i lk + A(l) lk ; (a) form vector ˆx (k) x j a ij y i l, which contains only those entries of x (k) corresponding to nonzero columns in A (l) ŷ lk i, ŷ i + a ij x j y i y i +ŷ i (b) form vector ŷ (l) k, which contains only those entries of y(l) k corresponding to nonzero rows in A (k) lk, P k (c) send vector P` [ˆx (k) l, ŷ (l) k ] to processor P l.. compute output-subvector x ŷi j x j (a) y (k) A kk x (k) y i x, input-output-local j a ij a ij tasks of P k (b) y (k) y (k) + A (k) ŷ ˆx(l) i ŷ i + a ij k and output-local tasks of P k (c) y (k) y (k) + x j y i y i +ŷ i y i y i + a ij x j l k ŷ(k) l. input-local tasks of other processors P` P k y i a P P P x [x ] x [x, ŷ ] y x a y a a y a x + a x ŷ a x y a x +ŷ Fig..: An illustration of Algorithm

12 E. KAYAASLAN, B. UÇAR AND C. AYKANAT. Two proposed methods for local row-column-parallel partitioning. In this section, we propose two methods to find a local row-column-parallel partition that is required for.d local row-column-parallel sparse matrix vector multiply. One method finds vector and nonzero distributions simultaneously, whereas the other employs two parts in which vector and nonzero distributions are found separately... A directed hypergraph model for simultaneous vector and nonzero distribution. In this method, we adopt the elementary hypergraph model for finegrain partitioning of [

13 P.D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY (a) a sparse matrix P P P V P P V P P (c) a -way local hypergraph partition P V (b) directed hypergraph model y y y x x x x x x x x x y x x x y y y y y y y y (d) local fine-grain partition Fig..: An illustration of attaining a local fine-grain partition through vertex partitioning of the directed hypergraph model that satisfies locality constraints. The input- and output-data vertices are drawn with triangles and rectangles, respectively. j has smaller number of nonzeros than row i and it is amalgamated into v y (i) if vice versa, where the ties are broken arbitrarily. The result is a reduced hypergraph that contains only input- and output-data vertices amalgamated with task vertices where the weight of a data vertex is equal to the number of task vertices amalgamated into that data vertex. As a result, the locality constraint on vertex partitioning of the initial directed hypergraph naturally holds through vertex partitioning on the reduced hypergraph for which the net directions become irrelevant. A vertex partition of this reduced hypergraph can be obtained by any existing hypergraph partitioning tools and then can be trivially decoded as a local fine-grain partition. Figure

14 E. KAYAASLAN, B. UÇAR AND C. AYKANAT " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " (a) task-vertex amalgamations (b) reduced hypergraph P V V V x x x x x x x x x P y x x x P y y y V y y y y P y V y y y (c) a -way hypergraph partition V (d) local fine-grain partition P Fig..: An illustration of local fine-grain P partitioning through task-vertex amalgamations. The input- and output-data vertices are drawn with triangles and rectangles, respectively. The figure on right bottom shows the fine-grain partition. the recursive-bisection framework which distorts the locality of task vertices so that a partition obtained in further recursive steps is no more a local fine-grain partition... Optimal nonzero distribution to minimize total communication volume. This method is composed of two parts. The first part is to find a vector distribution (Π(x), Π(y)) and the second part is to find a nonzero/task distribution Π(A) that exactly minimizes total communication volume over all possible local finegrain partitions those abide by the vector distribution (Π(x), Π(y)) of the first part. In this way, we generate a local fine-grain partition Π(y Ax) = (Π(A), Π(x), Π(y)). The first part can be accomplished by any conventional data partitioning methods such as D partitioning and this section is devoted to the second part of the method. Consider the block structure (

15 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY x x x x x x (a) a sample 0 sparse matrix y y y y 0 0 y 8 y (b) the induced block structure y y y x x x x x x x x x x x x Fig..: A sample 0 sparse matrix A and its block structure induced by input- distribution Π(x) = {x 0 (), x 0 (), x () } and output-data 8 distribution Π(y) 0 = 0 data y y {y (), y (), y () }, where x () = {x, x, xy, x, x 8, x 9 }, x () = {x, x }, x () = {x, x, x, x 0 }, y () = {y, y 0 }, y () = {y, y, y, y, y 8 } and y () = {y, y, y 9 }. y be performed independently for minimizing y total y communication volume. In the local row-column-parallel algorithm, P l sends [ˆx (k) l, ŷ (l) k ] to P k where ˆx (k) l 8 corresponds 8 to the nonzero columns of A (l) 8 8 lk and ŷ(l) k corresponds to the nonzero rows of A (k) lk y, for a nonzero/task distribution A = A (k) y y + A(l). Then, we can derive the following formula for the communication volume φ from P l to P k : φ = ˆn(A (k) ) + ˆm(A(l) ), (.) where ˆn(.) and ˆm(.) refer to the number of nonzero columns and nonzero rows of the input submatrix, respectively. The total communication volume φ is then computed by summing the communication volumes incurred by each nonzero off-diagonal block of the block structure. Then, the problem of our interest can be described as follows. Problem. Given A and a vector distribution (Π(x), Π(y)), find a nonzero/task distribution Π(A) such that each nonzero off-diagonal block A = A (k) + A(l) and each diagonal block A kk = A (k) kk for the block structure induced by (Π(x), Π(y)) minimizing total communication volume φ = k l φ. Let G = (U V, E ) be the bipartite graph representation of A, where U and V are the set of vertices corresponding to the rows and columns of A, respectively, and E is the set of edges corresponding to the nonzeros of A. Based on this notation, the following theorem states a correspondence between the problem of distributing nonzeros/tasks of A to minimize communication volume φ from P l to P k and the problem of finding a minimum vertex cover of G. Theorem.. Let A be a nonzero off-diagonal block and G = (U V, E ) be its bipartite graph representation.. For any vertex cover S of G, there is a nonzero distribution A = A (k) A (l) such that S ˆn(A (k) ) + ˆm(A(l) ),. For any nonzero distribution A = A (k) of G such that S = ˆn(A (k) ) + ˆm(A(l) ). + + A(l), there is a vertex cover S

16 E. KAYAASLAN, B. UÇAR AND C. AYKANAT A (k) Proof. We prove the two parts of the theorem separately.. Take any vertex cover S of G. Consider any nonzero distribution A = + A(l) such that A (k) if v j S and u i S, a ij A (l) if v j S and u i S, (.) A (k) or A (l) if v j S l and u i S. Since v j S for every a ij A (k) and u i S for every a ij A (l), S V ˆn(A (k) ) and S U ˆm(A (l) ), which in turn leads S ˆn(A (k) ) + ˆm(A(l) ). (.). Take any nonzero distribution A = A (k) + A(l). Consider S = {u i U : ) + ˆm(A(l) ). Now, consider a nonzero a ij A and its corresponding edge {u i, v j } E. If a ij A (k) then a ij A (l) } {v j V : a ij A (k) } where S = ˆn(A (k) v j S. Otherwise, u i S since a ij A (l). So, S is a vertex cover of G. At this point, however, it is still not clear how the reduction from the problem of distributing nonzeros/tasks to the problem of finding minimum vertex cover holds. For this purpose, using Theorem

17 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY 8 Ak` 8 9 y k = {y,y,y,y,y 8 } x` = {x,x,x,x,x 8,x 9 } G k` v u v u v u v u v 8 u 8 v 9 S k` = {u,u,v,v 8 } Input-communication [x,x 8 ] 8 9 A (k) k` A (`) k` Output-communication [ŷ, ŷ ] Fig..: The minimum vertex cover model for A to minimize communication volume φ from P l to P k. Due to minimum vertex cover S, P l sends [x, x 8, ŷ, ŷ ] to P k. Algorithm Nonzero/task distribution to minimize total communication volume : procedure NonzeroTaskDistributeVolume(A, Π(x), Π(y)) : for each nonzero off-diagonal block A do Equation (

18 8 E. KAYAASLAN, B. UÇAR AND C. AYKANAT u u u u u 8 G G v v v v v 8 v 9 u u 0 v v G u u 9 G v v S ={v,u 0 } S ={v,u 9 } u u u u 8 v v v 0 v S ={v,v 8,u,u } S ={v,u } (a) a minimum vertex cover for each nonzero off-diagonal block of Figure

19 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY 9 running with default parameters and setting the maximum allowable imbalance ratio as %. Since PaToH depends on randomization, we report the geometric mean of ten different runs for each partitioning instance. In all experiments, we report the results using a generic tool called performance profiles [

20 0 E. KAYAASLAN, B. UÇAR AND C. AYKANAT trices with or without dense rows (K = 0).0 Method D D.D-H.D-V.D-L.D-L SpMV Partitioning Section row-parallel row-column-parallel local row-column-parallel local row-column-parallel Matrices with no dense rows/columns rowwise fine-grain local fine-grain local fine-grain Matrices with no dense rows/columns Partitioning time relative to the best Partitioning time relative to the best Communication volume relative to the best.0 (a) Total volume (K = ) Matrices with no dense rows/columns % % % % % % % % 8% 9% 0% Load imbalance ratio.0 (b) Load balance (K = ) Matrices with no dense rows/columns Communication volume relative to the best (c) Total communication volume (K = 0) 0% % % % % % % % 8% 9% 0% Load imbalance ratio (d) Load balance (K = 0) Fig..: Performance profiles that compare total communication volume and load balance using test matrices with no dense rows/columns for K = and 0. respectively, for K = 0. Figures

21 trices with or without dense rows (K = 0).0.D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY Method D D.D-H.D-V.D-L.D-L SpMV Partitioning Section row-parallel row-column-parallel local row-column-parallel local row-column-parallel Matrices with dense rows/columns rowwise fine-grain local fine-grain local fine-grain Matrices with dense rows/columns Partitioning time relative to the best Partitioning time relative to the best Communication volume relative to the best.0 (a) Total volume (K = ) Matrices with dense rows/columns % % 8% % % 0% % 8% % % 0% Load imbalance ratio.0 (b) Load balance (K = ) Matrices with dense rows/columns Communication volume relative to the best (c) Total communication volume (K = 0) 0% % 8% % % 0% % 8% % % 0% Load imbalance ratio (d) Load balance (K = 0) Fig..: Performance profiles that compare total communication volume and load balance using test matrices with dense rows/columns for K = and 0. performances in terms of total communication volume as expected. Figure

22 E. KAYAASLAN, B. UÇAR AND C. AYKANAT trices with or without dense rows (K = 0).0 Method D D.D-H.D-V.D-L.D-L All matrices SpMV Partitioning Section row-parallel row-column-parallel local row-column-parallel local row-column-parallel rowwise fine-grain local fine-grain local fine-grain All matrices Partitioning time relative to the best Partitioning time relative to the best Number of messages relative to the best.0 (a) Total message count All matrices Number of messages relative to the best.0 (b) Maximum message count All matrices Communication volume relative to the best (c) Maximum volume Partitioning time relative to the best (d) Partitioning time Fig..: Performance profiles that compare total message count and maximum message count for three methods D, D and.d-h, maximum communication volume per processor and partitioning time for all methods using all test matrices for K = 0. still be favorable to other methods for particular matrices due to low communication volume it may lead. In short, if the sparse matrix contains dense rows/columns then.d-h seems to be the method of choice in general; otherwise,.d-v and D are reasonable alternatives competing with each other.. Conclusion and further discussions. This paper introduced.d parallelism for sparse matrix-vector multiply. We presented the local row-column parallel sparse matrix-vector multiply that uses this introduced.d parallelism. This algorithm is the fourth parallel algorithm in the literature for sparse matrix-vector multiply in addition to the well-known D row-parallel, D column-parallel and D row-column-parallel ones. In this paper, we also proposed two methods (.D-H and.d-v) to distribute tasks and data in accordance with the requirements of the proposed.d parallel algorithm. Using an extensive set of matrices from the UFL sparse

23 .D PARALLEL SPARSE MATRIX-VECTOR MULTIPLY matrix collection, we compared the partitioning qualities of these two methods against the baseline D and D methods. The experiments suggest the use of the local row-column-parallel sparse matrixvector multiply with a local fine-grain partition obtained by the proposed directed hypergraph model for matrices those contain dense rows/columns as we observe a performance close to that of D fine-grain partitioning in terms of the partitioning quality but with considerably less number of messages and significant efficiency. We consider the problem mainly from a theoretical point of interest and leave the performance of.d parallel sparse matrix-vector multiply algorithms in terms of the parallel multiply timings as a future work. We note that the main ideas behind the proposed.d parallelism, such as heterogeneous messaging and avoiding nonlocal tasks by a locality constraint on partitioning, are of course not restricted to the parallel sparse matrix-vector multiply operation and these ideas can be extended to other parallel computations as well. REFERENCES [] Ümit Çatalyürek and Cevdet Aykanat, Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication, Parallel and Distributed Systems, IEEE Transactions on, 0 (999), pp. 9. [], A fine-grain hypergraph model for d decomposition of sparse matrices, Parallel and Distributed Processing Symposium, International, (00), p. 08b. [], Patoh (partitioning tool for hypergraphs), in Encyclopedia of Parallel Computing, Springer, 0, pp [] Ümit Çatalyürek, Cevdet Aykanat, and Bora Uçar, On two-dimensional sparse matrix partitioning: Models, methods, and a recipe, SIAM Journal on Scientific Computing, (00), pp. 8. [] Elizabeth D Dolan and Jorge J Moré, Benchmarking optimization software with performance profiles, Mathematical programming, 9 (00), pp. 0. [] Enver Kayaaslan, Bora Uçar, and Cevdet Aykanat, Semi-two-dimensional partitioning for parallel sparse matrix-vector multiplication, in Parallel and Distributed Processing Symposium Workshop (IPDPSW), 0 IEEE International, IEEE, 0, pp.. [] Daniël M Pelt and Rob H Bisseling, A medium-grain method for fast d bipartitioning of sparse matrices, in Parallel and Distributed Processing Symposium, 0 IEEE 8th International, IEEE, 0, pp [8] Bora Uçar and Cevdet Aykanat, Revisiting hypergraph models for sparse matrix partitioning, SIAM review, 9 (00), pp [9], Partitioning sparse matrices for parallel preconditioned iterative methods, SIAM Journal on Scientific Computing, 9 (008), p. 8.

Mathematics and Computer Science

Mathematics and Computer Science Technical Report TR-2006-010 Revisiting hypergraph models for sparse matrix decomposition by Cevdet Aykanat, Bora Ucar Mathematics and Computer Science EMORY UNIVERSITY REVISITING HYPERGRAPH MODELS FOR

More information

Technical Report. OSUBMI-TR_2008_n04. On two-dimensional sparse matrix partitioning: Models, Methods, and a Recipe

Technical Report. OSUBMI-TR_2008_n04. On two-dimensional sparse matrix partitioning: Models, Methods, and a Recipe The Ohio State University Department of Biomedical Informatics Graves Hall, W. th Avenue Columbus, OH http://bmi.osu.edu Technical Report OSUBMI-TR n On two-dimensional sparse matrix partitioning: Models,

More information

ENCAPSULATING MULTIPLE COMMUNICATION-COST METRICS IN PARTITIONING SPARSE RECTANGULAR MATRICES FOR PARALLEL MATRIX-VECTOR MULTIPLIES

ENCAPSULATING MULTIPLE COMMUNICATION-COST METRICS IN PARTITIONING SPARSE RECTANGULAR MATRICES FOR PARALLEL MATRIX-VECTOR MULTIPLIES SIAM J. SCI. COMPUT. Vol. 25, No. 6, pp. 1837 1859 c 2004 Society for Industrial and Applied Mathematics ENCAPSULATING MULTIPLE COMMUNICATION-COST METRICS IN PARTITIONING SPARSE RECTANGULAR MATRICES FOR

More information

Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning

Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning Michael M. Wolf 1,2, Erik G. Boman 2, and Bruce A. Hendrickson 3 1 Dept. of Computer Science, University of Illinois at Urbana-Champaign,

More information

c 2010 Society for Industrial and Applied Mathematics

c 2010 Society for Industrial and Applied Mathematics SIAM J. SCI. COMPUT. Vol., No., pp. 66 68 c Society for Industrial and Applied Mathematics ON TWO-DIMENSIONAL SPARSE MATRIX PARTITIONING: MODELS, METHODS, AND A RECIPE ÜMİT V. ÇATALYÜREK, CEVDET AYKANAT,

More information

Minimizing Communication Cost in Fine-Grain Partitioning of Sparse Matrices

Minimizing Communication Cost in Fine-Grain Partitioning of Sparse Matrices Minimizing Communication Cost in Fine-Grain Partitioning of Sparse Matrices Bora Uçar and Cevdet Aykanat Department of Computer Engineering, Bilkent University, 06800, Ankara, Turkey {ubora,aykanat}@cs.bilkent.edu.tr

More information

Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication

Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication Ümit V. Çatalyürek and Cevdet Aykanat, Member, IEEE Computer Engineering Department, Bilkent University 06 Bilkent,

More information

A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices

A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices Ümit V. Çatalyürek Dept. of Pathology, Division of Informatics Johns Hopkins Medical Institutions Baltimore, MD 21287 umit@jhmi.edu

More information

Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication

Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 7, JULY 1999 673 Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication UÈ mitv.cëatalyuè rek and

More information

Technical Report. OSUBMI-TR-2009-n02/ BU-CE Hypergraph Partitioning-Based Fill-Reducing Ordering

Technical Report. OSUBMI-TR-2009-n02/ BU-CE Hypergraph Partitioning-Based Fill-Reducing Ordering Technical Report OSUBMI-TR-2009-n02/ BU-CE-0904 Hypergraph Partitioning-Based Fill-Reducing Ordering Ümit V. Çatalyürek, Cevdet Aykanat and Enver Kayaaslan April 2009 The Ohio State University Department

More information

A matrix partitioning interface to PaToH in MATLAB

A matrix partitioning interface to PaToH in MATLAB A matrix partitioning interface to PaToH in MATLAB Bora Uçar a,,1, Ümit V. Çatalyürek b,2, and Cevdet Aykanat c,3 a Centre National de la Recherche Scientifique, Laboratoire de l Informatique du Parallélisme,

More information

Par$$oning Sparse Matrices

Par$$oning Sparse Matrices SIAM CSE 09 Minisymposium on Parallel Sparse Matrix Computa:ons and Enabling Algorithms March 2, 2009, Miami, FL Par$$oning Sparse Matrices Ümit V. Çatalyürek Associate Professor Biomedical Informa5cs

More information

On the scalability of hypergraph models for sparse matrix partitioning

On the scalability of hypergraph models for sparse matrix partitioning On the scalability of hypergraph models for sparse matrix partitioning Bora Uçar Centre National de la Recherche Scientifique Laboratoire de l Informatique du Parallélisme, (UMR CNRS-ENS Lyon-INRIA-UCBL),

More information

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication on Many-Core Architectures

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication on Many-Core Architectures Eploiting Matri Reuse and Data Locality in Sparse Matri-Vector and Matri-ranspose-Vector Multiplication on Many-Core Architectures Ozan Karsavuran 1 Kadir Akbudak 2 Cevdet Aykanat 1 (speaker) sites.google.com/site/kadircs

More information

Parallel Computing 36 (2010) Contents lists available at ScienceDirect. Parallel Computing. journal homepage:

Parallel Computing 36 (2010) Contents lists available at ScienceDirect. Parallel Computing. journal homepage: Parallel Computing 36 (2010) 254 272 Contents lists available at ScienceDirect Parallel Computing journal homepage: www.elsevier.com/locate/parco A Matrix Partitioning Interface to PaToH in MATLAB Bora

More information

Image-Space-Parallel Direct Volume Rendering on a Cluster of PCs

Image-Space-Parallel Direct Volume Rendering on a Cluster of PCs Image-Space-Parallel Direct Volume Rendering on a Cluster of PCs B. Barla Cambazoglu and Cevdet Aykanat Bilkent University, Department of Computer Engineering, 06800, Ankara, Turkey {berkant,aykanat}@cs.bilkent.edu.tr

More information

Technical Report. Hypergraph-Partitioning-Based Models and Methods for Exploiting Cache Locality in Sparse-Matrix Vector Multiplication

Technical Report. Hypergraph-Partitioning-Based Models and Methods for Exploiting Cache Locality in Sparse-Matrix Vector Multiplication Technical Report BU-CE-1201 Hypergraph-Partitioning-Based Models and Methods for Exploiting Cache Locality in Sparse-Matrix Vector Multiplication Kadir Akbudak, Enver Kayaaslan and Cevdet Aykanat February

More information

A Parallel Matrix Scaling Algorithm

A Parallel Matrix Scaling Algorithm A Parallel Matrix Scaling Algorithm Patrick R. Amestoy 1, Iain S. Duff 2,3, Daniel Ruiz 1, and Bora Uçar 3 1 ENSEEIHT-IRIT, 2 rue Camichel, 31071, Toulouse, France amestoy@enseeiht.fr, Daniel.Ruiz@enseeiht.fr

More information

Portable, usable, and efficient sparse matrix vector multiplication

Portable, usable, and efficient sparse matrix vector multiplication Portable, usable, and efficient sparse matrix vector multiplication Albert-Jan Yzelman Parallel Computing and Big Data Huawei Technologies France 8th of July, 2016 Introduction Given a sparse m n matrix

More information

Analyzing and Enhancing OSKI for Sparse Matrix-Vector Multiplication 1

Analyzing and Enhancing OSKI for Sparse Matrix-Vector Multiplication 1 Available on-line at www.prace-ri.eu Partnership for Advanced Computing in Europe Analyzing and Enhancing OSKI for Sparse Matrix-Vector Multiplication 1 Kadir Akbudak a, Enver Kayaaslan a, Cevdet Aykanat

More information

Communication balancing in Mondriaan sparse matrix partitioning

Communication balancing in Mondriaan sparse matrix partitioning Communication balancing in Mondriaan sparse matrix partitioning Rob Bisseling and Wouter Meesen Rob.Bisseling@math.uu.nl http://www.math.uu.nl/people/bisseling Department of Mathematics Utrecht University

More information

Sparse matrices: Basics

Sparse matrices: Basics Outline : Basics Bora Uçar RO:MA, LIP, ENS Lyon, France CR-08: Combinatorial scientific computing, September 201 http://perso.ens-lyon.fr/bora.ucar/cr08/ 1/28 CR09 Outline Outline 1 Course presentation

More information

Parallel Greedy Matching Algorithms

Parallel Greedy Matching Algorithms Parallel Greedy Matching Algorithms Fredrik Manne Department of Informatics University of Bergen, Norway Rob Bisseling, University of Utrecht Md. Mostofa Patwary, University of Bergen 1 Outline Background

More information

The Matrix-Tree Theorem and Its Applications to Complete and Complete Bipartite Graphs

The Matrix-Tree Theorem and Its Applications to Complete and Complete Bipartite Graphs The Matrix-Tree Theorem and Its Applications to Complete and Complete Bipartite Graphs Frankie Smith Nebraska Wesleyan University fsmith@nebrwesleyan.edu May 11, 2015 Abstract We will look at how to represent

More information

Graph and Hypergraph Partitioning for Parallel Computing

Graph and Hypergraph Partitioning for Parallel Computing Graph and Hypergraph Partitioning for Parallel Computing Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology June 29, 2016 Graph and hypergraph partitioning References:

More information

c 2004 Society for Industrial and Applied Mathematics

c 2004 Society for Industrial and Applied Mathematics SIAM J. SCI. COMPUT. Vol. 25, No. 6, pp. 160 179 c 2004 Society for Industrial and Applied Mathematics PERMUTING SPARSE RECTANGULAR MATRICES INTO BLOCK-DIAGONAL FORM CEVDET AYKANAT, ALI PINAR, AND ÜMIT

More information

Self-Improving Sparse Matrix Partitioning and Bulk-Synchronous Pseudo-Streaming

Self-Improving Sparse Matrix Partitioning and Bulk-Synchronous Pseudo-Streaming Self-Improving Sparse Matrix Partitioning and Bulk-Synchronous Pseudo-Streaming MSc Thesis Jan-Willem Buurlage Scientific Computing Group Mathematical Institute Utrecht University supervised by Prof. Rob

More information

Combinatorial problems in a Parallel Hybrid Linear Solver

Combinatorial problems in a Parallel Hybrid Linear Solver Combinatorial problems in a Parallel Hybrid Linear Solver Ichitaro Yamazaki and Xiaoye Li Lawrence Berkeley National Laboratory François-Henry Rouet and Bora Uçar ENSEEIHT-IRIT and LIP, ENS-Lyon SIAM workshop

More information

On shared-memory parallelization of a sparse matrix scaling algorithm

On shared-memory parallelization of a sparse matrix scaling algorithm On shared-memory parallelization of a sparse matrix scaling algorithm Ümit V. Çatalyürek, Kamer Kaya The Ohio State University Dept. of Biomedical Informatics {umit,kamer}@bmi.osu.edu Bora Uçar CNRS and

More information

Advances in Parallel Partitioning, Load Balancing and Matrix Ordering for Scientific Computing

Advances in Parallel Partitioning, Load Balancing and Matrix Ordering for Scientific Computing Advances in Parallel Partitioning, Load Balancing and Matrix Ordering for Scientific Computing Erik G. Boman 1, Umit V. Catalyurek 2, Cédric Chevalier 1, Karen D. Devine 1, Ilya Safro 3, Michael M. Wolf

More information

Augmenting Hypergraph Models with Message Nets to Reduce Bandwidth and Latency Costs Simultaneously

Augmenting Hypergraph Models with Message Nets to Reduce Bandwidth and Latency Costs Simultaneously Augmenting Hypergraph Models with Message Nets to Reduce Bandwidth and Latency Costs Simultaneously Oguz Selvitopi, Seher Acer, and Cevdet Aykanat Bilkent University, Ankara, Turkey CSC16, Albuquerque,

More information

A Scalable Parallel LSQR Algorithm for Solving Large-Scale Linear System for Seismic Tomography

A Scalable Parallel LSQR Algorithm for Solving Large-Scale Linear System for Seismic Tomography 1 A Scalable Parallel LSQR Algorithm for Solving Large-Scale Linear System for Seismic Tomography He Huang, Liqiang Wang, Po Chen(University of Wyoming) John Dennis (NCAR) 2 LSQR in Seismic Tomography

More information

Downloaded 01/21/15 to Redistribution subject to SIAM license or copyright; see

Downloaded 01/21/15 to Redistribution subject to SIAM license or copyright; see SIAM J. SCI. COMPUT. Vol. 36, No. 5, pp. C568 C590 c 2014 Society for Industrial and Applied Mathematics SIMULTANEOUS INPUT AND OUTPUT MATRIX PARTITIONING FOR OUTER-PRODUCT PARALLEL SPARSE MATRIX-MATRIX

More information

for Parallel Matrix-Vector Multiplication? Umit V. C atalyurek and Cevdet Aykanat Computer Engineering Department, Bilkent University

for Parallel Matrix-Vector Multiplication? Umit V. C atalyurek and Cevdet Aykanat Computer Engineering Department, Bilkent University Decomposing Irregularly Sparse Matrices for Parallel Matrix-Vector Multiplication? Umit V. C atalyurek and Cevdet Aykanat Computer Engineering Department, Bilkent University 06533 Bilkent, Ankara, Turkey

More information

Vertex Magic Total Labelings of Complete Graphs 1

Vertex Magic Total Labelings of Complete Graphs 1 Vertex Magic Total Labelings of Complete Graphs 1 Krishnappa. H. K. and Kishore Kothapalli and V. Ch. Venkaiah Centre for Security, Theory, and Algorithmic Research International Institute of Information

More information

Design of Parallel Algorithms. Models of Parallel Computation

Design of Parallel Algorithms. Models of Parallel Computation + Design of Parallel Algorithms Models of Parallel Computation + Chapter Overview: Algorithms and Concurrency n Introduction to Parallel Algorithms n Tasks and Decomposition n Processes and Mapping n Processes

More information

Exploiting Locality in Sparse Matrix-Matrix Multiplication on the Many Integrated Core Architecture

Exploiting Locality in Sparse Matrix-Matrix Multiplication on the Many Integrated Core Architecture Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Exploiting Locality in Sparse Matrix-Matrix Multiplication on the Many Integrated Core Architecture K. Akbudak a, C.Aykanat

More information

For example, the system. 22 may be represented by the augmented matrix

For example, the system. 22 may be represented by the augmented matrix Matrix Solutions to Linear Systems A matrix is a rectangular array of elements. o An array is a systematic arrangement of numbers or symbols in rows and columns. Matrices (the plural of matrix) may be

More information

Hypergraph Partitioning for Computing Matrix Powers

Hypergraph Partitioning for Computing Matrix Powers Hypergraph Partitioning for Computing Matrix Powers Nicholas Knight Erin Carson, James Demmel UC Berkeley, Parallel Computing Lab CSC 11 1 Hypergraph Partitioning for Computing Matrix Powers Motivation

More information

SPARSE matrix-vector multiplication (SpMV) is a

SPARSE matrix-vector multiplication (SpMV) is a Spatiotemporal Graph and Hypergraph Partitioning Models for Sparse Matrix-Vector Multiplication on Many-Core Architectures Nabil Abubaker, Kadir Akbudak, and Cevdet Aykanat Abstract There exist graph/hypergraph

More information

COUNTING PERFECT MATCHINGS

COUNTING PERFECT MATCHINGS COUNTING PERFECT MATCHINGS JOHN WILTSHIRE-GORDON Abstract. Let G be a graph on n vertices. A perfect matching of the vertices of G is a collection of n/ edges whose union is the entire graph. This definition

More information

Lecture 27: Fast Laplacian Solvers

Lecture 27: Fast Laplacian Solvers Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall

More information

rppatoh: Replicated Partitioning Tool for Hypergraphs

rppatoh: Replicated Partitioning Tool for Hypergraphs rppatoh: Replicated Partitioning Tool for Hypergraphs R. Oguz Selvitopi Computer Engineering Department Bilkent University Ankara, 06800 Turkey reha@cs.bilkent.edu.tr roguzsel@gmail.com Ata Turk Computer

More information

Hypergraph-Theoretic Partitioning Models for Parallel Web Crawling

Hypergraph-Theoretic Partitioning Models for Parallel Web Crawling Hypergraph-Theoretic Partitioning Models for Parallel Web Crawling Ata Turk, B. Barla Cambazoglu and Cevdet Aykanat Abstract Parallel web crawling is an important technique employed by large-scale search

More information

Vertex Magic Total Labelings of Complete Graphs

Vertex Magic Total Labelings of Complete Graphs AKCE J. Graphs. Combin., 6, No. 1 (2009), pp. 143-154 Vertex Magic Total Labelings of Complete Graphs H. K. Krishnappa, Kishore Kothapalli and V. Ch. Venkaiah Center for Security, Theory, and Algorithmic

More information

Structured System Theory

Structured System Theory Appendix C Structured System Theory Linear systems are often studied from an algebraic perspective, based on the rank of certain matrices. While such tests are easy to derive from the mathematical model,

More information

Acyclic Colorings of Graph Subdivisions

Acyclic Colorings of Graph Subdivisions Acyclic Colorings of Graph Subdivisions Debajyoti Mondal, Rahnuma Islam Nishat, Sue Whitesides, and Md. Saidur Rahman 3 Department of Computer Science, University of Manitoba Department of Computer Science,

More information

Abusing a hypergraph partitioner for unweighted graph partitioning

Abusing a hypergraph partitioner for unweighted graph partitioning Abusing a hypergraph partitioner for unweighted graph partitioning B. O. Fagginger Auer R. H. Bisseling Utrecht University February 13, 2012 Fagginger Auer, Bisseling (UU) Mondriaan Graph Partitioning

More information

The block triangular form and bipartite matchings

The block triangular form and bipartite matchings Outline The block triangular form and bipartite matchings Jean-Yves L Excellent and Bora Uçar GRAAL, LIP, ENS Lyon, France CR-07: Sparse Matrix Computations, November 2010 http://graal.ens-lyon.fr/~bucar/cr07/

More information

On the Balanced Case of the Brualdi-Shen Conjecture on 4-Cycle Decompositions of Eulerian Bipartite Tournaments

On the Balanced Case of the Brualdi-Shen Conjecture on 4-Cycle Decompositions of Eulerian Bipartite Tournaments Electronic Journal of Graph Theory and Applications 3 (2) (2015), 191 196 On the Balanced Case of the Brualdi-Shen Conjecture on 4-Cycle Decompositions of Eulerian Bipartite Tournaments Rafael Del Valle

More information

Discrete Mathematics

Discrete Mathematics Discrete Mathematics Lecturer: Mgr. Tereza Kovářová, Ph.D. tereza.kovarova@vsb.cz Guarantor: doc. Mgr. Petr Kovář, Ph.D. Department of Applied Mathematics, VŠB Technical University of Ostrava About this

More information

Dense Matrix Algorithms

Dense Matrix Algorithms Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication

More information

I How does the formulation (5) serve the purpose of the composite parameterization

I How does the formulation (5) serve the purpose of the composite parameterization Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)

More information

MapReduce-based Parallelization of Sparse Matrix Kernels for Large-scale Scientific Applications

MapReduce-based Parallelization of Sparse Matrix Kernels for Large-scale Scientific Applications Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe MapReduce-based Parallelization of Sparse Matrix Kernels for Large-scale Scientific Applications Gunduz Vehbi Demirci a,

More information

Graph Theory Day Four

Graph Theory Day Four Graph Theory Day Four February 8, 018 1 Connected Recall from last class, we discussed methods for proving a graph was connected. Our two methods were 1) Based on the definition, given any u, v V(G), there

More information

Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation

Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation 786 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 5, MAY 20 Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation Ali Cevahir, Cevdet Aykanat, Ata

More information

Chapter 8 Dense Matrix Algorithms

Chapter 8 Dense Matrix Algorithms Chapter 8 Dense Matrix Algorithms (Selected slides & additional slides) A. Grama, A. Gupta, G. Karypis, and V. Kumar To accompany the text Introduction to arallel Computing, Addison Wesley, 23. Topic Overview

More information

Sparse Linear Systems

Sparse Linear Systems 1 Sparse Linear Systems Rob H. Bisseling Mathematical Institute, Utrecht University Course Introduction Scientific Computing February 22, 2018 2 Outline Iterative solution methods 3 A perfect bipartite

More information

Fill-in reduction in sparse matrix factorizations using hypergraphs

Fill-in reduction in sparse matrix factorizations using hypergraphs Fill-in reduction in sparse matrix factorizations using hypergraphs Oguz Kaya, Enver Kayaaslan, Bora Uçar, Iain S. Duff To cite this version: Oguz Kaya, Enver Kayaaslan, Bora Uçar, Iain S. Duff. Fill-in

More information

Partitioning Spatially Located Load with Rectangles: Algorithms and Simulations

Partitioning Spatially Located Load with Rectangles: Algorithms and Simulations Partitioning Spatially Located Load with Rectangles: Algorithms and Simulations, Erdeniz Ozgun Bas, Umit V. Catalyurek Department of Biomedical Informatics, The Ohio State University {esaule,erdeniz,umit}@bmi.osu.edu

More information

MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix.

MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. Row echelon form A matrix is said to be in the row echelon form if the leading entries shift to the

More information

AN ANALYSIS ON MARKOV RANDOM FIELDS (MRFs) USING CYCLE GRAPHS

AN ANALYSIS ON MARKOV RANDOM FIELDS (MRFs) USING CYCLE GRAPHS Volume 8 No. 0 208, -20 ISSN: 3-8080 (printed version); ISSN: 34-3395 (on-line version) url: http://www.ijpam.eu doi: 0.2732/ijpam.v8i0.54 ijpam.eu AN ANALYSIS ON MARKOV RANDOM FIELDS (MRFs) USING CYCLE

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 3 Dense Linear Systems Section 3.3 Triangular Linear Systems Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign

More information

Mathematics High School Geometry An understanding of the attributes and relationships of geometric objects can be applied in diverse contexts

Mathematics High School Geometry An understanding of the attributes and relationships of geometric objects can be applied in diverse contexts Mathematics High School Geometry An understanding of the attributes and relationships of geometric objects can be applied in diverse contexts interpreting a schematic drawing, estimating the amount of

More information

Algebraic Graph Theory- Adjacency Matrix and Spectrum

Algebraic Graph Theory- Adjacency Matrix and Spectrum Algebraic Graph Theory- Adjacency Matrix and Spectrum Michael Levet December 24, 2013 Introduction This tutorial will introduce the adjacency matrix, as well as spectral graph theory. For those familiar

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Chordal Graphs and Minimal Free Resolutions

Chordal Graphs and Minimal Free Resolutions Chordal Graphs and Minimal Free Resolutions David J. Marchette David A. Johannsen Abstract The problem of computing the minimal free resolution of the edge ideal of a graph has attracted quite a bit of

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

Math 1B03/1ZC3 - Tutorial 3. Jan. 24th/28th, 2014

Math 1B03/1ZC3 - Tutorial 3. Jan. 24th/28th, 2014 Math 1B03/1ZC3 - Tutorial 3 Jan. 24th/28th, 2014 Tutorial Info: Website: http://ms.mcmaster.ca/ dedieula. Math Help Centre: Wednesdays 2:30-5:30pm. Email: dedieula@math.mcmaster.ca. Elementary Matrices

More information

Sparse Hypercube 3-Spanners

Sparse Hypercube 3-Spanners Sparse Hypercube 3-Spanners W. Duckworth and M. Zito Department of Mathematics and Statistics, University of Melbourne, Parkville, Victoria 3052, Australia Department of Computer Science, University of

More information

arxiv: v1 [cs.dm] 21 Dec 2015

arxiv: v1 [cs.dm] 21 Dec 2015 The Maximum Cardinality Cut Problem is Polynomial in Proper Interval Graphs Arman Boyacı 1, Tinaz Ekim 1, and Mordechai Shalom 1 Department of Industrial Engineering, Boğaziçi University, Istanbul, Turkey

More information

Exercise Set Decide whether each matrix below is an elementary matrix. (a) (b) (c) (d) Answer:

Exercise Set Decide whether each matrix below is an elementary matrix. (a) (b) (c) (d) Answer: Understand the relationships between statements that are equivalent to the invertibility of a square matrix (Theorem 1.5.3). Use the inversion algorithm to find the inverse of an invertible matrix. Express

More information

LATIN SQUARES AND THEIR APPLICATION TO THE FEASIBLE SET FOR ASSIGNMENT PROBLEMS

LATIN SQUARES AND THEIR APPLICATION TO THE FEASIBLE SET FOR ASSIGNMENT PROBLEMS LATIN SQUARES AND THEIR APPLICATION TO THE FEASIBLE SET FOR ASSIGNMENT PROBLEMS TIMOTHY L. VIS Abstract. A significant problem in finite optimization is the assignment problem. In essence, the assignment

More information

REGULAR GRAPHS OF GIVEN GIRTH. Contents

REGULAR GRAPHS OF GIVEN GIRTH. Contents REGULAR GRAPHS OF GIVEN GIRTH BROOKE ULLERY Contents 1. Introduction This paper gives an introduction to the area of graph theory dealing with properties of regular graphs of given girth. A large portion

More information

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari)

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari) On Modularity Clustering Presented by: Presented by: Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari) Modularity A quality index for clustering a graph G=(V,E) G=(VE) q( C): EC ( ) EC ( ) + ECC (, ')

More information

Computer Graphics Hands-on

Computer Graphics Hands-on Computer Graphics Hands-on Two-Dimensional Transformations Objectives Visualize the fundamental 2D geometric operations translation, rotation about the origin, and scale about the origin Learn how to compose

More information

Summary of Raptor Codes

Summary of Raptor Codes Summary of Raptor Codes Tracey Ho October 29, 2003 1 Introduction This summary gives an overview of Raptor Codes, the latest class of codes proposed for reliable multicast in the Digital Fountain model.

More information

K 4 C 5. Figure 4.5: Some well known family of graphs

K 4 C 5. Figure 4.5: Some well known family of graphs 08 CHAPTER. TOPICS IN CLASSICAL GRAPH THEORY K, K K K, K K, K K, K C C C C 6 6 P P P P P. Graph Operations Figure.: Some well known family of graphs A graph Y = (V,E ) is said to be a subgraph of a graph

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 21 Outline 1 Course

More information

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning Parallel sparse matrix-vector product Lay out matrix and vectors by rows y(i) = sum(a(i,j)*x(j)) Only compute terms with A(i,j) 0 P0 P1

More information

15. The Software System ParaLab for Learning and Investigations of Parallel Methods

15. The Software System ParaLab for Learning and Investigations of Parallel Methods 15. The Software System ParaLab for Learning and Investigations of Parallel Methods 15. The Software System ParaLab for Learning and Investigations of Parallel Methods... 1 15.1. Introduction...1 15.2.

More information

Monday, 12 November 12. Matrices

Monday, 12 November 12. Matrices Matrices Matrices Matrices are convenient way of storing multiple quantities or functions They are stored in a table like structure where each element will contain a numeric value that can be the result

More information

Numerical Algorithms

Numerical Algorithms Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0

More information

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras Spectral Graph Sparsification: overview of theory and practical methods Yiannis Koutis University of Puerto Rico - Rio Piedras Graph Sparsification or Sketching Compute a smaller graph that preserves some

More information

Mathematics High School Geometry

Mathematics High School Geometry Mathematics High School Geometry An understanding of the attributes and relationships of geometric objects can be applied in diverse contexts interpreting a schematic drawing, estimating the amount of

More information

Bilinear Programming

Bilinear Programming Bilinear Programming Artyom G. Nahapetyan Center for Applied Optimization Industrial and Systems Engineering Department University of Florida Gainesville, Florida 32611-6595 Email address: artyom@ufl.edu

More information

Robot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss

Robot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss Robot Mapping Least Squares Approach to SLAM Cyrill Stachniss 1 Three Main SLAM Paradigms Kalman filter Particle filter Graphbased least squares approach to SLAM 2 Least Squares in General Approach for

More information

Graphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General

Graphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General Robot Mapping Three Main SLAM Paradigms Least Squares Approach to SLAM Kalman filter Particle filter Graphbased Cyrill Stachniss least squares approach to SLAM 1 2 Least Squares in General! Approach for

More information

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions Data In single-program multiple-data (SPMD) parallel programs, global data is partitioned, with a portion of the data assigned to each processing node. Issues relevant to choosing a partitioning strategy

More information

Mathematics. 2.1 Introduction: Graphs as Matrices Adjacency Matrix: Undirected Graphs, Directed Graphs, Weighted Graphs CHAPTER 2

Mathematics. 2.1 Introduction: Graphs as Matrices Adjacency Matrix: Undirected Graphs, Directed Graphs, Weighted Graphs CHAPTER 2 CHAPTER Mathematics 8 9 10 11 1 1 1 1 1 1 18 19 0 1.1 Introduction: Graphs as Matrices This chapter describes the mathematics in the GraphBLAS standard. The GraphBLAS define a narrow set of mathematical

More information

(Lec 14) Placement & Partitioning: Part III

(Lec 14) Placement & Partitioning: Part III Page (Lec ) Placement & Partitioning: Part III What you know That there are big placement styles: iterative, recursive, direct Placement via iterative improvement using simulated annealing Recursive-style

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Parallelizing LU Factorization

Parallelizing LU Factorization Parallelizing LU Factorization Scott Ricketts December 3, 2006 Abstract Systems of linear equations can be represented by matrix equations of the form A x = b LU Factorization is a method for solving systems

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Sparse Linear Systems and Factorization Methods Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 18 Sparse

More information

3 Identify shapes as two-dimensional (lying in a plane, flat ) or three-dimensional ( solid ).

3 Identify shapes as two-dimensional (lying in a plane, flat ) or three-dimensional ( solid ). Geometry Kindergarten Identify and describe shapes (squares, circles, triangles, rectangles, hexagons, cubes, cones, cylinders, and spheres). 1 Describe objects in the environment using names of shapes,

More information

SHORTEST PATHS ON SURFACES GEODESICS IN HEAT

SHORTEST PATHS ON SURFACES GEODESICS IN HEAT SHORTEST PATHS ON SURFACES GEODESICS IN HEAT INF555 Digital Representation and Analysis of Shapes 28 novembre 2015 Ruoqi He & Chia-Man Hung 1 INTRODUCTION In this project we present the algorithm of a

More information

3D Computer Graphics. Jared Kirschner. November 8, 2010

3D Computer Graphics. Jared Kirschner. November 8, 2010 3D Computer Graphics Jared Kirschner November 8, 2010 1 Abstract We are surrounded by graphical displays video games, cell phones, television sets, computer-aided design software, interactive touch screens,

More information

New Challenges In Dynamic Load Balancing

New Challenges In Dynamic Load Balancing New Challenges In Dynamic Load Balancing Karen D. Devine, et al. Presentation by Nam Ma & J. Anthony Toghia What is load balancing? Assignment of work to processors Goal: maximize parallel performance

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

An Improved Measurement Placement Algorithm for Network Observability

An Improved Measurement Placement Algorithm for Network Observability IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 16, NO. 4, NOVEMBER 2001 819 An Improved Measurement Placement Algorithm for Network Observability Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper

More information