Level 3: Level 2: Level 1: Level 0:

Similar documents
Lecture 19: Graph Partitioning

Kernighan/Lin - Preliminary Definitions. Comments on Kernighan/Lin Algorithm. Partitioning Without Nodal Coordinates Kernighan/Lin

My favorite application using eigenvalues: partitioning and community detection in social networks

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen

Native mesh ordering with Scotch 4.0

Analysis of Multilevel Graph Partitioning

Lesson 2 7 Graph Partitioning

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA

Clustering Object-Oriented Software Systems using Spectral Graph Partitioning

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Mesh Decomposition and Communication Procedures for Finite Element Applications on the Connection Machine CM-5 System

Multigrid Methods for Markov Chains

Parallel Multilevel Graph Partitioning

LINE AND PLANE SEPARATORS. PADMA RAGHAVAN y. Abstract. We consider sparse matrices arising from nite-element or nite-dierence methods.

Lecture 27: Fast Laplacian Solvers

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Graph partitioning. Prof. Richard Vuduc Georgia Institute of Technology CSE/CS 8803 PNA: Parallel Numerical Algorithms [L.27] Tuesday, April 22, 2008

F k G A S S1 3 S 2 S S V 2 V 3 V 1 P 01 P 11 P 10 P 00

PACKAGE SPECIFICATION HSL 2013

Advanced Topics in Numerical Analysis: High Performance Computing

Multilevel Graph Partitioning

Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Introduction to Multigrid and its Parallelization

Graph drawing in spectral layout

Solving Sparse Linear Systems. Forward and backward substitution for solving lower or upper triangular systems

Analysis of Multilevel Graph Partitioning

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND

A Compiler for Parallel Finite Element Methods. with Domain-Decomposed Unstructured Meshes JONATHAN RICHARD SHEWCHUK AND OMAR GHATTAS

Visual Representations for Machine Learning

Nonsymmetric Problems. Abstract. The eect of a threshold variant TPABLO of the permutation

TELCOM2125: Network Science and Analysis

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Spectral Clustering on Handwritten Digits Database

Numerical Linear Algebra for Data and Link Analysis

Social-Network Graphs

Contents. I The Basic Framework for Stationary Problems 1

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras

Scalable Clustering of Signed Networks Using Balance Normalized Cut

Q. Wang National Key Laboratory of Antenna and Microwave Technology Xidian University No. 2 South Taiba Road, Xi an, Shaanxi , P. R.

Sparse Matrices. Mathematics In Science And Engineering Volume 99 READ ONLINE

Andrew V. Knyazev and Merico E. Argentati (speaker)

Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning

Hierarchical Multi level Approach to graph clustering

Algebraic Graph Theory- Adjacency Matrix and Spectrum

Performance Evaluation of a New Parallel Preconditioner

sizes become smaller than some threshold value. This ordering guarantees that no non zero term can appear in the factorization process between unknown

A Recursive Coalescing Method for Bisecting Graphs

Extra-High Speed Matrix Multiplication on the Cray-2. David H. Bailey. September 2, 1987

Mathematics and Computer Science

Small Matrices fit into cache. Large Matrices do not fit into cache. Performance (MFLOPS) Performance (MFLOPS) bcsstk20 blckhole e05r0000 watson5

A B. A: sigmoid B: EBA (x0=0.03) C: EBA (x0=0.05) U

Sparse Matrices and Graphs: There and Back Again

Computing Heat Kernel Pagerank and a Local Clustering Algorithm

Advances in Parallel Partitioning, Load Balancing and Matrix Ordering for Scientific Computing

What is Multigrid? They have been extended to solve a wide variety of other problems, linear and nonlinear.

Parallel Numerical Algorithms

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution

Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI

SDD Solvers: Bridging theory and practice

Construction and application of hierarchical matrix preconditioners

SGN (4 cr) Chapter 11

A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS

Graph Partitioning Algorithms for Distributing Workloads of Parallel Computations

Ilya Lashuk, Merico Argentati, Evgenii Ovtchinnikov, Andrew Knyazev (speaker)

Kirchhoff's theorem. Contents. An example using the matrix-tree theorem

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C

CS 542G: Solving Sparse Linear Systems

PT-Scotch: A tool for efficient parallel graph ordering

A Multilevel Algorithm for the Minimum 2-sum Problem

Comments on the randomized Kaczmarz method

The problem of minimizing the elimination tree height for general graphs is N P-hard. However, there exist classes of graphs for which the problem can

Mining Social Network Graphs

Multilevel k-way Hypergraph Partitioning

22 Elementary Graph Algorithms. There are two standard ways to represent a

Multi-Processor Cholesky Decomposition of Conductance Matrices

Downloaded 12/20/13 to Redistribution subject to SIAM license or copyright; see

Algorithms for Graph Partitioning and Fill Reducing Ordering for Domain Decomposition Methods

Technical Report TR , Computer and Information Sciences Department, University. Abstract

An Agglomeration Multigrid Method for Unstructured Grids

Lecture 15: More Iterative Ideas

A Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering


Large-scale Structural Analysis Using General Sparse Matrix Technique

Gary Kumfert y Alex Pothen y;z. fkumfert, ABSTRACT

Solving the Graph Bisection Problem with Imperialist Competitive Algorithm

Problem Definition. Clustering nonlinearly separable data:

Planar Graphs 2, the Colin de Verdière Number

Efficient Nested Dissection for Multicore Architectures

V4 Matrix algorithms and graph partitioning

Neuro-Remodeling via Backpropagation of Utility. ABSTRACT Backpropagation of utility is one of the many methods for neuro-control.

Parallel ILU Ordering and Convergence Relationships: Numerical Experiments

Minimum-Cost Spanning Tree. as a. Path-Finding Problem. Laboratory for Computer Science MIT. Cambridge MA July 8, 1994.

Signal Processing for Big Data

Graph Partitioning Algorithms

Generalized trace ratio optimization and applications

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Space-filling curves for 2-simplicial meshes created with bisections and reflections

Transcription:

A Graph Based Method for Generating the Fiedler Vector of Irregular Problems 1 Michael Holzrichter 1 and Suely Oliveira 2 1 Texas A&M University, College Station, TX,77843-3112 2 The University of Iowa, Iowa City, 52242, USA, oliveira@cs.uiowa.edu, WWW home page: http://www.cs.uiowa.edu Abstract. In this paper we present new algorithms for spectral graph partitioning. Previously, the best partitioning methods were based on a combination of Combinatorial algorithms and application of the Lanczos method when the graph allows this method to be cheap enough. Our new algorithms are purely spectral. They calculate the Fiedler vector of the original graph and use the information about the problem in the form of a preconditioner for the graph Laplacian. In addition, we use a favorable subspace for starting the Davidson algorithm and reordering of variables for locality of memory references. 1 Introduction Many algorithms have been developed to partition a graph into k parts such that the number of edges cut is small. This problem arises in many areas including nding ll-in reducing permutations of sparse matrices and mapping irregular data structures to nodes of a parallel computer. Kernighan and Lin developed an eective combinatorial method based on swapping vertices [12]. Multilevel extensions of the Kernighan-Lin algorithm have proven eective for graphs with large numbers of vertices [11]. In recent years spectral approaches have received signicant attention [15]. Like combinatorial methods, multilevel versions of spectral methods have proven eective for large graphs [1]. The previous multilevel spectral algorithms of [1]is based on the application of spectral algorithms at various graph levels. Previous (multilevel) combinatorial algorithms coarsen a graph until it has a small number of nodes and can be partioned more cheaply by well known techniques (which may include spectral partitioning). After a partition is found for the coarsest graph, the partition is then successively interpolated onto ner graphs and rened [10, 11]. Our focus is on nding a partition of the graph using purely spectral techniques. We use the graphical structure of the problem to develop a multilevel spectral partitioning algorithm applied to the original graph. Our algorithm is based on a well known algorithm for the calculation of the Fiedler vector: the 1 This research was supported by NSF grant ASC-9528912, and currently by NSF/DARPA DMS-9874015

2 M. Holzrichter and S. Oliveira Davidson algorithm. The structure of the problem is incorporated in the form of a graphical preconditioner to the Davidson algorithm. 2 Spectral Methods The graph Laplacian of graph G is L = D? A, where A is G's adjacency matrix and D is a diagonal matrix where d i;i equals to the degree of vertex v i of the graph. One property of L is that its smallest eigenvalue is 0 and the corresponding eigenvector is (1; 1; : : : ; 1). If G is connected, all other eigenvalues are greater than 0. Fiedler [5, 6] explored the properties of the eigenvector associated with the second smallest eigenvalue. (These are now known as the \Fiedler vector" and \Fiedler value," respectively.) Spectral methods partition G based on the Fiedler vector of L. The reason why the Fiedler vector is useful for partitioning is explained in more detail in the next section. 3 Bisection as a Discrete Optimization Problem It is well known that for any vector x x T Lx = X (i;j)2e (x i? x j ) 2 ; holds (see for example Pothen et al. [15]). Note that there is one term in the sum for each edge in the graph. Consider a vector x whose construction is based upon a partition of the graph into subgraphs P 1 and P 2. Assign +1 to x i if vertex v i is in partition P 1 and?1 if v i is in partition P 2. Using this assignment of values to x i and x j, if an edge connects two vertices in the same partition then x i = x j and the corresponding term in the Dirichlet sum will be 0. The only non-zero terms in the Dirichlet sum are those corresponding to edges with end points in separate partitions. Since the Dirichlet sum has one term for each edge and the only non-zero terms are those corresponding to edges between P 1 and P 2, it follows that x T Lx = 4 (number of edges between P 1 and P 2 ). An x which minimizes the above expression corresponds to a partition which minimizes the number of edges between the partitions. The graph partitioning problem has been transformed into a discrete optimization problem with the goal of { minimizing 1 4 xt Lx { such that 1. e T x = 0 where e = (1; 1; : : : ; 1) T 2. x T x = n. 3. x i = 1

Irregular Problems 3 Condition (1) stipulates that the number of vertices in each partition be equal, and condition (2) stipulates that every vertex be assigned to one of the partitions. If we remove condition (3) the above problem can be solved using Lagrange multipliers. We seek to minimize f(x) subject to g 1 (x) = 0 and g 2 (x) = 0. This involves nding Lagrange multipliers 1 and 2 such that rf(x)? 1 rg 1 (x)? 2 rg 2 (x) = 0: For this discrete optimization problem, f(x) = 1 2 xt Lx, g 1 (x) = e T x, and g 2 (x) = 1 2 (xt x? n). The solution must satisfy Lx? 1 e? 2 x = 0: That is, (L? 2 I)x = 1 e. Premultiplying by e T gives But e T L = 0, e T x = 0 so 1 = 0. Thus e T Lx? 2 e T x = e T e 1 = n 1 : (L? 2 I)x = 0 and x is an eigenvector of L. The above development has shown how nding a partition of a graph which minimizes the number of edges cut can be transformed into an eigenvalue problem involving the graph Laplacian. This is the foundation of spectral methods. Theory exists to show the eectiveness of this approach. The work of Guattery and Miller [7] present examples of graphs for which spectral partitioning does not work well. Nevertheless, spectral partition methods work well on boundeddegree planar graphs and nite element meshes [17]. Chan et al. [3] show that spectral partitioning is optimal in the sense that the partition vector induced by it is the closest partition vector to the second eigenvector, in any l s norm where s 1. Nested dissection is predicated upon nding a good partition of a graph. Simon and Teng [16] examined the quality of the p-way partition produced by recursive bisection when p is a power of two. They show that recursive bisection in some cases produces a p-way partition which is much worse than the optimal p-way partition. However, if one is willing to relax the problem slightly, they show that recursive bisection will nd a good partition. 4 Our Approach We developed techniques which use a multilevel representation of a graph to accelerate the computation of its Fiedler vector. The new techniques { provide a framework for a multilevel preconditioner { obtain a favorable initial starting point for the eigensolver { reorder the data structures to improve locality of memory references.

4 M. Holzrichter and S. Oliveira The multilevel representation of the input graph G 0 consists of a series of graphs fg 0 ; G 1 ; : : : ; G n g obtained by successive coarsening. Coarsening G i is accomplished by nding a maximum matching of its vertices and then combining each pair of matched vertices to form a new vertex for G i+1 ; unmatched vertices are replicated in G i+1 without modication. Connectivity of G i is maintained by connecting two vertices in G i+1 with an edge if, and only if, their constituent vertices in G i were connected by an edge. The coarsening process concludes when a graph G n is obtained with suciently few vertices. We used the Davidson algorithm [4, 2, 14] as our eigensolver. The Davidson algorithm is a subspace algorithm which iteratively builds a sequence of nested subspaces. A Rayleigh-Ritz process nds a vector in each subspace which approximates the desired eigenvector. If the Ritz vector is not suciently close to an eigenvector then the subspace is augmented by adding a new dimension and the process repeats. The Davidson algorithm allows the incorporation of a preconditioner. We used the structure of the graph in the development of our Davidson preconditioner. More details about this algorithm is given in [8, 9]. 4.1 Multilevel Preconditioner We will refer to our new preconditioned Davidson Algorithm as PDA. The preconditioner approximates the inverse of the graph Laplacian matrix. It operates in a manner similar to multigrid methods for solving discretizations of PDE's [13]. However, our preconditioner diers from these methods in that we do not rely on obtaining coarser problems by decimating a regular discretization. Our method works with irregular or unknown discretizations because it uses the coarse graphs for the multilevel framework. PDA can be considered as our main contribution to the current research on partitioning algorithms. The next two subsections decrib other features of our algorithms. These additional features aimed to spedd up each interation of PDA. The numerical results of Sec. 5 will shown that we were successful in improving purely spectral partitioning algorithms. 4.2 Favorable Initial Subspace The second main strategy utilizes the multilevel representation of G 0 to construct a favorable initial subspace for the Davidson Algorithm. We call this strategy the Nested Davidson Algorithm (NDA). The Nested Davidson Algorithm works by running the Davidson algorithm on the graphs in fg 0 ; G 1 ; : : : ; G n g in order from G n to G 0. The k basis vectors spanning the subspace for G i+1 are used to construct l basis vectors for the initial subspace for G i. The l new basis vectors are constructed from linear combinations of the basis vectors for G i+1 by interpolating them onto G i and orthonormalizing. Once the initial basis vectors for G i have been constructed, the Davidson algorithm is run on G i and the process repeats for G i?1.

Irregular Problems 5 4.3 Locality of Memory References No assumption is made about the regularity of the input graphs. Furthermore the manner in which the coarse graphs are constructed results in data structures with irregular storage in memory. The irregular storage of data structures has the potential of reducing locality of memory accesses and thereby reducing the eectiveness of cache memories. We developed a method called Multilevel Graph Reordering (MGR) which uses the multilevel representation of G 0 to reorder the data structures in memory to improve locality of reference. Intuitively, we permute the graph Laplacian matrices to increase the concentration of non-zero elements along the diagonal. This improved locality of reference during relaxation operations which represented a major portion of the time required to compute the Fiedler vector. The relabeling of the graph is accomplished by imposing a tree on the vertices of the graphs fg 0 ; G 1 ; : : : ; G n g. This tree was traversed in a depth-rst manner. The vertices were relabeled in the order in which they were visited. After the relabelling was complete the data structures were rearranged in memory such that the data structures for the i th vertex are stored at lower memory addresses than the data structures for the i + 1 st vertex. An example of reordering by MGR is shown in Figures 1 and 2. The vertices are shown as well as the tree overlain on the vertices. (The edges of the graphs are not shown.) If a vertex lies to the right of another in the gure, then its datastructures occupy higher memory addresses. Notice that after reordering, the vertices with a common parent are placed next to each other in memory. This indicates that they are connected by an edge and will be referenced together during relaxation operations. Relaxation operations represented a large fraction of the total work done by our algorithms. This reordering has a positive eect of locality of reference during such operations. Level 3: A Level 2: B C Level 1: D E F Level 0: G I H J Fig. 1. Storage of data structures before reordering The algorithms described above are complementary and can be used concurrently while computing the Fiedler vector. We call the resulting algorithm

6 M. Holzrichter and S. Oliveira Level 3: A Level 2: B C Level 1: E D F Level 0: G J I H Fig. 2. Storage of data structures after reordering when PDA, NDA and MGR are combined: the Multilevel Davidson Algorithm (MDA). 5 Numerical Results The performance of our new algorithms were tested by computing the Fiedler vector of graphs extracted from large, sparse matrices. These matrices ranged in size from approximately 10; 000 10; 000 to more 200; 000 200; 000. The number of edges ranged from approximately 300,000 to more than 12 million. The matrices come from a variety of application domains, for example, nite element models and a matrix from nancial portfolio optimization. Figure 3 shows the ratio of the time taken by MDA when using PDA, NDA and MGR concurrently to the time taken by the Lanczos algorithm to compute the Fiedler vector. The ratio for most graphs was between 0.2 and 0.5 indicating that MDA usually computed the Fiedler vector two to ve times as fast as the Lanczos algorithm. For some graphs MDA was even faster. For the gearbox graph MDA was only slightly faster. Our studies indicate that the two main factors in MDA's improved performance shown in Figure 3 are the preconditioner of PDA and the favorable initial subspace provided by NDA. We have observed that NDA does indeed nd favorable subspaces. The Fiedler vector had a very small component perpendicular to the initial subspace. We were able to measure the eect of MGR on locality of memory references. Our studies found that MGR on the average increased the number of times a primary cache line was reused by 18 percent indicating improved locality by the pattern of memory accesses. 6 Conclusions We have developed new algorithms which utilize the multilevel representation of a graph to accelerate the computation of the Fiedler vector for that graph. The

Irregular Problems 7 3dtube bcsstk31 bcsstk32 bcsstk35 cfd1 cfd2 ct20stif finan512 gearbox nasasrb pwt pwtk 0.0 0.2 0.4 0.6 0.8 1.0 Fig. 3. Ratio of Davidson to Lanczos execution time Preconditioned Davidson Algorithm uses the graph structure of the problem as as a framework for a multilevel preconditioner. A favorable initial subspace for the Davidson algorithm is provided by the Nested Davidson algorithm. The Multilevel Graph Reordering algorithm improves the locality of memory references by nding a permutation of the graph Laplacian matrices which concentrates the non-zero entries along the diagonal and reordering the data structures in memory accordingly. More numerical results can be found in [9]. The full algorithms for the methods described in this paper are available in [8]. PDA can be considered as our main contribution to spectral partitioning of irregular graphs. References 1. S. Barnard and H. Simon. A fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems. In Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientic Computing, Norfolk, Virginia, 1993. SIAM, SIAM. 2. L. Borges and S. Oliveira. A parallel Davidson-type algorithm for several eigenvalues. Journal of Computational Physics, (144):763{770, August 1998. 3. T. Chan, P. Ciarlet Jr., and W. K. Szeto. On the optimality of the median cut spectral bisection graph partitioning method. SIAM Journal on Computing, 18(3):943{ 948, 1997. 4. E. Davidson. The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real-symmetric matrices. Journal of Computational Physics, 17:87{94, 1975. 5. M. Fiedler. Algebraic connectivity of graphs. Czechoslovak Mathematical Journal, 23:298{305, 1973. 6. M. Fiedler. A property of eigenvectors of non-negative symmetric matrices and its application to graph theory. Czechoslovak Mathematical Journal, 25:619{632, 1975. 7. S. Guattery and G. L. Miller. On the quality of spectral separators. SIAM Journal on Matrix Analysis and Applications, 19:701{719, 1998.

8 M. Holzrichter and S. Oliveira 8. M. Holzrichter and S. Oliveira. New spectral graph partitioning algorithms. submitted. 9. M. Holzrichter and S. Oliveira. New graph partitioning algorithms. 1998. The University of Iowa TR-120. 10. G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular garphs. to appear in the Journal of Parallel and Distributed Computing. 11. G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientic Computing, 20(1):359{392, 1999. 12. B. Kernighan and S. Lin. An ecient heuristic procedure for partitioning graphs. The Bell System Technical Journal, 49:291{307, February 1970. 13. S. McCormick. Multilevel Adaptive Methods for Partial Dierential Equations. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, 1989. 14. S. Oliveira. A convergence proof of an iterative subspace method for eigenvalues problem. In F. Cucker and M. Shub, editors, Foundations of Computational Mathematics Selected Papers, pages 316{325. Springer, January 1997. 15. Alex Pothen, Horst D. Simon, and Kang-Pu Liou. Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl., 11(3):430{452, 1990. Sparse matrices (Gleneden Beach, OR, 1989). 16. H. D. Simon and S. H. Teng. How good is recursive bisection. SIAM Journal on Scientic Computing, 18(5):1436{1445, July 1997. 17. D. Spielman and S. H. Teng. Spectral partitioning works: planar graphs and nite element meshes. In 37th Annual Symposium Foundations of Computer Science, Burlington, Vermont, October 1996. IEEE, IEEE Press.