Parallel Incremental Graph Partitioning Using Linear Programming

Similar documents
Parallelism for Nested Loops with Non-uniform and Flow Dependences

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

An Optimal Algorithm for Prufer Codes *

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem

Solving two-person zero-sum game by Matlab

A Binarization Algorithm specialized on Document Images and Photos

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Communication-Minimal Partitioning and Data Alignment for Af"ne Nested Loops

X- Chart Using ANOM Approach

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Smoothing Spline ANOVA for variable screening

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

GSLM Operations Research II Fall 13/14

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

O n processors in CRCW PRAM

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Support Vector Machines

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

11. APPROXIMATION ALGORITHMS

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

LECTURE : MANIFOLD LEARNING

Load-Balanced Anycast Routing

The Codesign Challenge

Channel 0. Channel 1 Channel 2. Channel 3 Channel 4. Channel 5 Channel 6 Channel 7

Control strategies for network efficiency and resilience with route choice

Constructing Minimum Connected Dominating Set: Algorithmic approach

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Support Vector Machines

Cost-efficient deployment of distributed software services

Virtual Machine Migration based on Trust Measurement of Computer Node

Wavefront Reconstructor

Meta-heuristics for Multidimensional Knapsack Problems

A One-Sided Jacobi Algorithm for the Symmetric Eigenvalue Problem

Biostatistics 615/815

5 The Primal-Dual Method

Chapter 1. Introduction

SAO: A Stream Index for Answering Linear Optimization Queries

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

1 Introducton Gven a graph G = (V; E), a non-negatve cost on each edge n E, and a set of vertces Z V, the mnmum Stener problem s to nd a mnmum cost su

A Robust Method for Estimating the Fundamental Matrix

Concurrent Apriori Data Mining Algorithms

Abstract Ths paper ponts out an mportant source of necency n Smola and Scholkopf's Sequental Mnmal Optmzaton (SMO) algorthm for SVM regresson that s c

Cluster Analysis of Electrical Behavior

an assocated logc allows the proof of safety and lveness propertes. The Unty model nvolves on the one hand a programmng language and, on the other han

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Report on On-line Graph Coloring

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Active Contours/Snakes

Unsupervised Learning

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Module Management Tool in Software Development Organizations

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

LECTURE NOTES Duality Theory, Sensitivity Analysis, and Parametric Programming

CS 534: Computer Vision Model Fitting

Classifier Selection Based on Data Complexity Measures *

Parallel matrix-vector multiplication

Mathematics 256 a course in differential equations for engineering students

Cordial and 3-Equitable Labeling for Some Star Related Graphs

Angle-Independent 3D Reconstruction. Ji Zhang Mireille Boutin Daniel Aliaga

Efficient Load-Balanced IP Routing Scheme Based on Shortest Paths in Hose Model. Eiji Oki May 28, 2009 The University of Electro-Communications

Wishing you all a Total Quality New Year!

Machine Learning: Algorithms and Applications

Loop Transformations, Dependences, and Parallelization

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Feature Reduction and Selection

CMPS 10 Introduction to Computer Science Lecture Notes

A SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES

Hermite Splines in Lie Groups as Products of Geodesics

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

Greedy Technique - Definition

A Parallel Gauss-Seidel Algorithm for Sparse Power System. Matrices. D. P. Koester, S. Ranka, and G. C. Fox

Topology Design using LS-TaSC Version 2 and LS-DYNA

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

A Facet Generation Procedure. for solving 0/1 integer programs

Load Balancing for Hex-Cell Interconnection Network

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

LS-TaSC Version 2.1. Willem Roux Livermore Software Technology Corporation, Livermore, CA, USA. Abstract

Improving Low Density Parity Check Codes Over the Erasure Channel. The Nelder Mead Downhill Simplex Method. Scott Stransky

Analysis of Continuous Beams in General

Simplification of 3D Meshes

Maintaining temporal validity of real-time data on non-continuously executing resources

Problem Set 3 Solutions

User Authentication Based On Behavioral Mouse Dynamics Biometrics

USING GRAPHING SKILLS

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

S1 Note. Basis functions.

Cooperative UAV Trajectory Planning with Multiple Dynamic Targets

Optimization Methods: Integer Programming Integer Linear Programming 1. Module 7 Lecture Notes 1. Integer Linear Programming

Transcription:

Syracuse Unversty SURFACE College of Engneerng and Computer Scence - Former Departments, Centers, Insttutes and roects College of Engneerng and Computer Scence 994 arallel Incremental Graph arttonng Usng Lnear rogrammng Chao We Ou Syracuse Unversty Sanay Ranka Syracuse Unversty Follow ths and addtonal works at: https://surface.syr.edu/lcsmth_other art of the Computer Scences Commons Recommended Ctaton Ou, Chao We and Ranka, Sanay, "arallel Incremental Graph arttonng Usng Lnear rogrammng" (994). College of Engneerng and Computer Scence - Former Departments, Centers, Insttutes and roects. 4. https://surface.syr.edu/lcsmth_other/4 Ths Artcle s brought to you for free and open access by the College of Engneerng and Computer Scence at SURFACE. It has been accepted for ncluson n College of Engneerng and Computer Scence - Former Departments, Centers, Insttutes and roects by an authorzed admnstrator of SURFACE. For more nformaton, please contact surface@syr.edu.

arallel Incremental Graph arttonng Usng Lnear rogrammng Chao-We Ou and Sanay Ranka School of Computer and Informaton Scence Syracuse Unversty Syracuse, NY 44-4 Abstract arttonng graphs nto equally large groups of nodes whle mnmzng the number of edges between derent groups s an extremely mportant problem n parallel computng. For nstance, ecently parallelzng several scentc and engneerng applcatons requres the parttonng of data or tasks among processors such that the computatonal load on each node s roughly the same, whle communcaton s mnmzed. Obtanng exact solutons s computatonally ntractable, snce graph-parttonng s an Ncomplete. For a large class of rregular and adaptve data parallel applcatons (such as adaptve meshes), the computatonal structure changes from one phase to another n an ncremental fashon. In ncremental graph-parttonng problems the parttonng of the graph needs to be updated as the graph changes over tme; a small number of nodes or edges may be added or deleted at any gven nstant. In ths paper we use a lnear programmng-based method to solve the ncremental graph parttonng problem. All the steps used by our method are nherently parallel and hence our approach can be easly parallelzed. By usng an ntal soluton for the graph parttons derved from recursve spectral bsectonbased methods, our methods can acheve reparttonng at consderably lower cost than can be obtaned by applyng recursve spectral bsecton from scratch. Further, the qualty of the parttonng acheved s comparable to that acheved by applyng recursve spectral bsecton to the ncremental graphs from scratch. Introducton Graph parttonng s a well-known problem for whch fast solutons are extremely mportant n paral- Ths research was supported n part by DARA under contract #DABT6-9-C-8. lel computng and n research areas such as crcut parttonng for VLSI desgn. For nstance, parallelzaton of many scentc and engneerng problems requres parttonng the data among the processors n such a fashon that the computaton load on each node s balanced, whle communcaton s mnmzed. Ths s a graph-parttonng problem, where nodes of the graph represent computatonal tasks, and edges descrbe the communcaton between tasks wth each partton correspondng to one processor. Optmal parttonng would allow optmal parallelzaton of the computatons wth the load balanced over varous processors and wth mnmzed communcaton tme. For many applcatons, the computatonal graph can be derved only at runtme and requres that graph parttonng also be done n parallel. Snce graph parttonng s N-complete, obtanng suboptmal solutons quckly s desrable and often satsfactory. For a large class of rregular and adaptve data parallel applcatons such as adaptve meshes [], the computatonal structure changes from one phase to another n an ncremental fashon. In \ncremental graph-parttonng" problems, the parttonng of the graph needs to be updated as the graph changes over tme; a small number of nodes or edges may be added or deleted at any gven nstant. A soluton of the prevous graph-parttonng problem can be utlzed to partton the updated graph, such that the tme requred wll be much less than the tme requred to reapply a parttonng algorthm to the entre updated graph. If the graph s not reparttoned, t may lead to mbalance n the tme requred for computaton on each node and cause consderable deteroraton n the overall performance. For many of these problems the graph may be moded after every few teratons (albet ncrementally), and so the remappng must have a lower cost relatve to the computatonal cost of executng the few teratons for whch the computatonal structure remans xed. Unless ths ncremental parttonng can tself be performed n parallel, t may become a bottleneck.

Several suboptmal methods have been suggested for ndng good solutons to the graph-parttonng problem. Important heurstcs nclude recursve coordnate bsecton, recursve graph bsecton, recursve spectral bsecton, mncut-based methods, clusterng technques, geometry-based mappng, block-based spatal decomposton, and scattered decomposton [, 4,, 6, 5, 8, 9,, ]. For many applcatons, the computatonal graph s such that the vertces correspond to two- or threedmensonal coordnates and the nteracton between computatons s lmted to vertces that are physcally proxmate. In ths paper we concentrate on methods for whch such nformaton s not avalable, and whch therefore have wder applcablty. Our ncremental graph-parttonng algorthm uses lnear programmng. Usng recursve spectral bsecton, whch s regarded as one of the best-known methods for graph parttonng, our methods can partton the new graph at consderably lower cost. The qualty of parttonng acheved s close to that acheved by applyng recursve spectral bsecton from scratch. Further, our algorthms are nherently parallel. The rest of the paper s outlned as follows. Secton denes the ncremental graph-parttonng problem. Secton descrbes the lnear programmng-based ncremental graph parttonng. Expermental results of our methods on sample meshes are descrbed n Secton 4. Conclusons are gven n Secton 5.. roblem denton Consder a graph G = (V; E), where V represents a set of vertces, E represents a set of undrected edges, the number of vertces s gven by n = V, and the number of edges s gven by m = E. The graphparttonng problem can be dened as an assgnment scheme M : V?! that maps vertces to parttons. We denote by B(q) the set of vertces assgned to a partton q,.e., B(q) = fv V : M(v) = qg. The weght w corresponds to the computaton cost (or weght) of the vertex v. The cost of an edge w e (v ; v ) s gven by the amount of nteracton between vertces v and v. The weght of every partton can be dened as W (q) = v B(q) w : () The cost of all the outgong edges from a partton represent the total amount of communcaton cost and s gven by C(q) = v B(q); v 6B(q) w e (v ; v ): () We would lke to make an assgnment such that the tme spent by every node s mnmzed,.e., mn q (W (q) + C(q)), where represents the rato of cost of unt computaton/cost of unt communcaton on a machne. Assumng computatonal loads are nearly balanced (W () W () W (p? )), the second term needs to be mnmzed. In the lterature C(q) has also been used to represent the communcaton. Assume that a soluton s avalable for a graph G(V; E) by usng one of the many avalable methods n the lterature,.e., the mappng functon M s avalable such that B() B() B() B(q? ) () and the communcaton cost s close to optmal. Let G (V ; E ) be an ncremental graph of G(V; E). V = V [ V? V where V V; (4).e., some vertces are added and some vertces are deleted. Smlarly, E = E [ E? E where E E; E \ E 6= ; (5).e., some edges are added and some are deleted. We would lke to nd a new mappng M : V?! such that the new parttonng s as load balanced as possble and the communcaton cost s mnmzed. The methods descrbed n ths paper assume that G (V ; E ) s sucently smlar to G(V; E) that ths can be acheved,.e., the number of vertces and edges added/deleted are a small fracton of the orgnal number of vertces and edges. Incremental parttonng In ths secton we formulate ncremental graph parttonng n terms of lnear programmng. A hgh-level overvew of the four phases of our ncremental graphparttonng algorthm s shown n Fgure. Some notaton s n order. Let. be the number of parttons.. B () represent the set of vertces n partton.

. represent the average load for each partton = B (). The four steps are descrbed n detal n the followng sectons. Step : Assgn the new vertces to one of the parttons (gven by M ). Step : Layer each partton to nd the closest partton for each vertex (gven by L ). Step : Formulate the lnear programmng problem based on the mappng of Step and balance loads (.e., modfy M ) mnmzng the total number of changes n M. Step 4: Rene the mappng n Step to reduce the communcaton cost. Fgure : The derent steps used n our ncremental graph-parttonng algorthm.. Assgnng an ntal partton to the new nodes The rst step of the algorthm s to assgn an ntal partton to the nodes of the new graph (gven by M (V )). A smple method for ntalzng M (V ) s gven as follows. Let M (v) = M(v) for all v V? V : (6) For all the vertces v V, M (v) = M(x) where mn xv?v (d(v; x)); (7) d(v; x) s the shortest dstance n the graph G (V ; E ). For the examples consdered n ths paper we assume that G s connected. If ths s not the case, several other strateges can be used. (a) (b) Fgure : (a) Intal Graph (b) Incremental Graph (New vertces are shown by \"). If G (V [ V ; E [ E ) s connected, ths graph can be used nstead of G for calculaton of M (V ). If G (V [ V ; E [ E ) s not connected, then the new nodes that are not connected to any of the old nodes can be clustered together (nto potentally dsont clusters) and assgned to the partton that has the least number of vertces. For the rest of the paper we wll assume that M (v) can be calculated usng the denton n (7), although the strateges developed n ths paper are, n general,

ndependent of ths mappng. Further, for ease of presentaton, we wll assume that the edge and the vertex weghts are of unt value. All of our algorthms can be easly moded f ths s not the case. Fgure (a) descrbes the mappng of each the vertces of a graph. Fgure (b) descrbes the mappng of the addtonal vertces usng the above strategy.. Layerng each partton f map[v[]] represents the mappng of vertex. g f ad [] represents the th element of the local adacent lst n partton. g f xad [v[]] represents the startng address of vertex n local adacent lst of partton. g f S (;k) represents the set of vertces of partton at a dstance k from a node n partton. f Neghbor represents the set of parttons whch have common boundares wth partton. g For each partton do For vertex v[] V do For k? xad [v[]] to xad [v[ + ]] do f map[ad [k] 6= Count [map[ad [k]]] := Count [map[ad [k]]] + fl Count[l] > Add v[] nto S (tag;) f where Count[tag] = max l Count[l] g V level := repeat V? fv[]g For k Neghbor do For vertex v[] S (k;level) do For l? xad [v[]] to xad [v[ + ]] do f ad [l] 6 S (k;level) count [ad [l]][k] := count [ad [l]][k] + level := level + Add v[] nto tmp S For vertex v[] tmp S do Add v[] nto S (tag;level) f where count [][tag] = max l count [][l] g V untl (V = ) V? fv[]g For Neghbor do := k<level S (;k) Fgure : Layerng Algorthm The above mappng would ordnarly generate parttons of unequal sze. We would lke to move vertces from one partton to another to acheve load balancng, whle keepng the communcaton cost as small as possble. Ths s acheved by makng sure that the vertces transferred between two parttons are close to the boundary of the two parttons. We assgn each vertex of a gven partton to a derent partton t s close to (tes are broken arbtrarly). where x s such that L (v) = M(x) (8) mn (d(v; x)) (9) x=b (M(v)) s satsed; d(v; x) s the shortest dstance n the graph between v and x. k k k k (a) (b) Fgure 4: Labelng the nodes of a graph to the closest outsde partton. (a) A mcroscopc vew of the layerng for a graph near the boundary of three parttons. (b) Layerng of the graph n Fgure (b); no edges are shown. k

A smple algorthm to perform the layerng s gven n Fgure. It assumes the graph s connected. Let represent the number of such vertces of partton that can be moved to partton. For the example case of Fgure, labels of all the vertces are gven n Fgure 4. A label of vertex n partton corresponds to the fact that ths vertex belongs to the set that contrbuted to.. Load balancng Let l represent the number of vertces to be moved from partton to partton to acheve load balance. There are several methods for load balancng. However, snce one of our goals s to mnmze the communcaton cost, we would lke to mnmze l, because ths would correspond to a mnmzaton of the amount of vertex movement (or \deformty") n the orgnal parttons. Thus, the load-balancng step can be formally dened as the followng lnear programmng problem. Mnmze l () subect to < 6= l B () () (l? l ) = B ()? < : () Constrant corresponds to the load balance condton. The above formulaton s based on the assumpton that changes to the orgnal graph are small and the ntal parttonng s well balanced. Hence, movng the boundares by a small amount wll gve balanced parttonng wth low communcaton cost. There are several approaches to solvng the above lnear programmng problem. We decded to use the smplex method because t has been shown to work well n practce and because t can be easly parallelzed. The smplex formulaton of the example n Fgure s gven n Fgure 5. The correspondng soluton s l = 8 and l =. The new parttonng s gven n Fgure 6. The above set of constrants may not have a feasble soluton. One approach s to relax the constrant n () and not have l as a constrant. Clearly, Constrants n (): l 9 l 7 l l l l l 7 l 9 l 7 l 5 Constrants n (): l + l + l? l? l? l = 8 l + l? l? l =?l? l? l + l + l + l =?l? l + l + l = 8 Soluton usng the Smplex Method l = 8, l = all other values are zero. Fgure 5: Lnear programmng formulaton and ts soluton based on the mappng of the graph n Fgure (b) usng the labelng nformaton n Fgure 4 (b). Intal parttons Incremental parttons Fgure 6: The new partton of the graph n Fgure (b) after the Load Balancng step. We have used a dense verson of smplex algorthm. The total tme can potentally be reduced by usng sparse representaton.

ths would acheve load balance but may lead to maor modcatons n the mappng. Another approach s to replace the constrant n ( ) by: < (l? l ) = B ()? < : () Assumng C > >, ths would not acheve load balancng n one step, but several such steps can be appled to acheve load balancng. If a feasble soluton cannot be found wth a reasonable value of (wthn an upper bound C), t would be better to start parttonng from scratch or solve the problem by addng only a fracton of the nodes at a gven tme,.e., solve the problem n multple stages. Typcally, such cases arse when all the new nodes correspond to a few parttons and the amount of ncremental change s greater than the sze of one partton. non-local edge to partton = non-local edge to partton k = local edges = v.4 Renement of parttons The formulaton n the prevous secton acheves load balance but does not try explctly to reduce the number of cross-edges. The mnmzaton term n () and the constrant n () ndrectly keep the crossedges to a mnmum under the assumpton that the ntal partton s good. In ths secton we descrbe a lnear programmng-based strategy to reduce the number of cross-edges, whle stll mantanng the load balance. Ths s acheved by ndng all the vertces of parttons on the boundary of partton and such that the cost of edges to the vertces n are larger than the cost of edges to local vertces (Fgure 7),.e., the total cost of cross-edges wll decrease by movng the vertex from partton to, whch wll aect the load balance. In the followng a lnear programmng formulaton s gven that moves the vertces whle keepng the load balance. Let M (k) : V?! represent the mappng of each vertex after the load balancng step. Let out (k; ) represent the number of edges of vertex k n partton M (k) connected to partton ( 6= M (k)) and n (k) represent the number of vertces a vertex k s connected to n partton M (k). Let b represent the number of vertces n partton whch have more outgong edges to partton than local edges. (a) (b) Fgure 7: Choosng vertces for renement. (a) Mcroscopc vew of a vertex whch can be moved from partton to, reduceng the number of cross edges. (b) The set of vertces wth the above property n the partton of Fgure 6. k b = fv B out (V; )? n (V ) :g We would lke to maxmze the number of vertces moved so that movng a vertex wll not ncrease the cost of cross-edges. The nequalty n the above denton can be changed to a strct nequalty. We leave

the equalty, however, snce by ncludng such vertces the number of ponts that can be moved can be larger (because these vertces can be moved to satsfy load balance constrants wthout aectng the number of cross-edges). The renement problem can now be posed as the followng lnear programmng problem: Maxmze such that 6= l (4) l b 6= < (5) < (l? l ) = < : (6) Incremental parttons Refned parttons Fgure 9: The new partton of the graph n Fgure 6 after the Renement step. Constrant (5) l l l l l l l l l l Load Balancng Constrant (6) l + l + l? l? l? l = l + l? l? l =?l? l? l + l + l + l =?l? l + l + l = Soluton usng Smplex Method l =, l =, l =, l =, l = l =, l =, l =, l =, l = Fgure 8: Formulaton of the renement step usng lnear programmng and ts soluton. Ths renng step can be appled teratvely untl the eectve gan by the movement of vertces s small. After a few steps, the nequaltes (l b ) need to be replaced by strct nequaltes (l < b ); otherwse, vertces havng an equal number of local and nonlocal vertces may move between boundares wthout reducng the total cost. The smplex formulaton of the example n Fgure 6 s gven n Fgure 8 and the new parttonng after renement s gven n Fgure 9. Expermental results In ths secton, we present expermental results of the lnear programmng-based ncremental parttonng presented n the prevous secton (we wll use the term Incremental Graph arttoner (IG) to refer to ths algorthm). The tmngs are gven for parttons on a -node and -node CM-5. We have used two sets of adaptve meshes for our experments. These meshes were generated usng the DIME envronment []. The ntal mesh of the rst set s gven n Fgure. The other ncremental meshes are generated by makng renements n a localzed area of the ntal mesh. These meshes represent a sequence of renements n a localzed area. The number of nodes n the meshes are 7, 96,, 5, and 9 respectvely. The parttonng of the ntal mesh (sze 7 nodes) was determned usng Recursve Spectral bsecton. Ths was the parttonng used by algorthm IG to determne the partton of the ncremental mesh (of sze 96). The reparttonng of the next set of renement (wth, 5, and 9 nodes, respectvely) was acheved usng the parttonng obtaned by usng the IG for the prevous mesh n the sequence. The results show that, even after multple renements, the qualty of parttonng acheved s comparable to that acheved by recursve spectral bsecton from scratch, thus ths method can be used for reparttonng for several stages. The tme requred by reparttonng s about half of the tme requred for parttonng usng RSB. The algorthm provdes speedup of around 5 to on a node CM-5. Most of the tme spent by our algorthm s n the so-

Intal Graph Fgure arttoner V E Total Max Mn SB 7 85 74 56 5 V = 96 E = 6 arttoner Tme-s Tme-p Total Max Mn SB.7 7 56 IG 4.75.68 747 55 4 IGR 6.87.88 7 54 4 V = E = 5 arttoner Tme-s Tme-p Total Max Mn SB 4.5 7 56 4 IG.6.7 75 54 IGR 6.4.5 77 54 V = 5 E = 48 arttoner Tme-s Tme-p Total Max Mn SB 4.96 76 57 4 IG 5.89.9 757 56 IGR 8..8 74 56 V = 9 E = 548 arttoner Tme-s Tme-p Total Max Mn SB 8. 774 6 4 IG 5.69.94 85 6 4 IGR 8.4.6 779 59 4 Tme unt n seconds. p - parallel tmng on a -node CM-5. s - tmng on a one-node CM-5. SB - Spectral Bsecton. IG - Incremental Graph arttoner. IGR - Incremental Graph arttoner wth Renement. Fgure : Incremental graph parttonng usng lnear programmng and ts comparson wth spectral bsecton from scratch for meshes n Fgure. Fgure : Test graph A an rregular graph wth 7 nodes and 85 edges. The renement graph wth 9 nodes and 548 edges. Fgure : A mesh wth 66 nodes and 47 edges.

Fgure : A renement of mesh n Fgure wth 67 extra nodes. luton of the lnear programmng formulaton usng the smplex method. The cost of the smplex method depends on the number of varables (v) and the number of constrants (c). Each teraton n the dense matrx formulaton requres tme proportonal to the O(vc). The value of v and c depend largely on the number of parttons and the number of edges between the parttons (correspondng to e and l as descrbed n secton. and secton.4, respectvely). The values of v and c for the formulaton correspondng to performng the load balancng step for mesh n Fgure wth V = 96 and E = 6 for parttons are 88 and 6, respectvely These costs are ndependent of the number of vertces n the mesh and depend on the number of parttons. Thus, for large meshes the performance should be much better. Our software currently mplements the smplex method usng a dense matrx formulaton. Snce the matrx s hghly sparse, ths cost can be substantally reduced by usng a sparse representaton. Clearly, the latter would be more dcult to parallelze. Another opton s to use a multlevel approach and apply ncremental parttonng recursvely. We are currently explorng ths approach. Snce most of the tme (even for large meshes) s spent on the soluton of the lnear programmng usng the smplex method, any mprovements n the tme requred wll have a maor mpact on the total tme requred for parttonng. The next data set corresponds to hghly rregular mesh wth 66 nodes and 47 edges. Ths data set was generated to study the eect of derent amounts of new data added to the orgnal mesh. Fgures 4 (b), 4 (c), 4 (d), and 4 (e) correspond to meshs (a) Intal Graph Fgure arttoner V E Total Max Mn 66 47 8 7 8 (b) V = 4 E = 65 arttoner Tme-s Tme-p Total Max Mn SB 8.5 7 78 9 IG.9. 9 86 84 IGR 4.7.8 4 7 8 (c) V = 5 E = 888 arttoner Tme-s Tme-p Total Max Mn SB 84.6 99 66 87 IG 8.89.8 95 9 9 IGR 9.. 6 6 85 (d) V = 95 E = 58 arttoner Tme-s Tme-p Total Max Mn SB 85.5 57 69 94 IG() 5.98.8 48 56 9 IGR 4.86.76 9 9 85 (e) V = 88 E = 487 arttoner Tme-s Tme-p Total Max Mn SB 94.8 58 58 94 IG() 76.78.66 57 IGR 89.48 4.9 7 7 96 Tme unt n seconds. p - parallel tmng on a -node CM-5. s - tmng on a one-node CM-5. SB - Spectral Bsecton. IG - Incremental Graph arttoner. IGR - Incremental Graph arttoner wth Renement. Fgure 4: Incremental graph parttonng usng lnear programmng and ts comparson wth spectral bsecton from scratch for meshes n Fgure and Fgure.

wth 68, 9, 9, and 67 addtonal nodes over the mesh n Fgure. The parttonng acheved by algorthm IG for mesh n Fgure usng the partton of mesh n Fgure for mesh s gven n Fgure 4. The number of stages requred (by choosng an approprate value of, as descrbed n secton.) were,,, and, respectvely. It s worth notng that although the load mbalance created by the addtonal nodes was severe, the qualty of parttonng acheved for each of the cases was close to that of applyng Recursve Spectral Bsecton from scratch. Further, the sequental tme s at least an order of magntude better than that of Recursve Spectral Bsecton. The CM-5 mplementaton mproved the tme requred by a factor of 5 to. The tme requred for reparttonng Fgure 4 (b) and Fgure 4 (c) s close to that requred for meshes n Fgure. The tmngs for meshes n Fgure 4 (d) and 4 (e) are larger because they use multple stages. The above results show that the IG at a fracton of the cost, can be eectvely used for reparttonng to acheve solutons smlar n qualty to those obtaned by applyng recursve spectral bsecton from scratch. Further, the algorthm can be parallelzed eectvely. 4 Conclusons In ths paper we have presented a novel lnear programmng-based formulaton for solvng ncremental graph-parttonng problems. The qualty of parttonng produced by our methods s close to that acheved by applyng the best parttonng methods from scratch. Further, the tme needed s a small fracton of the latter and our algorthms are nherently parallel. We beleve the methods descrbed n ths paper are of crtcal mportance to the parallelzaton of the adaptve and ncremental problems descrbed earler. References [] I. Angus, G. Fox, J. Km, and D. Walker. Solvng roblems on Concurrent rocessors, volume. rentce Hall, Englewood Cls, NJ, 99. [] Alok Choudhary, Georey C. Fox, Seema Hranandan, Ken Kennedy, Charles Koelbel, Sanay Ranka, and Joel Saltz. Software Support for Irregular and Loosely Synchronous roblems. In roceedngs of the Conference on Hgh erformance Computng for Flght Vehcles, 99. To appear. [] F. Ercal. Heurstc Approaches to Task Allocaton for arallel Computng. h.d. thess, Oho State Unversty, 988. [4] G. C. Fox and W. Furmansk. Load Balancng Loosely Synchronous roblems wth a Neural Network. 988. [5] G. C. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker. Solvng roblems on Concurrent rocessors, volume. rentce Hall, Englewood Cls, NJ, 988. [6] Georey C. Fox. Graphcal Approach to Load Balancng and Sparse Matrx Vector Multplcaton on the Hypercube. 988. M. Schultz, Ed., Sprnger-Verlag, Berln. [7] Harpal Man, Kshan Mehrotra, Chlukur Mohan, and Sanay Ranka. Genetc Algorthms for Graph arttonng and Incremental Graph arttonng. Supercomputng '94 [8] S. Noltng. Nonlnear Adaptve Fnte Element Systems on Dstrbuted Memory Computers. In roceedngs of European Dstrbuted Memory Computng Conference, Aprl 99. [9] A. othen, H. Smon, and K- Lou. arttonng Sparse Matrces wth Egenvectors of Graphs. SIAM Journal of Matrx Analyss and Applcaton, (), July 99. [] H. Smon. arttonng of Unstructured Mesh roblems for arallel rocessng. In roceedngs of the Conference on arallel Methods on Large Scale Structural Analyss and hyscs Applcatons. ermagon ress, 99. [] R.D. Wllams. DIME: Dstrbuted Irregular Mesh Envroment. Calforna Insttute of Technology, February 99. [] R.D. Wllams. erformance of Dynamc Load- Balancng Algorthm for Unstructured Mesh Calculatons. Concurrency arctce and Experence, :457{48, 99. The number of stages chosen were by tral and error, but can be determned by the load mbalance.