PuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks

Similar documents
PuLP. Complex Objective Partitioning of Small-World Networks Using Label Propagation. George M. Slota 1,2 Kamesh Madduri 2 Sivasankaran Rajamanickam 1

PULP: Fast and Simple Complex Network Partitioning

Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations

XtraPuLP. Partitioning Trillion-edge Graphs in Minutes. State University

Irregular Graph Algorithms on Parallel Processing Systems

Massively Parallel Graph Analytics

Downloaded 10/31/16 to Redistribution subject to SIAM license or copyright; see

Extreme-scale Graph Analysis on Blue Waters

High-performance Graph Analytics

Simple Parallel Biconnectivity Algorithms for Multicore Platforms

Extreme-scale Graph Analysis on Blue Waters

Partitioning Trillion-edge Graphs in Minutes

Partitioning Problem and Usage

HPCGraph: Benchmarking Massive Graph Analytics on Supercomputers

BFS and Coloring-based Parallel Algorithms for Strongly Connected Components and Related Problems

Characterizing Biological Networks Using Subgraph Counting and Enumeration

PULP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks

Planar: Parallel Lightweight Architecture-Aware Adaptive Graph Repartitioning

Parallel Graph Partitioning for Complex Networks

Graph Partitioning for Scalable Distributed Graph Computations

DNA Interaction Network

Irregular Graph Algorithms on Modern Multicore, Manycore, and Distributed Processing Systems

Scalable Community Detection Benchmark Generation

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

Engineering Multilevel Graph Partitioning Algorithms

Parallel Graph Partitioning for Complex Networks

Multilevel Graph Partitioning

Graph and Hypergraph Partitioning for Parallel Computing

Multi-Threaded Graph Partitioning

k-way Hypergraph Partitioning via n-level Recursive Bisection

Fennel: Streaming Graph Partitioning for Massive Scale Graphs

Improving Graph Partitioning for Modern Graphs and Architectures

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

New Challenges In Dynamic Load Balancing

FASCIA. Fast Approximate Subgraph Counting and Enumeration. 2 Oct Scalable Computing Laboratory The Pennsylvania State University 1 / 28

Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning

Parallel Graph Partitioning and Sparse Matrix Ordering Library Version 4.0

Engineering Multilevel Graph Partitioning Algorithms

PARALLEL DECOMPOSITION OF 100-MILLION DOF MESHES INTO HIERARCHICAL SUBDOMAINS

Parallel FEM Computation and Multilevel Graph Partitioning Xing Cai

Accelerated Load Balancing of Unstructured Meshes

Partitioning Complex Networks via Size-constrained Clustering

Lecture 19: Graph Partitioning

Efficient Programming of Nanowire-based Sublithographic PLAs: A Multilevel Algorithm for Partitioning Graphs

On Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs

High-Quality Shared-Memory Graph Partitioning

Advances in Parallel Partitioning, Load Balancing and Matrix Ordering for Scientific Computing

Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs

Requirements of Load Balancing Algorithm

Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI

Think Locally, Act Globally: Highly Balanced Graph Partitioning

Scalable Clustering of Signed Networks Using Balance Normalized Cut

Harp-DAAL for High Performance Big Data Computing

War Stories : Graph Algorithms in GPUs

Problem Definition. Clustering nonlinearly separable data:

Load Balancing Myths, Fictions & Legends

Efficient Nested Dissection for Multicore Architectures

Measurements on (Complete) Graphs: The Power of Wedge and Diamond Sampling

BFS and Coloring-based Parallel Algorithms for Strongly Connected Components and Related Problems

Hypergraph Exploitation for Data Sciences

Parallel static and dynamic multi-constraint graph partitioning

Fast Dynamic Load Balancing for Extreme Scale Systems

Parallel repartitioning and remapping in

Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark. by Nkemdirim Dockery

KaHIP v0.73 Karlsruhe High Quality Partitioning User Guide

DFEP: Distributed Funding-based Edge Partitioning

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

Tanuj Kr Aasawat, Tahsin Reza, Matei Ripeanu Networked Systems Laboratory (NetSysLab) University of British Columbia

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks

A Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

A Parallel Solver for Laplacian Matrices. Tristan Konolige (me) and Jed Brown

Graph Data Management

Argo: Architecture- Aware Graph Par33oning

The JOSTLE executable user guide : Version 3.1

Parallel Graph Partitioning on a CPU-GPU Architecture

KaHIP v2.00 Karlsruhe High Quality Partitioning User Guide

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA

Early Experiences with Trinity - The First Advanced Technology Platform for the ASC Program

BFS and Coloring-based Parallel Algorithms for Strongly Connected Components and Related Problems

Engineering Multilevel Graph Partitioning Algorithms

A New Parallel Algorithm for Connected Components in Dynamic Graphs. Robert McColl Oded Green David Bader

Algorithm XXX: Mongoose, A Graph Coarsening and Partitioning Library

High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock

Parallel Graph Partitioning on Multicore Architectures

Scalable and Memory-Efficient Clustering of Large-Scale Social Networks

Parallel Multilevel Graph Partitioning

Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18

Visual Analysis of Lagrangian Particle Data from Combustion Simulations

Planar: Parallel Lightweight Architecture-Aware Adaptive Graph Repartitioning

Multilevel Algorithms for Multi-Constraint Hypergraph Partitioning

Native mesh ordering with Scotch 4.0

Integrating Analysis and Computation with Trios Services

Parallel Logic Synthesis Optimization for Digital Sequential Circuit

Analysis of Multilevel Graph Partitioning

IOGP. an Incremental Online Graph Partitioning algorithm for distributed graph databases. Dong Dai*, Wei Zhang, Yong Chen

Shape Optimizing Load Balancing for MPI-Parallel Adaptive Numerical Simulations

Analysis of Multilevel Graph Partitioning

Lesson 2 7 Graph Partitioning

Transcription:

PuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks George M. Slota 1,2 Kamesh Madduri 2 Sivasankaran Rajamanickam 1 1 Sandia National Laboratories, 2 The Pennsylvania State University gslota@psu.edu, madduri@cse.psu.edu, srajama@sandia.gov BigData14 28 Oct 2014 1 / 37

Highlights We present PuLP, a multi-constraint multi-objective partitioner designed for small-world graphs Shared-memory parallelism PuLP demonstrates an average speedup of 14.5 relative to state-of-the-art partitioners PuLP requires 8-39 less memory than state-of-the-art partitioners PuLP produces partitions with comparable or better quality than state-of-the-art partitioners for small-world graphs 2 / 37

Overview PuLP: Partitioning Using Label Propagation Overview Graph partitioning formulation Label propagation Using label propagation for partitioning PuLP Algorithm Degree-weighted label prop Label propagation for balancing constraints and minimizing objectives Label propagation for iterative refinement Results Performance comparisons with other partitioners Partitioning quality with different objectives 3 / 37

Overview Partitioning Graph Partitioning: Given a graph G(V, E) and p processes or tasks, assign each task a p-way disjoint subset of vertices and their incident edges from G Balance constraints (weighted) vertices per part, (weighted) edges per part Quality metrics edge cut, communication volume, maximal per-part edge cut We consider: Balancing edges and vertices per part Minimizing edge cut (EC) and maximal per-part edge cut (EC max ) 4 / 37

Overview Partitioning - Objectives and Constraints Lots of graph algorithms follow a certain iterative model BFS, SSSP, FASCIA subgraph counting (Slota and Madduri 2014) computation, synchronization, communication, synchronization, computation, etc. Computational load: proportional to vertices and edges per-part Communication load: proportional to total edge cut and max per-part cut We want to minimize the maximal time among tasks for each comp/comm stage 5 / 37

Overview Partitioning - Balance Constraints Balance vertices and edges: (1 ɛ l ) V p V (π i ) (1 + ɛ u ) V p E(π i ) (1 + η u ) E p (1) (2) ɛ l and ɛ u : lower and upper vertex imbalance ratios η u : upper edge imbalance ratio V (π i ): set of vertices in part π i E(π i ): set of edges with both endpoints in part π i 6 / 37

Overview Partitioning - Objectives Given a partition Π, the set of cut edges (C(G, Π)) and cut edge per partition (C(G, π k )) are C(G, Π) = {{(u, v) E} Π(u) Π(v)} (3) C(G, π k ) = {{(u, v) C(G, Π)} (u π k v π k )} (4) Our partitioning problem is then to minimize total edge cut EC and max per-part edge cut EC max : EC(G, Π) = C(G, Π) (5) EC max (G, Π) = max C(G, π k ) (6) k 7 / 37

Overview Partitioning - HPC Approaches (Par)METIS (Karypis et al.), PT-SCOTCH (Pellegrini et al.), Chaco (Hendrickson et al.), etc. Multilevel methods: Coarsen the input graph in several iterative steps At coarsest level, partition graph via local methods following balance constraints and quality objectives Iteratively uncoarsen graph, refine partitioning Problem 1: Designed for traditional HPC scientific problems (e.g. meshes) limited balance constraints and quality objectives Problem 2: Multilevel approach high memory requirements, can run slowly and lack scalability 8 / 37

Overview Label Propagation Label propagation: randomly initialize a graph with some p labels, iteratively assign to each vertex the maximal per-label count over all neighbors to generate clusters (Raghavan et al. 2007) Clustering algorithm - dense clusters hold same label Fast - each iteration in O(n + m), usually fixed iteration count (doesn t necessarily converge) Naïvely parallel - only per-vertex label updates Observation: Possible applications for large-scale small-world graph partitioning 9 / 37

Overview Partitioning - Big Data Approaches Methods designed for small-world graphs (e.g. social networks and web graphs) Exploit label propagation/clustering for partitioning: Multilevel methods - use label propagation to coarsen graph (Wang et al. 2014, Meyerhenke et al. 2014) Single level methods - use label propagation to directly create partitioning (Ugander and Backstrom 2013, Vaquero et al. 2013) Problem 1: Multilevel methods still can lack scalability, might also require running traditional partitioner at coarsest level Problem 2: Single level methods can produce sub-optimal partition quality 10 / 37

Overview PuLP PuLP : Partitioning Using Label Propagation Utilize label propagation for: Vertex balanced partitions, minimize edge cut (PuLP) Vertex and edge balanced partitions, minimize edge cut (PuLP-M) Vertex and edge balanced partitions, minimize edge cut and maximal per-part edge cut (PuLP-MM) Any combination of the above - multi objective, multi constraint 11 / 37

Algorithms Primary Algorithm Overview PuLP-MM Algorithm Constraint 1: balance vertices, Constraint 2: balance edges Objective 1: minimize edge cut, Objective 2: minimize per-partition edge cut Pseudocode gives default iteration counts Initialize p random partitions Execute 3 iterations degree-weighted label propagation (LP) for k 1 = 1 iterations do for k 2 = 3 iterations do Balance partitions with 5 LP iterations to satisfy constraint 1 Refine partitions with 10 FM iterations to minimize objective 1 for k 3 = 3 iterations do Balance partitions with 2 LP iterations to satisfy constraint 2 and minimize objective 2 with 5 FM iterations Refine partitions with 10 FM iterations to minimize objective 1 12 / 37

Algorithms Primary Algorithm Overview Initialize p random partitions Execute degree-weighted label propagation (LP) for k 1 iterations do for k 2 iterations do Balance partitions with LP to satisfy vertex constraint Refine partitions with FM to minimize edge cut for k 3 iterations do Balance partitions with LP to satisfy edge constraint and minimize max per-part cut Refine partitions with FM to minimize edge cut 13 / 37

Algorithms Primary Algorithm Overview Randomly initialize p partitions (p = 4) Network shown is the Infectious network dataset from KONECT (http://konect.uni-koblenz.de/) 14 / 37

Algorithms Primary Algorithm Overview After random initialization, we then perform label propagation to create partitions Initial Observations: Partitions are unbalanced, for high p, some partitions end up empty Edge cut is good, but can be better PuLP Solutions: Impose loose balance constraints, explicitly refine later Degree weightings - cluster around high degree vertices, let low degree vertices form boundary between partitions 15 / 37

Algorithms Primary Algorithm Overview Initialize p random partitions Execute degree-weighted label propagation (LP) for k 1 iterations do for k 2 iterations do Balance partitions with LP to satisfy vertex constraint Refine partitions with FM to minimize edge cut for k 3 iterations do Balance partitions with LP to satisfy edge constraint and minimize max per-part cut Refine partitions with FM to minimize edge cut 16 / 37

Algorithms Primary Algorithm Overview Part assignment after random initialization. Network shown is the Infectious network dataset from KONECT (http://konect.uni-koblenz.de/) 17 / 37

Algorithms Primary Algorithm Overview Part assignment after degree-weighted label propagation. Network shown is the Infectious network dataset from KONECT (http://konect.uni-koblenz.de/) 18 / 37

Algorithms Primary Algorithm Overview After label propagation, we balance vertices among partitions and minimize edge cut (baseline PuLP ends here) Observations: Partitions are still unbalanced in terms of edges Edge cut is good, max per-part cut isn t necessarily PuLP-M and PuLP-MM Solutions: Maintain vertex balance while explicitly balancing edges Alternate between minimizing total edge cut and max per-part cut (for PuLP-MM, PuLP-M only minimizes total edge cut) 19 / 37

Algorithms Primary Algorithm Overview Initialize p random partitions Execute degree-weighted label propagation (LP) for k 1 iterations do for k 2 iterations do Balance partitions with LP to satisfy vertex constraint Refine partitions with FM to minimize edge cut for k 3 iterations do Balance partitions with LP to satisfy edge constraint and minimize max per-part cut Refine partitions with FM to minimize edge cut 20 / 37

Algorithms Primary Algorithm Overview Part assignment after degree-weighted label propagation. Network shown is the Infectious network dataset from KONECT (http://konect.uni-koblenz.de/) 21 / 37

Algorithms Primary Algorithm Overview Part assignment after balancing for vertices and minimizing edge cut. Network shown is the Infectious network dataset from KONECT (http://konect.uni-koblenz.de/) 22 / 37

Algorithms Primary Algorithm Overview Initialize p random partitions Execute degree-weighted label propagation (LP) for k 1 iterations do for k 2 iterations do Balance partitions with LP to satisfy vertex constraint Refine partitions with FM to minimize edge cut for k 3 iterations do Balance partitions with LP to satisfy edge constraint and minimize max per-part cut Refine partitions with FM to minimize edge cut 23 / 37

Algorithms Primary Algorithm Overview Part assignment after balancing for vertices and minimizing edge cut. Network shown is the Infectious network dataset from KONECT (http://konect.uni-koblenz.de/) 24 / 37

Algorithms Primary Algorithm Overview Part assignment after balancing for edges and minimizing total edge cut and max per-part edge cut Network shown is the Infectious network dataset from KONECT (http://konect.uni-koblenz.de/) 25 / 37

Results Test Environment and Graphs Test system: Compton Intel Xeon E5-2670 (Sandy Bridge), dual-socket, 16 cores, 64 GB memory. Test graphs: LAW graphs from UF Sparse Matrix, SNAP, MPI, Koblenz Real (one R-MAT), small-world, 60 K 70 M vertices, 275 K 2 B edges Test Algorithms: METIS - single constraint single objective METIS-M - multi constraint single objective ParMETIS - METIS-M running in parallel KaFFPa - single constraint single objective PuLP - single constraint single objective PuLP-M - multi constraint single objective PuLP-MM - multi constraint multi objective Metrics: 2 128 partitions, serial and parallel running times, memory utilization, edge cut, max per-partition edge cut 26 / 37

Results Running Times - Serial (top), Parallel (bottom) In serial, PuLP-MM runs 1.7 faster (geometric mean) than next fastest Partitioner PULP PULP M PULP MM METIS METIS M KaFFPa FS LiveJournal R MAT Twitter Running Time 300 200 100 0 1500 1000 15000 10000 500 5000 0 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 Number of Partitions In parallel, PuLP-MM runs 14.5 faster (geometric mean) than next fastest (ParMETIS times are fastest of 1 to 256 cores) Partitioner PULP PULP M PULP MM ParMETIS METIS M (Serial) PULP M (Serial) LiveJournal R MAT Twitter Running Time 75 50 25 0 1500 1000 500 15000 10000 0 0 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 Number of Partitions 5000 27 / 37

Results Memory utilization for 128 partitions PuLP utilizes minimal memory, O(n), 8-39 less than other partitioners Savings are mostly from avoiding a multilevel approach Network Memory Utilization METIS-M KaFFPa PuLP-MM Graph Size Improv. LiveJournal 7.2 GB 5.0 GB 0.44 GB 0.33 GB 21 Orkut 21 GB 13 GB 0.99 GB 0.88 GB 23 R-MAT 42 GB - 1.2 GB 1.02 GB 35 DBpedia 46 GB - 2.8 GB 1.6 GB 28 WikiLinks 103 GB 42 GB 5.3 GB 4.1 GB 25 sk-2005 121 GB - 16 GB 13.7 GB 8 Twitter 487 GB - 14 GB 12.2 GB 39 28 / 37

Results Performance - Edge Cut and Edge Cut Max PuLP-M produces better edge cut than METIS-M over most graphs PuLP-MM produces better max edge cut than METIS-M over most graphs Partitioner PULP M PULP MM METIS M Edge Cut Ratio 0.4 0.3 0.2 0.1 LiveJournal R MAT Twitter 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 Number of Partitions Partitioner PULP M PULP MM METIS M Max Per Part Ratio 0.08 0.06 0.04 0.02 LiveJournal R MAT Twitter 0.3 0.4 0.3 0.2 0.2 0.1 0.1 0.0 0.0 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 Number of Partitions 29 / 37

Results Balanced communication uk-2005 graph from LAW, METIS-M (left) vs. PuLP-MM (right) Blue: low comm; White: avg comm; Red: High comm PuLP reduces max inter-part communication requirements and balances total communication load through all tasks 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Part Number 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 Part Number 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 Part Number Part Number 7 8 9 10 11 12 13 14 15 16 30 / 37

Future Work Explore techniques for avoiding local minima, such as simulated annealing, etc. Further parallelization in distributed environment for massive-scale graphs Demonstrate performance of PuLP partitions with graph analytics Explore tradeoff and interactions in various parameters and iteration counts 31 / 37

Conclusions We presented PuLP, a multi-constraint multi-objective partitioner designed for small-world graphs Shared-memory parallelism PuLP demonstrates an average speedup of 14.5 relative to state-of-the-art partitioners PuLP requires 8-39 less memory than state-of-the-art partitioners PuLP produces partitions with comparable or better quality than METIS/ParMETIS for small-world graphs 32 / 37

Conclusions We presented PuLP, a multi-constraint multi-objective partitioner designed for small-world graphs Shared-memory parallelism PuLP demonstrates an average speedup of 14.5 relative to state-of-the-art partitioners PuLP requires 8-39 less memory than state-of-the-art partitioners PuLP produces partitions with comparable or better quality than METIS/ParMETIS for small-world graphs Questions? 32 / 37

Acknowledgments DOE Office of Science through the FASTMath SciDAC Institute Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy s National Nuclear Security Administration under contract DE-AC04-94AL85000. NSF grant ACI-1253881, OCI-0821527 Used NERSC hardware for generating partitionings - supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 33 / 37

Backup slides 34 / 37

Results Running Times - Serial (top), Parallel (bottom) PuLP faster than others over most tests in serial In parallel, PuLP always faster than other Partitioner PULP PULP M PULP MM METIS METIS M KaFFPa FS Running Time 300 200 100 0 LiveJournal Orkut R MAT DBpedia WikiLinks sk 2005 Twitter 500 1500 400 1500 5000 15000 4000 1000 300 1000 1000 3000 10000 200 100 500 2000 500 500 1000 5000 0 0 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 Number of Partitions In parallel, PuLP runs 14.5 faster (geometric mean) Partitioner PULP PULP M PULP MM ParMETIS METIS M (Serial) PULP M (Serial) Running Time 75 50 25 0 LiveJournal Orkut R MAT DBpedia WikiLinks sk 2005 Twitter 300 1500 1500 1500 15000 1000 200 1000 1000 1000 10000 100 500 500 0 0 0 0 0 0 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 2 4 8 163264128 Number of Partitions 500 500 5000 35 / 37

Results Memory utilization for 128 partitions PuLP utilizes minimal memory - O(n) Savings are mostly from avoiding a multilevel approach Network Memory Utilization METIS-M KaFFPa PuLP-MM Graph Size Improv. LiveJournal 7.2 GB 5.0 GB 0.44 GB 0.33 GB 21 Orkut 21 GB 13 GB 0.99 GB 0.88 GB 23 R-MAT 42 GB - 1.2 GB 1.02 GB 35 DBpedia 46 GB - 2.8 GB 1.6 GB 28 WikiLinks 103 GB 42 GB 5.3 GB 4.1 GB 25 sk-2005 121 GB - 16 GB 13.7 GB 8 Twitter 487 GB - 14 GB 12.2 GB 39 36 / 37

Results Performance - Edge Cut and Edge Cut Max PuLP-M produces better edge cut than METIS-M over most graphs PuLP-MM produces better max edge cut than METIS-M over most graphs Taken together, these demonstrate the tradeoff for multi objective Partitioner PULP M PULP MM METIS M Edge Cut Ratio Max Per Part Ratio 0.4 0.3 0.2 0.1 0.08 0.06 0.04 0.02 LiveJournal Orkut R MAT DBpedia WikiLinks sk 2005 Twitter 1.0 0.4 0.5 0.6 0.8 0.4 0.06 0.8 0.3 0.4 0.6 0.3 0.04 0.6 0.2 0.2 0.2 0.4 0.02 0.4 0.0 0.1 0.1 0.2 0.00 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 Number of Partitions LiveJournal Orkut R MAT DBpedia WikiLinks sk 2005 Twitter 0.15 0.3 0.15 0.4 0.15 0.12 0.0100 0.10 0.3 0.2 0.10 0.09 0.2 0.0075 0.05 0.1 0.1 0.06 0.05 0.0050 0.00 0.0 0.03 0.00 0.0 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 2 4 8 16 32 64128 Number of Partitions 37 / 37

Results Performance - Edge Cut and Edge Cut Max PuLP-M produces better edge cut than METIS-M over most graphs PuLP-MM produces better max edge cut than METIS-M over most graphs Taken together, these demonstrate the tradeoff for multi objective Across all Lab for Web Algorithmics graphs Partitioner PULP M PULP MM METIS M Edge Cut Ratio 0.25 0.20 0.15 0.10 0.05 0.3 0.2 0.1 0.0 0.4 0.3 0.2 0.1 Amazon Arabic CNR DBLP Enron 0.04 0.20 0.6 0.4 0.03 0.3 0.15 0.4 0.02 0.2 0.10 0.2 0.01 0.1 0.05 0.0 0.00 0.0 eu Hollywood in indochina it 0.20 0.6 0.15 0.04 0.15 0.4 0.10 0.03 0.10 0.02 0.05 0.2 0.05 0.01 0.00 0.00 0.0 0.00 LJournal sk 2005 uk 2002 uk 2005 webbase 0.06 0.04 0.02 0.00 0.02 0.01 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 Number of Partitions 0.12 0.08 0.04 0.00 0.03 0.02 0.01 37 / 37

Results Performance - Edge Cut and Edge Cut Max PuLP-M produces better edge cut than METIS-M over most graphs PuLP-MM produces better max edge cut than METIS-M over most graphs Taken together, these demonstrate the tradeoff for multi objective Across all Lab for Web Algorithmics graphs Partitioner PULP M PULP MM METIS M Max Per Part Ratio 0.100 0.075 0.050 0.025 0.000 0.03 0.02 0.01 0.08 0.06 0.04 0.02 Amazon Arabic CNR DBLP Enron 0.0100 0.0025 0.0075 0.0050 0.050 0.09 0.06 0.03 0.00 0.06 LJournal sk 2005 uk 2002 uk 2005 webbase 0.0100 0.015 0.04 0.0100 0.0075 0.03 0.010 0.0075 0.0050 0.02 0.005 0.0025 0.0050 0.01 0.00 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 2 4 8 16 32 64 128 Number of Partitions eu Hollywood in indochina it 0.015 0.125 0.08 0.100 0.075 0.025 0.02 0.01 0.00 0.06 0.04 0.02 0.00 0.04 0.02 0.00 0.15 0.10 0.05 0.00 0.010 0.005 37 / 37