Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

Size: px
Start display at page:

Download "Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering"

Transcription

1 Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering George Karypis and Vipin Kumar Brian Shi CSci /09/2017

2 Outline Introduction Graph Partitioning Problem Multilevel Graph Partitioning Algorithm Parallelizing The Multilevel Graph Partitioning Algorithm Initial Partitioning Phase Uncoarsening Phase Parallel Refinement Algorithm Parallel Multilevel Sparse Matrix Ordering Algorithm Performance,Scaling Analysis, and Results

3 Graph Partitioning Problem The graph partitioning problem is NP-complete. Definition: Let G = (V, E) be a graph with V = n. A p-way partition of G consists of V 1,..., V p V such that V is a disjoint union of V 1,..., V p with V i n/p and the edge cuts are minimized.

4 Multilevel Graph Partitioning Algorithm Consists of three main phases. 1. Coarsening Phase: A matching of G is constructed and edges are contracted. 2. Initial Partitioning Phase: A partition of a smaller coarsened graph is performed. 3. Uncoarsening Phase: The partitioned graph is uncontracted and map back to the original graphs.

5 Coarsening Phase Sequence of smaller graphs {G l } are constructed by finding maximal matchings starting from G 0 = G. Graph Matching Let G = (V, E) be an undirected graph, then a matching of G is a subset of edges, M E, such that no vertex in V is incident to more than one edge in M. Vertices in matched edges contracted Random Matching Heavy Edge Matching

6 Partitioning and Uncoarsening 1. Initial Partitioning Phase: A partition of a smaller coarsened graph is performed. 2. Serial algorithms Greedy Graph Growing Partition (GGGP). 3. Partitioned graph is projected back to the original graph. 4. Refinement algorithms used to minimize edge-cuts and load balancing.

7 Parallelizing The Multilevel Graph Partitioning Algorithm Assumption: p-way partitioning with p processors Parallelization exploited in the recursize nature of the bisection algorithm p processors perform one bisection. Then p/2 processors perform bisections of each half. Goal: Parallel algorithm for graph bisection.

8 Coarsening Phase Main Idea Assume p = 2 2r processors arranged in a 2-D array, and (V 0, E 0 ) = (V, E) Distribute the verticesinto p subsets: V0 0,, V0 1 p 1,..., V0 Processor P ij will contain subset of edges in E 0 with incident vertices in V i 0 and V j 0

9

10 Coarsening Phase The edge matchings, M0 i, will be done along the diagonal processors. After completion each P ii does two broadcasts along its row and columns to distribute M i 0. E 1 = M 0 = i Mi 0 Each P ij contains edges with incident vertices in V i 1 and V j 1 Once G k is coarse enough we reduce number of working processors by folding.

11

12 Initial Partitioning Phase Coarsest graph will be contained in a single processor. Can be done sequentially using GGGP. Several runs of GGGP are run using different random starting vertices. We can copy the coarsest graph to multiple processors and run these trials concurrently. We keep the partition giving the smallest edge-cut.

13 Uncoarsening Phase Project coarse graphs back to original graphs. Refinements are made during each step of the projection Serial algorithms (Kernighan-Lin variants) used when coarse graph resides in one processor. The processor folding is reversed.

14 Parallel Refinement Algorithm Each P ij computes local gain lg v for each v V j 0. Total gain of v, g v, is computed via a sum reduction along the columns. Each P ii selects vertices, U i V0 i, with positive gain. Broadcast U i row and column-wise. Recompute the local gains and repeat. Balancing partitions Start vertex swaps from heavier parts of partition Use a load balancing iteration if there is more than 2% imbalance.

15 Parallel Multilevel Sparse Matrix Ordering Algorithm Assume a bisection is already constructed. A, B be the boundary vertices. Need to construct vertex separator.

16 Parallel Vertex Cover Algorithm Processors are arranged in a 2-D array. A i = A V i 0 and B i = B V i 0 Each P ij stores A i and B j and computes minimal cover of edges between A i, B j locally. Denote A ij c A and Bc ij B and A ij c Bc ij Union of A ij c, Bc ij the minimal cover across all processors forms a cover.

17 Parallel Vertex Cover Algorithm Constructing Minimal Cover Broadcast Bc i = j Bc ij to processors in same column Each P ij removes vertices from A ij c whose edges are covered in Bc j A ij c A ij c Broadcast A i c = j A ij c to processors in same row Each P ij removes vertices from B j c whose union is B j c Then a minimal cover is got via S = ( p 1 i=0 ) ( p 1 A i c j=0 B j c )

18

19

20

21 Scaling Analysis Time to broadcast is n = V 0 and m = E 0 Time for bisection is T broadcast = O T bisection = O ( n p ) ( ) ( ) m n + O p p Overall runtime of the p-way partitioning algorithm is ( ( ) ( )) m n T partition = O + O p log(p) p

22 Test Cases

23 Results

24 References G. Karypis and V. Kumar, A parallel algorithm for multilevel graph partitioning and sparse matrix ordering, Journal of Parallel and Distributed Computing, vol. 48, no. 1, p , G. Karypis and V. Kumar, Multilevel graph partitioning schemes, ICPP (3), pp , A. Grama, G. Karypis, V. Kumar, and A. Gupta, Introduction to Parallel Computing. Addison-Wesley, second ed., 2003.

Multilevel Graph Partitioning

Multilevel Graph Partitioning Multilevel Graph Partitioning George Karypis and Vipin Kumar Adapted from Jmes Demmel s slide (UC-Berkely 2009) and Wasim Mohiuddin (2011) Cover image from: Wang, Wanyi, et al. "Polygonal Clustering Analysis

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

Graph and Hypergraph Partitioning for Parallel Computing

Graph and Hypergraph Partitioning for Parallel Computing Graph and Hypergraph Partitioning for Parallel Computing Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology June 29, 2016 Graph and hypergraph partitioning References:

More information

Basic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Basic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar Basic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003 Topic Overview One-to-All Broadcast

More information

Lesson 2 7 Graph Partitioning

Lesson 2 7 Graph Partitioning Lesson 2 7 Graph Partitioning The Graph Partitioning Problem Look at the problem from a different angle: Let s multiply a sparse matrix A by a vector X. Recall the duality between matrices and graphs:

More information

Penalized Graph Partitioning for Static and Dynamic Load Balancing

Penalized Graph Partitioning for Static and Dynamic Load Balancing Penalized Graph Partitioning for Static and Dynamic Load Balancing Tim Kiefer, Dirk Habich, Wolfgang Lehner Euro-Par 06, Grenoble, France, 06-08-5 Task Allocation Challenge Application (Workload) = Set

More information

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen Graph Partitioning for High-Performance Scientific Simulations Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Challenges for irregular meshes Modeling mesh-based computations as graphs Static

More information

A Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

A Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering Appears in the Journal of Parallel and Distributed Computing A short version of this paper appears in International Parallel Processing Symposium 996 The serial algorithms described in this paper are implemented

More information

Sorting Algorithms. Slides used during lecture of 8/11/2013 (D. Roose) Adapted from slides by

Sorting Algorithms. Slides used during lecture of 8/11/2013 (D. Roose) Adapted from slides by Sorting Algorithms Slides used during lecture of 8/11/2013 (D. Roose) Adapted from slides by Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel

More information

Problem Definition. Clustering nonlinearly separable data:

Problem Definition. Clustering nonlinearly separable data: Outlines Weighted Graph Cuts without Eigenvectors: A Multilevel Approach (PAMI 2007) User-Guided Large Attributed Graph Clustering with Multiple Sparse Annotations (PAKDD 2016) Problem Definition Clustering

More information

Lecture 19: Graph Partitioning

Lecture 19: Graph Partitioning Lecture 19: Graph Partitioning David Bindel 3 Nov 2011 Logistics Please finish your project 2. Please start your project 3. Graph partitioning Given: Graph G = (V, E) Possibly weights (W V, W E ). Possibly

More information

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning Parallel sparse matrix-vector product Lay out matrix and vectors by rows y(i) = sum(a(i,j)*x(j)) Only compute terms with A(i,j) 0 P0 P1

More information

Lecture 4: Graph Algorithms

Lecture 4: Graph Algorithms Lecture 4: Graph Algorithms Definitions Undirected graph: G =(V, E) V finite set of vertices, E finite set of edges any edge e = (u,v) is an unordered pair Directed graph: edges are ordered pairs If e

More information

Dense Matrix Algorithms

Dense Matrix Algorithms Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication

More information

MULTI-LEVEL GRAPH PARTITIONING

MULTI-LEVEL GRAPH PARTITIONING MULTI-LEVEL GRAPH PARTITIONING By PAWAN KUMAR AURORA A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE

More information

Multi-Objective Hypergraph Partitioning Algorithms for Cut and Maximum Subdomain Degree Minimization

Multi-Objective Hypergraph Partitioning Algorithms for Cut and Maximum Subdomain Degree Minimization IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN, VOL XX, NO. XX, 2005 1 Multi-Objective Hypergraph Partitioning Algorithms for Cut and Maximum Subdomain Degree Minimization Navaratnasothie Selvakkumaran and

More information

Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning

Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning Kirk Schloegel, George Karypis, and Vipin Kumar Army HPC Research Center Department of Computer Science and Engineering University

More information

Analysis of Multilevel Graph Partitioning

Analysis of Multilevel Graph Partitioning A short version of this paper appears in Supercomputing 995 The algorithms described in this paper are implemented by the METIS: Unstructured Graph Partitioning and Sparse Matrix Ordering System. METIS

More information

Introduction to Parallel & Distributed Computing Parallel Graph Algorithms

Introduction to Parallel & Distributed Computing Parallel Graph Algorithms Introduction to Parallel & Distributed Computing Parallel Graph Algorithms Lecture 16, Spring 2014 Instructor: 罗国杰 gluo@pku.edu.cn In This Lecture Parallel formulations of some important and fundamental

More information

Analysis of Multilevel Graph Partitioning

Analysis of Multilevel Graph Partitioning Analysis of Multilevel Graph Partitioning GEORGE KARYPIS AND VIPIN KUMAR University of Minnesota, Department of Computer Science Minneapolis, MN 55455 {karypis, kumar}@cs.umn.edu Abstract Recently, a number

More information

Efficient Programming of Nanowire-based Sublithographic PLAs: A Multilevel Algorithm for Partitioning Graphs

Efficient Programming of Nanowire-based Sublithographic PLAs: A Multilevel Algorithm for Partitioning Graphs Efficient Programming of Nanowire-based Sublithographic PLAs: A Multilevel Algorithm for Partitioning Graphs Vivek Rajkumar (University of Washington CSE) Contact: California Institute

More information

k-way Hypergraph Partitioning via n-level Recursive Bisection

k-way Hypergraph Partitioning via n-level Recursive Bisection k-way Hypergraph Partitioning via n-level Recursive Bisection Sebastian Schlag, Vitali Henne, Tobias Heuer, Henning Meyerhenke Peter Sanders, Christian Schulz January 10th, 2016 @ ALENEX 16 INSTITUTE OF

More information

Graph Partitioning Algorithms

Graph Partitioning Algorithms Graph Partitioning Algorithms Leonid E. Zhukov School of Applied Mathematics and Information Science National Research University Higher School of Economics 03.03.2014 Leonid E. Zhukov (HSE) Lecture 8

More information

CHAPTER 6 DEVELOPMENT OF PARTICLE SWARM OPTIMIZATION BASED ALGORITHM FOR GRAPH PARTITIONING

CHAPTER 6 DEVELOPMENT OF PARTICLE SWARM OPTIMIZATION BASED ALGORITHM FOR GRAPH PARTITIONING CHAPTER 6 DEVELOPMENT OF PARTICLE SWARM OPTIMIZATION BASED ALGORITHM FOR GRAPH PARTITIONING 6.1 Introduction From the review, it is studied that the min cut k partitioning problem is a fundamental partitioning

More information

PARALLEL DECOMPOSITION OF 100-MILLION DOF MESHES INTO HIERARCHICAL SUBDOMAINS

PARALLEL DECOMPOSITION OF 100-MILLION DOF MESHES INTO HIERARCHICAL SUBDOMAINS Technical Report of ADVENTURE Project ADV-99-1 (1999) PARALLEL DECOMPOSITION OF 100-MILLION DOF MESHES INTO HIERARCHICAL SUBDOMAINS Hiroyuki TAKUBO and Shinobu YOSHIMURA School of Engineering University

More information

Requirements of Load Balancing Algorithm

Requirements of Load Balancing Algorithm LOAD BALANCING Programs and algorithms as graphs Geometric Partitioning Graph Partitioning Recursive Graph Bisection partitioning Recursive Spectral Bisection Multilevel Graph partitioning Hypergraph Partitioning

More information

Multilevel Algorithms for Multi-Constraint Hypergraph Partitioning

Multilevel Algorithms for Multi-Constraint Hypergraph Partitioning Multilevel Algorithms for Multi-Constraint Hypergraph Partitioning George Karypis University of Minnesota, Department of Computer Science / Army HPC Research Center Minneapolis, MN 55455 Technical Report

More information

Parallel static and dynamic multi-constraint graph partitioning

Parallel static and dynamic multi-constraint graph partitioning CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2002; 14:219 240 (DOI: 10.1002/cpe.605) Parallel static and dynamic multi-constraint graph partitioning Kirk Schloegel,,

More information

Sorting Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Sorting Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar Sorting Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. Topic Overview Issues in Sorting on Parallel

More information

Parallel FEM Computation and Multilevel Graph Partitioning Xing Cai

Parallel FEM Computation and Multilevel Graph Partitioning Xing Cai Parallel FEM Computation and Multilevel Graph Partitioning Xing Cai Simula Research Laboratory Overview Parallel FEM computation how? Graph partitioning why? The multilevel approach to GP A numerical example

More information

Multi-Threaded Graph Partitioning

Multi-Threaded Graph Partitioning Multi-Threaded Graph Partitioning Dominique LaSalle and George Karypis Department of Computer Science & Engineering University of Minnesota Minneapolis, Minnesota 5555, USA {lasalle,karypis}@cs.umn.edu

More information

Multilevel k-way Hypergraph Partitioning

Multilevel k-way Hypergraph Partitioning _ Multilevel k-way Hypergraph Partitioning George Karypis and Vipin Kumar fkarypis, kumarg@cs.umn.edu Department of Computer Science & Engineering, University of Minnesota, Minneapolis, MN 55455 Abstract

More information

Principles of Parallel Algorithm Design: Concurrency and Decomposition

Principles of Parallel Algorithm Design: Concurrency and Decomposition Principles of Parallel Algorithm Design: Concurrency and Decomposition John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 2 12 January 2017 Parallel

More information

Native mesh ordering with Scotch 4.0

Native mesh ordering with Scotch 4.0 Native mesh ordering with Scotch 4.0 François Pellegrini INRIA Futurs Project ScAlApplix pelegrin@labri.fr Abstract. Sparse matrix reordering is a key issue for the the efficient factorization of sparse

More information

Parallel Multilevel Graph Partitioning

Parallel Multilevel Graph Partitioning Parallel Multilevel raph Partitioning eorge Karypis and Vipin Kumar University of Minnesota, Department of Computer Science, Minneapolis, MN 55455 Abstract In this paper we present a parallel formulation

More information

Shared memory parallel algorithms in Scotch 6

Shared memory parallel algorithms in Scotch 6 Shared memory parallel algorithms in Scotch 6 François Pellegrini EQUIPE PROJET BACCHUS Bordeaux Sud-Ouest 29/05/2012 Outline of the talk Context Why shared-memory parallelism in Scotch? How to implement

More information

PuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks

PuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks PuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks George M. Slota 1,2 Kamesh Madduri 2 Sivasankaran Rajamanickam 1 1 Sandia National Laboratories, 2 The Pennsylvania

More information

Chapter 8 Dense Matrix Algorithms

Chapter 8 Dense Matrix Algorithms Chapter 8 Dense Matrix Algorithms (Selected slides & additional slides) A. Grama, A. Gupta, G. Karypis, and V. Kumar To accompany the text Introduction to arallel Computing, Addison Wesley, 23. Topic Overview

More information

Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication

Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 7, JULY 1999 673 Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication UÈ mitv.cëatalyuè rek and

More information

A Parallel Hill-Climbing Refinement Algorithm for Graph Partitioning

A Parallel Hill-Climbing Refinement Algorithm for Graph Partitioning A Parallel Hill-Climbing Refinement Algorithm for Graph Partitioning Dominique LaSalle and George Karypis Department of Computer Science & Engineering, University of Minnesota, Minneapolis, MN 55455, USA

More information

Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 3.. NIL. 2. error new key is greater than current key 6. CASCADING-CUT(, )

Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 3.. NIL. 2. error new key is greater than current key 6. CASCADING-CUT(, ) Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 1. if >. 2. error new key is greater than current key 3.. 4.. 5. if NIL and.

More information

Unit 5A: Circuit Partitioning

Unit 5A: Circuit Partitioning Course contents: Unit 5A: Circuit Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Simulated annealing based partitioning algorithm Readings Chapter 7.5 Unit 5A 1 Course

More information

Place and Route for FPGAs

Place and Route for FPGAs Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming

More information

Search Algorithms for Discrete Optimization Problems

Search Algorithms for Discrete Optimization Problems Search Algorithms for Discrete Optimization Problems Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. 1 Topic

More information

Search Algorithms for Discrete Optimization Problems

Search Algorithms for Discrete Optimization Problems Search Algorithms for Discrete Optimization Problems Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. Topic

More information

Advanced Topics in Numerical Analysis: High Performance Computing

Advanced Topics in Numerical Analysis: High Performance Computing Advanced Topics in Numerical Analysis: High Performance Computing MATH-GA 2012.001 & CSCI-GA 2945.001 Georg Stadler Courant Institute, NYU stadler@cims.nyu.edu Spring 2017, Thursday, 5:10 7:00PM, WWH #512

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Introduction to Parallel Computing George Karypis Sorting Outline Background Sorting Networks Quicksort Bucket-Sort & Sample-Sort Background Input Specification Each processor has n/p elements A ordering

More information

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5.

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5. Course contents: Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Readings Chapter 7.5 Partitioning 1 Basic Definitions Cell: a logic block used to build larger circuits.

More information

Co-optimizing Application Partitioning and Network Topology for a Reconfigurable Interconnect

Co-optimizing Application Partitioning and Network Topology for a Reconfigurable Interconnect Co-optimizing Application Partitioning and Network Topology for a Reconfigurable Interconnect Deepak Ajwani a,, Adam Hackett b, Shoukat Ali c, John P. Morrison d, Stephen Kirkland b a Bell Labs, Alcatel-Lucent,

More information

Kernighan/Lin - Preliminary Definitions. Comments on Kernighan/Lin Algorithm. Partitioning Without Nodal Coordinates Kernighan/Lin

Kernighan/Lin - Preliminary Definitions. Comments on Kernighan/Lin Algorithm. Partitioning Without Nodal Coordinates Kernighan/Lin Partitioning Without Nodal Coordinates Kernighan/Lin Given G = (N,E,W E ) and a partitioning N = A U B, where A = B. T = cost(a,b) = edge cut of A and B partitions. Find subsets X of A and Y of B with

More information

Principles of Parallel Algorithm Design: Concurrency and Mapping

Principles of Parallel Algorithm Design: Concurrency and Mapping Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 17 January 2017 Last Thursday

More information

F k G A S S1 3 S 2 S S V 2 V 3 V 1 P 01 P 11 P 10 P 00

F k G A S S1 3 S 2 S S V 2 V 3 V 1 P 01 P 11 P 10 P 00 PRLLEL SPRSE HOLESKY FTORIZTION J URGEN SHULZE University of Paderborn, Department of omputer Science Furstenallee, 332 Paderborn, Germany Sparse matrix factorization plays an important role in many numerical

More information

CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh HW#3 Due at the beginning of class Thursday 02/26/15

CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh HW#3 Due at the beginning of class Thursday 02/26/15 CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh (rezab@stanford.edu) HW#3 Due at the beginning of class Thursday 02/26/15 1. Consider a model of a nonbipartite undirected graph in which

More information

Randomized Graph Algorithms

Randomized Graph Algorithms Randomized Graph Algorithms Vasileios-Orestis Papadigenopoulos School of Electrical and Computer Engineering - NTUA papadigenopoulos orestis@yahoocom July 22, 2014 Vasileios-Orestis Papadigenopoulos (NTUA)

More information

Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication

Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication Ümit V. Çatalyürek and Cevdet Aykanat, Member, IEEE Computer Engineering Department, Bilkent University 06 Bilkent,

More information

Efficient Nested Dissection for Multicore Architectures

Efficient Nested Dissection for Multicore Architectures Efficient Nested Dissection for Multicore Architectures Dominique Lasalle and George Karypis Department of Computer Science & Engineering, University of Minnesota, Minneapolis, MN 55455, USA {lasalle,karypis}@cs.umn.edu

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

John Mellor-Crummey Department of Computer Science Rice University

John Mellor-Crummey Department of Computer Science Rice University Parallel Sorting John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 23 6 April 2017 Topics for Today Introduction Sorting networks and Batcher s bitonic

More information

Parallel Multilevel Algorithms for Hypergraph Partitioning

Parallel Multilevel Algorithms for Hypergraph Partitioning Parallel Multilevel Algorithms for Hypergraph Partitioning Aleksandar Trifunović William J. Knottenbelt Department of Computing, Imperial College London, South Kensington Campus, London SW7 2AZ, United

More information

Parallel Algorithm Design. Parallel Algorithm Design p. 1

Parallel Algorithm Design. Parallel Algorithm Design p. 1 Parallel Algorithm Design Parallel Algorithm Design p. 1 Overview Chapter 3 from Michael J. Quinn, Parallel Programming in C with MPI and OpenMP Another resource: http://www.mcs.anl.gov/ itf/dbpp/text/node14.html

More information

COMMUNICATION IN HYPERCUBES

COMMUNICATION IN HYPERCUBES PARALLEL AND DISTRIBUTED ALGORITHMS BY DEBDEEP MUKHOPADHYAY AND ABHISHEK SOMANI http://cse.iitkgp.ac.in/~debdeep/courses_iitkgp/palgo/index.htm COMMUNICATION IN HYPERCUBES 2 1 OVERVIEW Parallel Sum (Reduction)

More information

Distributed-memory Algorithms for Dense Matrices, Vectors, and Arrays

Distributed-memory Algorithms for Dense Matrices, Vectors, and Arrays Distributed-memory Algorithms for Dense Matrices, Vectors, and Arrays John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 19 25 October 2018 Topics for

More information

Parallel Graph Partitioning on a CPU-GPU Architecture

Parallel Graph Partitioning on a CPU-GPU Architecture Parallel Graph Partitioning on a CPU-GPU Architecture Bahareh Goodarzi Martin Burtscher Dhrubajyoti Goswami Department of Computer Science Department of Computer Science Department of Computer Science

More information

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution Multigrid Pattern I. Problem Problem domain is decomposed into a set of geometric grids, where each element participates in a local computation followed by data exchanges with adjacent neighbors. The grids

More information

Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI

Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI Parallel Adaptive Institute of Theoretical Informatics Karlsruhe Institute of Technology (KIT) 10th DIMACS Challenge Workshop, Feb 13-14, 2012, Atlanta 1 Load Balancing by Repartitioning Application: Large

More information

Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning

Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning Michael M. Wolf 1,2, Erik G. Boman 2, and Bruce A. Hendrickson 3 1 Dept. of Computer Science, University of Illinois at Urbana-Champaign,

More information

Parallelizing LU Factorization

Parallelizing LU Factorization Parallelizing LU Factorization Scott Ricketts December 3, 2006 Abstract Systems of linear equations can be represented by matrix equations of the form A x = b LU Factorization is a method for solving systems

More information

Level 3: Level 2: Level 1: Level 0:

Level 3: Level 2: Level 1: Level 0: A Graph Based Method for Generating the Fiedler Vector of Irregular Problems 1 Michael Holzrichter 1 and Suely Oliveira 2 1 Texas A&M University, College Station, TX,77843-3112 2 The University of Iowa,

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Last time Network topologies Intro to MPI Matrix-matrix multiplication Today MPI I/O Randomized Algorithms Parallel k-select Graph coloring Assignment 2 Parallel I/O Goal of Parallel

More information

Lecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC

Lecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC Lecture 9: Group Communication Operations Shantanu Dutt ECE Dept. UIC Acknowledgement Adapted from Chapter 4 slides of the text, by A. Grama w/ a few changes, augmentations and corrections Topic Overview

More information

CSE 431/531: Analysis of Algorithms. Greedy Algorithms. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

CSE 431/531: Analysis of Algorithms. Greedy Algorithms. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo CSE 431/531: Analysis of Algorithms Greedy Algorithms Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Main Goal of Algorithm Design Design fast algorithms to solve

More information

PACKAGE SPECIFICATION HSL 2013

PACKAGE SPECIFICATION HSL 2013 HSL MC73 PACKAGE SPECIFICATION HSL 2013 1 SUMMARY Let A be an n n matrix with a symmetric sparsity pattern. HSL MC73 has entries to compute the (approximate) Fiedler vector of the unweighted or weighted

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation Spring 2005 Ahmed Elgammal Dept of Computer Science CS 534 Segmentation II - 1 Outlines What is Graph cuts Graph-based clustering

More information

Preclass Warmup. ESE535: Electronic Design Automation. Motivation (1) Today. Bisection Width. Motivation (2)

Preclass Warmup. ESE535: Electronic Design Automation. Motivation (1) Today. Bisection Width. Motivation (2) ESE535: Electronic Design Automation Preclass Warmup What cut size were you able to achieve? Day 4: January 28, 25 Partitioning (Intro, KLFM) 2 Partitioning why important Today Can be used as tool at many

More information

Computer Science Technical Report. Approximating Weighted Matchings Using the Partitioned Global Address Space Model

Computer Science Technical Report. Approximating Weighted Matchings Using the Partitioned Global Address Space Model Computer Science Technical Report Approximating Weighted Matchings Using the Partitioned Global Address Space Model Alicia Thorsen, Phillip Merkey Michigan Technological University Computer Science Technical

More information

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA Partitioning and Partitioning Tools Tim Barth NASA Ames Research Center Moffett Field, California 94035-00 USA 1 Graph/Mesh Partitioning Why do it? The graph bisection problem What are the standard heuristic

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Parallel Logic Synthesis Optimization for Digital Sequential Circuit

Parallel Logic Synthesis Optimization for Digital Sequential Circuit Kasetsart J. (Nat. Sci.) 36 : 319-326 (2002) Parallel Logic Synthesis Optimization for Digital Sequential Circuit Aswit Pungsema and Pradondet Nilagupta ABSTRACT High-level synthesis tools are very important

More information

Parallel Graph Partitioning for Complex Networks

Parallel Graph Partitioning for Complex Networks Parallel Graph Partitioning for Complex Networks Henning Meyerhenke, Peter Sanders, Christian Schulz High-performance Graph Algorithms and Applications in Computational Science Dagstuhl 1 Christian Schulz:

More information

11/22/2016. Chapter 9 Graph Algorithms. Introduction. Definitions. Definitions. Definitions. Definitions

11/22/2016. Chapter 9 Graph Algorithms. Introduction. Definitions. Definitions. Definitions. Definitions Introduction Chapter 9 Graph Algorithms graph theory useful in practice represent many real-life problems can be slow if not careful with data structures 2 Definitions an undirected graph G = (V, E) is

More information

Mathematics and Computer Science

Mathematics and Computer Science Technical Report TR-2006-010 Revisiting hypergraph models for sparse matrix decomposition by Cevdet Aykanat, Bora Ucar Mathematics and Computer Science EMORY UNIVERSITY REVISITING HYPERGRAPH MODELS FOR

More information

Chapter 9 Graph Algorithms

Chapter 9 Graph Algorithms Chapter 9 Graph Algorithms 2 Introduction graph theory useful in practice represent many real-life problems can be slow if not careful with data structures 3 Definitions an undirected graph G = (V, E)

More information

Graph Partitioning for Scalable Distributed Graph Computations

Graph Partitioning for Scalable Distributed Graph Computations Graph Partitioning for Scalable Distributed Graph Computations Aydın Buluç ABuluc@lbl.gov Kamesh Madduri madduri@cse.psu.edu 10 th DIMACS Implementation Challenge, Graph Partitioning and Graph Clustering

More information

Paths. Path is a sequence of edges that begins at a vertex of a graph and travels from vertex to vertex along edges of the graph.

Paths. Path is a sequence of edges that begins at a vertex of a graph and travels from vertex to vertex along edges of the graph. Paths Path is a sequence of edges that begins at a vertex of a graph and travels from vertex to vertex along edges of the graph. Formal Definition of a Path (Undirected) Let n be a nonnegative integer

More information

Principles of Parallel Algorithm Design: Concurrency and Mapping

Principles of Parallel Algorithm Design: Concurrency and Mapping Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 28 August 2018 Last Thursday Introduction

More information

Graph partitioning. Prof. Richard Vuduc Georgia Institute of Technology CSE/CS 8803 PNA: Parallel Numerical Algorithms [L.27] Tuesday, April 22, 2008

Graph partitioning. Prof. Richard Vuduc Georgia Institute of Technology CSE/CS 8803 PNA: Parallel Numerical Algorithms [L.27] Tuesday, April 22, 2008 Graph partitioning Prof. Richard Vuduc Georgia Institute of Technology CSE/CS 8803 PNA: Parallel Numerical Algorithms [L.27] Tuesday, April 22, 2008 1 Today s sources CS 194/267 at UCB (Yelick/Demmel)

More information

Multi-level sequential circuit partitioning for test vector generation for low power test in VLSI

Multi-level sequential circuit partitioning for test vector generation for low power test in VLSI MultiCraft International Journal of Engineering, Science and Technology INTERNATIONAL JOURNAL OF ENGINEERING, SCIENCE AND TECHNOLOGY www.ijest-ng.com 2010 MultiCraft Limited. All rights reserved Multi-level

More information

K-Ways Partitioning of Polyhedral Process Networks: a Multi-Level Approach

K-Ways Partitioning of Polyhedral Process Networks: a Multi-Level Approach 2015 IEEE International Parallel and Distributed Processing Symposium Workshops K-Ways Partitioning of Polyhedral Process Networks: a Multi-Level Approach Riccardo Cattaneo, Mahdi Moradmand, Donatella

More information

On Partitioning FEM Graphs using Diffusion

On Partitioning FEM Graphs using Diffusion On Partitioning FEM Graphs using Diffusion Stefan Schamberger Universität Paderborn, Fakultät für Elektrotechnik, Informatik und Mathematik Fürstenallee 11, D-33102 Paderborn email: schaum@uni-paderborn.de

More information

Graph Partitioning Algorithms for Distributing Workloads of Parallel Computations

Graph Partitioning Algorithms for Distributing Workloads of Parallel Computations Graph Partitioning Algorithms for Distributing Workloads of Parallel Computations Bradford L. Chamberlain October 13, 1998 Abstract This paper surveys graph partitioning algorithms used for parallel computing,

More information

Chapter 9 Graph Algorithms

Chapter 9 Graph Algorithms Chapter 9 Graph Algorithms 2 Introduction graph theory useful in practice represent many real-life problems can be if not careful with data structures 3 Definitions an undirected graph G = (V, E) is a

More information

Application of Fusion-Fission to the multi-way graph partitioning problem

Application of Fusion-Fission to the multi-way graph partitioning problem Application of Fusion-Fission to the multi-way graph partitioning problem Charles-Edmond Bichot Laboratoire d Optimisation Globale, École Nationale de l Aviation Civile/Direction des Services de la Navigation

More information

CMPSCI 311: Introduction to Algorithms Practice Final Exam

CMPSCI 311: Introduction to Algorithms Practice Final Exam CMPSCI 311: Introduction to Algorithms Practice Final Exam Name: ID: Instructions: Answer the questions directly on the exam pages. Show all your work for each question. Providing more detail including

More information

Evolving Multi-level Graph Partitioning Algorithms

Evolving Multi-level Graph Partitioning Algorithms Evolving Multi-level Graph Partitioning Algorithms Aaron S. Pope, Daniel R. Tauritz and Alexander D. Kent Department of Computer Science, Missouri University of Science and Technology, Rolla, Missouri

More information

Technical Report. OSUBMI-TR-2009-n02/ BU-CE Hypergraph Partitioning-Based Fill-Reducing Ordering

Technical Report. OSUBMI-TR-2009-n02/ BU-CE Hypergraph Partitioning-Based Fill-Reducing Ordering Technical Report OSUBMI-TR-2009-n02/ BU-CE-0904 Hypergraph Partitioning-Based Fill-Reducing Ordering Ümit V. Çatalyürek, Cevdet Aykanat and Enver Kayaaslan April 2009 The Ohio State University Department

More information

CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li

CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Main Goal of Algorithm Design Design fast

More information

Parallel Graph Partitioning on Multicore Architectures

Parallel Graph Partitioning on Multicore Architectures Parallel Graph Partitioning on Multicore Architectures Xin Sui 1, Donald Nguyen 1, Martin Burtscher 2, and Keshav Pingali 1,3 1 Department of Computer Science, University of Texas at Austin 2 Department

More information

ENCAPSULATING MULTIPLE COMMUNICATION-COST METRICS IN PARTITIONING SPARSE RECTANGULAR MATRICES FOR PARALLEL MATRIX-VECTOR MULTIPLIES

ENCAPSULATING MULTIPLE COMMUNICATION-COST METRICS IN PARTITIONING SPARSE RECTANGULAR MATRICES FOR PARALLEL MATRIX-VECTOR MULTIPLIES SIAM J. SCI. COMPUT. Vol. 25, No. 6, pp. 1837 1859 c 2004 Society for Industrial and Applied Mathematics ENCAPSULATING MULTIPLE COMMUNICATION-COST METRICS IN PARTITIONING SPARSE RECTANGULAR MATRICES FOR

More information

The PRAM model. A. V. Gerbessiotis CIS 485/Spring 1999 Handout 2 Week 2

The PRAM model. A. V. Gerbessiotis CIS 485/Spring 1999 Handout 2 Week 2 The PRAM model A. V. Gerbessiotis CIS 485/Spring 1999 Handout 2 Week 2 Introduction The Parallel Random Access Machine (PRAM) is one of the simplest ways to model a parallel computer. A PRAM consists of

More information