Lecture 13: Minimum Spanning Trees Steven Skiena

Similar documents
Lecture 13: Minimum Spanning Trees Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY

Lecture 14: Shortest Paths Steven Skiena

LEC13: SHORTEST PATH. CSE 373 Analysis of Algorithms Fall 2016 Instructor: Prof. Sael Lee. Lecture slide courtesy of Prof.

Lecture 10 Graph Algorithms

CS 310 Advanced Data Structures and Algorithms

CSE 100: GRAPH ALGORITHMS

Dijkstra s algorithm for shortest paths when no edges have negative weight.

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Algorithms and Theory of Computation. Lecture 5: Minimum Spanning Tree

Algorithms and Theory of Computation. Lecture 5: Minimum Spanning Tree

Week 11: Minimum Spanning trees

CSE 100 Minimum Spanning Trees Prim s and Kruskal

Graphs and Network Flows ISE 411. Lecture 7. Dr. Ted Ralphs

Representations of Weighted Graphs (as Matrices) Algorithms and Data Structures: Minimum Spanning Trees. Weighted Graphs

Minimum-Spanning-Tree problem. Minimum Spanning Trees (Forests) Minimum-Spanning-Tree problem

CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li

COP 4531 Complexity & Analysis of Data Structures & Algorithms

Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 3.. NIL. 2. error new key is greater than current key 6. CASCADING-CUT(, )

COP 4531 Complexity & Analysis of Data Structures & Algorithms

Algorithms for Minimum Spanning Trees

What is a minimal spanning tree (MST) and how to find one

2.1 Greedy Algorithms. 2.2 Minimum Spanning Trees. CS125 Lecture 2 Fall 2016

Union/Find Aka: Disjoint-set forest. Problem definition. Naïve attempts CS 445

managing an evolving set of connected components implementing a Union-Find data structure implementing Kruskal s algorithm

Introduction to Optimization

CSE 373 MAY 10 TH SPANNING TREES AND UNION FIND

CSE 100 Disjoint Set, Union Find

CSE 431/531: Analysis of Algorithms. Greedy Algorithms. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

Autumn Minimum Spanning Tree Disjoint Union / Find Traveling Salesman Problem

CSC 505, Fall 2000: Week 7. In which our discussion of Minimum Spanning Tree Algorithms and their data Structures is brought to a happy Conclusion.

Chapter 23. Minimum Spanning Trees

Greedy Algorithms Part Three

Greedy Approach: Intro

Undirected Graphs. DSA - lecture 6 - T.U.Cluj-Napoca - M. Joldos 1

Greedy Algorithms. At each step in the algorithm, one of several choices can be made.

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

22 Elementary Graph Algorithms. There are two standard ways to represent a

22c:31 Algorithms. Kruskal s Algorithm for MST Union-Find for Disjoint Sets

6.1 Minimum Spanning Trees

Minimum Spanning Trees My T. UF

Minimum Spanning Trees

looking ahead to see the optimum

Minimum-Cost Spanning Tree. Example

Example. Minimum-Cost Spanning Tree. Edge Selection Greedy Strategies. Edge Selection Greedy Strategies

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Minimum Spanning Trees

Lecture 13. Reading: Weiss, Ch. 9, Ch 8 CSE 100, UCSD: LEC 13. Page 1 of 29

22 Elementary Graph Algorithms. There are two standard ways to represent a

Advanced Set Representation Methods

Minimum Spanning Trees

Greedy Algorithms. COMP 215 Lecture 6

Example of why greedy step is correct

Info 2950, Lecture 16

COS 226 Lecture 19: Minimal Spanning Trees (MSTs) PFS BFS and DFS examples. Classic algorithms for solving a natural problem

CISC 320 Midterm Exam

CSE373: Data Structures & Algorithms Lecture 17: Minimum Spanning Trees. Dan Grossman Fall 2013

CAD Algorithms. Categorizing Algorithms

Minimum Spanning Trees

UNIT 3. Greedy Method. Design and Analysis of Algorithms GENERAL METHOD

Theory of Computing. Lecture 10 MAS 714 Hartmut Klauck

Reference Sheet for CO142.2 Discrete Mathematics II

COMP 182: Algorithmic Thinking Prim and Dijkstra: Efficiency and Correctness

Minimum Spanning Trees

CS : Data Structures

CSE 421 Greedy Alg: Union Find/Dijkstra s Alg

SPANNING TREES. Lecture 21 CS2110 Spring 2016

Advanced algorithms. topological ordering, minimum spanning tree, Union-Find problem. Jiří Vyskočil, Radek Mařík 2012

2 A Template for Minimum Spanning Tree Algorithms

Design and Analysis of Algorithms

CMSC 341 Lecture 20 Disjointed Sets

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Context: Weighted, connected, undirected graph, G = (V, E), with w : E R.

Introduction: (Edge-)Weighted Graph

CSC 1700 Analysis of Algorithms: Minimum Spanning Tree

15 211: RECITATION7, SECTION M. Hui Han Chin,

2pt 0em. Computer Science & Engineering 423/823 Design and Analysis of Algorithms. Lecture 04 Minimum-Weight Spanning Trees (Chapter 23)

CS 561, Lecture 9. Jared Saia University of New Mexico

CS521 \ Notes for the Final Exam

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 8 Lecturer: David Wagner February 20, Notes 8 for CS 170

Lecturers: Sanjam Garg and Prasad Raghavendra March 20, Midterm 2 Solutions

Design and Analysis of Algorithms

Week 9 Student Responsibilities. Mat Example: Minimal Spanning Tree. 3.3 Spanning Trees. Prim s Minimal Spanning Tree.

Theorem 2.9: nearest addition algorithm

Outline. Disjoint set data structure Applications Implementation

Minimum Spanning Trees Ch 23 Traversing graphs

10/31/18. About A6, Prelim 2. Spanning Trees, greedy algorithms. Facts about trees. Undirected trees

Building a network. Properties of the optimal solutions. Trees. A greedy approach. Lemma (1) Lemma (2) Lemma (3) Lemma (4)

Spanning Trees, greedy algorithms. Lecture 20 CS2110 Fall 2018

Chapter 5. Greedy algorithms

Outline and Reading. Minimum Spanning Tree. Minimum Spanning Tree. Cycle Property. Minimum Spanning Trees ( 12.7)

CSE 521: Design and Analysis of Algorithms I

Spanning Trees, greedy algorithms. Lecture 22 CS2110 Fall 2017

Algorithm Design (8) Graph Algorithms 1/2

Week 12: Minimum Spanning trees and Shortest Paths

Notes on Minimum Spanning Trees. Red Rule: Given a cycle containing no red edges, select a maximum uncolored edge on the cycle, and color it red.

Matching 4/21/2016. Bipartite Matching. 3330: Algorithms. First Try. Maximum Matching. Key Questions. Existence of Perfect Matching

Chapter 9 Graph Algorithms

Lecture 6 Basic Graph Algorithms

Algorithms and Data Structures: Minimum Spanning Trees (Kruskal) ADS: lecture 16 slide 1

Optimization I : Brute force and Greedy strategy

Transcription:

Lecture 13: Minimum Spanning Trees Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.stonybrook.edu/ skiena

Problem of the Day Your job is to arrange n rambunctious children in a straight line, facing front. You are given a list of m statements of the form i hates j. If i hates j, then you do not want put i somewhere behind j, because then i is capable of throwing something at j. 1. Give an algorithm that orders the line, (or says that it is not possible) in O(m + n) time.

2. Suppose instead you want to arrange the children in rows, such that if i hates j then i must be in a lower numbered row than j. Give an efficient algorithm to find the minimum number of rows needed, if it is possible.

Weighted Graph Algorithms Beyond DFS/BFS exists an alternate universe of algorithms for edge-weighted graphs. Our adjacency list representation quietly supported these graphs: typedef struct { int y; int weight; struct edgenode *next; } edgenode;

Minimum Spanning Trees A tree is a connected graph with no cycles. A spanning tree is a subgraph of G which has the same set of vertices of G and is a tree. A minimum spanning tree of a weighted graph G is the spanning tree of G whose edges sum to minimum weight. There can be more than one minimum spanning tree in a graph consider a graph with identical weight edges.

Find the Minimum Spanning Tree (a) (b) (c)

Why Minimum Spanning Trees? The minimum spanning tree problem has a long history the first algorithm dates back to 1926! MST is taught in algorithm courses because: It arises in many graph applications. It is problem where the greedy algorithm always gives the optimal answer. Clever data structures are necessary to make it work. Greedy algorithms make the decision of what next to do by selecting the best local option from all available choices.

Applications of Minimum Spanning Trees Minimum spanning trees are useful in constructing networks, by describing the way to connect a set of sites using the smallest total amount of wire. Minimum spanning trees provide a reasonable way for clustering points in space into natural groups. What are natural clusters in the friendship graph?

Minimum Spanning Trees and Net Partitioning One of the war stories in the text describes how to partition a graph into compact subgraphs by deleting large edges from the minimum spanning tree. (a) (b) (c) (d)

Minimum Spanning Trees and TSP For points in the Euclidean plane, MST yield a good heuristic for the traveling salesman problem: The optimum traveling salesman tour is at most twice the length of the minimum spanning tree.

Prim s Algorithm Prim s algorithm starts from one vertex and grows the rest of the tree an edge at a time. As a greedy algorithm, which edge should we pick? The cheapest edge with which can grow the tree by one vertex without creating a cycle.

Prim s Algorithm in Action 7 9 4 2 3 5 2 4 3 5 6 2 3 1 5 7 4 7 1 2 6 5 4 A 12 A A G Prim(G,A) Kruskal(G) https://upload.wikimedia.org/wikipedia/ en/3/33/prim-algorithm-animation-2.gif

Prim s Algorithm (Pseudocode) During execution each vertex v is either in the tree, fringe (meaning there exists an edge from a tree vertex to v) or unseen (meaning v is more than one edge away). Prim-MST(G) Select an arbitrary vertex s to start the tree from. While (there are still non-tree vertices) Select min weight edge between tree/non-tree verti Add the selected edge and vertex to the tree T prim. This creates a spanning tree, since no cycle can be introduced, but is it minimum?

Why is Prim Correct? (Proof by Contradiction) If Prim s algorithm is not correct, these must be some graph G where it does not give the minimum cost spanning tree. If so, there must be a first edge (x, y) Prim adds, such that the partial tree V cannot be extended into a MST. x y x y s s v1 v2 (a) (b)

The Contradiction But if (x, y) is not in MST (G), then there must be a path in MST (G) from x to y, because the tree is connected. Let (v 1, v 2 ) be the other edge on this path with one end in V. Replacing (v 1, v 2 ) with (x, y) we get a spanning tree. with smaller weight, since W (v, w) > W (x, y). Thus you did not have the MST!! If W (v, w) = W (x, y), then the tree is the same weight, but we couldn t have made a fatal mistake picking (x, y). Thus Prim s algorithm is correct!

How Fast is Prim s Algorithm? That depends on what data structures are used. In the simplest implementation, we can simply mark each vertex as tree and non-tree and search always from scratch: Select an arbitrary vertex to start. While (there are non-tree vertices) select minimum weight edge between tree and fringe add the selected edge and vertex to the tree This can be done in O(nm) time, by doing a DFS or BFS to loop through all edges, with a constant time test per edge, and a total of n iterations.

Prim s Implementation To do it faster, we must identify fringe vertices and the minimum cost edge associated with it fast. prim(graph *g, int start) { int i; (* counter *) edgenode *p; (* temporary pointer *) bool intree[maxv]; (* is the vertex in the tree yet? *) int distance[maxv]; (* distance vertex is from start *) int v; (* current vertex to process *) int w; (* candidate next vertex *) int weight; (* edge weight *) int dist; (* best current distance from start *) for (i=1; i<=g >nvertices; i++) { intree[i] = FALSE; distance[i] = MAXINT; parent[i] = -1; } distance[start] = 0;

v = start; while (intree[v] == FALSE) { intree[v] = TRUE; p = g >edges[v]; while (p! = NULL) { w = p >y; weight = p >weight; if ((distance[w] > weight) && (intree[w] == FALSE)) { distance[w] = weight; parent[w] = v; } p = p >next; } } } v = 1; dist = MAXINT; for (i=1; i<= g >nvertices; i++) if ((intree[i] == FALSE) && (dist > distance[i])) { dist = distance[i]; v = i; }

Prim s Analysis Finding the minimum weight fringe-edge takes O(n) time, because we iterate through the distance array to find the minimum After adding a vertex v to the tree, by running through its adjacency list in O(n) time we check whether it provides a cheaper way to connect its neighbors to the tree. If so, update the distance value. The total time is n O(n) = O(n 2 ).

Kruskal s Algorithm Since an easy lower bound argument shows that every edge must be looked at to find the minimum spanning tree, and the number of edges m = O(n 2 ), Prim s algorithm is optimal on dense graphs. The complexity of Prim s algorithm is independent of the number of edges. Kruskal s algorithm is faster on sparse graphs Kruskal s algorithm is also greedy. It repeatedly adds the smallest edge to the spanning tree that does not create a cycle.

Kruskal s Algorithm in Action 7 9 4 2 3 5 2 4 3 5 6 2 3 1 5 7 4 7 1 2 6 5 4 A 12 A A G Prim(G,A) Kruskal(G)

Kruskal is Correct (Proof by Contradiction) If Kruskal s algorithm is not correct, these must be some graph G where it does not give the minimum cost spanning tree. If so, there must be a first edge (x, y) Kruskal adds such that the set of edges cannot be extended into a minimum spanning tree. When we added (x, y) there no path between x and y, or it would have created a cycle. Thus adding (x, y) to the optimal tree it must create a cycle. But at least one edge in this cycle must have been added after (x, y), so it must have heavier.

Deleting this heavy edge leaves a better MST than the optimal tree, yielding a contradiction! Thus Kruskal s algorithm is correct!

How fast is Kruskal s algorithm? What is the simplest implementation? Sort the m edges in O(m lg m) time. For each edge in order, test whether it creates a cycle the forest we have thus far built if so discard, else add to forest. With a BFS/DFS, this can be done in O(n) time (since the tree has at most n edges). The total time is O(mn), but can we do better?

Fast Component Tests Give Fast MST Kruskal s algorithm builds up connected components. Any edge where both vertices are in the same connected component create a cycle. Thus if we can maintain which vertices are in which component fast, we do not have test for cycles! Same component(v 1, v 2 ) Do vertices v 1 and v 2 lie in the same connected component of the current graph? Merge components(c 1, C 2 ) Merge the given pair of connected components into one component.

Fast Kruskal Implementation Put the edges in a heap count = 0 while (count < n 1) do get next edge (v, w) if (component (v) component(w)) add to T component (v)=component(w) If we can test components in O(log n), we can find the MST in O(m log m)! Question: Is O(m log n) better than O(m log m)?

Union-Find Programs We need a data structure for maintaining sets which can test if two elements are in the same and merge two sets together. These can be implemented by union and find operations, where Find(i) Return the label of the root of tree containing element i, by walking up the parent pointers until there is no where to go. Union(i,j) Link the root of one of the trees (say containing i) to the root of the tree containing the other (say j) so find(i) now equals find(j).

Same Component Test Is s i s j t = Find(s i ) u = Find(s j ) Return (Is t = u?)

Merge Components Operation Make s i s j t = d(s i ) u = d(s j ) Union(t, u)

Union-Find Trees We are interested in minimizing the time it takes to execute any sequence of unions and finds. A simple implementation is to represent each set as a tree, with pointers from a node to its parent. Each element is contained in a node, and the name of the set is the key at the root: 4 1 6 2 3 5 1 2 3 4 5 0 4 0 0 3 6 4 7 3 (l) 7 (r)

Worst Case for Union Find In the worst case, these structures can be very unbalanced: For i = 1 to n/2 do UNION(i,i+1) For i = 1 to n/2 do FIND(1)

Who s The Daddy? We want the limit the height of our trees which are effected by union s. When we union, we can make the tree with fewer nodes the child. FIND(S) UNION (s, t) s S t Since the number of nodes is related to the height, the height of the final tree will increase only if both subtrees are of equal

height! If Union(t, v) attaches the root of v as a subtree of t iff the number of nodes in t is greater than or equal to the number in v, after any sequence of unions, any tree with h/4 nodes has height at most lg h.

Proof By induction on the number of nodes k, k = 1 has height 0. Let d i be the height of the tree t i d1 T1 k1 nodes T2 k2 nodes d2 k = k1+ k2 nodes d is the height If (d 1 > d 2 ) then d = d 1 log k 1 lg(k 1 + k 2 ) = log k If (d 1 d 2 ), then k 1 k 2. d = d 2 +1 log k 2 +1 = log 2k 2 log(k 1 +k 2 ) = log k

Can we do better? We can do unions and finds in O(log n), good enough for Kruskal s algorithm. But can we do better? The ideal Union-Find tree has depth 1: N-1 leaves On a find, if we are going down a path anyway, why not change the pointers to point to the root?......

2 1 10 14 1 3 7 11 4 5 6 8 9 12 13 FIND(4) 2 4 3 7 10 5 6 8 9 11 14 13 12 This path compression will let us do better than O(n log n) for n union-finds. O(n)? Not quite... Difficult analysis shows that it takes O(nα(n)) time, where α(n) is the inverse Ackerman function and α(number of atoms in the universe)= 5.