Minimum Spanning Trees (Forests) Given an undirected graph G=(V,E) with each edge e having a weight w(e) : Find a subgraph T of G of minimum total weight s.t. every pair of vertices connected in G are also connected in T if G is connected then T is a tree otherwise it is a forest Minimum-Spanning-Tree problem Given a connected, undirected graph G = (V, E), where V is the set of vertices, E is the set of possible interconnections between pairs of vertices, and for each edge (u, v) ε E, we have a weight w(u, v) specifying the cost to connect u and v. We then wish to find an acyclic subset T of E that connects all of the vertices and whose total weight is minimized. This called a minimum spanning tree Since T is acyclic and connects all of the vertices, it must form a tree, which we call a spanning tree since it "spans" the graph G. All spanning trees have exactly V - edges. Minimum-Spanning-Tree problem Two "greedy" strategy algorithms for solving the minimum-spanning-tree problem: Kruskal's algorithm and Prim's algorithm. GENERIC-MST(G, w) A φ while A does not form a spanning tree do find an edge (u, v) that is safe for A A A U {(u, v)} return A An edge is safe edge for A if A U edge is also a subset of a MST. Note that after line, the set A trivially satisfies the invariant that it is a subset of a minimum spanning tree. The loop in lines - maintains the invariant. When the set A is returned in line, therefore, it must be a minimum spanning tree. The tricky part is, of course, finding a safe edge in line. One must exist, since when line is executed, the invariant dictates that there is a spanning tree T such that A is contained in T, and if there is an edge (u, v) ε T such that (u, v) ε A, then (u, v) is safe for A.
The algorithms of Kruskal and Prim These two minimum-spanning-tree greedy algorithms use a specific rule to determine a safe edge in line of GENERIC-MST. In Prim's algorithm, the set A forms a single tree. The safe edge added to A is always a least-weighted edge connecting the tree to a vertex not in the tree In Kruskal's algorithm, the set A is a forest. The safe edge added to A is always a least-weight edge in the graph that connects two distinct components. Kruskal's Algorithm : based directly on the generic minimum-spanning-tree algorithm it finds a safe edge to add to the growing forest by finding, of all the edges that connect any two trees in the forest, an edge (u, v) of least weight. Let C and C denote the two trees that are connected by (u, v). Since (u, v) must be a light edge connecting C to some other tree, C (u, v) is a safe edge for C. Kruskal's algorithm is a greedy algorithm, because at each step it adds to the forest an edge of least possible weight. Like the algorithm to compute connected components. It uses a disjoint-set data structure to maintain several disjoint sets of elements. Each set contains the vertices in a tree of the current forest. The operation FIND-SET(u) returns a representative element from the set that contains u. Thus, we can determine whether two vertices u and v belong to the same tree by testing whether FIND-SET(u) equals FIND- SET(v). The combining of trees is accomplished by the UNION procedure. Prim's algorithm Prim's algorithm : is a special case of the generic minimum-spanning-tree algorithm. operates much like Dijkstra's algorithm for finding shortest paths in a graph. has the property that the edges in the set A always form a single tree. The tree starts from an arbitrary root vertex r and grows until the tree spans all the vertices in V. At each step, a light edge connecting a vertex in A to a vertex in V - A is added to the tree. By Corollary., this rule adds only edges that are safe for A; therefore, when the algorithm terminates, the edges in A form a minimum spanning tree. This strategy is "greedy" since the tree is augmented at each step with an edge that contributes the minimum amount possible to the tree's weight. Why greed is good Definition: Given a graph G=(V,E), a cut of G is a partition of V into two non-empty pieces, S and V-S Lemma: For every cut (S,V-S) of G, there is a minimum spanning tree (or forest) containing any cheapest edge crossing the cut, i.e. connecting some node in S with some node in V-S. call such an edge safe
blue = tree edges. = non-tree edges green = safe and light An edge is said to cross the cut ( S, V-S ) is one endpoint is in S and the other in V S. A cut respects a set A of edges if no edge in A crosses the cut An edge is a light edge crossing a cut it its weight is the minimum of any edge crossing the cut. - 0 Weighted Undirected Graph First Greedy Algorithm : - 0 start at a vertex v add the cheapest edge adjacent to v repeatedly add the cheapest edge that joins the vertices explored so far to the rest of the graph.
- 0-0 - 0-0
- 0-0 - 0-0
- 0-0 - 0-0
Naive Implementation & Analysis - 0 MST weight = Computing the minimum weight edge at each stage. O(m) per step new vertex n vertices in total O(n * m) overall Second Greedy Algorithm Start with the vertices and no edges Repeatedly add the cheapest edge that joins two different components. i.e. that doesn t create a cycle - 0
- 0-0 - 0-0
- 0-0 - 0-0
- 0-0 - 0-0
- 0-0 produces same tree as Prim s algorithm The greedy algorithms always choose safe edges Always chooses cheapest edge from current tree to rest of the graph This is cheapest edge across a cut which has the vertices of that tree on one side. - 0
with Priority Queues For each vertex u not in tree maintain current cheapest edge from tree to u Store u in priority queue with key = weight of this edge Operations: n- insertions (each vertex added once) n- delete-mins (each vertex deleted once) pick the vertex of smallest key, remove it from the p.q. and add its edge to the graph < m decrease-keys (each edge updates one vertex) relaxation of edges with Priority Queues Priority queue implementations Array insert O(), delete-min O(n), decrease-key O() total O(n+n +m)=o(n ) Heap insert, delete-min, decrease-key all O(log n) total O(m log n) n = # of vertices m = # of edges The greedy algorithms always choose safe edges Always chooses cheapest edge connecting two pieces of the graph that aren t yet connected This is the cheapest edge across any cut which has those two pieces on different sides and doesn t split any current pieces. Given a graph G = { V, E }, a cut ( S, V-S ) of an undirected graph G is a partition of V. An edge is said to cross the cut ( S, V-S ) is one endpoint is in S and the other in V S. A cut respects a set A of edges if no edge in A crosses the cut An edge is a light edge crossing a cut it its weight is the minimum of any edge crossing the cut.
Cuts and Spanning Trees - 0-0 Implementation & Analysis - 0 First sort the edges by weight O(m log m) Go through edges from smallest to largest if endpoints of edge e are currently in different components then add to the graph else skip Union-find data structure handles last part Total cost of last part: O(m α(n)) where α(n)<< log m Overall O(m log n)
Union-find disjoint sets data structure Weighted Undirected Graph Maintaining components start with n different components one per vertex find components of the two endpoints of e mfinds union two components when edge connecting them is added n-unions - 0 C D 0 E B A F G H Extract Min V times Build Q : A B C D E F G H 0 INF INF INF INF INF INF INF NULL A A E F A A F ) EXTRACT A : B - A, C A ; F A ; G A ) EXTRACT F: H F ; E F ) EXTRACT B : C NO CHANGE ) EXTRACT C : D C ; E NO CHANGE ) EXTRACT E : D E ) EXTRACT D : NO WORK ) EXTRACT G ) EXTRACT H At completion { ( v, π ( v ) } form a MST Adjacency: A B-C-F-G D - C-E G - A B - A-C E- C-D-F H - F C - A-B-D-E F - A-E-H