Shortest Path Problem CLRS Chapters 24.1 3, 24.5, 25.2 Shortest path problem Shortest path problem (and variants) Properties of shortest paths Algorithmic framework Bellman-Ford algorithm Shortest paths in DAGs Dkstra s algorithm All-pairs shortest paths Consider an edge-weighted directed graph G = (V, E): Each edge (u, v) E has an associated real-valued weight w(u, v). The weight w(p) of a path p in G is the sum of the weights along p. The shortest-path weight from vertex u to vertex v is: { min{w(p) : u δ(u, v) = p v} if there is a path from u to v otherwise A shortest path is any path p from u to v with w(p) = δ(u, v). Martin Zachariasen, DIKU March 15, 2007 1 2 Properties of shortest paths Variants of the shortest-paths problem Single-source Find a shortest path from a given source s V to each vertex v V. Single-destination Find a shortest path to a given destination t V from each vertex v V. Single-pair Find a shortest path from u to v for given vertices u, v V. All-pairs Find a shortest path for every pair of vertices u, v V. Optimal substructure: Subpaths of shortest paths are also shortest paths. Let p = v 1, v 2,..., v k be a shortest path from v 1 to v k. Then the path p = v i,..., v j, i j, is a shortest path from v i to v j. Cycles on shortest paths: Positive-weight: Cannot happen, since removing the cycle would create a shorter path. Zero-weight: Can happen, but the cycle may be removed without changing the weight of the path. Negative-weight: Shortest paths not well defined. May assume that shortest paths contain at most V 1 edges. 3 4
Algorithmic framework: Definitions Triangle inequality Triangle inequality for shortest paths For all (u, v) E, we have δ(s, v) δ(s, u) + w(u, v). Proof: The path s u v is a path from s to v that has weight δ(s, u) + w(u, v) assuming that we use a shortest path from s to u. Since δ(s, v) is the weight of a shortest path from s to v, the theorem follows. (We will present other named properties in framed boxes later.) Assumption: The input graph G is represented by adjacency-lists containing (references to) edge weights. All algorithms will maintain the following attributes for each vertex v V : Shortest path estimate d[v]: Initially, d[v] =. Reduces as algorithms progress. At termination, d[v] = δ(s, v). Shortest path predecessor π[v]: Initially, π[v] = NIL. At termination, π[v] is the parent of v in shortest-path tree. Vertex π[v] is the predecessor of v on a shortest path from s to v. Note that these attributes are identical to those used by breadthfirst-search (BFS). 5 6 Algorithmic framework: Fundamental procedures Algorithmic framework: Properties (I) Initialization of attributes: INITIALIZE-SINGLE-SOURCE(G, s) 1 for each vertex v V [G] 2 do d[v] 3 π[v] NIL 4 d[s] 0 Relaxation or updating of attributes: RELAX(u, v, w) 1 if d[v] > d[u] + w(u, v) 2 then d[v] d[u] + w(u, v) 3 π[v] u All algorithms start with a single call to INITIALIZE-SINGLE-SOURCE, and then RELAX is called zero or more times. Upper bound property We have d[v] δ(s, v) at all times for all v V. Once d[v] = δ(s, v), it never changes. Proof: By induction on the number of relaxations. Basis: Clearly true. Inductive step: Consider relaxation of edge (u, v) E. By inductive hypothesis, assume d[x] δ(s, x) for all x V prior to relaxation. Assume that d[v] is changed by relaxation: d[v] = d[u] + w(u, v) δ(s, u) + w(u, v) δ(s, v) No-path property If δ(s, v) = (meaning that there is no path from s to v), then d[v] = always. Proof: Follows directly from the upper bound property. 7 8
Algorithmic framework: Properties (II) Bellman-Ford algorithm Convergence property Assume that s u v is a shortest path and d[u] = δ(s, u). If we relax edge (u, v), then d[v] = δ(s, v) at all times afterward. Proof: If we relax edge (u, v) at a time when d[u] = δ(s, u), after the relaxation we have: d[v] d[u] + w(u, v) = δ(s, u) + w(u, v) = δ(s, v) Since we also have that d[v] δ(s, v), we get d[v] = δ(s, v). Solves the single-source shortest-paths problem in the general case in which edge weights may be negative. Detects negativeweight cycles reachable from s. Algorithm: All edges are first relaxed once in arbitrary order. Then all edges are relaxed again in the same order. This process is repeated until every edge has been relaxed V 1 times. Finally, negative-weight cycles are detected by checking if for any edge (u, v) E. Running time: O(V E). d[v] > d[u] + w(u, v) 9 10 Correctness of B-F negative-weight cycle detection Correctness of Bellman-Ford algorithm 1. There are no negative-weight cycles reachable from s: Path-relaxation property Let p = v 0, v 1,..., v k be a shortest path from s = v 0 to v k. If we relax in order, (v 0, v 1 ),(v 1, v 2 ),...(v k 1, v k ), then d[v k ] = δ(s, v k ) afterward. This holds regardless of other intermixed relaxations. Proof: Induction to show that d[v i ] = δ(s, v i ) after edge(v i 1, v i ) is relaxed. Basis: d[v 0 ] = d[s] = δ(s, s) = 0 Inductive step: Assume d[v i 1 ] = δ(s, v i 1 ). By the convergence property, when edge(v i 1, v i ) is relaxed, we have d[v i ] = δ(s, v i ). Since we may assume that any shortest path contains at most V 1 edges, the Bellman-Ford algorithm relaxes each of the edges of any shortest path in order namely once in each main iteration! d[v] = δ(s, v) δ(s, u) + w(u, v) = d[u] + w(u, v) Therefore Bellman-Ford returns TRUE. 2. There exists a negative-weight cycle reachable from s. Let c = v 0, v 1,..., v k, where v 0 = v k, be a negativeweight cycle. Suppose Bellman-Ford returns TRUE. Then we have Since we get d[v i ] (d[v i 1 ] + w(v i 1, v i )) d[v i ] = 0 d[v i 1 ] w(v i 1, v i ) which is a contradiction to the assumption that c was a negative-weight cycle. 11 12
Shortest paths in directed acyclic graphs (DAGs) Correctness of DAG shortest-paths algorithm Solves the single-source shortest-paths problem for directed acyclic graphs (DAGs). Edge weights may be both positive and negative. Since there are no cycles, there are no negative-weight cycles either. Algorithm: Topologically sort the vertices of the DAG. Process the vertices in topological order, and for each vertex u, relax all its outgoing edges (u, v). Running time: O(V + E). 1. If v is not reachable from s, then d[v] = by the no-path property. 2. If v is reachable, then we consider any shortest path p = v 0, v 1,..., v k from v 0 = s to v k = v. Main observation: The edges of p are relaxed in order: (v 0, v 1 ),(v 1, v 2 ),...(v k 1, v k ). Why? The vertices are processed in topological order, so v 0 must precede v 1, which must precede v 2 etc. By the path-relaxation property, we have d[v k ] = δ(s, v k ) at termination. 13 14 Running time of Dkstra s algorithm Dkstra s algorithm Solves the single-source shortest-paths problem in the case where edge-weights are non-negative: w(u, v) 0 for every edge (u, v) E. Algorithm: Maintains a set S of vertices whose final shortestpath weight from the source s has been determined. In each iteration, the algorithm greedily chooses a vertex u from V \ S, relaxes all its outgoing edges (u, v), and adds u to S. The algorithm is almost identical to breadth-first search. The only difference is that d[v] and π[v] may be updated more than once and that the vertex u is chosen from V \ S in a different way. Maximum size of priority queue Q is V. At most V EXTRACT-MIN operations, and at most E DECREASE- KEY operations. Q represented by a simple array: EXTRACT-MIN: O(V ) time. DECREASE-KEY: O(1) time. In total O(V 2 + E) = O(V 2 ) time. Q represented by a binary min-heap: EXTRACT-MIN: O(log V ) time. DECREASE-KEY: O(log V ) time. In total O(V log V + E log V ) time. Q represented by Fibonacci heap: EXTRACT-MIN: O(log V ) time (amortized). DECREASE-KEY: O(1) time (amortized). In total O(V log V + E) time. 15 16
Correctness of Dkstra s algorithm All-pairs shortest path problem Suffices to show that for each vertex u V, we have d[u] = δ(s, u) at the time when u is added to S. For the purpose of contradiction, let u V be the first vertex added to S for which d[u] δ(s, u) Consider shortest path s x y u from s to u, where x S and y V \ S. We may have x = s and/or y = u. When x was added to S, we had d[x] = δ(s, x), and since edge (x, y) was relaxed, by the convergence property we have d[y] = δ(s, y). Therefore d[y] = δ(s, y) δ(s, u) d[u] By the choice of u (minimum element in Q) we also have so therefore d[y] d[u] d[y] = δ(s, y) = δ(s, u) = d[u] which is a contradiction to our assumption that d[u] δ(s, u). Consider an edge-weighted directed graph G = (V, E). n = V. G is represented by an adjacency-matrix W = (w ) such that 0 if i = j w = weight of directed edge (i, j) if i j and (i, j) E if i j and (i, j) / E We would like to find a shortest (least-weight) path between every pair of vertices i and j. The output should be a distance matrix D = (d ), where d is the weight of a shortest path from vertex i to vertex j. The algorithmic approach is dynamic programming. (Here we focus on how to compute the weight of a shortest path the path itself can be reconstructed by using a so-called predecessor matrix.) Let 17 18 Step 2: Recursive solution (Floyd-Warshall) Step 1: Structure of an optimal solution Intermediate vertex of a path: Any vertex on the path except from the endpoints of the path. If a path i k j is a shortest path from vertex i to vertex j, then each of the subpaths i k and k j are also shortest paths. Why? Assume not and use cut-and-paste argument to obtain a contradiction. Thus we have optimal substructure! = weight of a shortest path i j for which all intermediate vertices are from the vertex set {1,2,..., k} V d (k) Optimal value for problem: D = D (n) = (d (n) ) Recursive computation of d (k) : d (0) = w For k = 1,..., n: ( = min d (k) d (k 1), d (k 1) ik ) + d (k 1) kj 19 20
Transitive closure of a directed graph Step 3: Bottom-up computation (Floyd-Warshall) Straight-forward: Compute D (k) in order of increasing value of k, that is, allowing the paths to use more and more intermediate vertices. Total number of subproblems is Θ(n 3 ): (#choices for k) (#choices for i) (#choices for j) Number of choices when solving a subproblem: O(1). Total running time: Θ(n 3 ). Given: Directed graph G = (V, E). Find: The transitive closure of G, i.e., the graph G = (V, E ) where E = {(i, j) : there exists a path from vertex i to vertex j in G} Here we assume that G is represented by an adjacency-matrix. One solution: Assign a weight of 1 to every edge in G and compute all-pairs shortest paths in G. Then there exists a path from vertex i to vertex j if and only if the shortest path distance is d <. May design a simpler (but not asymptotically faster) algorithm, that replaces min and + operations in FLOYD-WARSHALL with logical OR and logical AND operations. 21 22