Algorithms for Data Science

Similar documents
Analysis of Algorithms, I

Algorithms for Data Science

Single Source Shortest Path (SSSP) Problem

Shortest Path Problem

CSC 373 Lecture # 3 Instructor: Milad Eftekhar

Lecture 11: Analysis of Algorithms (CS ) 1

Shortest path problems

The Shortest Path Problem. The Shortest Path Problem. Mathematical Model. Integer Programming Formulation

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Shortest Paths Date: 10/13/15

Breadth-First Search, 1. Slides for CIS 675 DPV Chapter 4. Breadth-First Search, 3. Breadth-First Search, 2

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 7 Lecturer: David Wagner February 13, Notes 7 for CS 170

Single Source Shortest Path

Lecture #7. 1 Introduction. 2 Dijkstra s Correctness. COMPSCI 330: Design and Analysis of Algorithms 9/16/2014

from notes written mostly by Dr. Carla Savage: All Rights Reserved

CSE 502 Class 23. Jeremy Buhler Steve Cole. April BFS solves unweighted shortest path problem. Every edge traversed adds one to path length

CSE 101. Algorithm Design and Analysis Miles Jones Office 4208 CSE Building Lecture 8: Negative Edges

Minimum Spanning Trees

Description of The Algorithm

1 Dijkstra s Algorithm

Week 12: Minimum Spanning trees and Shortest Paths

Directed Graphs. DSA - lecture 5 - T.U.Cluj-Napoca - M. Joldos 1

Algorithms. All-Pairs Shortest Paths. Dong Kyue Kim Hanyang University

The Shortest Path Problem

Week 11: Minimum Spanning trees

Artificial Intelligence

Algorithms (V) Path in Graphs. Guoqiang Li. School of Software, Shanghai Jiao Tong University

Title. Ferienakademie im Sarntal Course 2 Distance Problems: Theory and Praxis. Nesrine Damak. Fakultät für Informatik TU München. 20.

Lecture I: Shortest Path Algorithms

MA 252: Data Structures and Algorithms Lecture 36. Partha Sarathi Mandal. Dept. of Mathematics, IIT Guwahati

CS420/520 Algorithm Analysis Spring 2009 Lecture 14

Single Source Shortest Paths

CS 341: Algorithms. Douglas R. Stinson. David R. Cheriton School of Computer Science University of Waterloo. February 26, 2019

Algorithm Design and Analysis

Shortest Paths. Nishant Mehta Lectures 10 and 11

CSC 8301 Design & Analysis of Algorithms: Warshall s, Floyd s, and Prim s algorithms

Outline. (single-source) shortest path. (all-pairs) shortest path. minimum spanning tree. Dijkstra (Section 4.4) Bellman-Ford (Section 4.

Shortest Paths. Nishant Mehta Lectures 10 and 11

CS 4349 Lecture October 23rd, 2017

Dynamic-Programming algorithms for shortest path problems: Bellman-Ford (for singlesource) and Floyd-Warshall (for all-pairs).

1 i n (p i + r n i ) (Note that by allowing i to be n, we handle the case where the rod is not cut at all.)

8.1 Introduction. Background: Breadth First Search. CS125 Lecture 8 Fall 2014

1 Dynamic Programming

1 Shortest Paths. 1.1 Breadth First Search (BFS) CS 124 Section #3 Shortest Paths and MSTs 2/13/2018

CS 561, Lecture 10. Jared Saia University of New Mexico

Algorithms (IX) Yijia Chen Shanghai Jiaotong University

CS 161 Lecture 11 BFS, Dijkstra s algorithm Jessica Su (some parts copied from CLRS) 1 Review

Dijkstra s Shortest Path Algorithm

More Graph Algorithms: Topological Sort and Shortest Distance

Module 27: Chained Matrix Multiplication and Bellman-Ford Shortest Path Algorithm

1 More on the Bellman-Ford Algorithm

Algorithms on Graphs: Part III. Shortest Path Problems. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar..

CSE 101. Algorithm Design and Analysis Miles Jones Office 4208 CSE Building Lecture 6: BFS and Dijkstra s

Unit-5 Dynamic Programming 2016

Question Points Score TOTAL 50

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Dijkstra s Algorithm. Dijkstra s algorithm is a natural, greedy approach towards

Chapter 25: All-Pairs Shortest-Paths

Algorithms (VII) Yijia Chen Shanghai Jiaotong University

LEC13: SHORTEST PATH. CSE 373 Analysis of Algorithms Fall 2016 Instructor: Prof. Sael Lee. Lecture slide courtesy of Prof.

Dijkstra s Algorithm

Theory of Computing. Lecture 7 MAS 714 Hartmut Klauck

Chapter 6. Dynamic Programming

Graphs. Part II: SP and MST. Laura Toma Algorithms (csci2200), Bowdoin College

Classical Shortest-Path Algorithms

CS 561, Lecture 10. Jared Saia University of New Mexico

CS161 - Minimum Spanning Trees and Single Source Shortest Paths

1 Shortest Paths. 1.1 Breadth First Search (BFS) CS 124 Section #3 Shortest Paths and MSTs 2/13/2018

(Refer Slide Time: 00:18)

Lecture 18 Solving Shortest Path Problem: Dijkstra s Algorithm. October 23, 2009

Algorithms (VII) Yijia Chen Shanghai Jiaotong University

1 Non greedy algorithms (which we should have covered

Dynamic Programming II

Introduction to Algorithms. Lecture 11

Single Source Shortest Path: The Bellman-Ford Algorithm. Slides based on Kevin Wayne / Pearson-Addison Wesley

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

Dr. Alexander Souza. Winter term 11/12

Solution. violating the triangle inequality. - Initialize U = {s} - Add vertex v to U whenever DIST[v] decreases.

Lecture 14: Shortest Paths Steven Skiena

Algorithm Design and Analysis

Introduction to Optimization

Design and Analysis of Algorithms 演算法設計與分析. Lecture 13 December 18, 2013 洪國寶

DIJKSTRA'S ALGORITHM. By Laksman Veeravagu and Luis Barrera

4/8/11. Single-Source Shortest Path. Shortest Paths. Shortest Paths. Chapter 24

CS350: Data Structures Dijkstra s Shortest Path Alg.

1 The Shortest Path Problem

5.4 Shortest Paths. Jacobs University Visualization and Computer Graphics Lab. CH : Algorithms and Data Structures 456

Weighted Graph Algorithms Presented by Jason Yuan

Addis Ababa University, Amist Kilo July 20, 2011 Algorithms and Programming for High Schoolers. Lecture 12

Parallel Graph Algorithms

Introduction Single-source shortest paths All-pairs shortest paths. Shortest paths in graphs

Introduction. I Given a weighted, directed graph G =(V, E) with weight function

COL 702 : Assignment 1 solutions

Final Exam Solutions

Homework 2 Solutions

CSE 326: Data Structures Dijkstra s Algorithm. James Fogarty Autumn 2007

Outline. Computer Science 331. Definitions: Paths and Their Costs. Computation of Minimum Cost Paths

CS4800: Algorithms & Data Jonathan Ullman

Algorithms and Data Structures

Dijkstra s Algorithm and Priority Queue Implementations. CSE 101: Design and Analysis of Algorithms Lecture 5

Algorithms and Data Structures 2014 Exercises and Solutions Week 9

Transcription:

Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Shortest paths in weighted graphs (Bellman-Ford, Floyd-Warshall)

Outline 1 Shortest paths in graphs with non-negative edge weights (Dijkstra s algorithm) Implementations Graphs with negative edge weights: why Dijkstra fails 2 Single-source shortest paths (negative edges): Bellman-Ford A DP solution An alternative formulation of Bellman-Ford 3 All-pairs shortest paths (negative edges): Floyd-Warshall

Today 1 Shortest paths in graphs with non-negative edge weights (Dijkstra s algorithm) Implementations Graphs with negative edge weights: why Dijkstra fails 2 Single-source shortest paths (negative edges): Bellman-Ford A DP solution An alternative formulation of Bellman-Ford 3 All-pairs shortest paths (negative edges): Floyd-Warshall

Graphs with non-negative weights Input a weighted, directed graph G = (V, E, w); function w : E R + assigns non-negative real-valued weights to edges; a source (origin) vertex s V. Output: for every vertex v V 1. the length of a shortest s-v path; 2. a shortest s-v path.

Dijkstra s algorithm (Input: G = (V, E, w), s V ) Output: arrays dist, prev with n entries such that 1. dist(v) stores the length of the shortest s-v path 2. prev(v) stores the node before v in the shortest s-v path At all times, maintain a set S of nodes for which the distance from s has been determined. Initially, dist(s) = 0, S = {s}. Each time, add to S the node v V S that 1. has an edge from some node in S; 2. minimizes the following quantity among all nodes v V S Set prev(v) = u. d(v) = min {dist(u) + w(u, v)} u S:(u,v) E

Implementation Dijkstra-v1(G = (V, E, w), s V ) Initialize(G, s) S = {s} while S V do Select a node v V S with at least one edge from S so that d(v) = {dist[u] + w(u, v)} S = S {v} dist[v] = d(v) prev[v] = u end while Initialize(G, s) for v V do dist[v] = prev[v] = NIL end for dist[s] = 0 min u S,(u,v) E

Improved implementation (I) Idea: Keep a conservative overestimate of the true length of the shortest s-v path in dist[v] as follows: when u is added to S, update dist[v] for all v with (u, v) E. Dijkstra-v2(G = (V, E, w), s V ) Initialize(G, s) S = while S V do Pick u so that dist[u] is minimum among all nodes in V S S = S {u} for (u, v) E do Update(u, v) end for end while Update(u, v) if dist[v] > dist[u] + w(u, v) then dist[v] = dist[u] + w(u, v) prev[v] = u end if

Improved implementation (II): binary min-heap Idea: Use a priority queue implemented as a binary min-heap: store vertex u with key dist[u]. Required operations: Insert, ExtractMin; DecreaseKey for Update; each takes O(log n) time. Dijkstra-v3(G = (V, E, w), s V ) Initialize(G, s) Q = {V ; dist} S = while Q do u = ExtractMin(Q) S = S {u} for (u, v) E do Update(u, v) end for end while Running time: O(n log n + m log n) = O(m log n) When is Dijkstra-v3() better than Dijkstra-v2()?

Example graph with negative edge weights s 5 2 a b 1-4 c

Dijkstra s output and correct output for example graph Dijkstra s output Correct shortest paths s 2 (2) a 1 (3) c s 2 (2) a (1) c 5 b 5 b -4 (5) (5) Dijkstra s algorithm will first include a to S and then c, thus missing the shorter path from s to b to c.

Negative edge weights: why Dijkstra s algorithm fails Intuitively, a path may start on long edges but then compensate along the way with short edges. Formally, in the proof of correctness of the algorithm, the last statement about P does not hold anymore: even if the length of path P v is smaller than the length of the subpath s x y, negative edges on the subpath y v may now result in P being shorter than P v.

Bigger problems in graphs with negative edges? s 5 2 a b 1-4 1 c dist(a) =?

Bigger problems in graphs with negative edges? s 5 2 a b 1-4 1 c 1. dist(v) goes to for every v on the cycle (a, b, c, a) 2. no solution to shortest paths when negative cycles need to detect negative cycles

Today 1 Shortest paths in graphs with non-negative edge weights (Dijkstra s algorithm) Implementations Graphs with negative edge weights: why Dijkstra fails 2 Single-source shortest paths (negative edges): Bellman-Ford A DP solution An alternative formulation of Bellman-Ford 3 All-pairs shortest paths (negative edges): Floyd-Warshall

Single-source shortest paths, negative edge weights Input: weighted directed graph G = (V, E, w) with w : E R; a source (origin) vertex s V. Output: 1. If G has a negative cycle reachable from s, answer negative cycle in G. 2. Else, compute for every v V 2.1 the length of a shortest s-v path; 2.2 a shortest s-v path.

Properties of shortest paths Suppose the problem has a solution for an input graph. Can there be negative cycles in the graph? Can there be positive cycles in the graph? Can the shortest paths contain positive cycles? Consider a shortest s-t path; are its subpaths shortest? In other words, does the problem exhibit optimal substructure?

Towards a DP solution Key observation: if there are no negative cycles, a path cannot become shorter by traversing a cycle. Fact 1. If G has no negative cycles, then there is a shortest s-v path that is simple, thus has at most n 1 edges. Fact 2. The shortest paths problem exhibits optimal substructure. Facts 1 and 2 suggest a DP solution.

Subproblems s w v Let OP T (i, v) = cost of a shortest s-v path with at most i edges Consider a shortest s-v path using at most i edges. If the path uses at most i 1 edges, then If the path uses i edges, then OP T (i, v) = OP T (i, v) = OP T (i 1, v). min {OP T (i 1, x) + w(x, v)}. x:(x,v) E

Recurrence Let OP T (i, v) = cost of a shortest s-v path using at most i edges Then OP T (i, v) = 0, if i = 0, v = s {, if i = 0, v s OP T (i 1, v) min min {OP T (i 1, x) + w(x, v)}, if i > 0 x:(x,v) E

Example of Bellman-Ford Compute shortest s-v paths in the graph below, for all v V. 4 a 1 s -1 2 c 5 b -1

Pseudocode n n dynamic programming table M such that M[i, v] = OP T (i, v). Bellman-Ford(G = (V, E, w), s V ) for v V do M[0, v] = end for M[0, s] = 0 for i = 1,..., n 1 do for v V (in any order) do end for end for M[i, v] = min M[i 1, v] min x:(x,v) E { M[i 1, x] + w(x, v) }

Running time & Space Running time: O(nm) Space: Θ(n 2 ) can be improved (coming up) To reconstruct actual shortest paths, also keep array prev of size n such that prev[v] = predecessor of v in current shortest s-v path.

Improving space requirements Only need two rows of M at all times. Actually, only need one (see Remark 1)! Thus drop the index i from M[i, v] and only use it as a counter for #repetitions. { { } } M[v] = min M[v], min M[x] + w(x, v) x:(x,v) E Remark 1. Throughout the algorithm, M[v] is the length of some s-v path. After i repetitions, M[v] is no larger than the length of the current shortest s-v path with at most i edges. Early termination condition: if at some iteration i no value in M changed, then stop (why?) This allows us to detect negative cycles!

An alternative way to view Bellman-Ford s v1 v2 vk-1 v Let P = (s = v 0, v 1, v 2,..., v k = v) be a shortest s-v path. Then P can contain at most n 1 edges. How can we correctly compute dist(v) on this path?

Key observations about subroutine Update(u, v) Recall subroutine Update from Dijkstra s algorithm: Fact 3. Update(u, v) : dist(v) = min{dist(v), dist(u) + w(u, v)} Suppose u is the last node before v on the shortest s-v path, and suppose dist(u) has been correctly set. The call Update(u, v) returns the correct value for dist(v). Fact 4. No matter how many times Update(u, v) is performed, it will never make dist(v) too small. That is, Update is a safe operation: performing few extra updates can t hurt.

Performing the correct sequence of updates Suppose we update the edges on the shortest path P in the order they appear on the path (though not necessarily consecutively). Hence we update (s, v 1 ), (v 1, v 2 ), (v 2, v 3 ),..., (v k 1, v). This sequence of updates correctly computes dist(v 1 ), dist(v 2 ),..., dist(v) (by induction and Fact 3). How can we guarantee that this specific sequence of updates occurs?

A concrete example Consider the shortest s-b path, which uses edges (s, a), (a, b). How can we guarantee that our algorithm will update these two edges in this order? (More updates in between are allowed.) 4 a 1 s -1 2 c 5 b -1

Bellman-Ford algorithm Update all m edges in the graph, n 1 times in a row! By Fact 4, it is ok to update an edge several times in between. All we need is to update the edges on the path in this particular order. This is guaranteed if we update all edges n 1 times in a row.

Pseudocode We will use Initialize and Update from Dijkstra s algorithm. Initialize(G, s) for v V do dist[v] = prev[v] = NIL end for dist[s] = 0 Update(u, v) if dist[v] > dist[u] + w(u, v) then dist[v] = dist[u] + w(u, v) prev[v] = u end if

Bellman-Ford Bellman-Ford(G = (V, E, w), s) Initialize(G, s) for i = 1,..., n 1 do for (u, v) E do Update(u, v) end for end for Running time? Space?

Detecting negative cycles s 5 2 a b 1-4 1 c

Detecting negative cycles s 5 2 a b 1-4 1 c 1. dist(v) goes to for every v on the cycle. 2. Any shortest s-v path can have at most n 1 edges. 3. Update all edges n times (instead of n 1): if dist(v) changes for any v V, then there is a negative cycle.

Today 1 Shortest paths in graphs with non-negative edge weights (Dijkstra s algorithm) Implementations Graphs with negative edge weights: why Dijkstra fails 2 Single-source shortest paths (negative edges): Bellman-Ford A DP solution An alternative formulation of Bellman-Ford 3 All-pairs shortest paths (negative edges): Floyd-Warshall

All pairs shortest-paths Input: a directed, weighted graph G = (V, E, w) with real edge weights Output: an n n matrix D such that D[i, j] = length of shortest path from i to j

Solving all pairs shortest-paths 1. Straightforward solution: run Bellman Ford once for every vertex (O(n 2 m) time). 2. Improved solution: Floyd-Warshall s dynamic programming algorithm (O(n 3 ) time).

Towards a DP formulation Consider a shortest s-t path P. This path uses some intermediate vertices: that is, if P = (s, v 1, v 2,..., v k, t), then v 1,..., v k are intermediate vertices. For simplicity, relabel the vertices in V as {1, 2, 3,..., n} and consider a shortest i-j path where intermediate vertices may only be from {1, 2,..., k}. Goal: compute the length of a shortest i-j path for every pair of vertices (i, j), using {1, 2,..., n} as intermediate vertices.

Example 4 a 1 s -1 2 c 5 b -1 Rename {s, a, b, c} as {1, 2, 3, 4} 4 2 1 1 5-1 3 2-1 4

Examples of shortest paths Shortest (1, 2)-path using {} or {1} is P. Shortest (1, 2)-path using {1,2,3,4} is P. 1 P 2 1 4 5 2-1 3 4 Shortest (1, 3)-path using {} or {1} is P. Shortest (1, 3)-path using {1,2} or {1,2,3} is P1. Shortest (1, 3)-path using {1,2,3,4} is P1. 1 4 2 P1 P1 5-1 1 4 P 3

A shortest i-j path using nodes from {1,..., k} Consider a shortest i-j path P where intermediate nodes may only be from the set of nodes {1, 2,..., k}. i j Fact: any subpath of P must be shortest itself.

A useful observation Focus on the last node k from the set {1, 2,..., k}. Either 1. P completely avoids k: then a shortest i-j path with intermediate nodes from {1,..., k} is the same as a shortest i-j path with intermediate nodes from {1,..., k 1}. 2. Or, k is an intermediate node of P. k i j Decompose P into an i-k subpath P 1 and a k-j subpath P 2. i. P 1, P 2 are shortest subpaths themselves. ii. All intermediate nodes of P 1, P 2 are from {1,..., k 1}.

Subproblems Let OP T k (i, j) = cost of shortest i j path P using {1,..., k} as intermediate vertices 1. Either k does not appear in P, hence OP T k (i, j) = OP T k 1 (i, j) 2. Or, k appears in P, hence OP T k (i, j) = OP T k 1 (i, k) + OP T k 1 (k, j)

Recurrence Hence w(i, j), if k = 0 OP T k (i, j) = { OP Tk 1 (i, j) min, if k 1 OP T k 1 (i, k) + OP T k 1 (k, j) We want OP T n (i, j). Time/space requirements?

Floyd-Warshall on example graph Let D k [i, j] = OP T k (i, j). D 0 = 0 4 5 0-1 1 2 0-1 0 D 1 = 0 4 5 0-1 1 2 0-1 0 D 2 = 0 4 3 5 0-1 1 2 0-1 0 D 3 = 0 4 3 2 0-1 -2 2 0-1 0

Space requirements A single n n dynamic programming table D, initialized to w(i, j) (the adjacency matrix of G). Let {1,..., k} be the set of intermediate nodes that may be used for the shortest i-j path. After the k-th iteration, D[i, j] contains the length of some i-j path that is no larger than the length of the shortest i-j path using {1,..., k} as intermediate nodes.

The Floyd-Warshall algorithm Floyd-Warshall(G = (V, E, w)) for k = 1 to n do for i = 1 to n do for j = 1 to n do D[i, j] = min(d[i, j], D[i, k] + D[k, j]) end for end for end for Running time: O(n 3 ) Space: Θ(n 2 )