Chapter 3. Graphs. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

3.1 Basic Definitions and Applications

Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. V = { 1, 2, 3, 4, 5, 6, 7, 8 } E = { (1,2), (1,3), (2,3), (2,4), (2,5), (3,5), (3,7), (3,8), (4,5), (5,6), (7,8) } n = 8 m = 11 3

Some Graph Applications Graph Nodes Edges transportation street intersections highways communication computers fiber optic cables World Wide Web web pages hyperlinks social people relationships food web species predator-prey software systems functions function calls scheduling tasks precedence constraints circuits gates wires 4

World Wide Web Web graph. Node: web page. Edge: hyperlink from one page to another. cnn.com google.com novell.com cnnsi.com timewarner.com hbo.com sorpranos.com 5

Ecological Food Web Food web graph. Node = species. Edge = from prey to predator. Reference: http://www.twingroves.district96.k12.il.us/wetlands/salamander/salgraphics/salfoodweb.giff 6

Graph Representation: Adjacency Matrix Adjacency matrix. n-by-n matrix with A uv = 1 iff (u, v) is an edge. Two representations of each edge. Space proportional to n 2. Checking if (u, v) is an edge takes Θ(1) time. ( Checking k pairs (u,v) will cost Θ(k) time.) Identifying all edges takes Θ(n 2 ) time. n = number of vertices 1 2 3 4 5 6 7 8 1 0 1 1 0 0 0 0 0 2 1 0 1 1 1 0 0 0 3 1 1 0 0 1 0 1 1 4 0 1 0 0 1 0 0 0 5 0 1 1 1 0 1 0 0 6 0 0 0 0 1 0 0 0 7 0 0 1 0 0 0 0 1 8 0 0 1 0 0 0 1 0 7

Graph Representation: Adjacency List Adjacency list. Node indexed array of lists. Two representations of each edge. Space proportional to m + n. degree = number of neighbours of u Checking if (u, v) is an edge takes O(deg(u)) time. ( Checking k pairs (u,v) may cost up to Θ(kn) time.) Identifying all edges takes Θ(m + n) time. n = number of vertices m = number of edges 1 2 3 2 3 1 3 4 5 1 2 5 7 8 4 2 5 5 6 2 3 4 6 5 7 3 8 8 3 7 8

Paths and Connectivity Def. A path in an undirected graph G = (V, E) is a sequence P of nodes v 1, v 2,, v k-1, v k with the property that each consecutive pair v i, v i+1 is joined by an edge in E. Def. A path is simple if all nodes are distinct. Def. An undirected graph is connected if for every pair of nodes u and v, there is a path between u and v. 9

Cycles Def. A cycle is a path v 1, v 2,, v k-1, v k in which v 1 = v k, k > 2, and the first k-1 nodes are all distinct. cycle C = 1-2-4-5-3-1 10

Trees Def. An undirected graph is a tree if it is connected and does not contain a cycle. Theorem. Let G be an undirected graph on n nodes. Any two of the following statements imply the third. G is connected. G does not contain a cycle. G has n-1 edges. 11

Rooted Trees Rooted tree. Given a tree T, choose a root node r and orient each edge away from r. Importance. Models hierarchical structure. root r parent of v v child of v a tree the same tree, rooted at 1 12

3.2 Graph Traversal

Connectivity s-t connectivity problem. Given two nodes s and t, is there a path between s and t? s-t shortest path problem. Given two node s and t, what is the length (number of edges) of the shortest path between s and t? Applications. Facebook. Maze traversal. Erdos number. Fewest number of hops in a communication network. 14

Breadth First Search BFS intuition. Explore outward from s in all possible directions, adding nodes one "layer" at a time. BFS algorithm. L 0 = { s }. s L 1 L 2 L n-1 L 1 = all neighbours of L 0. L 2 = all nodes that do not belong to L 0 or L 1, and that have an edge to a node in L 1. L i+1 = all nodes that do not belong to an earlier layer, and that have an edge to a node in L i. Theorem. For each i, L i consists of all nodes at distance exactly i from s. There is a path from s to t iff t appears in some layer. 15

Breadth First Search 16

Breadth First Search Property. Let T be a BFS tree of G = (V, E), and let (x, y) be an edge of G. Then the level of x and y differ by at most 1. L 0 L 1 L 2 L 3 17

Breadth First Search: Analysis Theorem. The above implementation of BFS runs in O(m + n) time if the graph is given by its adjacency list representation. Pf. Easy to prove O(n 2 ) running time: at most n lists L[i] each node occurs on at most one list when we consider node u, there are n incident edges (u, v), and we spend O(1) processing each edge Actually runs in O(m + n) time: when we consider node u, there are deg(u) incident edges (u, v) total time processing edges is Σ u V deg(u) = 2m each edge (u, v) is counted exactly twice in sum: once in deg(u) and once in deg(v) 18

Connected Component Connected component. Find all nodes reachable from s. Connected component containing node 1 = { 1, 2, 3, 4, 5, 6, 7, 8 }. 19

Connected Component Connected component. Find all nodes reachable from s. s R u v it's safe to add v Theorem. Upon termination, R is the connected component containing s. BFS = explore in order of distance from s. DFS = explore in a different way. 20

3.4 Testing Bipartiteness

Bipartite Graphs Def. An undirected graph G = (V, E) is bipartite if the nodes can be coloured red or blue such that every edge has one red and one blue end. Applications. Stable marriage: men = red, women = blue. Scheduling: machines = red, jobs = blue. a bipartite graph 23

Testing Bipartiteness Testing bipartiteness. Given a graph G, is it bipartite? Many graph problems become: easier if the underlying graph is bipartite (matching) tractable if the underlying graph is bipartite (independent set) Before attempting to design an algorithm, we need to understand the structure of bipartite graphs. v 2 v 3 v 2 v 1 v 4 v 6 v 5 v 4 v 3 v 5 v 6 v 7 v 1 v 7 a bipartite graph G another drawing of G 24

An Obstruction to Bipartiteness Lemma. If a graph G is bipartite, it cannot contain an odd length cycle. Pf. Not possible to 2-colour the odd cycle, let alone G. bipartite (2-colorable) not bipartite (not 2-colorable) 25

Bipartite Graphs Lemma. Let G be a connected graph, and let L 0,, L k be the layers produced by BFS starting at node s. Exactly one of the following holds. (i) No edge of G joins two nodes of the same layer, and G is bipartite. (ii) An edge of G joins two nodes of the same layer, and G contains an odd-length cycle (and hence is not bipartite). Pf. (i) Suppose no edge joins two nodes in the same layer. By above property, this implies all edges join nodes on adjacent layers. Bipartition: red = nodes on odd levels, blue = nodes on even levels. L 1 L 2 L 3 Case (i) 27

Bipartite Graphs Lemma. Let G be a connected graph, and let L 0,, L k be the layers produced by BFS starting at node s. Exactly one of the following holds. (i) No edge of G joins two nodes of the same layer, and G is bipartite. (ii) An edge of G joins two nodes of the same layer, and G contains an odd-length cycle (and hence is not bipartite). Pf. (ii) Suppose (x, y) is an edge with x, y in same level L j. Let z = lca(x, y) = lowest common ancestor*. Let L i be level containing z. z = lca(x, y) Consider cycle that takes edge from x to y, then path* from y to z, then path* from z to x. Its length is 1 + (j-i) + (j-i), which is odd. (x, y) path from y to z path from z to x *Consider only edges of the BFS tree. 28

Obstruction to Bipartiteness Corollary. A graph G is bipartite iff it contains no odd length cycle. 5-cycle C bipartite (2-colorable) not bipartite (not 2-colorable) 29

3.5 Connectivity in Directed Graphs

Directed Graphs Directed graph. G = (V, E) Edge (u, v) goes from node u to node v. Ex. Web graph - hyperlink points from one web page to another. Directedness of graph is crucial. Modern web search engines exploit hyperlink structure to rank web pages by importance. 31

Graph Search Directed reachability. Given a node s, find all nodes reachable from s. Directed s-t shortest path problem. Given two node s and t, what is the length of the shortest path between s and t? Graph search. BFS extends naturally to directed graphs. Web crawler. Start from web page s. Find all web pages linked from s, either directly or indirectly. 32

Strong Connectivity Def. Node u and v are mutually reachable if there is a path from u to v and also a path from v to u. Def. A graph is strongly connected if every pair of nodes is mutually reachable. Lemma. Let s be any node. G is strongly connected iff every node is reachable from s, and s is reachable from every node. Pf. Follows from definition. Pf. Path from u to v: concatenate u-s path with s-v path. Path from v to u: concatenate v-s path with s-u path. ok if paths overlap s u v 33

Strong Connectivity: Algorithm Theorem. Can determine if G is strongly connected in O(m + n) time. Pf. Pick any node s. Run BFS from s in G. reverse orientation of every edge in G Run BFS from s in G rev. Return true iff all nodes reached in both BFS executions. Correctness follows immediately from previous lemma. strongly connected not strongly connected 34

3.6 DAGs and Topological Ordering

3.6 DAGs and Topological Ordering 36

Directed Acyclic Graphs Def. A DAG is a directed graph that contains no directed cycles. Ex. Precedence constraints: edge (v i, v j ) means v i must precede v j. Def. A topological order of a directed graph G = (V, E) is an ordering of its nodes as v 1, v 2,, v n so that for every edge (v i, v j ) we have i < j. v 2 v 3 v 6 v 5 v 4 v 1 v 2 v 3 v 4 v 5 v 6 v 7 v 7 v 1 a DAG a topological ordering 38

Precedence Constraints Precedence constraints. Edge (v i, v j ) means task v i must occur before v j. Applications. Course prerequisite graph: course v i must be taken before v j. Compilation: module v i must be compiled before v j. Pipeline of computing jobs: output of job v i needed to determine input of job v j. 39

Directed Acyclic Graphs Lemma. If G has a topological order, then G is a DAG. Pf. (by contradiction) Suppose that G has a topological order v 1,, v n and that G also has a directed cycle C. Let's see what happens. Let v i be the lowest-indexed node in C, and let v j be the node just before v i in C; thus (v j, v i ) is an edge. By our choice of i, we have i < j. On the other hand, since (v j, v i ) is an edge and v 1,, v n is a topological order, we must have j < i, a contradiction. the directed cycle C v 1 v i v j v n the supposed topological order: v 1,, v n 40

Directed Acyclic Graphs Lemma. If G has a topological order, then G is a DAG. Q. Does every DAG have a topological ordering? Q. If so, how do we compute one? 41

Directed Acyclic Graphs Lemma. If G is a DAG, then G has a node with no incoming edges. Pf. (by contradiction) Suppose that G is a DAG and every node has at least one incoming edge. Let's see what happens. Pick any node v, and begin following edges backward from v. Since v has at least one incoming edge (u, v) we can walk backward to u. Then, since u has at least one incoming edge (x, u), we can walk backward to x. Repeat until we visit a node, say w, twice. Let C denote the sequence of nodes encountered between successive visits to w. C is a cycle. w x u v 42

Directed Acyclic Graphs Lemma. If G is a DAG, then G has a topological ordering. Pf. (by induction on n) Base case: true if n = 1. Given DAG on n > 1 nodes, find a node v with no incoming edges. G - { v } is a DAG, since deleting v cannot create cycles. By inductive hypothesis, G - { v } has a topological ordering. Place v first in topological ordering; then append nodes of G - { v } in topological order. This is valid since v has no incoming edges. DAG v 43

Topological Ordering Algorithm: Example v 2 v 3 v 6 v 5 v 4 v 7 v 1 Topological order: 44

Topological Ordering Algorithm: Example v 2 v 3 v 6 v 5 v 4 v 7 Topological order: v 1 45

Topological Ordering Algorithm: Example v 3 v 6 v 5 v 4 v 7 Topological order: v 1, v 2 46

Topological Ordering Algorithm: Example v 6 v 5 v 4 v 7 Topological order: v 1, v 2, v 3 47

Topological Ordering Algorithm: Example v 6 v 5 v 7 Topological order: v 1, v 2, v 3, v 4 48

Topological Ordering Algorithm: Example v 6 v 7 Topological order: v 1, v 2, v 3, v 4, v 5 49

Topological Ordering Algorithm: Example v 7 Topological order: v 1, v 2, v 3, v 4, v 5, v 6 50

Topological Ordering Algorithm: Example v 2 v 3 v 6 v 5 v 4 v 1 v 2 v 3 v 4 v 5 v 6 v 7 v 7 v 1 Topological order: v 1, v 2, v 3, v 4, v 5, v 6, v 7. 51

Topological Sorting Algorithm: Running Time Theorem. Algorithm finds a topological order in O(m + n) time. Pf. Maintain the following information: for each node w, count[w] = number of remaining incoming edges S = set of remaining nodes with no incoming edges Initialization: O(m + n) via single scan through graph. Update: to delete v remove v from S decrement count[w] for all edges from v to w, and add w to S if count[w] hits 0 this is O(1) per edge