Graph Representation Adjacency list representation of G = (V, E) An array of V lists, one for each vertex in V Each list Adj[u] contains all the vertices v such that there is an edge between u and v Adj[u] contains the vertices adjacent to u (in arbitrary order) Can be used for both directed and undirected graphs 1 2 3 5 4 Undirected graph 1 2 3 4 5 2 5 / 1 5 3 4 / 2 4 2 5 3 / 4 1 2 1
Properties of Adjacency-List Representation Sum of the lengths of all the adjacency lists 1 2 3 4 Directed graph: E Directed graph Edge (u, v) appears only once in u s list Undirected graph: 2 E u and v appear in each other s adjacency lists: edge (u, v) appears twice 1 2 3 5 4 Undirected graph 2
Properties of Adjacency-List Representation Memory required (V + E) Preferred when the graph is sparse: E << V 2 1 2 5 4 3 Disadvantage no quick way to determine whether there is an edge between node u and v Time to list all vertices adjacent to u: (degree(u)) Time to determine if (u, v) E: Undirected graph 1 2 3 4 Directed graph O(degree(u)) 3
Graph Representation Adjacency matrix representation of G = (V, E) Assume vertices are numbered 1, 2, V The representation consists of a matrix A V x V : a ij = 1 if (i, j) E 1 2 0 otherwise 3 1 2 1 2 3 4 5 0 1 0 0 1 1 0 1 1 1 Matrix A is symmetric: 5 4 Undirected graph 3 4 5 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 a ij = a ji A = A T 4
Properties of Adjacency Matrix Memory required Representation (V 2 ), independent on the number of edges in G Preferred when The graph is dense E is close to V 2 We need to quickly determine if there is an edge between two vertices Time to list all vertices adjacent to u: (V) Time to determine if (u, v) E: (1) 5
Weighted Graphs Weighted graphs = graphs for which each edge has an associated weight w(u, v) w: E R, weight function Storing the weights of a graph Adjacency list: Store w(u,v) along with vertex v in u s adjacency list Adjacency matrix: Store w(u, v) at location (u, v) in the matrix 6
Searching in a Graph Graph searching = systematically follow the edges of the graph so as to visit the vertices of the graph Two basic graph searching algorithms: Breadth-first search Depth-first search The difference between them is in the order in which they explore the unvisited edges of the graph Graph algorithms are typically elaborations of the basic graph-searching algorithms 7
Input: Breadth-First Search (BFS) A graph G = (V, E) (directed or undirected) A source vertex s V Goal: Explore the edges of G to discover every vertex reachable from s, taking the ones closest to s first Output: d[v] = distance (smallest # of edges) from s to v, for all v V A breadth-first tree rooted at s that contains all reachable vertices 8
Breadth-First Search (cont.) Discover vertices in increasing order of distance from the source s search in breadth not depth Find all vertices at 1 edge from s, then all vertices at 2 edges from s, and so on 1 2 3 5 4 11 7 9 6 12 7 9
Breadth-First Search (cont.) Keeping track of progress: Color each vertex in either white, gray or black Initially, all vertices are white When being discovered a vertex becomes gray After discovering all its adjacent vertices the node becomes black Use FIFO queue Q to maintain the set of gray vertices source 1 2 5 4 1 2 5 4 1 2 5 4 3 3 3 10
Breadth-First Tree BFS constructs a breadth-first tree Initially contains the root (source vertex s) When vertex v is discovered while scanning the adjacency list of a vertex u vertex v and edge (u, v) are added to the tree u is the predecessor (parent) of v in the breadth-first tree A vertex is discovered only once it has at most one parent source 1 2 5 4 3 11
BFS Additional Data Structures G = (V, E) represented using adjacency lists color[u] the color of the vertex for all u V [u] predecessor of u If u = s (root) or node u has not yet been discovered [u] = NIL d[u] the distance from the source s to vertex u Use a FIFO queue Q to maintain the set of gray vertices source d=1 =1 1 2 5 4 d=1 =1 d=2 =5 3 d=2 =2 12
BFS(G, s) 1. for each u V[G] - {s} 2. do color[u] WHITE 3. d[u] 4. [u] = NIL 5. color[s] GRAY 6. d[s] 0 7. [s] = NIL 8. Q 9. Q ENQUEUE(Q, s) r s t u v w x y r s t u v w x y r s t u 0 v w x y Q: s 13
BFS(V, E, s) 10. while Q 11. do u DEQUEUE(Q) 12. for each v Adj[u] 13. do if color[v] = WHITE 14. then color[v] GRAY 15. d[v] d[u] + 1 16. [v] = u 17. ENQUEUE(Q, v) 18. color[u] BLACK r s t u 0 v w x y r s t u 0 1 v w x y r s t u 1 0 1 v w x y Q: s Q: w Q: w, r 14
r s t u 0 Example r s t u 1 0 r s t u 1 0 2 v w x y Q: s r s t u 1 0 2 1 v w x y Q: w, r r s t u 1 0 2 3 1 2 v w x y Q: r, t, x r s t u 1 0 2 3 2 1 2 v w x y Q: t, x, v r s t u 1 0 2 3 2 1 2 v w x y Q: x, v, u r s t u 1 0 2 3 2 1 2 3 v w x y Q: v, u, y r s t u 1 0 2 3 2 1 2 3 v w x y Q: u, y 2 1 2 3 v w x y Q: y 2 1 2 3 v w x y Q: COSC3101A 15
Analysis of BFS 1. for each u V - {s} 2. do color[u] WHITE 3. d[u] 4. [u] = NIL 5. color[s] GRAY 6. d[s] 0 7. [s] = NIL 8. Q 9. Q ENQUEUE(Q, s) O(V) (1) 16
Analysis of BFS 10. while Q 11. do u DEQUEUE(Q) 12. for each v Adj[u] 13. do if color[v] = WHITE 14. then color[v] = GRAY 15. d[v] d[u] + 1 16. [v] = u 17. ENQUEUE(Q, v) (1) Scan Adj[u] for all vertices in the graph Each vertex is scanned only once, when the vertex is dequeued Sum of lengths of all adjacency lists = (E) Scanning operations: O(E) (1) 18. color[u] BLACK Total running time for BFS = O(V + E) 17
Shortest Paths Property BFS finds the shortest-path distance from the source vertex s V to each node in the graph Shortest-path distance = (s, u) Minimum number of edges in any path from s to u source r s t u 1 0 2 3 2 1 2 3 v w x y 18
Input: Depth-First Search G = (V, E) (No source vertex given!) Goal: Explore the edges of G to discover every vertex in V starting at the most current visited node Search may be repeated from multiple sources Output: 2 timestamps on each vertex: d[v] = discovery time f[v] = finishing time (done with examining v s adjacency list) Depth-first forest 1 2 3 5 4 19
Depth-First Search Search deeper in the graph whenever possible Edges are explored out of the most recently discovered vertex v that still has unexplored edges After all edges of v have been explored, the search backtracks from the parent of v The process continues until all vertices reachable from the original source have been discovered If undiscovered vertices remain, choose one of them as a new source and repeat the search from that vertex DFS creates a depth-first forest 1 2 5 4 20 3
DFS Additional Data Structures Global variable: time-step Incremented when nodes are discovered/finished color[u] similar to BFS White before discovery, gray while processing and black when finished processing [u] predecessor of u d[u], f[u] discovery and finish times WHITE 1 d[u] < f [u] 2 V GRAY BLACK 0 d[u] f[u] 2V 21
DFS(G) 1. for each u V[G] 2. do color[u] WHITE 3. [u] NIL 4. time 0 5. for each u V[G] 6. do if color[u] = WHITE 7. then DFS-VISIT(u) Every time DFS-VISIT(u) is called, u becomes the root of a new tree in the depth-first forest 22
DFS-VISIT(u) 1. color[u] GRAY 2. time time+1 3. d[u] time 4. for each v Adj[u] 5. do if color[v] = WHITE 6. then [v] u 7. DFS-VISIT(v) 8. color[u] BLACK 9. time time + 1 10. f[u] time time = 1 1/ 1/ 2/ 23
Example 1/ 1/ 2/ 1/ 2/ 1/ 2/ 4/ 3/ 1/ 2/ B 4/5 3/6 1/ 2/ B 4/ 3/ 1/ 2/7 B 4/5 3/6 3/ 1/ 2/ B 4/5 3/ 1/ 2/7 F B 4/5 3/6 24
Example (cont.) 1/8 2/7 F B 1/8 2/7 9/ F B 1/8 2/7 9/ C F B 4/5 3/6 4/5 3/6 4/5 3/6 1/8 2/7 9/ C F B 4/5 3/6 10/ 1/8 2/7 9/ C F B 4/5 3/6 10/ B 1/8 2/7 9/ C F B 4/5 3/6 10/11 B 1/8 2/7 9/12 C F B 4/5 3/6 10/11 B The results of DFS may depend on: The order in which nodes are explored in procedure DFS The order in which the neighbors of a vertex are visited in DFS-VISIT 25
Edge Classification Tree edge (reaches a WHITE vertex): (u, v) is a tree edge if v was first discovered by exploring edge (u, v) 1/ Back edge (reaches a GRAY vertex): (u, v), connecting a vertex u to an ancestor v in a depth first tree Self loops (in directed graphs) are also back edges 1/ 2/ B 4/ 3/ 26
Edge Classification Forward edge (reaches a BLACK vertex & d[u] < d[v]): Non-tree edges (u, v) that connect a vertex u to a descendant v in a depth first tree Cross edge (reaches a BLACK vertex & d[u] > d[v]): Can go between vertices in same depth-first tree (as long as there is no ancestor / descendant relation) or between different depth-first trees 1/ 2/7 F B 4/5 3/6 1/8 2/7 9/ C F B 4/5 3/6 27
1. for each u V[G] Analysis of DFS(G) 2. do color[u] WHITE 3. [u] NIL 4. time 0 5. for each u V[G] 6. do if color[u] = WHITE 7. then DFS-VISIT(u) (V) (V) exclusive of time for DFS-VISIT 28
Analysis of DFS-VISIT(u) 1. color[u] GRAY 2. time time+1 3. d[u] time 4. for each v Adj[u] DFS-VISIT is called exactly once for each vertex 5. do if color[v] = WHITE 6. then [v] u 7. DFS-VISIT(v) Each loop takes Adj[v] 8. color[u] BLACK 9. time time + 1 10. f[u] time Total: Σ v V Adj[v] + (V) = (V + E) (E) 29
Properties of DFS u = [v] DFS-VISIT(v) was called during a search of u s adjacency list 1/ 2/ Vertex v is a descendant of vertex u in the depth first forest v is 3/ discovered during the time in which u is gray 30
Parenthesis Theorem In any DFS of a graph G, for all u, v, exactly one of the following holds: 1. [d[u], f[u]] and [d[v], f[v]] are disjoint, and neither of u and v is a descendant of the other 2. [d[v], f[v]] is entirely within [d[u], f[u]] and v is a descendant of u 3. [d[u], f[u]] is entirely within [d[v], f[v]] and u is a descendant of v y x y z s 3/6 2/9 1/10 4/5 7/8 12/13 x s z w w v t 11/16 14/15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (s (z (y (x x) y) (w w) z) s) (t (v u) (u u) t) Well-formed expression: parenthesis are properly nested u v t u 31
Other Properties of DFS Corollary Vertex v is a proper descendant of u d[u] < d[v] < f[v] < f[u] Theorem (White-path Theorem) In a depth-first forest of a graph G, vertex v is a descendant of u if and only if at time d[u], there is a path u v consisting of only white vertices. 1/8 2/7 9/12 C F B 4/5 3/6 10/11 v u 1/ 2/ v u B 32
Topological Sort Topological sort of a directed acyclic graph G = (V, E): a linear order of vertices such that if there exists an edge (u, v), then u appears before v in the ordering. Directed acyclic graphs (DAGs) Used to represent precedence of events or processes that have a partial order a before b b before c a before c b before c a before c What about a and b? Topological sort helps us establish a total order 33
Topological Sort undershorts 11/ 16 17/ 18 socks TOPOLOGICAL-SORT(V, E) pants 12/ 15 shoes 13/ 14 1. Call DFS(V, E) to compute finishing times f[v] for each 6/ 7 belt shirt tie 1/ 8 2/ 5 watch 9/ 10 vertex v 2. When each vertex is finished, insert it onto the front of a linked list jacket 3/ 4 3. Return the linked list of vertices socks undershorts pants shoes watch shirt belt tie jacket Running time: (V + E) 34
Topological Sort undershorts 11/ 16 17/ 18 socks 6/ 7 pants belt 12/ 15 shirt tie 1/ 8 2/ 5 shoes watch 13/ 14 9/ 10 Topological sort: an ordering of vertices along a horizontal line so that all directed edges go from left to right. jacket 3/ 4 socks undershorts pants shoes watch shirt belt tie jacket 35
Lemma A directed graph is acyclic a DFS on G yields no back edges. Proof: : acyclic no back edge (u, v) Assume back edge prove cycle Assume there is a back edge (u, v) v is an ancestor of u there is a path from v to u in G (v u) v u + the back edge (u, v) yield a cycle u v 36
Strongly Connected Components Given directed graph G = (V, E): A strongly connected component (SCC) of G is a maximal set of vertices C V such that for every pair of vertices u, v C, we have both u v and v u. 37
The Transpose of a Graph G T = transpose of G G T is G with all edges reversed G T = (V, E T ), E T = {(u, v) : (v, u) E} If using adjacency lists: we can create G T in (V + E) time 1 2 1 2 3 3 5 4 5 4 38
Finding the SCC Observation: G and G T have the same SCC s u and v are reachable from each other in G they are reachable from each other in G T Idea for computing the SCC of a DAG G = (V, E): Make two depth first searches: one on G and one on G T 1 2 1 2 3 3 5 4 5 4 39
Example a b c d 13/ 14 11/ 16 1/ 10 8/ 9 DFS on the initial graph G 12/ 15 3/ 4 2/ 7 5/ 6 b 16 e 15 a 14 c 10 d 9 g 7 h 6 f 4 e f g h a b c d DFS on G T: start at b: visit a, e start at c: visit d start at g: visit f start at h e f g h Strongly connected components: C 1 = {a, b, e}, C 2 = {c, d}, C 3 = {f, g}, C 4 = {h} 40
Component Graph a b c d c d a b e e f g h The component graph G SCC = (V SCC, E SCC ): V SCC = {v 1, v 2,, v k }, where v i corresponds to each strongly connected component C i There is an edge (v i, v j ) E SCC if G contains a directed edge (x, y) for some x C i and y C j The component graph is a DAG f g h 41
Lemma 1 Let C and C be distinct SCC s in G Let u, v C, and u, v C Suppose there is a path u u in G Then there cannot also be a path v v in G. Proof Suppose there is a path v v There exists u u v There exists v v u u and v are reachable from each other, so they are not in separate SCC s: contradiction! u C C u v v 42
Notations Extend notation for d (starting time) and f (finishing time) to sets of vertices U V: d(u) = min u U { d[u] } (earliest discovery time) f(u) = max u U { f[u] } (latest finishing time) d(c 1 ) f(c 1 ) =11 =16 C 1 C 2 a b c d 13/ 14 11/ 16 1/ 10 8/ 9 d(c 2 ) f(c 2 ) =1 =10 12/ 15 3/ 4 2/ 7 5/ 6 e f g h d(c 3 ) f(c 3 ) =2 =7 C 3 C 4 d(c 4 ) f(c 4 ) =5 =6 43