Graph Theory: Starting Out

Administrivia To read: Chapter 7, Sections 1-3 (Ensley/Crawley) Problem Set 5 sent out; due Monday 12/8 in class. There will be two review days next week (Wednesday and Friday) FCQs this week Final exam Tues. 12/16 10:30-1

Let s Start Where Graph Theory Started

The Königsberg Bridge Problem Is there a walk that you can take in Königsberg such that you go over each bridge exactly once? (Ideally, a walk that ends in the same place that it began.)

The Euler Trail Problem Given a graph of vertices and edges, determine whether there is a trail that will use every edge exactly once. (The Euler trail/circuit problem)

The Euler Trail Problem: An Easy Problem Let s pause to take a deeper look at this problem. Think of it as a little puzzle. You may notice, first of all, that whenever you try to find a trail that uses all edges of this graph, you can t seem to do it. Can we prove there s no solution?

A Brief Pause for Vocabulary There s a lot of specialized vocabulary in graph theory. See especially the definitions on pages 508-511 in the textbook. Here, we ll focus on the following definitions: walk, path, trail; cycle, circuit; degree of a vertex.

Euler Trail, Continued We just showed that some graphs don t have an Euler trail, if they have more than two vertices with an odd degree. What about graphs which don t have this dangerous property? Is there a way of finding an Euler trail for any such graph?

Fleury s Algorithm Start with an odd-degree vertex of the graph (remember there are at most two); if there are no odd-degree vertices, choose any vertex at random. Step 1: If there is a non-bridge edge E to traverse, choose that; if not, choose any edge E; traverse that edge E, and delete it from the graph. Keep doing Step 1 until you re done!

One little matter: What s a Bridge Edge (or Non-Bridge Edge )? A bridge edge is one that, if you subtract it from the graph, separates the graph into two distinct components that can t be reached from one another. (In other words, the bridge edge is the only way to get from some set of vertices to another.)

As an exercise

Addendum: How Many Odd Vertices Can a Graph Have? Remember that our rule is: if a graph has 2 or fewer odd vertices, then it has an Euler trail. In fact, no graph can have only one odd vertex. So to have an Euler trail, we re looking for graphs with either two or zero odd vertices. To have an Euler circuit, only graphs with zero odd vertices will do. A final puzzle: why can t a graph have only one odd vertex? Try to create a graph that does why won t that work?

Say, How Do You Determine if a Graph is Connected or Not? Depth-first Search Define DFS-connected? (currentv othervs done) If (are-all-vertices-accounted-for? currentv done) TRUE If (nullset? othervs) FALSE If (already-in-the-done-set? currentv) ELSE DFS-connected? (first othervs) (rest othervs) done Let new-others (add-to-front-of-list (adjacent-vertices currentv) othervs) DFS-connected? (first new-others) (rest new-others) (add-to-front-of-list currentv done) Start with: DFS-Connected? (vertex1 (adjacent-vertices-to vertex1) {vertex1})

Depth-first search of a tree: a connected graph with no cycles in it.

Current others done A BCDE A B CDE A A DCDE BA D CDE BA A BCFCDE DBA B CFCDE DBA C FCDE DBA A DEFFCDE CDBA D EFFCDE CDBA E FFCDE CDBA A CFFCDE ECDBA C FFCDE ECDBA F FCDE ECDBA à TRUE!

Breadth-first search: an alternative way of searching a graph (in this case, a tree)

Here s a superficially similar problem to the Euler trail problem Given a graph of vertices and edges, determine whether there is a cycle path that will move you from vertex to vertex, visiting each vertex exactly once (except for the final step, which lands you where you began). This is the Hamiltonian cycle problem.

What separates the easy from the apparently hard kinds of problems? One sort is called a polynomial-time problem: you can write a program that will solve the problem in a relatively short amount of time (a polynomial function of the size of the input). The Euler trail problem is polynomial time.

What separates these last two kinds of problems? The other sort is called a non-deterministic polynomial-time problem: you can easily write a program that will solve the problem, but only in exponential time. On the other hand, if you re lucky, you can guess an answer to the problem that can be written and checked in polynomial time. The Hamilton circuit problem is non-deterministic polynomial time.

Another easy (polynomial-time) example: Bipartite matching (the dance-hall problem But compare... the tripartite (men/women/dogs) matching problem!

P and NP P is the name given to the set of all problems that can be reliably solved in polynomial time: that is, the time you need to solve the problem grows as a polynomial function of the input size. NP is the name given to the set of all problems for which you can guess and check a solution in polynomial time: that is, the time you need to solve the problem if you re extremely lucky in your guess grows as a polynomial function of the input size. (NP means non-deterministic polynomial )

A little digression: What s Polynomial Time? We want to know how much time a problem takes to solve as the size of the problem gets larger. How long does it take (say) to sort a list of length N? How long to solve a Tower-of- Hanoi problem of tower-size N? How long to find a Hamiltonian circuit in a graph with N vertices and M edges?...

Polynomial Time (continued) If the time we need to solve the problem grows about as fast as a polynomical function of the problem size (say, N 2 or N 3 or N 50 ) then we have a polynomial-time problem. The Euler path problem is polynomial-time in the number of vertices (or edges) of the graph, for example.

Note: Some Problems are Worse than NP The Tower of Hanoi problem is an example: think of the size of the problem as measured by the number n of discs to move. The only way to solve the problem even to write out a solution requires an exponential number of steps (approximately 2 n ). So you can t guess or check a shorter solution.

Another example of a problem in NP that isn t known to be in P: the graph clique problem.

Some additional topics Notice that P is a subset of NP: any problem that is in P is clearly also in NP. The issue is whether there is any problem that is in NP that is not in P. There are some problems in NP that appear, as far as we know, to require exponential time to solve deterministically. The idea of an NP-Complete problem

Graphs in Computer Science How do you represent a graph? Different representations make certain computations easier or harder to accomplish.

A Typical Graph Data Structure An adjacency matrix: here, row i, column j contains the number of edges linking vertex i to vertex j. 0 1 0 0 1 0! 1 0 1 0 1 0! 0 1 0 1 0 0! 0 0 1 0 1 1! 1 1 0 1 0 0! 0 0 0 1 0 0!

Working with an adjacency matrix Note that you can represent multiple edges and loops with this data structure It s easy to test whether a graph is simple: only 1 s, and all 0 s along the diagonal. It s easy to test (e.g.) for the presence of an Eulerian trail in a simple graph: do more than two rows have an odd number of 1 s? If you multiply the matrix by itself (matrix multiplication), you get a matrix representing the number of walks of length 2 between vertex i and vertex j

An Aside on Matrix Multiplication You have two n x n matrices, A and B. Their product can be written A x B, A B, or just AB. Important note: for matrices, AB does not (in general) equal BA. (That is, matrix multiplication is not commutative.) To get the i, j entry of AB, you take the dot product of the ith row of A and the jth column of B.

Ok, what s the dot product? (a, b, c, d) (e, f, g, h) = ae + bf + cg + dh Note that the dot product multiplies two equal-length rows of numbers (vectors), and returns a single number. That s why it is also called the scalar product. The example here takes the dot product of two 4-vectors.

0 1 0 0 1 0 0 1 0 0 1 0 2 1 1 1 1 0! 1 0 1 0 1 0 1 0 1 0 1 0 1 3 0 2 1 0! 0 1 0 1 0 0 X 0 1 0 1 0 0 = 1 0 2 0 2 1! 0 0 1 0 1 1 0 0 1 0 1 1 1 2 0 3 0 0! 1 1 0 1 0 0 1 1 0 1 0 0 1 1 2 0 3 1! 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1!! This is a two-step adjacency matrix. Each entry (i, j) represents the number of distinct twostep paths between vertex i and vertex J.!

Now we can step back and think about this process Suppose you have an adjacency matrix A for a graph. In a sense, A can be thought of as representing all length-1 walks. If a i,j is 1, there is a length-1 walk (i.e, an edge) between vertices i and j. If a i,j is 0 there are no length-1 walks between vertices i and j.

How many distinct walks of length m between two vertices i and j? You multiply A by itself m times to get A m. Note, by the way, that matrix multiplication, while not commutative, is associative: so you can multiply AAA as (AA)A or as (AA)A. The i, jth entry of A m is just the number of distinct walks of length m from i to j. (If you show this for A A, remembering the meaning of the dot product that produces each entry, you can see how it works for length-2 walks. The general rule is then proved by induction.) So we have solved a combinatorial problem by a simple computational method.