Subsequence Definition CS 461, Lecture 8 Jared Saia University of New Mexico Assume given sequence X = x 1, x 2,..., x m Let Z = z 1, z 2,..., z l Then Z is a subsequence of X if there exists a strictly increasing sequence i 1, i 2,..., i k of indices such that for all j = 1, 2,..., k, x ij = z j 2 Today s Outline Example Longest Common Subsequence Intro to Greedy Algs Let X = A, B, C, B, A, B, D, C, Z = A, C, A, B, C Then, Z is a subsequence of X 1 3
Common Subsequence Brute Force Given two sequences X and Y, we say that Z is a common subsequence of X and Y if Z is a subsequence of X and Z is a subsequence of Y Example: X = A, B, D, C, B, A, B, C, Y = A, D, B, C, D, B, A, B Then Z = A, B, B, A, B is a common subsequence Z is not a longest common subsequence(lcs) of X and Y though since the common subsequence Z = A, B, C, B, A, B is longer Q: Is Z a longest common subsequence? Brute Force approach is to enumerate all possible subsequences of X, check to see if its a subsequence of Y, and then keep track of the longest common subsequence of both X and Y This is slow. Q: How many subsequences of X are there? 4 6 LCS Problem Terminology We are given two sequences X = x 1, x 2,..., x m and Y = y 1, y 2,..., y n Goal: Find a maximum-length common subsequence of X and Y Given a sequence X = x 1, x 2,... x m, for i = 0, 1,..., m, let X i be the i-th prefix of X i.e. X i = x 1, x 2,..., x i Example: if X = A, B, D, C, X 0 = and X 3 = A, B, D 5 7
Optimal Substructure Recursive Solution Lemma 1: Let X = x 1, x 2,..., x m and let Y = y 1, y 2,..., y n be sequences and let Z = z 1, z 2,..., z k be any LCS of X and Y. Then: If x m = y n, then z k = x m = y n and Z k 1 is a LCS of X m 1 and Y n 1 If x m y n, then z k x m implies that Z is a LCS of X m 1 and Y If x m y n, then z k y n implies that Z is an LCS of X and Y n 1 Let X = x 1, x 2,..., x m and Y = y 1, y 2,..., y n be arbitrary sequences Based on Lemma 1, there are two main possibilities for the LCS of X and Y : If x m = y n, LCS(X, Y ) is LCS(X m 1, Y n 1 ) appended to x m = y n Otherwise, either LCS(X, Y ) is LCS(X m 1, Y ) or LCS(X, Y n 1 ) (whichever is larger) 8 10 In-Class Exercise Recursive solution Prove each of the three statements in the previous slide Hint: Use proof by contradiction Let c(i, j) be the length of an LCS of the sequence X i, Y j Note that c(i, j) = 0 if i or j is 0 Thus we have: c(i, j) = 0 if i = 0 or j = 0 c(i, j) = c(i 1, j 1) + 1 if i, j > 0 and x i = y j c(i, j) = max(c(i, j 1), c(i 1, j)) if i, j > 0 and x i y j 9 11
DP Solution Example This is already enough to write up a recursive function, however the naive recursive function will take exponential time Instead, we can use dynamic programming and solve from the bottom up Code for doing this is on p. 353 and 355 of the text, basically it uses the same ideas we ve seen before of filling in entries in a table from the bottom up. A B D C B A B C 0 0 0 0 0 0 0 0 0 A 0 1 1 1 1 1 1 1 1 D 0 1 1 2 2 2 2 2 2 B 0 1 2 2 2 3 3 3 3 C 0 1 2 2 3 3 3 3 4 D 0 1 2 3 3 3 3 3 4 B 0 1 2 3 3 4 4 4 4 A 0 1 2 3 3 4 5 5 5 B 0 1 2 3 3 4 5 6 6 12 14 Example Reconstruction Consider X = A, B, D, C, B, A, B, C, Y = A, D, B, C, D, B, A, B The next slide gives the table constructed by the DP algorithm for computing the LCS of X and Y The bold numbers represent one possible path giving a LCS. The arrows keep track of where the minimum is obtained from To find a reconstruction, first find a path of edges leading from the bottom right corner to the top left corner In this path, the target of each diagonal arrow gives a character to include in the LCS In our example, A, B, C, B, A, B is the LCS we get by following the edges along the only path from the bottom right to the top left 13 15
Take Away Activity Selection We ve seen four different DP type algorithms In each case, we did the following 1) found a recurrence for the solution 2) built solutions to the recurrence from the bottom up You should be prepared to do this on your own now! You are given a list of programs to run on a single processor Each program has a start time and a finish time However the processor can only run one program at any given time, and there is no preemption (i.e. once a program is running, it must be completed) 16 18 Greedy Algorithms Another Motivating Problem Greed is Good - Michael Douglas in Wall Street A greedy algorithm always makes the choice that looks best at the moment Greedy algorithms do not always lead to optimal solutions, but for many problems they do In the next week or two, we will see several problems for which greedy algorithms produce optimal solutions including: activity selection, fractional knapsack, and making change When we study graph theory, we will also see that greedy algorithms work well for computing shortest paths and finding minimum spanning trees. Suppose you are at a film fest, all movies look equally good, and you want to see as many complete movies as possible This problem is also exactly the same as the activity selection problem. 17 19
Example Greedy Activity Selector Imagine you are given the following set of start and stop times for activities 1. Sort the activities by their finish times 2. Schedul the first activity in this list 3. Now go through the rest of the sorted list in order, scheduling activities whose start time is after (or the same as) the last scheduled activity (note: code for this algorithm is in section 16.1) time 20 22 Ideas Greedy Algorithm Sorting the activities by their finish times There are many ways to optimally schedule these activities Brute Force: examine every possible subset of the activites and find the largest subset of non-overlapping activities Q: If there are n activities, how many subsets are there? The book also gives a DP solution to the problem time 21 23
Greedy Scheduling of Activities Optimality time The big question here is: Does the greedy algorithm give us an optimal solution??? Surprisingly, the answer turns out to be yes 24 26 Analysis Todo Let n be the total number of activities The algorithm first sorts the activities by finish time taking O(n log n) Then the algorithm visits each activity exactly once, doing a constant amount of work each time. This takes O(n) Thus total time is O(n log n) Finish Chapter 15 Start Chapter 16 25 27