Dynamic Programming II

Similar documents
Lecture 8. Dynamic Programming

Dynamic Programming (Part #2)

CS473-Algorithms I. Lecture 10. Dynamic Programming. Cevdet Aykanat - Bilkent University Computer Engineering Department

Lecture 13: Chain Matrix Multiplication

Elements of Dynamic Programming. COSC 3101A - Design and Analysis of Algorithms 8. Discovering Optimal Substructure. Optimal Substructure - Examples

We ve done. Now. Next

Introduction to Algorithms

Chapter 3 Dynamic programming

Dynamic Programming. Design and Analysis of Algorithms. Entwurf und Analyse von Algorithmen. Irene Parada. Design and Analysis of Algorithms

Chain Matrix Multiplication

15.Dynamic Programming

12 Dynamic Programming (2) Matrix-chain Multiplication Segmented Least Squares

/463 Algorithms - Fall 2013 Solution to Assignment 3

Algorithm Design Techniques part I

14 Dynamic. Matrix-chain multiplication. P.D. Dr. Alexander Souza. Winter term 11/12

CMPS 2200 Fall Dynamic Programming. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

Data Structures and Algorithms Week 8

Introduction to Algorithms

Write an algorithm to find the maximum value that can be obtained by an appropriate placement of parentheses in the expression

ECE250: Algorithms and Data Structures Dynamic Programming Part B

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

So far... Finished looking at lower bounds and linear sorts.

Dynamic Programming. An Enumeration Approach. Matrix Chain-Products. Matrix Chain-Products (not in book)

CMSC351 - Fall 2014, Homework #4

Dynamic Programming part 2

y j LCS-Length(X,Y) Running time: O(st) set c[i,0] s and c[0,j] s to 0 for i=1 to s for j=1 to t if x i =y j then else if

Lecture 4: Dynamic programming I

Algorithms: COMP3121/3821/9101/9801

Dynamic Programming Shabsi Walfish NYU - Fundamental Algorithms Summer 2006

Unit-5 Dynamic Programming 2016

Longest Common Subsequence. Definitions

Efficient Sequential Algorithms, Comp309. Problems. Part 1: Algorithmic Paradigms

15.4 Longest common subsequence

Dynamic Programming Group Exercises

Algorithms: Dynamic Programming

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Dynamic Programming

Tutorial 6-7. Dynamic Programming and Greedy

Longest Common Subsequences and Substrings

CS 231: Algorithmic Problem Solving

We augment RBTs to support operations on dynamic sets of intervals A closed interval is an ordered pair of real

Framework for Design of Dynamic Programming Algorithms

Unit 4: Dynamic Programming

Subsequence Definition. CS 461, Lecture 8. Today s Outline. Example. Assume given sequence X = x 1, x 2,..., x m. Jared Saia University of New Mexico

15.4 Longest common subsequence

Dynamic Programming Matrix-chain Multiplication

Algorithms IV. Dynamic Programming. Guoqiang Li. School of Software, Shanghai Jiao Tong University

Homework3: Dynamic Programming - Answers

Partha Sarathi Manal

Longest Common Subsequence, Knapsack, Independent Set Scribe: Wilbur Yang (2016), Mary Wootters (2017) Date: November 6, 2017

Module 27: Chained Matrix Multiplication and Bellman-Ford Shortest Path Algorithm

Dynamic Programming. Nothing to do with dynamic and nothing to do with programming.

15-451/651: Design & Analysis of Algorithms January 26, 2015 Dynamic Programming I last changed: January 28, 2015

CS173 Longest Increasing Substrings. Tandy Warnow

1 i n (p i + r n i ) (Note that by allowing i to be n, we handle the case where the rod is not cut at all.)

CSE 101, Winter Design and Analysis of Algorithms. Lecture 11: Dynamic Programming, Part 2

IN101: Algorithmic techniques Vladimir-Alexandru Paun ENSTA ParisTech

CS Algorithms. Dynamic programming and memoization. (Based on slides by Luebke, Lim, Wenbin)

Algorithms for Data Science

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 15 Lecturer: Michael Jordan October 26, Notes 15 for CS 170

ECE608 - Chapter 15 answers

Dynamic Programming. Outline and Reading. Computing Fibonacci

Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms

Greedy Algorithms. Algorithms

CSE 417 Dynamic Programming (pt 4) Sub-problems on Trees

CSED233: Data Structures (2017F) Lecture12: Strings and Dynamic Programming

Dynamic Programming: 1D Optimization. Dynamic Programming: 2D Optimization. Fibonacci Sequence. Crazy 8 s. Edit Distance

CMSC 451: Lecture 11 Dynamic Programming: Longest Common Subsequence Thursday, Oct 5, 2017

COMP251: Greedy algorithms

Memoization/Dynamic Programming. The String reconstruction problem. CS124 Lecture 11 Spring 2018

CS141: Intermediate Data Structures and Algorithms Dynamic Programming

Computer Sciences Department 1

Lecture 57 Dynamic Programming. (Refer Slide Time: 00:31)

Recursive-Fib(n) if n=1 or n=2 then return 1 else return Recursive-Fib(n-1)+Recursive-Fib(n-2)

Dynamic Programming CS 445. Example: Floyd Warshll Algorithm: Computing all pairs shortest paths

Dynamic Programming Algorithms

Dynamic Programming 1

Dynamic Programmming: Activity Selection

Dynamic Programming. CIS 110, Fall University of Pennsylvania

Lecture 22: Dynamic Programming

1 Dynamic Programming

10/24/ Rotations. 2. // s left subtree s right subtree 3. if // link s parent to elseif == else 11. // put x on s left

F(0)=0 F(1)=1 F(n)=F(n-1)+F(n-2)

Greedy Algorithms. CLRS Chapters Introduction to greedy algorithms. Design of data-compression (Huffman) codes

Dynamic Programming. Lecture Overview Introduction

Lecturers: Sanjam Garg and Prasad Raghavendra March 20, Midterm 2 Solutions

Algorithmic Paradigms. Chapter 6 Dynamic Programming. Steps in Dynamic Programming. Dynamic Programming. Dynamic Programming Applications

Dynamic Programming in Haskell

Dynamic Programming Algorithms

Longest Common Subsequence

1. (a) O(log n) algorithm for finding the logical AND of n bits with n processors

Data Structure and Algorithm II Homework #2 Due: 13pm, Monday, October 31, === Homework submission instructions ===

Design and Analysis of Algorithms 演算法設計與分析. Lecture 7 April 6, 2016 洪國寶

(Feodor F. Dragan) Department of Computer Science Kent State University. Advanced Algorithms, Feodor F. Dragan, Kent State University 1

Optimization II: Dynamic Programming

CMSC 451: Dynamic Programming

Algorithm Design and Analysis

CSC 505, Spring 2005 Week 6 Lectures page 1 of 9

Design and Analysis of Algorithms

Dynamic Programming on Plagiarism Detecting Application

Lectures 12 and 13 Dynamic programming: weighted interval scheduling

Transcription:

June 9, 214

DP: Longest common subsequence biologists often need to find out how similar are 2 DNA sequences DNA sequences are strings of bases: A, C, T and G how to define similarity?

DP: Longest common subsequence biologists often need to find out how similar are 2 DNA sequences DNA sequences are strings of bases: A, C, T and G how to define similarity? one is a substring of another

DP: Longest common subsequence biologists often need to find out how similar are 2 DNA sequences DNA sequences are strings of bases: A, C, T and G how to define similarity? one is a substring of another number of changes (mutations) needed to change one string to another

DP: Longest common subsequence biologists often need to find out how similar are 2 DNA sequences DNA sequences are strings of bases: A, C, T and G how to define similarity? one is a substring of another number of changes (mutations) needed to change one string to another the longest common subsequence of two strings S 1 and S 2 : a longest sequence S 3 appearing in each of S 1 and S 2 (in the same order, but necessarily consecutively) Definition. Z = z 1 z 2... z k is a subsequence of S = s 1 s 2... s n if there exists an increasing sequence of indexes: 1 i 1 < i 2 < < i k n such that z j = s ij

Example. S= G G C A C T G T A C Z= G C C A Z = GCCA is a subsequence of S = GGCACTGTAC Definition. Z is a common subsequence of X and Y if it is a subsequence of both X and Y. A longest such Z is called a longest common subsequence LCS.

Example. S= G G C A C T G T A C Z= G C C A Z = GCCA is a subsequence of S = GGCACTGTAC Definition. Z is a common subsequence of X and Y if it is a subsequence of both X and Y. A longest such Z is called a longest common subsequence LCS. Example. Consider X = GGCACTGTAC Y = CATGTCACGG Then ATAC and GCAG are a common subsequences of X and Y. The longest common subsequence is CATGTAC.

brute-force approach : list all subsequences of X and for each test if it s subsequence of Y If X has a length m, there are 2 m subsequences of X exponential time.

brute-force approach : list all subsequences of X and for each test if it s subsequence of Y If X has a length m, there are 2 m subsequences of X exponential time. We should apply dynamic programming approach.

Optimal substructure of LCS Claim. Let Z = z 1... z k be a LCS of X = x 1... x m and Y = y 1... y n. Then 1 if x m = y n, then z k = x m = y n and Z 1,k 1 is an LCS of X 1,m 1 and Y 1,n 1 ; 2 if x m y n and z k x m, then Z is an LCS of X 1,m 1 and Y ; 3 if x m y n and z k y n, then Z is an LCS of X and Y 1,n 1.

Proof. 1 if z k x m = y n, then Zx m is a common subsequence of X and Y longer than Z, a contradiction clearly, Z 1,k 1 is a common subsequence of X 1,m 1 and Y 1,n 1 if not a longest one: let W be an LCS of X 1,m 1 and Y 1,n 1 ; then Wz k is a common subsequence of X and Y, again a contradiction ( cut-and-paste ) 2 clearly, since z k x m, Z is a common subsequence of X 1,m 1 and Y ; if not a longest one: use cut-and-paste technique again 3 similarly as in case 2. Hence, an LCS of two sequences contains within it an LCS of prefixes of these two sequences: optimal substructure property.

A recursive solution To find an LCS of X = x 1... x m and Y = y 1... y n : if x m = y n, then find an LCS of X 1,m 1 and Y 1,n 1 and append x m = y n to it if x m y n, then find an LCS of X and Y 1,n 1 and an LCS of X 1,m 1 and Y, and take the longer of these two

A recursive solution To find an LCS of X = x 1... x m and Y = y 1... y n : if x m = y n, then find an LCS of X 1,m 1 and Y 1,n 1 and append x m = y n to it if x m y n, then find an LCS of X and Y 1,n 1 and an LCS of X 1,m 1 and Y, and take the longer of these two Let c[i, j] be the length of an LCS of X 1,i and Y 1,j recursive formula: if i = or j =, c[i, j] = c[i 1, j 1] + 1 if i, j > and x i = y j, max(c[i, j 1], c[i 1, j]) if i, j > and x i y j.

Computing A recursive algorithm based on recursive formula would be exponential, however there are only (m + 1)(n + 1) subproblems ( overlapping-subproblems property ) entries of table c[... m,... n] are filled in row-major order : the first row from left to right the second row from left to right etc table b[1... m, 1... n] - contains the information to construct the optimal solution (shows a direction from where we got the minimal value of the length of an LCS: c[i, j] = c[i, j 1], c[i, j] = c[i 1, j], or c[i, j] = c[i 1, j 1] + 1.

LCS-Length (X, Y ) 1. m := length[x ] 2. n := length[y ] 3. for i := 1 to m c[i, ] := 4. for i := 1 to n c[, i] := 5. for i := 1 to m 6. for j := 1 to n 7. if x i = y j 8. c[i, j] := c[i 1, j 1] + 1 9. b[i, j] := 1. if c[i 1, j] c[i, j 1] 11. c[i, j] := c[i 1, j] 12. b[i, j] := 13. else c[i, j] := c[i 1, j] 14. b[i, j] := 15. return c and b Time complexity: O(mn)

j 1 2 3 4 5 6 7 8 9 1 i G G C A C T G T A C 1 C,,,1,1,1,1,1,1,1,1 2 A,,,1,2,2,2,2,2,2,2 3 T,,,1,2,2,3,3,3,3,3 4 G,1,1,1,2,2,3,4,4,4,4 5 T,1,1,1,2,2,3,4,5,5,5 6 C,1,1,2,2,3,3,4,5,5,6 7 A,1,1,2,3,3,3,4,5,6,6 8 C,1,1,2,3,4,4,4,5,6,7 9 G,1,2,2,3,4,4,5,5,6,7 1 G,1,2,2,3,4,4,5,5,6,7

1 2 3 4 5 6 C T G A C A y i x i 1 A 1 1 1 2 C 1 1 1 1 2 2 3 G 1 1 2 2 2 2 4 C 1 1 2 2 3 3 5 T 1 2 2 2 3 3 6 A 1 2 2 3 3 4 7 C 1 2 2 3 4 4

1 2 3 4 5 6 C T G A C A y i 1 2 3 4 5 x i A C G C T 1 1 1 1 1 1 1 2 2 1 1 2 2 2 2 1 1 2 2 3 3 1 2 2 2 3 3 PRING-LCS (b,x,y,i,j) 1. if i = or j = then return 2. if b[i, j] = then PRINT-LCS (b,x,y,i-1,j-1) print x i 3. else if b[i, j] = then PRINT-LCS (b,x,y,i-1,j) 6 A 1 2 2 3 3 4 4. else PRINT-LCS(b,X,Y,i,j-1) 7 C 1 2 2 3 4 4

Exercise Matrix Multiplications Given: a chain of matrices (A 1, A 2,... A n ), with A i having dimension p i 1 p i. Goal: compute the product A 1 A 2 A n as fast as possible

Exercise Matrix Multiplications Given: a chain of matrices (A 1, A 2,... A n ), with A i having dimension p i 1 p i. Goal: compute the product A 1 A 2 A n as fast as possible Clearly, time to multiply two matrices depends on dimensions Does the order of multiplication (= parenthesization) matter? Example: n = 4. Possible orders: (A 1 (A 2 (A 3 A 4 ))) (A 1 ((A 2 A 3 )A 4 )) ((A 1 A 2 )(A 3 A 4 )) ((A 1 (A 2 A 3 ))A 4 ) (((A 1 A 2 )A 3 )A 4 )

Suppose A 1 is 1 1, A 2 is 1 5, A 3 is 5 5, and A 4 is 5 1 Assume that multiplication of a (p q)-matrix and a (q r)-matrix takes pqr steps (a straightforward algorithm)

Suppose A 1 is 1 1, A 2 is 1 5, A 3 is 5 5, and A 4 is 5 1 Assume that multiplication of a (p q)-matrix and a (q r)-matrix takes pqr steps (a straightforward algorithm) Order 2: (A 1 ((A 2 A 3 )A 4 )) 1 5 5 + 1 5 1 + 1 1 1 = 85, Order 5: (((A 1 A 2 )A 3 )A 4 ) 1 1 5 + 1 5 5 + 1 5 1 = 12, 5 Seems it might be a good idea to find a good order

How many orders are there? Can we just check all of them? ( we look only at fully parenthesized matrix products)

How many orders are there? Can we just check all of them? ( we look only at fully parenthesized matrix products) Let P(n) be the number of orders of a sequence of n matrices Clear, P(1) = 1 (only one matrix)

How many orders are there? Can we just check all of them? ( we look only at fully parenthesized matrix products) Let P(n) be the number of orders of a sequence of n matrices Clear, P(1) = 1 (only one matrix) If n 2, a matrix product is the product of two matrix sub-products. Split may occur between k-th and (k + 1)-st position, for any k = 1, 2,..., n 1 ( top-level multiplication )

How many orders are there? Can we just check all of them? ( we look only at fully parenthesized matrix products) Let P(n) be the number of orders of a sequence of n matrices Clear, P(1) = 1 (only one matrix) If n 2, a matrix product is the product of two matrix sub-products. Split may occur between k-th and (k + 1)-st position, for any k = 1, 2,..., n 1 ( top-level multiplication ) Thus { 1 if n = 1 P(n) = n 1 k=1 P(k) P(n k) if n 2 Unfortunately, P(n) = Ω(4 n /n 3/2 ), and thus (easier to see) P(n) = Ω(2 n ) Thus brute-force approach (check all parenthesization) is no good

We will use the Dynamic programming approach to optimally solve this problem. The four basic steps when designing Dynamic programming algorithm: 1 Characterize the structure of an optimal solution 2 Recursively define the value of an optimal solution 3 Compute the value of an optimal solution in a bottom-up fashion 4 Construct an optimal solution from computed information

1. Characterizing structure Let A i,j = A i A j for i j. If i < j, then any parenthesization of A i,j must split product at some k, i k < j, i.e., compute A i,k, A k+1,j, and then A i,k A k+1,j.

1. Characterizing structure Let A i,j = A i A j for i j. If i < j, then any parenthesization of A i,j must split product at some k, i k < j, i.e., compute A i,k, A k+1,j, and then A i,k A k+1,j. Hence, for some k, the cost of computing A i,j is the cost of computing A i,k plus the cost of computing A k+1,j plus the cost of multiplying A i,k and A k+1,j.

Optimal substructure: Suppose that optimal parenthesization of A i,j splits the product between A k and A k+1. Then, parenthesizations of A i,k and A k+1,j within this optimal parenthesization must be also optimal (otherwise, substitute the opt. parenthesization of A i,k (resp. A k+1,j ) to current parenthesization of A i,j and obtain a better solution contradiction) Use optimal substructure to construct an optimal solution: 1 split into two subproblems (choosing an optimal split), 2 find optimal solutions to subproblem, 3 combine optimal subproblem solutions.

A recursive solution Let m[i, j] denote minimum number of multiplications needed to compute A i,j = A i A i+1 A j (full problem: m[1, n]). Recursive definition of m[i, j]: if i = j, then m[i, j] = m[i, i] = (no multiplication needed)

A recursive solution Let m[i, j] denote minimum number of multiplications needed to compute A i,j = A i A i+1 A j (full problem: m[1, n]). Recursive definition of m[i, j]: if i = j, then m[i, j] = m[i, i] = (no multiplication needed) if i < j, assume optimal split at k, i k < j. Since each matrix A i is p i 1 p i, A i,k is p i 1 p k and A k+1,j is p k p j, m[i, j] = m[i, k] + m[k + 1, j] + p i 1 p k p j

A recursive solution Let m[i, j] denote minimum number of multiplications needed to compute A i,j = A i A i+1 A j (full problem: m[1, n]). Recursive definition of m[i, j]: if i = j, then m[i, j] = m[i, i] = (no multiplication needed) if i < j, assume optimal split at k, i k < j. Since each matrix A i is p i 1 p i, A i,k is p i 1 p k and A k+1,j is p k p j, m[i, j] = m[i, k] + m[k + 1, j] + p i 1 p k p j We do not know optimal value of k. There are j i possibilities, k = i, i + 1,..., j 1, hence if i = j m[i, j] = min i k<j {m[i, k] + m[k + 1, j] if i < j +p i 1 p k p j } We also keep track of optimal splits: s[i, j] = k m[i, j] = m[i, k] + m[k + 1, j] + p i 1 p k p j (s[i, j] is a value of k at which we split Dynamic the Programming product II A i,j to obtain

Computing the optimal costs Want to compute m[1, n], minimum cost for multiplying A 1 A 2 A n. Recursively, it would take Ω(2 n ) steps: the same subproblems are computed over and over again. However, if we compute in a bottom-up fashion, we can reduce running time to polynomial in n.

Computing the optimal costs Want to compute m[1, n], minimum cost for multiplying A 1 A 2 A n. Recursively, it would take Ω(2 n ) steps: the same subproblems are computed over and over again. However, if we compute in a bottom-up fashion, we can reduce running time to polynomial in n. The recursive equation shows that cost m[i, j] (product of j i + 1 matrices) depends only on smaller subproblems: for k = 1,..., j 1, A i,k is a product of k i + 1 < j i + 1 matrices, A k+1,j is a product of j k < j i + 1 matrices. Algorithm should fill table m in order of increasing lengths of chains.

Matrix-Chain-Order(p) 1. n := length[p] 1 2. for i := 1 to n 3. m[i, i] := 4. for l := 2 to n 4. for i := 1 to n l + 1 5. j := i + l 1 m[i, j] := 6. for k := i to j 1 7. q := m[i, k] + m[k + 1, j] + p i 1 p k p j 8. if q < m[i, j] 9. m[i, j] := q s[i, j] := k 1. return m and s