Advanced Algorithms Class Notes for Monday, November 10, 2014

Size: px
Start display at page:

Download "Advanced Algorithms Class Notes for Monday, November 10, 2014"

Transcription

1 Advanced Algorithms Class Notes for Monday, November 10, 2014 Bernard Moret Divide-and-Conquer: Matrix Multiplication Divide-and-conquer is especially useful in computational geometry, but also in numerical algebra. One of the most celebrated and influential algorithms in algebra is the Fast Fourier Transform (FFT) algorithm: it is a typical use of divide-and-conquer. Here we look at another famous algorithm: Strassen s algorithm for fast matrix multiplication. The origin of this algorithm goes back to the days when multiplying two values (not matrices, just scalars) was much more expensive than adding them so that programmers and algorithm designers strove to avoid multiplications. Strassen s algorithm started with an attempt at reducing the number of scalar multiplications needed to compute the product of two 2 2 matrices: ( a11 a 12 a 21 a 22 ) ( b11 b 12 b 21 b 22 ) ( c11 c = 12 c 21 c 22 Computed in the standard manner, with each entry of the product computed separately, this product requires 2 scalar multiplications for each entry, or 8 in all. Strassen found a way to compute 7 subexpressions, each using only one multiplication, such that all 8 entries can then be computed from these subexpressions using only additions and multiplications. The subexpressions are not obvious: Strassen used the following: a 11 b 11 (a 12 a 21 + a 11 a 22 )b 22 a 12 b 21 a 22 (b 11 + b 22 b 12 b 21 ) (a 11 a 21 )(b 22 b 12 ) (a 21 + a 22 a 11 )(b 22 b 12 + b 11 ) (a 21 + a 22 )(b 12 b 11 ) Using these, one can compute the product of two 2 2 matrices with 7 multiplications and 15 additions and subtractions, instead of 8 multiplications and 4 additions and subtractions. Today, of course, multiplication is almost as fast as addition for standard number representation (single or double precision numbers) although it remains much slower than addition for unbounded-precision arithmetic. Thus Strassen s find would be little more than a footnote in the history of computing if he had not quickly realized that his algorithm could be used recursively in a divide-and-conquer scheme. As stated, his algorithm computes the product of 2 2 matrices by reducing the number of products of 1 1 matrices (scalars); by using this approach recursively, we can take 4 4 matrices, divide each into four 2 2 matrices, and compute the product of 4 4 matrices using 7 products (and 15 additions/subtractions) of 2 2 matrices. When reduced to multiplications and additions of scalars, this yields a total of 7 7=49 scalar multiplications and(7 15)+(15 4)= 165 additions and subtractions, whereas the standard method would need = 64 scalar ) 1

2 multiplications and = 48 additions. If we move to 8 8 matrices, we now get 7 7 7=343 scalar multiplications instead of 512, and (7 165)+(15 16) = 1395 additions and subtractions instead of = 448. Notice that the ratio of excess additions and subtractions decreases steadily: the standard method requires (n 1) n 2, or Θ(n 3 ) additions, whereas Strassen s algorithm uses f(n)=7 f(n/2)+15n 2 /4 additions and subtractions, which yields f(n)= Θ(n log 2 7 ). In other words, the penalty sustained on 2 2 matrices in terms of additions and subtractions is quickly overshadowed by the fact that we have fewer matrix multiplications to perform, thereby reducing the number of both multiplicative and additive scalar operations. In terms of scalar multiplications, the standard method uses exactly n 3 of them, where Strassen s uses f(n)=7 f(n/2) in all, which is again Θ(n log 2 7 ). Hence the standard method computes the product of two n n matrices in Θ(n 3 ) time, whereas, using divide-and-conquer and a clever way to handle matrices divided into four equal submatrices, Strassen s algorithm takes O(n log 2 7 ) time, a substantial improvement. Strassen s algorithm is by no means the fastest matrix multiplication algorithm in asymptotic terms, but it is the fastest practical algorithm. The Copper-Winograd algorithm, inspired from Strassen s algorithm, but far more complex, runs (in some of its variations) in O(n 2.38 ), but its coefficients are so large that the algorithm remains impractical today. We just add a couple of other examples of divide-and-conquer for numerical programming. We can compute Fibonacci numbers exactly in logarithmic time by considering products of Fibonacci numbers for arguments divided by 2 (obviously, using floating-point numbers, we can compute in constant time by using the irrational function that solves the recurrence, but we get numerical errors for large values). We can multiply two polynomials of degree n in Θ(n log 2 3 ) time (instead of Θ(n 2 ) by the obvious method) by devising a way to compute (a 1 n+a 0 ) (b 1 n+b 0 ) with only three multiplications instead of four this is a direct analog of Strassen s algorithm. (In this case, we can compute the product even faster using FFTs, since it takes Θ(n log n) to compute the forward FFT, linear time to add the two transformed functions in Fourier space, then again Θ(nlog n) to reverse the FFT. However, FFTs require non-integral operations, whereas the divide-and-conquer uses only integral operations.) Dynamic Programming Dynamic programming is the last main non-randomized algorithmic design method. It can be viewed simply as a lazy evaluation approach, or as a generalization of divide-andconquer. The first applies to simple computations, for instance, computing Fibonacci numbers by storing the value returned by a recursive call and making such calls only when the desired value is not already stored. The second is more to the point: divide-and-conquer, 2

3 as we have seen, can always divide in any manner without affecting the correctness or optimality of the answer returned the choice of how to divide affects only the running time, not the answer. Whenever different divisions could return different answers, we must turn to another algorithmic paradigm and in such cases dynamic programming usually works. Trivial DPs: lazy evaluation Consider our old friends the Fibonacci numbers. If we run the code int function Fib(int n) int answer if (n==0) or (n==1) answer = 1 else answer = Fib(n-1) + Fib(n-2) return answer then the running time of our procedure is exponential! That is because it will compute the function over and over again for the same argument, within different branches of the recursion tree. (In fact, the running time is Θ(Fib(n)).) If, however, we use a global array initialized to a recognizable marker value, then we can avoid recomputing values as follows: int array storefib[0..n] storefib[0] = 1 storefib[1] = 1 for i=2 to n storefib[i] = -1 int function Fib(int n) if (storefib[n] == -1) storefib[n] = Fib(n-1) + Fib(n-2) return storefib[n] This new version runs in linear time. Obviously, it is much simpler and more efficient to use a direct loop to fill in the array, and even better to rotate values among three variables and so reduce the storage requirements, but these are minor issues compared to the gulf between exponential and linear time that was bridged by the simple expedient of storing the solutions to subproblems and looking them up whenever possible. Note, however, that such an approach relies on the fundamental premise that the number of subinstances is a polynomial function of the size of the instance. (For Fibonacci, this is a linear function: n subinstances when solving the instance F(n).) This is an absolute requirement for using dynamic programming, as otherwise both storage and running time will explode. We shall now discover a second requirement: when we must compute an optimal solution according to some given objective function (as opposed to a simple value, such as F(n)), the objective function for the full instance must be computable from the values of the objective function for the subinstances. 3

4 Matrix chain products A classic problem in dynamic programming is the so-called matrix chain product. Suppose you are to multiply many matrices in a single product: B= A i i where each matrix is rectangular and the dimensions all match that is, if matrix A I is m n, then matrix A i+1 is necessarily n k for some k. Because the matrices are rectangular, we cannot use fancy algorithms such as Strassen s: to compute the product A i a i+1 just described, producing a m k matrix, will take O(mnk) time, using the basic method of computing each element of the product independently by taking the scalar product of the corresponding row of the first matrix and column of the second matrix. Because matrix multiplication is associative, however, the order in which the various matrices are multiplied does not affect the answer, yet may have a strong effect on the running time. Consider the product of four matrices X = A B C D We can parenthesize this product as ((AB)C)D and the cost is then the cost of the product AB, or = , followed by the cost of multiplying the result by C, for an additional cost of = , followed by the cost of multiplying the result by D, for an additional cost of = and thus a total cost of In contrast, we could have parenthesized the product as A((BC)D) for a total cost of just = , at just one-hundredth of the cost! Thus the order of association (the parenthesization) matters very much. Given a long chain of matrix products of that type, then, what is the optimal parenthesization (where the cost is just the sum of the cost of each successive multiplication and the cost of the multiplication is just the product of the three dimensions of the two matrices)? If you look at the outermost layer of parentheses, they divide the expression into two sub-expressions, each of which defines the product of a certain number of matrices; the cost of these two sub-expressions can be computed recursively, while combining the two is a simple matrix product. Unfortunately, we do not know where to place the outermost pair of parentheses in our example, placing them to define(abc)d gave us a much worse result than placing them to define A(BCD). This is a classic situation for dynamic programming: we shall simply try all possible placements of this first pair. The number of choices is fortunately limited by the fact that, while the product is associative, it is not commutative the ordering of the matrices in the product is fixed. The outermost pair of parentheses defines the last matrix multiplication to be run and that multiplication can be chosen from just n 1 possibilities for chain of n matrices (from a 1 (2 n) split to a (1 n 1) n split). The cost of the split (1 i) (i+1 n) is the cost of the optimal way to compute the product of the first i matrices, plus the cost of the optimal way to compute the product of the last n i matrices, plus the cost of multiplying together the two resulting matrices, and we seek the choice of i that will minimize this expression. Now 4

5 this is just the first level; in order to compute the optimal way to compute the product of the first i matrices, we will seek a j that gives us an optimal decomposition into a product of the first j matrices, a product of the remaining i j matrices, and the matrix product of the two results. Thus, in general, we will need to find (and store) the optimal way to compute the product of the chain of matrices starting with matrix A i and ending with matrix A i+k, for all valid indices 1 i i+k n. There are Θ(n 2 ) such pairs and computing the optimal value for any of them involves finding the minimum value among k choices, hence the algorithm will require Θ(n 2 ) space and Θ(n 3 ) time. DP in general Obviously, a dynamic program is much more computationally demanding than a divideand-conquer algorithm: the divide-and-conquer algorithm does not test all possible ways of dividing a subproblem, but chooses one a priori (such as choosing the middle one, for instance), whereas the dynamic program must test every choice, in what amounts to a recursive setup (to compute the value of a choice, we must find out the optimal way to choose roots for its subtrees, and so forth all the way down). That the dynamic program is even able to run in polynomial time is due to a combination of two crucial factors. We have already remarked on the first: the number of subproblems the size of the array to be filled in is polynomial. The second is the computation of the value of an optimal solution to a subproblem: it requires only knowledge of the optimal solutions to smaller subproblems and can be computed in polynomial time using that knowledge. Another way to put it is that we can set up a recurrence relation to describe the optimal cost of a subproblem, a recurrence that depends only on entries for smaller subproblems; it is sometimes known in the literature as the principle of optimality. The principle of optimality by itself enables us to devise a dynamic program for an optimization problem, but in order for that algorithm to be useful, we also need to know that the number of subproblems is polynomially bounded. For instance, it is quite easy to devise a dynamic program for the Travelling Salesperson problem, but that DP will require exponential storage and take exponential time, because it will have an exponential number of subproblems. Sequence alignment Sequence alignment and its many variants have become the computational mainstay of research in biology, in part because of the advent of inexpensive, high-throughput sequencing machines: life sciences researchers today sequence anything at hand, from indiscriminate samplings of environments (gut flora, seawater, dirt, etc.) to understand microbial communities to targeted tissues to understand metabolism, aging, disease, etc. Much of what is done with sequences from a computational point of view predates the advent of DNA sequencing: computer scientists had already investigated many of the same problems in the context of strings, editors, spellcheckers, etc. The most basic purpose of sequence alignment is the computation of a similarity (or distance) measure between pairs of sequences. One such measure, the Levenshtein dis- 5

6 tance, was published (in Russian) in 1965 by Russian scientist Levenshtein, and corresponds pretty much exactly to today s sequence alignment. The Levenshtein distance is an example of an edit distance: given two structures from some family and given a collection of allowable operations on these structures, find the shortest (or least-cost) series of operations that will transform one of the given structures into the other. The operations are editing operations, their name directly suggesting their provenance from string operations. Levenshtein defined his metric in terms of single-character editing: the allowable operations are substitutions (or one character for another) and so-called indels, an abbreviation for single-character insertions and deletions. (However, his paper also discusses reversals as another operation.) The alignment version of this edit distance definition is simply the layout, in an indexed array, of the two strings, establishing common indexing that aligns the two strings, allowing gaps (empty array entries) in one string where an indel is indicated, and putting into 1-1 correspondence identical characters as well as characters that are the result of a substitution (biologists would call that a mutation ). For instance, consider the two strings ACCATT and ACATA; the edit distance is 2: we need to delete/insert a C, and substitute an A for a T. In terms of alignment, we would create the following pairwise alignment: A C C A T T A C A T A where the dash was used to indicate a gap corresponding to an indel, where the last position shows the substitution between A and T, and where all other positions are exactly matched. Many such edit distances have been proposed in Computer Science. The Damerau- Levenshtein distance, published in 1964 in the US by Damerau, adds one more operation, an exchange of adjacent characters a common problem in typing and hence a reasonable edit operation. The older Hamming distance (introduced in 1950 by Hamming) removes the indels, allowing only substitutions. All of these distances can be fine-tuned to the application by giving different costs to the specific substitutions, indels, and exchanges; for instance, typing substitutions depend directly on the language and the arrangement of characters on the keyboard. Naturally, computer scientists developped algorithms to compute these various edit distances, with or without weights on the operations. For two strings of length m and n, the Levenshtein and Damerau-Levenshtein distances can be computed in O(mn) time the full algorithm is generally credited to Lowrance and Wagner (1975); the Hamming distance, of course, is trivially computed in linear time, since it assumes that the two strings have the same length and thus does not need to align them, only to compare the character pairs at each index. In computational biology, these algorithms were re-invented and, in two slightly different variations, go by the name of Needleman-Wunsch (1970), which is exactly the Levenshtein problem, known in biology as global pairwise sequence alignment, and Smith-Waterman (1981), which solves a variant of the previous known as local pairwise sequence alignment. 6

Framework for Design of Dynamic Programming Algorithms

Framework for Design of Dynamic Programming Algorithms CSE 441T/541T Advanced Algorithms September 22, 2010 Framework for Design of Dynamic Programming Algorithms Dynamic programming algorithms for combinatorial optimization generalize the strategy we studied

More information

Unit-5 Dynamic Programming 2016

Unit-5 Dynamic Programming 2016 5 Dynamic programming Overview, Applications - shortest path in graph, matrix multiplication, travelling salesman problem, Fibonacci Series. 20% 12 Origin: Richard Bellman, 1957 Programming referred to

More information

Dynamic Programming II

Dynamic Programming II June 9, 214 DP: Longest common subsequence biologists often need to find out how similar are 2 DNA sequences DNA sequences are strings of bases: A, C, T and G how to define similarity? DP: Longest common

More information

Dynamic Programming. Lecture Overview Introduction

Dynamic Programming. Lecture Overview Introduction Lecture 12 Dynamic Programming 12.1 Overview Dynamic Programming is a powerful technique that allows one to solve many different types of problems in time O(n 2 ) or O(n 3 ) for which a naive approach

More information

CSC 373: Algorithm Design and Analysis Lecture 8

CSC 373: Algorithm Design and Analysis Lecture 8 CSC 373: Algorithm Design and Analysis Lecture 8 Allan Borodin January 23, 2013 1 / 19 Lecture 8: Announcements and Outline Announcements No lecture (or tutorial) this Friday. Lecture and tutorials as

More information

15-451/651: Design & Analysis of Algorithms January 26, 2015 Dynamic Programming I last changed: January 28, 2015

15-451/651: Design & Analysis of Algorithms January 26, 2015 Dynamic Programming I last changed: January 28, 2015 15-451/651: Design & Analysis of Algorithms January 26, 2015 Dynamic Programming I last changed: January 28, 2015 Dynamic Programming is a powerful technique that allows one to solve many different types

More information

Algorithm Design Techniques part I

Algorithm Design Techniques part I Algorithm Design Techniques part I Divide-and-Conquer. Dynamic Programming DSA - lecture 8 - T.U.Cluj-Napoca - M. Joldos 1 Some Algorithm Design Techniques Top-Down Algorithms: Divide-and-Conquer Bottom-Up

More information

Richard Feynman, Lectures on Computation

Richard Feynman, Lectures on Computation Chapter 8 Sorting and Sequencing If you keep proving stuff that others have done, getting confidence, increasing the complexities of your solutions for the fun of it then one day you ll turn around and

More information

CS473 - Algorithms I

CS473 - Algorithms I CS473 - Algorithms I Lecture 4 The Divide-and-Conquer Design Paradigm View in slide-show mode 1 Reminder: Merge Sort Input array A sort this half sort this half Divide Conquer merge two sorted halves Combine

More information

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Divide and Conquer

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Divide and Conquer Computer Science 385 Analysis of Algorithms Siena College Spring 2011 Topic Notes: Divide and Conquer Divide and-conquer is a very common and very powerful algorithm design technique. The general idea:

More information

CS473-Algorithms I. Lecture 10. Dynamic Programming. Cevdet Aykanat - Bilkent University Computer Engineering Department

CS473-Algorithms I. Lecture 10. Dynamic Programming. Cevdet Aykanat - Bilkent University Computer Engineering Department CS473-Algorithms I Lecture 1 Dynamic Programming 1 Introduction An algorithm design paradigm like divide-and-conquer Programming : A tabular method (not writing computer code) Divide-and-Conquer (DAC):

More information

Divide and Conquer Strategy. (Page#27)

Divide and Conquer Strategy. (Page#27) MUHAMMAD FAISAL MIT 4 th Semester Al-Barq Campus (VGJW01) Gujranwala faisalgrw123@gmail.com Reference Short Questions for MID TERM EXAMS CS502 Design and Analysis of Algorithms Divide and Conquer Strategy

More information

Towards a Memory-Efficient Knapsack DP Algorithm

Towards a Memory-Efficient Knapsack DP Algorithm Towards a Memory-Efficient Knapsack DP Algorithm Sanjay Rajopadhye The 0/1 knapsack problem (0/1KP) is a classic problem that arises in computer science. The Wikipedia entry http://en.wikipedia.org/wiki/knapsack_problem

More information

Chain Matrix Multiplication

Chain Matrix Multiplication Chain Matrix Multiplication Version of November 5, 2014 Version of November 5, 2014 Chain Matrix Multiplication 1 / 27 Outline Outline Review of matrix multiplication. The chain matrix multiplication problem.

More information

Chapter 3 Dynamic programming

Chapter 3 Dynamic programming Chapter 3 Dynamic programming 1 Dynamic programming also solve a problem by combining the solutions to subproblems. But dynamic programming considers the situation that some subproblems will be called

More information

Resources matter. Orders of Growth of Processes. R(n)= (n 2 ) Orders of growth of processes. Partial trace for (ifact 4) Partial trace for (fact 4)

Resources matter. Orders of Growth of Processes. R(n)= (n 2 ) Orders of growth of processes. Partial trace for (ifact 4) Partial trace for (fact 4) Orders of Growth of Processes Today s topics Resources used by a program to solve a problem of size n Time Space Define order of growth Visualizing resources utilization using our model of evaluation Relating

More information

CSCE 411 Design and Analysis of Algorithms

CSCE 411 Design and Analysis of Algorithms CSCE 411 Design and Analysis of Algorithms Set 4: Transform and Conquer Slides by Prof. Jennifer Welch Spring 2014 CSCE 411, Spring 2014: Set 4 1 General Idea of Transform & Conquer 1. Transform the original

More information

Lecture 10. Sequence alignments

Lecture 10. Sequence alignments Lecture 10 Sequence alignments Alignment algorithms: Overview Given a scoring system, we need to have an algorithm for finding an optimal alignment for a pair of sequences. We want to maximize the score

More information

Dynamic Programming. Design and Analysis of Algorithms. Entwurf und Analyse von Algorithmen. Irene Parada. Design and Analysis of Algorithms

Dynamic Programming. Design and Analysis of Algorithms. Entwurf und Analyse von Algorithmen. Irene Parada. Design and Analysis of Algorithms Entwurf und Analyse von Algorithmen Dynamic Programming Overview Introduction Example 1 When and how to apply this method Example 2 Final remarks Introduction: when recursion is inefficient Example: Calculation

More information

Computer Sciences Department 1

Computer Sciences Department 1 1 Advanced Design and Analysis Techniques (15.1, 15.2, 15.3, 15.4 and 15.5) 3 Objectives Problem Formulation Examples The Basic Problem Principle of optimality Important techniques: dynamic programming

More information

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems. 2.3 Designing algorithms There are many ways to design algorithms. Insertion sort uses an incremental approach: having sorted the subarray A[1 j - 1], we insert the single element A[j] into its proper

More information

6. Algorithm Design Techniques

6. Algorithm Design Techniques 6. Algorithm Design Techniques 6. Algorithm Design Techniques 6.1 Greedy algorithms 6.2 Divide and conquer 6.3 Dynamic Programming 6.4 Randomized Algorithms 6.5 Backtracking Algorithms Malek Mouhoub, CS340

More information

We augment RBTs to support operations on dynamic sets of intervals A closed interval is an ordered pair of real

We augment RBTs to support operations on dynamic sets of intervals A closed interval is an ordered pair of real 14.3 Interval trees We augment RBTs to support operations on dynamic sets of intervals A closed interval is an ordered pair of real numbers ], with Interval ]represents the set Open and half-open intervals

More information

1 Motivation for Improving Matrix Multiplication

1 Motivation for Improving Matrix Multiplication CS170 Spring 2007 Lecture 7 Feb 6 1 Motivation for Improving Matrix Multiplication Now we will just consider the best way to implement the usual algorithm for matrix multiplication, the one that take 2n

More information

Sequence alignment is an essential concept for bioinformatics, as most of our data analysis and interpretation techniques make use of it.

Sequence alignment is an essential concept for bioinformatics, as most of our data analysis and interpretation techniques make use of it. Sequence Alignments Overview Sequence alignment is an essential concept for bioinformatics, as most of our data analysis and interpretation techniques make use of it. Sequence alignment means arranging

More information

Mouse, Human, Chimpanzee

Mouse, Human, Chimpanzee More Alignments 1 Mouse, Human, Chimpanzee Mouse to Human Chimpanzee to Human 2 Mouse v.s. Human Chromosome X of Mouse to Human 3 Local Alignment Given: two sequences S and T Find: substrings of S and

More information

Divide and Conquer Algorithms

Divide and Conquer Algorithms CSE341T 09/13/2017 Lecture 5 Divide and Conquer Algorithms We have already seen a couple of divide and conquer algorithms in this lecture. The reduce algorithm and the algorithm to copy elements of the

More information

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Dynamic Programming

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Dynamic Programming Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 25 Dynamic Programming Terrible Fibonacci Computation Fibonacci sequence: f = f(n) 2

More information

Sankalchand Patel College of Engineering - Visnagar Department of Computer Engineering and Information Technology. Assignment

Sankalchand Patel College of Engineering - Visnagar Department of Computer Engineering and Information Technology. Assignment Class: V - CE Sankalchand Patel College of Engineering - Visnagar Department of Computer Engineering and Information Technology Sub: Design and Analysis of Algorithms Analysis of Algorithm: Assignment

More information

The Citizen s Guide to Dynamic Programming

The Citizen s Guide to Dynamic Programming The Citizen s Guide to Dynamic Programming Jeff Chen Stolen extensively from past lectures October 6, 2007 Unfortunately, programmer, no one can be told what DP is. You have to try it for yourself. Adapted

More information

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer Module 2: Divide and Conquer Divide and Conquer Control Abstraction for Divide &Conquer 1 Recurrence equation for Divide and Conquer: If the size of problem p is n and the sizes of the k sub problems are

More information

Computational biology course IST 2015/2016

Computational biology course IST 2015/2016 Computational biology course IST 2015/2016 Introduc)on to Algorithms! Algorithms: problem- solving methods suitable for implementation as a computer program! Data structures: objects created to organize

More information

An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST

An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST Alexander Chan 5075504 Biochemistry 218 Final Project An Analysis of Pairwise

More information

Algorithms IV. Dynamic Programming. Guoqiang Li. School of Software, Shanghai Jiao Tong University

Algorithms IV. Dynamic Programming. Guoqiang Li. School of Software, Shanghai Jiao Tong University Algorithms IV Dynamic Programming Guoqiang Li School of Software, Shanghai Jiao Tong University Dynamic Programming Shortest Paths in Dags, Revisited Shortest Paths in Dags, Revisited The special distinguishing

More information

DDS Dynamic Search Trees

DDS Dynamic Search Trees DDS Dynamic Search Trees 1 Data structures l A data structure models some abstract object. It implements a number of operations on this object, which usually can be classified into l creation and deletion

More information

Data Structures and Algorithms Week 8

Data Structures and Algorithms Week 8 Data Structures and Algorithms Week 8 Dynamic programming Fibonacci numbers Optimization problems Matrix multiplication optimization Principles of dynamic programming Longest Common Subsequence Algorithm

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

Attendance (2) Performance (3) Oral (5) Total (10) Dated Sign of Subject Teacher

Attendance (2) Performance (3) Oral (5) Total (10) Dated Sign of Subject Teacher Attendance (2) Performance (3) Oral (5) Total (10) Dated Sign of Subject Teacher Date of Performance:... Actual Date of Completion:... Expected Date of Completion:... ----------------------------------------------------------------------------------------------------------------

More information

Chapter 1: Number and Operations

Chapter 1: Number and Operations Chapter 1: Number and Operations 1.1 Order of operations When simplifying algebraic expressions we use the following order: 1. Perform operations within a parenthesis. 2. Evaluate exponents. 3. Multiply

More information

Introduction to Algorithms

Introduction to Algorithms Lecture 1 Introduction to Algorithms 1.1 Overview The purpose of this lecture is to give a brief overview of the topic of Algorithms and the kind of thinking it involves: why we focus on the subjects that

More information

FastA & the chaining problem

FastA & the chaining problem FastA & the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem 1 Sources for this lecture: Lectures by Volker Heun, Daniel Huson and Knut Reinert,

More information

Course Number 432/433 Title Algebra II (A & B) H Grade # of Days 120

Course Number 432/433 Title Algebra II (A & B) H Grade # of Days 120 Whitman-Hanson Regional High School provides all students with a high- quality education in order to develop reflective, concerned citizens and contributing members of the global community. Course Number

More information

Homework 2. Sample Solution. Due Date: Thursday, May 31, 11:59 pm

Homework 2. Sample Solution. Due Date: Thursday, May 31, 11:59 pm Homework Sample Solution Due Date: Thursday, May 31, 11:59 pm Directions: Your solutions should be typed and submitted as a single pdf on Gradescope by the due date. L A TEX is preferred but not required.

More information

Lecture 8: Mergesort / Quicksort Steven Skiena

Lecture 8: Mergesort / Quicksort Steven Skiena Lecture 8: Mergesort / Quicksort Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.stonybrook.edu/ skiena Problem of the Day Give an efficient

More information

An Interesting Way to Combine Numbers

An Interesting Way to Combine Numbers An Interesting Way to Combine Numbers Joshua Zucker and Tom Davis October 12, 2016 Abstract This exercise can be used for middle school students and older. The original problem seems almost impossibly

More information

Memoization/Dynamic Programming. The String reconstruction problem. CS124 Lecture 11 Spring 2018

Memoization/Dynamic Programming. The String reconstruction problem. CS124 Lecture 11 Spring 2018 CS124 Lecture 11 Spring 2018 Memoization/Dynamic Programming Today s lecture discusses memoization, which is a method for speeding up algorithms based on recursion, by using additional memory to remember

More information

Module 27: Chained Matrix Multiplication and Bellman-Ford Shortest Path Algorithm

Module 27: Chained Matrix Multiplication and Bellman-Ford Shortest Path Algorithm Module 27: Chained Matrix Multiplication and Bellman-Ford Shortest Path Algorithm This module 27 focuses on introducing dynamic programming design strategy and applying it to problems like chained matrix

More information

Cilk, Matrix Multiplication, and Sorting

Cilk, Matrix Multiplication, and Sorting 6.895 Theory of Parallel Systems Lecture 2 Lecturer: Charles Leiserson Cilk, Matrix Multiplication, and Sorting Lecture Summary 1. Parallel Processing With Cilk This section provides a brief introduction

More information

Practice Problems for the Final

Practice Problems for the Final ECE-250 Algorithms and Data Structures (Winter 2012) Practice Problems for the Final Disclaimer: Please do keep in mind that this problem set does not reflect the exact topics or the fractions of each

More information

Chapter 3:- Divide and Conquer. Compiled By:- Sanjay Patel Assistant Professor, SVBIT.

Chapter 3:- Divide and Conquer. Compiled By:- Sanjay Patel Assistant Professor, SVBIT. Chapter 3:- Divide and Conquer Compiled By:- Assistant Professor, SVBIT. Outline Introduction Multiplying large Integers Problem Problem Solving using divide and conquer algorithm - Binary Search Sorting

More information

Lectures by Volker Heun, Daniel Huson and Knut Reinert, in particular last years lectures

Lectures by Volker Heun, Daniel Huson and Knut Reinert, in particular last years lectures 4 FastA and the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem 4.1 Sources for this lecture Lectures by Volker Heun, Daniel Huson and Knut

More information

Sorting. There exist sorting algorithms which have shown to be more efficient in practice.

Sorting. There exist sorting algorithms which have shown to be more efficient in practice. Sorting Next to storing and retrieving data, sorting of data is one of the more common algorithmic tasks, with many different ways to perform it. Whenever we perform a web search and/or view statistics

More information

Matrix algorithms: fast, stable, communication-optimizing random?!

Matrix algorithms: fast, stable, communication-optimizing random?! Matrix algorithms: fast, stable, communication-optimizing random?! Ioana Dumitriu Department of Mathematics University of Washington (Seattle) Joint work with Grey Ballard, James Demmel, Olga Holtz, Robert

More information

Optimization II: Dynamic Programming

Optimization II: Dynamic Programming Chapter 12 Optimization II: Dynamic Programming In the last chapter, we saw that greedy algorithms are efficient solutions to certain optimization problems. However, there are optimization problems for

More information

(Refer Slide Time 04:53)

(Refer Slide Time 04:53) Programming and Data Structure Dr.P.P.Chakraborty Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 26 Algorithm Design -1 Having made a preliminary study

More information

Week 12: Running Time and Performance

Week 12: Running Time and Performance Week 12: Running Time and Performance 1 Most of the problems you have written in this class run in a few seconds or less Some kinds of programs can take much longer: Chess algorithms (Deep Blue) Routing

More information

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 1 Lecturer: Michael Jordan August 28, Notes 1 for CS 170

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 1 Lecturer: Michael Jordan August 28, Notes 1 for CS 170 UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 1 Lecturer: Michael Jordan August 28, 2005 Notes 1 for CS 170 1 Topics to be covered 1. Data Structures and big-o Notation: a quick

More information

Lecture 8. Dynamic Programming

Lecture 8. Dynamic Programming Lecture 8. Dynamic Programming T. H. Cormen, C. E. Leiserson and R. L. Rivest Introduction to Algorithms, 3rd Edition, MIT Press, 2009 Sungkyunkwan University Hyunseung Choo choo@skku.edu Copyright 2000-2018

More information

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation Comp2711 S1 2006 Correctness Oheads 1 Efficiency Issues Comp2711 S1 2006 Correctness Oheads 2 Run Times An implementation may be correct with respect to the Specification Pre- and Post-condition, but nevertheless

More information

CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014

CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014 CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014 Study: Chapter 4 Analysis of Algorithms, Recursive Algorithms, and Recurrence Equations 1. Prove the

More information

11. Divide and Conquer

11. Divide and Conquer The whole is of necessity prior to the part. Aristotle 11. Divide and Conquer 11.1. Divide and Conquer Many recursive algorithms follow the divide and conquer philosophy. They divide the problem into smaller

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 16 Dynamic Programming (plus FFT Recap) Adam Smith 9/24/2008 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova, K. Wayne Discrete Fourier Transform

More information

n = 1 What problems are interesting when n is just 1?

n = 1 What problems are interesting when n is just 1? What if n=1??? n = 1 What problems are interesting when n is just 1? Sorting? No Median finding? No Addition? How long does it take to add one pair of numbers? Multiplication? How long does it take to

More information

Introduction to Algorithms

Introduction to Algorithms Introduction to Algorithms Dynamic Programming Well known algorithm design techniques: Brute-Force (iterative) ti algorithms Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic

More information

Lecturers: Sanjam Garg and Prasad Raghavendra March 20, Midterm 2 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra March 20, Midterm 2 Solutions U.C. Berkeley CS70 : Algorithms Midterm 2 Solutions Lecturers: Sanjam Garg and Prasad aghavra March 20, 207 Midterm 2 Solutions. (0 points) True/False Clearly put your answers in the answer box in front

More information

Partha Sarathi Manal

Partha Sarathi Manal MA 515: Introduction to Algorithms & MA353 : Design and Analysis of Algorithms [3-0-0-6] Lecture 29 http://www.iitg.ernet.in/psm/indexing_ma353/y09/index.html Partha Sarathi Manal psm@iitg.ernet.in Dept.

More information

1 More on the Bellman-Ford Algorithm

1 More on the Bellman-Ford Algorithm CS161 Lecture 12 Shortest Path and Dynamic Programming Algorithms Scribe by: Eric Huang (2015), Anthony Kim (2016), M. Wootters (2017) Date: May 15, 2017 1 More on the Bellman-Ford Algorithm We didn t

More information

Section 3.1 Gaussian Elimination Method (GEM) Key terms

Section 3.1 Gaussian Elimination Method (GEM) Key terms Section 3.1 Gaussian Elimination Method (GEM) Key terms Rectangular systems Consistent system & Inconsistent systems Rank Types of solution sets RREF Upper triangular form & back substitution Nonsingular

More information

Algorithmics. Some information. Programming details: Ruby desuka?

Algorithmics. Some information. Programming details: Ruby desuka? Algorithmics Bruno MARTIN, University of Nice - Sophia Antipolis mailto:bruno.martin@unice.fr http://deptinfo.unice.fr/~bmartin/mathmods.html Analysis of algorithms Some classical data structures Sorting

More information

Dynamic Programming. Outline and Reading. Computing Fibonacci

Dynamic Programming. Outline and Reading. Computing Fibonacci Dynamic Programming Dynamic Programming version 1.2 1 Outline and Reading Matrix Chain-Product ( 5.3.1) The General Technique ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Dynamic Programming version 1.2 2 Computing

More information

Dynamic Programming Part I: Examples. Bioinfo I (Institut Pasteur de Montevideo) Dynamic Programming -class4- July 25th, / 77

Dynamic Programming Part I: Examples. Bioinfo I (Institut Pasteur de Montevideo) Dynamic Programming -class4- July 25th, / 77 Dynamic Programming Part I: Examples Bioinfo I (Institut Pasteur de Montevideo) Dynamic Programming -class4- July 25th, 2011 1 / 77 Dynamic Programming Recall: the Change Problem Other problems: Manhattan

More information

FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:

FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10: FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:56 4001 4 FastA and the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem

More information

CMPSCI 250: Introduction to Computation. Lecture #14: Induction and Recursion (Still More Induction) David Mix Barrington 14 March 2013

CMPSCI 250: Introduction to Computation. Lecture #14: Induction and Recursion (Still More Induction) David Mix Barrington 14 March 2013 CMPSCI 250: Introduction to Computation Lecture #14: Induction and Recursion (Still More Induction) David Mix Barrington 14 March 2013 Induction and Recursion Three Rules for Recursive Algorithms Proving

More information

CS303 (Spring 2008) Solutions to Assignment 7

CS303 (Spring 2008) Solutions to Assignment 7 CS303 (Spring 008) Solutions to Assignment 7 Problem 1 Here is our implementation of a class for long numbers. It includes addition, subtraction, and both ways of multiplying. The rest of the code, not

More information

10/24/ Rotations. 2. // s left subtree s right subtree 3. if // link s parent to elseif == else 11. // put x on s left

10/24/ Rotations. 2. // s left subtree s right subtree 3. if // link s parent to elseif == else 11. // put x on s left 13.2 Rotations MAT-72006 AA+DS, Fall 2013 24-Oct-13 368 LEFT-ROTATE(, ) 1. // set 2. // s left subtree s right subtree 3. if 4. 5. // link s parent to 6. if == 7. 8. elseif == 9. 10. else 11. // put x

More information

Lecture 12: Dynamic Programming Part 1 10:00 AM, Feb 21, 2018

Lecture 12: Dynamic Programming Part 1 10:00 AM, Feb 21, 2018 CS18 Integrated Introduction to Computer Science Fisler, Nelson Lecture 12: Dynamic Programming Part 1 10:00 AM, Feb 21, 2018 Contents 1 Introduction 1 2 Fibonacci 2 Objectives By the end of these notes,

More information

Brief review from last class

Brief review from last class Sequence Alignment Brief review from last class DNA is has direction, we will use only one (5 -> 3 ) and generate the opposite strand as needed. DNA is a 3D object (see lecture 1) but we will model it

More information

Special course in Computer Science: Advanced Text Algorithms

Special course in Computer Science: Advanced Text Algorithms Special course in Computer Science: Advanced Text Algorithms Lecture 8: Multiple alignments Elena Czeizler and Ion Petre Department of IT, Abo Akademi Computational Biomodelling Laboratory http://www.users.abo.fi/ipetre/textalg

More information

CSE373: Data Structures and Algorithms Lecture 4: Asymptotic Analysis. Aaron Bauer Winter 2014

CSE373: Data Structures and Algorithms Lecture 4: Asymptotic Analysis. Aaron Bauer Winter 2014 CSE373: Data Structures and Algorithms Lecture 4: Asymptotic Analysis Aaron Bauer Winter 2014 Previously, on CSE 373 We want to analyze algorithms for efficiency (in time and space) And do so generally

More information

l So unlike the search trees, there are neither arbitrary find operations nor arbitrary delete operations possible.

l So unlike the search trees, there are neither arbitrary find operations nor arbitrary delete operations possible. DDS-Heaps 1 Heaps - basics l Heaps an abstract structure where each object has a key value (the priority), and the operations are: insert an object, find the object of minimum key (find min), and delete

More information

So far... Finished looking at lower bounds and linear sorts.

So far... Finished looking at lower bounds and linear sorts. So far... Finished looking at lower bounds and linear sorts. Next: Memoization -- Optimization problems - Dynamic programming A scheduling problem Matrix multiplication optimization Longest Common Subsequence

More information

Curriculum Map: Mathematics

Curriculum Map: Mathematics Curriculum Map: Mathematics Course: Honors Advanced Precalculus and Trigonometry Grade(s): 11-12 Unit 1: Functions and Their Graphs This chapter will develop a more complete, thorough understanding of

More information

Recurrences and Memoization: The Fibonacci Sequence

Recurrences and Memoization: The Fibonacci Sequence Chapter 7 Recurrences and Memoization: The Fibonacci Sequence Copyright Oliver Serang, 208 University of Montana Department of Computer Science The Fibonacci sequence occurs frequently in nature and has

More information

Subset sum problem and dynamic programming

Subset sum problem and dynamic programming Lecture Notes: Dynamic programming We will discuss the subset sum problem (introduced last time), and introduce the main idea of dynamic programming. We illustrate it further using a variant of the so-called

More information

CS 231: Algorithmic Problem Solving

CS 231: Algorithmic Problem Solving CS 231: Algorithmic Problem Solving Naomi Nishimura Module 5 Date of this version: June 14, 2018 WARNING: Drafts of slides are made available prior to lecture for your convenience. After lecture, slides

More information

Lecture 13: Divide and Conquer (1997) Steven Skiena. skiena

Lecture 13: Divide and Conquer (1997) Steven Skiena.   skiena Lecture 13: Divide and Conquer (1997) Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.sunysb.edu/ skiena Problem Solving Techniques Most

More information

Advanced Algorithms. Problem solving Techniques. Divide and Conquer הפרד ומשול

Advanced Algorithms. Problem solving Techniques. Divide and Conquer הפרד ומשול Advanced Algorithms Problem solving Techniques. Divide and Conquer הפרד ומשול 1 Divide and Conquer A method of designing algorithms that (informally) proceeds as follows: Given an instance of the problem

More information

Algorithms (IX) Guoqiang Li. School of Software, Shanghai Jiao Tong University

Algorithms (IX) Guoqiang Li. School of Software, Shanghai Jiao Tong University Algorithms (IX) Guoqiang Li School of Software, Shanghai Jiao Tong University Q: What we have learned in Algorithm? Algorithm Design Algorithm Design Basic methodologies: Algorithm Design Algorithm Design

More information

(a) (4 pts) Prove that if a and b are rational, then ab is rational. Since a and b are rational they can be written as the ratio of integers a 1

(a) (4 pts) Prove that if a and b are rational, then ab is rational. Since a and b are rational they can be written as the ratio of integers a 1 CS 70 Discrete Mathematics for CS Fall 2000 Wagner MT1 Sol Solutions to Midterm 1 1. (16 pts.) Theorems and proofs (a) (4 pts) Prove that if a and b are rational, then ab is rational. Since a and b are

More information

Longest Common Subsequence, Knapsack, Independent Set Scribe: Wilbur Yang (2016), Mary Wootters (2017) Date: November 6, 2017

Longest Common Subsequence, Knapsack, Independent Set Scribe: Wilbur Yang (2016), Mary Wootters (2017) Date: November 6, 2017 CS161 Lecture 13 Longest Common Subsequence, Knapsack, Independent Set Scribe: Wilbur Yang (2016), Mary Wootters (2017) Date: November 6, 2017 1 Overview Last lecture, we talked about dynamic programming

More information

U.C. Berkeley CS170 : Algorithms, Fall 2013 Midterm 1 Professor: Satish Rao October 10, Midterm 1 Solutions

U.C. Berkeley CS170 : Algorithms, Fall 2013 Midterm 1 Professor: Satish Rao October 10, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms, Fall 2013 Midterm 1 Professor: Satish Rao October 10, 2013 Midterm 1 Solutions 1 True/False 1. The Mayan base 20 system produces representations of size that is asymptotically

More information

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1.1 Introduction Given that digital logic and memory devices are based on two electrical states (on and off), it is natural to use a number

More information

Pairwise Sequence Alignment: Dynamic Programming Algorithms. COMP Spring 2015 Luay Nakhleh, Rice University

Pairwise Sequence Alignment: Dynamic Programming Algorithms. COMP Spring 2015 Luay Nakhleh, Rice University Pairwise Sequence Alignment: Dynamic Programming Algorithms COMP 571 - Spring 2015 Luay Nakhleh, Rice University DP Algorithms for Pairwise Alignment The number of all possible pairwise alignments (if

More information

Lecture 2: Getting Started

Lecture 2: Getting Started Lecture 2: Getting Started Insertion Sort Our first algorithm is Insertion Sort Solves the sorting problem Input: A sequence of n numbers a 1, a 2,..., a n. Output: A permutation (reordering) a 1, a 2,...,

More information

Outline. CS38 Introduction to Algorithms. Fast Fourier Transform (FFT) Fast Fourier Transform (FFT) Fast Fourier Transform (FFT)

Outline. CS38 Introduction to Algorithms. Fast Fourier Transform (FFT) Fast Fourier Transform (FFT) Fast Fourier Transform (FFT) Outline CS8 Introduction to Algorithms Lecture 9 April 9, 0 Divide and Conquer design paradigm matrix multiplication Dynamic programming design paradigm Fibonacci numbers weighted interval scheduling knapsack

More information

CSC-140 Assignment 5

CSC-140 Assignment 5 CSC-140 Assignment 5 Please do not Google a solution to these problems, cause that won t teach you anything about programming - the only way to get good at it, and understand it, is to do it! 1 Introduction

More information

IN101: Algorithmic techniques Vladimir-Alexandru Paun ENSTA ParisTech

IN101: Algorithmic techniques Vladimir-Alexandru Paun ENSTA ParisTech IN101: Algorithmic techniques Vladimir-Alexandru Paun ENSTA ParisTech License CC BY-NC-SA 2.0 http://creativecommons.org/licenses/by-nc-sa/2.0/fr/ Outline Previously on IN101 Python s anatomy Functions,

More information

Lecture 3: February Local Alignment: The Smith-Waterman Algorithm

Lecture 3: February Local Alignment: The Smith-Waterman Algorithm CSCI1820: Sequence Alignment Spring 2017 Lecture 3: February 7 Lecturer: Sorin Istrail Scribe: Pranavan Chanthrakumar Note: LaTeX template courtesy of UC Berkeley EECS dept. Notes are also adapted from

More information

PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS

PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS Lecture 03-04 PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS By: Dr. Zahoor Jan 1 ALGORITHM DEFINITION A finite set of statements that guarantees an optimal solution in finite interval of time 2 GOOD ALGORITHMS?

More information

6.001 Notes: Section 4.1

6.001 Notes: Section 4.1 6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,

More information