1 CS241 - week 1 review Special classes of algorithms: logarithmic: O(log n) linear: O(n) quadratic: O(n 2 ) polynomial: O(n k ), k 1 exponential: O(a n ), a > 1 Classifying algorithms is generally done in terms of worst-case running time: O (f(n)): Big Oh--asymptotic upper bound. Ω (f(n)): Big Omega--asymptotic lower bound Θ (f(n)): Theta--asymptotic tight bound 1 Plotting run-time graphically For functions f(n) and g(n) (to the right) there are positive constants c and n 0 such that: f(n) c g(n) for n n 0 conclusion: 2n+6 is O(n) since 2n+6 4n for n 3 c g(n) = 4n g(n) = n n 0 = 3 f(n) = 2n n Plotting run-time graphically On the other hand n 2 is not O(n) because there is no c and n 0 such that: n 2 cn for n n 0 (As the graph to the right illustrates, no matter how large c is chosen, there is an n big enough that n 2 > cn ). Prefix Averages - Algorithm #1 An algorithm for computing prefix averages Algorithm prefixaverages1(x): Input: An n-element array X of numbers Output: An n -element array A of numbers such that A[i] is the average of elements X[1],..., X[ i]. 1. Create an array A such that length[a] length[x] 2. n length[x] 3. for i 1 to n do 4. a 0 5. for j 1 to i do 6. a a + X[j] 7. A[i] a / i 8. return array A Analysis... 1 step i iterations with i=1,2...n n iterations 3 4 Run time: Prefix Averages - Algorithm #1 cost c 1 Create an array A such that length[a] length[x] c 2 n length[x] c 3 for i 1 to n do c 4 a 0 c 5 for j 1 to i do c 6 a a + X[j] c 7 A[i] a / i c 8 return array A c 1 n + c 2 + c 3 (n+1) + c 4 n + c 5 j+1 + c 6 j + c j=1 7 n + c 8 = (c 5 /2+c 6 /2)n 2 j=1 + (3c 5 /2+c 6 /2+c 1 +c 3 +c 4 +c 7 )n + (c 2 +c 3 +c 5 +c 8 ) = an 2 + bn + c = ϑ(n 2 ) n n 5 A better algorithm for computing prefix averages: Algorithm prefixaverages2(x): Input: An n-element array X of numbers. Output: An n -element array A of numbers such that A[i] is the average of elements X[1],..., X[i]. 1. Create an array A such that length[a] length[x] 2. n length[x] 3. s 0 4. for i 1 to n do 5. s s + X[i] n iterations 6. A[i] s / i 7. return array A Analysis... Prefix Averages - Algorithm #2 6 1

2 Math Review Logarithms A logarithm is an inverse exponential function. Saying b x = y is equivalent to saying log b y = x. notation convention for logarithms: lgn = log 2 n (binary logarithm) lnn = log e n (natural logarithm) Math Review Logarithms Iterated logarithm function (lg * n): - "log star of n" - Very slow growing, e.g. lg * ( ) = 5 (and is much larger than the number of atoms in the observable universe!!) properties of logarithms: log b (xy) = log b x + log b y While exponential functions log grow very fast, log functions b (x/y) = log b x - log b y grow very slowly. log b x a = alog b x log b a= log x a/log x b a = b log a b (e.g., n = 2 lgn = n lg2 ) 8 7 Floor: Ceiling: More Math Review x = the largest integer x x = the smallest integer x Summations: (see Appendix A, p.1058) Geometric, Telescoping & Harmonic series: (see Appendix A, p ) Handy Asymptotic Facts a) If T(n) is a polynomial function of degree k, then O(n k ) b) log k n = O(n) for any constant k. c) n k = O(2 n ) for any constant k > 0. d) n! = o(n n ) e) n! = ω(2 n ) f) lg(n!) = θ(nlgn) g) The base of a logarithm doesn't matter asymptotically, but the base of an exponential function and the degree of a polynomial do matter asymptotically Proving Correctness--Insertion Sort Loop invariant: At the start of each iteration of the for loop, the subarray A[1...j-1] consists of the elements originally in A[1...j-1], but in sorted order. Insertion-Sort(A) 1.for j 2 to length(a) do 2. key A[j] 3. i j while i>0 and A[i]>key do 5. A[i+1] A[i] 6. i i A[i+1] key We need to show that the loop invariant is true prior to the first iteration before each subsequent iteration, so it remains true for the next iteration when the loop terminates. 11 Proving Correctness--Insertion Sort Basis: When j = 2, A[1...j-1] has a single element and is therefore trivially sorted. Inductive step: During the k-1st iteration, there are k-1 sorted items in the subarray A[1...k-1] and key = A[k]. Insertion-Sort(A) 1.for j 2 to length(a) do 2. key A[j] 3. i j while i>0 and A[i]>key do 5. A[i+1] A[i] 6. i i A[i+1] key A[k-1], A[k-2], A[k-3] and so on are each moved one position to the right until either a value less than key is found or until k-2 values have been shifted right, when the value of key is inserted. Due to the total ordering on integers, key will be inserted in the right position. Therefore, the loop invariant holds at the start of the kth iteration. Termination: The for loop ends when j = n+1. By the inductive hypothesis, we have that the subarray A[1...n] is in sorted order. Therefore, the entire array is sorted and the algorithm is correct. 12 2

3 Divide-and-Conquer Algorithms The divide-and-conquer paradigm (Ch.2) divide the problem into a number of subproblems conquer the subproblems (solve them) combine the subproblem solutions to get the solution to the original problem Example: Merge Sort divide the n-element sequence to be sorted into two n/2- element sequences. conquer the subproblems recursively using merge sort. combine the resulting two sorted n/2-element sequences by merging. 13 Merge(A,p,q,r) 1. n 1 q-p+1; n 2 r-q; 2. Create arrays L[1...n 1 +1] and R[1...n 2 +1] 3. for i 1 to n 1 do 4. L[i] A[p+i-1] 5. for i 1 to n 2 do 6. R[i] A[q+i] 7. L[n 1 +1] = R[n 2 +1] = 8. i j 1 9. for k p to r do 10. if L[i] R[j] then 11. A[k] L[i] 12. i i else A[k] R[j] 14. j j+1 Merge-Sort Merge-Sort(A,p,r) 1.if p < r then 2. q (p+r)/2 3. Merge-Sort (A,p,q) 4. Merge-Sort (A,q+1,r) 5. Merge (A,p,q,r) Initial call: Merge-sort(A,1, length(a)) The Merge subroutine takes θ(n) time to Merge n elements that are divided into two sorted arrays of n/2 elements each. Analyzing Divide-and-Conquer Algorithms Analyzing Merge-Sort A recursive algorithm can often be described by a recurrence equation describes the overall runtime on a problem of size n in terms of the runtime on smaller inputs. For divide-and-conquer algorithms, we get recurrences like: Divide D(n) = θ(1) where θ(1) if n c at(n/b) + D(n) + C(n) otherwise a = the number of subproblems we divide the problem into n/b = the size of the subproblems (in terms of n) D(n) = time to divide the size n problem into subproblems C(n) = time to combine the subproblem solutions to get the answer for the problem of size n Analyzing Merge-Sort Analyzing Merge-Sort Merge C(n) = θ(n) θ(1) if n = 1 2T(n/2) + θ(n) otherwise Recurrence for worst-case running time for Merge-Sort 17 θ(1) if n = 1 2T(n/2) + θ(n) otherwise Recurrence for worst-case running time for Merge-Sort a = 2 (two subproblems) n/b = n/2 (each subproblem has size approx. n/2) D(n) = θ(1) (just compute midpoint of array) C(n) = θ(n) (merging can be done by scanning sorted subarrays 18 3

4 lgn (h = lgn) Recursion Tree for Merge-Sort cn cn cn/2 cn/2 cn c c c c c c c c c if n = 1 2T(n/2) + cn otherwise Recurrence for worst-case running time for Merge-Sort cn cn cnlgn + cn Solving Recurrences There are 3 general methods for solving recurrences (Ch. 4) 1. Iteration (recursion tree) : Convert the recurrence to a summation and bound the summation (not always straightforward). 2. Substitution (Guess & Verify): Guess a solution and verify it is correct with an inductive proof. 3. Apply the "Master Theorem": If the recurrence has the form at(n/b) + f(n) then there is a formula that can (often) be applied. To make the solutions simpler, we will ignore floors and ceilings (justification in text) assume base cases are constant, i.e., θ(1) (constant time) for n small enough Solving Recurrences: Iteration Example: 4T(n/2) + n 4T(n/2) + n = 4(4T(n/4) + n/2) + n /* expand */ = 16T(n/4) + 2n + n /* simplify */ = 16(4T(n/8) + n/4) + 2n + n /* expand */ = 64T(n/8) + 4n + 2n + n /* simplify */ = = 4 lgn T(1) n + 2n + n /* log n levels */ = c4 lgn + n lgn-1 2 k /* convert to summation */ k = 0 Solving Recurrences: Iteration Intuitive Help: Can represent this as a recursion tree and identify computation with each node/level in the tree. root represents computation (D(n) + C(n)) at top level of recursion node at level i represents subproblem at level i in the recursion height of tree is number of levels in the recursion sum of all nodes in the tree = cn lg4 + n (2 lgn - 1) /* 4 lgn = n lg4 = n 2 */ = cn 2 + n(n - 1) /* 2 lgn = n lg2 = n */ = θ(n 2 ) lgn (h = lgn) Recursion Tree 4T(n/2) + n cn cn cn/2 cn/2 2cn 4cn Solving Recurrences: Substitution This method involves guessing form of solution use mathematical induction to find the constants and verify solution (can't disregard lower order terms here) Can use this method to find an upper or a lower bound (do both to obtain a tight bound) Example: 4T(n/2) + n (find upper bound) c c c c c c c c 4 lgn θ(1) θ(n 2 ) guess O(n 3 ) and try to show T(n) cn 3 for some c > 0. basis? 4T(n/2) + n 4(c(n/2) 3 ) + n /* by inductive hypothesis */ = (c/2)n 3 + n = cn 3 - ((c/2)n 3 - n) cn 3 /* if c 2 and n 1 */

5 Substitution (Guess & Verify) Example 2: Give an upper bound for 2T(n/2) + n guess O(n) and try to show T(n) cn for some c > 0 (you have to find c) basis? assume T(k) ck for k < n, and prove T(n) cn. T(n) = 2T(n/2) + n 2c(n/2) + n /* by inductive hypothesis */ = cn + n = O(n) /* WRONG!!! */ why? Example 2: Show 2T(n/2) + n is Ω(nlgn) using the substitution method. The master method provides a 'cookbook' method for solving recurrences of a certain form. Master Theorem: Let a 1 and b > 1 be constants, let f(n) be a function, and let T(n) be defined on nonnegative integers as: at(n/b) + f(n) Then, T(n) can be bounded asymptotically as follows: 1. θ(n log b a ) if f(n) = O(n log b a-ε ) for some constant ε > 0 2. θ(n log b a logn) if f(n) = θ(n log b a ) 3. θf(n) if f(n) = Ω(n log b a+ε ) for some constant ε > 0 and if af(n/b) cf(n) for some constant c < 1 and all sufficiently large n Intuition: Compare f(n) with θ(n log b a ). case 1: f(n) is "polynomially smaller than" θ(n log b a ) case 2: f(n) is "asymptotically equal to" θ(n log b a ) case 3: f(n) is "polynomially larger than" θ(n log b a ) What is log b a? The number of times we divide a by b to reach O(1). Example: 9T(n/3) + n a = 9, b = 3, f(n) = n, n log b a = n log 3 9 = n 2 compare f(n) = n with n 2 n = O(n 2 -ε ) (so f(n) is polynomially smaller than n log b a ) case 1 applies: θ(n 2 ) Example (in class): T((2/3)n) + 1 Example (in class): 3T(n/4) + nlgn Problem 1: (in class) 2T(n/2) + n 3 Problem 3: (in class) 7T(n/3) + n 2 Problem 2: (in class) 4T(n/2) + n 2 /lgn Problem 4: (in class) 7T(n/2) + n

6 "Big Oh" g(n) = O(f(n)) if c>0 and n 0 > 0 s.t. g(n) cf(n) n > n 0. g(n) = O(f(n)) iff lim n g(n)/f(n) = c for some c 0 "little Oh" g(n) = o(f(n)) if c>0, some n 0 > 0 s.t. g(n) < cf(n) n > n 0. g(n) = o(f(n)) iff lim n g(n)/f(n) = 0 "Big Omega" g(n) = Ω(f(n)) if c>0 and n 0 > 0 s.t. g(n) cf(n) n > n 0. g(n) = Ω(f(n)) iff lim n f(n)/g(n) = c for some c 0 "little Omega" g(n) = ω(f(n)) if c>0, some n 0 > 0 s.t. g(n) > cf(n) n > n 0. g(n) = ω(f(n)) iff lim n g(n)/f(n) = "Big Theta" g(n) = θ(f(n)) if c 1,c 2 >0 and n 0 > 0 s.t. c 1 f(n) g(n) c 2 f(n) n > n 0. g(n) = θ(f(n)) iff lim n g(n)/f(n) = c for some c > 0 6

