The Running Time of Programs The 90 10 Rule Many programs exhibit the property that most of their running time is spent in a small fraction of the source code. There is an informal rule that states 90% of the running time is spent in 10% of the code. While the exact percentage varies from program to program, the 90 10 rule says that most programs exhibit significant and locality in where the running time is spent. Measuring the Running Time We can measure the running time of a program simply by using wall clock time. However, there are several disadvantages with this methodology, including: Input dependence Machine dependence We want a more analytical method for measuring the running time of a program instead of relying on the wall clock time. If we consider a program at the highest level, it will accept certain inputs, process them, and output the result. We want to come up with the running time measurement that depends only on the program's input size. In this way, the measurement is independent of the type of input and the hardware on which the program runs. What should be the standard measurement for the running time of a program that is independent of hardware and input type? Input Output Program Introducing Asymptotic Bound Analysis that relates the running time of a program with its input size. Such relation will be dependent on the algorithms and data structures that the program uses to process the input. Asymptotic Upper Bound (Big O) Big of f(n) is is a set of function T(n) with the property that there exists a positive constant and such that for : The below diagram shows how the above definition can be visualized: 1
C*f(n) T(n) N We normally use the overloaded notation: when we say that is order. When we prove a property about Big O, we have the freedom to choose arbitrary positive constants. So, we can see that: since we can chose N to be 0 and C to be 10 000. Is in O(n)? We have to demonstrate that we can find C and N to satisfy the following condition: n 2 <= Cn for every n >= N But, no matter how large we choose C, there will always be an n value that is greater than C (since n is a variable) to make Example: 4n 3 + 3n 2 +4 is order O(n 3 ) 4n is order O(n 3 ) 2
When we use big O to analyze the running time of a program, the variable n is the size of the input and T(n) is the running time. As mentioned before the "shape" of the function T(n) will depend on the algorithms and data structures employed in the program. If T(n) = 4n 3 + n 2 +1, the running time of the program is 4n 3 + n 2 +1, which is O(n 3 ) (pronounced "big O of n cube) Asymptotic bound is the measure for the order of growth of the function T(n). We definitely want T(n) to have a small order so that the running time grows "reasonably" with the increase in the input size. And, what is a reasonable order of growth? It has been a commonly accepted wisdom in computer science that that order is polynomial order. That is T(n) is O(n d ) when d>=1 When we consider a polynomial function, the term with the highest exponent dominates. To appreciate why this is so consider T(n) = 4n 3 + n 2 +1 if n = 10 20, we have T(n) = 4(10 60 ) + 10 40 + 1. We can see that 4(10 60 ) >> (10 40 ) so the latter term becomes insignificant (20 order of magnitude difference!). So, the order of T(n) is determined by the highest exponent. Can we prove that T(n) is actually O(n 3 )? If we choose C to be 6, we ask if we can find N such that for all n >= N: 4n 3 + n 2 +1 <= 6n 3 4n 3 + n 2 +1 <= 4n 3 + n 3 + n 3 If we choose N = 1, we see that RHS is always greater than or equal LHS for n >= N. Tightness First, we generally want the tightest big oh upper bound we can prove. That is, if T (n) is O(n 2 ), we want to say so, rather than make the technically true but weaker statement that T (n) is O(n 3 ). On the other hand, this way lies madness, because if we like O(n 2 ) as an expression of running time, we should like O(0.5 n 2 ) even better, because it is tighter, and we should like O(.01 n 2 ) still more, and so on. However, since constant factors don t matter in big oh expressions, there is really no point in trying to make the estimate of running time tighter by shrinking the constant factor. Thus, whenever possible, we try to use a big oh expression that has a constant factor 1. More precisely, we shall say that f(n) is a tight big oh bound on T (n) if 1. T (n) is O(f(n)), and 2. f(n) is also O(T(n)) The following lists some of the more common running times for programs and their informal names Examples of Analysis of the Running Time 1. Finding closet pair of points Given n points in a co ordinate (x,y) plane, find two points with the closet distance from each other 3
............................. We propose a program that implements the following algorithm to tackle this problem. The algorithm checks all pairs of points and updates the "minimum distance" value each time it calculates the distance between two points. What is the running time T(n) of this program? We will first assume that the calculation of the distance between two points take a constant amount of time, and, hence, it is independent of the number of input points. How many pairs of points do we need to consider? If there are 4 points, we start with the first one and check it against all the other three points. Then, we pick the second point and check it against all the other except the first and so on... 1 2 4 3 In general, for n points, we need to check (n 1) + (n 2) +... + 1 = pairs Therefore, T(n) = C*((n 1)*n)/2 = T(n) = O (n 2 ) Consider the pseudo code for solving this problem. for (i=o; i< numpoints; i++) { for (j=i+1; j <numpoints; j++) { Find distance from point i to point j and store in d If (d<minimum distance) update minimum distance in d We see that there are two loops where one is nested inside of the other. The outer loop will iterate numpoints time (which is n times if numpoints equals n). The inner loop will iterate less than numpoints time, but it will never go beyond numpoints. So, we can say that both loops have numpoints as their uppper bounds, and, hence, we can deduce the running them of this code by arguing that the code will iterate at most numpoints x numpoints times. Hence, it has a running time of T(n) of O(n 2 ) 4
2. Array processing: An input to a given program is an array of size n. We want to "compress" this array by reducing the number of duplicated adjacent elements to just one. For example, given: Input 1,1,2,2,2,3, 1, 1 the program returns: Return 1, 2, 3, 1 (The number of 1 is reduced from 2 two 1 and the number of 2 from 3 to 1) Consider the following pseudo code used to solve this problem: i= 0; j=0; while (i < arraylength) { array[j]=array[i]; do { i ++ ; while ((i<array.length) && (array[i]==array[j])); j++; It has got two nested loops just like in the first example. So, can we argue that the running time T(n) of this code is O(n 2 ) where n is the size of the array (array.length)? Not quite. If we look closely, the i variable increase every time, the code enters the innner loop and i cannot go above array.length. Therefore, this code has the running time T(n) = O(n) 3. Insertion Sort: Move the key forward starting with the first position. Maintain the invariant that the data before including at the key index is sorted. 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 2 3 4 8 9 6 2 3 4 6 8 9 Consider the two types of inputs to insertion sort: Best Case Analysis Case I: 2 3 4 6 8 9 The data is already sorted, hence, there are no exchange operations. Case II: 9 8 6 4 3 2 Worst Case Analysis The data is sorted in reverse order, so every pair of elements in the array must be exchanged for every key position. 5
The example case we gave earlier can be considered the average case (not best, not worst, but sits in between). When we analyze the running time, we always want to do it for the worst case since we want to bound the running time and be able to say that no matter what type of input a program accepts, its running time is never above this bound. Insertion Sort Input : Array A[0,1,,n-1] for(j=1 ; j<n ; j++){ key = A[j]; //part a i = j-1; //part a while((i>=0) &&(A[i]>key) ){ A[i+1]=A[i]; //part c i--; //part c A[i+1]=key; // part b Analyse the worst case runtime Part A + part B = k (constant time) Part C = m (constant time) Iteration 1 runtime k + m Iteration 2 runtime k + 2m (exchange twice when key moved to 2nd position) Iteration n 1 runtime k+(n 1)m k(n 1)+(1+2+3+ +n 1)m k(n 1)+m(n 1)(n/2) T(n) = k(n 1) + = = = O(n 2 ) 6
4. Merge sort: Input : Array A[0,1,,n 1] MS(A[0,1,..,n 1]) 1. if n = 1 return 2. Recursively sort A[0 ] (1) และ A [ ] (2) 3. Merge Array 1 and 2 together 8 3 2 9 7 1 5 4 8 3 2 9 7 1 5 4 8 3 2 9 7 1 5 4 8 3 2 9 7 1 5 4 3 8 2 9 1 7 4 5 1 2 3 4 5 7 8 9 7
How to merge two sorted array? l 1 r Maintain pointer l and r each pointed to an element in the left and right arrays, respectively. Compare the two pointed elements, if one is less than or equal to the other, bring it down to the third "merge" array. 1 2 1 2 3 1 2 3 4 1 2 3 4 5 1 2 3 4 5 7 r reaches the end of the array. Can bring down the rest of the elements after and including the one pointed to by l one by one in order. 1 2 3 4 5 7 8 9 8
Since merge sort involves recursive function calls, to calculate its running time, we need a basis for calculating the running time of recursive functions. Consider a recursive factorial below: Since there is only one function, fact, involved, we shall use T (n) for the unknown running time of this function. We shall use n, the value of the argument, as the size of the argument. Clearly, recursive calls made by fact when the argument is n have a smaller argument, n 1 to be precise. For the basis of the inductive definition of T (n) we shall take n = 1, since no recursive call is made by fact when its argument is 1. With n = 1, the condition of line (1) is true, and so the call to fact executes lines (1) and (2). Each takes O(1) time, and so the running time of fact in the basis case is O(1). That is, T (1) is O(1). Now, consider what happens when n > 1. The condition of line (1) is false, and so we execute only lines (1) and (3). Line (1) takes O(1) time, and line (3) takes O(1) for the multiplication and assignment, plus T (n 1) for the recursive call to fact. That is, for n > 1, the running time of fact is O(1) + T (n 1). We can thus define T (n) by the following recurrence relation: BASIS. T (1) = O(1). INDUCTION. T (n) = O(1) + T (n 1), for n > 1. We now invent constant symbols to stand for those constants hidden within the various big oh expressions, as was suggested by rule (c) above. In this case, we can replace the O(1) in the basis by some constant a, and the O(1) in the induction by some constant b. These changes give us the following recurrence relation: BASIS. T (1) = a. INDUCTION. T (n) = b + T (n 1), for n > 1. Now we must solve this recurrence for T (n). We can calculate the first few values easily. T (1) = a by the basis. Thus, by the inductive rule, we have: T (2) = b + T (1) = a + b Continuing to use the inductive rule, we get T (3) = b + T (2) = b + (a + b) = a + 2b Then T (4) = b + T (3) = b + (a + 2b) = a + 3b 9
By this point, it should be no surprise if we guess that T (n) = a + (n 1)b, for all n >= 1. Indeed, computing some sample values, then guessing a solution, and finally proving our guess correct by an inductive proof is a common method of dealing with recurrences. Now we are ready to calculate the runtime of merge sort. First, we need to consider the running time of the merge procedure. Merging two sorted arrays each of size n/2 to get a resulting sorted array of size n takes the following runtime: T(n)= cn = O(n) This is so because we use a constant time to compare the two elements on the left and right arrays and bring the smaller of the two down to the merged array. And, we have at most n elements to bring down. We are now ready to set up the recurrence relation for the running time of merge sort as follow: T(n) = T(n/2) + T(n/2) + cn (1) Clearly, T(1) = c and for n > 1, we perform repeated substitution: T(n/2)=T(n/4)+T(n/4)+cn/2 T(n/4)=T(n/8)+T(n/8)+cn/4 cn c(n/2) c(n/2) c(n/4) c(n/4) c(n/4) c(n/4) cn cn cn We then sum all the nodes at the same level of the above tree. Then, we take the resulting sum at each level to produce the total sum for merge sort as a whole. As the tree grows to the level for T(2), the height of the tree is log 2 n T(n)= cn log 2 n Hence, T(n)= O(n log n). And, if we substitute T(n) = cn log 2 n to the equation (1) above, we can verify that this is indeed a correct solution to it. 10
5. Recursive Fibonacci: Fib(int n){ if(n==1) return 1; if(n==2) return 1; return Fib(n-1)+Fib(n-2); Fib(5) Fib(6) Fib(4) Fib(4) Fib(3) Fib(3) Fib(2) Fib(3) Fib(2) Fib(2) Fib(1) Fib(2) Fib(1) This tree is different from that of merge sort. Its height is n and not log 2 n as in merge sort. Therefore, the number of nodes at the lowest level of this tree is proportional to 2 n, and, hence, the running time of Fib() is exponential order. Calculating Fib(100) with this code on a state of the art machine takes around 100, 000 years! Notice that this tree is not quite a complete binary tree so the running time is not quite O(2 n ) since the base will be less than 2. In fact, the value of the base is calculated to be the value of the golden ratio, which is approximately 1.62. So, to be more precise, the running time of this Fib is O(1.62 n ). The bloat in the running time is due to the many redundant repeated calculations. For example, the above tree depicts the same calculation of Fib(3) 3 times. 6. Fibonacci with Memoization To improve the running time of the recursive Fibonacci, we want to put the already calculated value resulting from a particular call in the memo. The next time the same call is encountered, just get the 11
resulting value from the memo instead of generating further recursive calls. Our memo is a static array A as shown. x 1 1 A[0] A[1] A[2] A[3] A[4].. The following is the code for Fibonacci with memoization. FibMemo(n){ If(n<=computed) return A[n]; A[n] = FibMemo(n-1) + FibMemo(n-2); computed = n; return A[n]; Initialize: A[1]=1; A[2]=1; computed = 2; FibMemo(6) FibMemo(5) FibMemo(4) FibMemo(4) FibMemo(3) FibMemo(3) FibMemo(2) FibMemo(3) FibMemo(2) FibMemo(2) FibMemo(1) FibMemo(2) FibMemo(2) FibMemo(1) Actual calculation Get these from the memo The running time of FibMemo is O(n) since effectively there are only n calls to FibMemo when calculating Fibonacci number for n. 12