Algorithms and Data Structures

Size: px

Start display at page:

Download "Algorithms and Data Structures"

Anna Powers
5 years ago
Views:

1 Algorithms and Data Structures Spring 2019 Alexis Maciel Department of Computer Science Clarkson University Copyright c 2019 Alexis Maciel

2 ii

3 Contents 1 Analysis of Algorithms Introduction Measuring Exact Running Times Analysis Asymptotic Analysis Other Asymptotic Relations Some Common Asymptotic Running Times Basic Strategies Analyzing Summations Worst-Case and Average-Case Analysis The Binary Search Algorithm Recursion The Technique When to Use Recursion Tail Recursion Analysis of Recursive Algorithms

4 iv CONTENTS 3 Sorting Selection Sort Insertion Sort Mergesort Quicksort Analysis of Quicksort Partitioning Algorithm A Selection Algorithm A Lower Bound for Comparison-Based Sorting Sorting in Linear Time Heaps Priority Queues The Heap Data Structure Heap Operations Heapsort Building a Heap in Linear Time Hash Tables Maps Direct-Address Tables The Hash Table Data Structure Analysis Index 155

5 Chapter 1 Analysis of Algorithms In this chapter, we will learn how to analyze algorithms in order to evaluate their efficiency. We will also discuss the relative advantages and disadvantages of analysis compared to measuring exact running times through testing. 1.1 Introduction In general, an algorithm is efficient if it uses a small amount of computational resources. The two resources that are most often considered are running time and memory space. An example of another resource is randomness. 1 In this chapter, we will learn how to evaluate the efficiency of algorithms. We will focus on running time but the main concepts and techniques we will learn also apply to other resources. We will learn that the efficiency of algorithms can be evaluated by analyzing them. The analysis can be done from pseudocode, 1 Algorithms that use randomness are usually studied in a course such as CS447 Computer Algorithms.

6 2 CHAPTER 1. ANALYSIS OF ALGORITHMS which allows us to choose efficient algorithms without having to implement the inefficient ones. Note that algorithm analysis is also useful for the analysis of data structures, if only because data structure operations are algorithms. Study Questions What does it mean for an algorithm to be efficient? What two computational resources are most often considered? 1.2 Measuring Exact Running Times When choosing or designing an algorithm for a particular problem, there are two questions that can be asked: Is the algorithm fast enough? Is it as fast as possible? The first question is perhaps the more pragmatic. To be able to answer that question, however, we need to know exactly what is meant by fast enough. One possibility would be precise time targets such as 5 ms. Now, the running time of an algorithm depends on several factors including what data it is used on, what computer it runs on and exactly how it is coded. (The input data could be arguments, input files or data entered by the user.) If all that information is available, then tests can be run to accurately determine if the algorithm is fast enough. But very often, there are no precise time targets to meet. In that case, the safest approach is to choose the fastest algorithm among the available alternatives. So how can we determine which of several possible algorithms is fastest?

7 1.2. MEASURING EXACT RUNNING TIMES 3 n log n µs 3 µs 10 µs 20 µs n µs 10 µs 1 ms 1 s n 2 µs 100 µs 1 s 12 days Table 1.1: Running times of three algorithms An obvious way is to implement each of the algorithms, run them and measure their running times. The choice of what computer to use probably doesn t matter much since if an algorithm is significantly faster than another on one computer, the same is probably true on most if not all computers. A more delicate issue is what inputs to use for the tests. Very often, we need an algorithm that will run well on a wide variety of inputs. So we could run tests on various inputs and compute the average running time of each algorithm. But the running time of an algorithm can vary greatly, especially as a function of the size of the input. For example, suppose that three algorithms have running times log n µs, n µs and n 2 µs, where n is the size of the input. 2 Table 1.1 shows what these running times are for various input sizes. When the input size is only 10, the difference between the running time of these three algorithms run is not that large. But at n = 10 3, the difference is significant and at n = 10 6, it is huge. Therefore, when comparing algorithms by measuring their running times, it is important to use a wide range of input sizes. So we can determine which of several algorithms will be the fastest as fol- 2 In these notes, as is customary in computer science, when the base of a log is not specified, it is assumed to be 2.

8 4 CHAPTER 1. ANALYSIS OF ALGORITHMS lows: implement the algorithms, run them on a wide variety of inputs, and measure the running times. Of course, for the comparisons to be valid, the algorithms must be coded in the same language and run on the same computer and similar inputs. This approach has several significant disadvantages. First, it requires that all the algorithms be implemented, even those than will end up not being used. Second, writing test drivers and running tests takes time, especially since we must test on a good number of inputs of each size to make sure we have a representative sample. Third, because all the algorithms being compared must be implemented in the same language and tested on the same computer and on similar inputs, earlier tests done on different computers, with different inputs, or using different programming languages often need to be repeated. In the rest of this chapter, we will learn that it is possible evaluate the running time of an algorithm in a way that addresses these problems. Study Questions When comparing the efficiency of algorithms, why is it usually important to compare running times over a wide range of input sizes? What are three significant weaknesses of comparing algorithms by measuring exact running times? 1.3 Analysis Our goal is to find a way to assess the running time of an algorithm without having to implement and test it. We also want this assessment to be valid for all implementations of the algorithm and for all the computers on which the

9 1.3. ANALYSIS 5 for i = 0 to n 1 print a[i] Figure 1.1: Printing the contents of an array algorithm may run. And, of course, to be useful, this assessment should allow us to compare the running time of various algorithms. Let s consider an example. Figure 1.1 shows pseudocode for an algorithm that prints the contents of an array. The running time of this algorithm can be determined as follows. Before the first iteration of the loop, i is initialized and its value is compared to n 1. At every iteration of the loop, an array element is accessed, then printed, i is incremented and then again compared to n. The loop is executed n times. Therefore, the running time of the algorithm is t(n) = c assign + c comp + (c index + c print + c incr + c comp )n where the c constants are the running times of the various basic operations performed by the algorithm. For example, c assign is the time it takes to assign a value to an integer variable. We can simplify this expression by letting a = c index + c print + c incr + c comp and b = c assign + c comp. The running time of the algorithm can then be written as t(n) = an + b If we knew the exact values of the constants a and b, this expression would allow us to determine the exact running time of the algorithm on inputs of any size. But the values of these constants depend on exactly how the algorithm is implemented and on which computer the algorithm will run. Recall that we want to assess the running time of an algorithm without having to implement

10 6 CHAPTER 1. ANALYSIS OF ALGORITHMS it. We also want this assessment to be valid for all computers. Therefore, we will not determine the values of the constants and instead focus on the general form of the running time as a function of n. In our example, the running time of the printing algorithm is a linear function of n. Is that useful information? Knowing that the running time is a linear function doesn t allow us to determine the exact running time of the algorithm for any input size. But suppose that another algorithm has a running time that s a quadratic function of n, for example. Then we know that when n is large enough, the printing algorithm runs faster, much faster, than this other algorithm. This basic fact about linear and quadratic functions is apparent in the numbers that were given in Table 1.1. Therefore, is is useful to know that the running time of the printing algorithm is a linear function of n. So analyzing an algorithm to determine the general form of its running time is a useful alternative to the measurement of exact running times through testing. It is useful because it can be used to determine that an algorithm will be faster than another one on every input that is large enough. Analysis has three main advantages over measuring exact running time through testing. First, analysis can be carried out from pseudocode, without having to implement the algorithms. Second, analysis does not require writing test drivers or performing possibly time-consuming tests. Third, each algorithm needs to be analyzed only once because the results of the analysis are valid for every (reasonable) implementation of the algorithm and every computer and data the algorithm may run on. On the other hand, analysis has three main disadvantages over measuring exact running times. First, it is not as precise. For example, it does not allow us to distinguish between two linear-time algorithms or to determine if an algorithm meets specific time targets. Second, analysis is valid only for large enough inputs, not for small ones. Third, analysis may require difficult mathematics, al-

11 1.3. ANALYSIS 7 though this is usually not the case with reasonably simple algorithms. In general, analysis is a convenient and reliable way of quickly identifying large differences in running times. When more accuracy is needed, or when the analysis is too difficult, which can happen, we must then resort to measuring exact running times through testing. In the following sections, we will learn more about the analysis of algorithms. But first, we end this section with a couple of general notes about analysis. First, when analyzing the running time of the array printing algorithm, we gave separate names to the running times of the various basic operations. This is not really necessary. For example, we could simply consider that these constants are all equal to 1. This is acceptable because it doesn t change the general form of the function t(n), it only changes the values of the constant factors, which are unknown anyway. For example, for the array printing algorithm, we would have still obtained a linear function. (Note that considering that the running times of the basic operations are all 1 means that t(n) equals the number of operations performed by the algorithm.) Second, when analyzing the running time of the array printing algorithm, we said that the running time of basic operations such as integer operations and array indexing are constant, meaning that they are independent of the size of the array. It turns out that this isn t exactly true. For example, the time it takes to assign a value to an integer variable depends on the number of bits used to store that integer. And the number of bits required to store an integer x is approximately log x. 3 Now, there are two types of integers in the array printing algorithm. On one hand, there are the integers i and n whose value is at most n. On the other 3 At Clarkson, the representation of integer values in a computer s memory, as well as some aspects of the implementation of the basic operations on those integers, are covered in a course such as CS241 Computer Organization.

12 8 CHAPTER 1. ANALYSIS OF ALGORITHMS hand, there are the array elements whose values could in principle be much larger than that. In general, we consider that integers are small if their values are at most polynomial in n, that is, at most n c for some constant c, where n is the size of the input or the parameter of interest. This implies that operations on small integers can be performed in time approximately c log n, where c is a constant. If we assume that the arrays contain only small numbers, which is a common assumption, then the running time of the array printing algorithm is really of the form t(n) = (an + b)c log n Since this extra factor of log n usually occurs in every algorithm that solves the same problem, it is common practice to omit it. In other words, it is common to do as we did earlier and pretend that all basic operations can be executed in constant time. But note that this can only be done for basic operations performed on small numbers. For example, if the arrays could contain large numbers, we would need to know how large and then take this into account in our analysis. Study Questions As described in this section, what does analysis seek to determine? What are three advantages and three disadvantages of analysis over the measurement of exact running times through testing? 1.4 Asymptotic Analysis In the previous section, we saw that the general form of the running time of an algorithm, when expressed as a function of a parameter such as its input size, is a useful measure of the efficiency of the algorithm. For example, if we determine

13 1.4. ASYMPTOTIC ANALYSIS 9 that an algorithm has a linear running time then we know that it will run faster than any quadratic-time algorithm on every input that is large enough. But what should we make of a running time of the form t(n) = an+ b log n+ c? How does that compare to linear and quadratic running times, for example? The key point to remember is that analysis allows us to compare the running time of algorithms when the input is large enough. When n is large enough, the terms b log n and c are insignificant compared to an. In other words, the dominant term an is the one that will essentially determine the running time for large enough values of n. This means that when n is large enough, an+b log n+c will behave essentially like the linear function an. We can make this more precise by taking the limit of the ratio of t(n) and its dominant term an: t(n) lim n an = lim an n an + lim b log n c + lim n an n an = = 1 Once again, this says that when n is large enough, t(n) grows essentially like an. This is useful information because, for example, it tells us that when n is large enough, t(n) will be much smaller than any quadratic running time. Therefore, to compare t(n) to other running times, it is useful to simplify t(n) to an. In fact, since we don t know the value of a, we can simplify t(n) even further to just n. Note that t(n) lim n n = a and that a > 0 since running times are positive. In general, to make it easier to compare a running time t(n) to other running

14 10 CHAPTER 1. ANALYSIS OF ALGORITHMS times, we find a simple function f (n) such that lim n t(n) f (n) = c for some constant c > 0. A problem occurs when the above limit does not exist. Consider, for example, the possibility that the running time of an algorithm is the following function: t(n) = n 2n if n is odd if n is even This running time seems similar to n but the limit lim n t(n)/n does not exist. Now, it is still true that t(n) grows like n. The key is the following pair of inequalities: n t(n) 2n In other words, t(n) is sandwiched between two constant factors of n. This forces t(n) to grow like n. In particular, when n is large enough, t(n) will be much smaller than any quadratic running time because t(n) is no larger than the linear function 2n. This idea of a running time being sandwiched between two constant factors of a simpler function is therefore an important concept. It can be defined precisely as follows. Definition 1.1 Suppose that f and g are functions that map positive integers to nonnegative real numbers ( f, g : >0 0 ). We say that f (n) is Θ(g(n)) ( Theta of g of n ), if there are positive constants a and b and a number n 0 such that for every n n 0, ag(n) f (n) bg(n)

15 1.4. ASYMPTOTIC ANALYSIS 11 This definition says that when n is large enough, f (n) is sandwiched between two constant multiples of g(n). And this forces f (n) to grow like g(n), which means that when n is large enough, f (n) will be much smaller than any function that grows much faster than g(n) and much larger than any function that grows much slower than g(n). For example, in the case of our previous running time t(n), we have that t(n) is Θ(n) and, when n is large enough, t(n) is much smaller than any quadratic function. We can now revise what we said earlier: when analyzing an algorithm, to make it easier to compare its running time t(n) to other running times, we find a simple function f (n) such that t(n) is Θ(f (n)). This type of analysis is called asymptotic analysis. The function f (n) can be called the asymptotic running time of the algorithm. (Note that this is not a precisely defined concept since we haven t defined what we mean by simple.) When a function f (n) is Θ(g(n)), we say that f (n) is asymptotically equivalent to g(n). Some people also say that f (n) is order g(n). And when f (n) is Θ(g(n)), we often write f (n) = Θ(g(n)). But note that this equal sign is not a real equal sign. In particular, it doesn t make sense to write Θ(g(n)) = f (n). Note the conditions on the functions f and g in the definition of Θ: these functions map positive integers to nonnegative real numbers. It is possible to define the Θ notation in a more general way but it would be less convenient. And this restriction on the functions is not a problem for our purposes since our functions will either be running times, which are always nonnegative, or simple functions like log n, n or n 2, which are also nonnegative. Note that we exclude 0 from the domain of the functions because log n is not defined when n = 0 and we include 0 in the range of the functions because log n = 0 when n = 1. Now, when analyzing running times of algorithms, limits are still useful because they often give us an easy way to establish that a function is Θ of another one:

16 12 CHAPTER 1. ANALYSIS OF ALGORITHMS Theorem 1.2 Suppose that f, g : >0 0. If f (n) lim n g(n) = c for some constant c > 0, then f (n) is Θ(g(n)). For example, the first running time of this section was t(n) = an+ b log n+c. We saw earlier that an + b log n + c lim = a n n In addition, a must be positive for this running to be positive. Therefore, t(n) is Θ(n). To prove the theorem, we would need a precise definition of the concept of a limit. We won t go that far but with an intuitive understanding of what a limit is, we can still give a fairly convincing sketch of the proof. Proof (Sketch.) Suppose that f (n) lim n g(n) = c for some constant c > 0. Intuitively, this means that as n grows, f (n)/g(n) gets closer and closer to c. Eventually, it must be that f (n)/g(n) is no larger than 2c and no smaller than c/2, and that it remains that way forever. Say that this happens when n n 0. Then we have that for every n n 0, c 2 f (n) g(n) 2c which implies that c g(n) f (n) 2c g(n) 2

17 1.4. ASYMPTOTIC ANALYSIS 13 This proves that f (n) is Θ(g(n)) (with a = c/2 and b = 2c). Often, there is no need to explicitly compute limits because it is clear what the dominant term of the running time is. This can be stated precisely as follows: Theorem 1.3 Suppose that f, g : >0 0. If cg(n) is the dominant term f (n), in the sense that f (n) = c g(n) + r(n), with c > 0, and then f (n) is Θ(g(n)). lim n r(n) c g(n) = 0 Proof By using the previous theorem, all we need to do is compute the limit of the ratio of the two functions: f (n) lim n g(n) = lim c g(n) n g(n) + lim r(n) n g(n) = c + 0 = c Since c > 0, f (n) is Θ(g(n)). For example, consider again the running time t(n) = an + b log n + c. The term an dominates in the sense of the theorem since b log n + c lim = 0 n an Therefore, t(n) is Θ(n). To summarize, in the previous section, we said that the goal of analysis is to determine the general form of the running time t(n) of an algorithm. We can now be more precise: the goal is to find a simple function f (n) that s asymptotically equivalent to t(n). In other words, a function f (n) such that t(n) is

18 14 CHAPTER 1. ANALYSIS OF ALGORITHMS Θ(f (n)). This is called asymptotic analysis and the function f (n) can be called the asymptotic running time of the algorithm. The function f (n) can often be found by simply finding the dominant term of the running time. In other cases, we search for a function f (n) such that lim n t(n) f (n) = c > 0 When this limit doesn t exist, then we have to work directly with the inequalities of the Θ definition; that is, we find positive constants a and b such that ag(n) f (n) bg(n) when n is large enough. Several examples of asymptotic running times and how they compare to each other will be given in a coming section. Later in this chapter, we will learn basic strategies for analyzing the running time of simple algorithms. Study Questions What does it mean for a running time to be asymptotically equivalent to a function f (n)? What does f (n) = Θ(g(n)) mean? What is asymptotic analysis? What is the asymptotic running time of an algorithm? What is the main advantage of simplifying the running time of an algorithm?

19 1.5. OTHER ASYMPTOTIC RELATIONS How can a limit be used to determine the asymptotic running time of an algorithm? How can the dominant term of a running time allow us to determine the asymptotic running time of an algorithm. Exercises Below is a series of statements of the form f (n) = Θ(g(n)). Prove that each of these statements. First do it by computing limits. Then, do it by explicitly finding, in each case, positive constants a and b and a number n 0 such that ag(n) f (n) bg(n) for every n n 0. Justify your answers. a) n + 10 = Θ(n). b) n 2 + n = Θ(n 2 ). c) 3n 2 n = Θ(n 2 ). d) 3n 2 n + 10 = Θ(n 2 ) Show that if c and d are any two numbers greater than 1, then log c n = Θ(log d n). (This implies that when using the Θ notation, it is not necessary to specify the base of logarithms.) 1.5 Other Asymptotic Relations Besides Θ, there are a few other asymptotic relations that are useful when comparing running times of algorithms. The following definition includes Θ for completeness.

20 16 CHAPTER 1. ANALYSIS OF ALGORITHMS Definition 1.4 Suppose that f, g : > We say that f (n) is Θ(g(n)) ( Theta of g of n ), if there are positive constants a and b and a number n 0 such that for every n n 0, ag(n) f (n) bg(n) 2. We say that f (n) is O(g(n)) ( big-oh of g of n ), if there is a positive constant b and a number n 0 such that for every n n 0, f (n) bg(n) 3. We say that f (n) is Ω(g(n)) ( big-omega of g of n ), if there is a positive constant a and a number n 0 such that for every n n 0, f (n) ag(n) 4. We say that f (n) is o(g(n)) ( little-oh of g of n ), if f (n) lim n g(n) = 0 5. We say that f (n) is ω(g(n)) ( little-omega of g of n ), if f (n) lim n g(n) = Intuitively, Θ says that when n is large enough, f (n) is about the same as g(n), O says that f (n) is not much larger than g(n), Ω says that f (n) is not much smaller than g(n), o says that f (n) is much smaller than g(n), ω says that

21 1.5. OTHER ASYMPTOTIC RELATIONS 17 f (n) is much larger than g(n). In order words, Θ is essentially like =, O is like, Ω is like, o is like <, and ω is like >. For example, n is o(n 2 ) because And consider the function lim n n n = lim 1 2 n n = 0 f (n) = n n 2 if n is odd if n is even All we can say is that f (n) is O(n 2 ) and f (n) is Ω(n). These asymptotic relations have many interesting and useful properties. For example, we know that if a < b, then it is also true that a b. So if little-oh is like < and big-oh is like, then the following should be true: Theorem 1.5 If f (n) is o(g(n)), then f (n) is O(g(n)). Proof (Sketch.) Suppose that f (n) is o(g(n)). This means that f (n) lim n g(n) = 0 Once again, intuitively, this means that as n grows, f (n)/g(n) gets closer and closer to 0. Eventually, it must be that f (n)/g(n) becomes no larger than 1 and remains that way forever. Say that this happens for n n 0. Then for every n n 0, f (n)/g(n) 1, which implies that f (n) g(n). This shows that f (n) is O(g(n)) (with b = 1). Here s another example. We know that if a < b, then it is not true that a b.

22 18 CHAPTER 1. ANALYSIS OF ALGORITHMS So if little-oh is like < and big-omega is like, then the following should be true: Theorem 1.6 If f (n) is o(g(n)), then f (n) is not Ω(g(n)) (and therefore not Θ(g(n)) either). Proof Suppose that f (n) is o(g(n)). This means that f (n) lim n g(n) = 0 Now, suppose that f (n) is also Ω(g(n)). This means that there exists a > 0 such that f (n) ag(n), when n is large enough. This implies that f (n) lim n g(n) lim ag(n) n g(n) = a > 0 Clearly, the limit of the ratio f (n)/g(n) cannot be both equal to 0 and greater than 0. So if f (n) is o(g(n)), then f (n) cannot be Ω(g(n)). In the previous section, we said that if a running time t(n) is Θ(n), then, when n is large enough, t(n) is much smaller than any quadratic running time. We can now be more precise: t(n) is little-oh of any quadratic running time. Intuitively, this should be clear. The fact that t(n) is Θ(n) means that t(n) is about the same as n. If another running time t 2 (n) is Θ(n 2 ), then that running time is about the same as n 2. Since n is o(n 2 ), this should imply that t(n) is o(t 2 (n)). This intuition can be verified as follows. We want to show that lim n t(n) t 2 (n) = 0

23 1.5. OTHER ASYMPTOTIC RELATIONS 19 So we want to show that the ratio t(n)/t 2 (n) is small. To do that, we can show that t(n) is small while t 2 (n) is large. The fact that t(n) is Θ(n) implies that there exists b > 0 such that t(n) bn, when n is large enough. The fact that t 2 (n) is Θ(n 2 ) implies that there exists a > 0 such that t 2 (n) an 2, when n is large enough. Therefore, when n is large enough, t(n) t 2 (n) bn an = b 2 an This implies that On the other hand, lim n t(n) t 2 (n) lim n t 1 (n) lim n t 2 (n) 0 b an = 0 because both t(n) and t 2 (n) are running times so they re nonnegative. Therefore, t(n) lim n t 2 (n) = 0 which means that t(n) is o(t 2 (n)). Study Questions What are the five asymptotic relations mentioned in the section and what do they mean, both intuitively and formally (precisely)? Exercises Show that if t 1 (n) t 2 (n) and t 2 (n) is O(n 2 ), then t 1 (n) is O(n 2 ).

24 20 CHAPTER 1. ANALYSIS OF ALGORITHMS RUNNING TIME COMMON NAME TYPICAL EXAMPLE Θ(1) constant a single basic operation Θ(log n) logarithmic fast searching algorithms Θ(n) linear simple searching algorithms Θ(n log n) n log n fast sorting algorithms Θ(n 2 ) quadratic simple sorting algorithms O(n k ), for k > 0 polynomial most algorithms that are fast enough to be useful in practice Θ(c nk ), for c > 1, k > 0 exponential exhaustive searches of very large sets Table 1.2: Some common running times Is it true that if t 1 (n) t 2 (n) and t 2 (n) is Θ(n 2 ), then t 1 (n) is Θ(n 2 )? Justify your answer. (To justify a no answer, give a counterexample.) Show that if t(n) is Θ(n log n), then t(n) is o(n 2 ). 1.6 Some Common Asymptotic Running Times Table 1.2 gives a list of common asymptotic running times. In this table, both c and k are constants.

25 1.6. SOME COMMON ASYMPTOTIC RUNNING TIMES 21 The expression Θ(1) may seem a bit odd. According to the definition of Θ, a running time t(n) is Θ(1) if, when n is large enough, a t(n) b, for some a, b > 0. In other words, when n is large enough, t(n) is bounded above and below by constants. Note that we call a Θ(1) running time constant even though, strictly speaking, the running time may not be a constant function. The same comment applies to the other running times in this table. For example, we call a Θ(n) running time linear even though it may not be a function of the form an + b. In contrast to the other running times in this table, polynomial is typically viewed as an upper bound, so that, for example, an algorithm that runs in logarithmic time is also viewed as running in polynomial time. The running times of Table 1.2 are listed in increasing order in the sense that each running time is little-oh of the next, except for Θ(n 2 ), which is just a subset of O(n k ) when k 2. We already saw numbers that show how large a difference there is between logarithmic, linear and quadratic running times (Table 1.1). Table 1.3 provides some numbers that compare linear, quadratic and exponential running times. Table 1.1 makes it clear that quadratic-time algorithms are usually impractical on large inputs while Table 1.3 makes it clear that exponential-time algorithms are typically useless even for relatively small inputs. Exercises To each of Tables 1.1 and 1.3, add rows for the running times n log 2 n and n 3.

26 22 CHAPTER 1. ANALYSIS OF ALGORITHMS n n µs 10 µs 20 µs 40 µs 60 µs 80 µs n 2 µs 0.1 ms 0.4 ms 1.6 ms 3.6 ms 6.4 ms 2 n µs 1 ms 1 s 13 days years years 1.7 Basic Strategies Table 1.3: More execution times In this section, we will do several examples of analysis that will illustrate some important basic strategies. Algorithms often consist of a sequence of steps performed by other algorithms. The running time of these algorithms is simply the sum of the running time of the other algorithms. Example 1.7 Suppose that an algorithm consists of three steps, performed by algorithms A, B and C, in that order. Let T A (n), T B (n) and T C (n) denote the running time of these algorithms. Then the running time of the overall algorithm is T(n) = T A (n) + T B (n) + T C (n) Now, suppose that the running times of these algorithms are Θ(n), Θ(1) and Θ(n), respectively. Then we can write that T(n) = Θ(n) + Θ(1) + Θ(n) This simply means that T(n) is the sum of three functions and these functions are Θ(n), Θ(1) and Θ(n), respectively. Since the Θ(n) terms dominate, we get

27 1.7. BASIC STRATEGIES 23 that T(n) is Θ(n). Intuitively, it should be clear that the last sentence of this example is correct: T(n) is Θ(n) because the Θ(n) terms dominate. But, if necessary because we want to be completely sure or to convince someone else this intuition can be verified as follows. When n is large enough, we have that T(n) b 1 n + b 2 + b 3 n (b 1 + b 2 + b 3 )n Similarly, again when n is large enough, T(n) a 1 n + a 2 + a 3 n a 1 n because a 2 and a 3 are positive. Therefore, when n is large enough, a 1 n T(n) (b 1 + b 2 + b 3 )n which implies that T(n) is Θ(n). Here s another example with different running times. Example 1.8 Suppose that the running times of the algorithms A, B and C are now Θ(n), Θ(n 2 ) and Θ(1), respectively. Then T(n) = Θ(n) + Θ(n 2 ) + Θ(1) = Θ(n 2 ) because the Θ(n 2 ) function will dominate. Algorithms often contain loops. Some loops are very easy to analyze while others require a bit more work. Here are some examples, in increasing order of complexity.

28 24 CHAPTER 1. ANALYSIS OF ALGORITHMS for i = 0 to n 1 print a[i] for i = 0 to n 1 for j = 0 to n 1 print a[i,j] Figure 1.2: Printing the contents of an array Figure 1.3: Printing the contents of a two-dimensional array Example 1.9 Figure 1.2 shows an algorithm we considered earlier in this chapter. This algorithm prints the contents of an array of size n. A key observation about this algorithm is that the operations that control the loop as well as the operations in the body of the loop all run in constant time. Some of these operations are performed only once (the assignment i = 0, for example), while other operations are executed n times (the test i < n and the operations in the body of the loop, for example). This implies that the running time of the loop is of the form T(n) = an + b. Clearly, T(n) is Θ(n). Example 1.10 Figure 1.3 shows an algorithm that prints the contents of a twodimensional array. The algorithm consists of two nested loops. The usual strategy for analyzing nested loops is to work from the inside out. The inner loop can be analyzed as in the previous example. Its running time T 1 (n) is Θ(n). The operations that control the outer loop run in constant time but its body, which is the inner loop, does not run in constant time. It runs in linear time. As before, the total running time of the operations that control the outer loop is Θ(n). Since the inner loop always runs in time T 1 (n), the total running time of

29 1.7. BASIC STRATEGIES 25 for i = 0 to n 1 for j = 0 to i 1 print a[i,j] Figure 1.4: Printing the lower left triangle of a two-dimensional array the inner loop is nt 1 (n). Therefore, the running time of the outer loop is T(n) = Θ(n) + nt 1 (n) = Θ(n) + nθ(n) It should be clear that the term nθ(n) is Θ(n 2 ). This implies that T(n) is Θ(n 2 ). The fact that a function that s nθ(n) is also Θ(n 2 ) should be clear, intuitively. But it can also be easily justified as follows. Let f (n) be the Θ(n) function in the term nθ(n). When n is large enough, we have that an f (n) bn. Therefore, an 2 nf (n) bn 2, which implies that n f (n) is Θ(n 2 ). The next example shows that sometimes the analysis of loops involves summations. Example 1.11 Figure 1.4 shows an algorithm that prints the lower left triangle of a two-dimensional array. Once again, we analyze the inner loop first. The only difference compared to the previous example, is that the inner loop repeats i times instead of n times. This implies means that the running time of the inner loop varies with i. In fact, the running time of the inner loop is Θ(i). This has the important consequence that we cannot simply multiply the running time of the inner loop by the number of times it is repeated. Instead, we need to add the running time of all the executions of the inner loop. In other words, if T 1 (i) is the running time of the

30 26 CHAPTER 1. ANALYSIS OF ALGORITHMS inner loop, then the running time of the outer loop is n 1 T(n) = Θ(n) + T 1 (i) where the Θ(n) term is, as before, the total running time of the operations that control the outer loop. So we need to analyze the summation n 1 T 1 (i) i=0 We re going to take advantage of the fact that we know the exact form of the running time of the inner loop: T 1 (i) = ai + b. Therefore, n 1 n 1 n 1 n 1 n 1 T 1 (i) = (ai + b) = a i + b = a i + bn i=0 i=0 We can then use the well-known formula i=0 i=0 i=0 i=0 k i = i=0 k(k + 1) 2 This implies that n 1 i=0 (n 1)n T 1 (i) = a + bn = a 2 2 n2 a 2 n + bn Therefore, the summation is Θ(n 2 ) and the running time of the outer loop is Θ(n) + Θ(n 2 ), which is Θ(n 2 ).

31 1.7. BASIC STRATEGIES 27 if (n < 10) sort using simple sorting algorithm else sort using fast sorting algorithm Figure 1.5: A hybrid sorting algorithm The last example of this section concerns hybrid algorithms, which are algorithms that combine at least two other algorithms and include a strategy for choosing which of the other algorithms to use. Example 1.12 A sorting algorithm is an algorithm that arranges the elements in a sequence according to some order. For example, the elements of an array could be sorted in nondecreasing order. Later in these notes, we will learn that there are simple sorting algorithms that run in time Θ(n 2 ) and more complex sorting algorithms that run much faster, in time Θ(n log n). On small inputs, however, the simple sorting algorithms often run faster than the more complex ones. The hybrid sorting algorithm shown in Figure 1.5 takes advantage of that fact by choosing a simple sorting algorithm when the input is small and a fast sorting algorithm when the input is large. Now, what is the overall running time of this algorithm, Θ(n 2 ) or Θ(n log n)? The important thing to remember is that the asymptotic running time of an algorithm is determined by its running time on large inputs. When n 10, the running time of the running time of the hybrid algorithm equals the running time of the fast sorting algorithm, and when n is large enough, that running time is bounded below and above by constant multiples of n log n. Therefore, the running time of the hybrid algorithm is Θ(n log n).

32 28 CHAPTER 1. ANALYSIS OF ALGORITHMS Exercises What is the asymptotic running time of each of the following algorithms, as a function of n? Don t forget to simplify and use the Θ notation. Justify your answers. a) for i = 1 to n for j = 1 to 2n+1 print b) for i = 1 to 10 for j = 1 to n print c) for i = 1 to n for j = i to i+5 print d) for i = 1 to n for j = i to n print e) for i = 1 to n for j = 1 to 2 i+1 print f) for i = 1 to n n for j = 1 to i print

33 1.7. BASIC STRATEGIES 29 A if (n < 100) B else for j = 1 to n C Figure 1.6: Algorithm for Exercise Consider the algorithm shown in Figure 1.6. Let T A (n), T B (n) and T C (n) denote the running time of algorithms A, B and C, respectively. What is the asymptotic running time of this algorithm, as a function of n, under each of the following sets of assumptions? Justify your answers. a) T A (n) = Θ(n), T B (n) = Θ(n 2 ) and T C (n) = Θ(log n). b) T A (n) = Θ(n 2 ), T B (n) = Θ(n 2 ) and T C (n) = Θ(log n). c) T A (n) = Θ(n 2 ), T B (n) = Θ(n 3 ) and T C (n) = Θ(log n) Justify the fact that if T(n) = Θ(n) + Θ(n 2 ) + Θ(1), then T(n) = Θ(n 2 ) Suppose that T(n) = Θ(log n)θ(n). That is, T(n) is the product of two functions that are Θ(log n) and Θ(n), respectively. Show that T(n) = Θ(n log n).

34 30 CHAPTER 1. ANALYSIS OF ALGORITHMS 1.8 Analyzing Summations In one of the examples of the previous section, the running time contained the summation n 1 i i=1 We used a well-known formula to get the exact value of this summation. But what if we didn t know this formula or what if one didn t exist for the particular summation we were dealing with? In this section, we will learn a technique that can sometimes be used to get asymptotic bounds on summations. Getting an upper bound on this summation is easy. Since every term in the summation is at most n 1, we get that n 1 n 1 i (n 1) = (n 1) 2 = n 2 2n + 1 i=1 i=1 This implies that the summation is O(n 2 ). We can get a lower bound in a similar way. Every term in the summation is at least 1. Therefore, n 1 n 1 i 1 = n 1 i=1 i=1 This implies that the summation is Ω(n). Unfortunately, this lower bound is much smaller than the O(n 2 ) upper bound. Typically, what we need are matching upper and lower bounds. So we need to improve either the O(n 2 ) upper bound (by lowering it) or improve the Ω(n) lower bound (by increasing it). Or both. Let s try to improve the lower bound. One way to do that is to split the

35 1.8. ANALYZING SUMMATIONS 31 summation in the middle: n 1 i = i=1 n/2 1 i=1 i + n 1 i=n/2 i In the last summation, every term is at least n/2. Therefore, n 1 i i=1 n 1 i=n/2 n 2 = n n 2 n 2 = n2 4 This implies that the summation is Ω(n 2 ). Since this lower bound matches the O(n 2 ) upper bound, we can now conclude that the summation is Θ(n 2 ). All this is correct except for one small technicality: if n is not even, then n/2 is not an integer and we cannot have a summation start at i = n/2 when n/2 is not an integer. Usually, the most convenient way of dealing with this kind of issue is to use the floor and ceiling notations. In general, the floor of x is the greatest integer less than or equal to x. The ceiling of x is the smallest integer greater than or equal to x. The floor and ceiling of x are denoted x and x. For example, 3/2 = 1 and 3/2 = 2. On the other hand, 5 = 5 = 5. In our example, we can split the summation at n/2. We then get that n 1 i i=1 n 1 i= n/2 i n 1 i= n/2 n = 2 n n n 2 2 Using the fact that n/2 n/2 n/2 + 1/2, we get n 1 i i=1 n n n n 2 2 = 2 1 n 2 2 = n2 4 n 4

36 32 CHAPTER 1. ANALYSIS OF ALGORITHMS for i = 0 to n 1 if (a[i] == x) return i return 1 Figure 1.7: A sequential search of an array This implies the desired Ω(n 2 ) lower bound. Exercises Obtain an Ω(n 2 ) lower bound on the summation n 1 i by splitting the i=1 summation at n/2 (instead of n/2 ). 1.9 Worst-Case and Average-Case Analysis Consider the sequential search algorithm shown in Figure 1.7. What is the running time of this algorithm? The accurate answer is that it depends on the location of the first occurrence of x in the array. We can talk of at least three different running times for a given algorithm. All are functions of the input size. The best-case running time is the minimum running time required on inputs of size n. In the case of the sequential search algorithm, the best case occurs when x is the first element of the array. In that case, the running time is constant. The worst-case running time is the maximum running time required on inputs of size n. In our example, the worst case occurs when x is not found. In that case, the running time is linear in n.

37 1.9. WORST-CASE AND AVERAGE-CASE ANALYSIS 33 The average-case running time is the average running time required on inputs of size n. This running time is usually more difficult to determine, in part because it requires knowing how likely each input of size n is. For example, for the sequential search, how likely is it that x will not be found? Given that it is found, how likely is it that it will be found in each of the possible positions? In this example, one possible approach is to determine the average-case running time for the two separate cases of a successful and an unsuccessful search. If the search is unsuccessful, the running time will always be the same, so the average and worst-case running times are the same: Θ(n). In the case of a successful search, a common approach when lacking any more precise knowledge of the particular application we have in mind, is to assume that each location is equally likely. It is easy to see that the running time of the search is of the form ak+ b where k is the position a number from 1 to n of the first occurrence of x. The average running time can then be calculated by taking the average over all possible positions k: 1 n n (ak + b) = 1 a n k=1 n k + bn = 1 n k=1 n(n + 1) a(n + 1) a + bn = + b 2 2 Therefore, the average running time of a successful search is Θ(n). In general, the best-case running time is not very useful. The worst-case running time is much more useful and has the advantage of giving us a guarantee because it is an upper bound on the running time required for all inputs (that are large enough). A possible disadvantage of the worst-case running time is that this upper bound may be much larger than the running time required by most inputs. In other words, the worst-case running time can be overly pessimistic. An example of this occurs with the quicksort algorithm, one of the fast sorting algorithms we will study later in these notes. This algorithm has a worst-case

38 34 CHAPTER 1. ANALYSIS OF ALGORITHMS running time of Θ(n 2 ) while the mergesort algorithm, another fast sorting algorithm, has a Θ(n log n) worst-case running time. This might indicate that quicksort is much slower than mergesort. However, in practice, quicksort usually runs faster than mergesort. This apparent contradiction can be explained in part by the fact that the average-case running time of quicksort is Θ(n log n), just like the worst-case running time of mergesort. And the fact that quicksort tends to run faster than mergesort in practice, probably indicates that the inputs that cause quicksort to take quadratic time occur only rarely. This illustrates how the average-case running time can be more realistic than the worst-case running time. However, as we said earlier, the average-case running time can be more difficult to determine because it requires knowledge of the probability distribution of the inputs. In addition, average-case analysis usually requires additional calculations. This was the case with the sequential search algorithm, although the calculations there were fairly easy. The average-case analysis of quicksort, on the other hand, is significantly more complicated than its worst-case analysis. (We will do both later in these notes.) In cases where even the worst-case analysis of an algorithm proves difficult, it is possible to get an estimate of its asymptotic running time by testing the algorithm on randomly generated inputs of various sizes and seeing what kind of function best fits the data. But note that this gives an estimate of the emphaverage-case running time, since there is no guarantee that randomly generated inputs will include the worst-case ones. This kind of empirical analysis can be especially useful if the average-case analysis is difficult and we suspect that the worst-case running time may be too pessimistic. We sometimes say that the running time of an algorithm is O(n), without specifying if we are talking about the worst-case, average-case or best-case running time. Strictly speaking, this statement is ambiguous. However, it is usually

39 1.9. WORST-CASE AND AVERAGE-CASE ANALYSIS 35 understood as giving a bound on the running time of all inputs of length n. Since big-oh is an upper bound, this statement is therefore equivalent to saying that the worst-case running time of the algorithm is O(n). Similarly, when we say that the running time of an algorithm is Ω(n), we re saying that the best-case running time of the algorithm is Ω(n). And when we say that the running time T(n) of an algorithm is Θ(n), we re saying both things: that the best-case running is Ω(n) and that the worst-case running time is O(n). Note that this implies that all three running times worst-case, average-case and best-case are Θ(n). Study Questions What are the best-case, worst-case and average-case running times of an algorithm? What is an advantage and a disadvantage of the worst-case running time compared to the average-case running time? Exercises Consider an algorithm that runs a test on its input x. If the test succeeds, the algorithm runs algorithm A on x. If the test fails, the algorithm runs algorithm B on x. Suppose that algorithms A and B run in time Θ(n) and Θ(n 2 ), respectively. Assuming that for every n, inputs of length n are equally likely to pass or fail the test, what are the worst-case, average-case and best-case running times of this algorithm?

40 36 CHAPTER 1. ANALYSIS OF ALGORITHMS 1.10 The Binary Search Algorithm It is fairly obvious that searching a collection of data for a particular element, or for an element that satisfies a particular property, is a frequent operation. In this section, we will learn that under certain conditions, it is possible to search very efficiently by using an algorithm called binary search. We will also analyze the running time of this algorithm. The simplest way of searching a sequence such as an array or a vector is to scan it from one end to the other, examining elements one by one. This is the sequential search we analyzed in the previous section. We found that its running time is linear in the length of the sequence. If the sequence happens to be ordered, then the search can be done more quickly. For example, consider an array of integers sorted in increasing order. When looking for a particular integer, we can stop searching as soon as we find the integer we are looking for or an integer that is larger that the integer we are looking for. The running time of this modified sequential search is still linear but we can expect unsuccessful searches to be 50% faster, on average. A much more dramatic improvement in the running time can be obtained for sorted sequences that provide constant-time access to their elements, such as arrays and vectors. The idea is to go straight to the middle of the sequence and compare the element we are looking for with the middle element of the sequence. Because the sequence is sorted, this comparison tells us if the element we are looking for should be located in the first or second half of the sequence. We then only need to search that half. This searching algorithm is called a binary search. The algorithm is described in Figure 1.8. In this high-level description of the algorithm, s is a sorted sequence and e is the element begin searched for. Figure 1.9 shows a sample run of the algorithm on a sequence of integers.

41 1.10. THE BINARY SEARCH ALGORITHM 37 while (s contains more than one element) { locate middle of s if (e < middle element of s) s = left half of s else s = right half of s } compare e to only element in s Figure 1.8: The binary search algorithm e = 25 s = [ ] middle = 38 [ ] 25 [25 37] 37 [25] Found! Figure 1.9: A run of the binary search algorithm The middle element is taken to be the one at the middle or to the immediate right of the middle. The pseudocode of Figure 1.8 is very general. If we wanted to implement the algorithm we would need to make it more precise and use some care. One issue is that we need to specify how the middle element of the sequence can be located. This depends on the type of sequence we re searching. Another issue is that the lines and s = left half of s

Algorithm Analysis. (Algorithm Analysis ) Data Structures and Programming Spring / 48

Algorithm Analysis. (Algorithm Analysis ) Data Structures and Programming Spring / 48 Algorithm Analysis (Algorithm Analysis ) Data Structures and Programming Spring 2018 1 / 48 What is an Algorithm? An algorithm is a clearly specified set of instructions to be followed to solve a problem