COSC242 Lecture 7 Mergesort and Quicksort We saw last time that the time complexity function for Mergesort is T (n) = n + n log n. It is not hard to see that T (n) = O(n log n). After all, n + n log n n log n + n log n 2(n log n). So Mergesort is O(n log n), and in fact Θ(n log n), whereas Insertion Sort is Θ(n 2 ) in the worst case. However, we must remember that this comparison is about how well the sorting algorithm scales up when the values of n get larger. For small values of n, insertion sort beats mergesort! This surprising fact offers us an opportunity to improve mergesort. Also worth remembering is that insertion sort is very efficient if the input data is already sorted (or nearly sorted) because it only needs to make O(n) comparisons in this case. Many sorting algorithms, mergesort included, perform as badly on already-sorted data as in the worst case. COSC242 Lecture 7 / Slide 1
Improving Mergesort My colleague Nathan Rountree reported that on a test machine, for inputs of size n, insertion sort ran in 8n 2 steps and mergesort in 64n log n steps (in the worst case). If n = 2, then insertion sort runs in 8 4 = 32 steps while mergesort takes 64 2 1 = 128 steps. When does insertion sort stop beating mergesort? If 8n 2 < 64n log n then n < 8 log n, so n/8 < log n, so 2 n/8 < n. By trial and error, 2 5 = 32 where 5 8 = 40, and 2 6 = 64 where 6 8 = 48, so 2 n/8 > n for n = 48. Thus 40 n 48. So we can improve mergesort by rewriting it to do insertion sort on arrays of length 40 or less. Other improvements exist. Mergesort does not sort in place, since the merge routine copies items into a new array, so one can try to reduce the amount of copying, especially if you have big records. COSC242 Lecture 7 / Slide 2
Quicksort Mergesort divides the input array A[low..high] into two pieces of similar size without looking at the contents. We get pieces A[low..mid] and A[(mid + 1)..high] merely by using the location mid (low + high) / 2. Quicksort is based on the following subtle idea. In a sorted output array, the value of the middle key also divides the array into two pieces of similar size: all keys to the left are smaller and all keys to the right are bigger. Can we use this median value to help with the sorting? Well, if we had an easy way to pick some x so that about half the values in the input array A[low..high] are x and the rest x, then we could stick x into cell A[mid] and rearrange the entries so that smaller elements are put on the left of x and larger entries are put on the right of x. This would not yet be the sorted output array, but we could recursively apply the same process to the pieces A[low..mid] and A[(mid + 1)..high]. The recursive descent stops when we reach a piece of length 1, because an array of length 0 or 1 is sorted already. COSC242 Lecture 7 / Slide 3
Picking the pivot There is a major problem with the above method of splitting A. To split A into two equal pieces, the pivot x should be the median. But the preprocessing to find the median key value is costly. On the other hand if we don t choose the pivot to be the median (or close to it), then the two pieces on either side of pivot x are of unequal size, which is also potentially costly. How should we pick the pivot? The original approach simply picked the first key in the array as pivot x, and argued that for a large input array we may assume keys are randomly distributed and so there should be roughly as many keys smaller than x as larger than x. Our pseudocode uses this idea. But it s not the best. We ll look at some improvements in the next lecture. COSC242 Lecture 7 / Slide 4
The Quicksort Algorithm Suppose that the choice of pivot and the rearranging of entries is done by a special Partition algorithm. (And let s postpone the question of how the Partition algorithm works.) Then to sort the subarray of A[0..(n - 1)] stretching from location low to location high, do this: Quicksort(A, low, high) 1: if A[low..high] has 2 or more elements then 2: Partition(A, low, high), returning the index mid {everything the chosen pivot is to the right of mid } 3: Quicksort(A, low, mid) 4: Quicksort(A, mid + 1, high) To sort A[0..(n - 1)], the initial call is Quicksort(A, 0, n - 1). Quicksort provides the control structure for the sort, but the work of comparing and rearranging the keys is done by the (non-recursive) Partition algorithm. COSC242 Lecture 7 / Slide 5
The Partition Algorithm Partition picks the pivot x and then produces an array with two parts, one containing keys x, the other keys x. For now, we pick the first key of the array to be x, the pivot. Partition(A, low, high) 1: x A[low] 2: i = low - 1 3: j = high + 1 4: loop 5: repeat 6: j j - 1 7: until A[j] x 8: repeat 9: i i + 1 10: until A[i] x 11: if i < j then 12: swap A[i] and A[j] 13: else 14: return j COSC242 Lecture 7 / Slide 6
Exercises Try stepping through the Partition algorithm for each of the following. Check that the result is an index that partitions the array by having to its left all keys smaller than, and on the right all keys larger than, the pivot x. A = { 7, 3, 2, 11, 5} B = { 5, 5, 5, 5, 5} C = { 1, 2, 3, 4, 5} D = { 5, 4, 3, 2, 1} In which cases did the Partition algorithm seem to do a lot of unnecessary work, and what was the reason? Do duplicate keys ever arise in practice? And would you ever want to keep those records with duplicate keys in their original relative order for some reason? And does Partition do this? COSC242 Lecture 7 / Slide 7