Merge Sort 7 2 9 4 2 4 7 9 7 2 2 7 9 4 4 9 7 7 2 2 9 9 4 4
Divide-and-Conuer Divide-and conuer is a general algorithm design paradigm: n Divide: divide the input data S in two disjoint subsets S 1 and S 2 n Recur: solve the subproblems associated with n S 1 and S 2 Conuer: combine the solutions for S 1 and S 2 into a solution for S The base case for the recursion are subproblems of size 0 or 1 Merge-sort is a sorting algorithm based on the divideand-conuer paradigm Like heap-sort n It has O(n log n) running time Unlike heap-sort n It does not use an auxiliary priority ueue n It accesses data in a seuential manner (suitable to sort data on a disk) 2014 Goodrich, Tamassia, Goldwasser Merge Sort 2
Merge-Sort Merge-sort on an input seuence S with n elements consists of three steps: n Divide: partition S into two seuences S 1 and S 2 of about n/2 elements each n Recur: recursively sort S 1 and S 2 n Conuer: merge S 1 and S 2 into a uniue sorted seuence Algorithm mergesort(s) Input seuence S with n elements Output seuence S sorted according to C if S.size() > 1 (S 1, S 2 ) partition(s, n/2) mergesort(s 1 ) mergesort(s 2 ) S merge(s 1, S 2 ) 2014 Goodrich, Tamassia, Goldwasser Merge Sort 3
Merging Two Sorted Seuences The conuer step of mergesort consists of merging two sorted seuences A and B into a sorted seuence S containing the union of the elements of A and B Merging two sorted seuences, each with n/2 elements and implemented by means of a doubly linked list, takes O(n) time Algorithm merge(a, B) Input seuences A and B with n/2 elements each Output sorted seuence of A B S empty seuence while A.isEmpty() B.isEmpty() if A.first().element() < B.first().element() S.addLast(A.remove(A.first())) else S.addLast(B.remove(B.first())) while A.isEmpty() S.addLast(A.remove(A.first())) while B.isEmpty() S.addLast(B.remove(B.first())) return S 2014 Goodrich, Tamassia, Goldwasser Merge Sort 4
Merge-Sort Tree An execution of merge-sort is depicted by a binary tree n n each node represents a recursive call of merge-sort and stores w unsorted seuence before the execution and its partition w sorted seuence at the end of the execution the root is the initial call n the leaves are calls on subseuences of size 0 or 1 7 2 9 4 2 4 7 9 7 2 2 7 9 4 4 9 7 7 2 2 9 9 4 4 2014 Goodrich, Tamassia, Goldwasser Merge Sort 5
Execution Example Partition 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 6
Execution Example (cont.) Recursive call, partition 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 7
Execution Example (cont.) Recursive call, partition 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 8
Execution Example (cont.) Recursive call, base case 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 9
Execution Example (cont.) Recursive call, base case 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 10
Execution Example (cont.) Merge 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 11
Execution Example (cont.) Recursive call,, base case, merge 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 12
Execution Example (cont.) Merge 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 13
Execution Example (cont.) Recursive call,, merge, merge 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 6 8 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 14
Execution Example (cont.) Merge 7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 6 8 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 2014 Goodrich, Tamassia, Goldwasser Merge Sort 15
Merge - Example S 1 24 45 63 85 S 2 17 31 50 96 S Merge Sort 16
Merge - Example S 1 24 45 63 85 S 2 31 50 96 S 17 Merge Sort 17
Merge - Example S 1 45 63 85 S 2 31 50 96 S 17 24 Merge Sort 18
Merge - Example S 1 45 63 85 S 2 50 96 S 17 24 31 Merge Sort 19
Merge - Example S 1 63 85 S 2 50 96 S 17 24 31 45 Merge Sort 20
Merge - Example S 1 63 85 S 2 96 S 17 24 31 45 50 Merge Sort 21
Merge - Example S 1 S 2 S 17 24 31 45 50 63 85 96 Merge Sort 22
Analysis of Merge-Sort The height h of the merge-sort tree is O(log n) n at each recursive call we divide in half the seuence, The overall amount or work done at the nodes of depth i is O(n) n n we partition and merge 2 i seuences of size n/2 i we make 2 i+1 recursive calls Thus, the total running time of merge-sort is O(n log n) depth #ses size 0 1 n 1 2 n/2 i 2 i n/2 i 2014 Goodrich, Tamassia, Goldwasser Merge Sort 23
Recurrences Running times of algorithms with Recursive calls can be described using recurrences. A recurrence is an euation or ineuality that describes a function in terms of its value on smaller inputs. T(n) n time for solving the trivial problem (base case) n number of problems * T(n/sub-problem size factor)+dividing+combining Merge Sort 24
Recurrence Euation Analysis The conuer step of merge-sort consists of merging two sorted seuences, each with n/2 elements and implemented by means of a doubly linked list, takes at most bn steps, for some constant b. Likewise, the basis case (n < 2) will take at b most steps. Therefore, if we let T(n) denote the running time of merge-sort: T ( n) = 2T ( n / b 2) + bn if if n < 2 n 2 We can therefore analyze the running time of merge-sort by finding a closed form solution to the above euation. n That is, a solution that has T(n) only on the left-hand side. 2014 Goodrich, Tamassia, Goldwasser Merge Sort 25
Iterative Substitution In the iterative substitution, or plug-and-chug, techniue, we iteratively apply the recurrence euation to itself and see if we can find a pattern: Note that base, T(n)=b, case occurs when 2 i =n. That is, i = log n. T ( n) Thus, T(n) is O(n log n). = 2T ( n / 2) + bn = 2(2T ( n / 2 = = = =... = 2 2 2 2 3 4 T ( n / 2 T ( n / 2 T ( n / 2 i i 2 T ( n / 2 2 3 4 2 )) + b( n / 2)) + bn ) + 2bn ) + 3bn ) + 4bn ) + ibn T( n) = bn + bn logn 2014 Goodrich, Tamassia, Goldwasser Merge Sort 26
The Recursion Tree Draw the recursion tree for the recurrence relation and look for a pattern: depth T s size 0 1 n 1 2 n/2 i 2 i n/2 i T ( n) b = 2T ( n / 2) + bn if n < 2 if n 2 time bn bn bn Total time = bn + bn log n (last level plus all previous levels) 2014 Goodrich, Tamassia, Goldwasser Merge Sort 27
Summary of Sorting Algorithms Algorithm Time Notes selection-sort O(n 2 ) insertion-sort O(n 2 ) heap-sort O(n log n) merge-sort O(n log n) slow in-place for small data sets (< 1K) slow in-place for small data sets (< 1K) fast in-place for large data sets (1K 1M) fast seuential data access for huge data sets (> 1M) 2014 Goodrich, Tamassia, Goldwasser Merge Sort 28
Quick-Sort 7 4 9 6 2 2 4 6 7 9 4 2 2 4 7 9 7 9 2 2 9 9
Quick-Sort Quick-sort is a randomized sorting algorithm based on the divide-and-conuer paradigm: n n n Divide: pick a random element x (called pivot) and partition S into w L elements less than x w E elements eual x w G elements greater than x Recur: sort L and G Conuer: join L, E and G L E x x x G 2014 Goodrich, Tamassia, Goldwasser Quick Sort 30
Partition We partition an input seuence as follows: n We remove, in turn, each element y from S and n We insert y into L, E or G, depending on the result of the comparison with the pivot x Each insertion and removal is at the beginning or at the end of a seuence, and hence takes O(1) time Thus, the partition step of uick-sort takes O(n) time Algorithm partition(s, p) Input seuence S, position p of pivot Output subseuences L, E, G of the elements of S less than, eual to, or greater than the pivot, resp. L, E, G empty seuences x S.remove(p) while S.isEmpty() y S.remove(S.first()) if y < x L.addLast(y) else if y = x E.addLast(y) else { y > x } G.addLast(y) return L, E, G 2014 Goodrich, Tamassia, Goldwasser Quick Sort 31
Quick-Sort Tree An execution of uick-sort is depicted by a binary tree n n Each node represents a recursive call of uick-sort and stores w Unsorted seuence before the execution and its pivot w Sorted seuence at the end of the execution The root is the initial call n The leaves are calls on subseuences of size 0 or 1 7 4 9 6 2 2 4 6 7 9 4 2 2 4 7 9 7 9 2 2 9 9 2014 Goodrich, Tamassia, Goldwasser Quick Sort 32
Execution Example Pivot selection 7 2 9 4 3 7 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 2 2 9 4 4 9 3 3 8 8 9 9 4 4 2014 Goodrich, Tamassia, Goldwasser Quick Sort 33
Execution Example (cont.) Partition, recursive call, pivot selection 7 2 9 4 3 7 6 1 1 2 3 4 6 7 8 9 2 4 3 1 2 4 7 9 3 8 6 1 1 3 8 6 2 2 9 4 4 9 3 3 8 8 9 9 4 4 2014 Goodrich, Tamassia, Goldwasser Quick Sort 34
Execution Example (cont.) Partition, recursive call, base case 7 2 9 4 3 7 6 1 1 2 3 4 6 7 8 9 2 4 3 1 2 4 7 3 8 6 1 1 3 8 6 1 1 9 4 4 9 3 3 8 8 9 9 4 4 2014 Goodrich, Tamassia, Goldwasser Quick Sort 35
Execution Example (cont.) Recursive call,, base case, join 7 2 9 4 3 7 6 1 1 2 3 4 6 7 8 9 2 4 3 1 1 2 3 4 3 8 6 1 1 3 8 6 1 1 4 3 3 4 3 3 8 8 9 9 4 4 2014 Goodrich, Tamassia, Goldwasser Quick Sort 36
Execution Example (cont.) Recursive call, pivot selection 7 2 9 4 3 7 6 1 1 2 3 4 6 7 8 9 2 4 3 1 1 2 3 4 7 9 7 1 1 3 8 6 1 1 4 3 3 4 8 8 9 9 9 9 4 4 2014 Goodrich, Tamassia, Goldwasser Quick Sort 37
Execution Example (cont.) Partition,, recursive call, base case 7 2 9 4 3 7 6 1 1 2 3 4 6 7 8 9 2 4 3 1 1 2 3 4 7 9 7 1 1 3 8 6 1 1 4 3 3 4 8 8 9 9 9 9 4 4 2014 Goodrich, Tamassia, Goldwasser Quick Sort 38
Execution Example (cont.) Join, join 7 2 9 4 3 7 6 1 1 2 3 4 6 7 7 9 2 4 3 1 1 2 3 4 7 9 7 17 7 9 1 1 4 3 3 4 8 8 9 9 9 9 4 4 2014 Goodrich, Tamassia, Goldwasser Quick Sort 39
Best-case Running Time If we are lucky, Partition splits the array evenly T ( n) b = 2T ( n / 2) + bn if if n < 2 n 2 depth T s size 0 1 n 1 2 n/2 i 2 i n/2 i time bn bn bn Total time = bn + bn log n (last level plus all previous levels) 2014 Goodrich, Tamassia, Goldwasser Quick Sort 40
Worst-case Running Time The worst case for uick-sort occurs when the pivot is the uniue minimum or maximum element and the input seuence is in ascending or descending order One of L and G has size n - 1 and the other has size 1 The running time is proportional to the sum n + (n - 1) + + 2 + 1 Thus, the worst-case running time of uick-sort is O(n 2 ) depth time 0 n 1 n - 1 n - 1 1 2014 Goodrich, Tamassia, Goldwasser Quick Sort 41
Expected Time Analysis Fix the input n expectation is over different randomly selected pivots Let T(n) be the expected number of comparisons needed to uicksort n numbers Probability of each split 1/n n T(n) = T(i-1)+T(n-i)+n-1 with probability 1/n T (n) = 1 n = 2 n nx (T (j 1) + T (n j)+n 1) j=1 nx 1 j=0 T (j)+n 1 Quick Sort 42
Expected Time Analysis (2) Since we have T (n 1) = 2 n 1 2 n nx 2 j=0 T (j) = n 1 n nx 2 j=0 T (j)+n 2 (T (n 1) n + 2) substituting in the expression for T(n) T (n) = n 1 n (T (n 1) n + 2) + 2 T (n 1) + n 1 n = n +1 n T (n 1) + 2(n 1) n Quick Sort 43
Expected Time Analysis (3) T (n) = n +1 2(n 1) T (n 1) + n n < n +1 n < n +1 n T (n 1) + 2 n T (n 2) + 2 n 1 Quick Sort +2 = n +1 2(n + 1) T (n 2) + +2 n 1 n < n +1 1 T (n 3) + 2(n + 1) n 2 n + 1 +2 n 1 < n +1 1 T (n 4) + 2(n + 1) n 3 n + 1 n 1 + 1 +2 n 2 1 < (n + 1)T (0) + 2(n + 1) n + 1 n 1 +...+ 1 +2 2 1 = 2(n + 1) n + 1 n 1 +...+ 1 +2 2 44
Expected Time Analysis 1 T (n) < 2(n + 1) n + 1 n 1 +...+ 1 2 Z n dx = 2(n + 1) x +2 = 2(n + 1) log n +2 = O(n log n) 1 +2 Quick Sort 45
In-Place Quick-Sort Quick-sort can be implemented to run in-place In the partition step, we use replace operations to rearrange the elements of the input seuence such that n the elements less than the pivot have rank less than h n the elements eual to the pivot have rank between h and k n the elements greater than the pivot have rank greater than k The recursive calls consider n elements with rank less than h n elements with rank greater than k Algorithm inplacequicksort(s, l, r) Input seuence S, ranks l and r Output seuence S with the elements of rank between l and r rearranged in increasing order if l r return i a random integer between l and r x S.elemAtRank(i) (h, k) inplacepartition(x) inplacequicksort(s, l, h - 1) inplacequicksort(s, k + 1, r) 2014 Goodrich, Tamassia, Goldwasser Quick Sort 46
In-Place Partitioning Perform the partition using two indices to split S into L and E U G (a similar method can split E U G into E and G). j k 3 2 5 1 0 7 3 5 9 2 7 9 8 9 7 6 9 (pivot = 6) Repeat until j and k cross: n Scan j to the right until finding an element > x. n Scan k to the left until finding an element < x. n Swap elements at indices j and k j k 3 2 5 1 0 7 3 5 9 2 7 9 8 9 7 6 9 2014 Goodrich, Tamassia, Goldwasser Quick Sort 47
Summary of Sorting Algorithms Algorithm Time Notes selection-sort O(n 2 ) in-place slow (good for small inputs) insertion-sort O(n 2 ) uick-sort O(n log n) expected heap-sort O(n log n) merge-sort O(n log n) in-place slow (good for small inputs) in-place, randomized fastest (good for large inputs) in-place fast (good for large inputs) seuential data access fast (good for huge inputs) 2014 Goodrich, Tamassia, Goldwasser Quick Sort 48
Radix Sort Considers structure of the keys Assume keys are represented in base M number system (M=radix) i.e., if M=1, the keys are represented in binary format. 8 4 2 1 weight 8 = 1 0 0 0 (b=4) 3 2 1 0 bit # Sorting is performed by comparing bits in the same position Extension to keys that are alphanumeric strings Radix Sort 49
Radix Exchange Sort All the keys are represented with a fixed number of bits Examine bits from left to right n sort array with respect to leftmost bit 1 0 1 0 0 0 0 0 1 1 Radix Sort 50
Radix Exchange Sort All the keys are represented with a fixed number of bits Examine bits from left to right n sort array with respect to leftmost bit n partition array 0 0 0 1 1 0 0 0 1 1 top subarray bottom subarray Radix Sort 51
Radix Exchange Sort All the keys are represented with a fixed number of bits Examine bits from left to right n sort array with respect to leftmost bit n partition array n recursion w recursively sort the top subarray, ignoring the leftmost bit w recursively sort the bottom subarray, ignoring the leftmost bit n Complexity n numbers of b bits O(b n) Radix Sort 52
Radix Exchange Sort Partition n repeat w scan top-down to find key starting with 1 w scan bottom-up to find key starting with 0 w swap the keys n scan till indices cross. Complexity is O(n) scan from top 1 1 0 1 0 scan from bottom scan from top first exchange 0 1 0 1 1 scan from bottom 0 0 1 0 0 1 1 1 1 1 second exchange Radix Sort 53
Radix Exchange Sort array before sort 2 b-1 array after sort on leftmost bit array after recursive sort on second from leftmost bit Radix Sort 54
Radix Exchange Sort vs. Quicksort Similarities n partition arrays n recursively sort sub-arrays Differences n method of partitioning w radix exchange divides array based on greater than or less than 2 b-1 w uicksort partitions based on greater than or less than some element of the array n Time complexity w radix exchange O(bn) w Quicksort n average case O(n log n) n worst case O(n 2 ) Radix Sort 55
Straight Radix Sort Examines bit from right to left n for k:=0 to b-1 w sort the array in a stable way w looking only at bit k Radix Sort 56
Example Sorting a seuence of 4-bit integers 1001 0010 1001 1001 0001 0010 1110 1101 0001 0010 1101 1001 0001 0010 1001 0001 1101 0010 1101 1101 1110 0001 1110 1110 1110 2014 Goodrich, Tamassia, Goldwasser 57 Radix Sort
Sort in a Stable Way In a stable sort, the initial relative order of eual keys is unchanged For example, observe the first step of the sort. Note that the relative order of those keys ending with 0 is unchanged, and the same is true for elements ending in 1. 1001 0010 1101 0001 1110 0010 1110 1001 1101 0001 Radix Sort 58
Correctness We show that any two keys are in the correct relative order at the end of the algorithm Given two keys, let k be the leftmost bitposition where they differ At step k the two keys are put in the correct relative order Because of stability, the successive steps do not change the relative order of the two keys Radix Sort 59
Example Radix Sort 60
Example Decimal Numbers Radix Sort 61
Straight Radix Sort Time Complexity for k = 0 to b - 1 n sort the array in a stable way, looking only at bit k Suppose we can perform the stable sort above in O(n) time. The total time complexity would be O(bn) We can perform a stable sort based on the keys k th digit in O(n) time. n how? Radix Sort 62
Bucket Sort BASICS: n numbers Each number {1, 2, 3,... m} Stable Time: O (n + m) For example, m = 3 and our array is: (note that there are two 2 s and two 1 s) First, we create M buckets 63
Bucket Sort Each element of the array is put in one of the m buckets 64
Bucket Sort Now, pull the elements from the buckets into the array At last, the sorted array (sorted in a stable way): 65
In-Place Sorting A sorting algorithm is said to be in-place if n it uses no auxiliary data structures (however, O(1) auxiliary variables are allowed) n it updates the input seuence only by means of operations replaceelement and swapelements Which sorting algorithms seen so far can be made to work in place? 66
Lower Bound for Comparison Based Sorting internal node: comparison external node: permutation algorithm execution: root-to-leaf path 67
How Fast Can We Sort? Proposition: The running time of any comparison-based algorithm for sorting an n-element seuence S is Ω(n log n). Justification: The running time of a comparison-based sorting algorithm must be eual to or greater than the depth of the decision tree T associated with this algorithm. Each internal node of T is associated with a comparison that establishes the ordering of two elements of S. Each external node of T represents a distinct permutation of the elements of S. Hence T must have at least n! external nodes which again implies T has a height of at least log(n!) Since n! has at least n/2 terms that are greater than or eual to n/2, we have: log(n!) (n/2) log(n/2) Total Time Complexity: Ω(n log n). 68
Problems Write a recursive procedure to convert a binary search tree to a doubly linked list in place. 69
Problems Given two binary search trees (not necessarily balanced), write an algorithm that merges the two given trees into a balanced search tree. 70
Problems Given a balanced binary search tree of n nodes and a target sum, write a function that returns true, if there is a pair that adds up to the sum, otherwise returns false. 71
Problems Explain how to use an AVL tree or a redblack tree to sort n comparable elements in O(n log n) time in the worst case. 72
Problems Let T and U be (2,4) trees storing n and m entries, respectively, such that all the entries in T have keys less than the keys of all entries in U. Describe an O(log n +log m) algorithm for joining T and U into a single (2,4) tree that stores all the entries in T and U 73