CS 3343 Fall 2007 Sorting Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk CS 3343 Analysis of Algorithms 1 How fast can we sort? All the sorting algorithms we have seen so far are comparison sorts: only use comparisons to determine the relative order of elements. E.g., insertion sort, merge sort, quicksort, heapsort. The best worst-case running time that we ve seen for comparison sorting is O(n log n). Is O(n log n) the best we can do? Decision trees can help us answer this question. CS 3343 Analysis of Algorithms 2 Sort a 1, a 2,, a n The left subtree shows subsequent comparisons if a i a j. The right subtree shows subsequent comparisons if a i a j. CS 3343 Analysis of Algorithms 3 The left subtree shows subsequent comparisons if a i a j. The right subtree shows subsequent comparisons if a i a j. CS 3343 Analysis of Algorithms 4
9 4 9 6 The left subtree shows subsequent comparisons if a i a j. The right subtree shows subsequent comparisons if a i a j. CS 3343 Analysis of Algorithms 5 The left subtree shows subsequent comparisons if a i a j. The right subtree shows subsequent comparisons if a i a j. CS 3343 Analysis of Algorithms 6 4 6 a 1 a 2 a 3 The left subtree shows subsequent comparisons if a i a j. The right subtree shows subsequent comparisons if a i a j. CS 3343 Analysis of Algorithms 7 a 3 a 1 a 2 a 2 a 3 a 1 a 3 a 2 a 1 a 1 a 3 a 2 a 2 a 3 a 1 4 6 9 Each leaf contains a permutation π(1), π(2),, π(n) to indicate that the ordering a π(1) a π(2) L a π(n) has been established. CS 3343 Analysis of Algorithms 8
Decision-tree model A decision tree can model the execution of any comparison sorting algorithm: One tree for each input size n. The tree contains all possible comparisons (= if-branches) that could be executed for any input of size n. The tree contains all comparisons along all possible instruction traces (= control flows) for all inputs of size n. For one input, only one path to a leaf is executed. Running time = length of the path taken. Worst-case running time = height of tree. Lower bound for comparison sorting Theorem. Any decision tree that can sort n elements must have height Ω(n log n). Proof. The tree must contain n! leaves, since there are n! possible permutations. A height-h binary tree has 2 h leaves. Thus, n! 2 h. h log(n!) (log is mono. increasing) log ((n/e) n ) (Stirling s formula) = n log n n log e = Ω(n log n). CS 3343 Analysis of Algorithms 9 CS 3343 Analysis of Algorithms 10 Lower bound for comparison sorting Corollary. Heapsort and merge sort are asymptotically optimal comparison sorting algorithms. Sorting in linear time Counting sort: No comparisons between elements. Input: A[1.. n], where A[ j] {1, 2,, k}. Output: B[1.. n], sorted. Auxiliary storage: C[1.. k]. CS 3343 Analysis of Algorithms 11 CS 3343 Analysis of Algorithms 12
Counting sort Counting-sort example 1. for i 1 to k do C[i] 0 2. for j 1 to n 3. for i 2 to k C[i] = {key i} C: CS 3343 Analysis of Algorithms 13 CS 3343 Analysis of Algorithms 14 Loop 1 Loop 2 C: 00 00 00 00 C: 00 00 00 11 1. for i 1 to k do C[i] 0 2. for j 1 to n CS 3343 Analysis of Algorithms 15 CS 3343 Analysis of Algorithms 16
Loop 2 Loop 2 C: 11 00 00 11 C: 11 00 11 11 2. for j 1 to n 2. for j 1 to n CS 3343 Analysis of Algorithms 17 CS 3343 Analysis of Algorithms 18 Loop 2 Loop 2 C: 11 00 11 22 C: 11 00 22 22 2. for j 1 to n 2. for j 1 to n CS 3343 Analysis of Algorithms 19 CS 3343 Analysis of Algorithms 20 2001 by Charles E. Leiserson; small changes by Carola
Loop 3 Loop 3 C: 11 00 22 22 C: 11 00 22 22 C': 11 11 22 22 C': 11 11 33 22 3. for i 2 to k C[i] = {key i} 3. for i 2 to k C[i] = {key i} CS 3343 Analysis of Algorithms 21 CS 3343 Analysis of Algorithms 22 Loop 3 C: 11 00 22 22 C: 11 11 33 55 C': 11 11 33 55 33 C': 11 11 33 55 3. for i 2 to k C[i] = {key i} CS 3343 Analysis of Algorithms 23 CS 3343 Analysis of Algorithms 24 2001 by Charles E. Leiserson; small changes by Carola
C: 11 11 33 55 C: 11 11 22 55 33 C': 11 11 22 55 33 44 C': 11 11 22 55 CS 3343 Analysis of Algorithms 25 CS 3343 Analysis of Algorithms 26 C: 11 11 22 55 C: 11 11 22 44 33 44 C': 11 11 22 44 33 33 44 C': 11 11 22 44 CS 3343 Analysis of Algorithms 27 CS 3343 Analysis of Algorithms 28 2001 by Charles E. Leiserson; small changes by Carola
C: 11 11 22 44 C: 11 11 11 44 33 33 44 C': 11 11 11 44 11 33 33 44 C': 11 11 11 44 CS 3343 Analysis of Algorithms 29 CS 3343 Analysis of Algorithms 30 C: 11 11 11 44 C: 00 11 11 44 11 33 33 44 C': 00 11 11 44 11 33 33 44 44 C': 00 11 11 44 CS 3343 Analysis of Algorithms 31 CS 3343 Analysis of Algorithms 32 2001 by Charles E. Leiserson; small changes by Carola
11 33 33 44 44 C: 00 11 11 44 C': 00 11 11 33 Analysis Θ(k) Θ(n) Θ(k) Θ(n) Θ(n + k) 1. for i 1 to k do C[i] 0 2. for j 1 to n 3. for i 2 to k CS 3343 Analysis of Algorithms 33 CS 3343 Analysis of Algorithms 34 Running time Stable sorting If k = O(n), then counting sort takes Θ(n) time. But, sorting takes Ω(n log n) time! Where s the fallacy? Answer: Comparison sorting takes Ω(n log n) time. Counting sort is not a comparison sort. In fact, not a single comparison between elements occurs! Counting sort is a stable sort: it preserves the input order among equal elements. 11 33 33 44 44 Exercise: What other sorts have this property? CS 3343 Analysis of Algorithms 35 CS 3343 Analysis of Algorithms 36