Unit 7 Sorting s Simple Sorting algorithms Quicksort Improving Quicksort Overview of Sorting s Given a collection of items we want to arrange them in an increasing or decreasing order. You probably have seen a number of sorting algorithms including selection sort insertion sort bubble sort quicksort tree sort using BST's In terms of efficiency: average complexity of the first three is O(n2) average complexity of quicksort and tree sort is O(n lg n) but its worst case is still O(n2) which is not acceptable In this section, we review insertion, selection and bubble sort discuss quicksort and its average/worst case analysis show how to eliminate tail recursion present another sorting algorithm called heapsort Unit 7- Sorting s 2
Selection Sort Assume that data are integers are stored in an array, from 0 to size-1 sorting is in ascing order for i=0 to size-1 do x = location with smallest value in locations i to size-1 swap data[i] and data[x] If array has n items, i-th step will perform n-i operations First step performs n operations second step does n-1 operations... last step performs 1 operatio. Total cost : n + (n-1) +(n-2) +... + 2 + 1 = n*(n+1)/2. is O(n 2 ). Unit 7- Sorting s 3 Insertion Sort for i = 0 to size-1 do temp = data[i] x = first location from 0 to i with a value greater or equal to temp shift all values from x to i-1 one location forwards data[x] = temp Interesting operations: comparison and shift i-th step performs i comparison and shift operations Total cost : 1 + 2 +... + (n-1) + n = n*(n+1)/2. is O(n 2 ). Unit 7- Sorting s 4
Bubble Sort n passes each pass swaps out of order adjacent elements for = size-1 to 0 do for i = 0 to -1 do if data[i] > data[i+1] swap data[i] and data[i+1] Each step in inside for-loop performs # of operations. Therefore, the total cost of the algorithm is n + (n-1) +... + 1 = n*(n+1)/2. is O(n2). Unit 7- Sorting s 5 Tree Sort Insert each element in a BST or AVL tree. Traverse the tree inorder and place the elements back into the array. tree = an empty BST or AVL tree for i = 0 to size-1 do insert data[i] in tree for i = 0 to size-1 do get the next inorder item in tree store the item in data[i] Inserting n items in the bst or AVL tree: O(n log(n)), on the average. Traversing the tree and inserting back the items in the array : O(n) time. Average cost : O(n log(n) + n) = O(n log(n)). If BST's are used, worst case cost: O(n 2 ). Problem: algorithm needs an additional O(n) space for the tree. Unit 7- Sorting s 6
Quicksort Quicksort is a 'Divide and Conquer' (recursive method) pick one element of the array as the pivot partition the array into two regions: the left region that has the items less than the pivot the right region that has the items greater or equal to pivot apply quicksort to the left region apply quicksort to the right region Unit 7- Sorting s 7 Quicksort Code typedef long int Item; // It returns the final position of the pivot int partition( Item a[], int first, int last ) { Item pivot = a[first]; // pivot int left = first, right = last; while (left < right) { // search from right for item <= pivot while ( a[right] > pivot ) right--; //search from left for item >= pivot while ( left < right && a[left] <= pivot ) left++; // swap items if left and right have not cross if ( left < right ) { Item tmp =a[left]; a[left] = a[right]; a[right] = tmp; // place pivot in correct position a[first] = a[right]; a[right] = pivot; return right; void quicksort( Item a[], int first, int last ){ int pos; // position of pivot if (first < last) { // array has more than 1 item pos = partition( a, first, last ); quicksort(a, first, pos-1); quicksort(a, pos+1, last); Unit 7- Sorting s 8
Quicksort Partition The partition algorithm do the following: picks the first item as the pivot item divides the array into regions the left region that contains all items that are <= pivot the right region that contains all items that are > pivot places the pivot at the right slot (no need to move it any more) returns the pivot s final position after that, quicksort has to recursively sort the left region (the part before the pivot) and the right region (part after the pivot) Partition starts searching from the two s of the array and swaps the items that are in the wrong region. When the two searches meet, the array is partitioned Then the pivot is placed in the right position Unit 7- Sorting s 9 Quicksort - Example initial: 17 32 68 16 14 15 44 22 1 st partition 17 32 68 16 14 15 44 22 l r 17 15 68 16 14 32 44 22 l r 17 15 14 16 68 32 44 22 16 15 14 17 68 32 44 22 2nd partition 16 15 14 l r 14 15 16 3rd partition 14 15 4th partition 68 32 44 22 I r 22 32 44 68 5th partition 22 32 44 22 32 44 6 th partition 32 44 32 44 Unit 7- Sorting s 10
Quicksort Time Average Time Partition is O(n): it's just a loop through the array Assuming that we split the array in half each time, we need: 2 recursive calls of half the array, 4 of 1/4 of the array, and so on. So the algorithm has at the most log(n) levels. Therefore, the average time complexity of the algorithm is O(n log n) Average Space It can be shown that the max number of frames on the stack is log n So average space is O(log n) Worst Time Array is sorted. Partition splits the array into an empty region and a region with n-1 items Then time = n + (n-1) + + 2 + 1 which is O(n 2 ) Worst Space It will create n stack frames. So space is O(n) Unit 7- Sorting s 11 Improvements of Quicksort How can we improve the worst time case? select as pivot the median of the left, right and middle items significantly reduces the probability of the worst case How can we improve the worst space case? Remove tail recursion (recursive call done as the last operation) and replace it with iteration With these improvements, time/space complexity fof Quicksort is: Average Worst (for time) time O(nlgn) O(n 2 ) space O(lg n) O(1) Further improvements: Quicksort is inefficient on small arrays Stop using Quicksort when partition size is small (i.e. < 50) use insertion sort for this part of the array) Unit 7- Sorting s 12