Sorting Should we worry about speed? Task Description We have an array of n values in any order We need to have the array sorted in ascending or descending order of values 2 Selection Sort Select the smallest value in the array Swap that value with the value in the first array location Repeatedly do: Select the smallest value of the remaining values Swap that value with the value in the first of the remaining locations Stop when no values remain 3
4 Algorithm public static void selectionsort (int[ ] data, int n) int i, j, smallest int temp; for (i = 0; i < n-1; i++) smallest = i; for (j=i+1; j<n;j++) if (data[ j ] < data [smallest]) smallest = j; /*swap data [smallest] and data [ i ]*/ Swapping the data Can we do this: data [smallest] = data [i]; data [i] = data [smallest]; This would overwrite one of the values (which?) The correct solution is: temp = data [i]; data [i] = data [smallest]; data [smallest] = temp; 5 Running Time for Selection Sort Worst case: O(n 2 ) Average case: O(n 2 ) Best case: O(n 2 ) 6
7 Insertion Sort Divide the array into 2 sub arrays, the first one is sorted and the second is unsorted Initially the sorted array will have one element, and the unsorted array will have n-1 elements Repeat the following: Take an element from the unsorted array and insert it in the correct place in the sorted array Stop when the sorted array has all n elements Algorithm public static void insertionsort (int[ ] data, int n) int i, j, temp; for (i = 1; i < n; i++) for (j=i; j>0; j--) if (data[ j ] < data [ j-1]) /*swap data [ j ] and data [ j-1]*/ 8 Improved Algorithm public static void insertionsort (int[ ] data, int n) int i, j, temp; boolean done; for (i = 1; i < n; i++) j = i; done = false; while (j > 0) && (done!= true) if (data[ j ] < data [ j-1]) /*swap data [ j ] and data [ j-1]*/ else done = true; j--; 9
10 Running Time for Insertion Sort Worst case: O(n 2 ) Average case: O(n 2 ) Best case (if the data is already sorted): O(n) Recursive Sorting Algorithms A faster group of sorting algorithms exist This group utilizes divide and conquer techniques using recursion The data is recursively divided into smaller sized data, sorted separately and combined together 11 Mergesort Repeatedly do the following: Divide the array into 2 equal sized arrays (approximately) Recursively call the same method to sort each of the 2 subarrays Merge the 2 sorted arrays to form one sorted array 12
13 Algorithm public static void mergesort (int data[ ], int first, int n) int n1, n2; if (n > 1) n1 = n / 2; n2 = n n1; //why not n2 = n1? mergesort (data,?1,?2); mergesort (data,?3,?4); merge (data, first, n1, n2); Merging the Data 1. Allocate a temp array; and set copied, copied1, and copied2 to zero. 2. while (both halves of the array have more elements to copy) : if (data[first + copied1] <= data[first + n1 + copied2]) temp[copied++] = data[first + (copied++)]; else temp[copied++] = data[first + n1 + (copied2++)]; 14 Merging the Data 3. Copy any remaining entries from the left or right subarray 4. Copy the elements from temp back to data 15
16 Running Time for Mergesort Worst case = Average Case = Best Case = O(n log n) Price Paid: Extra allocated arrays Appropriate for sorting data in files: Divide the file into smaller files The smaller files will eventually fit in an array Sort the array (possibly using other methods) Merge the sorted files Replace the original file with the sorted file Quicksort Pick a value (call it the pivot value) and put it in its correct location in the array Put all the values smaller than the pivot value before the pivot in the array Put all the values larger than the pivot value after the pivot in the array 17 Algorithm public static void quicksort (int data[ ], int first, int n) int n1, n2, pivotindex; if (n>1) pivotindex = partition (data, first, n); n1 = pivotindex - first; n2 = n n1-1; //why -1? quicksort (data,?1,?2); quicksort (data,?3,?4); 18
19 Partitioning the Data Partitioning the Data 20 Running Time for Quicksort Average Case = Best Case = O(n log n) Worst Case = O(N 2 ) Worst Case occurs when array is already sorted Choice of pivot is critical in improving the worst case 21
22 Lower Bound for Sorting Comparison based sorting algorithms require at least the following number of comparisons: Log (N!) Which is approximately equal to: N log N 1.44N