Inserton Sort Dvde and Conquer Sortng CSE 6 Data Structures Lecture 18 What f frst k elements of array are already sorted? 4, 7, 1, 5, 1, 16 We can shft the tal of the sorted elements lst down and then nsert next element nto proper poston and we get k+1 sorted elements 4, 5, 7, 1, 1, 16 Dvde and Conquer Very mportant strategy n computer scence: Dvde problem nto smaller parts Independently solve the parts Combne these solutons to get overall soluton Idea 1: Dvde array nto two halves, recursvely sort left and rght halves, then merge two halves known as Mergesort Idea : Partton array nto small tems and large tems, then recursvely sort the two sets known as Qucksort Mergesort 8 4 5 1 6 Dvde t n two at the mdpont Conquer each sde n turn (by recursvely sortng) Merge two halves together 4 Mergesort Example Auxlary Array 8 4 5 1 6 Dvde Dvde 8 4 5 1 6 Dvde 8 4 5 1 6 1 element 8 4 5 1 6 Merge 8 4 5 1 6 Merge 4 8 1 5 6 Merge 1 4 5 6 8 5 The mergng requres an auxlary array. 4 8 1 5 6 Auxlary array 6
Auxlary Array The mergng requres an auxlary array. Auxlary Array The mergng requres an auxlary array. 4 8 1 5 6 4 8 1 5 6 1 Auxlary array 1 4 5 Auxlary array 7 8 Mergng Mergng frst target normal second target Rght completed frst copy Left completed frst target 10 Mergng Recursve Mergesort Merge(A[], T[] : nteger array, left, rght : nteger) : { md,,, k, l, target : nteger; md := (rght + left)/; := left; := md + 1; target := left; whle < md and < rght do f A[] < A[] then T[target] := A[] ; := + 1; else T[target] := A[]; := + 1; target := target + 1; f > md then //left completed// for k := left to target-1 do A[k] := T[k]; f > rght then //rght completed// k : = md; l := rght; whle k > do A[l] := A[k]; k := k-1; l := l-1; for k := left to target-1 do A[k] := T[k]; } Mergesort(A[], T[] : nteger array, left, rght : nteger) : { f left < rght then md := (left + rght)/; Mergesort(A,T,left,md); Mergesort(A,T,md+1,rght); Merge(A,T,left,rght); } ManMergesort(A[1..n]: nteger array, n : nteger) : { T[1..n]: nteger array; Mergesort[A,T,1,n]; } 11 1
Iteratve Mergesort Iteratve Mergesort Merge by 1 Merge by Merge by 4 Merge by 8 Merge by 1 Merge by Merge by 4 Merge by 8 Merge by 16 copy 1 14 Iteratve pseudocode Sort(array A of length N) Let m =, let B be temp array of length N Whle m<n For = 1 N n ncrements of m merge A[ +m/] and A[+m/ +m] nto B[ +m] Swap role of A and B m=m* If needed, copy B back to A Mergesort Analyss Let T(N) be the runnng tme for an array of N elements Mergesort dvdes array n half and calls tself on the two halves. After returnng, t merges both halves usng a temporary array Each recursve call takes T(N/) and mergng takes O(N) 15 16 Mergesort Recurrence Relaton The recurrence relaton for T(N) s: T(1) < c base case: 1 element array constant tme T(N) < T(N/) + dn Sortng n elements takes the tme to sort the left half plus the tme to sort the rght half plus an O(N) tme to merge the two halves T(N) = O(N log N) Propertes of Mergesort Not n place Requres an auxlary array Very few comparsons Iteratve Mergesort reduces copyng. 17 18
Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does Partton array nto left and rght sub arrays the elements n left sub array are all less than pvot elements n rght sub array are all greater than pvot Recursvely sort left and rght sub arrays Concatenate left and rght sub arrays n O(1) tme Four easy steps To sort an array S If the number of elements n S s 0 or 1, then return. The array s sorted. Pck an element v n S. Ths s the pvot value. Partton S {v} nto two dsont subsets, S 1 = {all values x v}, and S = {all values x v}. Return QuckSort(S 1 ), v, QuckSort(S ) 1 0 The steps of QuckSort Detals, detals S 1 1 0 6 81 4 65 4 1 65 57 1 57 6 75 75 0 81 select pvot value S 1 S partton S S 1 S 0 1 6 1 4 57 65 75 81 S 6 1 4 1 57 75 0 65 81 [Wess] QuckSort(S 1 ) and QuckSort(S ) Presto! S s sorted The algorthm so far lacks qute a few of the detals Pckng the pvot want a value that wll cause S 1 and S to be non zero, and close to equal n sze f possble Implementng the actual parttonng Dealng wth cases where the element equals the pvot 1 Alternatve Pvot Rules Chose A[left] Fast, but too based, enables worst case Chose A[random], left < random < rght Completely unbased Wll cause relatvely even splt, but slow Medan of three, A[left], A[rght], A[(left+rght)/] The standard, tends to be unbased, and does a lttle sortng on the sde. Qucksort Parttonng Need to partton the array nto left and rght subarrays the elements n left sub array are pvot elements n rght sub array are pvot How do the elements get to the correct partton? Choose an element from the array as the pvot Make one pass through the rest of the array and swap as needed to put elements n parttons 4
Example Parttonng s done In Place 0 1 4 5 6 7 8 8 1 4 0 5 7 6 0 1 4 7 5 6 8 Choose the pvot as the medan of three. Place the pvot and the largest at the rght and the smallest at the left One mplementaton (there are others) medan fnds pvot and sorts left, center, rght Swap pvot wth next to last element Set ponters and to start and end of array Increment untl you ht element A[] > pvot Decrement untl you ht element A[] < pvot Swap A[] and A[] Repeat untl and cross Swap pvot (= A[N ]) wth A[] 5 6 Example Example 0 1 4 7 5 6 8 0 1 4 7 5 6 8 0 1 4 7 5 6 8 0 1 4 7 5 6 8 Move to the rght to be larger than pvot. Move to the left to be smaller than pvot. Swap 0 1 4 7 5 6 8 0 1 4 7 5 6 8 0 1 4 5 7 6 8 0 1 4 5 7 6 8 0 1 4 5 7 6 8 0 1 4 5 6 7 8 S 1 < pvot pvot S > pvot 7 8 Recursve Qucksort Qucksort Best Case Performance Qucksort(A[]: nteger array, left,rght : nteger): { pvotndex : nteger; f left + CUTOFF rght then pvot := medan(a,left,rght); pvotndex := Partton(A,left,rght-1,pvot); Qucksort(A, left, pvotndex 1); Qucksort(A, pvotndex + 1, rght); else Insertonsort(A,left,rght); } Don t use qucksort for small arrays. CUTOFF = 10 s reasonable. Algorthm always chooses best pvot and splts sub arrays n half at each recurson T(0) = T(1) = O(1) constant tme f 0 or 1 element For N > 1, recursve calls plus lnear tme for parttonng T(N) = T(N/) + O(N) Same recurrence relaton as Mergesort T(N) = O(N log N) 0
Qucksort Worst Case Performance Algorthm always chooses the worst pvot one sub array s empty at each recurson T(N) a for N C T(N) T(N 1) + bn T(N ) + b(n 1) + bn T(C) + b(c+1)+ + bn a +b(c + C+1 + C+ + + N) T(N) = O(N ) Fortunately, average case performance s O(N log N) (see text for proof) Propertes of Qucksort No teratve verson (wthout usng a stack). Pure qucksort not good for small arrays. In place, but uses auxlary storage because of recursve calls. O(n log n) average case performance, but O(n ) worst case performance. 1 Folklore Qucksort s the best n memory sortng algorthm. Mergesort and Qucksort make dfferent tradeoffs regardng the cost of comparson and the cost of a swap Features of Sortng Algorthms In place Sorted tems occupy the same space as the orgnal tems. (No copyng requred, only O(1) extra space f any.) Stable Items n nput wth the same value end up n the same order as when they began. 4 How fast can we sort? Heapsort, Mergesort, and Qucksort all run n O(N log N) best case runnng tme Can we do any better? No, f the basc acton s a comparson. Sortng Model Recall our basc assumpton: we can only compare two elements at a tme we can only reduce the possble soluton space by half each tme we make a comparson Suppose you are gven N elements Assume no duplcates How many possble orderngs can you get? Example: a, b, c (N = ) 5 6
Permutatons How many possble orderngs can you get? Example: a, b, c (N = ) (a b c), (a c b), (b a c), (b c a), (c a b), (c b a) 6 orderngs = 1 =! (e, factoral ) All the possble permutatons of a set of elements For N elements N choces for the frst poston, (N 1) choces for the second poston,, () choces, 1 choce N(N 1)(N )Λ()(1)= N! possble orderngs b < c a < c c < a < b b > c a > c Decson Tree c < a < b a < b, b < c < a, c < a < b,,, c < b < a a > b b < c < a c < b < a b < c < a c < b < a c < a c > a b < c < a b < c The leaves contan all the possble orderngs of a, b, c b > c 7 8 Decson Trees Decson Tree Example A Decson Tree s a Bnary Tree such that: Each node = a set of orderngs e, the remanng soluton space Each edge = 1 comparson Each leaf = 1 unque orderng How many leaves for N dstnct elements? N!, e, a leaf for each possble orderng Only 1 leaf has the orderng that s the desred correctly sorted arrangement b < c a < c c < a < b b > c a > c c < a < b a < b, b < c < a, c < a < b,,, c < b < a actual order a > b b < c < a c < b < a b < c < a c < b < a c < a c > a b < c < a possble orders b < c b > c 40 Decson Trees and Sortng Every sortng algorthm corresponds to a decson tree Fnds correct leaf by choosng edges to follow e, by makng comparsons Each decson reduces the possble soluton space by one half Run tme s maxmum no. of comparsons maxmum number of comparsons s the length of the longest path n the decson tree,.e. the heght of the tree Lower bound on Heght A bnary tree of heght h has at most how many leaves? h L The decson tree has how many leaves: L= N! A bnary tree wth L leaves has heght at least: h log L So the decson tree has heght: h log ( N!) 41 4
select ust the frst N/ terms each of the selected terms s logn/ log(n!) s Ω(NlogN) log( N!) = log ( N ( N 1) ( N ) Λ () (1) ) = log N + log( N 1) + log( N ) + Λ + log + log1 N log N + log( N 1) + log( N ) + Λ + log N N log N N N (log N log ) = log N = Ω( N log N) Ω(N log N) Run tme of any comparson based sortng algorthm s Ω(N log N) Can we do better f we don t use comparsons? 4 44 BucketSort (aka BnSort) If all values to be sorted are known to be between 1 and K, create an array count of sze K, ncrement counts whle traversng the nput, and fnally output the result. Example K=5. Input = (5,1,,4,,,1,1,5,4,5) count array 1 4 5 Runnng tme to sort n tems? 45 BucketSort Complexty: O(n+K) Case 1: K s a constant BnSort s lnear tme Case : K s varable Not smply lnear tme Case : K s constant but large (e.g. )??? 46 Fxng mpractcalty: RadxSort Radx Sort Example (1 st pass) Radx = The base of a number system We ll use 10 for convenence, but could be anythng Idea: BucketSort on each dgt, least sgnfcant to most sgnfcant (lsd to msd) Input data 57 8 1 0 1 1 Bucket sort by 1 s dgt 4 5 6 7 57 8 8 After 1 st pass 1 57 8 47 Ths example uses B=10 and base 10 dgts for smplcty of demonstraton. Larger bucket counts should be used n an actual mplementaton. 48
Radx Sort Example ( nd pass) Radx Sort Example ( rd pass) After 1 st pass 1 57 8 0 0 0 1 57 1 8 Bucket sort by 10 s dgt 4 5 6 7 8 After nd pass 1 57 8 After nd pass 1 57 8 0 1 00 1 00 08 0 Bucket sort by 100 s dgt 4 5 57 6 7 8 After rd pass 8 1 57 Invarant: after k passes the low order k dgts are sorted. 4 50 Your Turn BucketSort on lsd: RadxSort Input:16, 8, 66, 41, 416, 11, 8 0 1 4 5 6 7 8 BucketSort on next-hgher dgt: 0 1 4 5 6 7 8 BucketSort on msd: How many passes? Radxsort: Complexty How much work per pass? Total tme? Concluson? In practce RadxSort only good for large number of elements wth relatvely small values Hard on the cache compared to MergeSort/QuckSort 0 1 4 5 6 7 8 51 5 Summary of sortng Sortng choces: O(N ) Bubblesort, Inserton Sort O(N log N) average case runnng tme: Heapsort: In place, not stable. Mergesort: O(N) extra space, stable. Qucksort: clamed fastest n practce, but O(N ) worst case. Needs extra storage for recurson. Not stable. O(N) Radx Sort: fast and stable. Not comparson based. Not n place. 5