Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Similar documents
CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Sorting: The Big Picture. The steps of QuickSort. QuickSort Example. QuickSort Example. QuickSort Example. Recursive Quicksort

Sorting. Sorting. Why Sort? Consistent Ordering

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

Design and Analysis of Algorithms

Searching & Sorting. Definitions of Search and Sort. Linear Search in C++ Linear Search. Week 11. index to the item, or -1 if not found.

Sorting and Algorithm Analysis

Quicksort. Part 1: Understanding Quicksort

More on Sorting: Quick Sort and Heap Sort

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Problem Set 3 Solutions

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

CS240: Programming in C. Lecture 12: Polymorphic Sorting

CE 221 Data Structures and Algorithms

CS1100 Introduction to Programming

Sorting. Sorted Original. index. index

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

An Optimal Algorithm for Prufer Codes *

CS221: Algorithms and Data Structures. Priority Queues and Heaps. Alan J. Hu (Borrowing slides from Steve Wolfman)

Priority queues and heaps Professors Clark F. Olson and Carol Zander

Module Management Tool in Software Development Organizations

CHAPTER 10: ALGORITHM DESIGN TECHNIQUES

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Programming in Fortran 90 : 2017/2018

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Esc101 Lecture 1 st April, 2008 Generating Permutation

Brave New World Pseudocode Reference

ELEC 377 Operating Systems. Week 6 Class 3

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005

The AVL Balance Condition. CSE 326: Data Structures. AVL Trees. The AVL Tree Data Structure. Is this an AVL Tree? Height of an AVL Tree

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm

We can use a max-heap to sort data.

CMPS 10 Introduction to Computer Science Lecture Notes

Lecture 5: Sorting Part A

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE

On Some Entertaining Applications of the Concept of Set in Computer Science Course

Hierarchical clustering for gene expression data analysis

Sorting is ordering a list of objects. Here are some sorting algorithms

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Dynamic Programming. Example - multi-stage graph. sink. source. Data Structures &Algorithms II

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation

Greedy Technique - Definition

EECS 2011M: Fundamentals of Data Structures

How many leaves on the decision tree? There are n! leaves, because every permutation appears at least once.

Lecture 15: Memory Hierarchy Optimizations. I. Caches: A Quick Review II. Iteration Space & Loop Transformations III.

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

Improving Low Density Parity Check Codes Over the Erasure Channel. The Nelder Mead Downhill Simplex Method. Scott Stransky

Outline. Midterm Review. Declaring Variables. Main Variable Data Types. Symbolic Constants. Arithmetic Operators. Midterm Review March 24, 2014

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Analysis of Continuous Beams in General

Private Information Retrieval (PIR)

Introduction to Programming. Lecture 13: Container data structures. Container data structures. Topics for this lecture. A basic issue with containers

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

Midterms Save the Dates!

Loop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont)

Real-Time Guarantees. Traffic Characteristics. Flow Control

Verification by testing

Conditional Speculative Decimal Addition*

CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

Unsupervised Learning and Clustering

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Active Contour Models

An Entropy-Based Approach to Integrated Information Needs Assessment

11. APPROXIMATION ALGORITHMS

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

The divide and conquer strategy has three basic parts. For a given problem of size n,

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

Design and Analysis of Algorithms

Parallel matrix-vector multiplication

Agenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals

An Efficient Label Setting/Correcting Shortest Path Algorithm

Concurrent Apriori Data Mining Algorithms

Comparison Based Sorting Algorithms. Algorithms and Data Structures: Lower Bounds for Sorting. Comparison Based Sorting Algorithms

O n processors in CRCW PRAM

Solving two-person zero-sum game by Matlab

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.

Loop Transformations, Dependences, and Parallelization

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp

Recurrences and Divide-and-Conquer

The Codesign Challenge

Load Balancing for Hex-Cell Interconnection Network

Algorithms and Data Structures: Lower Bounds for Sorting. ADS: lect 7 slide 1

4. Sorting and Order-Statistics

Smoothing Spline ANOVA for variable screening

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

e-pg PATHSHALA- Computer Science Design and Analysis of Algorithms Module 10

Lecture 5: Multilayer Perceptrons

Transcription:

Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place Heap Sort T(n) = Θ(n lg(n)) In-place Seems pretty good. Can we do better? Sortng Assumptons 1. No knowledge of the keys or numbers we are sortng on. 2. Each key supports a comparson nterface or operator. 3. Sortng entre records, as opposed to numbers, s an mplementaton detal. 4. Each key s unque (ust for convenence). Comparson Sortng Comparson Sortng Gven a set of n values, there can be n! permutatons of these values. So f we look at the behavor of the sortng algorthm over all possble n! nputs we can determne the worst-case complexty of the algorthm.

Decson Tree Decson tree model Full bnary tree A full bnary tree (sometmes proper bnary tree or 2- tree) s a tree n whch every node other than the leaves has two chldren Internal node represents a comparson. Ignore control, movement, and all other operatons, ust see comparson Each leaf represents one possble result (a permutaton of the elements n sorted order). The heght of the tree (.e., longest path) s the lower bound. Decson Tree Model <1,2,3> 1:2 > 2:3 1:3 > > 1:3 <1,3,2> <3,1,2> > <2,1,3> 2:3 > <2,3,1> <3,2,1> Internal node : ndcates comparson between a and a. suppose three elements < a1, a2, a3> wth nstance <6,8,5> Leaf node <π(1), π(2), π(3)> ndcates orderng a π(1) a π(2) a π(3). Path of bold lnes ndcates sortng path for <6,8,5>. There are total 3!=6 possble permutatons (paths). Decson Tree Model The longest path s the worst case number of comparsons. The length of the longest path s the heght of the decson tree. Theorem 8.1: Any comparson sort algorthm requres Ω(nlg n) comparsons n the worst case. Proof: Suppose heght of a decson tree s h, and number of paths (,e,, permutatons) s n!. Snce a bnary tree of heght h has at most 2 h leaves, n! 2 h, so h lg (n!) Ω(nlg( g n) )(By equaton 3.18). That s to say: any comparson sort n the worst case needs at least nlg n comparsons. QuckSort Desgn Follows the dvde-and-conquer paradgm. Dvde: Partton (separate) the array A[p..r] nto two (possbly empty) subarrays A[p..q 1] and A[q+1..r]. Each element n A[p..q 1] < A[q]. A[q] < each element n A[q+1..r]. Index q s computed as part of the parttonng procedure. Conquer: Sort the two subarrays by recursve calls to qucksort. Combne: The subarrays are sorted n place no work s needed to combne them. How do the dvde and combne steps of qucksort compare wth those of merge sort?

Pseudocode Qucksort(A, p, r) f p<rthen Partton(A, p, r) q := Partton(A, p, r); x, := A[r], p 1; Qucksort(A, p, q 1); for := p to r 1 do Qucksort(A, q + 1, r) f A[] then := + 1; A[p..r] A[] A[] A[ +1] A[r]; 5 return + 1 Partton A[p..q 1] A[q+1..r] 5 5 5 Example p r ntally: 2 5 8 3 9 4 1 7 10 6 note: pvot (x) = 6 next teraton: 2 5 8 3 9 4 1 7 10 6 Partton(A, p, r) x, := A[r], p 1; next teraton: 2 5 8 3 9 4 1 7 10 6 for := p to r 1 do f A[] then next teraton: := + 1; 2 5 8 3 9 4 1 7 10 6 A[] A[] A[ + 1] A[r]; []; next teraton: 2 5 3 8 9 4 1 7 10 6 return + 1 Example (Contnued) Parttonng next teraton: 2 5 3 8 9 4 1 7 10 6 next teraton: 2 5 3 8 9 4 1 7 10 6 next teraton: 2 5 3 4 9 8 1 7 10 6 Partton(A, p, r) x, := A[r], p 1; next teraton: 2 5 3 4 1 8 9 7 10 6 for := p to r 1 do f A[] then next teraton: 2 5 3 4 1 8 9 7 10 6 := + 1; A[] A[] next teraton: 2 5 3 4 1 8 9 7 10 6 A[ + 1] A[r]; [] return + 1 after fnal swap: 2 5 3 4 1 6 9 7 10 8 Select the last element A[r] n the subarray A[p..r] as the pvot the element around whch to partton. As the procedure executes, the array s parttoned nto four (possbly empty) regons. 1. A[p.. ] All entres n ths regon are < pvot. 2. A[+1.. 1] All entres n ths regon are > pvot. 3. A[r] = pvot. 4. A[..r 1] Not known how they compare to pvot. The above hold before each teraton of the for loop, and consttute a loop nvarant. (4 s not part of the loop.)

Correctness of Partton Use loop nvarant. Intalzaton: Before frst teraton A[p..] and A[+1.. 1] are empty Conds. 1 and 2 are satsfed (trvally). r s the ndex of the pvot Partton(A, p, r) Cond. 3 s satsfed. x, := A[r], p 1; for := p to r 1 do f A[] then Mantenance: Case 1: A[] > x Increment only. Loop Invarant s mantaned. := + 1; A[] A[] A[ + 1] A[r]; return + 1 Correctness of Partton Case 1: p r >x x > x p r x > x Correctness of Partton Case 2: A[] Increment Increment Condton 2 s mantaned. Swap A[] and A[] A[r] s unaltered. Condton 1 s mantaned. Condton 3 s mantaned. p r x > x p r x Correctness of Partton Termnaton: When the loop termnates, = r, so all elements n A are parttoned nto one of the three cases: A[p..] pvot A[+1.. 1] > pvot A[r] = pvot The last two lnes swap A[+1] and A[r]. Pvot moves from the end of the array to between the two subarrays. Thus, procedure partton correctly performs the dvde step. > x

Complexty of Partton ParttonTme(n) s gven by the number of teratons n the for loop. Θ(n) : n = r p +1 1. Partton(A, p, r) x, := A[r], p 1; for := p to r 1 do f A[] then := + 1; A[] A[] A[ + 1] A[r]; return + 1 Qucksort Overvew To sort a[left...rght]: 1. f left < rght: 11 1.1. Partton a[left...rght] such that: t all a[left...p-1] are less than a[p], and all a[p+1...rght] are >= a[p] 1.2. Qucksort a[left...p-1] 1.3. Qucksort a[p+1...rght] 2. Termnate Parttonng n Qucksort A key step n the Qucksort algorthm s parttonng the array We choose some (any) number p n the array to use as a pvot We partton the array nto three parts: numbers less than p p p numbers greater than or equal to p Alternatve Parttonng Choose an array value (say, the frst) to use as the pvot Startng from the left end, fnd the frst element that s greater than or equal to the pvot Searchng backward from the rght end, fnd the frst element that s less than the pvot Interchange (swap) these two elements Repeat, searchng from where we left off, untl done

Alternatve Parttonng To partton a[left...rght]: 1. Set pvot = a[left], l = left + 1, r = rght; 2. whle l < r, do 2.1. whle l < rght & a[l] < pvot, set l = l + 1 22 2.2. whle r > left & a[r] >= pvot, set r = r - 1 2.3. f l < r, swap a[l] and a[r] 3. Set a[left] = a[r], a[r] = pvot 4. Termnate Example of parttonng choose pvot: 4 3 6 9 2 4 3 1 2 1 8 9 3 5 6 search: 4 3 6 9 2 4 3 1 2 1 8 9 3 5 6 swap: 4 3 3 9 2 4 3 1 2 1 8 9 6 5 6 search: 4 3 3 9 2 4 3 1 2 1 8 9 6 5 6 swap: 4 331 2 4 3 1 2 9 8 9 6 5 6 search: 4 33124 3 1 2 9 8 9 6 5 6 swap: 4 33122 3 1 4 9 8 9 6 5 6 search: 4 3 3 1 2 2 3 1 4 9 8 9 6 5 6 swap wth pvot: 1 3 3 1 2 2 3 4 4 9 8 9 6 5 6 Partton Implementaton (Java) Qucksort Implementaton (Java) statc nt Partton(nt[] a, nt left, nt rght) { nt p = a[left], l = left + 1, r = rght; whle (l < r) { whle (l < rght && a[l] < p) l++; whle (r > left && a[r] >= p) r--; f (l < r) { nt temp = a[l]; a[l] = a[r]; a[r] = temp; a[left] = a[r]; a[r] [] = p; return r; statc vod Qucksort(nt[] array, nt left, nt rght) { f (left < rght) { nt p = Partton(array, left, rght); Qucksort(array, left, p - 1); Qucksort(array, p + 1, rght);

Analyss of qucksort best case Parttonng at varous levels Suppose each partton operaton dvdes the array almost exactly n half Then the depth of the recurson n log 2 n Because that s how many tmes we can halve n We note that Each partton s lnear over ts subarray All the parttons at one level cover the array Best Case Analyss We cut the array sze n half each tme So the depth of the recurson n log 2 n At each level of the recurson, all the parttons at that level do work that s lnear n n O(log 2 n) * O(n) = O(n log 2 n) Hence n the best case, qucksort has tme complexty O(n log 2 n) What about the worst case? Worst case In the worst case, parttonng always dvdes the sze n array nto these three parts: A length one part, contanng the pvot tself A length zero part, and Al length n-1 part, contanng everythng else We don t recur on the zero-length part Recurrng on the length n-1 part requres (n the worst case) recurrng to depth n-1

Worst case parttonng Worst case for qucksort In the worst case, recurson may be n levels deep (for an array of sze n) ) But the parttonng work done at each level s stll n O(n) * O(n) = O(n 2 ) So worst case for Qucksort s O(n 2 ) When does ths happen? There are many arrangements that could make ths happen Here are two common cases: When the array s already sorted When the array s nversely sorted (sorted n the opposte order) Typcal case for qucksort If the array s sorted to begn wth, Qucksort s terrble: O(n 2 ) It s possble to construct other bad cases However, Qucksort s usually O(n log 2 n) The constants are so good that Qucksort s generally the faster algorthm. Most real-world sortng s done by Qucksort Pckng a better pvot Before, we pcked the frst element of the subarray to use as a pvot If the array s already sorted, ths results n O(n 2 ) behavor It s no better f we pck the last element We could do an optmal qucksort (guaranteed O(n log n)) f we always pcked a pvot value that exactly cuts the array n half Such a value s called a medan: half of the values n the array are larger, half are smaller The easest way to fnd the medan s to sort the array and pck the value n the mddle (!)

Medan of three Qucksort for Small Arrays Obvously, t doesn t make sense to sort the array n order to fnd the medan to use as a pvot. Instead, compare ust three elements of our (sub)array the frst, the last, and the mddle Take the medan (mddle value) of these three as the pvot It s possble (but not easy) to construct cases whch wll make ths technque O(n 2 ) For very small arrays (N<= 20), qucksort does not perform as well as nserton sort A good cutoff range s N=10 Swtchng to nserton sort for small arrays can save about 15% n the runnng tme Mergesort vs Qucksort Both run n O(n lgn) Compared wth Qucksort, Mergesort has less number of comparsons but larger number of movng elements In Java, an element comparson s expensve but movng elements s cheap. Therefore, Mergesort s used n the standard Java lbrary for generc sortng Mergesort vs Qucksort In C++, copyng obects can be expensve whle comparng obects often s relatvely cheap. Therefore, qucksort s the sortng routne commonly used n C++ lbrares