Lecture 9: Sorting Algorithms

Similar documents
CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

EECS 2011M: Fundamentals of Data Structures

Sorting Algorithms. For special input, O(n) sorting is possible. Between O(n 2 ) and O(nlogn) E.g., input integer between O(n) and O(n)

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Merge Sort & Quick Sort

Sorting. Bubble Sort. Pseudo Code for Bubble Sorting: Sorting is ordering a list of elements.

Comparison Sorts. Chapter 9.4, 12.1, 12.2

4. Sorting and Order-Statistics

RAM with Randomization and Quick Sort

The divide and conquer strategy has three basic parts. For a given problem of size n,

COMP Data Structures

Sorting: Given a list A with n elements possessing a total order, return a list with the same elements in non-decreasing order.

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 6. Sorting Algorithms

Plotting run-time graphically. Plotting run-time graphically. CS241 Algorithmics - week 1 review. Prefix Averages - Algorithm #1

Sorting. Sorting. Stable Sorting. In-place Sort. Bubble Sort. Bubble Sort. Selection (Tournament) Heapsort (Smoothsort) Mergesort Quicksort Bogosort

Lecture 2: Getting Started

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

Data Structures and Algorithms Chapter 4

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

Sorting is a problem for which we can prove a non-trivial lower bound.

II (Sorting and) Order Statistics

CS 303 Design and Analysis of Algorithms

CPSC 311 Lecture Notes. Sorting and Order Statistics (Chapters 6-9)

Deterministic and Randomized Quicksort. Andreas Klappenecker

Divide and Conquer CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

Sorting. Task Description. Selection Sort. Should we worry about speed?

Lecture #2. 1 Overview. 2 Worst-Case Analysis vs. Average Case Analysis. 3 Divide-and-Conquer Design Paradigm. 4 Quicksort. 4.

DIVIDE & CONQUER. Problem of size n. Solution to sub problem 1

Divide and Conquer CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement. Outline

CS 5321: Advanced Algorithms Sorting. Acknowledgement. Ali Ebnenasir Department of Computer Science Michigan Technological University

COT 5407: Introduction to Algorithms. Giri Narasimhan. ECS 254A; Phone: x3748

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.

Data Structures and Algorithms

Bin Sort. Sorting integers in Range [1,...,n] Add all elements to table and then

Sorting algorithms Properties of sorting algorithm 1) Adaptive: speeds up to O(n) when data is nearly sorted 2) Stable: does not change the relative

Sorting Shabsi Walfish NYU - Fundamental Algorithms Summer 2006

Data Structures and Algorithms Week 4

Copyright 2009, Artur Czumaj 1

Sorting and Selection

IS 709/809: Computational Methods in IS Research. Algorithm Analysis (Sorting)

CS Algorithms and Complexity

Deliverables. Quick Sort. Randomized Quick Sort. Median Order statistics. Heap Sort. External Merge Sort

Sorting. There exist sorting algorithms which have shown to be more efficient in practice.

Data Structures. Sorting. Haim Kaplan & Uri Zwick December 2013

CS 310 Advanced Data Structures and Algorithms

Cpt S 122 Data Structures. Sorting

Total Points: 60. Duration: 1hr

SORTING AND SELECTION

The Limits of Sorting Divide-and-Conquer Comparison Sorts II

e-pg PATHSHALA- Computer Science Design and Analysis of Algorithms Module 10

Analysis of Algorithms. Unit 4 - Analysis of well known Algorithms

Module 3: Sorting and Randomized Algorithms

CS 561, Lecture 1. Jared Saia University of New Mexico

Module 3: Sorting and Randomized Algorithms. Selection vs. Sorting. Crucial Subroutines. CS Data Structures and Data Management

We can use a max-heap to sort data.

Sorting (I) Hwansoo Han

Lecture 5: Sorting Part A

COMP2012H Spring 2014 Dekai Wu. Sorting. (more on sorting algorithms: mergesort, quicksort, heapsort)

Programming II (CS300)

7. Sorting I. 7.1 Simple Sorting. Problem. Algorithm: IsSorted(A) 1 i j n. Simple Sorting

1 Probabilistic analysis and randomized algorithms

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Sorting & Growth of Functions

Module 2: Classical Algorithm Design Techniques

CS125 : Introduction to Computer Science. Lecture Notes #38 and #39 Quicksort. c 2005, 2003, 2002, 2000 Jason Zych

CHAPTER 7 Iris Hui-Ru Jiang Fall 2008

Algorithms and Data Structures for Mathematicians

Sorting. Weiss chapter , 8.6

CSE 143. Two important problems. Searching and Sorting. Review: Linear Search. Review: Binary Search. Example. How Efficient Is Linear Search?

Outline. CS 561, Lecture 6. Priority Queues. Applications of Priority Queue. For NASA, space is still a high priority, Dan Quayle

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Divide and Conquer

CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting. Aaron Bauer Winter 2014

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

Design and Analysis of Algorithms

Divide & Conquer. 2. Conquer the sub-problems by solving them recursively. 1. Divide the problem into number of sub-problems

QuickSort and Others. Data Structures and Algorithms Andrei Bulatov

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation

BM267 - Introduction to Data Structures

Lecture 19 Sorting Goodrich, Tamassia

Fundamental problem in computing science. putting a collection of items in order. Often used as part of another algorithm

O(n): printing a list of n items to the screen, looking at each item once.

Divide-and-Conquer CSE 680

Sorting Algorithms. + Analysis of the Sorting Algorithms

Classic Data Structures Introduction UNIT I

Giri Narasimhan. COT 5993: Introduction to Algorithms. ECS 389; Phone: x3748

Sorting. Popular algorithms: Many algorithms for sorting in parallel also exist.

Introduction to Algorithms

Algorithms - Ch2 Sorting

Merge Sort

COMP Analysis of Algorithms & Data Structures

Motivation of Sorting

2/26/2016. Divide and Conquer. Chapter 6. The Divide and Conquer Paradigm

CS61BL. Lecture 5: Graphs Sorting

CSE 373: Data Structures and Algorithms

Divide and Conquer Sorting Algorithms and Noncomparison-based

Sorting is ordering a list of objects. Here are some sorting algorithms

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

ECE 2574: Data Structures and Algorithms - Basic Sorting Algorithms. C. L. Wyatt

Divide and Conquer. Algorithm D-and-C(n: input size)

Transcription:

Lecture 9: Sorting Algorithms Bo Tang @ SUSTech, Spring 2018

Sorting problem Sorting Problem Input: an array A[1..n] with n integers Output: a sorted array A (in ascending order) Problem is: sort A[1..n] Input: 8 6 1 3 7 2 5 4 Output: 1 2 3 4 5 6 7 8 2

Our Roadmap Comparison-based Sorting Quadratic Cost Selection Sort, Insertion Sort, Bubble Sort O(n log n) Cost Heap Sort, Merge Sort Quick Sort Other sorting algorithms Counting sort, radix sort, bucket sort 3

Selection Sort 4

Selection Sort Idea of a selection sort method Start with empty hand, all cards on table Pick the smallest card from table Insert the card into the hand 5

Selection Sort Algorithm SelectionSort Input: an array A of n numbers Output : an array A of n numbers in the ascending order Selection-Sort ( A[1..n] ) 1. for integer i 1 to n 1 2. k i 3. for integer j i+1 to n 4. if A[k] > A[j] then 5. k j 6. swap A[i] and A[k] 8 6 1 3 7 2 5 4 1 8 6 3 7 2 5 4 sorted unsorted 1 2 8 6 3 7 5 4 sorted unsorted 6

Selection Sort Time Complexity Selection Sort Input: an array A of n numbers Output : an array A of n numbers in the ascending order Selection-Sort ( A[1..n] ) 1. for integer i 1 to n 1 Cost: n-1=o(n) 2. k i Cost: n-1=o(n) 3. for integer j i+1 to n Cost: n-1+n-2+ +1=O(n 2 ) 4. if A[k] > A[j] then Cost: O(n 2 ) 5. k j Cost: O(n 2 ) 6. swap A[i] and A[k] Cost: O(n) Selection sort total cost: O(n)+O(n)+O(n 2 ) +O(n 2 ) +O(n 2 ) +O(n)=O(n 2 ) 7

Insertion Sort 8

Insertion Sort Idea of a insertion sort method One input each iteration, growing a sorted output list Remove one element from input data Find the location it belongs within the sorted list Repeat until no input elements remain 9

Insertion Sort Algorithm InsertionSort Input: an array A of n numbers Output : an array A of n numbers in the ascending order Insertion-Sort ( A[1..n] ) 1. for integer i 1 to n 1 2. for integer j i to 0 3. if A[j-1] > A[j] then 4. swap A[j-1] and A[j] 8 6 1 3 7 2 5 4 6 8 1 3 7 2 5 4 sorted unsorted sorted unsorted 8 6 1 3 7 2 5 4 1 6 8 3 7 2 5 10 4

Insertion Sort Time Complexity Insertion Sort Input: an array A of n numbers Output : an array A of n numbers in the ascending order Insertion-Sort ( A[1..n] ) 1. for integer i 1 to n Cost: n-1=o(n) 2. for integer j i to 2 Cost: 1+ +n-2=o(n 2 ) 3. if A[j-1] > A[j] then Cost: O(n 2 ) 4. swap A[j-1] and A[j] Cost: O(n 2 ) Insertion sort total cost: O(n)+O(n 2 ) +O(n 2 ) +O(n 2 ) =O(n 2 ) 11

Bubble Sort 12

Bubble Sort Idea of a bubble sort method For each pass Compare the pair of adjacent item Swap them if they are in the wrong order Repeat the pass through until no swaps are needed 13

Bubble Sort Algorithm BubbleSort Input: an array A of n numbers Output : an array A of n numbers in the ascending order Bubble-Sort ( A[1..n] ) 1. for integer i 1 to n 1 2. for integer j 2 to n 3. if A[j-1] > A[j] then 4. swap A[j-1] and A[j] 8 6 1 3 7 2 5 4 unsorted sorted 6 1 3 7 2 5 4 8 1 3 6 2 5 4 7 8 unsorted sorted 1 3 2 5 4 6 7 14 8

Bubble Sort Time Complexity Bubble Sort Input: an array A of n numbers Output : an array A of n numbers in the ascending order Bubble-Sort ( A[1..n] ) 1. for integer i 1 to n 1 Cost: n-1=o(n) 2. for integer j 2 to n Cost: n-1+ +1=O(n 2 ) 3. if A[j-1] > A[j] then Cost: O(n 2 ) 4. swap A[j-1] and A[j] Cost: O(n 2 ) Bubble sort total cost: O(n)+O(n 2 ) +O(n 2 ) +O(n 2 ) =O(n 2 ) 15

Pop Quiz We say a sorting algorithm is stable if it does not change the relative order of elements with equal keys, which of the following is/are stable () A: Selection sort B: Insertion Sort C: Bubble Sort Watch a video: 1) Which sorting algorithm is used in that video? 2) I am No.??? 16

Heapsort Heap Construction & Deletion 17

Heapsort Idea of a heap sort method Build min-heap for items in Array A Delete-min of the heap (maintain min-heap property) Repeat step 2 until heap is empty 1 3 2 4 5 6 8 7 18

Heapsort Heap Sort Input: an array A of n numbers Output : an array A of n numbers in the ascending order Heapsort ( A[1..n] ) 1. min-heap T build_heap(a) 2. i 1 3. while (T is not empty) 4. A[i++] delete_min(t) Heapsort total cost: O(n)+O(1) +O(n) +O(nlogn) =O(nlogn) Cost: O(n) or O(nlogn) Cost: O(1) Cost: O(n) Cost: O(nlogn) 19

1 3 2 4 7 8 Heapsort 2 3 5 5 4 7 8 6 6 4 3 5 1 8 6 1 3 7 2 5 4 6 7 8 Heap 1 3 2 4 7 8 5 6 1 2 Delete-min Delete-min 2 3 5 4 7 8 6 3 4 5 6 7 8 20

Merge Sort (Divide-and-Conquer) 21

Divide and Conquer Divide and Conquer: an algorithmic technique Divide: divide the problem into smaller subproblems Conquer: solve each subproblem recursively Combine: combine the solution of subproblems into the solution of the original problem n 1. divide 3. combine n/b n/b n/b 2. conquer 22

Example: Merge Sort Sorting problem Input: an array A[1..n] with n integers Output: a sorted array A (in ascending order) Original problem is: sort A[1..n] 8 6 1 3 7 2 5 4 What is a subproblem? Sort a subarray A[l..r] 7 2 5 4 23

Merge Sort Merge Sort Divide: divide the array into two subarrays of n/2 numbers each Conquer: sort two subarrays recursively Combine: merge two sorted subarrays into a sorted array Merge-Sort(A, n) 1. if n > 1 2. p n/2 3. B[1..p] A[1..p] 4. C[1..n-p] A[p+1..n] 5. Merge-Sort(B, p) 6. Merge-Sort(C, n-p) 7. A[1..n] Merge(B, p, C, n-p) We ll discuss the Combine phase ( Merge function) later 24

25 Merge Sort 8 6 1 3 7 2 5 4 8 6 1 3 7 2 5 4 8 6 1 3 7 2 5 4 8 6 1 3 7 2 5 4 6 8 1 3 2 7 4 5 1 3 6 8 2 4 5 7 1 2 3 4 5 6 7 8 divide divide divide combine combine combine

Merge Sort: Combine Phase L R A Sorted arrays i j 1 3 6 8 2 4 5 7 Merge(L, n L, R, n R ) 1. n n L + n R 2. let A[1..n] be a new array 3. i 1; j 1 4. for k 1 to n 5. if i n L and (j>n R or L[i] R[j]) 6. A[k] L[i] ; i i + 1 7. else 8. A[k] R[j] ; j j + 1 9. return A 26

Running time of Merge Merge(L, n L, R, n R ) 1. n n L + n R 2. let A[1..n] be a new array 3. i 1; j 1 4. for k 1 to n 5. if i n L and (j>n R or L[i] R[j]) 6. A[k] L[i] ; i i + 1 7. else 8. A[k] R[j] ; j j + 1 9. return A Sorted arrays Let n = n L + n R be the total number of items Time of merge: O(n) time Line 1: O(1) Line 2: O(n) Line 3: O(1) Lines 4-8: O(n) 27

Running time of Merge Sort Merge-Sort(A, n) 1. if n > 1 2. p n/2 3. B[1..p] A[1..p] 4. C[1..n-p] A[p+1..n] 5. Merge-Sort(B, p) 6. Merge-Sort(C, n-p) 7. A[1..n] Merge(B, p, C, n-p) Let T(n) be the running time of Merge Sort Lines 3, 4 take O(n) time Line 5 takes T(n/2) time Line 6 takes T(n/2) time Line 7 takes O(n) time Thus, we obtain the recurrence T(n) = 2 T(n/2) + O(n) Solving it, we get: T(n) = O(nlogn) 28

Time Complexity T(n): time complexity of algorithm at input size n Divide the problem into a subproblems Size of each subproblem is n/b Combine phase takes f(n) time Note: a and b can have different values 1. divide n 3. combine n/b n/b n/b 2. conquer a subproblems Recurrence equation: T(n) = a T(n/b) + f(n) E.g., Merge Sort: T(n) = 2 T(n/2) + O(n) 29

Methods for Solving Recurrences Recurrence equation: T(n) = a T(n/b) + f(n) Two methods for solving recurrences Master theorem Substitution method Master theorem It could be proved by carefully applying the expansion method, the details are tedious and omitted from this course Substitution method (we skip it here) It is mathematical induction 30

Master Theorem Recurrence equation: T(n) = a T(n/b) + f(n) Let T(n) be a function that return a positive value for every integer n>0. We know that: T 1 = O(1) T n = αt n β + O(n γ ) for (n 2) where α 1, β > 1, andγ 0. Then: If log β α < γ, then T n = O(n γ ) If log β α = γ, then T n = O(n γ log n) If log β α > γ, then T n = O(n log β α ) 31

Master Theorem Consider the recurrence of binary search: T(1) c1 T(n) T(n/2) + c2 (for n 2) Hence, α = 1, β = 2, and γ = 0. Since log β α = log 2 1 = 0 = γ, we know that T n = O n 0 log n = O(log n). Consider the recurrence of merge sort: T(1) c1 T(n) = 2 T(n/2) + O(n) = 2 T(n/2) + c2 n (for n 2) Hence, α = 2, β = 2, and γ = 1. Since log β α = log 2 2 = 1 = γ, we know that T n = O n 1 log n = O(nlog n). 32

Quick Sort RAM with Randomization 33

Deterministic & Randomized So far in CS203, all our algorithms are deterministic, namely, they do not involve any randomization. We will introduce randomized algorithms, e.g., quick sort in the sorting problem. Randomized algorithms play an important role in computer science, they often simpler, and sometimes can be provably faster as well. Recall the core of the RAM model is a set of atomic operations, we extend this set with: RANDOM(x, y): given integers x and y (x <= y), this operation returns an integer chosen uniformly at random in [x,y], i.e., x, x+1,, y has the same probability of being returned. 34

Randomized Algorithm Example Find-a-Zero: Given an array of integers with size n, among which there is at least 0. Design an algorithm to report an arbitrary position of A that contains a 0 Suppose A = (9,18,0,0,15,0), an algorithm can report 3,4 or 6, consider the following randomized algorithm 1. do 2. r RANDOM(1,n) 3. until A[r]=0 4. return r What is the cost of the algorithm? It depends If all numbers in A are 0, O(1) time. If A has only one 0, O(n) expected time. As before, we care about the worst expected time: O(n) 35

Quick Sort Idea of a quick sort method Randomly pick an integer p in A, call it the pivot Re-arrange the integers in an array A such that All the integers smaller than p are positioned before p in A All the integers larger than p are positioned after p in A Sort the part of A before p recursively Sort the part of A after p recursively 36

Quicksort Quick Sort Input: an array A of n numbers Output : an array A of n numbers in the ascending order Quicksort ( A[1..n], lo=1, hi=n) 1. p partition(a, lo, hi) 2. Quicksort(A,lo,p-1) 3. Quicksort(A,p+1,hi) 37

Quicksort Partition(A, lo, hi) 1. p RANDOM(lo, hi); 2. pivot A[p]; 3. L lo, R hi 4. for integer i from lo to hi 5. if(a[i] < pivot) A [L++]A[i] 6. else A [R--]A[i] 7. A[lo, hi] A 8. return L; Question: If we set p lo or hi in Line 1, quick sort is still correct? What are the difference between p lo/hi and p RANDOM(lo, hi)? 38

Quicksort Example 8 6 1 3 7 2 5 4 pivot 1 3 2 5 4 6 8 7 pivot 1 2 3 5 4 6 7 8 pivot pivot pivot 1 2 3 4 5 6 7 8 pivot pivot pivot pivot pivot 39

Quicksort Time Complexity Quicksort s running time is not attractive in the worst case: it is O(n 2 ) (why?) However, quick sort is fast in expectation, i.e., O(nlogn), remember this holds for every input array A. Whether quicksort has any advantage over merge sort? which guarantees O(nlogn) in the worst case. No in theory, but there is an advantage in practice Quicksort permits a faster implementation that leads to a smaller hidden constant compared to merge sort. (why?) 40

Quicksort Time Complexity Let X be the number of comparisons in quicksort algorithm. The running is bounded by O(n+x). We prove that E[X]=O(n log n) Denote e i be the i-th smallest integer in A, consider e i and e j for any i,j such that i!=j What is the probability that quicksort compares e i and e j? Every element will be selected as pivot precisely once e i and e j are not compared, if any element between them gets selected as a pivot before them. Therefore, e i and e j are compared if and only if either one is the first among e i, e i+1,, e j picked as a pivot The probability is 2/(j-i+1) (random pivot selection) 41

Quicksort Time Complexity Define random variable X ij to be 1, if e i and e j are compared. Otherwise, X ij to be 0. Thus, we have Pr[X ij = 1] = 2/(j-i+1), that is E[X ij ] = 2/(j-i+1) Since X = σ i,j X ij, hence: E X = σ i,j E[X ij ] = σ i,j 2 j i+1 = 2 σ n 1 i=1 σn 1 j=i+1 j i+1 = 2 σ n 1 i=1 O(log(j i + 1)) (1+1/2+ +1/n = O(logn)) = 2 σ n 1 i=1 O(log n) = O n log n Harmonic series: 1+1/2+ +1/n, which is frequently encountered in computer science. 42

Summary Sort Average Space Stable Selection O(n 2 ) O(1) Yes Insertion O(n 2 ) O(1) Yes Bubble O(n 2 ) O(1) Yes Heap O(nlogn) O(1) No Merge O(nlogn) Depends Yes Quick O(nlogn) O(1) Yes Comparison lower bound of sorting algorithm: Ω n log n We omit the proof here. 43

Other Sorting Methods 44

Other Sorting Algorithms Counting sort (Chapter 8.2) it is applicable when each input is known to belong to a particular set, S, of possibilities. The algorithm runs in O( S + n) time and O( S ) memory where n is the length of the input. Radix sort (Chapter 8.3) radix sort is an algorithm that sorts numbers by processing individual digits. n numbers consisting of k digits each are sorted in O(n k) time Bucket sort (Chapter 8.4) Bucket sort is a divide and conquer sorting algorithm that generalizes counting sort by partitioning an array into a finite number of buckets. 45

Thank You! 46