Comparison Based Sorting Algorithms. Algorithms and Data Structures: Lower Bounds for Sorting. Comparison Based Sorting Algorithms

Similar documents
Algorithms and Data Structures: Lower Bounds for Sorting. ADS: lect 7 slide 1

Lower bound for comparison-based sorting

Lower Bound on Comparison-based Sorting

How fast can we sort? Sorting. Decision-tree example. Decision-tree example. CS 3343 Fall by Charles E. Leiserson; small changes by Carola

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Chapter 8 Sort in Linear Time

Sorting. CMPS 2200 Fall Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk

Chapter 8 Sorting in Linear Time

Lecture: Analysis of Algorithms (CS )

How many leaves on the decision tree? There are n! leaves, because every permutation appears at least once.

Design and Analysis of Algorithms

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

CSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.

Lecture 5: Sorting Part A

II (Sorting and) Order Statistics

EECS 2011M: Fundamentals of Data Structures

CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting. Aaron Bauer Winter 2014

MergeSort. Algorithm : Design & Analysis [5]

08 A: Sorting III. CS1102S: Data Structures and Algorithms. Martin Henz. March 10, Generated on Tuesday 9 th March, 2010, 09:58

Algorithms Lab 3. (a) What is the minimum number of elements in the heap, as a function of h?

COMP Data Structures

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

8 SortinginLinearTime

4. Sorting and Order-Statistics

CS 561, Lecture 1. Jared Saia University of New Mexico

Module 3: Sorting and Randomized Algorithms

Module 3: Sorting and Randomized Algorithms. Selection vs. Sorting. Crucial Subroutines. CS Data Structures and Data Management

How to Win Coding Competitions: Secrets of Champions. Week 3: Sorting and Search Algorithms Lecture 7: Lower bound. Stable sorting.

Introduction to Algorithms

Lower and Upper Bound Theory. Prof:Dr. Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY

Introduction to Algorithms 6.046J/18.401J

Partha Sarathi Manal

Question 7.11 Show how heapsort processes the input:

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Merge Sort & Quick Sort

Properties of a heap (represented by an array A)

Binary heaps (chapters ) Leftist heaps

Module 3: Sorting and Randomized Algorithms

The complexity of Sorting and sorting in linear-time. Median and Order Statistics. Chapter 8 and Chapter 9

CS : Data Structures

Introduction to Algorithms

CPSC 311 Lecture Notes. Sorting and Order Statistics (Chapters 6-9)

University of Waterloo CS240, Winter 2010 Assignment 2

Data Structures and Algorithms Chapter 4

Sorting isn't only relevant to records and databases. It turns up as a component in algorithms in many other application areas surprisingly often.

Data Structures. Sorting. Haim Kaplan & Uri Zwick December 2013

Examination Questions Midterm 2

University of Waterloo CS240 Winter 2018 Assignment 2. Due Date: Wednesday, Jan. 31st (Part 1) resp. Feb. 7th (Part 2), at 5pm

CSE 373 MAY 24 TH ANALYSIS AND NON- COMPARISON SORTING

Reading for this lecture (Goodrich and Tamassia):

Scribe: Sam Keller (2015), Seth Hildick-Smith (2016), G. Valiant (2017) Date: January 25, 2017

Next. 1. Covered basics of a simple design technique (Divideand-conquer) 2. Next, more sorting algorithms.

Sorting Algorithms. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Data Structures and Algorithms Week 4

Lecture 3. Recurrences / Heapsort

COMP Analysis of Algorithms & Data Structures

Lecture 8: Mergesort / Quicksort Steven Skiena

SORTING. Insertion sort Selection sort Quicksort Mergesort And their asymptotic time complexity

Comparisons. Θ(n 2 ) Θ(n) Sorting Revisited. So far we talked about two algorithms to sort an array of numbers. What is the advantage of merge sort?

Outline. Computer Science 331. Three Classical Algorithms. The Sorting Problem. Classical Sorting Algorithms. Mike Jacobson. Description Analysis

Trees (Part 1, Theoretical) CSE 2320 Algorithms and Data Structures University of Texas at Arlington

COT 6405 Introduction to Theory of Algorithms

DIVIDE AND CONQUER ALGORITHMS ANALYSIS WITH RECURRENCE EQUATIONS

CPSC 320 Midterm 2 Thursday March 13th, 2014

7. Sorting I. 7.1 Simple Sorting. Problem. Algorithm: IsSorted(A) 1 i j n. Simple Sorting

1. Covered basics of a simple design technique (Divideand-conquer) 2. Next, more sorting algorithms.

Lecture Notes for Advanced Algorithms

COSC 311: ALGORITHMS HW1: SORTING

Comparisons. Heaps. Heaps. Heaps. Sorting Revisited. Heaps. So far we talked about two algorithms to sort an array of numbers

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY. Lecture 11 CS2110 Spring 2016

Can we do faster? What is the theoretical best we can do?

ECE608 - Chapter 8 answers

CSE 326: Data Structures Sorting Conclusion

Outline. Computer Science 331. Heap Shape. Binary Heaps. Heap Sort. Insertion Deletion. Mike Jacobson. HeapSort Priority Queues.

COMP Analysis of Algorithms & Data Structures

1 Lower bound on runtime of comparison sorts

Binary Heaps in Dynamic Arrays

Data Structures and Algorithms. Roberto Sebastiani

Stacks, Queues, and Priority Queues. Inf 2B: Heaps and Priority Queues. The PriorityQueue ADT

Sorting and Selection

9/10/2018 Algorithms & Data Structures Analysis of Algorithms. Siyuan Jiang, Sept

Heapsort. Heap data structure

Cosc 241 Programming and Problem Solving Lecture 17 (30/4/18) Quicksort

l Find a partner to bounce ideas off of l Assignment 2 l Proofs should be very concrete l Proving big-o, must satisfy definition

Overview of Presentation. Heapsort. Heap Properties. What is Heap? Building a Heap. Two Basic Procedure on Heap

SORTING AND SELECTION

MA 252: Data Structures and Algorithms Lecture 9. Partha Sarathi Mandal. Dept. of Mathematics, IIT Guwahati

lecture notes September 2, How to sort?

We will show that the height of a RB tree on n vertices is approximately 2*log n. In class I presented a simple structural proof of this claim:

CISC 621 Algorithms, Midterm exam March 24, 2016

Introduction to Algorithms and Data Structures. Lecture 12: Sorting (3) Quick sort, complexity of sort algorithms, and counting sort

CS61B Lectures #28. Today: Lower bounds on sorting by comparison Distribution counting, radix sorts

1 Interlude: Is keeping the data sorted worth it? 2 Tree Heap and Priority queue

CS2 Algorithms and Data Structures Note 1

Algorithms Chapter 8 Sorting in Linear Time

The priority is indicated by a number, the lower the number - the higher the priority.

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

CSC Design and Analysis of Algorithms

The divide and conquer strategy has three basic parts. For a given problem of size n,

Theory and Algorithms Introduction: insertion sort, merge sort

QuickSort

Transcription:

Comparison Based Sorting Algorithms Algorithms and Data Structures: Lower Bounds for Sorting Definition 1 A sorting algorithm is comparison based if comparisons A[i] < A[j], A[i] A[j], A[i] = A[j], A[i] A[j], A[i] > A[j] are the only ways in which it accesses the input elements. ADS: lect 7 slide 1 Comparison Based Sorting Algorithms Definition 1 A sorting algorithm is comparison based if comparisons A[i] < A[j], A[i] A[j], A[i] = A[j], A[i] A[j], A[i] > A[j] are the only ways in which it accesses the input elements. Comparison based sorting algorithms are generic in the sense that they can be used for all types of elements that are comparable (such as objects of type Comparable in Java). ADS: lect 7 slide 2 Comparison Based Sorting Algorithms Definition 1 A sorting algorithm is comparison based if comparisons A[i] < A[j], A[i] A[j], A[i] = A[j], A[i] A[j], A[i] > A[j] are the only ways in which it accesses the input elements. Comparison based sorting algorithms are generic in the sense that they can be used for all types of elements that are comparable (such as objects of type Comparable in Java). Example 2 Insertion-Sort, Quicksort, Merge-Sort, Heapsort are all comparison based. ADS: lect 7 slide 2 ADS: lect 7 slide 2

ADS: lect 7 slide 5 ADS: lect 7 slide 6 The Decision Tree Model Abstractly, we may describe the behaviour of a comparison-based sorting algorithm S on an input array A = A[1],..., A[n] by a decision tree: A[k] < A[l] k : l A[i] < A[j] A[i] = A[j] i : j p : q A[i] > A[j] r : s A Simplifying Assumption In the following, we assume that all keys of elements of the input array of a sorting algorithm are distinct. (It is ok to restrict to a special case, because we want to prove a lower bound.) Thus the outcome A[i] = A[j] in a comparison will never occur, and the decision tree is in fact a binary tree: A[i] < A[j] i : j A[i] > A[j] t : u k : l r : s A[k] < A[l] t : u At each leaf of the tree the output of the algorithm on the corresponding execution branch will be displayed. Outputs of sorting algorithms correspond to permutations of the input array. ADS: lect 7 slide 3 ADS: lect 7 slide 4 Example Insertion sort for n = 3: For a comparison based sorting algorithm S: 2 : 3 1 : 2 1 : 3 C S (n) = worst-case number of comparisons performed by S on an input array of size n. A[1] A[2] A[3] 1 : 3 A[2] A[1] A[3] 2 : 3 A[1] A[3] A[2] A[3] A[1] A[2] A[2] A[3] A[1] A[3] A[2] A[1] In insertion sort, when we get the result of a comparison, we often swap some elements of the array. In showing decision trees, we don t implement a swap. Our indices always refer to the original elements at that position in the array. To understand what I mean, draw the evolving array of InsertionSort beside this decision tree.

For a comparison based sorting algorithm S: C S (n) = worst-case number of comparisons performed by S on an input array of size n. For a comparison based sorting algorithm S: C S (n) = worst-case number of comparisons performed by S on an input array of size n. Theorem 3 For all comparison based sorting algorithms S we have C S (n) = Ω(n lg n). Theorem 3 For all comparison based sorting algorithms S we have C S (n) = Ω(n lg n). Corollary 4 The worst-case running time of any comparison based sorting algorithm is Ω(n lg n). ADS: lect 7 slide 6 Proof of Theorem 3 uses Decision-Tree Model of sorting. It is an Information-Theoretic Lower Bound: ADS: lect 7 slide 6 Proof of Theorem 3 uses Decision-Tree Model of sorting. It is an Information-Theoretic Lower Bound: Information-Theoretic means that it is based on the amount of information that an instance of the problem can encode. ADS: lect 7 slide 7 ADS: lect 7 slide 7

Proof of Theorem 3 uses Decision-Tree Model of sorting. It is an Information-Theoretic Lower Bound: Information-Theoretic means that it is based on the amount of information that an instance of the problem can encode. For sorting, the input can encode n! outputs. Proof of Theorem 3 uses Decision-Tree Model of sorting. It is an Information-Theoretic Lower Bound: Information-Theoretic means that it is based on the amount of information that an instance of the problem can encode. For sorting, the input can encode n! outputs. Proof does not make any assumption about how the sorting might be done (except it is comparison-based). Proof of Theorem 3 ADS: lect 7 slide 7 Observation 5 For every n, C S (n) is the height of the decision tree of S on inputs n (the longest path from the root to a leaf is the maximum number of comparisons that algorithm S will do on an input of length n). We shall prove a lower bound for the height of the decision tree for any algorithm S. Remark Maybe you are wondering... was it really ok to assume all keys are distinct? Proof of Theorem 3 ADS: lect 7 slide 7 Observation 5 For every n, C S (n) is the height of the decision tree of S on inputs n (the longest path from the root to a leaf is the maximum number of comparisons that algorithm S will do on an input of length n). We shall prove a lower bound for the height of the decision tree for any algorithm S. Remark Maybe you are wondering... was it really ok to assume all keys are distinct? It is ok - because the problem of sorting n keys (with no distinctness assumption) is more general than the problem of sorting n distinct keys. The worst-case for sorting certainly is as bad as the worst-case for all-distinct keys sorting. ADS: lect 7 slide 8 ADS: lect 7 slide 8

height of decision tree 2 height of decision tree 2 2 C S (n). height of decision tree 2 2 C S (n). height of decision tree 2 2 C S (n). Thus C S (n) lg(n!) = Ω(n lg(n)). Thus C S (n) lg(n!) = Ω(n lg(n)). To obtain the last inequality, we can use the following inequality: n n/2 n! n n This tells us that lg n! lg(n n/2 ) = (n/2) lg n = Ω(n lg(n)). Thm 3 QED

An Average Case Lower Bound For any comparison based sorting algorithm S: A S (n) = average number of comparisons performed by S on an input array of size n. Theorem 7 For all comparison based sorting algorithms S we have A S (n) = Ω(n lg n). An Average Case Lower Bound For any comparison based sorting algorithm S: A S (n) = average number of comparisons performed by S on an input array of size n. Theorem 7 For all comparison based sorting algorithms S we have A S (n) = Ω(n lg n). ADS: lect 7 slide 10 Average root-leaf length in Binary tree Definition 9 For any binary tree T, let AvgRL(T ) denote the average root-to-leaf length for T. Definition 10 A near-complete binary tree T is a binary tree in which every internal node has exactly two child nodes, and all leaves are either at depth h or depth h 1. Theorem 11 Any near-complete binary tree T with leaf set L(T ), L(T ) 4, has Average root-to-leaf length AvgRL(T ) at least lg( L(T ) )/2. Theorem 12 For any binary tree T, there is a near-complete binary tree T such that L(T ) = L(T ) (same leaf set) and such that AvgRL(T ) AvgRL(T ). Hence AvgRL(T ) lg( L(T ) )/2 holds for all binary trees. Proof of Theorems 11 and 12 via BOARD notes. ADS: lect 7 slide 11 Proof uses the fact that the average length of a path from the root to a leaf in a binary tree with l leaves is Ω(lg l) (Theorem 11 and 12). Corollary 8 The average-case running time of any comparison based sorting algorithm is Ω(n lg n). Implications of These Lower Bounds ADS: lect 7 slide 10 Theorem 3 and Theorem 7 are significant because they hold for all comparison-based algorithms S. They imply the following: 1. By Thm 3, any comparison-based algorithm for sorting which has a worst-case running-time of O(n lg n) is asymptotically optimal (ie, apart from the constant factor inside the O term, it is as good as possible in terms of worst-case analysis). This includes algorithms like MergeSort, HeapSort. 2. By Thm 7, any comparison-based algorithm for sorting which has an average-case running-time of O(n lg n) is the best you can hope for in terms of average-case analysis (apart from the constant factor inside the O term). This is accomplished by MergeSort and HeapSort. In Lecture 8, we will see that it is also true for QuickSort (for the average-case, not the worst-case). ADS: lect 7 slide 12

Lecture 9 (after average-case analysis of QuickSort) We will show how in a special case of sorting (when the inputs are numbers, coming from the range {1, 2,..., n k } for some constant k, we can sort in linear time (not a comparison-based algorithm). Reading Assignment [CLRS] Section 8.1 (2nd and 3rd edition) or [CLR] Section 9.1 Well-worth reading - this is a nice chapter of CLRS (not too long). Problems 1. Draw (simplified) decision trees for Insertion Sort and Quicksort for n = 4. 2. Exercise 8.1-1 of [CLRS] (both 2nd and 3rd ed). 3. Resolve the complexity (in terms of no-of-comparisons) of sorting 4 numbers. 3.1 Give an algorithm which sorts any 4 numbers and which uses at most 5 comparisons in the worst-case. 3.2 Prove (using the decision-tree model) that there is no algorithm to sort 4 numbers, which uses less than 5 comparisons in the worst-case. ADS: lect 7 slide 13 ADS: lect 7 slide 14