Reading for this lecture (Goodrich and Tamassia):

Similar documents
Selection (deterministic & randomized): finding the median in linear time

Lower bound for comparison-based sorting

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation

COMP 250 Fall recurrences 2 Oct. 13, 2017

DIVIDE AND CONQUER ALGORITHMS ANALYSIS WITH RECURRENCE EQUATIONS

The Limits of Sorting Divide-and-Conquer Comparison Sorts II

Computer Algorithms. Introduction to Algorithm

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

Programming in Haskell Aug-Nov 2015

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Merge Sort

COMP Data Structures

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

Question 7.11 Show how heapsort processes the input:

Mergesort again. 1. Split the list into two equal parts

Comparison Based Sorting Algorithms. Algorithms and Data Structures: Lower Bounds for Sorting. Comparison Based Sorting Algorithms

Lecture 19 Sorting Goodrich, Tamassia

Algorithms and Data Structures: Lower Bounds for Sorting. ADS: lect 7 slide 1

Quicksort (Weiss chapter 8.6)

Recursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences

Recursive Algorithms II

1 (15 points) LexicoSort

We can use a max-heap to sort data.

SORTING AND SELECTION

Sorting Goodrich, Tamassia Sorting 1

Sorting: Given a list A with n elements possessing a total order, return a list with the same elements in non-decreasing order.

Recursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences

Introduction to Programming: Lecture 6

CSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.

Programming II (CS300)

Sorting Shabsi Walfish NYU - Fundamental Algorithms Summer 2006

Lecture 8: Mergesort / Quicksort Steven Skiena

Design and Analysis of Algorithms

ECE 2574: Data Structures and Algorithms - Basic Sorting Algorithms. C. L. Wyatt

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: September 28, 2016 Edited by Ofir Geri

Recursive Definitions Structural Induction Recursive Algorithms

SORTING, SETS, AND SELECTION

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

Divide and Conquer Algorithms: Advanced Sorting. Prichard Ch. 10.2: Advanced Sorting Algorithms

Merge Sort. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY. Lecture 11 CS2110 Spring 2016

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Merge Sort & Quick Sort

The divide and conquer strategy has three basic parts. For a given problem of size n,

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

Sorting Pearson Education, Inc. All rights reserved.

Programming II (CS300)

Sorting and Selection

Data structures. More sorting. Dr. Alex Gerdes DIT961 - VT 2018

Key question: how do we pick a good pivot (and what makes a good pivot in the first place)?

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

Lower Bound on Comparison-based Sorting

COSC 311: ALGORITHMS HW1: SORTING

Week - 03 Lecture - 18 Recursion. For the last lecture of this week, we will look at recursive functions. (Refer Slide Time: 00:05)

Intro to Algorithms. Professor Kevin Gold

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: April 1, 2015

CS126 Final Exam Review

Algorithms and Data Structures

CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting. Aaron Bauer Winter 2014

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY

Divide and Conquer Sorting Algorithms and Noncomparison-based

Lecture 9: Sorting Algorithms

CS61BL. Lecture 5: Graphs Sorting

Data Structure and Algorithm Midterm Reference Solution TA

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

lecture notes September 2, How to sort?

Data Structure Lecture#17: Internal Sorting 2 (Chapter 7) U Kang Seoul National University

Lecture 6 Sequences II. Parallel and Sequential Data Structures and Algorithms, (Fall 2013) Lectured by Danny Sleator 12 September 2013

Sorting. Sorting. Stable Sorting. In-place Sort. Bubble Sort. Bubble Sort. Selection (Tournament) Heapsort (Smoothsort) Mergesort Quicksort Bogosort

SORTING LOWER BOUND & BUCKET-SORT AND RADIX-SORT

How many leaves on the decision tree? There are n! leaves, because every permutation appears at least once.

DIVIDE & CONQUER. Problem of size n. Solution to sub problem 1

CS125 : Introduction to Computer Science. Lecture Notes #40 Advanced Sorting Analysis. c 2005, 2004 Jason Zych

CS240 Fall Mike Lam, Professor. Quick Sort

DATA STRUCTURES AND ALGORITHMS

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016

QuickSort

Algorithm Analysis. (Algorithm Analysis ) Data Structures and Programming Spring / 48

Exercise 1 : B-Trees [ =17pts]

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

EECS 2011M: Fundamentals of Data Structures

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

Sorting. Data structures and Algorithms

Week - 04 Lecture - 01 Merge Sort. (Refer Slide Time: 00:02)

CSE 373 MAY 24 TH ANALYSIS AND NON- COMPARISON SORTING

Sorting isn't only relevant to records and databases. It turns up as a component in algorithms in many other application areas surprisingly often.

O(n): printing a list of n items to the screen, looking at each item once.

Solutions. (a) Claim: A d-ary tree of height h has at most 1 + d +...

RAM with Randomization and Quick Sort

Algorithms and Data Structures for Mathematicians

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

II (Sorting and) Order Statistics

Lecture 4: Linear Programming

CSE332: Data Abstractions Lecture 21: Parallel Prefix and Parallel Sorting. Tyler Robison Summer 2010

CS221: Algorithms and Data Structures. Sorting Takes Priority. Steve Wolfman (minor tweaks by Alan Hu)

Design and Analysis of Algorithms (VII)

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 3 Parallel Prefix, Pack, and Sorting

9/10/12. Outline. Part 5. Computational Complexity (2) Examples. (revisit) Properties of Growth-rate functions(1/3)

Lecture 15 : Review DRAFT

Transcription:

COMP26120: Algorithms and Imperative Programming Basic sorting algorithms Ian Pratt-Hartmann Room KB2.38: email: ipratt@cs.man.ac.uk 2017 18

Reading for this lecture (Goodrich and Tamassia): Secs. 8.1, 8.2, 8.3 (pp. 241 258). Sec. 1.1.5 (pp. 11 16).

Outline Quicksort Mergesort A lower bound?

Consider the problem of sorting a list of numbers (in ascending order). Quicksort is a sorting algorithm which works well in practice. quicksort(l) if length of L 1 return L remove the first element, x, from L L := elements of L less than or equal to x L > := elements of L greater than x L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end The element x is sometimes referred to as the pivot.

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) quicksort([1,0]) quicksort([0]) quicksort([ ]) quicksort([9, 3, 4]) quicksort([3, 4]) quicksort([ ]) quicksort([4]) quicksort([ ])

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) quicksort([1,0]) quicksort([0]) [0] quicksort([ ]) quicksort([9, 3, 4]) quicksort([3, 4]) quicksort([ ]) quicksort([4]) quicksort([ ])

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) quicksort([1,0]) quicksort([0]) [0] quicksort([9, 3, 4]) quicksort([3, 4]) quicksort([ ]) quicksort([4]) quicksort([ ])

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) quicksort([1,0]) [0,1] quicksort([0]) [0] quicksort([9, 3, 4]) quicksort([3, 4]) quicksort([ ]) quicksort([4]) quicksort([ ])

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) quicksort([1,0]) [0,1] quicksort([0]) [0] quicksort([9, 3, 4]) quicksort([3, 4]) quicksort([4]) quicksort([ ])

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) quicksort([1,0]) [0,1] quicksort([0]) [0] quicksort([9, 3, 4]) quicksort([3, 4]) quicksort([4]) [4] quicksort([ ])

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) quicksort([1,0]) [0,1] quicksort([0]) [0] quicksort([9, 3, 4]) quicksort([3, 4]) [3, 4] quicksort([4]) [4] quicksort([ ])

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) quicksort([1,0]) [0,1] quicksort([0]) [0] quicksort([9, 3, 4]) quicksort([3, 4]) [3, 4] quicksort([4]) [4]

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) quicksort([1,0]) [0,1] quicksort([0]) [0] quicksort([9, 3, 4]) [3, 4, 9] quicksort([3, 4]) [3, 4] quicksort([4]) [4]

Example: quicksort([x L]) if length of L 1 return L remove x from L compute L, L > L l := quicksort(l ) L r := quicksort(l > ) return L l + [x] + L r end quicksort([2,9,1,3,4,0]) [0,1,2,3,4,9] quicksort([1,0]) [0,1] quicksort([0]) [0] quicksort([9, 3, 4]) [3, 4, 9] quicksort([3, 4]) [3, 4] quicksort([4]) [4]

Let s see how much work is done: The worst case occurs when, for each recursive call, one of L or L > is empty. Here n recursive calls are made (ignoring calls with [ ]), with the argument one element shorter each time. Before each recursive call, L and L > must be calculated, requiring O( L ) steps. So if L = n, total work is order n + n 1 + + 1 = 1 n(n + 1) 2 i.e. O(n 2 ) (because O( 1 2 n(n + 1)) = O(n2 )).

Outline Quicksort Mergesort A lower bound?

Here is an algorithm with lower complexity. First, consider the problem of merging two sorted list to form a third sorted list. merge([1, 3, 5], [0, 2, 4, 6, 7]) [0, 1, 2, 3, 4, 5, 6, 7] This algorithm will work. merge(l 1, L 2 ) if L 1 = [ ] return L 2 if L 2 = [ ] return L 1 x i =first element of L i (i = 1, 2) L i = L i minus first element (i = 1, 2) If if x 1, x 2 return [x 1 ]+merge(l 1,L 2) return [x 2 ]+merge(l 1,L 2 ) end

When merge(l 1, L 2 ) is called, at most one recursive call is made, in which L 1 + L 2 decreases by 1. Therefore, at most O(n) recursive calls are made, where n = L 1 + L 2 is the length of the input. A constant number of operations is executed for each recursive call. Therefore, at it takes most O(n) time to run.

We can now present our sorting algorithm mergesort(l) if L 1 return L Split L into two roughly equal halves L = L l + L r return merge(mergesort(l l ),mergesort(l r )) end This algorithm clearly returns a sorted list with exactly the original elements.

How many times is the algorithm called recursively? The following analysis gives a rather disappointing bound: Each recursive call gives rise to two others at at one greater depth of recursion. Thus, each depth of iteration, there are twice as many recursive calls. The maximum depth of recursion is log 2 n. Therefore, the number of calls is 2 ( log 2 n ) 2n. (Actually, a better bound is n 1: can you see why?) The time taken to merge is at most O(n), so this suggests (prima facie) a complexity bound of O(n 2 ).

But in fact it s not that bad. log 2 n The total lengths of lists processed at each level of recursion is constant at L = n. And the total amount of work done for each call is linear in the lengths of the arguments. The number of times L can be halved is O(log n). Hence, the time complexity of mergesort is O(n log n). n

Or do some algebra. Let the time taken my mergesort on any list of length n be bounded be (worst case), t(n). Then, ignoring constant factors ( n ) t(n) = 2t + n 2 and, without loss of generality, we may as well assume that t(2) 2. A simple induction shows that t(n) n log 2 n. For, n > 2 (and cheating quite a lot), we have ( n ) t(n) = 2t + n 2 2 n ( n ) 2 log 2 + n (ind. hyp.) 2 = n log 2 n.

Outline Quicksort Mergesort A lower bound?

Can we do any better than O(n log 2 n)? In the study of algorithms, lower complexity bounds are in general extraordinarily hard to obtain. In the case of sorting, however, we have a qualified lower bound: Any algorithm which sorts a list using only number-comparison operations requires time at least n log 2 n to run. Let us see why this is so.

First some basic facts about trees. Suppose we have a full binary tree of depth d with n vertices in total, of which l are leaves. d = 3 In this example, d = 3, n = 15 and l = 8. Starting with the root at level 0, the number of vertices on each level k is 2 k. Hence Otherwise expressed: l = 2 d n = d 2 k = 2 d+1 1. k=0 d = log 2 l d = log 2 (n + 1) 1.

If the tree is binary branching, but not full, then these equalities are replaced by inequalities. d = 3 Here we are missing some vertices. Hence: l 2 d n = 2 d+1 1. Otherwise expressed: d log 2 l d log 2 (n + 1) 1.

Now suppose we have an algorithm which sorts a list by making comparisons and branching as a result of that comparison. The possible runs of that algorithm may be arranged as a binary tree. d Assume without loss of generality, that the input of length n are the integers 1 n in some order π(1),..., π(n), where π is a permutation. The algorithm will then apply the inverse permutation π 1 to sort the list.

There are n! permutations of the numbers 1 n, each requiring a different output, and hence n! leaves in the computation tree. t(n) = d The maximum running time, t(n), on inputs of length n is the (maximum) depth of the tree. From our inequality d log 2 (l) we obtain, assuming n even: ( (n ) n ) 2 t(n) log 2 (n!) log 2 2 = n ( n ) 2 log 2 = 1 2 2 n(log 2(n) 1).

The following is a very handy way of talking about lower bounds. If f : N N is a function, then Ω(f ) denotes the set of functions: {g : N N n 0 N and c R + s.t. n > n 0, g(n) c f (n)}. Thus, Ω(f ) denotes a set of functions, intuitively, the functions that grow essentially at least as fast as f.

Notice that for n 4, 1 2 n log 2(n) n. In particular, for sufficiently large n, 1 2 n(log 2(n) 1) 1 4 n log 2(n). That is, 1 2 n(log 2(n) 1) Ω(n log 2 (n)) Thus, we are guaranteed that any sorting algorithm based on comparisons has running time (in) Ω(n log 2 (n)). Warning, this doesn t provide a guarantee of the complexity of any algorithm whatsoever. On the other hand, no one has done any better so far...