COSC242 Lecture 7 Mergesort and Quicksort

Similar documents
Better sorting algorithms (Weiss chapter )

Divide and Conquer 4-0

Lecture 6: Divide-and-Conquer

Cosc 241 Programming and Problem Solving Lecture 17 (30/4/18) Quicksort

Key question: how do we pick a good pivot (and what makes a good pivot in the first place)?

Lecture 2: Divide&Conquer Paradigm, Merge sort and Quicksort

Quicksort (Weiss chapter 8.6)

Data structures. More sorting. Dr. Alex Gerdes DIT961 - VT 2018

Lecture #2. 1 Overview. 2 Worst-Case Analysis vs. Average Case Analysis. 3 Divide-and-Conquer Design Paradigm. 4 Quicksort. 4.

Lecture 8: Mergesort / Quicksort Steven Skiena

CSC 273 Data Structures

4.4 Algorithm Design Technique: Randomization

BM267 - Introduction to Data Structures

Mergesort again. 1. Split the list into two equal parts

Mergesort again. 1. Split the list into two equal parts

CS125 : Introduction to Computer Science. Lecture Notes #38 and #39 Quicksort. c 2005, 2003, 2002, 2000 Jason Zych

Sorting is a problem for which we can prove a non-trivial lower bound.

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

Sorting: Given a list A with n elements possessing a total order, return a list with the same elements in non-decreasing order.

Sorting. Bubble Sort. Pseudo Code for Bubble Sorting: Sorting is ordering a list of elements.

// walk through array stepping by step amount, moving elements down for (i = unsorted; i >= step && item < a[i-step]; i-=step) { a[i] = a[i-step];

Scan and Quicksort. 1 Scan. 1.1 Contraction CSE341T 09/20/2017. Lecture 7

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

Quicksort (CLRS 7) We saw the divide-and-conquer technique at work resulting in Mergesort. Mergesort summary:

Data Structures and Algorithms

Sorting. Bringing Order to the World

CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting. Aaron Bauer Winter 2014

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs

Lecture 7 Quicksort : Principles of Imperative Computation (Spring 2018) Frank Pfenning

Data Structure Lecture#17: Internal Sorting 2 (Chapter 7) U Kang Seoul National University

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Divide and Conquer. Algorithm D-and-C(n: input size)

Sorting. Weiss chapter , 8.6

Lecture Notes 14 More sorting CSS Data Structures and Object-Oriented Programming Professor Clark F. Olson

Lecture Notes on Quicksort

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

CPSC 311 Lecture Notes. Sorting and Order Statistics (Chapters 6-9)

1 Probabilistic analysis and randomized algorithms

Algorithms Lab 3. (a) What is the minimum number of elements in the heap, as a function of h?

Randomized Algorithms, Quicksort and Randomized Selection

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting. Ruth Anderson Winter 2019

COMP Analysis of Algorithms & Data Structures

Algorithm for siftdown(int currentposition) while true (infinite loop) do if the currentposition has NO children then return

Comparison Sorts. Chapter 9.4, 12.1, 12.2

Quicksort. Repeat the process recursively for the left- and rightsub-blocks.

Scribe: Sam Keller (2015), Seth Hildick-Smith (2016), G. Valiant (2017) Date: January 25, 2017

Sorting. Sorting. Stable Sorting. In-place Sort. Bubble Sort. Bubble Sort. Selection (Tournament) Heapsort (Smoothsort) Mergesort Quicksort Bogosort

7 Sorting Algorithms. 7.1 O n 2 sorting algorithms. 7.2 Shell sort. Reading: MAW 7.1 and 7.2. Insertion sort: Worst-case time: O n 2.

CS 171: Introduction to Computer Science II. Quicksort

Lecture Notes on Quicksort

Introduction to Programming: Lecture 6

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 3 Parallel Prefix, Pack, and Sorting

Analysis of Algorithms - Quicksort -

Lecture 5: Sorting Part A

Data Structures and Algorithms Chapter 4

CS125 : Introduction to Computer Science. Lecture Notes #40 Advanced Sorting Analysis. c 2005, 2004 Jason Zych

Algorithmic Analysis and Sorting, Part Two

CS Divide and Conquer

CS 171: Introduction to Computer Science II. Quicksort

CSCI 2132 Software Development Lecture 18: Implementation of Recursive Algorithms

ITEC2620 Introduction to Data Structures

CSC Design and Analysis of Algorithms

Real-world sorting (not on exam)

S O R T I N G Sorting a list of elements implemented as an array. In all algorithms of this handout the sorting of elements is in ascending order

CSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer

Divide-and-Conquer. Dr. Yingwu Zhu

We can use a max-heap to sort data.

Quick Sort. CSE Data Structures May 15, 2002

Divide and Conquer. Algorithm Fall Semester

Searching in General

Sorting. There exist sorting algorithms which have shown to be more efficient in practice.

Data Structures And Algorithms

CS 3343 (Spring 2018) Assignment 4 (105 points + 15 extra) Due: March 9 before class starts

Unit-2 Divide and conquer 2016

Data Structures and Algorithms. Roberto Sebastiani

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

QuickSort

COSC 311: ALGORITHMS HW1: SORTING

COMP2012H Spring 2014 Dekai Wu. Sorting. (more on sorting algorithms: mergesort, quicksort, heapsort)

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

Assignment 4: Question 1b omitted. Assignment 5: Question 1b clarification

Sorting. Task Description. Selection Sort. Should we worry about speed?

L14 Quicksort and Performance Optimization

CSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer

CS61B Lectures # Purposes of Sorting. Some Definitions. Classifications. Sorting supports searching Binary search standard example

Algorithms. MIS, KUAS, 2007 Ding

Chapter 4. Divide-and-Conquer. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Deterministic and Randomized Quicksort. Andreas Klappenecker

Design and Analysis of Algorithms

CS Divide and Conquer

Sorting Algorithms. + Analysis of the Sorting Algorithms

Divide-and-Conquer. The most-well known algorithm design strategy: smaller instances. combining these solutions

Searching, Sorting. part 1

7.3 A randomized version of quicksort

CSE 143. Two important problems. Searching and Sorting. Review: Linear Search. Review: Binary Search. Example. How Efficient Is Linear Search?

n 1 x i = xn 1 x 1. i= (2n 1) = n 2.

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION

CSE332: Data Abstractions Lecture 21: Parallel Prefix and Parallel Sorting. Tyler Robison Summer 2010

Transcription:

COSC242 Lecture 7 Mergesort and Quicksort We saw last time that the time complexity function for Mergesort is T (n) = n + n log n. It is not hard to see that T (n) = O(n log n). After all, n + n log n n log n + n log n 2(n log n). So Mergesort is O(n log n), and in fact Θ(n log n), whereas Insertion Sort is Θ(n 2 ) in the worst case. However, we must remember that this comparison is about how well the sorting algorithm scales up when the values of n get larger. For small values of n, insertion sort beats mergesort! This surprising fact offers us an opportunity to improve mergesort. Also worth remembering is that insertion sort is very efficient if the input data is already sorted (or nearly sorted) because it only needs to make O(n) comparisons in this case. Many sorting algorithms, mergesort included, perform as badly on already-sorted data as in the worst case. COSC242 Lecture 7 / Slide 1

Improving Mergesort My colleague Nathan Rountree reported that on a test machine, for inputs of size n, insertion sort ran in 8n 2 steps and mergesort in 64n log n steps (in the worst case). If n = 2, then insertion sort runs in 8 4 = 32 steps while mergesort takes 64 2 1 = 128 steps. When does insertion sort stop beating mergesort? If 8n 2 < 64n log n then n < 8 log n, so n/8 < log n, so 2 n/8 < n. By trial and error, 2 5 = 32 where 5 8 = 40, and 2 6 = 64 where 6 8 = 48, so 2 n/8 > n for n = 48. Thus 40 n 48. So we can improve mergesort by rewriting it to do insertion sort on arrays of length 40 or less. Other improvements exist. Mergesort does not sort in place, since the merge routine copies items into a new array, so one can try to reduce the amount of copying, especially if you have big records. COSC242 Lecture 7 / Slide 2

Quicksort Mergesort divides the input array A[low..high] into two pieces of similar size without looking at the contents. We get pieces A[low..mid] and A[(mid + 1)..high] merely by using the location mid (low + high) / 2. Quicksort is based on the following subtle idea. In a sorted output array, the value of the middle key also divides the array into two pieces of similar size: all keys to the left are smaller and all keys to the right are bigger. Can we use this median value to help with the sorting? Well, if we had an easy way to pick some x so that about half the values in the input array A[low..high] are x and the rest x, then we could stick x into cell A[mid] and rearrange the entries so that smaller elements are put on the left of x and larger entries are put on the right of x. This would not yet be the sorted output array, but we could recursively apply the same process to the pieces A[low..mid] and A[(mid + 1)..high]. The recursive descent stops when we reach a piece of length 1, because an array of length 0 or 1 is sorted already. COSC242 Lecture 7 / Slide 3

Picking the pivot There is a major problem with the above method of splitting A. To split A into two equal pieces, the pivot x should be the median. But the preprocessing to find the median key value is costly. On the other hand if we don t choose the pivot to be the median (or close to it), then the two pieces on either side of pivot x are of unequal size, which is also potentially costly. How should we pick the pivot? The original approach simply picked the first key in the array as pivot x, and argued that for a large input array we may assume keys are randomly distributed and so there should be roughly as many keys smaller than x as larger than x. Our pseudocode uses this idea. But it s not the best. We ll look at some improvements in the next lecture. COSC242 Lecture 7 / Slide 4

The Quicksort Algorithm Suppose that the choice of pivot and the rearranging of entries is done by a special Partition algorithm. (And let s postpone the question of how the Partition algorithm works.) Then to sort the subarray of A[0..(n - 1)] stretching from location low to location high, do this: Quicksort(A, low, high) 1: if A[low..high] has 2 or more elements then 2: Partition(A, low, high), returning the index mid {everything the chosen pivot is to the right of mid } 3: Quicksort(A, low, mid) 4: Quicksort(A, mid + 1, high) To sort A[0..(n - 1)], the initial call is Quicksort(A, 0, n - 1). Quicksort provides the control structure for the sort, but the work of comparing and rearranging the keys is done by the (non-recursive) Partition algorithm. COSC242 Lecture 7 / Slide 5

The Partition Algorithm Partition picks the pivot x and then produces an array with two parts, one containing keys x, the other keys x. For now, we pick the first key of the array to be x, the pivot. Partition(A, low, high) 1: x A[low] 2: i = low - 1 3: j = high + 1 4: loop 5: repeat 6: j j - 1 7: until A[j] x 8: repeat 9: i i + 1 10: until A[i] x 11: if i < j then 12: swap A[i] and A[j] 13: else 14: return j COSC242 Lecture 7 / Slide 6

Exercises Try stepping through the Partition algorithm for each of the following. Check that the result is an index that partitions the array by having to its left all keys smaller than, and on the right all keys larger than, the pivot x. A = { 7, 3, 2, 11, 5} B = { 5, 5, 5, 5, 5} C = { 1, 2, 3, 4, 5} D = { 5, 4, 3, 2, 1} In which cases did the Partition algorithm seem to do a lot of unnecessary work, and what was the reason? Do duplicate keys ever arise in practice? And would you ever want to keep those records with duplicate keys in their original relative order for some reason? And does Partition do this? COSC242 Lecture 7 / Slide 7