CS125 : Introduction to Computer Science. Lecture Notes #40 Advanced Sorting Analysis. c 2005, 2004 Jason Zych

Similar documents
CS125 : Introduction to Computer Science. Lecture Notes #38 and #39 Quicksort. c 2005, 2003, 2002, 2000 Jason Zych

Sorting. Task Description. Selection Sort. Should we worry about speed?

Sorting Algorithms. + Analysis of the Sorting Algorithms

Key question: how do we pick a good pivot (and what makes a good pivot in the first place)?

CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting. Aaron Bauer Winter 2014

Algorithmic Analysis and Sorting, Part Two. CS106B Winter

Sorting and Searching

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

Mergesort again. 1. Split the list into two equal parts

We can use a max-heap to sort data.

Programming II (CS300)

CS 171: Introduction to Computer Science II. Quicksort

COSC242 Lecture 7 Mergesort and Quicksort

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

COE428 Lecture Notes Week 1 (Week of January 9, 2017)

Quicksort. Repeat the process recursively for the left- and rightsub-blocks.

CS1 Lecture 30 Apr. 2, 2018

Quick Sort. CSE Data Structures May 15, 2002

Algorithmic Analysis and Sorting, Part Two

Quicksort (Weiss chapter 8.6)

CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting. Ruth Anderson Winter 2019

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

4.4 Algorithm Design Technique: Randomization

Mergesort. CSE 2320 Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington

CSE 373: Data Structures and Algorithms

CS4311 Design and Analysis of Algorithms. Lecture 1: Getting Started

Sorting. Order in the court! sorting 1

CS S-11 Sorting in Θ(nlgn) 1. Base Case: A list of length 1 or length 0 is already sorted. Recursive Case:

CS 171: Introduction to Computer Science II. Quicksort

Programming II (CS300)

// walk through array stepping by step amount, moving elements down for (i = unsorted; i >= step && item < a[i-step]; i-=step) { a[i] = a[i-step];

Sorting. Order in the court! sorting 1

Sorting Algorithms Day 2 4/5/17

Fundamental problem in computing science. putting a collection of items in order. Often used as part of another algorithm

CmpSci 187: Programming with Data Structures Spring 2015

The Limits of Sorting Divide-and-Conquer Comparison Sorts II

L14 Quicksort and Performance Optimization

Sorting Algorithms. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Scan and Quicksort. 1 Scan. 1.1 Contraction CSE341T 09/20/2017. Lecture 7

Shell s sort. CS61B Lecture #25. Sorting by Selection: Heapsort. Example of Shell s Sort

Introduction to Computers and Programming. Today

DATA STRUCTURES AND ALGORITHMS

Lecture 9: Sorting Algorithms

S O R T I N G Sorting a list of elements implemented as an array. In all algorithms of this handout the sorting of elements is in ascending order

Chapter 7 Sorting. Terminology. Selection Sort

Lecture Notes 14 More sorting CSS Data Structures and Object-Oriented Programming Professor Clark F. Olson

CS 61B Summer 2005 (Porter) Midterm 2 July 21, SOLUTIONS. Do not open until told to begin

(Refer Slide Time: 01.26)

Sorting. Dr. Baldassano Yu s Elite Education

QuickSort

Faster Sorting Methods

Unit-2 Divide and conquer 2016

Table ADT and Sorting. Algorithm topics continuing (or reviewing?) CS 24 curriculum

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs

InserDonSort. InserDonSort. SelecDonSort. MergeSort. Divide & Conquer? 9/27/12

CSE373: Data Structure & Algorithms Lecture 18: Comparison Sorting. Dan Grossman Fall 2013

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 6. Sorting Algorithms

O(n): printing a list of n items to the screen, looking at each item once.

Data Structures and Algorithms Running time, Divide and Conquer January 25, 2018

Lab 9: More Sorting Algorithms 12:00 PM, Mar 21, 2018

CSC 273 Data Structures

QUICKSORT TABLE OF CONTENTS

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Merge Sort & Quick Sort

Recursive Algorithms II

IS 709/809: Computational Methods in IS Research. Algorithm Analysis (Sorting)

9/10/12. Outline. Part 5. Computational Complexity (2) Examples. (revisit) Properties of Growth-rate functions(1/3)

Lecture #2. 1 Overview. 2 Worst-Case Analysis vs. Average Case Analysis. 3 Divide-and-Conquer Design Paradigm. 4 Quicksort. 4.

SEARCHING AND SORTING HINT AT ASYMPTOTIC COMPLEXITY

Programming in Haskell Aug-Nov 2015

MERGESORT & QUICKSORT cs2420 Introduction to Algorithms and Data Structures Spring 2015

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Divide and Conquer

The Substitution method

Algorithms and Data Structures

COMP 250 Fall recurrences 2 Oct. 13, 2017

CS2351 Data Structures. Lecture 1: Getting Started

CS61B Lectures # Purposes of Sorting. Some Definitions. Classifications. Sorting supports searching Binary search standard example

Question 7.11 Show how heapsort processes the input:

4. Sorting and Order-Statistics

Mergesort again. 1. Split the list into two equal parts

Better sorting algorithms (Weiss chapter )

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: April 1, 2015

"Organizing is what you do before you do something, so that when you do it, it is not all mixed up." ~ A. A. Milne SORTING

ITEC2620 Introduction to Data Structures

Module 3: Sorting and Randomized Algorithms. Selection vs. Sorting. Crucial Subroutines. CS Data Structures and Data Management

2/14/13. Outline. Part 5. Computational Complexity (2) Examples. (revisit) Properties of Growth-rate functions(1/3)

Module 3: Sorting and Randomized Algorithms

BM267 - Introduction to Data Structures

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

CSE 373 MAY 24 TH ANALYSIS AND NON- COMPARISON SORTING

CS221: Algorithms and Data Structures. Sorting Takes Priority. Steve Wolfman (minor tweaks by Alan Hu)

Lecture Notes on Quicksort

Data Structures and Algorithms

Sorting. There exist sorting algorithms which have shown to be more efficient in practice.

ECE 242 Data Structures and Algorithms. Advanced Sorting II. Lecture 17. Prof.

lecture notes September 2, How to sort?

Sorting is a problem for which we can prove a non-trivial lower bound.

II (Sorting and) Order Statistics

Cpt S 122 Data Structures. Sorting

Transcription:

CS125 : Introduction to Computer Science Lecture Notes #40 Advanced Sorting Analysis c 2005, 2004 Jason Zych 1

Lecture 40 : Advanced Sorting Analysis Mergesort can be described in this manner: Analyzing MergeSort time to sort n values = time to sort left half + time to sort right half + time to merge sorted halves which is commonly written in the form of a recurrence relation, i.e. formula: a recursive mathematical T(n) is the time to run MergeSort on n values T(n) = T(n/2) + T(n/2) + time to merge T(1) = 1 / base case is constant time or after doing some algebra: T(n) = 2 * T(n/2) T(1) = 1 + time to merge You ll learn about recurrence relations in a different course, and learn how to solve them. For the moment, it simply makes for a convenient notation for the mergesort running time, before we get our final result. Now, what is the running time for merge? Well, if we have n total values among the two sorted collections we are merging, and we spend constant time on each step to write one of those values into the temporary array, then we have linear time total. And then, when we write the temporary array back into the original array, likewise, that takes constant time to write one value on each step, so that is likewise, linear time. So, overall, for a given step of MergeSort, the merge work is linear; each value is written to the temporary array once and then written back once. So, the recurrence relation looks like this: T(n) = 2 * T(n/2) + linear T(1) = 1 That describes mergesort two recursive calls on an array half the size, plus linear work to merge them. Now, you don t have the mathematical tools to solve that yet; eventually, you ll learn all about solving recurrence relations. But in this case, we can still analyze Mergesort we just will use a different technique than the recurrence relation. Each step breaks the array in half, and then we sort each of those halves before trying to merge the entire array. So if we start out with n elements, consider the following diagram, where the number we list (n, n/2, etc.) is the number of cells in the array at that level of recursion: 2

n / \ left side right side n/2 n/2 / \ / \ left of right of left of right of left left right right n/4 n/4 n/4 n/4 / \ / \ / \ / \ n/8 n/8 n/8 n/8 n/8 n/8 n/8 n/8... etc. Notice that each level adds up to n. The first level is indicating the Merge of the entire array that we do at the end. On the second level, we have two recursive calls (the two we made from our first call to Mergesort). Each one eventually runs Merge on an array of size n/2, but we double that time because we have two such recursive calls. So that level, too, adds up to n. The next level involves the third level of recursion the calls we make, from the calls we made, from our first call. Since we had already recursively called on half the array on the level above, when we made two calls on half of THAT array, we end up with two recursive calls on an array one fourth the size of the original. So the Merge for such a call takes time n/4, and we have four such calls, so again, the total time for all the 3-levels-deep recursive calls is n. And so on. Each row will total to n, in terms of time. And we have a logarithmic number of rows, since we keep cutting the array size in half as we move from one level, to the level below it. So if each row has time n, and there are a logarithmic number of rows, then the result is that the running time for Mergesort has an order of growth equal to: OR OR linear times logarithmic n * log-base-2 of n n * lg n // lg is a shorthand for log-base-2 but linear times logarithmic is good enough for our purposes here. It means the algorithm takes longer than linear time which makes sense, since the last merge *alone* is linear time but it s not quite as long as quadratic time. (just as constant is less than logarithmic is less than linear, likewise n*constant is less than n*lg n is less than n * n, i.e. linear is less than n * lg n is less than quadratic). This turns out to be exactly the result we d get if we mathematically solved the recurrence relation we listed earlier. Our diagram above was just an alternate way of arriving at the same result. And note Mergesort does not take any advantage of the array being already sorted the same work gets done regardless. So it s n *lg n for any case worst, average, or even best. 3

Analyzing Quicksort Partition is linear each step takes constant time to place one value beyond the left wall or right wall so to partition an array of n values takes time that is linear in n. So, in the best case, we get the best possible pivot for quicksort and the two halves of the array are equal in size. In that case, we have the same recurrence relation as for Mergesort: T(n) = linear + 2 * T(n/2) T(1) = 1 So quicksort in the best case is n*lg n. And on average, we expect quicksort to get a good pivot most of the time, so we can expect the average case to be similar to the best case, and thus to *also* be n*lg n. (That s not a proof, of course, but the proof that the average case is basically equal to the best case is beyond this course. I just want to give you a gut feeling that it s true, so that you remember it better.) The worst case for quicksort is when the pivot is always as low as possible, and in that case, basically everything, or everything but one value, in the partition, ends up on the right side of the pivot. So each recursive call to sort the right side, is basically only sorting only one or two fewer cells than the previous call...so this is basically the same as SelectionSort at that point, and is thus quadratic. 4

Summary: Quicksort best case: n * lg n average case: n * lg n worst case: quadratic advantage: on average, fastest known comparison-based sorting algorithm; i.e. the only faster algorithms we know of are designed for specific kinds of data, rather than working on anything we can compare. Both quicksort and mergesort have the same order of growth, but in terms of constant factors of that n * lg n term, quicksort s constants are lower. disadvantage: the quadratic worst case. It s very unlikely to occur but it *can* happen, making quicksort not the best choice if you *need* to be at n * lg n time. Mergesort best case: n * lg n average case : n * lg n worst case: n * lg n advangtage: n * lg n in all cases, even worst case disadvantage: uses linear extra memory (the temporary array needed by merge)...this is especially a problem if you have a VERY large array to sort, since you might not even *have* enough memory left over to create another array that size. Also, not quite as fast as quicksort s average case, when you consider the constant factors (since they both have n * lg n order of growth) InsertionSort best case: linear average case: quadratic, though about half the time of worst case worst case: quadratic advantage: works well for partially sorted or completely sorted arrays; also good for small arrays since quicksort and mergesort tend to be overkill for such arrays disadvantage: quadratic for any randomly arranged array, i.e. it s not better than quadratic unless the array *is* sorted a lot already SelectionSort best case: quadratic average case: quadratic worst case: quadratic advantage: fewest number of swaps...in situations (sorting integers is not one of them) where swapping our data can take a very long time due to very large amounts of it to move back and forth, swapping might have such a LARGE constant on it that it would overshadow everything else. In that case, we might want selection sort. 5

disadvantage: quadratic in all cases, can t even be improved upon if the array is partially sorted. Works okay for small arrays but insertionsort works better for those arrays...so in general, SelectionSort is not generally useful. It s only useful in the one case listed as an advantage above. 6