Quick Sort. Biostatistics 615 Lecture 9

Similar documents
Merge Sort. Biostatistics 615/815 Lecture 11

Merge Sort. Biostatistics 615/815 Lecture 9

Shell Sort. Biostatistics 615/815

Quick Sort. Mithila Bongu. December 13, 2013

BM267 - Introduction to Data Structures

Lecture 2: Divide&Conquer Paradigm, Merge sort and Quicksort

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

Lecture Notes 14 More sorting CSS Data Structures and Object-Oriented Programming Professor Clark F. Olson

Quicksort (CLRS 7) We saw the divide-and-conquer technique at work resulting in Mergesort. Mergesort summary:

Sorting. Sorting. Stable Sorting. In-place Sort. Bubble Sort. Bubble Sort. Selection (Tournament) Heapsort (Smoothsort) Mergesort Quicksort Bogosort

Sorting: Given a list A with n elements possessing a total order, return a list with the same elements in non-decreasing order.

Sorting is a problem for which we can prove a non-trivial lower bound.

// walk through array stepping by step amount, moving elements down for (i = unsorted; i >= step && item < a[i-step]; i-=step) { a[i] = a[i-step];

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

CS125 : Introduction to Computer Science. Lecture Notes #38 and #39 Quicksort. c 2005, 2003, 2002, 2000 Jason Zych

Sorting. Weiss chapter , 8.6

Randomized Algorithms, Quicksort and Randomized Selection

Divide-and-Conquer. The most-well known algorithm design strategy: smaller instances. combining these solutions

Cpt S 122 Data Structures. Sorting

Quick Sort. CSE Data Structures May 15, 2002

CSC Design and Analysis of Algorithms

CSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer

Quicksort. Repeat the process recursively for the left- and rightsub-blocks.

DATA STRUCTURES AND ALGORITHMS

Overview of Sorting Algorithms

IS 709/809: Computational Methods in IS Research. Algorithm Analysis (Sorting)

Chapter 4. Divide-and-Conquer. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Comparison Sorts. Chapter 9.4, 12.1, 12.2

Unit-2 Divide and conquer 2016

COMP Data Structures

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

ECE 242 Data Structures and Algorithms. Advanced Sorting II. Lecture 17. Prof.

Data Structures and Algorithms

Cosc 241 Programming and Problem Solving Lecture 17 (30/4/18) Quicksort

Lecture 7 Quicksort : Principles of Imperative Computation (Spring 2018) Frank Pfenning

Divide and Conquer 4-0

CSCI 2132 Software Development Lecture 18: Implementation of Recursive Algorithms

Lecture Notes on Quicksort

COSC242 Lecture 7 Mergesort and Quicksort

Chapter 7 Sorting. Terminology. Selection Sort

Better sorting algorithms (Weiss chapter )

Lecture Notes on Quicksort

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

CS61B Lectures # Purposes of Sorting. Some Definitions. Classifications. Sorting supports searching Binary search standard example

Sorting. Task Description. Selection Sort. Should we worry about speed?

Divide and Conquer. Algorithm Fall Semester

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

Active Learning: Sorting

Lecture 8: Mergesort / Quicksort Steven Skiena

Faster Sorting Methods

( ) time in the worst case, but it is

Sorting. Quicksort analysis Bubble sort. November 20, 2017 Hassan Khosravi / Geoffrey Tien 1

Basic Techniques. Recursion. Today: Later on: Recursive methods who call themselves directly or indirectly.

e-pg PATHSHALA- Computer Science Design and Analysis of Algorithms Module 10

7 Sorting Algorithms. 7.1 O n 2 sorting algorithms. 7.2 Shell sort. Reading: MAW 7.1 and 7.2. Insertion sort: Worst-case time: O n 2.

Key question: how do we pick a good pivot (and what makes a good pivot in the first place)?

11/8/2016. Chapter 7 Sorting. Introduction. Introduction. Sorting Algorithms. Sorting. Sorting

Sorting. Bringing Order to the World

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs

CSE 373: Data Structures and Algorithms

Design and Analysis of Algorithms

Introduction to Algorithms

4.4 Algorithm Design Technique: Randomization

Data Structures and Algorithms Week 4

Analysis of Algorithms - Quicksort -

Divide and Conquer Sorting Algorithms and Noncomparison-based

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION

Structures, Algorithm Analysis: CHAPTER 7: SORTING

Introduction. e.g., the item could be an entire block of information about a student, while the search key might be only the student's name

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Lecture T5: Analysis of Algorithm

Algorithms Interview Questions

Shell s sort. CS61B Lecture #25. Sorting by Selection: Heapsort. Example of Shell s Sort

CHAPTER 7 Iris Hui-Ru Jiang Fall 2008

CSE 2320 Section 002, Fall 2015 Exam 2 Time: 80 mins

Sorting. Order in the court! sorting 1

MA/CSSE 473 Day 17. Divide-and-conquer Convex Hull. Strassen's Algorithm: Matrix Multiplication. (if time, Shell's Sort)

Sorting algorithms Properties of sorting algorithm 1) Adaptive: speeds up to O(n) when data is nearly sorted 2) Stable: does not change the relative

08 A: Sorting III. CS1102S: Data Structures and Algorithms. Martin Henz. March 10, Generated on Tuesday 9 th March, 2010, 09:58

CPSC 311 Lecture Notes. Sorting and Order Statistics (Chapters 6-9)

QUICKSORT TABLE OF CONTENTS

Sorting Algorithms. Biostatistics 615 Lecture 7

COT 6405: Analysis of Algorithms. Giri Narasimhan. ECS 389; Phone: x3748

Scan and Quicksort. 1 Scan. 1.1 Contraction CSE341T 09/20/2017. Lecture 7

CS 171: Introduction to Computer Science II. Quicksort

SEARCHING AND SORTING HINT AT ASYMPTOTIC COMPLEXITY

ECE 242 Data Structures and Algorithms. Advanced Sorting I. Lecture 16. Prof.

CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting. Ruth Anderson Winter 2019

MPATE-GE 2618: C Programming for Music Technology. Unit 4.2

CSE 143. Two important problems. Searching and Sorting. Review: Linear Search. Review: Binary Search. Example. How Efficient Is Linear Search?

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 6. Sorting Algorithms

Lecture #2. 1 Overview. 2 Worst-Case Analysis vs. Average Case Analysis. 3 Divide-and-Conquer Design Paradigm. 4 Quicksort. 4.

Data Structures and Algorithms Chapter 4

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

Pseudo code of algorithms are to be read by.

Divide-and-Conquer. Dr. Yingwu Zhu

COMP Analysis of Algorithms & Data Structures

Sorting. CPSC 259: Data Structures and Algorithms for Electrical Engineers. Hassan Khosravi

having any value between and. For array element, the plot will have a dot at the intersection of and, subject to scaling constraints.

Programming I Part 3 Algorithms and Data Structures

Transcription:

Quick Sort Biostatistics 615 Lecture 9

Previously: Elementary Sorts Selection Sort Insertion Sort Bubble Sort Try to state an advantage for each one

Selection Insertion Bubble

Last Lecture: Shell Sort Gradually bring order to array by: Sort sub-arrays including every k th element For a sequence where the last k = 1 Produces a series of nearly sorted arrays where insertion sort is very efficient Theoretically, N (log N) 2 complexity

Pictorial Representation Array gradually gains order Eventually, we approach the ideal case where insertion sort is O(N)

Shell Sort Properties h-sorting a k-sorted array, produces an array that is h- and k- sorted If h and k are relatively prime, a[i] must be smaller than any a[i+j] where j > (h 1)(k 1) No more than (h-1)(k-1)/g comparisons to g- sort array that is h and k sorted

Property I Result of h-sorting an array that is k- ordered is an h- and k- ordered array Consider 4 elements: a[i] <= a[i+k] a[i+h] <= a[i+k+h] After sorting, a[i] contains minimum and a[i+k+h] contains maximum of 4 elements

Property II No more than (h-1)(k-1)/g comparisons to g sort a file that is h- and k-sorted For h and k relatively prime Elements further than (h-1)(k-1) can be reached by a series of steps of size h or k (i.e. through elements known to be in order)

Property II Consider h and k sorted arrays Say h = 4 and k = 5 Elements that must be in order

Property II Consider h and k sorted arrays Say h = 4 and k = 5 More elements that must be in order

Property II Combining the previous series gives the desired property that elements (h-1)(k-1) elements away must be in order

An optimal series? Considering the two previous properties A series where every sub-array is known to be 2- and 3- ordered could be sorted with a single round of comparisons How many increments must be used for such a sequence?

Optimal Performance? Consider a triangle of increments: Each element is: double the number above to the right three times the number above to the left < log 2 N log 3 N increments 1 2 3 4 6 9 8 12 18 27 16 24 36 54 81 32 48 72 108 162 243

Today: Quick Sort Most widely used sorting algorithm Possibly, except for those bubble sorts that should be banished! Extremely efficient O(N log N) Divide-and-conquer algorithm

The Inventor of Quicksort Sir Charles A. R. Hoare 1980 ACM Turing Award British computer scientist Studied statistics as a graduate student Made major contributions to developing computer languages

C. A. R. Hoare Quote I conclude that there are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies."

Caution! Quicksort is fragile! Small mistakes can be hard to spot Shellsort is more robust (but slower) The worst case running time is O(N 2 ) Can be avoided in most cases

Divide-And-Conquer Divide a problem into smaller subproblems Find a partitioning element such that: All elements to the right are greater All elements to the left are smaller Sort right and left sub-arrays independently

C Code: QuickSort void quicksort(item * a, int start, int stop) { int i; if (stop <= start) return; i = partition(a, start, stop); quicksort(a, start, i - 1); quicksort(a, i + 1, stop); }

Quicksort Notes Each round places one element into position The partitioning element Recursive calls handle each half of array What would be: a good partitioning element? a bad partitioning element?

Partitioning Choose an arbitrary element Any suggestions? Place all smaller values to the left Place all larger values to the right

Partitioning smaller elements larger elements partitioning element up down

C Code: Partitioning int partition(item * a, int start, int stop) { int up = start, down = stop 1, part = a[stop]; if (stop <= start) return start; while (1) { while (isless(a[up], part)) up++; while (isless(part, a[down]) && (up < down)) down--; if (up >= down) break; Exchange(a[up], a[down]); up++, down--; } Exchange(a[up], a[stop]); return up; }

Partitioning Notes The check (up < down) required when partitioning element is also smallest element. N + 1 comparisons N - 1 for each element vs. partitioning element Extra for when pointers cross

Quick Sort Array is successively subdivided, around partitioning element. Within each section, items are arranged randomly

Complexity of Quick Sort Best case: C = 2 CN / 2 + N = N log N 2 N Random input: C N N 1 = N + 1+ C N 2N log e k = 1 k 1 N 1.4N + C log N k 2 N

Complexity of Quick Sort Number of Comparisons Thousands 90 75 60 45 30 15 Cost Cost2 Optimal 0 0 1 2 3 4 5 Thousands Number of Elements

Improvements to Quicksort Sorting small sub-arrays Quicksort is great for large arrays Inefficient for very small ones Choosing a better partitioning element A poor choice could make sort quadratic!

Small Sub-arrays Most recursive calls are for small subarrays Commonplace for many recursive programs Brute-force algorithms often better for small problems In this case, insertion sort is a good option

Sorting Small Sub-arrays Possibility 1: if (stop start <= M) insertion(a, start, stop); Possibility 2: if (stop start <= M) return; Make a single call to insertion() at the end 5 < M < 25 gives ~10% speed increase

Improved Partitioning How do we avoid picking smallest or largest element for partitioning? Could take a random element Could take a sample

Median-of-Three Partitioning Take sample of three elements Usually, first, last and middle element Sort these three elements Partition around median Much more unlikely for worst case to occur

C Code: Median-of-Three Partitioning void quicksort(item * a, int start, int stop) { int i; // Leave small subsets for insertion sort if (stop - start <= M) return; // Place median of 3 in position stop - 1 Exchange(a[(start + stop)/2], a[stop 1]); CompExch(a[start], a[stop 1]); CompExch(a[start], a[stop]); CompExch(a[stop - 1], a[stop]); // The first and the last elements are prepartioned i = partition(a, start + 1, stop - 1); quicksort(a, start, i - 1); quicksort(a, i + 1, stop); }

Summary Together, median-of-three partitioning and using insertion sort for small subfiles improve performance about 20% Problem for next lecture: Avoiding very deep recursions

Selection A problem related to sorting Find the k smallest element in an array Minimum Maximum Median

Selection small k We can solve the problem in O(Nk) = O(N) One approach: Perform k passes For pass j, find j smallest element Another approach: Maintain a small array with k smallest elements

Selection for large k One option is to sort array But we only need to bring k into position Focus on one side of current partition

C Code: Selection // Places element k in position void select(item * a, int start, int stop, int k) { int i; if (start <= stop) return; i = partition(a, start, stop); if (i > k) select(a, start, i 1, k); if (i < k) select(a, i + 1, stop, k); }

C Code: Without Recursion // Places element k in position within array void select(item * a, int start, int stop, int k) { int i; while (start > stop) { i = partition(a, start, stop); } if (i >= k) stop = i - 1; if (i <= k) start = i + 1; }

Selection Quicksort based method is O(N) Rough argument: First pass through N elements Second pass through N/2 elements Third pass through N/4 elements Common application: finding k smallest values in a simulation to save for further analyses

Recommended Reading Sedgewick, Chapter 7 The original description by Hoare (1962) in Computer Journal 5:10-15.