Lecture 2: Divide&Conquer Paradigm, Merge sort and Quicksort

Similar documents
Sorting is a problem for which we can prove a non-trivial lower bound.

Quick Sort. Mithila Bongu. December 13, 2013

COSC242 Lecture 7 Mergesort and Quicksort

Divide and Conquer. Algorithm D-and-C(n: input size)

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

Quicksort. Repeat the process recursively for the left- and rightsub-blocks.

Randomized Algorithms, Quicksort and Randomized Selection

Quicksort (CLRS 7) We saw the divide-and-conquer technique at work resulting in Mergesort. Mergesort summary:

Lecture Notes 14 More sorting CSS Data Structures and Object-Oriented Programming Professor Clark F. Olson

Quick Sort. Biostatistics 615 Lecture 9

Outline. CS 561, Lecture 6. Priority Queues. Applications of Priority Queue. For NASA, space is still a high priority, Dan Quayle

BM267 - Introduction to Data Structures

IS 709/809: Computational Methods in IS Research. Algorithm Analysis (Sorting)

Sorting: Given a list A with n elements possessing a total order, return a list with the same elements in non-decreasing order.

Divide and Conquer 4-0

Scribe: Sam Keller (2015), Seth Hildick-Smith (2016), G. Valiant (2017) Date: January 25, 2017

EECS 2011M: Fundamentals of Data Structures

Comparison Sorts. Chapter 9.4, 12.1, 12.2

Module 3: Sorting and Randomized Algorithms

Lecture #2. 1 Overview. 2 Worst-Case Analysis vs. Average Case Analysis. 3 Divide-and-Conquer Design Paradigm. 4 Quicksort. 4.

Copyright 2009, Artur Czumaj 1

Module 3: Sorting and Randomized Algorithms. Selection vs. Sorting. Crucial Subroutines. CS Data Structures and Data Management

Quick Sort. CSE Data Structures May 15, 2002

Module 3: Sorting and Randomized Algorithms

Quicksort (Weiss chapter 8.6)

7 Sorting Algorithms. 7.1 O n 2 sorting algorithms. 7.2 Shell sort. Reading: MAW 7.1 and 7.2. Insertion sort: Worst-case time: O n 2.

Sorting. Sorting. Stable Sorting. In-place Sort. Bubble Sort. Bubble Sort. Selection (Tournament) Heapsort (Smoothsort) Mergesort Quicksort Bogosort

CSCI 2132 Software Development Lecture 18: Implementation of Recursive Algorithms

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

Sorting. Bubble Sort. Pseudo Code for Bubble Sorting: Sorting is ordering a list of elements.

// walk through array stepping by step amount, moving elements down for (i = unsorted; i >= step && item < a[i-step]; i-=step) { a[i] = a[i-step];

COMP Analysis of Algorithms & Data Structures

Data Structures and Algorithms Week 4

Introduction to Algorithms

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Introduction to Computers and Programming. Today

Lecture 5: Sorting Part A

Data Structures and Algorithms Chapter 4

Cpt S 122 Data Structures. Sorting

CS Divide and Conquer

Simple Sorting Algorithms

Table ADT and Sorting. Algorithm topics continuing (or reviewing?) CS 24 curriculum

Lecture 2: Getting Started

CS 171: Introduction to Computer Science II. Quicksort

Design and Analysis of Algorithms

CS S-11 Sorting in Θ(nlgn) 1. Base Case: A list of length 1 or length 0 is already sorted. Recursive Case:

Searching, Sorting. part 1

Introduction to Algorithms and Data Structures. Lecture 12: Sorting (3) Quick sort, complexity of sort algorithms, and counting sort

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Divide and Conquer

COMP Analysis of Algorithms & Data Structures

COMP2012H Spring 2014 Dekai Wu. Sorting. (more on sorting algorithms: mergesort, quicksort, heapsort)

Data Structures and Algorithms

Next. 1. Covered basics of a simple design technique (Divideand-conquer) 2. Next, more sorting algorithms.

CSC Design and Analysis of Algorithms

CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting. Ruth Anderson Winter 2019

Sorting Algorithms. For special input, O(n) sorting is possible. Between O(n 2 ) and O(nlogn) E.g., input integer between O(n) and O(n)

Better sorting algorithms (Weiss chapter )

Mergesort again. 1. Split the list into two equal parts

Algorithms and Data Structures for Mathematicians

We can use a max-heap to sort data.

Lecture 19 Sorting Goodrich, Tamassia

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

CS 310 Advanced Data Structures and Algorithms

Sorting Algorithms. + Analysis of the Sorting Algorithms

CS 303 Design and Analysis of Algorithms

CSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Deterministic and Randomized Quicksort. Andreas Klappenecker

Key question: how do we pick a good pivot (and what makes a good pivot in the first place)?

7. Sorting I. 7.1 Simple Sorting. Problem. Algorithm: IsSorted(A) 1 i j n. Simple Sorting

Divide and Conquer CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang

Divide and Conquer CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement. Outline

COMP Data Structures

Unit-2 Divide and conquer 2016

The divide and conquer strategy has three basic parts. For a given problem of size n,

Deliverables. Quick Sort. Randomized Quick Sort. Median Order statistics. Heap Sort. External Merge Sort

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.

1 Probabilistic analysis and randomized algorithms

Algorithms Lab 3. (a) What is the minimum number of elements in the heap, as a function of h?

CSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer

Lecture 8: Mergesort / Quicksort Steven Skiena

COS 226 Lecture 4: Mergesort. Merging two sorted files. Prototypical divide-and-conquer algorithm

Analysis of Algorithms - Quicksort -

1. Covered basics of a simple design technique (Divideand-conquer) 2. Next, more sorting algorithms.

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

CSE373: Data Structure & Algorithms Lecture 18: Comparison Sorting. Dan Grossman Fall 2013

Chapter 4. Divide-and-Conquer. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION

CSE 143. Two important problems. Searching and Sorting. Review: Linear Search. Review: Binary Search. Example. How Efficient Is Linear Search?

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs

Data Structures and Algorithms

e-pg PATHSHALA- Computer Science Design and Analysis of Algorithms Module 10

DIVIDE & CONQUER. Problem of size n. Solution to sub problem 1

Divide-and-Conquer CSE 680

CS 171: Introduction to Computer Science II. Quicksort

Cosc 241 Programming and Problem Solving Lecture 17 (30/4/18) Quicksort

Data Structures and Algorithms. Roberto Sebastiani

CS125 : Introduction to Computer Science. Lecture Notes #38 and #39 Quicksort. c 2005, 2003, 2002, 2000 Jason Zych

CIS 121 Data Structures and Algorithms with Java Spring Code Snippets and Recurrences Monday, January 29/Tuesday, January 30

Transcription:

Lecture 2: Divide&Conquer Paradigm, Merge sort and Quicksort Instructor:

Outline 1 Divide and Conquer 2 Merge sort 3 Quick sort

In-Class Quizzes URL: http://m.socrative.com/ Room Name: 4f2bb99e

Divide And Conquer Paradigm D&C is a popular algorithmic technique Lots of applications Consists of three steps: 1 Divide the problem into a number of sub-problems 2 Conquer the sub-problems by solving them recursively 3 Combine the solutions to sub-problems into solution for original problem

Divide And Conquer Paradigm When can you use it? The sub-problems are easier to solve than original problem The number of sub-problems is small Solution to original problem can be obtained easily, once the sub-problems are solved

Recursion and Recurrences Typically, D&C algorithms use recursion as it makes coding simpler Non-recursive variants can be designed, but are often slower If all sub-problems are of equal size, can be analyzed by the recurrence equation T (n) = at ( n ) + D(n) + C(n) b a: number of sub-problems to solve b: how fast the problem size shrinks D(n): time complexity for the divide step C(n): time complexity for the combine step

D&C Approach to Sorting How to use D&C in Sorting? Partition the array into sub-groups Sort each sub-group recursively Combine sorted sub-groups if needed

Why study Merge Sort? One of the simplest and efficient sorting algorithms Time complexity is Θ(n log n) (vast improvement over Bubble, Selection and Insertion sorts) Transparent application of D&C paradigm Good showcase for time complexity analysis

Merge Sort High Level Idea : Divide the array into two equal partitions - L and R If not divisible by 1, L has n 2 elements and R has n 2 Sort left partition L recursively Sort right partition R recursively Merge the two sorted partitions into the output array

Merge Sort Pseudocode Pseudocode: MergeSort(A, p, r): if p < r: q = (p+r)/2 Mergesort(A, p, q) Mergesort(A, q+1, r) Merge(A, p, q, r)

Merge Sort - Divide 1 1 http://web.stanford.edu/class/cs161/slides/0623_mergesort.pdf

Merge Sort - Combine 2 2 http://web.stanford.edu/class/cs161/slides/0623_mergesort.pdf

Merging Two Sorted Lists 3 3 http://web.stanford.edu/class/cs161/slides/0623_mergesort.pdf

Merging Two Sorted Lists

Merging Two Sorted Lists

Merging Two Sorted Lists

Merging Two Sorted Lists

Merging Two Sorted Lists

Merging Two Sorted Lists

Merging Two Sorted Lists

Merging Two Sorted Lists

Merging Two Sorted Lists Merge Pseudocode: Merge(A,B,C): i = j = 1 for k = 1 to n: if A[i] < B[j]: C[k] = A[i] i = i + 1 else: (A[i] > B[j]) C[k] = B[j] j = j + 1

Analyzing Merge Sort: Master Method Quiz! General recurrence formula for D&C is T (n) = at ( n b ) + D(n) + C(n) What is a? What is b? What is D(n)? What is C(n)?

Analyzing Merge Sort: Master Method Quiz! General recurrence formula for D&C is a = 2, b = 2 D(n) = O(1) C(n) = O(n) Combining, we get: T (n) = at ( n ) + D(n) + C(n) b T (n) = 2T ( n 2 ) + O(n) Using Master method, we get T (n) = O(n log n) If you are picky, T (n) = T ( n 2 ) + T ( n 2 ) + O(n)

Analyzing Merge Sort: Recursion Tree 4 4 CLRS Book

Analyzing Merge Sort: Recursion Tree 5 5 CLRS Book

Merge Sort Vs Insertion Sort Merge Sort is very fast in general (O(n log n)) than Insertion sort (O(n 2 )) For nearly sorted arrays, Insertion sort is faster Merge sort has Θ(n log n) (i.e. both best and worst case complexity is n log n Overhead: recursive calls and extra space for copying Insertion sort is in-place and adaptive Merge sort is easily parallizable

Quicksort Quicksort is a very popular and elegant sorting algorithm Invented by Tony Hoare Also invented the concept of NULL (he called it a Billion dollar mistake! - why?) There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.

Quicksort Fastest of the fast sorting algorithms and lot (and lots) of ways to tune it. Default sorting algorithm in most languages Simple but innovative use of D&C On average it takes Θ(n log n) but in worst case might require O(n 2 ) Occurs rarely in practice if coded properly

Quicksort Quicksort is a D&C algorithm and uses a different style than Merge sort It does more work in Divide phase and almost no work in Combine phase One of the very few algorithms with this property

Partitioning Choices 6 Sorting by D&C - Divide to two sub-arrays, sort each and merge Different partitioning ideas leads to different sorting algorithms Given an array A with n elements, how to split to two sub-arrays L and R L has first n 1 elements and R has last element R has largest element of A and L has rest of n 1 elements L has the first n 2 elements and R has the rest Chose a pivot p, L has elements less than p and R has elements greater than p 6 From http://www.cs.bu.edu/fac/gkollios/cs113/slides/quicksort.ppt

Partitioning Choices D&C is not a silver bullet! Different partitioning ideas leads to different sorting algorithms Insertion Sort O(n 2 ): L has first n 1 elements and R has last element Bubble Sort O(n 2 ): R has largest element of A and L has rest of n 1 elements Merge Sort O(n log n): L has the first n 2 elements and R has the rest Quick Sort O(n log n) (average case): Chose a pivot p, L has elements less than p and R has elements greater than p

Quicksort Pseudocode: QuickSort(A, p, r): if p < r: q = Partition(A, p, r) QuickSort(A, p, q-1) QuickSort(A, q+1, r)

QuickSort

Quicksort Design Objectives Choose a good pivot Do the partitioning - efficiently and in-place

Partition Subroutine Given a pivot, partition A to two sub-arrays L and R. All elements less then pivot are in L All elements greater than pivot are in R Return the new index of the pivot after the rearrangement Note: Elements in L and R need not be sorted during partition (just pivot and pivot respectively)

Partition Subroutine

Partition Subroutine Note: CLRS version of Partition subroutine. Assumes last element as pivot. To use this subroutine for other pivot picking strategies, swap pivot element with last element. Partition(A, p, r): x = A[r] // x is the pivot i = p - 1 for j = p to r-1 if A[j] <= x i = i + 1 exchange A[i] with A[j] exchange A[i+1] with A[r] return i+1

Partition Example 1 7 7 https: //www.cs.rochester.edu/~gildea/csc282/slides/c07-quicksort.pdf

Partition Example 2 8 8 https: //www.cs.rochester.edu/~gildea/csc282/slides/c07-quicksort.pdf

Partition Example 2 9 9 https: //www.cs.rochester.edu/~gildea/csc282/slides/c07-quicksort.pdf

Partition Subroutine Time Complexity: O(n) - why? Correctness: Why does it work?

Quicksort: Best Case Scenario

Quicksort: Worst Case Scenario

Quicksort: Analysis Recurrence Relation for Quicksort: Best Case: T (n) = 2T ( n 2 ) + n Same as Merge sort: Θ(n log n) Worst Case: T (n) = T (n 1) + n Same as Bubble sort: O(n 2 ) Average Case: T (n) = n p=1 1 n (T (p 1) + T (n p)) + n Analysis is very tricky, but returns O(n log n) Intuition: Even an uneven split is okay (as long as it is between 25:75 to 75:25) When looking at all possible arrays of size n, we expect (on average) such a split to happen half the time

Quicksort: Average Case Scenario

Strategies to Pick Pivot Pick first, middle or last element as pivot Pick median-of-3 as pivot (i.e. median of first, middle and last element) Bad News: All of these strategies work well in practice, but has worst case time complexity of O(n 2 ) Pick the median as pivot

Randomized Quicksort Randomized-Partition(A, p, r): i = Random(p,r) Exchange A[r] with A[i] return Partition(A, p, r) Randomized-QuickSort(A, p, r): if p < r: q = Randomized-Partition(A, p, r) Randomized-QuickSort(A, p, q-1) Randomized-QuickSort(A, q+1, r)

Randomized Quicksort Adversarial analysis It is easy to construct a worst case input for every deterministic pivot picking strategy Harder to do for randomized strategy Idea: Pick a pivot randomly or shuffle data and use a deterministic strategy Expected time complexity is O(n log n)

Sorting in the Real World Sorting is a fundamental problem with intense ongoing research No single best algorithm - Merge, Quick, Heap, Insertion sort all excel in some scenarions Most programming languages implement sorting via tuned QuickSort (e.g. Java 6 or below) or a combination of Merge Sort and Insertion sort (Python, Java 7, Perl etc).

Summary Major Concepts: D&C - Divide, Conquer and Combine Merge and Quick sort Randomization as a strategy