CSE 373: Data Structures and Algorithms

Similar documents
CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting. Aaron Bauer Winter 2014

CSE373: Data Structure & Algorithms Lecture 18: Comparison Sorting. Dan Grossman Fall 2013

CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting. Ruth Anderson Winter 2019

CSE 373: Data Structures & Algorithms More Sor9ng and Beyond Comparison Sor9ng

CSE 373: Data Structures and Algorithms

Sorting. Riley Porter. CSE373: Data Structures & Algorithms 1

Announcements. HW4: Due tomorrow! Final in EXACTLY 2 weeks. Start studying. Summer 2016 CSE373: Data Structures & Algorithms 1

CSE 373 MAY 24 TH ANALYSIS AND NON- COMPARISON SORTING

CSE 373: Data Structures and Algorithms

CSE 373: Data Structure & Algorithms Comparison Sor:ng

Sorting Algorithms. + Analysis of the Sorting Algorithms

CSE373: Data Structure & Algorithms Lecture 23: More Sorting and Other Classes of Algorithms. Catie Baker Spring 2015

CSE 373 NOVEMBER 8 TH COMPARISON SORTS

CSE 373: Data Structures and Algorithms

CSE 373: Data Structures and Algorithms

Mergesort again. 1. Split the list into two equal parts

Quick Sort. CSE Data Structures May 15, 2002

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

Quicksort (Weiss chapter 8.6)

Faster Sorting Methods

DATA STRUCTURES AND ALGORITHMS

Better sorting algorithms (Weiss chapter )

CSE 373: Data Structures and Algorithms

COMP Data Structures

Data Structures and Algorithms

Sorting. CSE 143 Java. Insert for a Sorted List. Insertion Sort. Insertion Sort As A Card Game Operation. CSE143 Au

CSE373: Data Structures and Algorithms Lecture 4: Asymptotic Analysis. Aaron Bauer Winter 2014

08 A: Sorting III. CS1102S: Data Structures and Algorithms. Martin Henz. March 10, Generated on Tuesday 9 th March, 2010, 09:58

Sorting Algorithms. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

II (Sorting and) Order Statistics

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

Merge Sort. Algorithm Analysis. November 15, 2017 Hassan Khosravi / Geoffrey Tien 1

Lecture #2. 1 Overview. 2 Worst-Case Analysis vs. Average Case Analysis. 3 Divide-and-Conquer Design Paradigm. 4 Quicksort. 4.

Sorting. Quicksort analysis Bubble sort. November 20, 2017 Hassan Khosravi / Geoffrey Tien 1

Divide and Conquer CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang

We can use a max-heap to sort data.

Lecture 8: Mergesort / Quicksort Steven Skiena

CSE 326: Data Structures Lecture #16 Sorting Things Out. Unix Tutorial!!

Divide and Conquer CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement. Outline

Sorting. CSC 143 Java. Insert for a Sorted List. Insertion Sort. Insertion Sort As A Card Game Operation CSC Picture

Sorting. Task Description. Selection Sort. Should we worry about speed?

Copyright 2009, Artur Czumaj 1

IS 709/809: Computational Methods in IS Research. Algorithm Analysis (Sorting)

Algorithmic Analysis and Sorting, Part Two. CS106B Winter

CSE 326: Data Structures Sorting Conclusion

10/5/2016. Comparing Algorithms. Analyzing Code ( worst case ) Example. Analyzing Code. Binary Search. Linear Search

CS240 Fall Mike Lam, Professor. Quick Sort

Unit-2 Divide and conquer 2016

CS : Data Structures

Sorting. Bubble Sort. Pseudo Code for Bubble Sorting: Sorting is ordering a list of elements.

Lecture 19 Sorting Goodrich, Tamassia

Sorting. Sorting. Stable Sorting. In-place Sort. Bubble Sort. Bubble Sort. Selection (Tournament) Heapsort (Smoothsort) Mergesort Quicksort Bogosort

Unit Outline. Comparing Sorting Algorithms Heapsort Mergesort Quicksort More Comparisons Complexity of Sorting 2 / 33

Sorting. Weiss chapter , 8.6

L14 Quicksort and Performance Optimization

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

Algorithmic Analysis and Sorting, Part Two

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 6. Sorting Algorithms

Programming II (CS300)

CS1 Lecture 30 Apr. 2, 2018

Sorting (I) Hwansoo Han

Data structures. More sorting. Dr. Alex Gerdes DIT961 - VT 2018

4. Sorting and Order-Statistics

ITEC2620 Introduction to Data Structures

DIVIDE AND CONQUER ALGORITHMS ANALYSIS WITH RECURRENCE EQUATIONS

Algorithm Efficiency & Sorting. Algorithm efficiency Big-O notation Searching algorithms Sorting algorithms

Divide and Conquer Sorting Algorithms and Noncomparison-based

DATA STRUCTURES AND ALGORITHMS

CS 61B Summer 2005 (Porter) Midterm 2 July 21, SOLUTIONS. Do not open until told to begin

SORTING AND SELECTION

CS 171: Introduction to Computer Science II. Quicksort

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

Lecture 9: Sorting Algorithms

Plan of the lecture. Quick-Sort. Partition of lists (or using extra workspace) Quick-Sort ( 10.2) Quick-Sort Tree. Partitioning arrays

Cosc 241 Programming and Problem Solving Lecture 17 (30/4/18) Quicksort

CIS 121 Data Structures and Algorithms with Java Spring Code Snippets and Recurrences Monday, January 29/Tuesday, January 30

CS S-11 Sorting in Θ(nlgn) 1. Base Case: A list of length 1 or length 0 is already sorted. Recursive Case:

Chapter 7 Sorting. Terminology. Selection Sort

Mergesort again. 1. Split the list into two equal parts

Homework Assignment #3. 1 (5 pts) Demonstrate how mergesort works when sorting the following list of numbers:

Design and Analysis of Algorithms

CSE 143. Two important problems. Searching and Sorting. Review: Linear Search. Review: Binary Search. Example. How Efficient Is Linear Search?

CS125 : Introduction to Computer Science. Lecture Notes #40 Advanced Sorting Analysis. c 2005, 2004 Jason Zych

Sorting. Order in the court! sorting 1

Sorting Shabsi Walfish NYU - Fundamental Algorithms Summer 2006

CSC 273 Data Structures

Programming Challenges - Week 3

Lecture 7 Quicksort : Principles of Imperative Computation (Spring 2018) Frank Pfenning

SORTING, SETS, AND SELECTION

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

Sorting. CPSC 259: Data Structures and Algorithms for Electrical Engineers. Hassan Khosravi

BM267 - Introduction to Data Structures

What next? CSE332: Data Abstractions Lecture 20: Parallel Prefix and Parallel Sorting. The prefix-sum problem. Parallel prefix-sum

O(n): printing a list of n items to the screen, looking at each item once.

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Merge Sort & Quick Sort

Parallel Sorting Algorithms

Quicksort. The divide-and-conquer strategy is used in quicksort. Below the recursion step is described:

CSE 332 Spring 2013: Midterm Exam (closed book, closed notes, no calculators)

COT 5407: Introduction to Algorithms. Giri Narasimhan. ECS 254A; Phone: x3748

Quicksort. Repeat the process recursively for the left- and rightsub-blocks.

Transcription:

CSE 373: Data Structures and Algorithms Lecture 20: More Sorting Instructor: Lilian de Greef Quarter: Summer 2017

Today: More sorting algorithms! Merge sort analysis Quicksort Bucket sort Radix sort

Divide and conquer Very important technique in algorithm design 1. Divide problem into smaller parts 2. Independently solve the simpler parts Think recursion Or parallelism 3. Combine solution of parts to produce overall solution Two great sorting methods are fundamentally divide-and-conquer (Merge Sort & Quicksort)

Merge Sort Merge Sort: repeatedly Sort the left half of the elements Sort the right half of the elements Merge the two sorted halves into a sorted whole To sort array from position lo to position hi: If range is 1 element long, it is already sorted! Else: Sort from lo to (hi+lo)/2 Sort from (hi+lo)/2 to hi Merge the two halves together

Linked lists and big data We defined sorting over an array, but sometimes you want to sort linked lists One approach: Convert to array: Sort: Convert back to list: Merge sort works very nicely on linked lists directly Heapsort and quicksort do not Insertion sort and selection sort do but they re slower Merge sort is also the sort of choice for external sorting Linear merges minimize disk accesses And can leverage multiple disks to get streaming accesses

Analysis Having defined an algorithm and argued it is correct, we should analyze its running time and space: To sort n elements, we: Return immediately if n=1 Else do 2 subproblems of size and then an merge Recurrence relation:

Analysis intuitively This recurrence is common, you just know it s O(n log n) Merge sort is relatively easy to intuit (best, worst, and average): The recursion tree will have height At each level we do a total amount of merging equal to

Analysis more formally (One of the recurrence classics) For simplicity, ignore constants (let constants be ) T(1) = 1 T(n) = 2T(n/2) + n = 2(2T(n/4) + n/2) + n = 4T(n/4) + 2n = 4(2T(n/8) + n/4) + 2n = 8T(n/8) + 3n. = 2 k T(n/2 k ) + kn We will continue to recurse until we reach the base case, i.e. T(1) for T(1), n/2 k = 1, i.e., log n = k So the total amount of work is 2 k T(n/2 k ) + kn = 2 log n T(1) + n log n = n + n log n = O(n log n)

Divide-and-Conquer Sorting Two great sorting methods are fundamentally divide-and-conquer 1. Merge Sort: Sort the left half of the elements (recursively) Sort the right half of the elements (recursively) Merge the two sorted halves into a sorted whole 2. Quicksort: Pick a pivot element Divide elements into less-than pivot and greater-than pivot Sort the two divisions (recursively on each) Answer is sorted-less-than, followed by pivot, followed by sorted-greater-than

Quicksort Overview 1. Pick a pivot element 2. Partition all the data into: A. The elements less than the pivot B. The pivot C. The elements greater than the pivot 3. Recursively sort A and C 4. The final answer is A-B-C

Real-world example demo time!

Think in Terms of Sets S 13 81 92 43 65 31 57 26 75 0 select pivot value S 1 S 2 partition S 0 13 26 43 31 65 57 75 92 81 S 1 S 2 0 13 26 31 43 57 65 75 81 92 Quicksort(S 1 ) and Quicksort(S 2 ) S 0 13 26 31 43 57 65 75 81 92 Presto! S is sorted [Weiss]

Example, Showing Recursion Divide Divide Divide 1 Element 1 2 8 2 9 4 5 3 1 6 5 2 4 3 1 8 9 6 3 2 1 4 6 8 9 Conquer 1 2 Conquer 1 2 3 4 6 8 9 Conquer 1 2 3 4 5 6 8 9

Details Have not yet explained: How to pick the pivot element Any choice is correct: data will end up sorted But as analysis will show, want the two partitions to be about How to implement partitioning In linear time In place

Pivots Best pivot? Halve each time 8 2 9 4 5 3 1 6 5 2 4 3 1 8 9 6 Worst pivot? Greatest/least element Partition of size n - 1 8 2 9 4 5 3 1 6 1 8 2 9 4 5 3 6

Potential pivot rules While sorting arr from lo to hi-1 Pick arr[lo] or arr[hi-1] Fast, but worst-case occurs with mostly sorted input Pick random element in the range Does as well as any technique, but (pseudo)random number generation can be slow Still probably the most elegant approach Median of 3, e.g., arr[lo], arr[hi-1], arr[(hi+lo)/2] Common heuristic that tends to work well

Partitioning Conceptually simple, but hardest part to code up correctly After picking pivot, need to partition in linear time in place One approach (there are slightly fancier ones): 1. Swap pivot with arr[lo] 2. Use two fingers i and j, starting at lo+1 and hi-1 3. while (i < j) if (arr[j] > pivot) j-- else if (arr[i] < pivot) i++ else swap arr[i] with arr[j] 4. Swap pivot with arr[i] * *skip step 4 if pivot ends up being least element

Example Step one: pick pivot as median of 3 lo = 0, hi = 10 0 1 2 3 4 5 6 7 8 9 8 1 4 9 0 3 5 2 7 6 Step two: move pivot to the lo position 0 1 2 3 4 5 6 7 8 9 6 1 4 9 0 3 5 2 7 8

Example Now partition in place Often have more than one swap during partition this is a short example 6 1 4 9 0 3 5 2 7 8 Move fingers 6 1 4 9 0 3 5 2 7 8 Swap Move fingers 6 1 4 2 0 3 5 9 7 8 6 1 4 2 0 3 5 9 7 8 Move pivot 5 1 4 2 0 3 6 9 7 8

Analysis Best-case: Pivot is always the median T(0) = T(1) = 1 T(n) = -- linear-time partition Same recurrence as merge sort: Worst-case: Pivot is always smallest or largest element T(0) = T(1) = 1 T(n) = Basically same recurrence as selection sort: Average-case (e.g., with random pivot) O(n log n), not responsible for proof (in text)

Cutoffs For small n, all that recursion tends to cost more than doing a quadratic sort Remember asymptotic complexity is for Common engineering technique: switch algorithm below a cutoff Reasonable rule of thumb: use insertion sort for n < 10 Notes: Could also use a cutoff for merge sort Cutoffs are also the norm with parallel algorithms Switch to sequential algorithm None of this affects asymptotic complexity

Cutoff pseudocode void quicksort(int[] arr, int lo, int hi) { if(hi lo < CUTOFF) insertionsort(arr,lo,hi); else } Notice how this cuts out the vast majority of the recursive calls Think of the recursive calls to quicksort as a tree Trims out the bottom layers of the tree

Practice with comparison sort! A comparison sorting algorithm is operating on an array of 8 integers. After its 4 th loop or recursive call, the array looks like: 4 8 11 15 42 29 18 37 Which of these sorting algorithms can it be? A) Heapsort B) Merge sort C) Insertion sort D) Quicksort using Median of 3

Practice with comparison sort! A comparison sorting algorithm is operating on an array of 8 integers. After its 4 th loop or recursive call, the array looks like: 4 8 11 15 42 29 18 37 Which of these sorting algorithms can it be? A) Heapsort B) Merge sort C) Insertion sort D) Quicksort using Median of 3

How Fast Can We Sort? Heapsort & mergesort have O(n log n) worst-case running time Quicksort has O(n log n) average-case running time These bounds are all tight, actually Q(n log n) Comparison sorting in general is W (n log n) An amazing computer-science result: proves all the clever programming in the world cannot comparison-sort in linear time

The Big Picture Surprising amount of juicy computer science: 2-3 lectures Simple algorithms: O(n 2 ) Fancier algorithms: O(n log n) Comparison lower bound: W(n log n) Specialized algorithms: O(n) Handling huge data sets Insertion sort Selection sort Shell sort Heap sort Merge sort Quick sort (avg) Bucket sort Radix sort How??? Change the model assume more than compare(a,b) External sorting

Bucket Sort (a.k.a. BinSort) If all values to be sorted are known to be integers between 1 and K (or any small range): Create an array of size K Put each element in its proper bucket (a.k.a. bin) If data is only integers, no need to store more than a count of how times that bucket has been used Output result via linear pass through array of buckets count array 1 2 3 4 5 Example: K=5 input (5, 1, 3, 4, 3, 2, 1, 1, 5, 4, 5) output

Analyzing Bucket Sort Overall: O(n+K) Linear in n, but also linear in K W(n log n) lower bound does not apply because this is not a comparison sort Good when K is smaller (or not much larger) than n We don t spend time doing comparisons of duplicates Bad when K is much larger than n Wasted space; wasted time during linear O(K) pass For data in addition to integer keys, use list at each bucket

Bucket Sort with Data Most real lists aren t just keys; we have data Each bucket is a list (say, linked list) To add to a bucket, insert in O(1) (at beginning, or keep pointer to last element) Example: spice level; scale 1-5; 1 = mild, 5 = very spicy Input= 5: Habanero 3: Jalapeño 5: Ghost pepper 1: Bell pepper count array 1 2 3 4 5 Result: Easy to keep stable ; Habanero still before Ghost pepper

Interactive Visualizations Comparison Sort (including quicksort): http://www.cs.usfca.edu/~galles/visualization/comparisonsort.html Bucket Sort: http://www.cs.usfca.edu/~galles/visualization/bucketsort.html http://www.cs.usfca.edu/~galles/visualization/countingsort.html Radix Sort: http://www.cs.usfca.edu/~galles/visualization/radixsort.html