Shell s sort. CS61B Lecture #25. Sorting by Selection: Heapsort. Example of Shell s Sort

Similar documents
CS61B Lectures # Purposes of Sorting. Some Definitions. Classifications. Sorting supports searching Binary search standard example

CS61B Lectures #28. Today: Lower bounds on sorting by comparison Distribution counting, radix sorts

Beyond Comparison: Distribution. Necessary Choices

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

COMP Analysis of Algorithms & Data Structures

Module 3: Sorting and Randomized Algorithms. Selection vs. Sorting. Crucial Subroutines. CS Data Structures and Data Management

Module 3: Sorting and Randomized Algorithms

Design and Analysis of Algorithms

08 A: Sorting III. CS1102S: Data Structures and Algorithms. Martin Henz. March 10, Generated on Tuesday 9 th March, 2010, 09:58

COMP Data Structures

Module 3: Sorting and Randomized Algorithms

CS S-11 Sorting in Θ(nlgn) 1. Base Case: A list of length 1 or length 0 is already sorted. Recursive Case:

Algorithmic Analysis and Sorting, Part Two. CS106B Winter

CPSC 311 Lecture Notes. Sorting and Order Statistics (Chapters 6-9)

Data Structures and Algorithms

Scan and Quicksort. 1 Scan. 1.1 Contraction CSE341T 09/20/2017. Lecture 7

Today: Sorting algorithms: why? Insertion Sort. Inversions. CS61B Lecture #26. Last modified: Sun Oct 22 18:22: CS61B: Lecture #26 1

Lecture Notes for Advanced Algorithms

Algorithmic Analysis and Sorting, Part Two

CS 171: Introduction to Computer Science II. Quicksort

CS125 : Introduction to Computer Science. Lecture Notes #40 Advanced Sorting Analysis. c 2005, 2004 Jason Zych

The Limits of Sorting Divide-and-Conquer Comparison Sorts II

Sorting. Bubble Sort. Pseudo Code for Bubble Sorting: Sorting is ordering a list of elements.

Faster Sorting Methods

CSC 273 Data Structures

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

Today: Sorting algorithms: why? Insertion Sort. Inversions. CS61B Lecture #26. Last modified: Wed Oct 24 13:43: CS61B: Lecture #26 1

COSC242 Lecture 7 Mergesort and Quicksort

lecture notes September 2, How to sort?

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Question 7.11 Show how heapsort processes the input:

Searching, Sorting. part 1

Overview of Sorting Algorithms

Lecture 5: Sorting Part A

Introduction. e.g., the item could be an entire block of information about a student, while the search key might be only the student's name

Lecture 3: Sorting 1

CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting. Aaron Bauer Winter 2014

Introduction to Algorithms

Lecture 8: Mergesort / Quicksort Steven Skiena

4.4 Algorithm Design Technique: Randomization

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs

11/8/2016. Chapter 7 Sorting. Introduction. Introduction. Sorting Algorithms. Sorting. Sorting

Sorting algorithms Properties of sorting algorithm 1) Adaptive: speeds up to O(n) when data is nearly sorted 2) Stable: does not change the relative

CSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.

Algorithms Lab 3. (a) What is the minimum number of elements in the heap, as a function of h?

Divide and Conquer Sorting Algorithms and Noncomparison-based

Sorting. Quicksort analysis Bubble sort. November 20, 2017 Hassan Khosravi / Geoffrey Tien 1

Data Structures. Sorting. Haim Kaplan & Uri Zwick December 2013

Data Structures Brett Bernstein

Selection (deterministic & randomized): finding the median in linear time

CS61BL. Lecture 5: Graphs Sorting

CS 171: Introduction to Computer Science II. Quicksort

Examination Questions Midterm 2

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

COMP Analysis of Algorithms & Data Structures

Real-world sorting (not on exam)

CS125 : Introduction to Computer Science. Lecture Notes #38 and #39 Quicksort. c 2005, 2003, 2002, 2000 Jason Zych

CS61B, Spring 2003 Discussion #15 Amir Kamil UC Berkeley 4/28/03

COSC 311: ALGORITHMS HW1: SORTING

Programming II (CS300)

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger.

We can use a max-heap to sort data.

Lecture 23: Priority Queues, Part 2 10:00 AM, Mar 19, 2018

CS 360 Exam 1 Fall 2014 Name. 1. Answer the following questions about each code fragment below. [8 points]

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Sorting and Selection

PRAM Divide and Conquer Algorithms

4. Sorting and Order-Statistics

having any value between and. For array element, the plot will have a dot at the intersection of and, subject to scaling constraints.

Sorting Algorithms. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Lower bound for comparison-based sorting

L14 Quicksort and Performance Optimization

CSE 373: Data Structures and Algorithms

Sorting Algorithms Spring 2019 Mentoring 10: 18 April, Asymptotics Potpourri

1 Sorting: Basic concepts

Sorting. Sorting. Stable Sorting. In-place Sort. Bubble Sort. Bubble Sort. Selection (Tournament) Heapsort (Smoothsort) Mergesort Quicksort Bogosort

CS 3343 (Spring 2013) Exam 2

CS240 Fall Mike Lam, Professor. Quick Sort

CS Data Structures and Algorithm Analysis

Introduction to Computers and Programming. Today

Data Structures and Algorithms Week 4

DO NOT. UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N.

Sorting. Weiss chapter , 8.6

Lecture Notes 14 More sorting CSS Data Structures and Object-Oriented Programming Professor Clark F. Olson

(Refer Slide Time: 01.26)

CS1 Lecture 30 Apr. 2, 2018

Unit Outline. Comparing Sorting Algorithms Heapsort Mergesort Quicksort More Comparisons Complexity of Sorting 2 / 33

Cpt S 122 Data Structures. Sorting

Algorithms and Data Structures: Lower Bounds for Sorting. ADS: lect 7 slide 1

CSE 373: Data Structures & Algorithms More Sor9ng and Beyond Comparison Sor9ng

How fast can we sort? Sorting. Decision-tree example. Decision-tree example. CS 3343 Fall by Charles E. Leiserson; small changes by Carola

CSE 373: Data Structures and Algorithms

Randomized Algorithms: Selection

Tree Data Structures CSC 221

1 Tree Sort LECTURE 4. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

Principles of Algorithm Design

Better sorting algorithms (Weiss chapter )

Data Structures and Algorithms Chapter 4

Divide and Conquer Algorithms: Advanced Sorting

How many leaves on the decision tree? There are n! leaves, because every permutation appears at least once.

Transcription:

CS6B Lecture #5 Today: Sorting, t. Standard methods Properties of standard methods Selection Readings for Today: DS(IJ), Chapter 8; Readings for Next Topic: Balanced searches, DS(IJ), Chapter ; Shell s sort Idea: Improve insertion sort by first sorting distant elements: First sort subsequences of elements k apart: sort items #, k, ( k ), ( k ),..., then sort items #, + k, + ( k ), + ( k ),..., then sort items #, + k, + ( k ), + ( k ),..., then etc. sort items # k, ( k ), ( k ),..., Each time an item moves, reduce #inversions by as much as k +. Now sort subsequences of elements k apart: sort items #, k, ( k ), ( k ),..., then sort items #, + k, + ( k ), + ( k ),...,. End at plain insertion sort ( = apart), but with most inversions gone. Sort is Θ(N.5 ) (take CS7 for why!). Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 Example of Shell s Sort Sorting by Selection: Heapsort #I #C 5 4 8 7 6 5 4 4 8 7 6 5 4 5 Idea: Keep selecting smallest (or largest) element. Really bad idea on a simple list or vector. But we ve already seen it in action: use heap. Gives O(N lg N) algorithm (N remove-first operations). Since we remove items from end of heap, we use that area to accumulate result: 7 6 5 4 4 8 5 4 4 6 5 7 8 4 5 4 4 5 6 7 8 4 5 - I: Inversions left. C: Comparisons needed to sort subsequences. original: heapified: - 7 4 4 7-7 - 4 7-4 7-4 - 7 4-7 4-7 4 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 4

Merge Sorting Idea: Divide data in equal parts; recursively sort halves; merge results. Already seen analysis: Θ(N lg N). Good for external sorting: First break data into small enough chunks to fit in memory and sort. Then repeatedly merge into bigger and bigger sequences. Can merge K sequences of arbitrary size on sedary storage using Θ(K) storage. For internal sorting, use binomial comb to orchestrate: Illustration of Internal Merge Sort L: (, 5, 5,,, 6,, -,,, 8) : () : : element processed : : : elements processed : : (5) : (, 5) : (, 5) : : elements processed elements processed : : : (8) : : (, 6) : (, ) : (, 5,, 5) : (, 5,, 5) : : (-,,, 5, 6,,, 5) 4 elements processed 6 elements processed elements processed Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 5 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 6 Idea: Quicksort: Speed through Probability Partition data into pieces: everything > a pivot value at the high end of the sequence to sorted, and everything on the low end. Repeat recursively on the high and low pieces. For speed, stop when pieces are small enough and do insertion sort on the whole thing. Reason: insertion sort has low stant factors. By design, no item will move out of its will move out of its piece [why?], so when pieces are small, #inversions is, too. Have to choose pivot well. E.g.: median of first, last and middle items of sequence. Example of Quicksort In this example, we tinue until pieces are size 4. Pivots for next step are starred. Arrange to move pivot to dividing line each time. Last step is insertion sort. 6 8-4 -7-5 5 4 -* -4-5 -7-8 5 4 6* -4-5 -7-5 * 6 * 4 8-4 -5-7 - 5 6 8 4 Now everything is close to right, so just do insertion sort: -7-5 -4-5 6 8 4 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 7 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 8

Probabalistic time: Performance of Quicksort If choice of pivots good, divide data in two each time: Θ(N lg N) with a good stant factor relative to merge or heap sort. If choice of pivots bad, most items on one side each time: Θ(N ). Ω(N lg N) in st case, so insertion sort tter for nearly ordered input s. Interesting point: randomly shuffling the data fore sorting makes Ω(N ) time very unlikely! Quick Selection The Selection Problem: forgiven k, find k th smallestelementindata. Obvious method: sort, select element #k, time Θ(N lg N). If k some stant, easily do in Θ(N) time: Go through array, keep smallest k items. Get probably Θ(N) time for all k by adapting quicksort: Partition around some pivot, p, as in quicksort, arrange that pivot ends up at dividing line. Suppose that in the result, pivot is at index m, all elements pivot have indicies m. If m = k, you re done: p is answer. If m > k, recursively select k th from left half of sequence. If m < k, recursively select (m k ) th from right half of sequence. Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 Selection Example Problem: Find just item # in the sorted version of array: Initial tents: 5 6-4 7 4 4 4* 5 46 Looking for # to left of pivot 4: -4 7 4* 4 5 5 4 46 6 Looking for #6 to right of pivot 4: -4 4 7 * 4 5 5 4 46 6 4 Looking for # to right of pivot : -4 4 7 4 5 5 4 46 6 Just two elements; just sort and return #: -4 4 7 4 5 5 4 46 6 Result: Selection Performance For this algorithm, if m roughly in middle each time, cost is C(N) =, if N =, N + C(N/), otherwise. = N + N/ +... + = N Θ(N) But in worst case, get Θ(N ), as for quicksort. By another, non-obvious algorithm, get Θ(N) worst-case time for all k (take CS7). Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5

Better than N lg N? Canprove that if all you do to keys is comparethem then sorting must take Ω(N lg N). Basic idea: there are N! possible ways the input data could scrambled. Therefore, your program must prepared to do N! different combinations of move operations. Therefore, there must N! possible combinations of outcomes of all the if tests in your program(we re assuming that comparisons are -way). Since each if test goes two ways, numr of possible different outcomes for k if tests is k. Thus, need enough tests so that k > N!, which means k Ω(lg N!). Using Stirling s approximation, this tells us that m! πm m m + Θ e k Ω(N lg N). m, Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 Beyond Comparison: Distribution Counting But suppose do more than compare keys? For example, how we sort a of N integer keys whose values range from to kn, for some small stant k? One technique: count the numr of items <, <, etc. If M p =#items with value < p, then in sorted order, the j th item with value p must #M p + j. Gives linear-time algorithm. Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 4 Distribution Counting Example Suppose all items are tween and as in this example: 7 4 5 7 6 7 4 < < 6 < 7 < 4 < 4 5 < 5 6 < 6 7 < 7 8 6 < 8 4 4 5 6 7 6 Counts line gives # occurrences of each key. 6 < Counts Running sum 7 7 6 Running sum gives cumulative count of keys each value......which tells us where to put each key: The first instance of key k goes into slot m, where m is the numr of key instances that are < k. Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 5 Radix Sort Idea: Sort keys one character at a time. Can use distribution counting for each digit. Can work either right to left (LSD radix sort) or left to right (MSD radix sort) LSD radix sort is venerable: used for punched cards. Pass (by char #) Initial:,,,,,,,, t t d n t,,,,,,,, t Pass (by char #) Pass (by char #) t t a e o,,,,,,, t, b c l s,, t,,,,,, Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 6

MSD Radix Sort A bit more complied: must keep lists from each step separate But, stop processing -element lists A posn,,,,,,,, t,, t /,,, / / /, t /,,, / / / / t /,,, / / / / t /,, / / / / / t / / / / / / Performance of Radix Sort Radix sort takes Θ(B) time where B is total size of the key data. Have measured other sorts as function of #records. How to compare? To have N different records, must have keys at least Θ(lg N) long [why?] Furthermore, comparison actually takes time Θ(K) where K is size of key in worst case [why?] So N lg N comparisons really means N(lg N) operations. While radix sort takes B = N lg N time. On the other hand, must work to get good stant factors with radix sort. Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 7 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 8 And Don t Forget Search Trees Idea: A search tree is in sorted order, when read in inorder. Need balance to really use for sorting [next topic]. Given balance, same performance as heapsort: N insertions in time lg N each, plus Θ(N) to traverse, gives Θ(N + N lg N) = Θ(N lg N) Summary Insertion sort: Θ(N k) comparisons and moves, where k is maximum amount data is displaced from final position. Good for small datas or almost ordered data s. Quicksort: Θ(N lg N) with good stant factor if data is not pathological. Worst case O(N ). Merge sort: Θ(N lg N) guaranteed. Good for external sorting. Heapsort, treesort with guaranteed balance: Θ(N lg N) guaranteed. Radix sort, distribution sort: Θ(B) (numr of bytes). Also good for external sorting. Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5 Last modified: Thu Oct 8 6:: 4 CS6B: Lecture #5