Problem Set 4 Solutions

Similar documents
Problem Set 7 Solutions

Problem Set 5 Solutions

Union-Find and Amortization

Problem Set 3 Solutions

Binary Search to find item in sorted array

Algorithm Design and Analysis Homework #4

Problem Set 2 Solutions

Randomized Algorithms, Quicksort and Randomized Selection

II (Sorting and) Order Statistics

Introduction to Algorithms February 24, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Handout 8

Sorting. There exist sorting algorithms which have shown to be more efficient in practice.

Outline. Computer Science 331. Three Classical Algorithms. The Sorting Problem. Classical Sorting Algorithms. Mike Jacobson. Description Analysis

Introduction to Algorithms October 12, 2005 Massachusetts Institute of Technology Professors Erik D. Demaine and Charles E. Leiserson Quiz 1.

Dynamic Arrays and Amortized Analysis

Advanced Algorithms and Data Structures

Scribe: Sam Keller (2015), Seth Hildick-Smith (2016), G. Valiant (2017) Date: January 25, 2017

CS261: Problem Set #1

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.

1. Chapter 1, # 1: Prove that for all sets A, B, C, the formula

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

QuickSort and Others. Data Structures and Algorithms Andrei Bulatov

CHAPTER 11 SETS, AND SELECTION

Notes for Recitation 8

Homework #5 Algorithms I Spring 2017

Design and Analysis of Algorithms

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

Lecture 8 13 March, 2012

Algorithm Theory, Winter Term 2015/16 Problem Set 5 - Sample Solution

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Problem Set 1. Problem 1-1.

Lecture 9: Sorting Algorithms

Lecture 4 Feb 21, 2007

Problem Set 6 Solutions

Total Points: 60. Duration: 1hr

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs

Advanced Algorithms and Data Structures

1 Probabilistic analysis and randomized algorithms

Exam in Algorithms & Data Structures 3 (1DL481)

COS 226 Midterm Exam, Spring 2009

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

COMP Analysis of Algorithms & Data Structures

Randomized Algorithms, Hash Functions

: Parallel Algorithms Exercises, Batch 1. Exercise Day, Tuesday 18.11, 10:00. Hand-in before or at Exercise Day

Lecture 2: Getting Started

managing an evolving set of connected components implementing a Union-Find data structure implementing Kruskal s algorithm

Introduction to Algorithms

Introduction to Algorithms

CPSC 536N: Randomized Algorithms Term 2. Lecture 4

Selection (deterministic & randomized): finding the median in linear time

COMP Data Structures

Dynamic Arrays and Amortized Analysis

Introduction to Algorithms

Algorithms Lab 3. (a) What is the minimum number of elements in the heap, as a function of h?

Lecture 7 Quicksort : Principles of Imperative Computation (Spring 2018) Frank Pfenning

University of Waterloo CS240 Winter 2018 Assignment 2. Due Date: Wednesday, Jan. 31st (Part 1) resp. Feb. 7th (Part 2), at 5pm

2. Lecture notes on non-bipartite matching

Lecture 5: Sorting Part A

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

The divide and conquer strategy has three basic parts. For a given problem of size n,

7.3 A randomized version of quicksort

Lecture 9 March 4, 2010

Parallel and Sequential Data Structures and Algorithms Lecture (Spring 2012) Lecture 16 Treaps; Augmented BSTs

Module Contact: Dr Geoff McKeown, CMP Copyright of the University of East Anglia Version 1

2 A Template for Minimum Spanning Tree Algorithms

CSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.

Problem Set 1. Solution. CS4234: Optimization Algorithms. Solution Sketches

Final Exam Spring 2003

Representations of Weighted Graphs (as Matrices) Algorithms and Data Structures: Minimum Spanning Trees. Weighted Graphs

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

Introduction to Algorithms

Problem Set 1 Solutions

CPSC 320 Midterm 2 Thursday March 13th, 2014

Comparison Based Sorting Algorithms. Algorithms and Data Structures: Lower Bounds for Sorting. Comparison Based Sorting Algorithms

Binary Heaps in Dynamic Arrays

Introduction to Algorithms April 21, 2004 Massachusetts Institute of Technology. Quiz 2 Solutions

Lecture 10 October 7, 2014

CMSC Introduction to Algorithms Spring 2012 Lecture 7

Outline. Computer Science 331. Heap Shape. Binary Heaps. Heap Sort. Insertion Deletion. Mike Jacobson. HeapSort Priority Queues.

CS 3343 (Spring 2018) Assignment 4 (105 points + 15 extra) Due: March 9 before class starts

Algorithms and Data Structures: Lower Bounds for Sorting. ADS: lect 7 slide 1

ICS 691: Advanced Data Structures Spring Lecture 3

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Sorting and Selection

How invariants help writing loops Author: Sander Kooijmans Document version: 1.0

Lecture 1 Contracts. 1 A Mysterious Program : Principles of Imperative Computation (Spring 2018) Frank Pfenning

Programming and Data Structures Prof. N.S. Narayanaswamy Department of Computer Science and Engineering Indian Institute of Technology, Madras

Lecture 11 Unbounded Arrays

SORTING. Insertion sort Selection sort Quicksort Mergesort And their asymptotic time complexity

Randomized Algorithms: Selection

Queues COL 106. Slides by Amit Kumar, Shweta Agrawal

University of Waterloo CS240, Winter 2010 Assignment 2

Introduction to Algorithms I

SORTING AND SELECTION

Fast Sorting and Selection. A Lower Bound for Worst Case

Basic Data Structures (Version 7) Name:

EECS 2011M: Fundamentals of Data Structures

15 210: Parallel and Sequential Data Structures and Algorithms

Transcription:

Design and Analysis of Algorithms March 5, 205 Massachusetts Institute of Technology 6.046J/8.40J Profs. Erik Demaine, Srini Devadas, and Nancy Lynch Problem Set 4 Solutions Problem Set 4 Solutions This problem set is due at :59pm on Thursday, March 5, 205. Exercise 4-. Read CLRS, Chapter 7. Exercise 4-2. Exercise 7.-3. Exercise 4-3. Exercise 7.2-2. Exercise 4-4. Exercise 7.3-2. Exercise 4-5. Read CLRS, Chapter 7. Exercise 4-6. Exercise 7.-3. Exercise 4-7. Exercise 7.2-5. Exercise 4-8. Exercise 7.4-4. Problem 4-. Extreme FIFO Queues [25 points] Design a data structure that maintains a FIFO queue of integers, supporting operations ENQUEUE, DEUEUE, and FIND-MIN, each in O() amortized time. In other words, any sequence of m operations should take time O(m). You may assume that, in any execution, all the items that get enqueued are distinct. (a) [5 points] Describe your data structure. Include clear invariants describing its key properties. Hint: Use an actual queue plus auxiliary data structure(s) for bookkeeping. Solution: For example, we might use a FIFO queue Main and an auxiliary linked list, Min, satisfying the following invariants:. Item x appears in Min if and only if x is the minimum element of some tailsegment of Main. 2. Min is sorted in increasing order, front to back. (b) [5 points] Describe carefully, in words or pseudo-code, your ENQUEUE, DEQUEUE and FIND-MIN procedures. Solution:

2 Problem Set 4 Solutions ENQUEUE(x) Add x to the end of Main. 2 Starting at the end of the list, examine elements of Min and remove those that are larger than x; stop ex 3 Add x to the end of Min. DEQUEUE() Remove and return the first element x of Main. 2 If x is the first element in Min, remove it. FIND-MIN() Return the first element of Min. (c) [5 points] Prove that your operations give the right answers. Hint: You may want to prove that their correctness follows from your data structure invariants. In that case you should also sketch arguments for why the invariants hold. Solution: This solution is for the choices of data structure and procedures given above; your own may be different. The only two operations that return answers are DEQUEUE and FIND-MIN. DEQUEUE returns the first element of Main, which is correct because Main maintains the actual queue. FIND-MIN returns the first element of Min. This is the smallest element of Min because Min is sorted in increasing order (by Invariant 2 above). The smallest element of Main is the minimum of the tail-segment consisting of all of Main, which is the smallest of all the tail-mins of Main. This is the smallest element in Min (by Invariant ). Therefore, FIND-MIN returns the smallest element of Main, as needed. Proofs for the invariants: The invariants are vacuously true in the initial state. We argue that ENQUEUE and DEQUEUE preserve them; FIND-MIN does not affect them. It is easy to see that both operations preserve Invariant 2: Since a DEQUEUE operation can only remove an element from Min, the order of the remaining elements is preserved. For ENQUEUE(x), we remove elements from the end of Min until we find one that less than x, and then add x to the end of Min. Because Min was in sorted order prior to the ENQUEUE(x), when we stop removing elements, we know that all the remaining elements in Min are less than x. Since we do not change the order of any elements previously in Min, all the elements are still in sorted order. So it remains to prove Invariant. There are two directions: The new Min list contains all the tail-mins. ENQUEUE(x): x is the minimum element of the singleton tail-segment of Main and it is added to Min. Additionally, since every tail-segment now contains the value x, all elements with value greater than x can no longer be tail-mins. So, after their removal, Min still contains all the tail-mins.

Problem Set 4 Solutions 3 DEQUEUE of element x: The only element that could be removed from Min is x. It is OK to remove x, because it can no longer be a tail-min since it is no longer in Main. All other tail-mins are remain in Min. All elements of the new Min are tail-mins. ENQUEUE(x): x is the only value that is added to Min. It is the min of the singleton tail-segment. Every other element y remaining in the Min list was a tail-min before the ENQUEUE and is less than x. So y is still a tail-min after the ENQUEUE. DEQUEUE of element x: Then we claim that, if x is in Min before the operation, it is the first element of Min and therefore is removed from Min as well. Now, if x is in Min, it must be the minimum element of some tail of Main. This tail must include the entire queue, since x is the first element of Main. So x must be the smallest element in Min, which means it is the first element of Min. Every other element y in Min was a tail-min before the DEQUEUE, and is still a tail-min after the DEQUEUE. (d) [0 points] Analyze the time complexity: the worst-case cost for each operation, and the amortized cost of any sequence of m operations. Solution: DEQUEUE and FIND-MIN are O() operations, in the worst case. ENQUEUE is O(m) in the worst case. To see that the cost can be this large, suppose that ENQUEUE operations are performed for the elements 2, 3, 4,..., m, m, in order. After these, Min contains {2, 3, 4,..., m, m}. Then perform ENQUEUE(). This takes Ω(m) time because all the other entries from Min are removed one by one. However, the amortized cost of any sequence of m operations is O(m). To see this, we use a potential argument. First, define the actual costs of the operations as follows: The cost of any FIND-MIN operation is. The cost of any DEQUEUE operation is 2, for removal from Main and possible removal from Min. The cost of an ENQUEUE operation is 2 + s, where s is the number of elements removed from Min. Define the potential function Φ = Min. Now consider a sequence o, o 2,..., o m of operations and let c i denote the actual cost of operation o i. Let Φ i denote the value of the potential function after exactly i operations; let Φ 0 denote the initial value of Φ, which here is 0. Define the amortized cost ĉ i of operation instance o i to be c i + Φ i Φ i. We claim that ĉ i 2 for every i. If we show this, then we know that the actual cost of the entire sequence of operations satisfies: m m m c i = ĉ i + Φ 0 Φ m ĉ i 2m. i= i= i= This yields the needed O(m) amortized bound.

4 Problem Set 4 Solutions To show that ĉ i 2 for every i, we consider the three types of operations. If o i is a FIND-MIN operation, then ĉ i = + Φ i Φ i = < 2. If o i is a DEQUEUE, then since the lengths of the lists cannot increase, we have: If o i is an ENQUEUE, then ĉ i = c i + Φ i Φ i 2 + 0 2. ĉ i = c i + Φ i Φ i 2 + s s = 2, where s is the number of elements removed from Min. Thus, in every case, ĉ i 2, as claimed. Alternatively, we could use the accounting method. Use the same actual costs as above. Assign each ENQUEUE an amortized cost of 3, each DEQUEUE an amortized cost of 2, and each FIND-MIN an amortized cost of. Then we must argue that m m ĉ i c i i= i= for any sequence of operations and costs as above. This is so because each ENQUEUE(x) contributes an amortized cost of 3, which covers its own actual cost of 2 plus the possible cost of removing x from Min later. Problem 4-2. Quicksort Analysis [25 points] In this problem, we will analyze the time complexity of QUICKSORT in terms of error probabilities, rather than in terms of expectation. Suppose the array to be sorted is A[.. n], and write x i for the element that starts in array location A[i] (before QUICKSORT is called). Assume that all the x i values are distinct. In solving this problem, it will be useful to recall a claim from lecture. Here it is, slightly restated: Claim: Let c > be a real constant, and let α be a positive integer. Then, with probability at least n α, 3(α + c) lg n tosses of a fair coin produce at least c lg n heads. Note: High probability bounds, and this Claim, will be covered in Tuesday s lecture. (a) [5 points] Consider a particular element x i. Consider a recursive call of QUICKSORT on subarray A[p.. p+m ] of size m 2 which includes element x i. Prove that, with probability at least 2, either this call to QUICKSORT chooses x i as the pivot element, or the next recursive call to QUICKSORT containing x i involves a subarray of size at most 3 4 m.

Problem Set 4 Solutions 5 Solution: Suppose the pivot value is x. If l m J + x m l m J, then both 4 4 subarrays produced by the partition have size at most 3 m. Moreover, the number of 4 values of x in this range is at least m, so the probability of choosing such a value is at 2 least. Then either x 2 i is the pivot value or it is in one of the two segments. (b) [9 points] Consider a particular element x i. Prove that, with probability at least n 2, the total number of times the algorithm compares x i with pivots is at most d lg n, for a particular constant d. Give a value for d explicitly. Solution: We use part (a) and the Claim. By part (a), each time QUICKSORT is called for a subarray containing x i, with probability at least, either x 2 i is chosen as the pivot value or else the size of the subarray containing x i reduces to at most 3 of what it 4 was before the call. Let s say that a call is successful if either of these two cases happens. That is, with probability at least, the call is successful. 2 Now, at most log 4/3 n successful calls can occur for subarrays containing x i during an execution, because after that many successful calls, the size of the subarray containing x i would be reduced to. Using the change of base formula for logarithms, log 4/3 n = c lg n, where c = log 4/3 2. Now we can model the sequence of calls to QUICKSORT for subarrays containing x i as a sequence of tosses of a fair coin, where heads corresponds to successful calls. By the Claim, with c = log 4/3 2 and α = 2, we conclude that, with probability at least n 2, we have at least c lg n successful calls within d lg n total calls, where d = 3(2 + c). Each comparison of x i with a pivot occurs as part of one of these calls, so with probability at least n 2, the total number of times the algorithm compares x i with pivots is at most d lg n = 3(2 + c) lg n = 3(2 + log 4/3 2) lg n. The required value of d is 3(2 + log 4/3 2) 4. (c) [6 points] Now consider all of the elements x, x 2,..., x n. Apply your result from part (b) to prove that, with probability at least n, the total number of comparisons made by QUICKSORT on the given array input is at most d ' n lg n, for a particular constant d '. Give a value for d ' explicitly. Hint: The Union Bound may be useful for your analysis. Solution: Using a union bound for all the n elements of the original array A, we get that, with probability at least n( n 2 ) = n, every value in the array is compared with pivots at most d lg n times, with d as in part (b). Therefore, with probability at least n, the total number of such comparisons is at most dn lg n. Using d ' = d works fine. Since all the comparisons made during execution of QUICKSORT involve comparison of some element with a pivot, we get the same probabilistic bound for the total number of comparisons.

6 Problem Set 4 Solutions (d) [5 points] Generalize your results above to obtain a bound on the number of comparisons made by QUICKSORT that holds with probability n α, for any positive integer α, rather than just probability n (i.e., α = ). Solution: The modifications are easy. The Claim and part (a) are unchanged. For part (b), we now prove that with probability at least n α+, the total number of times the algorithm compares x i with pivots is at most d lg n, for d = 3(α + c). The argument is the same as before, but we use the Claim with the value of α instead of 2. Then for part (c), we show that with probability at least n α, the total number of times the algorithm compares any value with a pivot is at most dn lg n, where d = 3(α + c).

MIT OpenCourseWare http://ocw.mit.edu 6.046J / 8.40J Design and Analysis of Algorithms Spring 205 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.