CS2 Algorithms and Data Structures Note 6

Similar documents
CS2 Algorithms and Data Structures Note 6

Stacks, Queues, and Priority Queues. Inf 2B: Heaps and Priority Queues. The PriorityQueue ADT

How much space does this routine use in the worst case for a given n? public static void use_space(int n) { int b; int [] A;

Heaps Outline and Required Reading: Heaps ( 7.3) COSC 2011, Fall 2003, Section A Instructor: N. Vlajic

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Heaps. 2/13/2006 Heaps 1

Binary Heaps in Dynamic Arrays

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof.

Priority queues. Priority queues. Priority queue operations

Programming II (CS300)

Algorithms and Data Structures

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Priority Queues and Heaps. Heaps of fun, for everyone!

3. Priority Queues. ADT Stack : LIFO. ADT Queue : FIFO. ADT Priority Queue : pick the element with the lowest (or highest) priority.

Inf 2B: AVL Trees. Lecture 5 of ADS thread. Kyriakos Kalorkoti. School of Informatics University of Edinburgh

Chapter 2: Basic Data Structures

Heaps Goodrich, Tamassia. Heaps 1

Data Structures and Algorithms

Module 2: Priority Queues

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

CS 240 Fall Mike Lam, Professor. Priority Queues and Heaps

Midterm solutions. n f 3 (n) = 3

Module 2: Priority Queues

Priority Queues and Heaps. Heaps and Priority Queues 1

Lecture 13: AVL Trees and Binary Heaps

Heaps 2. Recall Priority Queue ADT. Heaps 3/19/14

Priority Queue ADT ( 7.1) Heaps and Priority Queues 2. Comparator ADT ( 7.1.4) Total Order Relation. Using Comparators in C++

Representations of Weighted Graphs (as Matrices) Algorithms and Data Structures: Minimum Spanning Trees. Weighted Graphs

Algorithms and Data Structures

Priority Queues Heaps Heapsort

Overview of Presentation. Heapsort. Heap Properties. What is Heap? Building a Heap. Two Basic Procedure on Heap

Heaps, Heapsort, Priority Queues

1 Tree Sort LECTURE 4. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

Algorithms and Data Structures

Data Structures and Algorithms Week 4

CS 234. Module 8. November 15, CS 234 Module 8 ADT Priority Queue 1 / 22

Priority Queues Heaps Heapsort

CS 240 Data Structures and Data Management. Module 2: Priority Queues

CSE 2123: Collections: Priority Queues. Jeremy Morris

CS 240 Data Structures and Data Management. Module 2: Priority Queues

CSE373: Data Structures & Algorithms Lecture 9: Priority Queues and Binary Heaps. Linda Shapiro Spring 2016

The smallest element is the first one removed. (You could also define a largest-first-out priority queue)

Programming II (CS300)

Chapter 6 Heap and Its Application

Priority Queues and Binary Heaps

CH 8. HEAPS AND PRIORITY QUEUES

CSE 332, Spring 2010, Midterm Examination 30 April 2010

Data Structures and Algorithms Chapter 4

CH. 8 PRIORITY QUEUES AND HEAPS

Module 2: Priority Queues

Comparisons. Θ(n 2 ) Θ(n) Sorting Revisited. So far we talked about two algorithms to sort an array of numbers. What is the advantage of merge sort?

Lecture: Analysis of Algorithms (CS )

Algorithms in Systems Engineering IE172. Midterm Review. Dr. Ted Ralphs

HEAPS: IMPLEMENTING EFFICIENT PRIORITY QUEUES

Partha Sarathi Manal

Priority Queues. Priority Queues Trees and Heaps Representations of Heaps Algorithms on Heaps Building a Heap Heapsort.

Recursive Data Structures and Grammars

Comparisons. Heaps. Heaps. Heaps. Sorting Revisited. Heaps. So far we talked about two algorithms to sort an array of numbers

Data Structures and Algorithms

Binary search trees. Binary search trees are data structures based on binary trees that support operations on dynamic sets.

Chapter 6 Heapsort 1

Transform & Conquer. Presorting

Readings. Priority Queue ADT. FindMin Problem. Priority Queues & Binary Heaps. List implementation of a Priority Queue

Priority queues. Priority queues. Priority queue operations

COMP Data Structures

CPSC 311 Lecture Notes. Sorting and Order Statistics (Chapters 6-9)

Stores a collection of elements each with an associated key value

Priority Queues and Heaps. More Data Structures. Priority Queue ADT ( 2.4.1) Total Order Relation. Sorting with a Priority Queue ( 2.4.

Priority Queues & Heaps. Chapter 9

Describe how to implement deque ADT using two stacks as the only instance variables. What are the running times of the methods

Outline. Computer Science 331. Heap Shape. Binary Heaps. Heap Sort. Insertion Deletion. Mike Jacobson. HeapSort Priority Queues.

CSC Design and Analysis of Algorithms. Lecture 8. Transform and Conquer II Algorithm Design Technique. Transform and Conquer

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi

Heaps and Priority Queues

CE 221 Data Structures and Algorithms

CSCI2100B Data Structures Heaps

The divide and conquer strategy has three basic parts. For a given problem of size n,

Data Structures and Algorithms. Roberto Sebastiani

SFU CMPT Lecture: Week 9

COMP 250 Fall priority queues, heaps 1 Nov. 9, 2018

CSCI 136 Data Structures & Advanced Programming. Lecture 22 Fall 2018 Instructor: Bills

A data structure and associated algorithms, NOT GARBAGE COLLECTION

Module 2: Classical Algorithm Design Techniques

INFO1x05 Tutorial 6. Exercise 1: Heaps and Priority Queues

Heaps. Heaps. A heap is a complete binary tree.

Dictionaries. Priority Queues

Figure 1: A complete binary tree.

CSE 214 Computer Science II Heaps and Priority Queues

Priority Queues. T. M. Murali. January 26, T. M. Murali January 26, 2016 Priority Queues

Algorithms and Data Structures: Minimum Spanning Trees I and II - Prim s Algorithm. ADS: lects 14 & 15 slide 1

Priority Queues and Heaps

Outline. Definition. 2 Height-Balance. 3 Searches. 4 Rotations. 5 Insertion. 6 Deletions. 7 Reference. 1 Every node is either red or black.

Priority Queues, Binary Heaps, and Heapsort

1 / 22. Inf 2B: Hash Tables. Lecture 4 of ADS thread. Kyriakos Kalorkoti. School of Informatics University of Edinburgh

COMP Analysis of Algorithms & Data Structures

Algorithms, Spring 2014, CSE, OSU Lecture 2: Sorting

Sorting. Outline. Sorting with a priority queue Selection-sort Insertion-sort Heap Sort Quick Sort

CSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.

Heapsort. Algorithms.

quiz heapsort intuition overview Is an algorithm with a worst-case time complexity in O(n) data structures and algorithms lecture 3

Transcription:

CS Algorithms and Data Structures Note 6 Priority Queues and Heaps In this lecture, we will discuss priority queues, another important ADT. As stacks and queues, priority queues store arbitrary collections of elements. Remember that stacks have a LIFO access policy and queues a FIFO policy. Priority queues have a more complicated policy: Each element stored in a priority queue has a priority, and the next element to be removed is the one with the highest priority. Priority queues have numerous applications. For example, a priority queue can be used to schedule jobs on a shared computer. Some jobs will be more urgent than others and thus get higher priorities. A priority queue keeps track of the jobs to be performed and their priorities. If a job is finished or interrupted, the one with the highest priority will be performed next. 6.1 The PriorityQueue ADT A PriorityQueue stores a collection of elements. Associated with each element is a key, which is taken from some linearly ordered set (such as the integers). Keys are just a way to represent the priorities of elements, smaller keys mean higher priorities. In its most basic form, the ADT supports the following operations: insertitem(k, e): Insert element e with key k. minelement(): Return an element with minimum key; an error occurs if the priority queue is empty. removemin(): Return and remove an element with minimum key; an error occurs if the priority queue is empty. isempty(): Return TRUE if the priority queue is empty and FALSE otherwise. Note that the keys have a quite different function here than they have in Dictionaries. Keys in priority queues are just used internally to determine the position of an element in the queue, whereas in dictionaries keys are the main means of accessing elements. 6. The search tree implementation A straightforward implementation of the PriorityQueue ADT is based on binary search trees. Observing that the element with the smallest key is always stored 1

in the leftmost internal node of the tree, we can easily implement all methods of PriorityQueue such that their time complexity is Θ(h), where h is the height of the tree (except isempty(), which only requires time Θ(1)). If we use AVL trees, this amounts to a time complexity of Θ(lg(n)). 6.3 Abstract Heaps A heap is data structure that is particularly well-suited for implementing the PriorityQueue ADT. Although also based on binary trees, heaps are quite different from binary search trees, and they are usually implemented using arrays, which makes them quite efficient in practice. Before we explain the heap data structure, we introduce an abstract model of a heap that will be quite helpful for understanding the algorithms implementing the priority queue methods on heaps. We need some more terminology for binary trees: For i 0, the ith level of a tree consist of all vertices that have distance i from the root. Thus the 0th level consists just of the root, the first level consists of the children of the root, the second level of the grandchildren of the root, etc. We say that a binary tree of height h is almost complete if the levels 0,..., h 1 have the maximum possible number of nodes (i.e., level i has i nodes), and on level h all internal nodes are left of all leaves. Figure 1 shows an almost complete tree and two trees that are not almost complete. Figure 6.1. The first tree is almost complete tree, the second an third are not. Theorem 6.. An almost complete binary tree with n internal nodes has height lg(n) + 1. Proof: We first recall that a complete binary tree, i.e., a binary tree where all levels have the maximum possible number of nodes, of height h has h 1 internal nodes. This can be proved by a simple induction on h. Since the number of internal nodes of an almost complete tree of height h is greater than the number of internal nodes of a complete tree of height h 1 and at most the number of internal nodes of a complete tree of height h, for the number n of nodes of an almost complete tree of height h we have h 1 n h 1.

Since for all n with h 1 n h 1 we have lg(n) = h 1, this implies the theorem. In our abstract model, a heap is an almost complete binary tree whose internal nodes store items such that the following heap condition is satisfied: (H) For every node i other than the root, the key stored at i is greater than or equal to the key stored at the parent of i. The last node of a heap of height h is the rightmost internal node in the hth level. Insertion To insert an item into the heap, we create a new last node and insert the item there. It may happen that the key of the new item is larger than the key of its parent, its grandparent, etc. To repair this, we bubble the new item up the heap until it has found its position. Algorithm insertitem(k, e) 1. Create new last node v.. while v is not the root and k < v.parent.key do 3. store the item stored at v.parent at v 4. v v.parent 5. store (k, e) at v Algorithm 6.3 Consider the insertion algorithm 6.3. The correctness of the algorithm should be obvious (we could prove it by an induction). To analyse the complexity, let us assume that each line of code can be executed in time Θ(1). This needs to be justified, as it depends on the concrete way we store the data. We will do so in the next section. The loop in lines 4 is iterated at most h times, where h is the height of the tree. By Theorem 6. we have h Θ(lg(n)). Thus the time complexity of insertitem is Θ(lg(n)). Finding the Minimum Property (H) implies that the minimum key is always stored at the root of a heap. Thus the method minelement just has to return the element stored at the root of the heap, which can be done in time Θ(1) (with any reasonable implementation of a heap). 3

Removing the Minimum The item with the minimum key is stored at the root, but we cannot easily remove the root of a tree. So what we do is replace item at the root by the item stored at the last node, remove the last node, and then repair the heap. The repairing is done in a separate procedure called heapify (Algorithm 6.4). Applied to a node v of a tree such that the subtrees rooted at the children of v already satisfy the heap property (H), it turns the subtree rooted at v into a tree satisfying (H) (without changing the shape of the tree). Algorithm heapify(v) 1. if v.left is an internal node and v.left.key < v.key then. s v.left 3. else 4. s v 5. if v.right is an internal node and v.right.key < s.key then 6. s x 7. if s v then 8. exchange the items of v and s 9. heapify(s) Algorithm 6.4 The algorithm for heapify works as follows: It defines s to be the smallest of the keys of v and its children. This is done in lines 1 6. Since the children of v may be leaves and thus not store items, some care is required. Then if s = v, the key of v is smaller than or equal to the keys of its children. Thus the heap property (H) is satisfied at v. Since the subtrees rooted at the children are already heaps, (H) is satisfied at every node in these subtrees, so it is satisfied in the subtree rooted at v, and no further action is required. If s = v, then s is the child of v with the smaller key. In this case, the items of s and v are exchanged, which means that now (H) is satisfied at v. (H) is still satisfied by the subtree of v that is not rooted by s, because this subtree has not changed. It may be, however, that (H) is no longer satsified at s, because s has a larger key now. Therefore, we have to apply heapify recursively to s. Eventually, we will reach a point where we call heapify(v) for a v that has no internal nodes as children, and the algorithm terminates. So what is the time complexity of heapify? It can be easily expressed in terms of the height h of v. Again assuming that each line of code requires time Θ(1) and observing that the recursive call is made to a node of height h 1, we obtain T heapify (h) Θ(1) + T heapify (h 1). 4

Moreover, the recursive call is only made if the height of v is greater than 1, thus heapify applied to a node of height 1 only requires constant time. To solve the recurrence, we use the following lemma: Lemma 6.5. Let f : N R such that for all n Then f(n) is Θ(n). f(n) Θ(1) + f(n 1). Proof: We first prove that f(n) is O(n). Let n 0 N and c > 0 such that f(n) c + f(n 1) for all n n 0. Let d = f(n 0 ). I claim that for all n n 0. f(n) d + c (n n 0 ) (6.1) We prove (6.1) by induction on n n 0. As the induction basis, we note that (6.1) holds for n = n 0. For the induction step, suppose that n > n 0 and that (6.1) holds for n 1. Then This proves (6.1). f(n) c + f(n 1) c + d + c (n 1 n 0 ) = d + c (n n 0 ). To complete the proof that f(n) is O(n), we just note that d + c (n n 0 ) d + c n O(n). The proof that f(n) is Ω(n) can be done completely analogously; I leave it as an exercise. Applying Lemma 6.5 to the function T heapify (h) yields T heapify (h) Θ(h). (6.) Finally, Algorithm 6.3 shows how to use heapify for removing the minimum. Algorithm removemin() 1. e root.element. root.item last.item 3. delete last 4. heapify(root) 5. return e; Algorithm 6.6 Since the height of the root of the heap, which is just the height of the heap, is Θ(lg(n)) by Theorem 6., the time complexity of removemin is Θ(lg(n)). 5

6.4 An array based implementation Based on the algorithms described in the last section, it is straightforward to design a tree based heap data structure. However, one of the reasons that heaps are so efficient is that they can be stored in arrays in a straightforward way. The items of the heap are stored in an array H, and the internal nodes of the tree are associated with the indices of the array by numbering them level-wise from the left to the right. Thus the root is associated with 0, the left child of the root with 1, the right child of the root with, the leftmost grandchild of the root with 3, etc. (see Figure 6.7). 3 0 9 1 4 1 3 13 4 5 5 11 6 15 7 14 8 33 9 14 10 88 11 3 9 4 1 13 5 11 15 14 33 14 88 Figure 6.7. Storing a heap in an array. The small number next to the nodes are the indices associated with them. So we represent the nodes of the heap by natural numbers. Observe that for every node v the left child of v is v + 1 and the right child of v is v + (if they are internal nodes). The root is 0, and for every node v > 0 the parent of v is v 1. We store the number of items the heap contains in a variable size. Thus the last node is size 1. Observe that we only have to insert or remove the last element of the heap. Using dynamic arrays, this can be done in amortised time Θ(1). Therefore, using the algorithms described in the previous section, the methods of PriorityQueue can be implemented with time complexities shown in Table 6.8. In many important priority queue applications we know the size of the queue in advance. Then we can just fix the size of the array when we set up the priority queue, and there is no need to use dynamic arrays. In this situation, the time complexities Θ(lg(n)) for insertitem and removemin are not amortised, but simple worst case time complexities. 6

method insertitem minelement removemin isempty time complexity Θ(lg(n)) amortised Θ(1) Θ(lg(n)) amortised Θ(1) Table 6.8. An implementation of the heap data structure can be found in the file heap.java, which is available on the CS lecture notes web page. In addition to the PriorityQueue methods, it has a useful constructor that takes an array H storing items and turns it into a heap. Algorithm 6.9 describes the algorithm behind this method in pseudo-code. Algorithm buildheap(h) 1. n H.length. for v n downto 0 do 3. heapify(v) Algorithm 6.9 To understand this algorithm, we first of observe that n is the maximum v such that v + 1 n 1, i.e., the last node of the heap that has at least one child. Recall that applied to a node v of a tree such that the subtrees rooted at the children of v already satisfy the heap property (H), the procedure heapify(v) turns the subtree rooted at v into a tree satisfying (H). So the algorithm turns all subtrees of the tree into a heap in a bottom up fashion. Since subtrees of height 1 are automatically heaps, the first node where something needs to be done is, the last node of height at least. n Straigtforward reasoning shows that the time complexity of heapify() is O(n lg(n)), because each call of heapify requires time O(lg(n)). Surprisingly, a more refined argument shows that we can do better: Theorem 6.10. The time complexity of buildheap is Θ(n), where n is the length of the array H. 7

Proof: Let h(v) denote the height of a node v and m = n. By 6.1, we obtain T buildheap (n) = m Θ(1) + T buildheap (h(v)) = Θ(m) + m Θ ( h(v) ) ( m ) = Θ(m) + Θ h(v). Since Θ(m) = Θ(n), all that remains to prove is that m h(v) O(n). Observe that m h(v) = = h i (number of nodes of height i ) i=1 (h i) (number of nodes at level i ) (h i) i, where h = lg(n) + 1. The rest is just math. We use the fact that Then (h i) i n h 1 (h i) i n n (h i) i n i + 1 i = 4. (6.3) n (h i) i (h 1) (since n h 1 ) = n (i + 1) i n = 4n. i + 1 i Thus m h(v) 4n O(n), which completes our proof. 8

6.5 Reading Assignment Section.4 of [GT]. The presentation of the material is quite different there, but it s essentially the same. You can omit the parts on sorting, we will cover them in class later. Martin Grohe 9