Figure 1: A complete binary tree.

Similar documents
Figure 1: A complete binary tree.

Implementations. Priority Queues. Heaps and Heap Order. The Insert Operation CS206 CS206

CS350: Data Structures Heaps and Priority Queues

Priority Queues Heaps Heapsort

Priority Queues Heaps Heapsort

Data Structures Lesson 9

CSCI2100B Data Structures Heaps

Priority Queues. Binary Heaps

Data Structures. Giri Narasimhan Office: ECS 254A Phone: x-3748

Data Structures and Algorithms

Priority Queues (Heaps)

Priority Queues (Heaps)

CE 221 Data Structures and Algorithms

Priority Queues and Binary Heaps. See Chapter 21 of the text, pages

3. Priority Queues. ADT Stack : LIFO. ADT Queue : FIFO. ADT Priority Queue : pick the element with the lowest (or highest) priority.

Priority Queues and Binary Heaps. See Chapter 21 of the text, pages

COMP : Trees. COMP20012 Trees 219

CMSC 341 Priority Queues & Heaps. Based on slides from previous iterations of this course

The priority is indicated by a number, the lower the number - the higher the priority.

Priority Queues. e.g. jobs sent to a printer, Operating system job scheduler in a multi-user environment. Simulation environments

CS 240 Fall Mike Lam, Professor. Priority Queues and Heaps

Algorithms and Data Structures

Readings. Priority Queue ADT. FindMin Problem. Priority Queues & Binary Heaps. List implementation of a Priority Queue

Binary Heaps. CSE 373 Data Structures Lecture 11

Priority Queues and Binary Heaps

CS 315 April 1. Goals: Heap (Chapter 6) continued review of Algorithms for Insert DeleteMin. algoritms for decrasekey increasekey Build-heap

Heap Model. specialized queue required heap (priority queue) provides at least

Sorting and Searching

Recall: Properties of B-Trees

UNIT III BALANCED SEARCH TREES AND INDEXING

Sorting and Searching

CSE373: Data Structures & Algorithms Lecture 9: Priority Queues and Binary Heaps. Nicki Dell Spring 2014

CMSC 341 Lecture 14: Priority Queues, Heaps

Tree: non-recursive definition. Trees, Binary Search Trees, and Heaps. Tree: recursive definition. Tree: example.

Binary Heaps. COL 106 Shweta Agrawal and Amit Kumar

Chapter 6 Heaps. Introduction. Heap Model. Heap Implementation

Programming II (CS300)

Topic: Heaps and priority queues

CSE 373: Data Structures and Algorithms

Heaps, Heap Sort, and Priority Queues.

CS 234. Module 8. November 15, CS 234 Module 8 ADT Priority Queue 1 / 22

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof.

BM267 - Introduction to Data Structures

CSE373: Data Structures & Algorithms Lecture 9: Priority Queues and Binary Heaps. Linda Shapiro Spring 2016

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Heaps. Heapsort. [Reading: CLRS 6] Laura Toma, csci2000, Bowdoin College

Properties of a heap (represented by an array A)

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

The Heap Data Structure

Computer Science 210 Data Structures Siena College Fall Topic Notes: Priority Queues and Heaps

9. Heap : Priority Queue

CSE332: Data Abstractions Lecture 4: Priority Queues; Heaps. James Fogarty Winter 2012

Binary Heaps in Dynamic Arrays

Comparisons. Θ(n 2 ) Θ(n) Sorting Revisited. So far we talked about two algorithms to sort an array of numbers. What is the advantage of merge sort?

CompSci 201 Tree Traversals & Heaps

CSE 332: Data Structures & Parallelism Lecture 3: Priority Queues. Ruth Anderson Winter 2019

Comparisons. Heaps. Heaps. Heaps. Sorting Revisited. Heaps. So far we talked about two algorithms to sort an array of numbers

Heaps and Priority Queues

CS165: Priority Queues, Heaps

CH 8. HEAPS AND PRIORITY QUEUES

Overview of Presentation. Heapsort. Heap Properties. What is Heap? Building a Heap. Two Basic Procedure on Heap

CH. 8 PRIORITY QUEUES AND HEAPS

Chapter 6 Heapsort 1

Partha Sarathi Manal

Adam Blank Lecture 3 Winter 2017 CSE 332. Data Structures and Parallelism

Thus, it is reasonable to compare binary search trees and binary heaps as is shown in Table 1.

A data structure and associated algorithms, NOT GARBAGE COLLECTION

Programming II (CS300)

CSE 332, Spring 2010, Midterm Examination 30 April 2010

CMSC 341. Binary Heaps. Priority Queues

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

Data Structures and Algorithms for Engineers

COMP Data Structures

Module 2: Priority Queues

Administration CSE 326: Data Structures

Priority queues. Priority queues. Priority queue operations

Binary Trees, Binary Search Trees

CSE373: Data Structures & Algorithms Lecture 9: Priority Queues. Aaron Bauer Winter 2014

Design and Analysis of Algorithms

Algorithms, Spring 2014, CSE, OSU Lecture 2: Sorting

Course Review. Cpt S 223 Fall 2009

CS251-SE1. Midterm 2. Tuesday 11/1 8:00pm 9:00pm. There are 16 multiple-choice questions and 6 essay questions.

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Heaps. 2/13/2006 Heaps 1

Heapsort. Heap data structure

Heaps Outline and Required Reading: Heaps ( 7.3) COSC 2011, Fall 2003, Section A Instructor: N. Vlajic

Height of a Heap. Heaps. 1. Insertion into a Heap. Heaps and Priority Queues. The insertion algorithm consists of three steps

Trees. Tree Structure Binary Tree Tree Traversals

HEAPS: IMPLEMENTING EFFICIENT PRIORITY QUEUES

CS 10: Problem solving via Object Oriented Programming. Priori9zing 2

Heaps. Heaps Priority Queue Revisit HeapSort

Lecture 13: AVL Trees and Binary Heaps

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

CS2 Algorithms and Data Structures Note 6

quiz heapsort intuition overview Is an algorithm with a worst-case time complexity in O(n) data structures and algorithms lecture 3

Binary Search Tree. Revised based on textbook author s notes.

CSE 214 Computer Science II Heaps and Priority Queues

Priority Queues and Heaps. Heaps of fun, for everyone!

AMORTIZED ANALYSIS. Announcements. CSE 332 Data Abstractions: Priority Queues, Heaps, and a Small Town Barber. Today.

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

Transcription:

The Binary Heap A binary heap is a data structure that implements the abstract data type priority queue. That is, a binary heap stores a set of elements with a total order (that means that every two elements can be compared), and supports the following operations on this set: findmin() returns the smallest element; deletemin() removes the smallest element from the set and returns it; insert(el) inserts a new element into the set; len returns the current size (number of elements) of the set. A complete binary tree is a binary tree in which every level is full, except possibly the bottom level, which is filled from left to right as in Fig. 1. For a given number of elements, the shape of the complete binary tree 8 9 Figure 1: A complete binary tree. is prescribed completely by this description. A binary heap is a complete binary tree whose elements satisfy the heap-order property: no node contains an element less than its parent s element. Every subtree of a binary heap is also a binary heap, because every subtree is complete and satisfies the heap-order property. Findmin. There are many different heaps possible for the same data, as there are many different ways to satisfy the heap property. However, all possible heaps have one feature in common: A smallest element must be in the root node. This implies that the findmin operation is very easy it just returns the element from the root node. Insert. To insert an element into the binary heap, we first have to add a node to the tree. Because we need to maintain that the tree is complete, there is only one possible place where we can add the node: If the lowest level is not yet full, we need to add the next open position on this level. If the lowest level is full, we need to add a new node at the leftmost position on the next level, see Fig. 2 for both cases. 8 9 Figure 2: Two cases of adding a node. 1

Of course, the new element may violate the heap-order property. We correct this by having the element bubble up the tree until the heap-order property is satisfied. More precisely, we compare the new element x with its parent; if x is less, we exchange x with its parent, then repeat the procedure with x and its new parent. Fig. shows the four stages during the insertion of the new element 2. 8 9 8 9 2 2 2 8 9 8 9 Figure : Adding element 2. When we finish, is the heap-order property satisfied? Yes, if the heap-order property was satisfied before the insertion. Let s look at a typical exchange of x with a parent p during the insertion operation. Since the heap-order property was satisfied before the insertion, we know that p s (where s is x s sibling, see Fig. ), p l, and p r (where l and r are x s children). We only swap if x < p, which implies that x < s; after the swap, x is the parent of s. After the swap, p is the parent of l and r. All other relationships in the subtree rooted at x are maintained, so after the swap, the tree rooted at x has the heap-order property. p x x s p s l r l r Figure : Maintaining the heap property during bubble-up. Deletemin. The deletemin() operation starts by removing the entry at the root node and saving it for the return value. This leaves a gaping hole at the root. We fill the hole with the last entry in the tree (which we call x), so that the tree is still complete. It is unlikely that x is the minimum element. Fortunately, both subtrees rooted at the root s children are heaps, and thus the new mimimum element is one of these two children. We bubble x down the heap as 2

follows: if x has a child that is smaller, swap x with the child having the smaller element. Next, compare x with its new children; if x still violates the heap-order property, again swap x with the child with the smaller element. Continue until x is less than or equal to its children, or reaches a leaf. Fig. shows the stages in this process. Note that if in the original heap we replace by 10, then would not bubble down all the way to a leaf, but would remain on a higher level. Figure : Removing the minimum element. Array-based implementation. Of course a complete binary tree can be implemented as a linked data structure, with Node objects that store references to their parent and children. However, because the tree is complete, we can actually store it very compactly in an array or Python list. To do so, simply number the nodes of the tree from top to bottom, left to right, starting with one for the root, so that the nodes are numbered from 1 to n. We observe that the left child of node i has index 2i and the right child has index 2i + 1. The parent of node i has index i/2. We will simply store node i in slot i of an array, and we can move from a node to its parent or its children by doing simple arithmetic on the index. (We started numbering from 1 for convenience the numbers are nicer. Of course this means that we waste slot 0 of the array not a big problem, but it could be fixed by changing the numbering scheme slightly.) Implementation. We now give a complete implementation in Python. Our PriorityQueue class can be constructed either by providing a Python list to be used, or by passing None (and the PriorityQueue class will create a Python list itself). We use the doubling-technique for growing the list. Initially, we create a list of some default capacity, and set _size to zero.

The methods len and findmin() are very easy: def len (self): return self._size def findmin(self): if self._size == 0: raise ValueError("empty priority queue") return self._data[1] The insert method first ensures there is enough space to add a node to the tree. For efficiency, we do not actually place x in the new node, but keep it empty (a hole ). Then we bubble the hole up in the tree until x is no longer smaller than the parent of the hole, and finally place x in the hole. def insert(self, x): if self._size + 1 == len(self._data): # double size of the array storing the data self._data.extend( [ None ] * len(self._data) ) self._size += 1 hole = self._size self._data[0] = x # Percolate up while x < self._data[hole // 2]: self._data[hole] = self._data[hole // 2] hole //= 2 self._data[hole] = x Note that we use a little trick here: We put x into _data[0]. This allows us to skip a test for hole == 1 in the while-loop. Since the root has no parent, the loop should end there. However, since hole//2 == 0 for hole == 1, in the root we will compare x to _data[0], that is to itself, so the comparison will return False and the loop will end automatically. Finally, the hardest operation is deletemin. It makes use of two internal methods. First, the method _smaller_child(hole, in_hole) checks whether one of the children of the node with index hole is smaller than the element in_hole. If so, it returns the index of the smaller element, otherwise it returns zero: def _smaller_child(self, hole, in_hole): if 2 * hole <= self._size: child = 2 * hole if child!= self._size and self._data[child + 1] < self._data[child]: child += 1 if self._data[child] < in_hole: return child return 0 The internal method _percolate_down(i) bubbles down the element in node i: def _percolate_down(self, i): in_hole = self._data[i] hole = i child = self._smaller_child(hole, in_hole) while child!= 0: self._data[hole] = self._data[child] hole = child child = self._smaller_child(hole, in_hole) self._data[hole] = in_hole

With these internal methods,deletemin() is rather easy: def deletemin(self): min_item = self.findmin() # raises error if empty self._data[1] = self._data[self._size] self._size -= 1 self._percolate_down(1) return min_item Analysis. Since the heap is always a complete binary tree, its height is O(log n) at any time, where n is the current size of the set. It follows that the operations deletemin and insert take time O(log n). The operations len and findmin take constant time. Now imagine we have n elements and we want to insert them into a binary heap. Calling insert n times would take total time O(n log n). Perhaps surprisingly, we can actually do better than this by creating an array for the n elements, throwing the elements into this array in some arbitrary order, and then running the following method: def build_heap(self): for i in range(self._size // 2, 0, -1): self._percolate_down(i) The method works backward from the last internal node (non-leaf node) to the root node, in reverse order in the array or the level-order traversal. When we visit a node this way, we bubble its entry down the heap using _percolate_down as in deletemin. We are making use of the fact that if the two subtrees of a node i are heaps, then after calling _percolate_down(i) the subtree rooted at node i is a heap. By induction, this implies that when _build_heap has completed, the entire tree is a binary heap. The running time of _build_heap is somewhat tricky to analyze. Let h i denote the height of node i, that is, the length of a longest path from node i to a leaf. The running time of _percolate_hown(i) is clearly O(h i ), and so the running time of _build_heap is O( n/2 i=1 h i ). How can we bound this sum? Here is a simple and elegant argument. For simplicity, let s assume that n = 2 h+1 1, so the tree has height h and is perfect all leaves are on level h. Clearly adding nodes on the bottom level can only increase the value of the sum. Now, for every node i there is a path P i of length h i from i to a leaf as follows: From i, go down to its right child. From there, always go down to the left child until you reach a leaf. It is easy to see that for two different nodes i and j, the paths P i and P j do not share any edge. It follows that the total number of edges of all the paths P i, for 1 i n, is at most the number of edges of the tree, which is n 1. So we have n i=1 h i n 1, and so the running time of _build_heap is O(n). Heap Sort Any priority queue can be used for sorting: def pqsort(a): pq = PriorityQueue() for e in a: pq.insert(e) for i in range(len(a)): a[i] = pq.deletemin() The binary heap is particularly well suited to implement a sorting algorithm, because we can use the array containing the input data to implement the binary heap. We therefore obtain an in-place sorting algorithm, which sorts the data inside the array.

Initially, we run _build_heap to put the objects inside the array into heap order. We then remove them one by one using deletemin. Every time an element is removed from the heap, the heap needs one less array slot. We can use this slot to store the element that we just removed from the heap, and obtain the following algorithm: def heapsort(a): heap = PriorityQueue(a) for i in range(len(a)-1, 1, -1): a[i] = heap.deletemin() Here is an example run: >>> a = [0, 1, 2, 9,,, 1, 1, 89, ] >>> heapsort(a) >>> a [0, 89,,, 1, 1, 9,, 2, 1] The element in slot 0 of the array was not sorted, since our PriorityQueue implementation doesn t use this slot. This could be fixed by changing the indexing of the array. Also, the array is sorted backwards, because we have to fill it starting at the last index. This can be fixed by using a max-heap instead of a min-heap.