A data structure and associated algorithms, NOT GARBAGE COLLECTION

Similar documents
Heaps, Heapsort, Priority Queues

The Heap Data Structure

Comparisons. Θ(n 2 ) Θ(n) Sorting Revisited. So far we talked about two algorithms to sort an array of numbers. What is the advantage of merge sort?

Comparisons. Heaps. Heaps. Heaps. Sorting Revisited. Heaps. So far we talked about two algorithms to sort an array of numbers

Heapsort. Algorithms.

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

Heapsort. Heap data structure

Lecture 5: Sorting Part A

Introduction to Algorithms 3 rd edition

Properties of a heap (represented by an array A)

Heaps and Priority Queues

Algorithms, Spring 2014, CSE, OSU Lecture 2: Sorting

Lecture: Analysis of Algorithms (CS )

Partha Sarathi Manal

CS 241 Analysis of Algorithms

403: Algorithms and Data Structures. Heaps. Fall 2016 UAlbany Computer Science. Some slides borrowed by David Luebke

Topic: Heaps and priority queues

Next. 1. Covered basics of a simple design technique (Divideand-conquer) 2. Next, more sorting algorithms.

Data Structures and Algorithms Chapter 4

Data Structures and Algorithms Week 4

1. Covered basics of a simple design technique (Divideand-conquer) 2. Next, more sorting algorithms.

Chapter 6 Heap and Its Application

Lecture 3. Recurrences / Heapsort

CSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.

Sorting Shabsi Walfish NYU - Fundamental Algorithms Summer 2006

BM267 - Introduction to Data Structures

Data Structures and Algorithms. Roberto Sebastiani

CPSC 311 Lecture Notes. Sorting and Order Statistics (Chapters 6-9)

Tirgul 4. Order statistics. Minimum & Maximum. Order Statistics. Heaps. minimum/maximum Selection. Overview Heapify Build-Heap

Algorithms and Data Structures

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Overview of Presentation. Heapsort. Heap Properties. What is Heap? Building a Heap. Two Basic Procedure on Heap

Basic Data Structures and Heaps

CSE5311 Design and Analysis of Algorithms. Administrivia Introduction Review of Basics IMPORTANT

CS 303 Design and Analysis of Algorithms

Computer Science 210 Data Structures Siena College Fall Topic Notes: Priority Queues and Heaps

1 Interlude: Is keeping the data sorted worth it? 2 Tree Heap and Priority queue

Sorting. Popular algorithms: Many algorithms for sorting in parallel also exist.

Priority Queues and Heaps. Heaps of fun, for everyone!

HEAP. Michael Tsai 2017/4/25

Priority Queues, Heaps, and Heapsort

CSCE 750, Fall 2002 Notes 2 Page Bubble Sort // sort the array `a[*]' of length `count' // perform only the first `howmany' sort steps // keep t

9. Heap : Priority Queue

Searching, Sorting. part 1

quiz heapsort intuition overview Is an algorithm with a worst-case time complexity in O(n) data structures and algorithms lecture 3

Lower Bound on Comparison-based Sorting

Module 2: Priority Queues

Sorting and Searching

CSC263 Week 2. Larry Zhang

CISC 621 Algorithms, Midterm exam March 24, 2016

Collection of priority-job pairs; priorities are comparable.

Deliverables. Quick Sort. Randomized Quick Sort. Median Order statistics. Heap Sort. External Merge Sort

Sorting and Searching

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

Wellesley College CS231 Algorithms September 25, 1996 Handout #11 COMPARISON-BASED SORTING

Appendix A and B are also worth reading.

Partha Sarathi Mandal

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

Algorithms Lab 3. (a) What is the minimum number of elements in the heap, as a function of h?

Lecture 6: Analysis of Algorithms (CS )

COMP Data Structures

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Chapter 6 Heapsort 1

Heap: A binary heap is a complete binary tree in which each, node other than root is smaller than its parent. Heap example: Fig 1. NPTEL IIT Guwahati

COSC Advanced Data Structures and Algorithm Analysis Lab 3: Heapsort

EECS 2011M: Fundamentals of Data Structures

CSC Design and Analysis of Algorithms. Lecture 8. Transform and Conquer II Algorithm Design Technique. Transform and Conquer

Design and Analysis of Algorithms

Module 2: Priority Queues

CSE 241 Class 17. Jeremy Buhler. October 28, Ordered collections supported both, plus total ordering operations (pred and succ)

CSC Design and Analysis of Algorithms. Lecture 8. Transform and Conquer II Algorithm Design Technique. Transform and Conquer

Design and Analysis of Algorithms

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

Module 2: Priority Queues

CS 240 Data Structures and Data Management. Module 2: Priority Queues

Figure 1: A complete binary tree.

Lecture 7. Transform-and-Conquer

CS S-11 Sorting in Θ(nlgn) 1. Base Case: A list of length 1 or length 0 is already sorted. Recursive Case:

CS 240 Data Structures and Data Management. Module 2: Priority Queues

Data structures. Organize your data to support various queries using little time and/or space

Priority Queues and Binary Heaps. See Chapter 21 of the text, pages

Figure 1: A complete binary tree.

Data Structures Lesson 9

7. Sorting I. 7.1 Simple Sorting. Problem. Algorithm: IsSorted(A) 1 i j n. Simple Sorting

Questions from the material presented in this lecture

Module 2: Classical Algorithm Design Techniques

Binary Heaps. CpSc 212: Algorithms and Data Structures. School of Computing Clemson University Summer, Raghuveer Mohan. Thanks to Brian C.

CS 5321: Advanced Algorithms Sorting. Acknowledgement. Ali Ebnenasir Department of Computer Science Michigan Technological University

CS165: Priority Queues, Heaps

O(n): printing a list of n items to the screen, looking at each item once.

CS 310 Advanced Data Structures and Algorithms

Priority Queues and Binary Heaps. See Chapter 21 of the text, pages

CS2 Algorithms and Data Structures Note 6

Premaster Course Algorithms 1 Chapter 2: Heapsort and Quicksort

The priority is indicated by a number, the lower the number - the higher the priority.

Heaps. Heapsort. [Reading: CLRS 6] Laura Toma, csci2000, Bowdoin College

Binary heaps (chapters ) Leftist heaps

CS 240 Fall Mike Lam, Professor. Priority Queues and Heaps

CSE 214 Computer Science II Heaps and Priority Queues

Heaps and Priority Queues

Transcription:

CS4 Lecture Notes /30/0 Heaps, Heapsort, Priority Queues Sorting problem so far: Heap: Insertion Sort: In Place, O( n ) worst case Merge Sort : Not in place, O( n lg n) worst case Quicksort : In place, O( n ) worst case, O( n lg n) expected case Heapsort : In place, O( n lg n) worst case A data structure and associated algorithms, NOT GARBAGE COLLECTION A heap data structure is an array of objects than can be viewed as a complete binary tree such that: Each tree node corresponds to elements of the array The tree is complete except possibly the lowest level, filled from left to right The heap property for all nodes I in the tree must be maintained except for the root: Parent node(i) I Example: Given array [ 0 87 4 3 ] 3 8 7 8 9 0 4 3 0 Note that the elements are not sorted, only max element at root of tree.

The height of a node in the tree is the number of edges on the longest simple downward path from the node to a leaf; e.g. height of node is 0, height of node 4 is, height of node is 3. The height of the tree is the height from the root. As in any complete binary tree of size n, this is lg n. Caveats: h nodes at level h. h+ total nodes in a complete binary tree. A heap represented as an array A represented has two attributes:. Length(A) Size of the array. HeapSize(A) - Size of the heap The property Length(A) HeapSize(A) must be maintained. (why?) The heap property is stated as A[parent(I)] A[I] The root of the tree is A[]. Formula to compute parents, children in an array: Parent(I) = A[ I / ] Left Child(I) = A[I] Right Child(I) = A[I+] (Show how to represent the above tree as an array in the example) Where might we want to use heaps? Consider the Priority Queue problem: Given a sequence of objects with varying degrees of priority, and we want to deal with the highest-priority item first. Managing air traffic control - want to do most important tasks first. Jobs placed in queue with priority, controllers take off queue from top Scheduling jobs on a processor - critical applications need high priority Event-driven simulator with time of occurrence as key. Use min-heap, which keeps smallest element on top, get next occurring event. To support these operations we need to extract the maximum element from the heap: HEAP-EXTRACT-MAX(A) remove A[] A[] A[n] ; n is HeapSize(A), the length of the heap, not array n n- ; decrease size of heap Heapify(A,,n) ; Remake heap to conform to heap properties Runtime: Θ( ) +Heapify time

Differences from book : no error handling n instead of HeapSize(A) slightly higher abstraction Passing n to Heapify routine Note: Successive removals will result in items in reverse sorted order! We will look at: Heapify : Maintain the heap property Build Heap : How to initially build a heap Heapsort : Sorting using a heap Heapify: Maintain heap property by floating a value down the heap that starts at I until it is in the right position. Heapify(A,I,n) ; Array A, heapify node I, heapsize is n ; Note that the left and right subtrees of I are also heaps ; Make I s subtree be a heap. If I n and A[I]>A[I] ; see which is largest of current node and its children then largest I else largest I If I+ n and A[I+]>A[largest] then largest I+ If largest I then swap A[I] A[largest] Heapify(A,largest,n) Differences from book : I and I+ instead of left and right, n instead of heapsize Example: Heapify(A,,0). A=[ 3 0 8 7 4 3 ]

3 8 7 8 9 0 0 4 3 Find largest of children and swap. All subtrees are valid heaps so we know the children are the maximums. 3 8 7 8 9 0 0 4 3 Next is Heapify(A,,0). A=[3 0 8 7 4 3 ] 3 8 7 8 9 0 0 4 3 Next is Heapify(A,4,0). A=[3 8 0 7 4 3 ]

3 8 4 7 8 9 0 0 3 Next is Heapify(A,8,0). A=[3 8 0 4 7 3 ] On this iteration we have reached a leaf and are finished. (Consider if started at node 3, n=7) Runtime: Intuitively this is Θ(lg n) since we make one trip down the path of the tree and we n have an almost complete binary tree. This would correspond to : T ( n) = T ( ) + Θ() Worst-case Runtime actually equal to: T( n) = T ( n ) + Θ ( ). Split problem into at least /3 3 the size. Consider the number of nodes on the left side vs. the right in the most unbalanced state: In the worst case a heap of height n has all of the bottom leaves of the left child filled and the right child has height n-. This is the most unbalanced a tree will ever become due to the heap property. For any complete binary tree of n nodes and l leaves, where the lowest level is full, l=n+. That is, half of the tree is leaves. For a tree of height h, the number of leaves in a complete binary tree is h and the number of nodes (not counting leaves) is h -. So in the worst case, the left subtree has about h h + leaves + nodes. The right subtree has h nodes. If we take the ratio of the left subtree over the total number of nodes: h h + + + h h h = 3 So we are able to split the problem by at least /3 each iteration of the loop in the worst case.

Given : T( n) = T ( n ) + Θ ( ) Can solve by the master theorem. 3 Case : a=,b=3/ log Is Θ( ) Θ(. = n )? 0 Θ( ) = Θ( n )? YES, so T(n)=Θ( f ( n) lg n) = Θ(lg n) Building The Heap: Given an array A, we want to build this array into a heap. Note: Leaves are already a heap! Start from the leaves and build up from there. Build-Heap(A,n) for I n downto ; could start at n/ do Heapify(A,I,n) Start with the leaves (last ½ of A) and consider each leaf as a element heap. Call heapify on the parents of the leaves, and continue recursively to call Heapify, moving up the tree to the root. Example: Build-Heap(A,0). A=[ 9 4 7 0 3 4] 9 4 7 0 8 9 0 3 4 Heapify(A,0,0) exits since this is a leaf. Heapify(A,9,0) exits since this is a leaf. Heapify(A,8,0) exits since this is a leaf. Heapify(A,7,0) exits since this is a leaf. Heapify(A,,0) exits since this is a leaf. Heapify(A,,0) puts the largest of A[] and its children, A[0] into A[]:

9 4 4 0 8 9 0 3 7 A=[ 9 4 4 0 3 7] Heapify(A,4,0): 9 4 0 8 9 0 4 3 7 A=[ 9 4 0 4 3 7] Heapify(A,3,0): 0 4 9 8 9 0 4 3 7 A=[ 0 4 9 4 3 7] Heapify(A,,0): First iteration: 4 0 9 8 9 0 4 3 7

this calls Heapify(A,,0): 4 0 7 9 8 9 0 4 3 A=[ 4 0 7 9 4 3 ] Heapify(A,,0): 4 0 7 9 8 9 0 4 3 Calls Heapify(A,,0): 4 7 0 9 8 9 0 4 3 Calls Heapify(A,,0): 4 7 0 9 8 9 0 4 3

Finished heap: A=[4 7 0 9 4 3 ] Running Time: We have a loop of n times, and each time call heapify which runs in Θ (lgn). This implies a bound of O(nlgn). This is correct, but is a loose bound! We can do better. Note: This is a good approach in general. Start with whatever bound you can determine, then try to tighten it. Key observation: Each time heapify is run within the loop, it is not run on the entire tree. We run it on subtrees, which have a lower height, so these subtrees do not take lgn time to run. Since the tree has more nodes on the leaf, most of the time the heaps are small compared to the size of n. Better Bound for Build-Heap: Property: In an n-element heap there are at most n h nodes of height h (The leaves are height and root at lgn, this is backwards from normal). The time required by Heapify when called in Build-Heap on a node at height h is O(h); h=lgn for the entire tree. Cost of Build-Heap is: T ( n) = T ( n) = lgn h= T ( n) = O heap_ height h= lgn h= (# nodes _ at _ h)( Heapify Time) n O( h) h n h h n x We know that nx = ( x) n= 0 h. If x=/ then (/) n =/ n so: / h = = n= 0 ( / ) Substitute this back in, which is safe because the sum from 0 to infinity is LARGER than the sum from to lgn. This means we are working with a somewhat looser upper bound on the right hand side:: T ( n) O n h h h= 0 T ( n) O( n) T ( n) = O( n)

DONE! HeapSort: Once we can build a heap and heapify a heap, sorting is easy. Idea is to: HeapSort(A,n) Build-Heap(A,n) for I n downto do Swap(A[] A[I] Heapify(A,,I-) Slightly different from book. The book removes root, puts into sorted list. In this example we are sticking it at the end of the array and the loop decreases the size, so the element is not touched again. Example: HeapSort(A,7) A=[4 7 0 9 ] (already a heap) 4 7 0 9 Swap root with 7: 7 0 9 4 Heapify(A,,) 0 7 9 4 A=[0 7 9 4]

Swap root with : 7 9 0 4 Heapify(A,,) 9 7 0 4 A=[9 7 0 4] Swap root with : 7 9 0 4 Heapify(A,,4) 7 9 0 4 A=[7 9 0 4] Swap root with 4: 7 9 0 4 Heapify(A,,3)

7 9 0 4 A=[ 7 9 0 4] Swap root with 3: 7 9 0 4 Heapify(A,,) 7 9 0 4 A=[ 7 9 0 4] Swap root with : 7 9 0 4 A=[ 7 9 0 4] We are done! Runtime is O(nlgn) since we do Heapify on n- elements, and we do Heapify on the whole tree. Note: In-place sort, required no extra storage variables unlike Merge Sort, which used extra space in the recursion. Variation on heaps: Heap could have min on top instead of max

Heap could be k-ary tree instead of binary Priority Queues: A priority queue is a data structure for maintaining a set of S elements each with an associated key value. Operations: Insert(S,x) puts element x into set S Max(S,x) returns the largest element in set S Extract-Max(S) removes the largest element in set S Uses: Job scheduling, event driven simulation, etc. We can model priority queues nicely with a heap. This is nice because heaps allow us to do fast queue maintenance. Max(S,x) : Just return root element. Takes O() time. Heap-Insert(A,key) n n+ I n while I > and A[ i / ] < key do A[I] A[ i / ] I i / A[I] key right level. Idea: same as heapify. Start from a new node, and propagate its value up to Example: Insert new element starting at new node on bottom, I=8 4 7 0 9 Bubble up:

4 7 0 9 I=4, bubble up again 4 0 7 9 At this point, the parent of is larger so the algorithm stops. Runtime = O(lgn) since we only move once up the tree levels. Heap-Extract-Max(A,n) max A[] A[] A[n] n n- Heapify(A,,n) return max Idea: Make the nth element the root, then call Heapify to fix. Uses a constant amount of time plus the time to call Heapify, which is O(lgn). Total time is then O(lgn). Example: Extract(A,7):

4 7 0 9 Remove 4, so max=4. Stick 7 into : 7 0 9 Heapify (A,,): 0 7 9 We have a new heap that is valid, with the max of 4 being returned. The is sitting in the array twice, but since n is updated to equal, it will be overwritten if a new element is added, and otherwise ignored. In general: Heaps have log or constant operation on queues as opposed to linear. Length O(lgn) O(n) 8 3 8 4 3 3 8 04 0 04 409 409 Consider if linked lists, cost to traverse! Recap of sorts so far: Insertion, Merge, Heap, Quicksort

Quicksort is actually used most frequently despite O(n ) worst case runtime