W4231: Analysis of Algorithms

Similar documents
W4231: Analysis of Algorithms

Sorting and Searching

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

Sorting and Searching

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

Priority Queues and Binary Heaps

Cpt S 223 Fall Cpt S 223. School of EECS, WSU

Lecture Summary CSC 263H. August 5, 2016

Algorithm Design (8) Graph Algorithms 1/2

EE 368. Weeks 5 (Notes)

Course Review. Cpt S 223 Fall 2010

Data Structures and Algorithms

Course Review. Cpt S 223 Fall 2009

UNIT III BALANCED SEARCH TREES AND INDEXING

CS301 - Data Structures Glossary By

CSE 332 Winter 2015: Midterm Exam (closed book, closed notes, no calculators)

CS251-SE1. Midterm 2. Tuesday 11/1 8:00pm 9:00pm. There are 16 multiple-choice questions and 6 essay questions.

CS711008Z Algorithm Design and Analysis

CSE 5311 Notes 4a: Priority Queues

Course Review for Finals. Cpt S 223 Fall 2008

Chapter 6 Heaps. Introduction. Heap Model. Heap Implementation

Lecture 6: Analysis of Algorithms (CS )

CS711008Z Algorithm Design and Analysis

Data Structures and Algorithms

Binary heaps (chapters ) Leftist heaps

Binary Heaps in Dynamic Arrays

Priority Queues Heaps Heapsort

CSci 231 Final Review

Thus, it is reasonable to compare binary search trees and binary heaps as is shown in Table 1.

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

Data Structure. A way to store and organize data in order to support efficient insertions, queries, searches, updates, and deletions.

Week 9 Student Responsibilities. Mat Example: Minimal Spanning Tree. 3.3 Spanning Trees. Prim s Minimal Spanning Tree.

Heap Model. specialized queue required heap (priority queue) provides at least

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Section 1: True / False (1 point each, 15 pts total)

Lecture 5 Using Data Structures to Improve Dijkstra s Algorithm. (AM&O Sections and Appendix A)

Cpt S 223 Course Overview. Cpt S 223, Fall 2007 Copyright: Washington State University

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard

Algorithm Theory, Winter Term 2015/16 Problem Set 5 - Sample Solution

(2,4) Trees. 2/22/2006 (2,4) Trees 1

ECE250: Algorithms and Data Structures Midterm Review

Outline for Today. How can we speed up operations that work on integer data? A simple data structure for ordered dictionaries.

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

3. Priority Queues. ADT Stack : LIFO. ADT Queue : FIFO. ADT Priority Queue : pick the element with the lowest (or highest) priority.

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

Hash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell

Data Structures and Algorithms for Engineers

CMSC 341 Priority Queues & Heaps. Based on slides from previous iterations of this course

Module 2: Classical Algorithm Design Techniques

Fibonacci Heaps. Structure. Implementation. Implementation. Delete Min. Delete Min. Set of min-heap ordered trees min

CS-301 Data Structure. Tariq Hanif

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

Priority queues. Priority queues. Priority queue operations

Selection, Bubble, Insertion, Merge, Heap, Quick Bucket, Radix

CSE373: Data Structures & Algorithms Lecture 5: Dictionary ADTs; Binary Trees. Linda Shapiro Spring 2016

CS 315 Data Structures Spring 2012 Final examination Total Points: 80

Heaps with merging. We can implement the other priority queue operations in terms of merging!

Partha Sarathi Manal

CSci 231 Homework 7. Red Black Trees. CLRS Chapter 13 and 14

Analysis of Algorithms

Analysis of Algorithms

CSE 530A. B+ Trees. Washington University Fall 2013

CMSC 341 Leftist Heaps

Data Structures and Algorithms for Engineers

Advanced Algorithms and Data Structures

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Practice Midterm Exam Solutions

CSCI2100B Data Structures Heaps

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

having any value between and. For array element, the plot will have a dot at the intersection of and, subject to scaling constraints.

CSCI-1200 Data Structures Fall 2018 Lecture 23 Priority Queues II

Midterm solutions. n f 3 (n) = 3

DATA STRUCTURES AND ALGORITHMS

Chapter 5 Data Structures Algorithm Theory WS 2016/17 Fabian Kuhn

CSE 373: Data Structures and Algorithms

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Heap: A binary heap is a complete binary tree in which each, node other than root is smaller than its parent. Heap example: Fig 1. NPTEL IIT Guwahati

CSE 373 MAY 10 TH SPANNING TREES AND UNION FIND

Final Examination CSE 100 UCSD (Practice)

Lecture 13: AVL Trees and Binary Heaps

CMSC 341 Lecture 15 Leftist Heaps

CS F-11 B-Trees 1

COMP251: Algorithms and Data Structures. Jérôme Waldispühl School of Computer Science McGill University

Week 3 Web site:

Chapter 6 Heapsort 1

Advanced Algorithms. Class Notes for Thursday, September 18, 2014 Bernard Moret

Lecture: Analysis of Algorithms (CS )

COS 226 Midterm Exam, Spring 2009

BM267 - Introduction to Data Structures

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

B+ Tree Review. CSE332: Data Abstractions Lecture 10: More B Trees; Hashing. Can do a little better with insert. Adoption for insert

CMSC 341 Lecture 15 Leftist Heaps

x-fast and y-fast Tries

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Sorting and Selection

Binary Trees, Binary Search Trees

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

EECS 477: Introduction to algorithms. Lecture 10

HEAPS: IMPLEMENTING EFFICIENT PRIORITY QUEUES

Transcription:

W4231: Analysis of Algorithms 10/5/1999 (revised 10/6/1999) More hashing Binomial heaps Lectures Next Week No Lecture Tues. October 12 and Thurs. October 14. Extra lecture (2 1 2 hours) on Mon. October 11 in 1127 Mudd, 11:30am-2:00pm. COMSW4231, Analysis of Algorithms 1 COMSW4231, Analysis of Algorithms 2 Office Hours and Homework Midterm Extra office hours on Fri. October 8, 3-4pm. No office hours on Thurs. October 14. Homework due in class next Monday, or to Dario, in his office, next Tues. 3-4pm. Midterm is tentatively scheduled for Tues. Oct. 19, 7-9pm. An alternate date/time will be set for (CVN and in-class) students with serious conflicts with the scheduled one (three so far). The syllabus for the midterm includes everything so far, plus Binomial heaps (today) and the notion of amortization (next time). No Fibonacci heaps. COMSW4231, Analysis of Algorithms 3 COMSW4231, Analysis of Algorithms 4 Definition of Universal Hashing Analysis of Use of Universal Hashing A random hash function h : {1,...,M} {1,...,m} is universal if for every two different inputs x and y, Pr[h(x) =h(y)] 1/m Let y 1,...,y n be an arbitrary set of keys that have been inserted in the table. Let x be an arbitrary key. Call C the random variable (depending on the choice of h) that counts the number of collisions between h(x) and h(y 1 ),...,h(y n ). I.e. C = {j : h(y j )=h(x)}. Call C y the random variable (depending on the choice of h) that is 1 if h(x) =h(y), and 0 otherwise. COMSW4231, Analysis of Algorithms 5 COMSW4231, Analysis of Algorithms 6

Then for every y Interpretation E[C y ] = 0 Pr[h(x) h(y)] + 1 Pr[h(x) =h(y)] = Pr[h(x) =h(y)] 1 m Since C = n i=1 C y i, we have E[C] =E[ n i=1 C y i ]= n i=1 E[C y i ] n/m = α The running time for delete or find on x is O(C), sothe average running time is O(α). If we build the table with a random universal hash function, no matter which sequence of keys we have to insert, and which key we are considering at a given moment, the average (over h) of the time needed to do a find or a delete on x is O(n/m), the load factor. The same as having random inputs. COMSW4231, Analysis of Algorithms 7 COMSW4231, Analysis of Algorithms 8 Applications of Hashing The Set data structure is useful in several applications. In compilers it is often implemented with hash functions. In the static dictionary problem, we are interested in building in one shot a data structure for n given elements (with a possibly huge preprocessing time) and then only want to do find. Worst-case constant-time implementations are possible using universal hashing as fundamental primitive. It is possible to come up with random hash functions such that given a value v it is very hard to come up with an x such that h(x) =v (e.g. MD5 from RSA), In order to sign an email, it is sufficient to sign a hash of the email. Sometimes distributed caching (e.g. for web servers) is done using hashing. A hash function distributes URLs among machines dedicated to caching. Hashing based on sums mod m can be a disaster. COMSW4231, Analysis of Algorithms 9 COMSW4231, Analysis of Algorithms 10 Hashing, Conclusions Hashing works very well in practice. Typically it is not needed to pick h at random, and experimental analysis with typical inputs will help find values of m for which the mod hash function is good. Random choices give stronger guarantees of efficiency. Tolerable downside of hashing: high worst case complexity. Sometimes intolerable downside of hashing: need to know max number n of elements to be put in the table, so we can choose m. If we over-estimate, collisions become frequent and we may need to reconstruct from scratch a bigger table. If we under-estimate we waste space. We may need to construct from scratch a smaller table and free the memory of the bigger one. Even assuming periodical reconstructions, if done in the right way, one can maintain constant amortized complexity on average. COMSW4231, Analysis of Algorithms 11 COMSW4231, Analysis of Algorithms 12

Binomial Heaps delete-min(h); smallest key. removes from H the elements having Implement the abstract data structure Priority Queue, with the union operation. Operations in the abstract data structure: H:= create(); Create returns an empty data structure. p:= insert(x, H); puts element x in H and returns a pointer to it. p= find-min(h); gives a pointer to the element of H with the minimum key. H:= union(h 1,H 2 ); creates a new data structure containing all the elements of H 1 and H 2. decrease-key(p, k); changes the key of the element pointed by p to k. Note: There is no operation to find (a pointer to) an element with a given key. Main result: all operations in O(log n) worst case time. Note: AVL trees and red-black trees (that we will not see) can COMSW4231, Analysis of Algorithms 13 COMSW4231, Analysis of Algorithms 14 implement all the operations in O(log n) time except union(). Binomial Tree A binomial heap is a forest of binomial trees. We better start with defining binomial trees. There is a binomial tree for every integer 0. Inductive definition: A node is the binomial tree B 0. Given two binomial trees B i, join them by making the root of one of them be the first child of the other. This gives B i+1. COMSW4231, Analysis of Algorithms 15 COMSW4231, Analysis of Algorithms 16 Properties of B k Representing a binomial tree It has 2 k nodes. It has depth k (depth = number of edges in the longest path from the root to a leaf). The root has k children, which is the largest degree in the graph. There are ( k i) nodes at level i in the tree. (Hence the name.) Each node is a record containing: An element of the data type we want to store. Pointers to: father, leftmost child, next sibling in the left-toright order. The tree itself can be represented as a record containing a pointer to the root, and an integer field (for a tree B k the integer field will be k). We will concentrate on binomial trees that are heap ordered: COMSW4231, Analysis of Algorithms 17 COMSW4231, Analysis of Algorithms 18

the key stored in a node is than the keys stored at the children. Merging two binomial trees of same size Consider the operation T = tree join(t 1,T 2 ) that given two heap-ordered binomial trees T 1, T 2 (of the same size), returns a heap-ordered binomial tree containing the union of T 1 and T 2. Can do in O(1) time. COMSW4231, Analysis of Algorithms 19 COMSW4231, Analysis of Algorithms 20 Binomial Heap The binomial trees have all different parameters k. A binomial heap H is a forest of binomial trees. It is represented as a list of the roots. [More precisely, a list of records, each record containing a pointer to a root, and an integer field telling the size of the tree.] The list is ordered by increased size of the trees. Elements in the binomial trees are heap-ordered (the value of the key at a node is larger than the value of the keys at the children). COMSW4231, Analysis of Algorithms 21 COMSW4231, Analysis of Algorithms 22 Structure and the find-min Operation Union If there are n items in H, no tree can have parameter k bigger than log n (since B k has 2 k nodes). Since all the trees are different, there are at most log n trees. The minimum is one of the roots; hence the minimum can be found in O(log n) time by scanning the list of roots. Join the lists of roots of H 1 and H 2. Similar to the Merge() phase of MergeSort, but when we find a duplicate elements we do a tree-join() operation. At most a tree-join() for every element in the lists. Total O(log n 1 + log n 2 )=O(log n) time. COMSW4231, Analysis of Algorithms 23 COMSW4231, Analysis of Algorithms 24

Insert decrease-key(p, k, H) To implement insert(x, H) 1. Create a binomial heap H that contains only x. O(1) time. 2. Let H be Union(H, H ). O(log n) time. Total: O(log n) time. Reduce the value of the key of the element pointed by p to k. If x becames smaller than the father, exchange with the father, and proceed recursively to bubble x up. Same as with standard heaps. Each swap is constant time. At most as many swaps as the depth. O(log n) time. COMSW4231, Analysis of Algorithms 25 COMSW4231, Analysis of Algorithms 26 delete-min(h) delete(p, H) The minimum is always the root of some tree, T. Remove the tree T from the heap H. Delete the root from T, and make a new binomial heap H containing the subtrees. Do a union(h, H ). Total time: O(log n). This operation is not required by the description of the abstract data structure, but is implementable with the operators we have so far. Do a decrease-key(p, k, H), choosing k such that after the decrease the key of the element pointed by p is smaller than the current minimum of H. Do a delete(h). COMSW4231, Analysis of Algorithms 27 COMSW4231, Analysis of Algorithms 28