INF2220: algorithms and data structures Series 3

Similar documents
INF2220: algorithms and data structures Series 1

INF2220: algorithms and data structures Series 4

Closed Book Examination. One and a half hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. Date: Wednesday 20 th January 2010

Search Engine Report May

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved

Topic HashTable and Table ADT

INF2220: algorithms and data structures Series 1

Hash Tables. Hashing Probing Separate Chaining Hash Function

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

Hash Tables Outline. Definition Hash functions Open hashing Closed hashing. Efficiency. collision resolution techniques. EECS 268 Programming II 1

Section 05: Solutions

Hashing. Dr. Ronaldo Menezes Hugo Serrano. Ronaldo Menezes, Florida Tech

HASH TABLES. Hash Tables Page 1

CSE 332 Spring 2013: Midterm Exam (closed book, closed notes, no calculators)

Understand how to deal with collisions

Outline. hash tables hash functions open addressing chained hashing

DATA STRUCTURES AND ALGORITHMS

UNIT III BALANCED SEARCH TREES AND INDEXING

Dictionary. Dictionary. stores key-value pairs. Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n)

Symbol Table. Symbol table is used widely in many applications. dictionary is a kind of symbol table data dictionary is database management

Data Structures and Algorithm Analysis (CSC317) Hash tables (part2)

CSE100. Advanced Data Structures. Lecture 21. (Based on Paul Kube course materials)

Summer Final Exam Review Session August 5, 2009

TABLES AND HASHING. Chapter 13

CSE373: Data Structures & Algorithms Lecture 17: Hash Collisions. Kevin Quinn Fall 2015

Exam Datastrukturer. DIT960 / DIT961, VT-18 Göteborgs Universitet, CSE

CMSC 341 Hashing (Continued) Based on slides from previous iterations of this course

General Idea. Key could be an integer, a string, etc e.g. a name or Id that is a part of a large employee structure

Data Structures and Algorithms 2018

Data Structure Lecture#22: Searching 3 (Chapter 9) U Kang Seoul National University

Übungen zur Vorlesung Informatik II (D-BAUG) FS 2017 D. Sidler, F. Friedrich

HASH TABLES cs2420 Introduction to Algorithms and Data Structures Spring 2015

Algorithms and Data Structures

Section 05: Solutions

Data Structures And Algorithms

Selection, Bubble, Insertion, Merge, Heap, Quick Bucket, Radix

CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far

CSE 332 Winter 2015: Midterm Exam (closed book, closed notes, no calculators)

COMP171. Hashing.

CS 61B Summer 2005 (Porter) Midterm 2 July 21, SOLUTIONS. Do not open until told to begin

INF2220: algorithms and data structures Series 5

(the bubble footer is automatically inserted in this space)

Today: Finish up hashing Sorted Dictionary ADT: Binary search, divide-and-conquer Recursive function and recurrence relation

Hash Tables. Gunnar Gotshalks. Maps 1

Fundamental Algorithms

9/24/ Hash functions

Lecture Notes on Hash Tables

HashTable CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Fall 2018

Data Structures. Topic #6

Worst-case running time for RANDOMIZED-SELECT

CSE wi: Practice Midterm

Quiz 1 Solutions. (a) f(n) = n g(n) = log n Circle all that apply: f = O(g) f = Θ(g) f = Ω(g)

SFU CMPT Lecture: Week 8

Open Addressing: Linear Probing (cont.)

Dynamic Dictionaries. Operations: create insert find remove max/ min write out in sorted order. Only defined for object classes that are Comparable

THINGS WE DID LAST TIME IN SECTION

CS 350 : Data Structures Hash Tables

CS 241 Analysis of Algorithms

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor

The questions will be short answer, similar to the problems you have done on the homework

Table ADT and Sorting. Algorithm topics continuing (or reviewing?) CS 24 curriculum

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs

ASSIGNMENTS. Progra m Outcom e. Chapter Q. No. Outcom e (CO) I 1 If f(n) = Θ(g(n)) and g(n)= Θ(h(n)), then proof that h(n) = Θ(f(n))

Lecture 12 Notes Hash Tables

Fast Lookup: Hash tables

Lecture 18. Collision Resolution

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Intro to Algorithms. Professor Kevin Gold

Chapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion,

Tables. The Table ADT is used when information needs to be stored and acessed via a key usually, but not always, a string. For example: Dictionaries

ECE250: Algorithms and Data Structures Midterm Review

We don t have much time, so we don t teach them [students]; we acquaint them with things that they can learn. Charles E. Leiserson

AAL 217: DATA STRUCTURES

Lecture 12 Hash Tables

Data and File Structures Chapter 11. Hashing

Data Structures - CSCI 102. CS102 Hash Tables. Prof. Tejada. Copyright Sheila Tejada

Acknowledgement HashTable CISC4080, Computer Algorithms CIS, Fordham Univ.

Hashing Techniques. Material based on slides by George Bebis

Lecture 4. Hashing Methods

Hashing. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong

Practice Midterm Exam Solutions

CITS2200 Data Structures and Algorithms. Topic 15. Hash Tables

Hash table basics mod 83 ate. ate. hashcode()

COSC 2007 Data Structures II Final Exam. Part 1: multiple choice (1 mark each, total 30 marks, circle the correct answer)

Computer Science 210 Data Structures Siena College Fall Topic Notes: Complexity and Asymptotic Analysis

CS/COE 1501

DATA STRUCTURES AND ALGORITHMS

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ.! Instructor: X. Zhang Spring 2017

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement. Support for Dictionary

Introduction hashing: a technique used for storing and retrieving information as quickly as possible.

HASH TABLES. CSE 332 Data Abstractions: B Trees and Hash Tables Make a Complete Breakfast. An Ideal Hash Functions.

Section 1: True / False (1 point each, 15 pts total)

5. Hashing. 5.1 General Idea. 5.2 Hash Function. 5.3 Separate Chaining. 5.4 Open Addressing. 5.5 Rehashing. 5.6 Extendible Hashing. 5.

Basic Data Structures (Version 7) Name:

Hashing. Hashing Procedures

Dictionaries and Hash Tables

Linked lists (6.5, 16)

BINARY HEAP cs2420 Introduction to Algorithms and Data Structures Spring 2015

Introduction to Algorithms October 12, 2005 Massachusetts Institute of Technology Professors Erik D. Demaine and Charles E. Leiserson Quiz 1.

Transcription:

Universitetet i Oslo Institutt for Informatikk A. Maus, R.K. Runde, I. Yu INF2220: algorithms and data structures Series 3 Topic Map and Hashing (Exercises with hints for solution) Issued: 7. 09. 2016 Classroom Exercise 1 (Hash table complexity) What is the complexity of finding order information, such as max, min, or range of stored values from a hash table? Solution: [of Exercise 1] Hash tables are not really made for that. In particular, they don t contain or represent order information. In a way, a good hashing function should best be completely unordered.... Anyway, since that s the case, there s no other way to determine the maximum etc than just going throught the whole table and find out. So the complexity is O(n). Note there is no distinction between best/worst/average case, one needs to seach through all of it, as said. Exercise 2 (Extendible hashing) Assume that you have an empty hashtable, and that M = 4. Show the hash table you get by inserting the following numbers: {000100, 001000, 100000, 011011, 111000, 010100, 101010, 010101, 111001, 001010, 000111, 001001, 101110, 100100} Solution: [of Exercise 2] See figure.ex 000 001 010 011 100 101 110 111 000100 000111 001000 001001 001010 010100 010101 011011 100000 100100 101010 101110 111000 111001 Figure 1: solution of 2 Exercise 3 (Deletion from a hash table) Explain how deletion is performed both with probing and separate chaining hash-tables.

Exercise 4 (Hash table behavior) Given the input {42,39,57,3,18,5,67,13,70,26}, and a fixed table size of 13, and a hash-function H(X) = X mod 13, show the resulting 2. Separate chaining hash-table Solution: [of Exercise 4] One may mention that exercises like that are not uncommon in earlier exams. Not just for hash-tables, of course. Note: the solution of last year had an error (13 and 26 should have swapped place). one. 1. linear probing: That one is pretty simple: just calculate mod 13, if the slot is filled ( conflict ), move (typically) to the right until you find a free slot. Note that it s convenient (as is the case in Java) for those modulo-based hashing functions, that the first slot of the array is index by 0. 1 The hash table is rather small, but perhaps one get s a feeling what the problem of primary clustering is. One can compare that with the figure below (with external chaining): for instance the linked list at slot 0 is of length 3 and hence grows in the cluster starting at 2 etc. such that in the end it s one big cluster. As a hint for an exam: if the task is (like here) to make a separate chain and an internal probing hash table with the same data, it seems slightly faster an less error prone to do the separate chain first. Of course: one cannot first do the separate chains and only look at the result to construct the other one.... The order of the original insertions plays a role (which is not completely preserved in the separate chain HT). elements: 39 13 67 42 3 57 18 5 70 26 10 11 12 2. Separate chaing hash table: The solution here is slightly inefficient. As u can see from the linked list, when inserting an element, it s done at the end. There is no reason to do that. It s to be expected that it s slightly more efficient to insert it at the head, avoiding list traversal. In practice there is perhaps not such a big difference. If one has a good hashing function (and consequently not too heavy clustering) the linked lists should be reasonably short. If they get too long one would increase the size of the array. But still, no need to insert at the end. In the book [Weiss, p. 198, Fig 5.10] the function List.add method (from the Java lib) is used, where the documentation states Appends the specified element to the end of this list (optional operation). It s to be expected that the Java implementation of linked list is more subtle that a naive (single-)linked list, so the above remark may not be too relevant. If, however, one makes the linked list from scratch, adding it at the beginning is more appropriate. Note: remembering esoteric details from the Java library such as remember that the add function adds to the end as far as Java is concerned is not exam material. Knowing that adding to a linked list at the beginning or the end can make a difference (depending on how it s represented) is something one should be aware of. 1 Some older/weired languages may start with 1, in which case one has to shift the mod-function by 2

10 11 12 List of 39 67 42 57 elements: 13 3 18 26 5 70 Exercise 5 (Hash table behavior) Given the input {4371,1323,6173,4199,4344,9679,1989}, and a fixed table size of 10, and a hash-function H(X) = X mod 10, show the resulting 2. Quadratic probing hash-table 3. Separate chaining hash-table Solution: [of Exercise 5] elements: 9679 4371 1989 1323 6173 4344 4199 2. Quadratic probing hash-table. elements: 9679 4371 1323 6173 4344 1989 4199 3. Separate chaining hash-table: that one is of course almost too boring. List of 4371 1323 4344 4199 elements: 6173 9679 1989 Exercise 6 (Hash table behavior) Given the input {15,78,56,25,19,38,57,76,34,53,72,91}, and a fixed table size of 19, and a hash-function H(X) = X mod 19, 1. show the resulting Quadratic probing hash-table 2. show the resulting Double probing hash-table. Note that you have to first find the largest prime number which is smaller than the size of the hash-table. Solution: 3

1. show the resulting Quadratic probing hash-table elements: 19 38 78 57 53 25 76 72 91 15 34 56 10 11 12 13 14 15 16 17 18 2. show the resulting double probing hash-table. Double probing is described at page 203. elements: 19 78 53 25 34 76 91 57 38 15 72 56 10 11 12 13 14 15 16 17 18 Exercise 7 (Complexity questions: Binary hash map) A class BinaryHashMap serves as a basis for the first 3 questions in this assignments. The internal array (of lists) which is used by BinaryHashMap is of length 2. Consequently, the (very basic) hash-function hashes all keys with an even length to 0 and all keys with an odd length to 1. The following complexity questions should be anwered with big-o notation, both for average case and worst case. 1. What is the complexity of locating an element in an unordered list? 2. What is the complexity of locating an element inside the BinaryHashMap? binhash is an object of BinaryHashMap. Object o = binhash.get( a key ) Hint: the elements are distributed among two unordered lists. 3. What is the complexity of inserting an element into the BinaryHashMap? binhash.put( some key, some obj pointer); Hint: keys are unique, we cannot simply add it to one of our internal lists. 2 4. Let M be the number of list-pointers internally inside a hash-table, assume that we have a hash-function which is perfect. I.e., if we fill it with M key-value pairs, we have zero collisions and all our internal lists contain one elements each. What is the complexity of locating an element in a hash-table with a perfect hashfunction, if it contains N elements, and it has M internal list pointers. I.e., big-o expressed with N and M. Solution: [for Exercise 7] 1. finding elements in an unordered list: worst case: O(n) = n. The worst case is that the element is the last one we inspect (equally if the element is not there; in that case we have to check them all as well until we are sure). 2 It s not a law of nature that hash tables as concept work only with unique keys. However, the binary hash map of this exercise should be a degenerate version of Java s hash map (java.util.hashmap<k,v>. It s a hash table implementation for a map, which a function assocuating values to keys (so a mapping from keys to values. Hence, keys must be unique. Note in passing: the put method from Java s hash map overrides the old value (if there s one) with the newly inserted and gives back the old value (if there has been one). This detail does not affect the complexity. 4

Lab average case: O(n) = (n/2)/2 = n/4. On average, one has to check half the list to find an element. According to the definitions we learnt, the factors 1/2 are irrelevant in the asymptotic considerations underlying the big-o notations. So a correct answer is O(n) as well, or linear complexity. Similarly for the other questions. 2. finding an element in the given binary hash map worst case: O(n) = n: the worst case is, that the hash table is completely unbalanced in the sense that all elements are chained to one slot, and furthermore that the element is at the end of that list. average case: O(n) = (n/2)/2 = n/4. One has two lists with n/2 elements on average, and you have to check half the list on average to find an element. 3. Insertion (a) worst case O(n) = n (b) average case O(n) = n/4 We first have to find the element as before, i.e. the same penalty as before. Updating the pointer to the new element can be done in constant time and therefore can be ignored. 4. The number of elements in internal lists are distributed perfectly as described is: N / M. The complexity of finding an element in a list with length N/M is (a) worst case: N/M (b) average case: (N/M)/2 = N/2M Exercise 8 We are going to implement a few functions inside the class BinaryHashMap 1. Implement the function: boolean remove(string key), which returns false if the elements is not present inside BinaryHashMap true otherwise. 2. Implement the function: String[] keys(), which returns all keys inside the BinaryHashMap. (HINT: we know how large this array has to be from this.size) 3. Implement the function: Object[] toarray(), which should return all values from the hash-table. Exercise 9 Write an implementation for a hash table which uses separate chaining to handle hash collision. The implementation should include inserting, deleting, and searching an element in the hash table. 5