Hash Table. A hash function h maps keys of a given type into integers in a fixed interval [0,m-1]

Size: px
Start display at page:

Download "Hash Table. A hash function h maps keys of a given type into integers in a fixed interval [0,m-1]"

Transcription

1 Exercise # 8- Hash Tables Hash Tables Hash Function Uniform Hash Hash Table Direct Addressing A hash function h maps keys of a given type into integers in a fixed interval [0,m-1] 1 Pr h( key) i, where m is the size of the hash table. m A hash table for a given key type consists of: Hash function h: keys-set [0,m-1] Array (called table) of size m K is a set whose elements' keys are in the range [0,m-1]. Use a table of size m and store each element x in index x.key. Disadvantage: when K << m waste of space Chaining h(k) = k mod m (This is an example of a common hash function) If h(k) is occupied, add the new element in the head of the chain at index h(k) Open Addressing Basic idea: if slot is full, try another slot until an open slot is found (Probing). Search is similar to insert and is a good solution for fixed sets. Deletion need to mark slot during deletion, search time is no longer dependent on α. Linear Probing: h(k,i) = (h'(k) + i)mod m 0 i m-1 h'(k) - common hash function First try h(k,0) = h'(k), if it is occupied, try h(k,1) etc.... Advantage: simplicity Disadvantage: clusters, uses Θ(m) permutations of index addressing sequences Double Hashing: h(k,i) = (h1(k) + i h2(k))mod m 0 i m-1 h1 hash function h2 step function also a hash function First try h(k,0) = h1(k), if it is occupied, try h(k,1) etc.... Advantage: less clusters, uses Θ(m*m) permutations of index addressing sequences

2 Load Factor α n m, Hash table with m slots that stores n elements (keys) Average (expected) Search Time Open Addressing unsuccessful search: O(1+1/(1-α)) successful search: O(1+ (1/α)ln( 1/(1-α))) Chaining unsuccessful search: Θ (1 + α) successful search: Θ (1 + α/2) = Θ (1 + α) Dictionary ADT The dictionary ADT models a searchable collection of key-element items, and supports the following operations: Insert Delete Search Bloom Filter A Bloom filter model consists of a bit-array of size m (the bloom filter) and k hash functions h1, h2,... hk. It supports insertion and search queries only: Insert at O(1) Search(x) in O(1), with (1-e ) probability of getting a false positive kn / m k Remarks: 1. The efficiency of hashing is examined in the average case, not the worst case. 2. Load factor defined above is the average number of elements that are hashed to the same value. 3. Let us consider chaining: The load factor is the average number of elements that are stored in a single linked list. If we examine the worst-case, then all n keys are hashed to the same cell and form a linked list of length n. Thus, running time of searching of an element in the worst case is (Θ(n)+ the time of computing the hash function). So, it is clear that we are not using hash tables due to their worst case performance, and when analyzing running time of hashing operations we refer to the average case.

3 Question 1 Given a hash table with m=11 entries and the following hash function h1 and step function h2: h1(key) = key mod m h2(key) = {key mod (m-1)} + 1 Insert the keys {22, 1, 13, 11, 24, 33, 18, 42, 31} in the given order (from left to right) to the hash table using each of the following hash methods: a. Chaining with h1 h(k) = h1(k) b. Linear-Probing with h1 h(k,i) = (h1(k)+i) mod m c. Double-Hashing with h1 as the hash function and h2 as the step function h(k,i) = (h1(k) + ih2(k)) mod m Solution: Chaining Linear Probing Double Hashing

4 Question 2 a. For the previous h1 and h2, is it OK to use h2 as a hash function and h1 as a step function? b. Why is it important that the result of the step function and the table size will not have a common divisor? That is, if hstep is the step function, why is it important to enforce GCD(hstep(k),m) = 1 for every k? Solution: a. No, since h1 can return 0 and h2 skips the entry 0. For example, key 11 h1(11) = 0, and all the steps are of length 0, h(11,i) = h2(11) + i*h1(11) = h2(11) = 2. The sequence of index addressing for key 11 consists of only one index, 2. b. Suppose GCD(hstep(k),m) = d >1 for some k. The search after k would only address 1/d of the hash table, as the steps will always get us to the same cells. If those cells are occupied we may not find an available slot even if the hast table if not full. Example: m = 48, hstep(k) = 24,GCD(48,24) = 24 only There are two guidelines on choosing m and step function so that GCD(hstep(k),m)=1: 2 cells of the table would be addressed. 1. The size of the table m is a prime number, and hstep(k) < m for every k. (An example for this is illustrated in Question1 ). 2. The size of the table m is 2 x, and the step function returns only odd values. Question 3 Consider two sets of integers, S = {s1, s2,..., sm} and T = {t1, t2,..., tn}, m n. a. Describe a deterministic algorithm that checks whether S is a subset of T. What is the running time of your algorithm? b. Devise an algorithm that uses a hash table of size m to test whether S is a subset of T. What is the average running time of your algorithm? c. Suppose we have a bloom filter of size r, and k hash functions h1, h2, hk : U [0, r-1] (U is the set of all possible keys). Describe an algorithm that checks whether S is a subset of T at O(n) worst time. What is the probability of getting a positive result although S is not a subset of T? (As a function of n, m, r and S T, you can assume k = O(1))

5 Solution: a) Subset(T,S,n,m) T=sort(T) for each sj S found = BinarySearch(T, sj) if (!found) return "S is not a subset of T" return "S is a subset of T" Time Complexity: O(nlogn+mlogn)=O(nlogn) b) The solution that uses a hash table of size m with chaining: SubsetWithHashTable(T,S,n,m) for each ti T insert(ht, ti) for each sj S found = search(ht, sj) if (!found) return "S is not a subset of T" return "S is a subset of T" Time Complexity Analysis: Inserting all elements ti to HT: n times O(1) = O(n) If S T then there are m successful searches: m (1 + α 2 ) = O(m) If S T then in the worst case all the elements except the last one, sm is in S, are in T, so there are (m-1) successful searches and one unsuccessful search: (m 1) (1 + α 2 ) + (1 + α) = O(m) Time Complexity: O(n + m) = O(n) c) We first insert all the elements in T to the Bloom filter. Then for each element x in S we query search(x), once we get one negetive result we return FALSE and break, if we get positive results on all the m queries we return TRUE. Time Complexity Analysis: n Insertions and m search queries to the bloom filter: n O(k) + m O(k) = (n+m) O(1) = O(n) The probability of getting a false positive result: Denote t = S T, for all t elements in S T we will get a positive result, the probability kn / r that we get a positive result for each of the m-t elements in S\T is ~ (1-e ) k and thus the probability of getting a positive result for all these m-t elements and so returning a kn / r k ( m t) TRUE answer although S T is ~ (1-e ) Note that the above analysis (b and c) gives an estimation of average run time

6 Question 4 You are given an array A[1..n] with n distinct real numbers and some value X. You need to determine whether there are two numbers in A[1..n] whose sum equals X. a. Do it in deterministic O(nlogn) time. b. Do it in expected O(n) time. Solution a: Sort A (O(nlogn) time). Pick A[1]. Perform a binary search (O(logn) time) on A to find an element that is equal to X-A[1]. If found, you re done. Otherwise, repeat the same with A[2], A[3], Overall O(nlogn) time Solution b: 1. Choose a universal hash function h from a universal collection of hash functions H. 2. Store all elements of array A in a Hash table of size n using h and chaining. 3. For (i = 1; i <= n; i++) { } Return false If ((A[i] is in Table) && (X-A[i] is in Table)) { Return true Time Complexity: We don't know if the elements inside the array distribute randomly. If they do not distribute randomly then using a "regular" hash function can result in α = O(n) and then the runtime could be O(n^2). If we use a universal hash function then regardless of the keys distribution the probability that two keys collide is no more than 1/m = 1/n. As seen in class if we use chaining and a universal hash function then the expected search time is O(1) and the expected runtime for the algorithm is O(n). Question 5 - Rolling hash Rolling hash is used when we want to enable fast search for parts of data. For example searching sub paragraphs in a long paragraph. In Rolling hash, a hash value is calculated for every fixed size part of the input in a sliding window manner. Example:

7 Input: Hello my name is Inigo Montoya Window size: 7 The result is a hash value for every 8 characters: Window Hash value example Hello my name is Inigo Montoya hash on Hello m = 20 Hello my name is Inigo Montoya hash on ello my = 571 Hello my name is Inigo Montoya hash on llo my = 12 Hello my name is Inigo Montoya hash on lo my n = 322 Hello my name is Inigo Montoya hash on o my na = 9 Hello my name is Inigo Montoya hash on my nam = 111 Hello my name is Inigo Montoya hash on my name = 7 Usually the window size is 512 and characters which means 512 bytes and each window differ from its predecessor in 1 character in the beginning and 1 in the end. Due to the fact that mod is a heavy command (if we are using mod as hash), and mod on a number with 512X8 bits (2^4096 ) is especially heavy, it is much easier to calculate the hash value of the window based on the difference from the preceding window. In a given text, the 1 st character is: a (ascii between 0 and 255) and the (k+1) th character is b (ascii between 0 and 255). If the hash value of the first k size window w1 is X, and the hush function is mod m, what is the value of the second window w2?

8 Solution: a is the most significant byte in a 8k bit number and b is the list significant byte. Thus: w2 = (w1 a*2 8(k-1) )*2 8 +b w2 mod m = ((w1 a*2 8(k-1) )*2^8 +b) mod m = (w1*2 8 a*2 8k +b) mod m = ((w1 mod m)* (2 8 mod m) (a mod m)*(2 8k mod m) + b mod m) mod m = X * (2 8 mod m) (a mod m)*(2 8k mod m) + b mod m) mod m Note that X is known from the last window, a,b and 2 8 are small numbers easy to calculate mod and 2 8k mod m can be calculated once in the beginning and saved. Question 6 You have a large amount of data chunks that need to be stored in servers in the cloud. In order to for the data to be distributed evenly, each data chunked is hashed to decide what server it will be stored in. In case a server stops working, its data chunks are supposed to be distributed evenly among the remaining servers. The data chunks of the remaining server obviously need to continue being stored in their server. If we implement such a system by numbering the servers and using modulo the number of servers as hash function, we will have a problem when a server stops working and the number of server changes, all the hash values will change. Suggest a way to implement such a system.

9 Solution: Consistent hashing The servers and data chunks will be hashed to the same table (array). Each server will be mapped by a number of hash functions so it will have a number of values in the table. Each server gets the chunks on its right until it meets another server: In this case: Chunk 1 is mapped to server 1 Chunk 2 is mapped to server 3 (the array is circular) Chunk 3 is mapped to server 2 Chunk 4 is mapped to server 2 Chunk 5 is mapped to server 2 Chunk 6 is mapped to server 2 Chunk 7 is mapped to server 3 Chunk 8 is mapped to server 1 If server 2 stops working, the other server still have their data and the data of server 2 is distributed among them: Now: Chunk 1 is mapped to server 1 Chunk 2 is mapped to server 3 (the array is circular) Chunk 3 is mapped to server 1 Chunk 4 is mapped to server 3 Chunk 5 is mapped to server 1 Chunk 6 is mapped to server 3 Chunk 7 is mapped to server 3 Chunk 8 is mapped to server 1

10 Question 7 You have a hash table with α=0.5. What is the probability that 3 or more keys will be mapped in to the same index? Solution: According to Poisson distribution, if the expected number of keys in the same index is λ, than the probability to get exactly k keys in 1 index is: λk e λ k! In our case λ=0.5 so: The probability for an index to be empty is 0.50 e 0.5 0! = e 0.5 The probability for an index to hold exactly 1 key is 0.51 e 0.5 The probability for an index to hold exactly 2 keys is 0.52 e 0.5 1! 2! = 0.5e 0.5 = 0.125e 0.5 And thus the probability of 3 or more keys is 1 e e e 0.5 = 0.014

Practical Session 8- Hash Tables

Practical Session 8- Hash Tables Practical Session 8- Hash Tables Hash Function Uniform Hash Hash Table Direct Addressing A hash function h maps keys of a given type into integers in a fixed interval [0,m-1] 1 Pr h( key) i, where m is

More information

Hash Tables. Hashing Probing Separate Chaining Hash Function

Hash Tables. Hashing Probing Separate Chaining Hash Function Hash Tables Hashing Probing Separate Chaining Hash Function Introduction In Chapter 4 we saw: linear search O( n ) binary search O( log n ) Can we improve the search operation to achieve better than O(

More information

Hashing. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong

Hashing. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong Department of Computer Science and Engineering Chinese University of Hong Kong In this lecture, we will revisit the dictionary search problem, where we want to locate an integer v in a set of size n or

More information

Data Structures and Algorithms. Chapter 7. Hashing

Data Structures and Algorithms. Chapter 7. Hashing 1 Data Structures and Algorithms Chapter 7 Werner Nutt 2 Acknowledgments The course follows the book Introduction to Algorithms, by Cormen, Leiserson, Rivest and Stein, MIT Press [CLRST]. Many examples

More information

Data Structures and Algorithms. Roberto Sebastiani

Data Structures and Algorithms. Roberto Sebastiani Data Structures and Algorithms Roberto Sebastiani roberto.sebastiani@disi.unitn.it http://www.disi.unitn.it/~rseba - Week 07 - B.S. In Applied Computer Science Free University of Bozen/Bolzano academic

More information

DATA STRUCTURES AND ALGORITHMS

DATA STRUCTURES AND ALGORITHMS LECTURE 11 Babeş - Bolyai University Computer Science and Mathematics Faculty 2017-2018 In Lecture 9-10... Hash tables ADT Stack ADT Queue ADT Deque ADT Priority Queue Hash tables Today Hash tables 1 Hash

More information

of characters from an alphabet, then, the hash function could be:

of characters from an alphabet, then, the hash function could be: Module 7: Hashing Dr. Natarajan Meghanathan Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu Hashing A very efficient method for implementing

More information

HO #13 Fall 2015 Gary Chan. Hashing (N:12)

HO #13 Fall 2015 Gary Chan. Hashing (N:12) HO #13 Fall 2015 Gary Chan Hashing (N:12) Outline Motivation Hashing Algorithms and Improving the Hash Functions Collisions Strategies Open addressing and linear probing Separate chaining COMP2012H (Hashing)

More information

Data Structures and Algorithm Analysis (CSC317) Hash tables (part2)

Data Structures and Algorithm Analysis (CSC317) Hash tables (part2) Data Structures and Algorithm Analysis (CSC317) Hash tables (part2) Hash table We have elements with key and satellite data Operations performed: Insert, Delete, Search/lookup We don t maintain order information

More information

Dictionary. Dictionary. stores key-value pairs. Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n)

Dictionary. Dictionary. stores key-value pairs. Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n) Hash-Tables Introduction Dictionary Dictionary stores key-value pairs Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n) Balanced BST O(log n) O(log n) O(log n) Dictionary

More information

Chapter 9: Maps, Dictionaries, Hashing

Chapter 9: Maps, Dictionaries, Hashing Chapter 9: 0 1 025-612-0001 2 981-101-0002 3 4 451-229-0004 Maps, Dictionaries, Hashing Nancy Amato Parasol Lab, Dept. CSE, Texas A&M University Acknowledgement: These slides are adapted from slides provided

More information

Today s Outline. CS 561, Lecture 8. Direct Addressing Problem. Hash Tables. Hash Tables Trees. Jared Saia University of New Mexico

Today s Outline. CS 561, Lecture 8. Direct Addressing Problem. Hash Tables. Hash Tables Trees. Jared Saia University of New Mexico Today s Outline CS 561, Lecture 8 Jared Saia University of New Mexico Hash Tables Trees 1 Direct Addressing Problem Hash Tables If universe U is large, storing the array T may be impractical Also much

More information

Fundamental Algorithms

Fundamental Algorithms Fundamental Algorithms Chapter 7: Hash Tables Michael Bader Winter 2011/12 Chapter 7: Hash Tables, Winter 2011/12 1 Generalised Search Problem Definition (Search Problem) Input: a sequence or set A of

More information

Understand how to deal with collisions

Understand how to deal with collisions Understand the basic structure of a hash table and its associated hash function Understand what makes a good (and a bad) hash function Understand how to deal with collisions Open addressing Separate chaining

More information

Hashing Techniques. Material based on slides by George Bebis

Hashing Techniques. Material based on slides by George Bebis Hashing Techniques Material based on slides by George Bebis https://www.cse.unr.edu/~bebis/cs477/lect/hashing.ppt The Search Problem Find items with keys matching a given search key Given an array A, containing

More information

CSC263 Week 5. Larry Zhang.

CSC263 Week 5. Larry Zhang. CSC263 Week 5 Larry Zhang http://goo.gl/forms/s9yie3597b Announcements PS3 marks out, class average 81.3% Assignment 1 due next week. Response to feedbacks -- tutorials We spent too much time on working

More information

Hash Tables Outline. Definition Hash functions Open hashing Closed hashing. Efficiency. collision resolution techniques. EECS 268 Programming II 1

Hash Tables Outline. Definition Hash functions Open hashing Closed hashing. Efficiency. collision resolution techniques. EECS 268 Programming II 1 Hash Tables Outline Definition Hash functions Open hashing Closed hashing collision resolution techniques Efficiency EECS 268 Programming II 1 Overview Implementation style for the Table ADT that is good

More information

CSE100. Advanced Data Structures. Lecture 21. (Based on Paul Kube course materials)

CSE100. Advanced Data Structures. Lecture 21. (Based on Paul Kube course materials) CSE100 Advanced Data Structures Lecture 21 (Based on Paul Kube course materials) CSE 100 Collision resolution strategies: linear probing, double hashing, random hashing, separate chaining Hash table cost

More information

Successful vs. Unsuccessful

Successful vs. Unsuccessful Hashing Search Given: Distinct keys k 1, k 2,, k n and collection T of n records of the form (k 1, I 1 ), (k 2, I 2 ),, (k n, I n ) where I j is the information associated with key k j for 1

More information

Tirgul 7. Hash Tables. In a hash table, we allocate an array of size m, which is much smaller than U (the set of keys).

Tirgul 7. Hash Tables. In a hash table, we allocate an array of size m, which is much smaller than U (the set of keys). Tirgul 7 Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys belong to a universal group of keys, U = {1... M}.

More information

Hash Table and Hashing

Hash Table and Hashing Hash Table and Hashing The tree structures discussed so far assume that we can only work with the input keys by comparing them. No other operation is considered. In practice, it is often true that an input

More information

Module 5: Hashing. CS Data Structures and Data Management. Reza Dorrigiv, Daniel Roche. School of Computer Science, University of Waterloo

Module 5: Hashing. CS Data Structures and Data Management. Reza Dorrigiv, Daniel Roche. School of Computer Science, University of Waterloo Module 5: Hashing CS 240 - Data Structures and Data Management Reza Dorrigiv, Daniel Roche School of Computer Science, University of Waterloo Winter 2010 Reza Dorrigiv, Daniel Roche (CS, UW) CS240 - Module

More information

AAL 217: DATA STRUCTURES

AAL 217: DATA STRUCTURES Chapter # 4: Hashing AAL 217: DATA STRUCTURES The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions, and finds in constant average

More information

HASH TABLES. Goal is to store elements k,v at index i = h k

HASH TABLES. Goal is to store elements k,v at index i = h k CH 9.2 : HASH TABLES 1 ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++, GOODRICH, TAMASSIA AND MOUNT (WILEY 2004) AND SLIDES FROM JORY DENNY AND

More information

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is: CS 124 Section #8 Hashing, Skip Lists 3/20/17 1 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look

More information

Hashing. 1. Introduction. 2. Direct-address tables. CmSc 250 Introduction to Algorithms

Hashing. 1. Introduction. 2. Direct-address tables. CmSc 250 Introduction to Algorithms Hashing CmSc 250 Introduction to Algorithms 1. Introduction Hashing is a method of storing elements in a table in a way that reduces the time for search. Elements are assumed to be records with several

More information

Hash Tables. Reading: Cormen et al, Sections 11.1 and 11.2

Hash Tables. Reading: Cormen et al, Sections 11.1 and 11.2 COMP3600/6466 Algorithms 2018 Lecture 10 1 Hash Tables Reading: Cormen et al, Sections 11.1 and 11.2 Many applications require a dynamic set that supports only the dictionary operations Insert, Search

More information

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 Asymptotics, Recurrence and Basic Algorithms 1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 2. O(n) 2. [1 pt] What is the solution to the recurrence T(n) = T(n/2) + n, T(1)

More information

Dictionaries and Hash Tables

Dictionaries and Hash Tables Dictionaries and Hash Tables 0 1 2 3 025-612-0001 981-101-0002 4 451-229-0004 Dictionaries and Hash Tables 1 Dictionary ADT The dictionary ADT models a searchable collection of keyelement items The main

More information

Outline. hash tables hash functions open addressing chained hashing

Outline. hash tables hash functions open addressing chained hashing Outline hash tables hash functions open addressing chained hashing 1 hashing hash browns: mixed-up bits of potatoes, cooked together hashing: slicing up and mixing together a hash function takes a larger,

More information

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ.! Instructor: X. Zhang Spring 2017

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ.! Instructor: X. Zhang Spring 2017 Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ.! Instructor: X. Zhang Spring 2017 Acknowledgement The set of slides have used materials from the following resources Slides

More information

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement. Support for Dictionary

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement. Support for Dictionary Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Spring 2017 Acknowledgement The set of slides have used materials from the following resources Slides for

More information

Algorithms and Data Structures

Algorithms and Data Structures Lesson 4: Sets, Dictionaries and Hash Tables Luciano Bononi http://www.cs.unibo.it/~bononi/ (slide credits: these slides are a revised version of slides created by Dr. Gabriele D Angelo)

More information

5. Hashing. 5.1 General Idea. 5.2 Hash Function. 5.3 Separate Chaining. 5.4 Open Addressing. 5.5 Rehashing. 5.6 Extendible Hashing. 5.

5. Hashing. 5.1 General Idea. 5.2 Hash Function. 5.3 Separate Chaining. 5.4 Open Addressing. 5.5 Rehashing. 5.6 Extendible Hashing. 5. 5. Hashing 5.1 General Idea 5.2 Hash Function 5.3 Separate Chaining 5.4 Open Addressing 5.5 Rehashing 5.6 Extendible Hashing Malek Mouhoub, CS340 Fall 2004 1 5. Hashing Sequential access : O(n). Binary

More information

Hashing 1. Searching Lists

Hashing 1. Searching Lists Hashing 1 Searching Lists There are many instances when one is interested in storing and searching a list: A phone company wants to provide caller ID: Given a phone number a name is returned. Somebody

More information

CS 241 Analysis of Algorithms

CS 241 Analysis of Algorithms CS 241 Analysis of Algorithms Professor Eric Aaron Lecture T Th 9:00am Lecture Meeting Location: OLB 205 Business HW5 extended, due November 19 HW6 to be out Nov. 14, due November 26 Make-up lecture: Wed,

More information

CS/COE 1501

CS/COE 1501 CS/COE 1501 www.cs.pitt.edu/~lipschultz/cs1501/ Hashing Wouldn t it be wonderful if... Search through a collection could be accomplished in Θ(1) with relatively small memory needs? Lets try this: Assume

More information

Worst-case running time for RANDOMIZED-SELECT

Worst-case running time for RANDOMIZED-SELECT Worst-case running time for RANDOMIZED-SELECT is ), even to nd the minimum The algorithm has a linear expected running time, though, and because it is randomized, no particular input elicits the worst-case

More information

COMP171. Hashing.

COMP171. Hashing. COMP171 Hashing Hashing 2 Hashing Again, a (dynamic) set of elements in which we do search, insert, and delete Linear ones: lists, stacks, queues, Nonlinear ones: trees, graphs (relations between elements

More information

HashTable CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Fall 2018

HashTable CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Fall 2018 HashTable CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Fall 2018 Acknowledgement The set of slides have used materials from the following resources Slides for textbook by Dr. Y.

More information

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far Chapter 5 Hashing 2 Introduction hashing performs basic operations, such as insertion, deletion, and finds in average time better than other ADTs we ve seen so far 3 Hashing a hash table is merely an hashing

More information

Hashing. Hashing Procedures

Hashing. Hashing Procedures Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements

More information

Data Structure Lecture#22: Searching 3 (Chapter 9) U Kang Seoul National University

Data Structure Lecture#22: Searching 3 (Chapter 9) U Kang Seoul National University Data Structure Lecture#22: Searching 3 (Chapter 9) U Kang Seoul National University U Kang 1 In This Lecture Motivation of collision resolution policy Open hashing for collision resolution Closed hashing

More information

Open Addressing: Linear Probing (cont.)

Open Addressing: Linear Probing (cont.) Open Addressing: Linear Probing (cont.) Cons of Linear Probing () more complex insert, find, remove methods () primary clustering phenomenon items tend to cluster together in the bucket array, as clustering

More information

CSE 2320 Notes 13: Hashing

CSE 2320 Notes 13: Hashing CSE 30 Notes 3: Hashing (Last updated 0/7/7 : PM) CLRS.-. (skip.3.3) 3.A. CONCEPTS Goal: Achieve faster operations than balanced trees (nearly Ο( ) epected time) by using randomness in key sets by sacrificing

More information

CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS

CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS 0 1 2 025-612-0001 981-101-0002 3 4 451-229-0004 CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++, GOODRICH,

More information

DATA STRUCTURES/UNIT 3

DATA STRUCTURES/UNIT 3 UNIT III SORTING AND SEARCHING 9 General Background Exchange sorts Selection and Tree Sorting Insertion Sorts Merge and Radix Sorts Basic Search Techniques Tree Searching General Search Trees- Hashing.

More information

Today: Finish up hashing Sorted Dictionary ADT: Binary search, divide-and-conquer Recursive function and recurrence relation

Today: Finish up hashing Sorted Dictionary ADT: Binary search, divide-and-conquer Recursive function and recurrence relation Announcements HW1 PAST DUE HW2 online: 7 questions, 60 points Nat l Inst visit Thu, ok? Last time: Continued PA1 Walk Through Dictionary ADT: Unsorted Hashing Today: Finish up hashing Sorted Dictionary

More information

Hashing. Dr. Ronaldo Menezes Hugo Serrano. Ronaldo Menezes, Florida Tech

Hashing. Dr. Ronaldo Menezes Hugo Serrano. Ronaldo Menezes, Florida Tech Hashing Dr. Ronaldo Menezes Hugo Serrano Agenda Motivation Prehash Hashing Hash Functions Collisions Separate Chaining Open Addressing Motivation Hash Table Its one of the most important data structures

More information

HASH TABLES.

HASH TABLES. 1 HASH TABLES http://en.wikipedia.org/wiki/hash_table 2 Hash Table A hash table (or hash map) is a data structure that maps keys (identifiers) into a certain location (bucket) A hash function changes the

More information

CSE 214 Computer Science II Searching

CSE 214 Computer Science II Searching CSE 214 Computer Science II Searching Fall 2017 Stony Brook University Instructor: Shebuti Rayana shebuti.rayana@stonybrook.edu http://www3.cs.stonybrook.edu/~cse214/sec02/ Introduction Searching in a

More information

Topic HashTable and Table ADT

Topic HashTable and Table ADT Topic HashTable and Table ADT Hashing, Hash Function & Hashtable Search, Insertion & Deletion of elements based on Keys So far, By comparing keys! Linear data structures Non-linear data structures Time

More information

Lecture 7: Efficient Collections via Hashing

Lecture 7: Efficient Collections via Hashing Lecture 7: Efficient Collections via Hashing These slides include material originally prepared by Dr. Ron Cytron, Dr. Jeremy Buhler, and Dr. Steve Cole. 1 Announcements Lab 6 due Friday Lab 7 out tomorrow

More information

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40 Lecture 16 Hashing Hash table and hash function design Hash functions for integers and strings Collision resolution strategies: linear probing, double hashing, random hashing, separate chaining Hash table

More information

CS 270 Algorithms. Oliver Kullmann. Generalising arrays. Direct addressing. Hashing in general. Hashing through chaining. Reading from CLRS for week 7

CS 270 Algorithms. Oliver Kullmann. Generalising arrays. Direct addressing. Hashing in general. Hashing through chaining. Reading from CLRS for week 7 Week 9 General remarks tables 1 2 3 We continue data structures by discussing hash tables. Reading from CLRS for week 7 1 Chapter 11, Sections 11.1, 11.2, 11.3. 4 5 6 Recall: Dictionaries Applications

More information

Hashing. Manolis Koubarakis. Data Structures and Programming Techniques

Hashing. Manolis Koubarakis. Data Structures and Programming Techniques Hashing Manolis Koubarakis 1 The Symbol Table ADT A symbol table T is an abstract storage that contains table entries that are either empty or are pairs of the form (K, I) where K is a key and I is some

More information

Data Structures and Algorithms(10)

Data Structures and Algorithms(10) Ming Zhang "Data Structures and Algorithms" Data Structures and Algorithms(10) Instructor: Ming Zhang Textbook Authors: Ming Zhang, Tengjiao Wang and Haiyan Zhao Higher Education Press, 2008.6 (the "Eleventh

More information

CITS2200 Data Structures and Algorithms. Topic 15. Hash Tables

CITS2200 Data Structures and Algorithms. Topic 15. Hash Tables CITS2200 Data Structures and Algorithms Topic 15 Hash Tables Introduction to hashing basic ideas Hash functions properties, 2-universal functions, hashing non-integers Collision resolution bucketing and

More information

Hashing for searching

Hashing for searching Hashing for searching Consider searching a database of records on a given key. There are three standard techniques: Searching sequentially start at the first record and look at each record in turn until

More information

Introduction hashing: a technique used for storing and retrieving information as quickly as possible.

Introduction hashing: a technique used for storing and retrieving information as quickly as possible. Lecture IX: Hashing Introduction hashing: a technique used for storing and retrieving information as quickly as possible. used to perform optimal searches and is useful in implementing symbol tables. Why

More information

Hashing. October 19, CMPE 250 Hashing October 19, / 25

Hashing. October 19, CMPE 250 Hashing October 19, / 25 Hashing October 19, 2016 CMPE 250 Hashing October 19, 2016 1 / 25 Dictionary ADT Data structure with just three basic operations: finditem (i): find item with key (identifier) i insert (i): insert i into

More information

CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2

CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2 CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2 Prof. John Park Based on slides from previous iterations of this course Today s Topics Overview Uses and motivations of hash tables Major concerns with hash

More information

Hashing Algorithms. Hash functions Separate Chaining Linear Probing Double Hashing

Hashing Algorithms. Hash functions Separate Chaining Linear Probing Double Hashing Hashing Algorithms Hash functions Separate Chaining Linear Probing Double Hashing Symbol-Table ADT Records with keys (priorities) basic operations insert search create test if empty destroy copy generic

More information

Hashing. Introduction to Data Structures Kyuseok Shim SoEECS, SNU.

Hashing. Introduction to Data Structures Kyuseok Shim SoEECS, SNU. Hashing Introduction to Data Structures Kyuseok Shim SoEECS, SNU. 1 8.1 INTRODUCTION Binary search tree (Chapter 5) GET, INSERT, DELETE O(n) Balanced binary search tree (Chapter 10) GET, INSERT, DELETE

More information

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved Introducing Hashing Chapter 21 Contents What Is Hashing? Hash Functions Computing Hash Codes Compressing a Hash Code into an Index for the Hash Table A demo of hashing (after) ARRAY insert hash index =

More information

Fast Lookup: Hash tables

Fast Lookup: Hash tables CSE 100: HASHING Operations: Find (key based look up) Insert Delete Fast Lookup: Hash tables Consider the 2-sum problem: Given an unsorted array of N integers, find all pairs of elements that sum to a

More information

Module 2: Classical Algorithm Design Techniques

Module 2: Classical Algorithm Design Techniques Module 2: Classical Algorithm Design Techniques Dr. Natarajan Meghanathan Associate Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu Module

More information

1 / 22. Inf 2B: Hash Tables. Lecture 4 of ADS thread. Kyriakos Kalorkoti. School of Informatics University of Edinburgh

1 / 22. Inf 2B: Hash Tables. Lecture 4 of ADS thread. Kyriakos Kalorkoti. School of Informatics University of Edinburgh 1 / 22 Inf 2B: Hash Tables Lecture 4 of ADS thread Kyriakos Kalorkoti School of Informatics University of Edinburgh 2 / 22 Dictionaries A Dictionary stores key element pairs, called items. Several elements

More information

CS 350 Algorithms and Complexity

CS 350 Algorithms and Complexity CS 350 Algorithms and Complexity Winter 2019 Lecture 12: Space & Time Tradeoffs. Part 2: Hashing & B-Trees Andrew P. Black Department of Computer Science Portland State University Space-for-time tradeoffs

More information

1 CSE 100: HASH TABLES

1 CSE 100: HASH TABLES CSE 100: HASH TABLES 1 2 Looking ahead.. Watch out for those deadlines Where we ve been and where we are going Our goal so far: We want to store and retrieve data (keys) fast 3 Tree structures BSTs: simple,

More information

Unit #5: Hash Functions and the Pigeonhole Principle

Unit #5: Hash Functions and the Pigeonhole Principle Unit #5: Hash Functions and the Pigeonhole Principle CPSC 221: Basic Algorithms and Data Structures Jan Manuch 217S1: May June 217 Unit Outline Constant-Time Dictionaries? Hash Table Outline Hash Functions

More information

Hash Tables. Hash functions Open addressing. March 07, 2018 Cinda Heeren / Geoffrey Tien 1

Hash Tables. Hash functions Open addressing. March 07, 2018 Cinda Heeren / Geoffrey Tien 1 Hash Tables Hash functions Open addressing Cinda Heeren / Geoffrey Tien 1 Hash functions A hash function is a function that map key values to array indexes Hash functions are performed in two steps Map

More information

We assume uniform hashing (UH):

We assume uniform hashing (UH): We assume uniform hashing (UH): the probe sequence of each key is equally likely to be any of the! permutations of 0,1,, 1 UH generalizes the notion of SUH that produces not just a single number, but a

More information

Acknowledgement HashTable CISC4080, Computer Algorithms CIS, Fordham Univ.

Acknowledgement HashTable CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement HashTable CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Spring 2018 The set of slides have used materials from the following resources Slides for textbook by Dr.

More information

Cpt S 223. School of EECS, WSU

Cpt S 223. School of EECS, WSU Hashing & Hash Tables 1 Overview Hash Table Data Structure : Purpose To support insertion, deletion and search in average-case constant t time Assumption: Order of elements irrelevant ==> data structure

More information

Week 9. Hash tables. 1 Generalising arrays. 2 Direct addressing. 3 Hashing in general. 4 Hashing through chaining. 5 Hash functions.

Week 9. Hash tables. 1 Generalising arrays. 2 Direct addressing. 3 Hashing in general. 4 Hashing through chaining. 5 Hash functions. Week 9 tables 1 2 3 ing in ing in ing 4 ing 5 6 General remarks We continue data structures by discussing hash tables. For this year, we only consider the first four sections (not sections and ). Only

More information

III Data Structures. Dynamic sets

III Data Structures. Dynamic sets III Data Structures Elementary Data Structures Hash Tables Binary Search Trees Red-Black Trees Dynamic sets Sets are fundamental to computer science Algorithms may require several different types of operations

More information

Hash Tables and Hash Functions

Hash Tables and Hash Functions Hash Tables and Hash Functions We have seen that with a balanced binary tree we can guarantee worst-case time for insert, search and delete operations. Our challenge now is to try to improve on that...

More information

CS 310 Hash Tables, Page 1. Hash Tables. CS 310 Hash Tables, Page 2

CS 310 Hash Tables, Page 1. Hash Tables. CS 310 Hash Tables, Page 2 CS 310 Hash Tables, Page 1 Hash Tables key value key value Hashing Function CS 310 Hash Tables, Page 2 The hash-table data structure achieves (near) constant time searching by wasting memory space. the

More information

Quiz 1 Practice Problems

Quiz 1 Practice Problems Introduction to Algorithms: 6.006 Massachusetts Institute of Technology March 7, 2008 Professors Srini Devadas and Erik Demaine Handout 6 1 Asymptotic Notation Quiz 1 Practice Problems Decide whether these

More information

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Data Structures Hashing Structures Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Hashing Structures I. Motivation and Review II. Hash Functions III. HashTables I. Implementations

More information

Algorithms and Data Structures

Algorithms and Data Structures Algorithms and Data Structures Open Hashing Ulf Leser Open Hashing Open Hashing: Store all values inside hash table A General framework No collision: Business as usual Collision: Chose another index and

More information

CMSC 341 Hashing (Continued) Based on slides from previous iterations of this course

CMSC 341 Hashing (Continued) Based on slides from previous iterations of this course CMSC 341 Hashing (Continued) Based on slides from previous iterations of this course Today s Topics Review Uses and motivations of hash tables Major concerns with hash tables Properties Hash function Hash

More information

CS 10: Problem solving via Object Oriented Programming Winter 2017

CS 10: Problem solving via Object Oriented Programming Winter 2017 CS 10: Problem solving via Object Oriented Programming Winter 2017 Tim Pierson 260 (255) Sudikoff Day 11 Hashing Agenda 1. Hashing 2. CompuLng Hash funclons 3. Handling collisions 1. Chaining 2. Open Addressing

More information

CSCI 104 Hash Tables & Functions. Mark Redekopp David Kempe

CSCI 104 Hash Tables & Functions. Mark Redekopp David Kempe CSCI 104 Hash Tables & Functions Mark Redekopp David Kempe Dictionaries/Maps An array maps integers to values Given i, array[i] returns the value in O(1) Dictionaries map keys to values Given key, k, map[k]

More information

CMSC 341 Hashing. Based on slides from previous iterations of this course

CMSC 341 Hashing. Based on slides from previous iterations of this course CMSC 341 Hashing Based on slides from previous iterations of this course Hashing Searching n Consider the problem of searching an array for a given value q If the array is not sorted, the search requires

More information

Advanced Algorithmics (6EAP) MTAT Hashing. Jaak Vilo 2016 Fall

Advanced Algorithmics (6EAP) MTAT Hashing. Jaak Vilo 2016 Fall Advanced Algorithmics (6EAP) MTAT.03.238 Hashing Jaak Vilo 2016 Fall Jaak Vilo 1 ADT asscociative array INSERT, SEARCH, DELETE An associative array (also associative container, map, mapping, dictionary,

More information

Hashing. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Hashing. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University Hashing CptS 223 Advanced Data Structures Larry Holder School of Electrical Engineering and Computer Science Washington State University 1 Overview Hashing Technique supporting insertion, deletion and

More information

CSCI 104 Log Structured Merge Trees. Mark Redekopp

CSCI 104 Log Structured Merge Trees. Mark Redekopp 1 CSCI 10 Log Structured Merge Trees Mark Redekopp Series Summation Review Let n = 1 + + + + k = σk i=0 n = k+1-1 i. What is n? What is log (1) + log () + log () + log (8)++ log ( k ) = 0 + 1 + + 3+ +

More information

TABLES AND HASHING. Chapter 13

TABLES AND HASHING. Chapter 13 Data Structures Dr Ahmed Rafat Abas Computer Science Dept, Faculty of Computer and Information, Zagazig University arabas@zu.edu.eg http://www.arsaliem.faculty.zu.edu.eg/ TABLES AND HASHING Chapter 13

More information

Multiple-choice (35 pt.)

Multiple-choice (35 pt.) CS 161 Practice Midterm I Summer 2018 Released: 7/21/18 Multiple-choice (35 pt.) 1. (2 pt.) Which of the following asymptotic bounds describe the function f(n) = n 3? The bounds do not necessarily need

More information

CSCI Analysis of Algorithms I

CSCI Analysis of Algorithms I CSCI 305 - Analysis of Algorithms I 04 June 2018 Filip Jagodzinski Computer Science Western Washington University Announcements Remainder of the term 04 June : lecture 05 June : lecture 06 June : lecture,

More information

Hashing HASHING HOW? Ordered Operations. Typical Hash Function. Why not discard other data structures?

Hashing HASHING HOW? Ordered Operations. Typical Hash Function. Why not discard other data structures? HASHING By, Durgesh B Garikipati Ishrath Munir Hashing Sorting was putting things in nice neat order Hashing is the opposite Put things in a random order Algorithmically determine position of any given

More information

( D. Θ n. ( ) f n ( ) D. Ο%

( D. Θ n. ( ) f n ( ) D. Ο% CSE 0 Name Test Spring 0 Multiple Choice. Write your answer to the LEFT of each problem. points each. The time to run the code below is in: for i=n; i>=; i--) for j=; j

More information

Hash Open Indexing. Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1

Hash Open Indexing. Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1 Hash Open Indexing Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1 Warm Up Consider a StringDictionary using separate chaining with an internal capacity of 10. Assume our buckets are implemented

More information

CSE 332: Data Structures & Parallelism Lecture 10:Hashing. Ruth Anderson Autumn 2018

CSE 332: Data Structures & Parallelism Lecture 10:Hashing. Ruth Anderson Autumn 2018 CSE 332: Data Structures & Parallelism Lecture 10:Hashing Ruth Anderson Autumn 2018 Today Dictionaries Hashing 10/19/2018 2 Motivating Hash Tables For dictionary with n key/value pairs insert find delete

More information

Hashing. 7- Hashing. Hashing. Transform Keys into Integers in [[0, M 1]] The steps in hashing:

Hashing. 7- Hashing. Hashing. Transform Keys into Integers in [[0, M 1]] The steps in hashing: Hashing 7- Hashing Bruno MARTI, University of ice - Sophia Antipolis mailto:bruno.martin@unice.fr http://www.i3s.unice.fr/~bmartin/mathmods.html The steps in hashing: 1 compute a hash function which maps

More information

BBM371& Data*Management. Lecture 6: Hash Tables

BBM371& Data*Management. Lecture 6: Hash Tables BBM371& Data*Management Lecture 6: Hash Tables 8.11.2018 Purpose of using hashes A generalization of ordinary arrays: Direct access to an array index is O(1), can we generalize direct access to any key

More information

Hash Tables Hash Tables Goodrich, Tamassia

Hash Tables Hash Tables Goodrich, Tamassia Hash Tables 0 1 2 3 4 025-612-0001 981-101-0002 451-229-0004 Hash Tables 1 Hash Functions and Hash Tables A hash function h maps keys of a given type to integers in a fixed interval [0, N 1] Example: h(x)

More information

Hash[ string key ] ==> integer value

Hash[ string key ] ==> integer value Hashing 1 Overview Hash[ string key ] ==> integer value Hash Table Data Structure : Use-case To support insertion, deletion and search in average-case constant time Assumption: Order of elements irrelevant

More information