Open Addressing: Linear Probing (cont.)

Similar documents
Hash Tables. Gunnar Gotshalks. Maps 1

CMSC 341 Hashing (Continued) Based on slides from previous iterations of this course

AAL 217: DATA STRUCTURES

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far

! A Hash Table is used to implement a set, ! The table uses a function that maps an. ! The function is called a hash function.

CSE373: Data Structures & Algorithms Lecture 17: Hash Collisions. Kevin Quinn Fall 2015

HASH TABLES. Goal is to store elements k,v at index i = h k

Dynamic Dictionaries. Operations: create insert find remove max/ min write out in sorted order. Only defined for object classes that are Comparable

Hashing Techniques. Material based on slides by George Bebis

General Idea. Key could be an integer, a string, etc e.g. a name or Id that is a part of a large employee structure

Topic HashTable and Table ADT

Chapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion,

Fast Lookup: Hash tables

Hash[ string key ] ==> integer value

Hashing. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

CSE 214 Computer Science II Searching

Understand how to deal with collisions

COMP171. Hashing.

Hash Open Indexing. Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1

Hashing. 1. Introduction. 2. Direct-address tables. CmSc 250 Introduction to Algorithms

Hash Tables. Hash functions Open addressing. March 07, 2018 Cinda Heeren / Geoffrey Tien 1

Hash Tables Outline. Definition Hash functions Open hashing Closed hashing. Efficiency. collision resolution techniques. EECS 268 Programming II 1

CSE100. Advanced Data Structures. Lecture 21. (Based on Paul Kube course materials)

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40

Cpt S 223. School of EECS, WSU

THINGS WE DID LAST TIME IN SECTION

CSE 373 Autumn 2012: Midterm #2 (closed book, closed notes, NO calculators allowed)

CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2

Chapter 27 Hashing. Liang, Introduction to Java Programming, Eleventh Edition, (c) 2017 Pearson Education, Inc. All rights reserved.

CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS

Data Structures And Algorithms

Hash Tables. Hashing Probing Separate Chaining Hash Function

Algorithms and Data Structures

lecture23: Hash Tables

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved

Introduction to Hashing

Algorithms and Data Structures

Outline. hash tables hash functions open addressing chained hashing

Data Structure Lecture#22: Searching 3 (Chapter 9) U Kang Seoul National University

Chapter 27 Hashing. Objectives

Hashing HASHING HOW? Ordered Operations. Typical Hash Function. Why not discard other data structures?

5. Hashing. 5.1 General Idea. 5.2 Hash Function. 5.3 Separate Chaining. 5.4 Open Addressing. 5.5 Rehashing. 5.6 Extendible Hashing. 5.

DATA STRUCTURES AND ALGORITHMS

Dictionaries-Hashing. Textbook: Dictionaries ( 8.1) Hash Tables ( 8.2)

Hash Tables. Johns Hopkins Department of Computer Science Course : Data Structures, Professor: Greg Hager

HASH TABLES. Hash Tables Page 1

Lecture 4. Hashing Methods

Data Structures and Algorithm Analysis (CSC317) Hash tables (part2)

Symbol Table. Symbol table is used widely in many applications. dictionary is a kind of symbol table data dictionary is database management

Hashing as a Dictionary Implementation

Hashing. Dr. Ronaldo Menezes Hugo Serrano. Ronaldo Menezes, Florida Tech

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor

Dictionary. Dictionary. stores key-value pairs. Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n)

Hash Tables Hash Tables Goodrich, Tamassia

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

HASH TABLES. CSE 332 Data Abstractions: B Trees and Hash Tables Make a Complete Breakfast. An Ideal Hash Functions.

Data and File Structures Laboratory

Chapter 9: Maps, Dictionaries, Hashing

Hashing. Data organization in main memory or disk

Data Structures (CS 1520) Lecture 23 Name:

CPSC 259 admin notes

Unit #5: Hash Functions and the Pigeonhole Principle

Hashing. Manolis Koubarakis. Data Structures and Programming Techniques

Adapted By Manik Hosen

Hash Tables. Hash functions Open addressing. November 24, 2017 Hassan Khosravi / Geoffrey Tien 1

csci 210: Data Structures Maps and Hash Tables

Module 3: Hashing Lecture 9: Static and Dynamic Hashing. The Lecture Contains: Static hashing. Hashing. Dynamic hashing. Extendible hashing.

Today: Finish up hashing Sorted Dictionary ADT: Binary search, divide-and-conquer Recursive function and recurrence relation

Table ADT and Sorting. Algorithm topics continuing (or reviewing?) CS 24 curriculum

Introduction hashing: a technique used for storing and retrieving information as quickly as possible.

More on Hashing: Collisions. See Chapter 20 of the text.

Hashing. Hashing Procedures

CS 350 : Data Structures Hash Tables

UNIT III BALANCED SEARCH TREES AND INDEXING

CSE 5311 Notes 5: Hashing

Tables. The Table ADT is used when information needs to be stored and acessed via a key usually, but not always, a string. For example: Dictionaries

Hash table basics mod 83 ate. ate. hashcode()

CS 206 Introduction to Computer Science II

Lecture 17. Improving open-addressing hashing. Brent s method. Ordered hashing CSE 100, UCSD: LEC 17. Page 1 of 19

HASH TABLES.

Data and File Structures Chapter 11. Hashing

CITS2200 Data Structures and Algorithms. Topic 15. Hash Tables

Hashing 1. Searching Lists

Data Structures - CSCI 102. CS102 Hash Tables. Prof. Tejada. Copyright Sheila Tejada

DATA STRUCTURES AND ALGORITHMS

Hash-Based Indexes. Chapter 11

CSE 332: Data Structures & Parallelism Lecture 10:Hashing. Ruth Anderson Autumn 2018

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs

CSI33 Data Structures

On my honor I affirm that I have neither given nor received inappropriate aid in the completion of this exercise.

Introduction To Hashing

Dictionaries and Hash Tables

Fundamentals of Database Systems Prof. Arnab Bhattacharya Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

ECE 242 Data Structures and Algorithms. Hash Tables I. Lecture 24. Prof.

Part I Anton Gerdelan

CMSC 341 Hashing. Based on slides from previous iterations of this course

Successful vs. Unsuccessful

CS 2150 (fall 2010) Midterm 2

Direct Addressing Hash table: Collision resolution how handle collisions Hash Functions:

Hash table basics mod 83 ate. ate

Transcription:

Open Addressing: Linear Probing (cont.) Cons of Linear Probing () more complex insert, find, remove methods () primary clustering phenomenon items tend to cluster together in the bucket array, as clustering gets worse, insert(k,e) takes longer because it must step all the way through a cluster to find a vacant bucket as clustering gets worse, find(k) takes longer since elements get placed further and further from their correct hashed index Example [ consecutive insertion of k=4567, 069, 073 ] Different keys that hash into different addresses compete with each other in successive rehashes. 3 4 5 7496 4567 069 073 should be at 3 should be at 4

Open Addressing: Linear Probing (cont.) Average RT in a non-full hash table, with no previous removals, the average running time of insert, find, remove is RT average (n) = + + - λ, ( - λ) for, successful search for unsuccessful search NOTE: ) What would be the worst case RT of all three methods?! ) Both formulas collapse for λ=. Why?!

Open Addressing: Linear Probing (cont.) 3 Example 4 [ linear probing ] h(k) = k mod 3 Insert keys: 8, 4,, 44, 59, 3, 3, 73, in this order. h(8) = 5 h(4) = h() = 9 h(44) = 5+ h(59) = 7 h(3) = 6++ h(3) = 5+++++ h(73) = 8+++ 4 8 44 59 3 3 73 0 3 4 5 6 7 8 9 0 cluster

Open Addressing: Quadratic Probing 4 Quadratic Probing involves iteratively trying the following buckets A [( h(k) j ) modn] + i i+ i+ i+3 where j=0,,,, until finding an empty bucket eliminates primary clusters by probing far-away buckets however, creates secondary clusters if two keys have the same initial probe position, then then they share the same probe sequence can be avoided by making the probe sequence a function of the key not just the home location if N is not prime, quadratic probing may not find an empty bucket even if one exists even if N is prime, may not find an empty slot if the bucket array is half-full

Open Addressing: Double Hashing 5 Double Hashing uses a secondary hash function h (k) and places the colliding item in st available cell of the series k k i i i A [( k + j h'(k) ) modn] where j=0,,,, until finding an empty bucket linear & quadratic probing are key independent; double hashing defines key dependant probing sequence h (k) determines the size of steps h cannot have 0 values; common choice: h (k) = q (k mod q) i 3 q < N, and q is a prime number possible values for h (k) are,,.., q

Open Addressing: Double Hashing (cont.) 6 Example 5 [ double hashing ] h(k) = k mod 3, h (k) = 7 k mod 7 Insert keys: 8, 4,, 44, 59, 3, 3, 73, in this order. Probes: [ k + j*h (k) ] mod 3, j=0,,, h(k) h(8) = 5 h(4) = h() = 9 h(44) = 5 h(59) = 7 h(3) = 6 h(3) = 5 h(73) = 8 h (k) h (44) = 5 h (3) = 4 5 9 5, 0 7 6 5, 9, 0 8 wraparound 3 4 8 3 59 73 44 0 3 4 5 6 7 8 9 0

Open Addressing: Double Hashing (cont.) 7 Pros of drastically reduces clustering and requires fewer Double Hashing comparisons than linear probing as a result, smaller hash tables can be used Cons of similar as in the case of linear/quadratic probing, Double Hashing the performance degrades as the table fills up Average RT in a non-full hash table, with no previous removals, the average running time of insert, find, remove is RT average (n) = - ln, - λ ( - λ) λ, for successful search for unsuccessful search Using more than one hash function is called rehashing. While having more than hash functions can be desirable, such schemes are difficult to implement.

Collision Handling Schemes: Comparison 8 NOTE: Quadratic Probing and Double Hashing have identical performance.

Collision Handling Schemes: Comparison (cont.) 9 () In all 4 cases, hashing efficiency decreases as load factor increases. () When λ=0.5, i.e. hash table is half-full, all 4 methods have nearly equal efficiency. (3) For Linear Probing, Quadratic Probing and Double Hashing λ should be kept below /3 to maintain reasonably good performance. if λ goes above /3 we could improve performance by resizing bucket array and using new hash function (4) Quadratic and Double Hashing perform better than Linear Probing on average, but they also suffer when table fills up. all open addressing schemes have linear O(n) worst case RT when λ! Would you ever use open addressing instead of Chaining?! (When the hash function is one-to-one on the set of all possible keys perfect hashing.) Would you ever use Linear instead of Quadratic Prob. and Double Hashing?! (When fast rehashing is required.)

Hash Tables: Conclusions 0 Hash Tables when we can afford a large tablesize (i.e. small load should be factor λ, below /3) and occasionally slow search used when Hash Tables should NOT be used if () if traversal in sorted order is required since hashing function distributes the elements to table positions somewhat randomly, order cannot be guaranteed () if range queries are required (3) if a guaranteed (better than O(n)) search time needs to be provided!!! In all three cases use binary search tree instead!