Announcements. Hash Functions. Hash Functions 4/17/18 HASHING

Similar documents
Announcements. Submit Prelim 2 conflicts by Thursday night A6 is due Nov 7 (tomorrow!)

Lecture 18. Collision Resolution

HASH TABLES cs2420 Introduction to Algorithms and Data Structures Spring 2015

Hash table basics mod 83 ate. ate

Hash table basics. ate à. à à mod à 83

Hash table basics mod 83 ate. ate. hashcode()

Introduction hashing: a technique used for storing and retrieving information as quickly as possible.

Topic 22 Hash Tables

CMSC 341 Hashing (Continued) Based on slides from previous iterations of this course

Prelim 2. CS 2110, 16 November 2017, 7:30 PM Total Question Name Short Heaps Tree Collections Sorting Graph

MIDTERM EXAM THURSDAY MARCH

CS 206 Introduction to Computer Science II

Hash tables. hashing -- idea collision resolution. hash function Java hashcode() for HashMap and HashSet big-o time bounds applications

Prelim 2 SOLUTION. CS 2110, 16 November 2017, 7:30 PM Total Question Name Short Heaps Tree Collections Sorting Graph

Implementing Hash and AVL

Linked lists (6.5, 16)

csci 210: Data Structures Maps and Hash Tables

CS 310: Maps and Sets

Introduction to Hashing

CSE 143. Lecture 28: Hashing

Data Structures - CSCI 102. CS102 Hash Tables. Prof. Tejada. Copyright Sheila Tejada

Hash table basics mod 83 ate. ate

Hash[ string key ] ==> integer value

CS61B Lecture #24: Hashing. Last modified: Wed Oct 19 14:35: CS61B: Lecture #24 1

11/27/12. CS202 Fall 2012 Lecture 11/15. Hashing. What: WiCS CS Courses: Inside Scoop When: Monday, Nov 19th from 5-7pm Where: SEO 1000

Lecture 16 More on Hashing Collision Resolution

Open Addressing: Linear Probing (cont.)

CMSC 132: Object-Oriented Programming II. Hash Tables

HASH TABLE BY AKARSH KUMAR

Hashing as a Dictionary Implementation

Dynamic Dictionaries. Operations: create insert find remove max/ min write out in sorted order. Only defined for object classes that are Comparable

CS1020 Data Structures and Algorithms I Lecture Note #15. Hashing. For efficient look-up in a table

CS 310: Maps and Sets and Trees

Abstract Data Types (ADTs) Queues & Priority Queues. Sets. Dictionaries. Stacks 6/15/2011

Review. CSE 143 Java. A Magical Strategy. Hash Function Example. Want to implement Sets of objects Want fast contains( ), add( )

Cpt S 223. School of EECS, WSU

NAME: c. (true or false) The median is always stored at the root of a binary search tree.

CSC 321: Data Structures. Fall 2016

Outline. hash tables hash functions open addressing chained hashing

Announcements. Container structures so far. IntSet ADT interface. Sets. Today s topic: Hashing (Ch. 10) Next topic: Graphs. Break around 11:45am

Hashing. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Announcements HEAPS & PRIORITY QUEUES. Abstract vs concrete data structures. Concrete data structures. Abstract data structures

Fall 2017 Mentoring 9: October 23, Min-Heapify This. Level order, bubbling up. Level order, bubbling down. Reverse level order, bubbling up

! A Hash Table is used to implement a set, ! The table uses a function that maps an. ! The function is called a hash function.

STANDARD ADTS Lecture 17 CS2110 Spring 2013

Priority Queue. 03/09/04 Lecture 17 1

Data Structures And Algorithms

Announcements. Today s topic: Hashing (Ch. 10) Next topic: Graphs. Break around 11:45am

Lecture 5 Data Structures (DAT037) Ramona Enache (with slides from Nick Smallbone)

Standard ADTs. Lecture 19 CS2110 Summer 2009

Logistics. Homework 10 due tomorrow Review on Monday. Final on the following Friday at 3pm in CHEM 102. Come with questions

BINARY HEAP cs2420 Introduction to Algorithms and Data Structures Spring 2015

(f) Given what we know about linked lists and arrays, when would we choose to use one data structure over the other?

lecture23: Hash Tables

HEAPS & PRIORITY QUEUES

CS 310 Advanced Data Structures and Algorithms

Tutorial #11 SFWR ENG / COMP SCI 2S03. Interfaces and Java Collections. Week of November 17, 2014

CSE373 Fall 2013, Second Midterm Examination November 15, 2013

CSC 321: Data Structures. Fall 2017

EXAMINATIONS 2015 COMP103 INTRODUCTION TO DATA STRUCTURES AND ALGORITHMS

COMP 250. Lecture 27. hashing. Nov. 10, 2017

13 A: External Algorithms II; Disjoint Sets; Java API Support

Recitation 9. Prelim Review

Java HashMap Interview Questions

CS 3410 Ch 20 Hash Tables

CS : Data Structures

Table ADT and Sorting. Algorithm topics continuing (or reviewing?) CS 24 curriculum

Lecture 10: Introduction to Hash Tables

COMP 9024, Assignment 4, 11s2

HASH TABLES. Goal is to store elements k,v at index i = h k

CS 310: Hash Table Collision Resolution

Logic, Algorithms and Data Structures ADT:s & Hash tables. By: Jonas Öberg, Lars Pareto

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved

Compsci 201 Hashing. Jeff Forbes February 7, /7/18 CompSci 201, Spring 2018, Hashiing

5. Hashing. 5.1 General Idea. 5.2 Hash Function. 5.3 Separate Chaining. 5.4 Open Addressing. 5.5 Rehashing. 5.6 Extendible Hashing. 5.

Data Structures and Object-Oriented Design VIII. Spring 2014 Carola Wenk

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40

EXAMINATIONS 2016 TRIMESTER 2

Understand how to deal with collisions

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far

I have neither given nor received any assistance in the taking of this exam.

Lecture 4. Hashing Methods

Hash Tables Outline. Definition Hash functions Open hashing Closed hashing. Efficiency. collision resolution techniques. EECS 268 Programming II 1

Mapping Structures. Chapter An Example: Language Dictionaries

CS 310: Hash Table Collision Resolution Strategies

stacks operation array/vector linked list push amortized O(1) Θ(1) pop Θ(1) Θ(1) top Θ(1) Θ(1) isempty Θ(1) Θ(1)

Why do we need hashing?

Hashing Techniques. Material based on slides by George Bebis

Hashing. 5/1/2006 Algorithm analysis and Design CS 007 BE CS 5th Semester 2

Topic HashTable and Table ADT

Hash Tables. Gunnar Gotshalks. Maps 1

CIT-590 Final Exam. Name: Penn Key (Not ID number): If you write a number above, you will lose 1 point

Collections Questions

Acknowledgement HashTable CISC4080, Computer Algorithms CIS, Fordham Univ.

AP Computer Science 4325

CSCI 104 Hash Tables & Functions. Mark Redekopp David Kempe

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ.! Instructor: X. Zhang Spring 2017

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement. Support for Dictionary

Building Java Programs

Questions. 6. Suppose we were to define a hash code on strings s by:

Transcription:

Announcements Submit Prelim conflicts by tomorrow night A7 Due FRIDAY A8 will be released on Thursday HASHING CS110 Spring 018 Hash Functions Hash Functions 1 0 4 1 Requirements: 1) deterministic ) return a number in [0..n] Properties of a good hash: 1) fast ) collision-resistant ) evenly distributed 4) hard to invert 1 0 4 1 Requirements: 1) deterministic ) return a number in [0..n] Which of the following functions f: Object -> int are hash functions: a) f(x) = x b) f(x) = x.hashcode() c) f(x) = &x d) f(x) = 0 Example: SHA-56 Example: hashcode() 6 Method defined in java.lang.object Default implementation: uses memory address of object If you override equals, you must override hashcode!!! String overrides hashcode: s. hashcode() = s[0] 1 567 + s[1] 1 569 + + s[n 1] 1

Application: Error Detection Application: Integrity 7 Hash functions are used for error detection E.g., hash of uploaded file should be the same as hash of original file (if different, file was corrupted) Hash functions are used to "sign" messages Provides integrity guarantees in presence of an active adversary Principals share some secret sk Send (m, h(m,sk)) Application: Password Storage Hash functions are used to store passwords Could store plaintext passwords Problem: Password files get stolen Could store (username, h(password)) Problem: password reuse Instead, store (username, s, h(password, s)) 10 Application: Hash Set Data Structure add(val x) lookup(int i) find(val x) ArrayList 1 0 LinkedList 1 0 TreeSet HashSet 1 O(log n) O(log n) 0 1 1 Expected time Worst-case: Hash Tables So what goes wrong? Idea: finding an element in an array takes constant time when you know which index it is stored in add( ) Hash mod 6 5 hunction k1 hashindex k hashindex b MA

Can we have perfect hash functions? Perfect hash functions map each value to a different index in the hash table Impossible in practice don t know size of the array Number of possible values far far exceeds the array size no point in a perfect hash function if it takes too much time to compute Collision Resolution Two ways of handling collisions: 1. Chaining. Open Addressing Chaining add( ) add( ) lookup("") Open Addressing New York hashindex probing: Find another available space add( ) hashindex bucket/chain (linked list) VA MA VA Different probing strategies Load Factor 18 When a collision occurs, how do we search for an empty space? Load factor linear probing: search the array in order: i, i+1, i+, i+... quadratic probing: search the array in nonlinear sequence: i, i+1, i+, i+... clustering: problem where nearby hashes have very similar probe sequence so we get more collisions What happens when the array becomes too full? i.e. load factor gets a lot bigger than ½? no longer expected constant time operations best range 0 1 waste of memory too slow

Resizing Solution: Dynamic resizing double the size. reinsert / rehash all elements to new array Why not simply copy into first half? a e b c d Poll a d e b c Note: Using linear probing, no resizing What is the final state of the hash table if you use open addressing with quadratic probing (assume no resizing)? a e d b c 0 1 0 1 4 5 6 7 84 9 5 10 11 a a c b e b c d Note: Using quadratic probing, no resizing Note: Using quadratic probing, resizing if load > ½ 4

Collision Resolution Summary Application: Hash Map 5 Chaining store entries in separate chains (linked lists) can have higher load factor/degrades gracefully as load factor increases Open Addressing store all entries in table use linear or quadratic probing to place items uses less memory clustering can be a problem need to be more careful with choice of hash function Map<K,V>{ void put(k key, V value); void update(k key, V value); V get(k key); V remove(k key); } Application: Hash Map Idea: finding an element in an array takes constant time when you know which index it is stored in put("california", ) get("california") Hash California mod 6 5 hunction b MA 8 HashMap in Java Computes hash using key.hashcode() No duplicate keys Uses chaining to handle collisions Default load factor is.75 Java 8 attempts to mitigate worst-case performance by switching to a BST-based chaining! Hash Maps in the Real World 9 Network switches Distributed storage Database indexing Index lookup (e.g., Dijkstra's shortest-path algorithm) Useful in lots of applications 5