9.2 Hash Tables. When implementing a map with a hash table, the goal is to store item (k, e) at index i = h(k)

Size: px
Start display at page:

Download "9.2 Hash Tables. When implementing a map with a hash table, the goal is to store item (k, e) at index i = h(k)"

Transcription

1 9.2 Hash Tables A hash function h() maps keys of a given type to integers in a fixed interval [0, N - 1] Example: h(x) = x mod N is a hash function for integer keys The integer h(x) is called the hash value of key x A hash table for a given key type consists of Hash function h() Array (called table) of size N When implementing a map with a hash table, the goal is to store item (k, e) at index i = h(k) DS 9-1

2 Bucket Array Bucket Arrays a bucket array for a hash table is an array A of size N, where each cell of A is thought of as a bucket (collection of keyvalue pairs), and the integer N defines the capacity of the array DS 9-2

3 Hash Functions A hash function is usually specified as the composition of two functions: Hash code: h 1 : keys integers Compression function: h 2 : integers [0, N 1] The hash code is applied first, and the compression function is applied next on the result, i.e., h(x) = h 2 (h 1 (x)) The goal of the hash function is to disperse the keys in an apparently random way DS 9-3

4 Hash Codes Memory address: We reinterpret the memory address of the key object as an integer Good in general, except for numeric and string keys Integer cast: We reinterpret the bits of the key as an integer Suitable for keys of length less than or equal to the number of bits of the integer type (e.g., byte, short, int and float in C++) Component sum: We partition the bits of the key into components of fixed length (e.g., 16 or 32 bits) and we sum the components (ignoring overflows) Suitable for numeric keys of fixed length greater than or equal to the number of bits of the integer type (e.g., long and double in C++) DS 9-4

5 Polynomial Hash Codes Polynomial accumulation: We partition the bits of the key into a sequence of components of fixed length (e.g., 8, 16 or 32 bits) x 0 a k-1 + x 1 a k x k-2 a + x k 1 We evaluate the polynomial p(a) = x k 1 + a(x k a(x 2 + a(x 1 + ax 0 )) ) at a fixed value a, ignoring overflows Especially suitable for strings (e.g., the choice a = 33 gives at most 6 collisions on a set of 50,000 English words) Polynomial p(a) can be evaluated in O(n) time using Horner s rule: The following polynomials are successively computed, each from the previous one in O(1) time p 0 (a) = a n 1 p i (a) = a n i 1 + zp i 1 (a) (i = 1, 2,, n 1) DS 9-5

6 Cyclic Shift Hash Codes Cyclic Shift Hash Code int hashcode (const char* p, int len) { unsigned int h = 0; for (int i=0; i< len; i++) { h = (h << 5) (h >> 27); h += (unsigned int) p[i]; } return h } Experimental results shift comparisons of the number of collisions for various shift amounts for 25,000 English words collisions collisions shift total max total max DS 9-6

7 Compression Functions Division: h (k) = k mod N The size N of the hash table is usually chosen to be a prime The reason has to do with number theory and is beyond the scope of this course Multiply, Add and Divide (MAD): h (k) = ak + b mod N a and b are nonnegative integers such that a mod N 0 Otherwise, every integer would map to the same value b DS 9-7

8 Collision Handling Collisions occur when different elements are mapped to the same cell Separate Chaining: let each cell in the table point to a linked list of entries that map there Separate chaining is simple, but requires additional memory outside the table DS 9-8

9 Map with Separate Chaining Delegate operations to a list-based map at each cell: Algorithm find(k): Output: the position of the matching entry of the map, or end if there is no key k in the map return A[h(k)].find(k) // delegate the find(k) to the list-based map at A[h(k)] Algorithm put(k,v): p = A[h(k)].put(k,v) // delegate the put(k,v) to the list-based map at A[h(k)] n = n + 1 return p Algorithm erase(k): Output: none A[h(k)].erase(k) // delegate the erase(k) to the list-based map at A[h(k)] n = n - 1 DS 9-9

10 Linear Probing Open addressing: the colliding item is placed in a different cell of the table Linear probing: handles collisions by placing the colliding item in the next (circularly) available table cell Each table cell inspection is referred to as a probe Colliding items lump together, causing future collisions to cause a longer sequence of probes Example: h(x) = x mod 11 Insert new element with key = 15 DS 9-10

11 quadratic probing Quadratic Probing iteratively try the buckets A[(i + f(j)) mod N], for j=0, 1, 2, f(j) = j 2 as linear probing, the quadratic-probing strategy complicates the removal operation, but it does not avoid the kinds of clustering patterns that occur with linear probing it creates its own kind of clustering, called secondary clustering DS 9-11

12 Double Hashing Double hashing uses a secondary hash function h (k) if h maps some key k to a bucket A[i], with i = h(k), that is already occupied, then iteratively try the buckets A[(i + f(j) mode N] next for i=1, 2, 3,, where f(j) = j h (k) The secondary hash function h (k) is not allowed to evaluate to zero common choice: h (k) = q ( k mod q), for some prime number q < N The table size N must be a prime number to allow probing of all the cells DS 9-12

13 Load factor In the worst case, searches, insertions and removals on a hash table take O(n) time The worst case occurs when all the keys inserted into the map collide The load factor λ = n/n affects the performance of a hash table Assuming that the hash values are like random numbers, it can be shown that the expected number of probes for an insertion with open addressing is 1 / (1 λ) The expected running time of all the dictionary ADT operations in a hash table is O(1) In practice, hashing is very fast provided the load factor is not close to 100% DS 9-13

14 Re-hashing Resizing if the load factor of a hash table goes significantly above a specified threshold, then it is common to resize the hash table with the new array s size be at least double the previous size once we have allocated a new bucket array, we must define a new hash function to go with it given this new hash function, we then reinsert every item from the old array into the new array using this new hash function DS 9-14

15 HashMap (1) C++ Hash Map Implementation /** HashMap.h (1) */ #ifndef HASHMAP_H #define HASHMAP_H #include <list> #include <vector> #include "Entry.h" #include "Exceptions.h" template <typename K, typename V, typename H> class HashMap { public: // public types typedef Entry<const K,V> Entry; // a (key, value) pair class Iterator; public: // public functions HashMap(int capacity = 101); // constructor int size() const; // number of entries bool empty() const; // is the map empty? DS 9-15

16 HashMap (2) /** HashMap.h (2) */ Iterator find(const K& k); // find entry with key k Iterator put(const K& k, const V& v); // insert/replace (k,v) void erase(const K& k); // remove entry with key k void erase(const Iterator& p); // erase entry at p Iterator begin(); // iterator to first entry Iterator end(); // iterator to end entry protected: // protected types typedef std::list<entry> Bucket; // a bucket of entries for separate chaining typedef std::vector<bucket> BktArray; // a bucket array for hash table // HashMap utilities here Iterator finder(const K& k); // find utility Iterator inserter(const Iterator& p, const Entry& e); // insert utility void eraser(const Iterator& p); // remove utility typedef typename BktArray::iterator BItor; // bucket iterator typedef typename Bucket::iterator EItor; // entry iterator static void nextentry(iterator& p) // bucket's next entry { ++p.ent; } static bool endofbkt(const Iterator& p) // end of bucket? { return p.ent == p.bkt->end(); } DS 9-16

17 HashMap (3) /** HashMap.h (3) */ private: int n; // number of entries H hash; // the hash comparator BktArray B; // bucket array (Hash Table) public: // public types // Iterator class declaration class Iterator { // an HashMap::Iterator (& position) private: const BktArray* ba; // which bucket array in the hash table BItor bkt; // which bucket in the bucket array (hash table) EItor ent; // which entry in the bucket public: Iterator(const BktArray& a, const BItor& b, const EItor& q = EItor()) : ba(&a), bkt(b), ent(q) { } Iterator() {} // default constructor Entry& operator*() const; // get entry bool operator==(const Iterator& p) const; // are iterators equal? bool operator!=(const Iterator& p) const; // are iterators different? Iterator& operator++(); // advance to next entry friend class HashMap; // give HashMap access }; }; #endif DS 9-17

18 Bucket and Bucket Array std::list<entry> Bucket std::vector<bucket> BktArray BItor typedef std::vector<bucket> BktArray B[0] B[1] B[99] EItor typedef std::list<entry> Bucket DS 9-18

19 HashMap (4) /** HashMap.cpp (1) */ #include <iostream> #include "HashMap.h" #include "Entry.h" using namespace std; template <typename K, typename V, typename H> HashMap<K,V,H>::HashMap(int capacity) : n(0), B(capacity) { } template <typename K, typename V, typename H> int HashMap<K,V,H>::size() const { return n; } template <typename K, typename V, typename H> bool HashMap<K,V,H>::empty() const { return size() == 0; } // constructor // number of entries // is the map empty? DS 9-19

20 HashMap (5) /** HashMap.cpp (2) */ template <typename K, typename V, typename H> // iterator to front typename HashMap<K,V,H>::Iterator HashMap<K,V,H>::begin() { if (empty()) return end(); // emtpty - return end BItor bkt = B.begin(); // else search for an entry while (bkt->empty()) ++bkt; // find nonempty bucket return Iterator(B, bkt, bkt->begin()); // return first of bucket } template <typename K, typename V, typename H> // iterator to end typename HashMap<K,V,H>::Iterator HashMap<K,V,H>::end() { return Iterator(B, B.end()); } template <typename K, typename V, typename H> // get entry typename HashMap<K,V,H>::Entry& HashMap<K,V,H>::Iterator::operator*() const { return *ent; } template <typename K, typename V, typename H> // are iterators equal? bool HashMap<K,V,H>::Iterator::operator==(const Iterator& p) const { if (ba!= p.ba bkt!= p.bkt) return false; // ba or bkt differ? else if (bkt == ba->end()) return true; // both at the end? else return (ent == p.ent); // else use entry to decide } DS 9-20

21 HashMap (6) /** HashMap.cpp (3) */ template <typename K, typename V, typename H> // are iterators equal? bool HashMap<K,V,H>::Iterator::operator!=(const Iterator& p) const { if (ba!= p.ba bkt!= p.bkt) return true; // ba or bkt differ? else if (bkt == ba->end()) return false; // both at the end? else return (ent!= p.ent); // else use entry to decide } template <typename K, typename V, typename H> // advance to next entry typename HashMap<K,V,H>::Iterator& HashMap<K,V,H>::Iterator::operator++() { ++ent; // next entry in bucket if (endofbkt(*this)) { // at end of bucket? ++bkt; // go to next bucket while (bkt!= ba->end() && bkt->empty()) // find nonempty bucket ++bkt; if (bkt == ba->end()) return *this; // end of bucket array? ent = bkt->begin(); // first nonempty entry } return *this; // return self } DS 9-21

22 HashMap (7) /** HashMap.cpp (4) */ template <typename K, typename V, typename H> // find utility typename HashMap<K,V,H>::Iterator HashMap<K,V,H>::finder(const K& k) { int i = hash(k) % B.size(); // get hash index i #ifdef HASH_TEST cout << i; #endif BItor bkt = B.begin() + i; // the ith bucket Iterator p(b, bkt, bkt->begin()); // start of ith bucket while (!endofbkt(p) && (*p).key()!= k) // search for k nextentry(p); return p; // return final position } template <typename K, typename V, typename H> // find key typename HashMap<K,V,H>::Iterator HashMap<K,V,H>::find(const K& k) { Iterator p = finder(k); // look for k if (endofbkt(p)) // didn't find it? return end(); // return end iterator else return p; // return its position } DS 9-22

23 HashMap (8) /** HashMap.cpp (5) */ template <typename K, typename V, typename H> // insert utility typename HashMap<K,V,H>::Iterator HashMap<K,V,H>::inserter(const Iterator& p, const Entry& e) { EItor ins = p.bkt->insert(p.ent, e); // insert before p using insert() of list<entry> n++; // one more entry return Iterator(B, p.bkt, ins); // return this position } template <typename K, typename V, typename H> typename HashMap<K,V,H>::Iterator HashMap<K,V,H>::put(const K& k, const V& v) { Iterator p = finder(k); if (endofbkt(p)) { return inserter(p, Entry(k, v)); } else { p.ent->setvalue(v); return p; } } // insert/replace (v,k) // search for k // k not found? // insert at end of bucket // found it? // replace value with v // return this position DS 9-23

24 HashMap (9) /** HashMap.cpp (6) */ template <typename K, typename V, typename H> // remove utility void HashMap<K,V,H>::eraser(const Iterator& p) { p.bkt->erase(p.ent); // remove entry from bucket n--; // one fewer entry } template <typename K, typename V, typename H> // remove entry at p void HashMap<K,V,H>::erase(const Iterator& p) { eraser(p); } template <typename K, typename V, typename H> // remove entry with key k void HashMap<K,V,H>::erase(const K& k) { Iterator p = finder(k); // find k if (endofbkt(p)) // not found? throw NonexistentElement("Erase of nonexistent"); //...error eraser(p); // remove it } DS 9-24

25 Entry.h /** Entry.h */ #include <iostream> using namespace std; template <typename K, typename V> class Entry { // a (key, value) pair friend ostream& operator<<(ostream& fout, const Entry& ent) { fout << "Entry [key: " << ent.key() << ", value : " << ent.value() <<"]"; return fout; } public: // public functions Entry(const K& k = K(), const V& v = V()) // constructor : _key(k), _value(v) { } const K& key() const { return _key; } // get key const V& value() const { return _value; } // get value void setkey(const K& k) { _key = k; }// set key void setvalue(const V& v) { _value = v; } // set value private: // private data K _key; // key V _value; // value }; DS 9-25

26 class Task /** Task.h */ #include <iostream> #include <string> using namespace std; class Task { friend ostream& operator<<(ostream&, const Task&); public: Task(); // constructor Task(string tname, int pri, int dur); string gettaskname() {return taskname;} int getpriority() { return priority;} int getduration() { return duration;} void settaskname(string n) { taskname = n;} void setpriority(int); void setduration(int); bool operator>(task& t) { return (priority > t.priority);} bool operator<(task& t) { return (priority < t.priority);} private: string taskname; // name of task int priority; // priority of task in scheduling: 0 ~ 49 int duration; // duration of processing time, [ms] }; DS 9-26

27 class LessThan /** LessThan.h */ #ifndef LESSTHAN_H #define LESSTHAN_H using namespace std; template<class T> class LessThan { public: bool operator()(t& p, T& q) const { return p < q; } }; #endif DS 9-27

28 CyclicShiftHashCode /** CyclicShiftHashCode.h */ using namespace std; class CyclicShiftHashCode { public: int operator() (const string key); }; /** CyclicShiftHashCode.cpp */ #include <string> #include "CyclicShiftHashCode.h" int CyclicShiftHashCode::operator() (const string key) { int len = key.length(); unsigned int h = 0; for (int i=0; i<len; i++) { h = (h << 5) (h >> 27); h += (unsigned int)key.at(i); } return h; } DS 9-28

29 Exceptions /** Exceptions.h */ #include <string> #include <iostream> using namespace std; class RuntimeException { private: string errormsg; public: RuntimeException(const string& err) { errormsg = err; } string getmessage() const { return errormsg; } }; class NonexistentElement : public RuntimeException { public: NonexistentElement(const string& err) : RuntimeException(err) {} }; DS 9-29

30 main.cpp (1) /** main.cpp (1) */ #include <iostream> #include "Task.h" #include "HashMap.h" #include "HashMap.cpp" #include "LessThan.h" #include "CyclicShiftHashCode.h" #include "Entry.h" #include <string> #define NUM_TASKS 20 void main() { HashMap<string, Task, CyclicShiftHashCode> *pthm; pthm = new HashMap<string, Task, CyclicShiftHashCode>; Task* ptask[num_tasks]; string tname = ""; char tmpstr[10], idstr[10]; int priority = -1; int duration = 0; int task_id = 0; LessThan<Task> lessthan_task; DS 9-30

31 main.cpp (2) /** main.cpp (2) */ cout << "\ n Generate " << NUM_TASKS << " tasks and put them into TaskHashMap" << endl; for (int i=0; i<num_tasks; i++) { task_id = rand(); _itoa_s(task_id, tmpstr, 10); tname = "t_" + string(tmpstr); priority = rand() % 50; duration = rand() % 100; ptask[i] = new Task(tName, priority, duration); cout << " Generated : " << *ptask[i]; cout << ", put task into " ; pthm->put(tname, *ptask[i]); cout << "-th bucket" << endl; } // end for } cout << " Print HashMap for Tasks: " << endl; HashMap<string, Task, CyclicShiftHashCode>::Iterator p; for (p = pthm->begin(); p!= pthm->end(); ++p) { cout << *p << endl; } cout << endl; delete ptask; DS 9-31

32 Execution Result is stored in 3-rd bucket DS 9-32 is stored in 97-th bucket

33 9.3 Ordered Map Ordered Map keep the entries in a map sorted according to some total order and be able to look up keys and values based on this ordering the ordered map includes all the functions of the standard map ADT plus the following: firstentry(k) : return an iterator to the entry with the smallest key value; if the map is empty, it returns end lastentry(k) : return an iterator to the entry with the largest key value; if the map is empty, it returns end ceilingentry(k) : return an iterator to the entry with the least key value greater than or equal to k; if the map is empty, it returns end floorentry(k) : return an iterator to the entry with the greatest key value less than or equal to k; if the map is empty, it returns end lowerentry(k) : return an iterator to the entry with the greatest key value less than k; if the map is empty, it returns end higherentry(k) : return an iterator to the entry with the least key value greater than k; if the map is empty, it returns end DS 9-33

34 Order Search Table and Binary Search Order search table if the keys in a map comes from a total order, we can store the map s entries in a vector L in increasing order of the keys space requirement of an ordered search table: O(n) accessing an element of L by its index: O(1) update operation in a search table (insert(k, v), erase(k)) : O(n) in worst case DS 9-34

35 Binary Search (1) Algorithm BinarySearch(L, K, low, high) Input: an ordered vector L storing n entries and integers low and high Output: an entry of L with key equal to k and index between low and high, if such an entry exists, and otherwise the special sentinel end if (low > high) then return end else mid (low + high) / 2 entry L.at(mid) if (k = e.key()) then return entry else if (k < entry.key()) then return BinarySearch(L, k, low, mid 1) else return BinarySearch(L, k, mid + 1, high) DS 9-35

36 Binary Search (2) DS 9-36

37 Comparing Map implementations Map implementation with un-ordered list Map implementation with hash table Map implementation with ordered search table Method Un-ordered List Hash Table Ordered Search Table DS 9-37

38 Applications of ordered maps - Flight databases Flight objects: k = (origin, destination, date, time) although a user typically wants to exactly match the origin and destination cities, as well as the departure date, he/she will probably be content with any departure time that is close to his or her requested departure time ceilingentry(k) : returns the flight between the desired cities on the desired date, with departure time at the desired time or after floorentry(k) : returns the flight with departure time at the desired or before DS 9-38

39 9.4 Skip List A skip list for a set S of distinct (key, element) items is a series of lists S 0, S 1,, S h such that Each list S i contains the special keys + and List S 0 contains the keys of S in nondecreasing order Each list is a subsequence of the previous one, i.e., S 0 S 1 S h List S h contains only the two special keys We show how to use a skip list to implement the dictionary ADT S 3 tower + S 2 31 level + S S DS 9-39

40 Search and Update Operations in a Skip List We search for a key x in a a skip list as follows: We start at the first position of the top list Operations for traversal : after(p) : return the position following p on the same level before(p) : return the position preceding p on the same level below(p) : return the position below p in the same tower above(p) : return the position above p in the same tower If we try to drop down past the bottom list, we return null Algorithm SkipSearch(k) Input: a search key k Output: position p in the bottom list S0 such that the entry at p has the largest key less than or equal to k p s // start position while (below(p) NULL) do p below(p) // drop-down while ( k key (after(p)) do p after(p) return p DS 9-40

41 Insertion To insert an entry (x, e) into a skip list, we use a randomized algorithm: We repeatedly toss a coin until we get tails, and we denote with i the number of times the coin came up heads If i h, we add to the skip list new lists S h+1,, S i +1, each containing only the two special keys We search for x in the skip list and find the positions p 0, p 1,, p i of the items with largest key less than x in each list S 0, S 1,, S i For j 0,, i, we insert item (x, e) into list S j after position p j Example: insert key 15, with i = 2 S 3 + S 2 p 2 + S S 1 S 0 p 1 p S 1 S DS 9-41

42 SkipInsert Algorithm SkipInsert(k) Input: Key k and value v Output: topmost position of the entry inserted in the skip list p SkipSearch(k) q NULL e (k, v) i -1 repeat i i + 1 if ( i h) then h h + 1 t after(s) s insertafterabove(null, s, (-, NULL)) // level h left node insertafterabove(s, t, (+, NULL)) // level-h right node q insertafterabove(p, q, e) while (above(p) = NULL) do p before(p) // scan backward p above(p) until coinflip() = tails n n + 1 return q DS 9-42

43 Randomized Algorithms A randomized algorithm performs coin tosses (i.e., uses random bits) to control its execution It contains statements of the type b random() if b = 0 do A else { b = 1} do B Its running time depends on the outcomes of the coin tosses We analyze the expected running time of a randomized algorithm under the following assumptions the coins are unbiased, and the coin tosses are independent The worst-case running time of a randomized algorithm is often large but has very low probability (e.g., it occurs when all the coin tosses give heads ) We use a randomized algorithm to insert items into a skip list DS 9-43

44 insertion of an entry (key 42) before insertion after insertion DS 9-44

45 Deletion To remove an entry with key x from a skip list, we proceed as follows: We search for x in the skip list and find the positions p 0, p 1,, p i of the items with key x, where position p j is in list S j We remove positions p 0, p 1,, p i from the lists S 0, S 1,, S i We remove all but one list containing only the two special keys Example: remove key 34 S 3 + S 2 p S 2 + S 1 p p0 S S S DS 9-45

46 Removal in a Skip List ( key = 25) DS 9-46

47 Implementation We can implement a skip list with quad-nodes A quad-node stores: entry quad-node link to the node prev pabove link to the node next link to the node below link to the node above Also, we define special keys PLUS_INF and MINUS_INF, and we modify the key comparator to handle them pprev x pbelow pnext DS 9-47

48 Space Usage The space used by a skip list depends on the random bits used by each invocation of the insertion algorithm We use the following two basic probabilistic facts: Fact 1: The probability of getting i consecutive heads when flipping a coin is 1/2 i Fact 2: If each of n entries is present in a set with probability p, the expected size of the set is np Consider a skip list with n entries By Fact 1, we insert an entry in list S i with probability 1/2 i By Fact 2, the expected size of list S i is n/2 i The expected number of nodes used by the skip list is h i= 0 h n 1 = n 2n i < i 2 2 i= 0 Thus, the expected space usage of a skip list with n items is O(n) DS 9-48

49 Height The running time of the search an insertion algorithms is affected by the height h of the skip list We show that with high probability, a skip list with n items has height O(log n) We use the following additional probabilistic fact: Fact 3: If each of n events has probability p, the probability that at least one event occurs is at most np Consider a skip list with n entires By Fact 1, we insert an entry in list S i with probability 1/2 i By Fact 3, the probability that list S i has at least one item is at most n/2 i By picking i = 3log n, we have that the probability that S 3log n has at least one entry is at most n/2 3log n = n/n 3 = 1/n 2 Thus a skip list with n entries has height at most 3log n with probability at least 1 1/n 2 DS 9-49

50 Search and Update Times The search time in a skip list is proportional to the number of drop-down steps, plus the number of scan-forward steps The drop-down steps are bounded by the height of the skip list and thus are O(log n) with high probability To analyze the scan-forward steps, we use yet another probabilistic fact: Fact 4: The expected number of coin tosses required in order to get tails is 2 When we scan forward in a list, the destination key does not belong to a higher list A scan-forward step is associated with a former coin toss that gave tails By Fact 4, in each list the expected number of scanforward steps is 2 Thus, the expected number of scan-forward steps is O(log n) We conclude that a search in a skip list takes O(log n) expected time The analysis of insertion and deletion gives similar results DS 9-50

51 Summary of Skip List A skip list is a data structure for dictionaries that uses a randomized insertion algorithm In a skip list with n entries The expected space used is O(n) The expected search, insertion and deletion time is O(log n) Using a more complex probabilistic analysis, one can show that these performance bounds also hold with high probability Skip lists are fast and simple to implement in practice DS 9-51

52 9.5 Dictionary The dictionary ADT models a searchable collection of key-element entries The main operations of a dictionary are searching, inserting, and deleting items Multiple items with the same key are allowed Applications: word-definition pairs credit card authorizations DNS mapping of host names (e.g., datastructures.net) to internet IP addresses (e.g., ) DS 9-52

53 Dictionary ADT Dictionary ADT methods: size(): return the number of entries in D empty(): return true if D is empty and false otherwise find(k): if there is an entry with key k, returns an iterator p referring any such entry, else return the special iterator end findall(k): returns a pair of iterators (b, e), such that all entries with key value k lie in the range from b up to, but not including, e: [b, e) // starting at b and ending just prior to e insert(k, v): insert an entry with key k and value v into D, returing an iterator referring to the newly created entry erase(k): remove from D an arbitrary entry with key equal to k; an error condition occurs if D has no such entry erase(p): remove from D the entry referenced by iterator p; an error condition occurs if D has no such entry begin(): return an iterator to the first entry of the dictionary end(): return an iterator to a position just beyond the end of the dictionary DS 9-53

54 Example Operation Output Dictionary empty() true put(5,a) p1: [(5,A) {(5,A)} put(7,b) p2: [(7,B) {(5,A),(7,B)} put(2,c) p3: (2,C) { (5,A),(7,B),(2,C)} put(8,d) p4: (8,D) { (5,A),(7,B),(2,C),(8,D)} put(2,e) p5: (2,E) { (5,A),(7,B),(2,C),(2,E),(8,D)} find(7) p2: (7,B) { (5,A),(7,B),(2,C),(2,E),(8,D)} find(4) end { (5,A),(7,B),(2,C),(2,E),(8,D)} find(2) p3: (2,C) { (5,A),(7,B),(2,C),(2,E),(8,D)} findall(2) (p3, p4) { (5,A),(7,B),(2,C),(2,E),(8,D)} size() 5 { (5,A),(7,B),(2,C),(2,E),(8,D)} erase(5) { (7,B),(2,C),(2,E),(8,D)} erase(p3) { (7,B),(2,E),(8,D)} find(2) p5:[(2,e)] { (7,B),(2,E),(8,D)} DS 9-54

55 C++ Dictionary Implementation class HashDict (1) template <typename K, typename V, typename H> class HashDict : public HashMap<K,V,H> { public: // public functions typedef typename HashMap<K,V,H>::Iterator Iterator; typedef typename HashMap<K,V,H>::Entry Entry; // Range class declaration class Range { // an iterator range private: Iterator _begin; // front of range Iterator _end; // end of range public: Range(const Iterator& b, const Iterator& e) // constructor : _begin(b), _end(e) { } Iterator& begin() { return _begin; } // get beginning Iterator& end() { return _end; } // get end }; public: // public functions HashDict(int capacity = 100); // constructor Range findall(const K& k); // find all entries with k Iterator insert(const K& k, const V& v); // insert pair (k,v) }; DS 9-55

56 class HashDict (2) template <typename K, typename V, typename H> // constructor HashDict<K,V,H>::HashDict(int capacity) : HashMap<K,V,H>(capacity) { } template <typename K, typename V, typename H> // insert pair (k,v) typename HashDict<K,V,H>::Iterator HashDict<K,V,H>::insert(const K& k, const V& v) { Iterator p = finder(k); // find key Iterator q = inserter(p, Entry(k, v)); // insert it here return q; // return its position } template <typename K, typename V, typename H> // find all entries with k typename HashDict<K,V,H>::Range HashDict<K,V,H>::findAll(const K& k) { Iterator b = finder(k); // look up k Iterator p = b; while (!endofbkt(p) && (*p).key() == (*b).key()) { // find next unequal key ++p; } return Range(b, p); // return range of positions } DS 9-56

57 Implementations with Location-aware Entries location-aware entry : augmented by private location variable protected functions: location(), setlocation(p) Several possible ways to implement Dictionary ADT with location-aware entry unordered list: maintain the location variable of each entry e to point e s position in the underlying linked list for unordered list L hash table: use the location variable of each entry e to point to e s position in the list L implementing the list A[h(k)] ordered search table: maintain the location variable of each entry e to be e s index in ordered table T skip list: maintain the location variable of each entry e to point to e s position in the bottom level of skip list S DS 9-57

58 Homework DS-9 DS-9.1 Projects P-9.3 (map ADT using a vector) DS-9.2 Projects P-9.11 (implementation of the skip-list, map and dictionary ADT with skip-list) DS 9-58

CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS

CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS 0 1 2 025-612-0001 981-101-0002 3 4 451-229-0004 CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++, GOODRICH,

More information

CSED233: Data Structures (2017F) Lecture10:Hash Tables, Maps, and Skip Lists

CSED233: Data Structures (2017F) Lecture10:Hash Tables, Maps, and Skip Lists (2017F) Lecture10:Hash Tables, Maps, and Skip Lists Daijin Kim CSE, POSTECH dkim@postech.ac.kr Maps A map models a searchable collection of key-value entries The main operations of a map are for searching,

More information

Dictionaries-Hashing. Textbook: Dictionaries ( 8.1) Hash Tables ( 8.2)

Dictionaries-Hashing. Textbook: Dictionaries ( 8.1) Hash Tables ( 8.2) Dictionaries-Hashing Textbook: Dictionaries ( 8.1) Hash Tables ( 8.2) Dictionary The dictionary ADT models a searchable collection of key-element entries The main operations of a dictionary are searching,

More information

Hash Tables Hash Tables Goodrich, Tamassia

Hash Tables Hash Tables Goodrich, Tamassia Hash Tables 0 1 2 3 4 025-612-0001 981-101-0002 451-229-0004 Hash Tables 1 Hash Functions and Hash Tables A hash function h maps keys of a given type to integers in a fixed interval [0, N 1] Example: h(x)

More information

Skip Lists S 3 S 2 S 1. Skip Lists Goodrich, Tamassia

Skip Lists S 3 S 2 S 1. Skip Lists Goodrich, Tamassia Skip Lists S 3 15 15 23 10 15 23 36 Skip Lists 1 What is a Skip List A skip list for a set S of distinct (key, element) items is a series of lists,,, S h such that Each list S i contains the special keys

More information

Maps, Hash Tables and Dictionaries. Chapter 10.1, 10.2, 10.3, 10.5

Maps, Hash Tables and Dictionaries. Chapter 10.1, 10.2, 10.3, 10.5 Maps, Hash Tables and Dictionaries Chapter 10.1, 10.2, 10.3, 10.5 Outline Maps Hashing Dictionaries Ordered Maps & Dictionaries Outline Maps Hashing Dictionaries Ordered Maps & Dictionaries Maps A map

More information

HASH TABLES. Goal is to store elements k,v at index i = h k

HASH TABLES. Goal is to store elements k,v at index i = h k CH 9.2 : HASH TABLES 1 ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++, GOODRICH, TAMASSIA AND MOUNT (WILEY 2004) AND SLIDES FROM JORY DENNY AND

More information

DataStruct 9. Hash Tables, Maps, Skip Lists, and Dictionaries

DataStruct 9. Hash Tables, Maps, Skip Lists, and Dictionaries 2013-2 DataStruct 9. Hash Tables, Maps, Skip Lists, and Dictionaries Michael T. Goodrich, et. al, Data Structures and Algorithms in C++, 2 nd Ed., John Wiley & Sons, Inc., 2011. November 22, 2013 Advanced

More information

Priority Queue Sorting

Priority Queue Sorting Priority Queue Sorting We can use a priority queue to sort a list of comparable elements 1. Insert the elements one by one with a series of insert operations 2. Remove the elements in sorted order with

More information

Data Structures Lecture 12

Data Structures Lecture 12 Fall 2017 Fang Yu Software Security Lab. Dept. Management Information Systems, National Chengchi University Data Structures Lecture 12 Advance ADTs Maps and Hash Tables Maps A map models a searchable collection

More information

This lecture. Iterators ( 5.4) Maps. Maps. The Map ADT ( 8.1) Comparison to java.util.map

This lecture. Iterators ( 5.4) Maps. Maps. The Map ADT ( 8.1) Comparison to java.util.map This lecture Iterators Hash tables Formal coursework Iterators ( 5.4) An iterator abstracts the process of scanning through a collection of elements Methods of the ObjectIterator ADT: object object() boolean

More information

Dictionaries and Hash Tables

Dictionaries and Hash Tables Dictionaries and Hash Tables 0 1 2 3 025-612-0001 981-101-0002 4 451-229-0004 Dictionaries and Hash Tables 1 Dictionary ADT The dictionary ADT models a searchable collection of keyelement items The main

More information

Hash Tables. Johns Hopkins Department of Computer Science Course : Data Structures, Professor: Greg Hager

Hash Tables. Johns Hopkins Department of Computer Science Course : Data Structures, Professor: Greg Hager Hash Tables What is a Dictionary? Container class Stores key-element pairs Allows look-up (find) operation Allows insertion/removal of elements May be unordered or ordered Dictionary Keys Must support

More information

CS2210 Data Structures and Algorithms

CS2210 Data Structures and Algorithms CS2210 Data Structures and Algorithms Lecture 5: Hash Tables Instructor: Olga Veksler 0 1 2 3 025-612-0001 981-101-0002 4 451-229-0004 2004 Goodrich, Tamassia Outline Hash Tables Motivation Hash functions

More information

Elementary Data Structures 2

Elementary Data Structures 2 Elementary Data Structures Priority Queues, & Dictionaries Priority Queues Sell 00 IBM $ Sell 300 IBM $0 Buy 00 IBM $9 Buy 400 IBM $8 Priority Queue ADT A priority queue stores a collection of items An

More information

Chapter 9: Maps, Dictionaries, Hashing

Chapter 9: Maps, Dictionaries, Hashing Chapter 9: 0 1 025-612-0001 2 981-101-0002 3 4 451-229-0004 Maps, Dictionaries, Hashing Nancy Amato Parasol Lab, Dept. CSE, Texas A&M University Acknowledgement: These slides are adapted from slides provided

More information

DataStruct 8. Heaps and Priority Queues

DataStruct 8. Heaps and Priority Queues 2013-2 DataStruct 8. Heaps and Priority Queues Michael T. Goodrich, et. al, Data Structures and Algorithms in C++, 2 nd Ed., John Wiley & Sons, Inc., 2011. November 13, 2013 Advanced Networking Technology

More information

Hashing. Manolis Koubarakis. Data Structures and Programming Techniques

Hashing. Manolis Koubarakis. Data Structures and Programming Techniques Hashing Manolis Koubarakis 1 The Symbol Table ADT A symbol table T is an abstract storage that contains table entries that are either empty or are pairs of the form (K, I) where K is a key and I is some

More information

CSCI-1200 Data Structures Fall 2009 Lecture 20 Hash Tables, Part II

CSCI-1200 Data Structures Fall 2009 Lecture 20 Hash Tables, Part II Review from Lecture 19 CSCI-1200 Data Structures Fall 2009 Lecture 20 Hash Tables, Part II Operators as non-member functions, as member functions, and as friend functions. A hash table is a table implementation

More information

CH 9 : MAPS AND DICTIONARIES

CH 9 : MAPS AND DICTIONARIES CH 9 : MAPS AND DICTIONARIES 1 ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++, GOODRICH, TAMASSIA AND MOUNT (WILEY 2004) AND SLIDES FROM JORY

More information

COSC 2011 Section N Data structure for efficient Thursday, April

COSC 2011 Section N Data structure for efficient Thursday, April Skip Lists - Description: (1) COSC 2011 Data structure for efficient Thursday, April 26 2001 Overview Skip Lists Description, Definition ADT Searching Insertion Removal realization of the Ordered Dictionary.

More information

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved Introducing Hashing Chapter 21 Contents What Is Hashing? Hash Functions Computing Hash Codes Compressing a Hash Code into an Index for the Hash Table A demo of hashing (after) ARRAY insert hash index =

More information

Open Addressing: Linear Probing (cont.)

Open Addressing: Linear Probing (cont.) Open Addressing: Linear Probing (cont.) Cons of Linear Probing () more complex insert, find, remove methods () primary clustering phenomenon items tend to cluster together in the bucket array, as clustering

More information

HO #13 Fall 2015 Gary Chan. Hashing (N:12)

HO #13 Fall 2015 Gary Chan. Hashing (N:12) HO #13 Fall 2015 Gary Chan Hashing (N:12) Outline Motivation Hashing Algorithms and Improving the Hash Functions Collisions Strategies Open addressing and linear probing Separate chaining COMP2012H (Hashing)

More information

CSCI-1200 Data Structures Fall 2018 Lecture 21 Hash Tables, part 1

CSCI-1200 Data Structures Fall 2018 Lecture 21 Hash Tables, part 1 Review from Lecture 20 CSCI-1200 Data Structures Fall 2018 Lecture 21 Hash Tables, part 1 Finishing binary search trees & the ds set class Operators as non-member functions, as member functions, and as

More information

Cpt S 223. School of EECS, WSU

Cpt S 223. School of EECS, WSU Hashing & Hash Tables 1 Overview Hash Table Data Structure : Purpose To support insertion, deletion and search in average-case constant t time Assumption: Order of elements irrelevant ==> data structure

More information

9.4.1 Search and Update Operations in a Skip List

9.4.1 Search and Update Operations in a Skip List 400 Chapter 9. Maps and Dictionaries 9.4.1 Search and Update Operations in a Skip List The skip list structure allows for simple dictionary search and update algorithms. In fact, all of the skip list search

More information

Hash Table. Ric Glassey

Hash Table. Ric Glassey Hash Table Ric Glassey glassey@kth.se Overview Hash Table Aim: Describe the map abstract data type with efficient insertion, deletion and search operations Motivation: List data structures are divided

More information

Introduction to Hashing

Introduction to Hashing Lecture 11 Hashing Introduction to Hashing We have learned that the run-time of the most efficient search in a sorted list can be performed in order O(lg 2 n) and that the most efficient sort by key comparison

More information

csci 210: Data Structures Maps and Hash Tables

csci 210: Data Structures Maps and Hash Tables csci 210: Data Structures Maps and Hash Tables Summary Topics the Map ADT Map vs Dictionary implementation of Map: hash tables READING: GT textbook chapter 9.1 and 9.2 Map ADT A Map is an abstract data

More information

Hashing. October 19, CMPE 250 Hashing October 19, / 25

Hashing. October 19, CMPE 250 Hashing October 19, / 25 Hashing October 19, 2016 CMPE 250 Hashing October 19, 2016 1 / 25 Dictionary ADT Data structure with just three basic operations: finditem (i): find item with key (identifier) i insert (i): insert i into

More information

Chapter 20 Hash Tables

Chapter 20 Hash Tables Chapter 20 Hash Tables Dictionary All elements have a unique key. Operations: o Insert element with a specified key. o Search for element by key. o Delete element by key. Random vs. sequential access.

More information

Hashing (Κατακερματισμός)

Hashing (Κατακερματισμός) Hashing (Κατακερματισμός) Manolis Koubarakis 1 The Symbol Table ADT A symbol table T is an abstract storage that contains table entries that are either empty or are pairs of the form (K, I) where K is

More information

Hash[ string key ] ==> integer value

Hash[ string key ] ==> integer value Hashing 1 Overview Hash[ string key ] ==> integer value Hash Table Data Structure : Use-case To support insertion, deletion and search in average-case constant time Assumption: Order of elements irrelevant

More information

CS302 - Data Structures using C++

CS302 - Data Structures using C++ CS302 - Data Structures using C++ Topic: Dictionaries and their Implementations Kostas Alexis The ADT Dictionary A collection of data about certain cities (slide to be referenced later) City Country Population

More information

Lecture 10 March 4. Goals: hashing. hash functions. closed hashing. application of hashing

Lecture 10 March 4. Goals: hashing. hash functions. closed hashing. application of hashing Lecture 10 March 4 Goals: hashing hash functions chaining closed hashing application of hashing Computing hash function for a string Horner s rule: (( (a 0 x + a 1 ) x + a 2 ) x + + a n-2 )x + a n-1 )

More information

Introduction hashing: a technique used for storing and retrieving information as quickly as possible.

Introduction hashing: a technique used for storing and retrieving information as quickly as possible. Lecture IX: Hashing Introduction hashing: a technique used for storing and retrieving information as quickly as possible. used to perform optimal searches and is useful in implementing symbol tables. Why

More information

CSE 214 Computer Science II Searching

CSE 214 Computer Science II Searching CSE 214 Computer Science II Searching Fall 2017 Stony Brook University Instructor: Shebuti Rayana shebuti.rayana@stonybrook.edu http://www3.cs.stonybrook.edu/~cse214/sec02/ Introduction Searching in a

More information

Hash Tables. Gunnar Gotshalks. Maps 1

Hash Tables. Gunnar Gotshalks. Maps 1 Hash Tables Maps 1 Definition A hash table has the following components» An array called a table of size N» A mathematical function called a hash function that maps keys to valid array indices hash_function:

More information

Tables. The Table ADT is used when information needs to be stored and acessed via a key usually, but not always, a string. For example: Dictionaries

Tables. The Table ADT is used when information needs to be stored and acessed via a key usually, but not always, a string. For example: Dictionaries 1: Tables Tables The Table ADT is used when information needs to be stored and acessed via a key usually, but not always, a string. For example: Dictionaries Symbol Tables Associative Arrays (eg in awk,

More information

Non-numeric types, boolean types, arithmetic. operators. Comp Sci 1570 Introduction to C++ Non-numeric types. const. Reserved words.

Non-numeric types, boolean types, arithmetic. operators. Comp Sci 1570 Introduction to C++ Non-numeric types. const. Reserved words. , ean, arithmetic s s on acters Comp Sci 1570 Introduction to C++ Outline s s on acters 1 2 3 4 s s on acters Outline s s on acters 1 2 3 4 s s on acters ASCII s s on acters ASCII s s on acters Type: acter

More information

CMSC 341 Hashing (Continued) Based on slides from previous iterations of this course

CMSC 341 Hashing (Continued) Based on slides from previous iterations of this course CMSC 341 Hashing (Continued) Based on slides from previous iterations of this course Today s Topics Review Uses and motivations of hash tables Major concerns with hash tables Properties Hash function Hash

More information

Understand how to deal with collisions

Understand how to deal with collisions Understand the basic structure of a hash table and its associated hash function Understand what makes a good (and a bad) hash function Understand how to deal with collisions Open addressing Separate chaining

More information

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40 Lecture 16 Hashing Hash table and hash function design Hash functions for integers and strings Collision resolution strategies: linear probing, double hashing, random hashing, separate chaining Hash table

More information

SETS AND MAPS. Chapter 9

SETS AND MAPS. Chapter 9 SETS AND MAPS Chapter 9 The set Functions Required methods: testing set membership (find) testing for an empty set (empty) determining set size (size) creating an iterator over the set (begin, end) adding

More information

Hashing. Hashing Procedures

Hashing. Hashing Procedures Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements

More information

Unit #5: Hash Functions and the Pigeonhole Principle

Unit #5: Hash Functions and the Pigeonhole Principle Unit #5: Hash Functions and the Pigeonhole Principle CPSC 221: Basic Algorithms and Data Structures Jan Manuch 217S1: May June 217 Unit Outline Constant-Time Dictionaries? Hash Table Outline Hash Functions

More information

General Idea. Key could be an integer, a string, etc e.g. a name or Id that is a part of a large employee structure

General Idea. Key could be an integer, a string, etc e.g. a name or Id that is a part of a large employee structure Hashing 1 Hash Tables We ll discuss the hash table ADT which supports only a subset of the operations allowed by binary search trees. The implementation of hash tables is called hashing. Hashing is a technique

More information

CSCI-1200 Data Structures Spring 2018 Lecture 16 Trees, Part I

CSCI-1200 Data Structures Spring 2018 Lecture 16 Trees, Part I CSCI-1200 Data Structures Spring 2018 Lecture 16 Trees, Part I Review from Lecture 15 Maps containing more complicated values. Example: index mapping words to the text line numbers on which they appear.

More information

Hashing. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Hashing. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University Hashing CptS 223 Advanced Data Structures Larry Holder School of Electrical Engineering and Computer Science Washington State University 1 Overview Hashing Technique supporting insertion, deletion and

More information

Today: Finish up hashing Sorted Dictionary ADT: Binary search, divide-and-conquer Recursive function and recurrence relation

Today: Finish up hashing Sorted Dictionary ADT: Binary search, divide-and-conquer Recursive function and recurrence relation Announcements HW1 PAST DUE HW2 online: 7 questions, 60 points Nat l Inst visit Thu, ok? Last time: Continued PA1 Walk Through Dictionary ADT: Unsorted Hashing Today: Finish up hashing Sorted Dictionary

More information

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far Chapter 5 Hashing 2 Introduction hashing performs basic operations, such as insertion, deletion, and finds in average time better than other ADTs we ve seen so far 3 Hashing a hash table is merely an hashing

More information

CSC 427: Data Structures and Algorithm Analysis. Fall 2004

CSC 427: Data Structures and Algorithm Analysis. Fall 2004 CSC 47: Data Structures and Algorithm Analysis Fall 4 Associative containers STL data structures and algorithms sets: insert, find, erase, empty, size, begin/end maps: operator[], find, erase, empty, size,

More information

CS 350 : Data Structures Hash Tables

CS 350 : Data Structures Hash Tables CS 350 : Data Structures Hash Tables David Babcock (courtesy of James Moscola) Department of Physical Sciences York College of Pennsylvania James Moscola Hash Tables Although the various tree structures

More information

Data Structures And Algorithms

Data Structures And Algorithms Data Structures And Algorithms Hashing Eng. Anis Nazer First Semester 2017-2018 Searching Search: find if a key exists in a given set Searching algorithms: linear (sequential) search binary search Search

More information

Hashing. inserting and locating strings. MCS 360 Lecture 28 Introduction to Data Structures Jan Verschelde, 27 October 2010.

Hashing. inserting and locating strings. MCS 360 Lecture 28 Introduction to Data Structures Jan Verschelde, 27 October 2010. ing 1 2 3 MCS 360 Lecture 28 Introduction to Data Structures Jan Verschelde, 27 October 2010 ing 1 2 3 We covered STL sets and maps for a frequency table. The STL implementation uses a balanced search

More information

Topic HashTable and Table ADT

Topic HashTable and Table ADT Topic HashTable and Table ADT Hashing, Hash Function & Hashtable Search, Insertion & Deletion of elements based on Keys So far, By comparing keys! Linear data structures Non-linear data structures Time

More information

Dictionary. Dictionary. stores key-value pairs. Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n)

Dictionary. Dictionary. stores key-value pairs. Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n) Hash-Tables Introduction Dictionary Dictionary stores key-value pairs Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n) Balanced BST O(log n) O(log n) O(log n) Dictionary

More information

Hashing as a Dictionary Implementation

Hashing as a Dictionary Implementation Hashing as a Dictionary Implementation Chapter 22 Contents The Efficiency of Hashing The Load Factor The Cost of Open Addressing The Cost of Separate Chaining Rehashing Comparing Schemes for Collision

More information

CSCI-1200 Data Structures Fall 2017 Lecture 17 Trees, Part I

CSCI-1200 Data Structures Fall 2017 Lecture 17 Trees, Part I Review from Lecture 16 CSCI-1200 Data Structures Fall 2017 Lecture 17 Trees, Part I Maps containing more complicated values. Example: index mapping words to the text line numbers on which they appear.

More information

CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2

CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2 CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2 Prof. John Park Based on slides from previous iterations of this course Today s Topics Overview Uses and motivations of hash tables Major concerns with hash

More information

CSCE 110 PROGRAMMING FUNDAMENTALS

CSCE 110 PROGRAMMING FUNDAMENTALS CSCE 110 PROGRAMMING FUNDAMENTALS WITH C++ Prof. Amr Goneid AUC Part 15. Dictionaries (1): A Key Table Class Prof. amr Goneid, AUC 1 Dictionaries(1): A Key Table Class Prof. Amr Goneid, AUC 2 A Key Table

More information

Hashing. mapping data to tables hashing integers and strings. aclasshash_table inserting and locating strings. inserting and locating strings

Hashing. mapping data to tables hashing integers and strings. aclasshash_table inserting and locating strings. inserting and locating strings Hashing 1 Hash Functions mapping data to tables hashing integers and strings 2 Open Addressing aclasshash_table inserting and locating strings 3 Chaining aclasshash_table inserting and locating strings

More information

Linked List using a Sentinel

Linked List using a Sentinel Linked List using a Sentinel Linked List.h / Linked List.h Using a sentinel for search Created by Enoch Hwang on 2/1/10. Copyright 2010 La Sierra University. All rights reserved. / #include

More information

CSCI-1200 Data Structures Fall 2015 Lecture 24 Hash Tables

CSCI-1200 Data Structures Fall 2015 Lecture 24 Hash Tables CSCI-1200 Data Structures Fall 2015 Lecture 24 Hash Tables Review from Lecture 22 & 23 STL Queues & Stacks Definition of a Prioriry Queue / Binary Heap percolate_up and percolate_down A Heap as a Vector,

More information

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is: CS 124 Section #8 Hashing, Skip Lists 3/20/17 1 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look

More information

Hashing Techniques. Material based on slides by George Bebis

Hashing Techniques. Material based on slides by George Bebis Hashing Techniques Material based on slides by George Bebis https://www.cse.unr.edu/~bebis/cs477/lect/hashing.ppt The Search Problem Find items with keys matching a given search key Given an array A, containing

More information

Hash Tables Outline. Definition Hash functions Open hashing Closed hashing. Efficiency. collision resolution techniques. EECS 268 Programming II 1

Hash Tables Outline. Definition Hash functions Open hashing Closed hashing. Efficiency. collision resolution techniques. EECS 268 Programming II 1 Hash Tables Outline Definition Hash functions Open hashing Closed hashing collision resolution techniques Efficiency EECS 268 Programming II 1 Overview Implementation style for the Table ADT that is good

More information

University of Illinois at Urbana-Champaign Department of Computer Science. Final Examination

University of Illinois at Urbana-Champaign Department of Computer Science. Final Examination University of Illinois at Urbana-Champaign Department of Computer Science Final Examination CS 225 Data Structures and Software Principles Fall 2009 7-10p, Tuesday, December 15 Name: NetID: Lab Section

More information

The dictionary problem

The dictionary problem 6 Hashing The dictionary problem Different approaches to the dictionary problem: previously: Structuring the set of currently stored keys: lists, trees, graphs,... structuring the complete universe of

More information

HASH TABLES.

HASH TABLES. 1 HASH TABLES http://en.wikipedia.org/wiki/hash_table 2 Hash Table A hash table (or hash map) is a data structure that maps keys (identifiers) into a certain location (bucket) A hash function changes the

More information

CSCI-1200 Data Structures Spring 2015 Lecture 18 Trees, Part I

CSCI-1200 Data Structures Spring 2015 Lecture 18 Trees, Part I CSCI-1200 Data Structures Spring 2015 Lecture 18 Trees, Part I Review from Lectures 17 Maps containing more complicated values. Example: index mapping words to the text line numbers on which they appear.

More information

Lecture 7: Efficient Collections via Hashing

Lecture 7: Efficient Collections via Hashing Lecture 7: Efficient Collections via Hashing These slides include material originally prepared by Dr. Ron Cytron, Dr. Jeremy Buhler, and Dr. Steve Cole. 1 Announcements Lab 6 due Friday Lab 7 out tomorrow

More information

Table ADT and Sorting. Algorithm topics continuing (or reviewing?) CS 24 curriculum

Table ADT and Sorting. Algorithm topics continuing (or reviewing?) CS 24 curriculum Table ADT and Sorting Algorithm topics continuing (or reviewing?) CS 24 curriculum A table ADT (a.k.a. Dictionary, Map) Table public interface: // Put information in the table, and a unique key to identify

More information

Algorithms and Data Structures

Algorithms and Data Structures Lesson 4: Sets, Dictionaries and Hash Tables Luciano Bononi http://www.cs.unibo.it/~bononi/ (slide credits: these slides are a revised version of slides created by Dr. Gabriele D Angelo)

More information

CSCE 110 PROGRAMMING FUNDAMENTALS

CSCE 110 PROGRAMMING FUNDAMENTALS CSCE 110 PROGRAMMING FUNDAMENTALS WITH C++ Prof. Amr Goneid AUC Part 16. Linked Lists Prof. amr Goneid, AUC 1 Linked Lists Prof. amr Goneid, AUC 2 Linked Lists The Linked List Structure Some Linked List

More information

Hashing. So what we do instead is to store in each slot of the array a linked list of (key, record) - pairs, as in Fig. 1. Figure 1: Chaining

Hashing. So what we do instead is to store in each slot of the array a linked list of (key, record) - pairs, as in Fig. 1. Figure 1: Chaining Hashing Databases and keys. A database is a collection of records with various attributes. It is commonly represented as a table, where the rows are the records, and the columns are the attributes: Number

More information

Dynamic Dictionaries. Operations: create insert find remove max/ min write out in sorted order. Only defined for object classes that are Comparable

Dynamic Dictionaries. Operations: create insert find remove max/ min write out in sorted order. Only defined for object classes that are Comparable Hashing Dynamic Dictionaries Operations: create insert find remove max/ min write out in sorted order Only defined for object classes that are Comparable Hash tables Operations: create insert find remove

More information

University of Waterloo Department of Electrical and Computer Engineering ECE 250 Data Structures and Algorithms. Final Examination

University of Waterloo Department of Electrical and Computer Engineering ECE 250 Data Structures and Algorithms. Final Examination University of Waterloo Department of Electrical and Computer Engineering ECE 250 Data Structures and Algorithms Instructor: Douglas Wilhelm Harder Time: 2.5 hours Aides: none 14 pages Final Examination

More information

Hashing Algorithms. Hash functions Separate Chaining Linear Probing Double Hashing

Hashing Algorithms. Hash functions Separate Chaining Linear Probing Double Hashing Hashing Algorithms Hash functions Separate Chaining Linear Probing Double Hashing Symbol-Table ADT Records with keys (priorities) basic operations insert search create test if empty destroy copy generic

More information

CSCI-1200 Data Structures Fall 2017 Lecture 23 Functors & Hash Tables, part II

CSCI-1200 Data Structures Fall 2017 Lecture 23 Functors & Hash Tables, part II CSCI-1200 Data Structures Fall 2017 Lecture 23 Functors & Hash Tables, part II Review from Lecture 22 A Heap as a Vector Building a Heap Heap Sort the single most important data structure known to mankind

More information

Data Structure Lecture#22: Searching 3 (Chapter 9) U Kang Seoul National University

Data Structure Lecture#22: Searching 3 (Chapter 9) U Kang Seoul National University Data Structure Lecture#22: Searching 3 (Chapter 9) U Kang Seoul National University U Kang 1 In This Lecture Motivation of collision resolution policy Open hashing for collision resolution Closed hashing

More information

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Data Structures Hashing Structures Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Hashing Structures I. Motivation and Review II. Hash Functions III. HashTables I. Implementations

More information

CSE030 Fall 2012 Final Exam Friday, December 14, PM

CSE030 Fall 2012 Final Exam Friday, December 14, PM CSE030 Fall 2012 Final Exam Friday, December 14, 2012 3-6PM Write your name here and at the top of each page! Name: Select your lab session: Tuesdays Thursdays Paper. If you have any questions or need

More information

CS1020 Data Structures and Algorithms I Lecture Note #15. Hashing. For efficient look-up in a table

CS1020 Data Structures and Algorithms I Lecture Note #15. Hashing. For efficient look-up in a table CS1020 Data Structures and Algorithms I Lecture Note #15 Hashing For efficient look-up in a table Objectives 1 To understand how hashing is used to accelerate table lookup 2 To study the issue of collision

More information

MIDTERM EXAM THURSDAY MARCH

MIDTERM EXAM THURSDAY MARCH Week 6 Assignments: Program 2: is being graded Program 3: available soon and due before 10pm on Thursday 3/14 Homework 5: available soon and due before 10pm on Monday 3/4 X-Team Exercise #2: due before

More information

Lab 2: ADT Design & Implementation

Lab 2: ADT Design & Implementation Lab 2: ADT Design & Implementation By Dr. Yingwu Zhu, Seattle University 1. Goals In this lab, you are required to use a dynamic array to design and implement an ADT SortedList that maintains a sorted

More information

Assignment 3 Suggested solutions

Assignment 3 Suggested solutions ECE-250 Algorithms and Data Structures (Winter 2012) Assignment 3 Suggested solutions 1 - Provide the definition of a function swap nodes that swaps two elements in a doubly-linked list by readjusting

More information

Lecture 4. Hashing Methods

Lecture 4. Hashing Methods Lecture 4 Hashing Methods 1 Lecture Content 1. Basics 2. Collision Resolution Methods 2.1 Linear Probing Method 2.2 Quadratic Probing Method 2.3 Double Hashing Method 2.4 Coalesced Chaining Method 2.5

More information

Tutorial AVL TREES. arra[5] = {1,2,3,4,5} arrb[8] = {20,30,80,40,10,60,50,70} FIGURE 1 Equivalent Binary Search and AVL Trees. arra = {1, 2, 3, 4, 5}

Tutorial AVL TREES. arra[5] = {1,2,3,4,5} arrb[8] = {20,30,80,40,10,60,50,70} FIGURE 1 Equivalent Binary Search and AVL Trees. arra = {1, 2, 3, 4, 5} 1 Tutorial AVL TREES Binary search trees are designed for efficient access to data. In some cases, however, a binary search tree is degenerate or "almost degenerate" with most of the n elements descending

More information

CSE 100: UNION-FIND HASH

CSE 100: UNION-FIND HASH CSE 100: UNION-FIND HASH Run time of Simple union and find (using up-trees) If we have no rule in place for performing the union of two up trees, what is the worst case run time of find and union operations

More information

CpSc212 Goddard Notes Chapter 10. Linked Lists

CpSc212 Goddard Notes Chapter 10. Linked Lists CpSc212 Goddard Notes Chapter 10 Linked Lists 10.1 Links and Pointers The linked list is not an ADT in its own right; rather it is a way of implementing many data structures. It is designed to replace

More information

Linear Lists. Advanced Algorithmics (4AP) Linear structures: Abstract Data Type (ADT) Lists: Array ADT

Linear Lists. Advanced Algorithmics (4AP) Linear structures: Abstract Data Type (ADT) Lists: Array ADT Linear Lists Advanced Algorithmics (4AP) Linear structures: Lists, Queues, Stacks, sorting, Jaak Vilo 2009 Spring Operations which one may want to perform on a linear list of n elements include: gain access

More information

Hash Table and Hashing

Hash Table and Hashing Hash Table and Hashing The tree structures discussed so far assume that we can only work with the input keys by comparing them. No other operation is considered. In practice, it is often true that an input

More information

Figure 1: Chaining. The implementation consists of these private members:

Figure 1: Chaining. The implementation consists of these private members: Hashing Databases and keys. A database is a collection of records with various attributes. It is commonly represented as a table, where the rows are the records, and the columns are the attributes: Number

More information

CS 241 Analysis of Algorithms

CS 241 Analysis of Algorithms CS 241 Analysis of Algorithms Professor Eric Aaron Lecture T Th 9:00am Lecture Meeting Location: OLB 205 Business HW5 extended, due November 19 HW6 to be out Nov. 14, due November 26 Make-up lecture: Wed,

More information

CE221 Programming in C++ Part 1 Introduction

CE221 Programming in C++ Part 1 Introduction CE221 Programming in C++ Part 1 Introduction 06/10/2017 CE221 Part 1 1 Module Schedule There are two lectures (Monday 13.00-13.50 and Tuesday 11.00-11.50) each week in the autumn term, and a 2-hour lab

More information

CSCI-1200 Data Structures Spring 2014 Lecture 19 Hash Tables

CSCI-1200 Data Structures Spring 2014 Lecture 19 Hash Tables CSCI-1200 Data Structures Spring 2014 Lecture 19 Hash Tables Announcements: Test 3 Information Test 3 will be held Monday, April 14th from 6-7:50pm in DCC 308 (Sections 1-5 & 7-9) and DCC 337 (Sections

More information

Multiple Choice (Questions 1 14) 28 Points Select all correct answers (multiple correct answers are possible)

Multiple Choice (Questions 1 14) 28 Points Select all correct answers (multiple correct answers are possible) Name Closed notes, book and neighbor. If you have any questions ask them. Notes: Segment of code necessary C++ statements to perform the action described not a complete program Program a complete C++ program

More information

Comp 11 - Summer Session Hashmap

Comp 11 - Summer Session Hashmap Lab 14 Comp 11 - Summer Session Hashmap 14.1 Description In this lab we are going to implement part of a hashmap. Our hashmap is limited to storing pairs with a key of string and a value of string for

More information