Cuckoo Filter: Practically Better Than Bloom
|
|
- Brett Webster
- 5 years ago
- Views:
Transcription
1 Cuckoo Filter: Practically Better Than Bloom Bin Fan (CMU/Google) David Andersen (CMU) Michael Kaminsky (Intel Labs) Michael Mitzenmacher (Harvard) 1
2 What is Bloom Filter? A Compact Data Structure Storing Set-membership Bloom Filters answer is item x in set Y by: definitely no, or probably yes with probability ε to be wrong false positive rate Benefit: not always precise but highly compact Typically a few bits per item Achieving lower ε (more accurate) requires spending more bits per 2 item
3 Example Use: Safe Browsing It is Good! Lookup( ) Probably Yes! No! Please verify Remote Server Known Malicious URLs Stored in Bloom Filter Scale to millions URLs 3
4 Bloom Filter Basics A Bloom Filter consists of m bits and k hash functions Example: m = 10, k = 3 Insert(x) Lookup(y) = not found hash 1 (x) hash 2 (x) hash 3 (x) hash 1 (y) hash 2 (y) hash 3 (y)
5 Succinct Data Structures for Approximate Set-membership Tests High Performance Low Space Cost Delete Support Bloom Filter Counting Bloom Filter Quotient Filter Can we achieve all three in practice? 5
6 Outline Background Cuckoo filter algorithm Performance evaluation Summary 6
7 Basic Idea: Store Fingerprints in Hash Table Fingerprint(x): A hash value of x Lower false positive rate ε, longer fingerprint 0: 1: 2: 3: 4: 5: 6: 7: FP(a) FP(c) FP(b) 7
8 Basic Idea: Store Fingerprints in Hash Table Fingerprint(x): A hash value of x Lower false positive rate ε, longer fingerprint Insert(x): add Fingerprint(x) to hash table 0: 1: 2: 3: 4: 5: 6: 7: FP(x) FP(a) FP(c) FP(b) 8
9 Basic Idea: Store Fingerprints in Hash Table Fingerprint(x): A hash value of x Lower false positive rate ε, longer fingerprint Insert(x): add Fingerprint(x) to hash table Lookup(x): search Fingerprint(x) in hashtable Lookup(x) = found 0: 1: 2: 3: 4: 5: 6: 7: FP(x) FP(a) FP(c) FP(b) 9
10 Basic Idea: Store Fingerprints in Hash Table Fingerprint(x): A hash value of x Lower false positive rate ε, longer fingerprint Insert(x): add Fingerprint(x) to hash table Lookup(x): search Fingerprint(x) in hashtable Delete(x): remove Fingerprint(x) from hashtable How to Construct Hashtable? Delete(x) 0: 1: 2: 3: 4: 5: 6: 7: FP(x) FP(a) FP(c) FP(b) 10
11 (Minimal) Perfect Hashing: No Collision but Update is Expensive Perfect hashing: maps all items with no collisions {a, b, c, d, e, f} f(x) FP(b) FP(e) FP(c) FP(d) FP(f) FP(a) 11
12 (Minimum) Perfect Hashing: No Collision but Update is Expensive Perfect hashing: maps all items with no collisions f(x) {a, b, c, d, e, f} {a, b, c, d, e, g} f(x) =? FP(b) FP(e) FP(c) FP(d) FP(f) FP(a) Changing set must recalculate f high cost/bad performance of update 12
13 Convention Hash Table: High Space Cost Chaining : Linear Probing bkt0 Lookup(x) FP(a) bkt1 FP(f) bkt2 FP(c) bkt3 FP(a) FP(c) FP(d) Lookup(x) Pointers low space utilization Making lookups O(1) requires large % table empty low space utilization Compare multiple fingerprints sequentially 13 more false positives
14 Cuckoo Hashing [Pagh2004] Good But.. High Space Utilization 4-way set-associative table: >95% entries occupied Fast Lookup: O(1) lookup(x) hash 1 (x) hash 2 (x) 0: 1: 2: 3: 4: 5: 6: 7: Standard cuckoo hashing doesn t work with fingerprints[pagh2004] Cuckoo hashing. 14
15 Standard Cuckoo Requires Storing Each Item 0: 1: Insert(x) h 1 (x) 2: 3: 4: b c h 2 (x) 5: 6: a 7: 15
16 Standard Cuckoo Requires Storing Each Item 0: 1: Insert(x) 2: 3: 4: b c h 2 (x) 5: 6: x Rehash a: alternate(a) = 4 Kick a to bucket 4 7: 16
17 Standard Cuckoo Requires Storing Each Item 0: 1: Insert(x) 2: 3: 4: b a Rehash c: alternate(c) = 1 Kick c to bucket 1 h 2 (x) 5: 6: x Rehash a: alternate(a) = 4 Kick a to bucket 4 7: 17
18 Standard Cuckoo Requires Storing Each Item Insert(x) 0: 1: 2: 3: 4: c b a Insert complete (or fail if MaxSteps reached) Rehash c: alternate(c) = 1 Kick c to bucket 1 h 2 (x) 5: 6: x Rehash a: alternate(a) = 4 Kick a to bucket 4 7: 18
19 Challenge: How to Perform Cuckoo? Cuckoo hashing requires rehashing and displacing existing items 0: 1: 2: 3: 4: 5: 6: 7: FP(b) FP(c) FP(a) Kick FP(c) to which bucket? Kick FP(a) to which bucket? With only fingerprint, how to calculate item s alternate bucket? 19
20 We Apply Partial-Key Cuckoo Standard Cuckoo Hashing: two independent hash functions for two buckets bucket1 = hash 1 (x) bucket2 = hash 2 (x) Partial-key Cuckoo Hashing: use one bucket and fingerprint to derive the other [Fan2013] bucket1 = hash(x) bucket2 = bucket1 hash(fp(x)) To displace existing fingerprint: alternate(x) = current(x) hash(fp(x)) [Fan2013] MemC3: Compact and Concurrent MemCache 20 with Dumber Caching and Smarter Hashing
21 Partial Key Cuckoo Hashing Perform cuckoo hashing on fingerprints 0: 1: 2: 3: 4: 5: 6: 7: FP(b) FP(c) FP(a) Kick FP(c) to 4 Kick FP(a) to 6 hash(fp(c)) hash(fp(a)) Can we still achieve high space utilization with partial-key cuckoo 21 hashing?
22 Fingerprints Must Be Long for Space Efficiency Table Space Utilization When fingerprint > 5 bits, high table space utilization Table size: n=128 million entries Fingerprint must be Ω(logn/b) bits in theory n: hash table size, b: bucket size see more analysis in paper 22
23 Space Efficiency More Space bits per item to achieve ε Lower bound More False Positive ε: target false positive rate 24
24 Space Efficiency More Space bits per item to achieve ε Bloom filter Lower bound More False Positive ε: target false positive rate 25
25 Space Efficiency More Space bits per item to achieve ε Bloom filter Lower bound Cuckoo filter More False Positive ε: target false positive rate 26
26 Space Efficiency More Space bits per item to achieve ε Bloom filter Lower bound Cuckoo filter Cuckoo filter + semisorting more compact than Bloom filter at 3% More False Positive ε: target false positive rate 27
27 Outline Background Cuckoo filter algorithm Performance evaluation Summary 28
28 Evaluation Compare cuckoo filter with Bloom filter (cannot delete) Blocked Bloom filter [Putze2007] (cannot delete) d-left counting Bloom filter [Bonomi2006] Cuckoo filter + semisorting More in the paper C++ implementation, single threaded [Putze2007] Cache-, hash- and space- efficient bloom filters. [Bonomi2006] Beyond Bloom filters: From approximate membership checks to approximate state machines. 29
29 Lookup Performance (MOPS) Hit Miss Cuckoo Cuckoo + semisort d-left counting Bloom blocked Bloom Bloom (no deletion) (no deletion) cf cfss dlbf bbf bf 30
30 Lookup Performance (MOPS) Hit Miss Cuckoo Cuckoo + semisort d-left counting Bloom blocked Bloom Bloom (no deletion) (no deletion) cf cfss dlbf bbf bf 31
31 Lookup Performance (MOPS) Hit Miss Cuckoo Cuckoo + semisort d-left counting Bloom blocked Bloom Bloom (no deletion) (no deletion) cf cfss dlbf bbf bf 32
32 Lookup Performance (MOPS) Hit Miss Cuckoo Cuckoo + semisort d-left counting Bloom blocked Bloom Bloom (no deletion) (no deletion) cf cfss dlbf bbf bf Cuckoo filter is among the fastest regardless 33 workloads.
33 Insert Performance (MOPS) Cuckoo Blocked Bloom d-left Bloom Standard Bloom Cuckoo + semisorting Cuckoo filter has decreasing insert rate, but overall is only slower than blocked Bloom 34 filter.
34 Summary Cuckoo filter, a Bloom filter replacement: Deletion support High performance Less Space than Bloom filters in practice Easy to implement Source code available in C++: 35
35 Othello Hashing and Its Applications for Network Processing Chen Qian Department of Computer Engineering Publications in ICNP 17, SIGMETRICS 17, MECOMM 17 and Bioinformatics
36 Background PhD in 2013 from UT Austin with Simon Lam Research: Computer networking SDN/NFV Internet of things Security 37
37 Motivation Network algorithms always prefer small memory and fast processing speed. Fast memory is precious resource on network devices Needs to reach the line rate to avoid being a bottleneck, under large traffic volume More important in networks with layer-two semantics 38
38 Othello Hashing Essentially a key-value lookup structure Keys can be any names, addresses, identifiers, etc. Values should not be too long. At most 64 bit. For example Key: network address; Value: link to forward a packet Key: virtual IP; Value: direct IP 39
39 Why Othello is special Minimal query time: only two memory read operations (cachelines) per query. Minimal memory cost: 10%-30% of existing hash tables (e.g., Cuckoo). Support dynamic updates: can be updated over a million times per second. 40
40 Idea of dynamic Othello lookups Controller Program Construct Update K-V Lookup Lookup Update via existing API of programmable networks 41 Optimize memory and query cost
41 How Othello works Basic version: Classifies keys to two sets XX and YY Equivalent to key lookups for a 1-bit value Query result ττ kk = 0 kk XX ττ kk = 1 kk YY Advance version: Classifies keys to 2 ll sets Equivalent to key lookups for a ll-bit value 42
42 Othello Query Structure nn is # of keys Two bitmaps aa, bb with size mm (mm in (1.33nn, 2nn)) h aa aa bb h bb Query is easy. Then how to construct it? mm bits ττ = is 0in set 1 = Y 1
43 Othello Control Structure: Construct GG: acyclic bipartite graph h aa aa uu 00 uu 11 uu 22 uu 33 uu 44 uu 55 uu 66 uu 77 kk h aa (kk) h bb (kk) 6 5 vv 00 vv 11 vv 22 vv 33 vv 44 vv 55 vv 66 vv 77 bb h bb 44
44 Othello Construct 45 h aa aa bb h bb If finding a cycle, use another pair kk <h a, h b > until an acyclic graph is built uu 00 vv 00 uu 11 uu 22 uu 33 uu 44 uu 55 uu 66 uu 77 h aa (kk) h bb (kk) For vv 11 vvn 22 names, vv 33 vv 44 vvthe 55 vvtime to find G is O(n)
45 Compute Bitmap aa uu 00 uu 11 uu 22 uu 33 uu 44 uu 55 1 uu 66 uu 77 kk h aa (kk) h bb (kk) set 6 5 Y vv 00 vv 11 vv 22 vv 33 vv 44 vv 55 vv 66 vv bb
46 Compute Bitmap aa uu 00 1 uu 11 uu 22 uu 33 0 uu 44 uu 55 1 uu 66 uu 77 kk h aa (kk) h bb (kk) set 6 5 Y 1 0 X 1 2 Y vv 00 vv 11 vv 22 vv 33 vv 44 vv 55 vv 66 vv X bb X If G is acyclic, easy to find a coloring plan 47
47 Name Addition color flip h aa aa uu 00 vv 00 1 uu 11 vv 11 uu 22 vv 22 uu 33 vv 33 0 uu 44 vv 44 uu 55 vv 55 uu 66 vv 66 uu 77 vv 77 bb h bb If G is acyclic, flipping is trivial 10 kk h aa (kk) h bb (kk) set 6 5 Y 1 0 X 1 2 Y 1 3 X 4 2 X 6 3 Y 48
48 L-Othello functionality Classifies names into 2 ll sets: ZZ 0, ZZ 1,, ZZ 2 ll 1 49 XX 1 YY1 ZZ 2 ZZ 0 ZZ 3 l < 8 for network devices ZZ 1 l Othellos can classify names to 2 l sets XX2 YY 2
49 Example Classify keys in 8 sets: ZZ 0, ZZ 1,, ZZ 7 Orthogonal separation of sets XX 3 = ZZ 0 ZZ 1 ZZ 2 ZZ 3 ; YY 3 = ZZ 4 ZZ 5 ZZ 6 ZZ 7. XX 2 = ZZ 0 ZZ 1 ZZ 4 ZZ 5 ; YY 2 = ZZ 2 ZZ 3 ZZ 6 ZZ 7. XX 1 = ZZ 0 ZZ 2 ZZ 4 ZZ 6 ; YY 1 = ZZ 1 ZZ 3 ZZ 5 ZZ 7. 6=(110) 2 kk YY 3 YY 2 XX 1 kk ZZ 6 ll Othellos : classify keys in 2 ll sets. 50
50 aa 1 uu uu 11 uu 22 uu 33 uu 44 uu 55 uu 66 uu 77 aa 2 uu 00 Same G, h a, h b uu 11 uu 22 uu 33 uu 44 uu 55 uu 66 Different coloring plan and bitmaps uu 77 bb 1 vv 00 vv 11 vv 22 vv 33 vv 44 vv 55 vv 66 vv bb 2 vv 00 vv 11 vv 22 vv 33 vv 44 Do we need 2l memory reads to query l Othellos? vv 55 vv 66 vv Othello 1 Same X UY Othello 2 51
51 aa 1 uu 00 h aa AA[0]AA[1] AA uu 11 uu 22 uu 33 uu 44 uu 55 uu 66 uu 77 aa 2 uu 00 uu 11 uu 22 uu 33 ττ kk = = 11 2 uu 44 uu 55 uu 66 uu 77 bb 1 vv vv 11 vv 22 vv 33 vv 44 vv 55 vv 66 vv 77 k is in set Z BB h bb bb 2 vv 00 vv 11 vv 22 vv 33 vv 44 vv 55 vv 66 vv CPUs can read l bits at one time Othello 1 Othello 2
52 Alien keys What is ττ kk = aa h aa kk not in SS? An arbitrary value bb[h bb kk ] when kk is ττ kk return 1 with when aa ii = 1 && bb jj = 0, or aa ii = 0&& bb jj = 1 53
53 Applications of Othello 1. Forwarding Information Base (FIB) 2. Software load balancer 3. Data placement and lookup 4. Private queries 5. Genomic sequencing search And more 54
54 A Concise FIB Resolving FIB explosion is crucial For layer-two interconnected data centers For OpenFlow-like fine-grained flow control Concise using l-othello is a portable solution In hardware devices Or software switches A Fast, Small, and Dynamic Forwarding Information Base, In ACM SIGMETRICS 2017 A 55 Concise Forwarding Information Base for Scalable and Fast Name Switching, in IEEE ICNP 2017.
55 Network-wide updating If all devices share a same set of network names/addresses Such as in layer-two Ethernet-based data centers All Othellos will share a same G. Hence network-wide updating is very efficient! Update consistency also provided 56
56 Implementation of three prototypes 1. Memory mode Query and control structures running on different threads. 2. CLICK modular router 3. Intel Data Plane Development Kit (DPDK) 57
57 Comparison: Buffalo Yu, Fabrikant, Rexford, in CoNEXT 09 Cuckoo hashing Zhou, Fan, Lim, Kaminsky, Andersen, in CoNEXT 13 and SIGCOMM 15 58
58 Comparison: Memory size FIB Example Memory Size Name Type # Names # Actions Concise Cuckoo Buffalo MAC (48 bits) 7* M 5.62M 2.64M MAC (48 bits) 5* M 40.15M 27.70M MAC (48 bits) 3* M M M IPv4 (32 bits) 1* M 4.27M 3.77M IPv6 (128 bits) 2* M 34.13M 11.08M OpenFlow (356 bits) 3* M 14.46M 1.67M OpenFlow (356 bits) 1.4* M 67.46M 18.21M File name (varied) K 19.32M 1.35M 59
59 Query speed 2x to 4x speed advantage 60
60 Update speed 61
61 For unknown network names 1. For data centers with most internal traffic Such situation is rare 2. For networks with much incoming traffic A filter can be installed at a firewall 3. Concise may include an r-bit checksum. A lookup still requires 2 memory accesses in total, as long as l + r <=
62 Thank You Chen Qian
MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing
MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing Bin Fan (CMU), Dave Andersen (CMU), Michael Kaminsky (Intel Labs) NSDI 2013 http://www.pdl.cmu.edu/ 1 Goal: Improve Memcached 1. Reduce space overhead
More informationScalable Enterprise Networks with Inexpensive Switches
Scalable Enterprise Networks with Inexpensive Switches Minlan Yu minlanyu@cs.princeton.edu Princeton University Joint work with Alex Fabrikant, Mike Freedman, Jennifer Rexford and Jia Wang 1 Enterprises
More informationCuckoo Hashing for Undergraduates
Cuckoo Hashing for Undergraduates Rasmus Pagh IT University of Copenhagen March 27, 2006 Abstract This lecture note presents and analyses two simple hashing algorithms: Hashing with Chaining, and Cuckoo
More informationAdvanced Algorithmics (6EAP) MTAT Hashing. Jaak Vilo 2016 Fall
Advanced Algorithmics (6EAP) MTAT.03.238 Hashing Jaak Vilo 2016 Fall Jaak Vilo 1 ADT asscociative array INSERT, SEARCH, DELETE An associative array (also associative container, map, mapping, dictionary,
More informationWhen Cycles are Cheap, Some Tables Can Be Huge
When Cycles are Cheap, Some Tables Can Be Huge Bin Fan, Dong Zhou, Hyeontaek Lim, Michael Kaminsky, David G. Andersen Carnegie Mellon University, Intel Labs Abstract The goal of this paper is to raise
More informationSwitch and Router Design. Packet Processing Examples. Packet Processing Examples. Packet Processing Rate 12/14/2011
// Bottlenecks Memory, memory, 88 - Switch and Router Design Dr. David Hay Ross 8b dhay@cs.huji.ac.il Source: Nick Mckeown, Isaac Keslassy Packet Processing Examples Address Lookup (IP/Ethernet) Where
More informationReliably Scalable Name Prefix Lookup! Haowei Yuan and Patrick Crowley! Washington University in St. Louis!! ANCS 2015! 5/8/2015!
Reliably Scalable Name Prefix Lookup! Haowei Yuan and Patrick Crowley! Washington University in St. Louis!! ANCS 2015! 5/8/2015! ! My Topic for Today! Goal: a reliable longest name prefix lookup performance
More informationRectangular Hash Table: Bloom Filter and Bitmap Assisted Hash Table with High Speed
Rectangular Hash Table: Bloom Filter and Bitmap Assisted Hash Table with High Speed Tong Yang, Binchao Yin, Hang Li, Muhammad Shahzad, Steve Uhlig, Bin Cui Xiaoming Li Peking University, China. Collaborative
More informationExtending DPDK Flow Classification Libraries for Determinism & Cloud Usages SAMEH GOBRIEL INTEL LABS
x Extending DPDK Flow Classification Libraries for Determinism & Cloud Usages SAMEH GOBRIEL INTEL LABS Contributors Yipeng Wang yipeng1.wang@intel.com Ren Wang ren.wang@intel.com Charlie Tai charlie.tai@intel.com
More informationA Filter Base Addressing Protocol for Node Auto- Configuration in Wireless Ad Hoc Network Using Cuckoo Filter.
A Filter Base Addressing Protocol for Node Auto- Configuration in Wireless Ad Hoc Network Using Cuckoo Filter. Mr. Bhushan A. Badgujar. Mrs. Swati patil. Mr. Mayur Agrawal. Computer Science &Engineering
More informationMemory-efficient and Ultra-fast Network Lookup and Forwarding using Othello Hashing
Memory-efficient and Ultra-fast Network Lookup and Forwarding using Othello Hashing Ye Yu, Student Member, IEEE and ACM, Djamal Belazzougui, Chen Qian, Member, IEEE and ACM, and Qin Zhang Abstract Network
More informationCS5112: Algorithms and Data Structures for Applications
CS5112: Algorithms and Data Structures for Applications Lecture 3: Hashing Ramin Zabih Some figures from Wikipedia/Google image search Administrivia Web site is: https://github.com/cornelltech/cs5112-f18
More informationCuckoo Linear Algebra
Cuckoo Linear Algebra Li Zhou, CMU Dave Andersen, CMU and Mu Li, CMU and Alexander Smola, CMU and select advertisement p(click user, query) = logist (hw, (user, query)i) select advertisement find weight
More information1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:
CS 124 Section #8 Hashing, Skip Lists 3/20/17 1 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look
More informationSummary Cache: A Scalable Wide-Area Web Cache Sharing Protocol
Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol Li Fan, Pei Cao and Jussara Almeida University of Wisconsin-Madison Andrei Broder Compaq/DEC System Research Center Why Web Caching One of
More informationarxiv: v2 [cs.ni] 22 Nov 2017
Memory-efficient and Ultra-fast Network Lookup and Forwarding using Othello Hashing Ye Yu, Student Member, IEEE and ACM, Djamal Belazzougui, Chen Qian, Member, IEEE and ACM, and Qin Zhang arxiv:608.05699v2
More informationLecture 11: Packet forwarding
Lecture 11: Packet forwarding Anirudh Sivaraman 2017/10/23 This week we ll talk about the data plane. Recall that the routing layer broadly consists of two parts: (1) the control plane that computes routes
More informationMemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing
To appear in Proceedings of the 1th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), Lombard, IL, April 213 MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter
More informationData Structures for IPv6 Network Traffic Analysis Using Sets and Bags. John McHugh, Ulfar Erlingsson
Data Structures for IPv6 Network Traffic Analysis Using Sets and Bags John McHugh, Ulfar Erlingsson The nature of the problem IPv4 has 2 32 possible addresses, IPv6 has 2 128. IPv4 sets can be realized
More informationA Concise Forwarding Information Base for Scalable and Fast Name Lookups
A Concise Forwarding Information Base for Scalable and Fast Name Lookups Ye Yu, Djamal Belazzougui, Chen Qian, and Qin Zhang University of Kentucky; CERIST; University of California, Santa Cruz; Indiana
More informationTOWARDS FAST IP FORWARDING
TOWARDS FAST IP FORWARDING IP FORWARDING PERFORMANCE IMPROVEMENT AND MEASUREMENT IN FREEBSD Nanako Momiyama Keio University 25th September 2016 EuroBSDcon 2016 OUTLINE Motivation Design and implementation
More informationBe Fast, Cheap and in Control with SwitchKV Xiaozhou Li
Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li Raghav Sethi Michael Kaminsky David G. Andersen Michael J. Freedman Goal: fast and cost-effective key-value store Target: cluster-level storage for
More informationBloom Filters. References:
Bloom Filters References: Li Fan, Pei Cao, Jussara Almeida, Andrei Broder, Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol, IEEE/ACM Transactions on Networking, Vol. 8, No. 3, June 2000.
More informationCuckoo Filter-Based Location-Privacy Preservation in Database-Driven Cognitive Radio Networks
Cuckoo Filter-Based Location-Privacy Preservation in Database-Driven Cognitive Radio Networks Mohamed Grissa, Attila A Yavuz, and Bechir Hamdaoui Oregon State University, grissam,yavuza,hamdaoub@onidoregonstateedu
More informationA Comparison of Performance and Accuracy of Measurement Algorithms in Software
A Comparison of Performance and Accuracy of Measurement Algorithms in Software Omid Alipourfard, Masoud Moshref 1, Yang Zhou 2, Tong Yang 2, Minlan Yu 3 Yale University, Barefoot Networks 1, Peking University
More informationCS5112: Algorithms and Data Structures for Applications
CS5112: Algorithms and Data Structures for Applications Lecture 4.1: Applications of hashing Ramin Zabih Some figures from Wikipedia/Google image search Administrivia Web site is: https://github.com/cornelltech/cs5112-f18
More informationJinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University)
Jinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University) Background: Memory Caching Two orders of magnitude more reads than writes
More informationFAWN. A Fast Array of Wimpy Nodes. David Andersen, Jason Franklin, Michael Kaminsky*, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan
FAWN A Fast Array of Wimpy Nodes David Andersen, Jason Franklin, Michael Kaminsky*, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan Carnegie Mellon University *Intel Labs Pittsburgh Energy in computing
More informationHash Table Design and Optimization for Software Virtual Switches
Hash Table Design and Optimization for Software Virtual Switches P R E S E N T E R : R E N WA N G Y I P E N G WA N G, S A M E H G O B R I E L, R E N WA N G, C H A R L I E TA I, C R I S T I A N D U M I
More informationDifference Bloom Filter: a Probabilistic Structure for Multi-set Membership Query
Difference Bloom Filter: a Probabilistic Structure for Multi-set Membership Query Dongsheng Yang, Deyu Tian, Junzhi Gong, Siang Gao, Tong Yang, Xiaoming Li Department of Computer Secience, Peking University,
More informationADT asscociafve array INSERT, SEARCH, DELETE. Advanced Algorithmics (6EAP) Some reading MTAT Hashing. Jaak Vilo.
Jaak Vilo Advanced Algorithmics (6EAP) MTAT.03.238 Hashing Jaak Vilo 2013 Spring 1 ADT asscociafve array INSERT, SEARCH, DELETE An associa6ve array (also associa6ve container, map, mapping, dic6onary,
More informationChapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion,
Introduction Chapter 5 Hashing hashing performs basic operations, such as insertion, deletion, and finds in average time 2 Hashing a hash table is merely an of some fixed size hashing converts into locations
More informationOutline for Today. Towards Perfect Hashing. Cuckoo Hashing. The Cuckoo Graph. Analysis of Cuckoo Hashing. Reducing worst-case bounds
Cuckoo Hashing Outline for Today Towards Perfect Hashing Reducing worst-case bounds Cuckoo Hashing Hashing with worst-case O(1) lookups. The Cuckoo Graph A framework for analyzing cuckoo hashing. Analysis
More informationHashing Based Dictionaries in Different Memory Models. Zhewei Wei
Hashing Based Dictionaries in Different Memory Models Zhewei Wei Outline Introduction RAM model I/O model Cache-oblivious model Open questions Outline Introduction RAM model I/O model Cache-oblivious model
More information"Charting the Course... MOC A Planning, Deploying and Managing Microsoft Forefront TMG Course Summary
Description Course Summary The goal of this three-day instructor-led course is to provide students with the knowledge and skills necessary to effectively plan, deploy and manage Microsoft Forefront Threat
More informationFAWN: A Fast Array of Wimpy Nodes
FAWN: A Fast Array of Wimpy Nodes David G. Andersen, Jason Franklin, Michael Kaminsky *, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan Carnegie Mellon University, * Intel Labs SOSP 09 CAS ICT Storage
More informationIntroduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far
Chapter 5 Hashing 2 Introduction hashing performs basic operations, such as insertion, deletion, and finds in average time better than other ADTs we ve seen so far 3 Hashing a hash table is merely an hashing
More informationSection 8: Monomials and Radicals
In this section, we are going to learn skills for: NGSS Standards MA.912.A.4.1 Simplify monomials and monomial expressions using the laws of integral exponents. MA.912.A.6.1 Simplify radical expressions.
More informationADT asscociafve array. Advanced Algorithmics (6EAP) Some reading INSERT, SEARCH, DELETE. MTAT Hashing. Jaak Vilo.
Jaak Vilo Advanced Algorithmics (6EAP) MTAT.03.238 Hashing Jaak Vilo 2011 Spring 1 ADT asscociafve array INSERT, SEARCH, DELETE An associa*ve array (also associa*ve container, map, mapping, dic*onary,
More informationHashing and sketching
Hashing and sketching 1 The age of big data An age of big data is upon us, brought on by a combination of: Pervasive sensing: so much of what goes on in our lives and in the world at large is now digitally
More informationarxiv: v2 [cs.ds] 11 Oct 2017
Adaptive Cuckoo Filters arxiv:1704.06818v2 [cs.ds] 11 Oct 2017 Michael Mitzenmacher 1, Salvatore Pontarelli 2, and Pedro Reviriego 3 1 Harvard University, 33 Oxford Street, Cambridge, MA 02138, USA., michaelm@eecs.harvard.edu
More informationOverview. Implementing Gigabit Routers with NetFPGA. Basic Architectural Components of an IP Router. Per-packet processing in an IP Router
Overview Implementing Gigabit Routers with NetFPGA Prof. Sasu Tarkoma The NetFPGA is a low-cost platform for teaching networking hardware and router design, and a tool for networking researchers. The NetFPGA
More informationDesign of a Near-Minimal Dynamic Perfect Hash Function on Embedded Device
Design of a Near-Minimal Dynamic Perfect Hash Function on Embedded Device Derek Pao, Xing Wang and Ziyan Lu Department of Electronic Engineering, City University of Hong Kong, HONG KONG E-mail: d.pao@cityu.edu.hk,
More informationA Distributed Hash Table for Shared Memory
A Distributed Hash Table for Shared Memory Wytse Oortwijn Formal Methods and Tools, University of Twente August 31, 2015 Wytse Oortwijn (Formal Methods and Tools, AUniversity Distributed of Twente) Hash
More informationCuckoo Cache a Technique to Improve Flow Monitoring Throughput
This article has been accepted for publication in IEEE Internet Computing but has not yet been fully edited. Some content may change prior to final publication. Cuckoo Cache a Technique to Improve Flow
More informationLecture 16: Router Design
Lecture 16: Router Design CSE 123: Computer Networks Alex C. Snoeren Eample courtesy Mike Freedman Lecture 16 Overview End-to-end lookup and forwarding example Router internals Buffering Scheduling 2 Example:
More informationModule 5: Hashing. CS Data Structures and Data Management. Reza Dorrigiv, Daniel Roche. School of Computer Science, University of Waterloo
Module 5: Hashing CS 240 - Data Structures and Data Management Reza Dorrigiv, Daniel Roche School of Computer Science, University of Waterloo Winter 2010 Reza Dorrigiv, Daniel Roche (CS, UW) CS240 - Module
More informationRBF: A New Storage Structure for Space- Efficient Queries for Multidimensional Metadata in OSS
RBF: A New Storage Structure for Space- Efficient Queries for Multidimensional Metadata in OSS Yu Hua 1, Dan Feng 1, Hong Jiang 2, Lei Tian 1 1 School of Computer, Huazhong University of Science and Technology,
More informationScalable Concurrent Hash Tables via Relativistic Programming
Scalable Concurrent Hash Tables via Relativistic Programming Josh Triplett September 24, 2009 Speed of data < Speed of light Speed of light: 3e8 meters/second Processor speed: 3 GHz, 3e9 cycles/second
More informationSILT: A MEMORY-EFFICIENT, HIGH-PERFORMANCE KEY- VALUE STORE PRESENTED BY PRIYA SRIDHAR
SILT: A MEMORY-EFFICIENT, HIGH-PERFORMANCE KEY- VALUE STORE PRESENTED BY PRIYA SRIDHAR AGENDA INTRODUCTION Why SILT? MOTIVATION SILT KV STORAGE SYSTEM LOW READ AMPLIFICATION CONTROLLABLE WRITE AMPLIFICATION
More informationReliably Scalable Name Prefix Lookup
Reliably Scalable Name Prefix Lookup Haowei Yuan & Patrick Crowley Washington University Department of Computer Science & Engineering St. Louis, Missouri 63130 {hyuan, pcrowley}@wustl.edu ABSTRACT Name
More informationNear Memory Key/Value Lookup Acceleration MemSys 2017
Near Key/Value Lookup Acceleration MemSys 2017 October 3, 2017 Scott Lloyd, Maya Gokhale Center for Applied Scientific Computing This work was performed under the auspices of the U.S. Department of Energy
More information2-3 Cuckoo Filters for Faster Triangle Listing and Set Intersection
2-3 Cuckoo Filters for Faster Triangle Listing and Set Intersection David Eppstein 1, Michael T. Goodrich 1, Michael Mitzenmacher 2, and Manuel Torres 1 1 University of California, Irvine 2 Harvard University
More informationFast Lookup: Hash tables
CSE 100: HASHING Operations: Find (key based look up) Insert Delete Fast Lookup: Hash tables Consider the 2-sum problem: Given an unsorted array of N integers, find all pairs of elements that sum to a
More informationAn Enhanced Bloom Filter for Longest Prefix Matching
An Enhanced Bloom Filter for Longest Prefix Matching Gahyun Park SUNY-Geneseo Email: park@geneseo.edu Minseok Kwon Rochester Institute of Technology Email: jmk@cs.rit.edu Abstract A Bloom filter is a succinct
More informationTUPLE PRUNING USING BLOOM FILTERS FOR PACKET CLASSIFICATION
... TUPLE PRUNING USING BLOOM FILTERS FOR PACKET CLASSIFICATION... TUPLE PRUNING FOR PACKET CLASSIFICATION PROVIDES FAST SEARCH AND A LOW IMPLEMENTATION COMPLEXITY. THE TUPLE PRUNING ALGORITHM REDUCES
More informationWrite-Optimized and High-Performance Hashing Index Scheme for Persistent Memory
Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory Pengfei Zuo, Yu Hua, and Jie Wu, Huazhong University of Science and Technology https://www.usenix.org/conference/osdi18/presentation/zuo
More informationFAWN as a Service. 1 Introduction. Jintian Liang CS244B December 13, 2017
Liang 1 Jintian Liang CS244B December 13, 2017 1 Introduction FAWN as a Service FAWN, an acronym for Fast Array of Wimpy Nodes, is a distributed cluster of inexpensive nodes designed to give users a view
More informationCounting with TinyTable: Every Bit Counts!
Counting with TinyTable: Every it Counts! Gil Einziger Roy Friedman Computer Science Department, Technion Haifa 32000, Israel {gilga,roy}@cs.technion.ac.il Abstract Counting loom filters CF and their variants
More informationModule 3: Hashing Lecture 9: Static and Dynamic Hashing. The Lecture Contains: Static hashing. Hashing. Dynamic hashing. Extendible hashing.
The Lecture Contains: Hashing Dynamic hashing Extendible hashing Insertion file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture9/9_1.htm[6/14/2012
More informationIX: A Protected Dataplane Operating System for High Throughput and Low Latency
IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this
More informationCS3600 SYSTEMS AND NETWORKS
CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 15: Networking overview Prof. (amislove@ccs.neu.edu) What is a network? 2 What is a network? Wikipedia: A telecommunications network is a network
More informationCSE 5311 Notes 5: Hashing
CSE 5311 Notes 5: Hashing (Last updated 2/18/18 1:33 PM) CLRS, Chapter 11 Review: 11.2: Chaining - related to perfect hashing method 11.3: Hash functions, skim universal hashing (aside: https://dl-acm-org.ezproy.uta.edu/citation.cfm?doid=3116227.3068772
More informationLast Lecture: Network Layer
Last Lecture: Network Layer 1. Design goals and issues 2. Basic Routing Algorithms & Protocols 3. Addressing, Fragmentation and reassembly 4. Internet Routing Protocols and Inter-networking 5. Router design
More informationChí Cao Minh 28 May 2008
Chí Cao Minh 28 May 2008 Uniprocessor systems hitting limits Design complexity overwhelming Power consumption increasing dramatically Instruction-level parallelism exhausted Solution is multiprocessor
More informationrecruitment Logo Typography Colourways Mechanism Usage Pip Recruitment Brand Toolkit
Logo Typography Colourways Mechanism Usage Primary; Secondary; Silhouette; Favicon; Additional Notes; Where possible, use the logo with the striped mechanism behind. Only when it is required to be stripped
More informationCS419: Computer Networks. Lecture 6: March 7, 2005 Fast Address Lookup:
: Computer Networks Lecture 6: March 7, 2005 Fast Address Lookup: Forwarding/Routing Revisited Best-match Longest-prefix forwarding table lookup We looked at the semantics of bestmatch longest-prefix address
More informationStochastic Pre-Classification for SDN Data Plane Matching
Stochastic Pre-Classification for SDN Data Plane Matching Luke McHale, C. Jasson Casey, Paul V. Gratz, Alex Sprintson Presenter: Luke McHale Ph.D. Student, Texas A&M University Contact: luke.mchale@tamu.edu
More informationHow Cuckoo Filter Can Improve Existing Approximate Matching Techniques
How Cuckoo Filter Can Improve Existing Approximate Matching Techniques Vikas Gupta 1 and Frank Breitinger 2 1 Netskope, Inc. vikasgupta.nit@gmail.com 2 Cyber Forensics Research and Education Group (UNHcFREG)
More informationSILT: A Memory-Efficient, High- Performance Key-Value Store
SILT: A Memory-Efficient, High- Performance Key-Value Store SOSP 11 Presented by Fan Ni March, 2016 SILT is Small Index Large Tables which is a memory efficient high performance key value store system
More informationPEARL. Programmable Virtual Router Platform Enabling Future Internet Innovation
PEARL Programmable Virtual Router Platform Enabling Future Internet Innovation Hongtao Guan Ph.D., Assistant Professor Network Technology Research Center Institute of Computing Technology, Chinese Academy
More informationSystem and Algorithmic Adaptation for Flash
System and Algorithmic Adaptation for Flash The FAWN Perspective David G. Andersen, Vijay Vasudevan, Michael Kaminsky* Amar Phanishayee, Jason Franklin, Iulian Moraru, Lawrence Tan Carnegie Mellon University
More informationHashCache: Cache Storage for the Next Billion
HashCache: Cache Storage for the Next Billion Anirudh Badam KyoungSoo Park Vivek S. Pai Larry L. Peterson Princeton University 1 Next Billion Internet Users 2 Next Billion Internet Users Schools, urban
More information===Lab Info=== * 90 points. * Due 11:59pm on Sunday, 9/20/2015 for Monday and Wednesday lab. ==Assignment==
===Lab Info=== * 90 points * Due 11:59pm on Sunday, 9/20/2015 for Monday and Wednesday lab ==Assignment== In this assignment, you will work on closed hashing with open addressing. You are to read in the
More informationSCALING SOFTWARE DEFINED NETWORKS. Chengyu Fan (edited by Lorenzo De Carli)
SCALING SOFTWARE DEFINED NETWORKS Chengyu Fan (edited by Lorenzo De Carli) Introduction Network management is driven by policy requirements Network Policy Guests must access Internet via web-proxy Web
More informationBloom Filters via d-left Hashing and Dynamic Bit Reassignment Extended Abstract
Bloom Filters via d-left Hashing and Dynamic Bit Reassignment Extended Abstract Flavio Bonomi 1 Michael Mitzenmacher 2 Rina Panigrahy 3 Sushil Singh 4 George Varghese 5 Abstract In recent work, the authors
More informationLecture 11: Perfect Hashing
Lecture 11: Perfect Hashing COSC242: Algorithms and Data Structures Brendan McCane Department of Computer Science, University of Otago Hashing Review The point of hashing is to get fast (O(1) on average)
More informationColoring Embedder: a Memory Efficient Data Structure for Answering Multi-set Query
Coloring Embedder: a Memory Efficient Data Structure for Answering Multi-set Query Tong Yang, Dongsheng Yang, Jie Jiang, Siang Gao, Bin Cui, Lei Shi, Xiaoming Li Department of Computer Science, Peking
More informationTowards High-performance Flow-level level Packet Processing on Multi-core Network Processors
Towards High-performance Flow-level level Packet Processing on Multi-core Network Processors Yaxuan Qi (presenter), Bo Xu, Fei He, Baohua Yang, Jianming Yu and Jun Li ANCS 2007, Orlando, USA Outline Introduction
More informationHashTable CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Fall 2018
HashTable CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Fall 2018 Acknowledgement The set of slides have used materials from the following resources Slides for textbook by Dr. Y.
More informationParallel Peeling Algorithms. Justin Thaler, Yahoo Labs Joint Work with: Michael Mitzenmacher, Harvard University Jiayang Jiang
Parallel Peeling Algorithms Justin Thaler, Yahoo Labs Joint Work with: Michael Mitzenmacher, Harvard University Jiayang Jiang The Peeling Paradigm Many important algorithms for a wide variety of problems
More informationParallel Peeling Algorithms. Justin Thaler, Yahoo Labs Joint Work with: Michael Mitzenmacher, Harvard University Jiayang Jiang
Parallel Peeling Algorithms Justin Thaler, Yahoo Labs Joint Work with: Michael Mitzenmacher, Harvard University Jiayang Jiang The Peeling Paradigm Many important algorithms for a wide variety of problems
More informationCSC 207 (16fa) Practice Exam 2 Page 1 of 13. Practice Exam 2 (Exam 2 date: Friday 11/18) Logistics:
CSC 207 (16fa) Practice Exam 2 Page 1 of 13 Practice Exam 2 (Exam 2 date: Friday 11/18) Logistics: In-class, written exam. Closed-book, notes, and technology. All relevant Java API documentation will be
More informationSummary Cache based Co-operative Proxies
Summary Cache based Co-operative Proxies Project No: 1 Group No: 21 Vijay Gabale (07305004) Sagar Bijwe (07305023) 12 th November, 2007 1 Abstract Summary Cache based proxies cooperate behind a bottleneck
More informationTopic HashTable and Table ADT
Topic HashTable and Table ADT Hashing, Hash Function & Hashtable Search, Insertion & Deletion of elements based on Keys So far, By comparing keys! Linear data structures Non-linear data structures Time
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms http://sudalab.is.s.u-tokyo.ac.jp/~reiji/pna16/ [ 9 ] Shared Memory Performance Parallel Numerical Algorithms / IST / UTokyo 1 PNA16 Lecture Plan General Topics 1. Architecture
More informationSOFTWARE DEFINED NETWORKS. Jonathan Chu Muhammad Salman Malik
SOFTWARE DEFINED NETWORKS Jonathan Chu Muhammad Salman Malik Credits Material Derived from: Rob Sherwood, Saurav Das, Yiannis Yiakoumis AT&T Tech Talks October 2010 (available at:www.openflow.org/wk/images/1/17/openflow_in_spnetworks.ppt)
More informationFast and Space-Efficient Maps: Shrinking Big Data Down to Size. Prashant Pandey Stony Brook University, NY
Fast and Space-Efficient Maps: Shrinking Big Data Down to Size Prashant Pandey Stony Brook University, NY Sequence Read Archive (SRA) database growth Petabyte Terabytes Sequence Read Archive (SRA) database
More informationThe dictionary problem
6 Hashing The dictionary problem Different approaches to the dictionary problem: previously: Structuring the set of currently stored keys: lists, trees, graphs,... structuring the complete universe of
More informationSearching for Shared Resources: DHT in General
1 ELT-53206 Peer-to-Peer Networks Searching for Shared Resources: DHT in General Mathieu Devos Tampere University of Technology Department of Electronics and Communications Engineering Based on the original
More informationExtensible Network Security Services on Software Programmable Router OS. David Yau, Prem Gopalan, Seung Chul Han, Feng Liang
Extensible Network Security Services on Software Programmable Router OS David Yau, Prem Gopalan, Seung Chul Han, Feng Liang System Software and Architecture Lab Department of Computer Sciences Purdue University
More informationProblem. Context. Hash table
Problem In many problems, it is natural to use Hash table as their data structures. How can the hash table be efficiently accessed among multiple units of execution (UEs)? Context Hash table is used when
More informationCMSC 341 Lecture 16/17 Hashing, Parts 1 & 2
CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2 Prof. John Park Based on slides from previous iterations of this course Today s Topics Overview Uses and motivations of hash tables Major concerns with hash
More informationThe Dynamic Cuckoo Filter
The Dynamic Cuckoo Filter Hanhua Chen, Liangyi Liao, Hai Jin, Jie Wu Services Computing Technology and System Lab Cluster and Grid Computing Lab Big Data Technology and System Lab School of Computer Science
More informationSearching for Shared Resources: DHT in General
1 ELT-53207 P2P & IoT Systems Searching for Shared Resources: DHT in General Mathieu Devos Tampere University of Technology Department of Electronics and Communications Engineering Based on the original
More informationRocker: switchdev prototyping vehicle
Rocker: switchdev prototyping vehicle Scott Feldman Somewhere in Oregon, USA sfeldma@gmail.com Abstract Rocker is an emulated network switch platform created to accelerate development of an in kernel network
More informationLecture 9 - Virtual Memory
CS 152 Computer Architecture and Engineering Lecture 9 - Virtual Memory Dr. George Michelogiannakis EECS, University of California at Berkeley CRD, Lawrence Berkeley National Laboratory http://inst.eecs.berkeley.edu/~cs152
More informationdebgr: An Efficient and Near-Exact Representation of the Weighted de Bruijn Graph Prashant Pandey Stony Brook University, NY, USA
debgr: An Efficient and Near-Exact Representation of the Weighted de Bruijn Graph Prashant Pandey Stony Brook University, NY, USA De Bruijn graphs are ubiquitous [Pevzner et al. 2001, Zerbino and Birney,
More informationHash Tables Outline. Definition Hash functions Open hashing Closed hashing. Efficiency. collision resolution techniques. EECS 268 Programming II 1
Hash Tables Outline Definition Hash functions Open hashing Closed hashing collision resolution techniques Efficiency EECS 268 Programming II 1 Overview Implementation style for the Table ADT that is good
More informationImproving STM Performance with Transactional Structs 1
Improving STM Performance with Transactional Structs 1 Ryan Yates and Michael L. Scott University of Rochester IFL, 8-31-2016 1 This work was funded in part by the National Science Foundation under grants
More information