Codes, Bloom Filters, and Overlay Networks. Michael Mitzenmacher

Similar documents
1 Reception efficiency

Bloom Filters and its Variants

All About Erasure Codes: - Reed-Solomon Coding - LDPC Coding. James S. Plank. ICL - August 20, 2004

CSCI-1680 Link Layer Reliability Rodrigo Fonseca

Efficient Content Delivery and Low Complexity Codes. Amin Shokrollahi

Lecture 4 September 17

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 12, NO. 5, OCTOBER Informed Content Delivery Across Adaptive Overlay Networks

Informed Content Delivery Across Adaptive Overlay Networks

Lecture 4: CRC & Reliable Transmission. Lecture 4 Overview. Checksum review. CRC toward a better EDC. Reliable Transmission

P2P Dissemination / ALM

Takes a file of k packets, encodes it into n

A Digital Fountain Approach to Asynchronous Reliable Multicast

CSCI-1680 Link Layer I Rodrigo Fonseca

Summary of Raptor Codes

Networked Systems and Services, Fall 2018 Chapter 2. Jussi Kangasharju Markku Kojo Lea Kutvonen

Communications Software. CSE 123b. CSE 123b. Spring Lecture 3: Reliable Communications. Stefan Savage. Some slides couresty David Wetherall

Lec 19 - Error and Loss Control

CSCI-1680 Link Layer Reliability John Jannotti

Searching for Shared Resources: DHT in General

Parallel Peeling Algorithms. Justin Thaler, Yahoo Labs Joint Work with: Michael Mitzenmacher, Harvard University Jiayang Jiang

Parallel Peeling Algorithms. Justin Thaler, Yahoo Labs Joint Work with: Michael Mitzenmacher, Harvard University Jiayang Jiang

Today s Papers. Array Reliability. RAID Basics (Two optional papers) EECS 262a Advanced Topics in Computer Systems Lecture 3

Searching for Shared Resources: DHT in General

Measurement Based Routing Strategies on Overlay Architectures

Content Overlays (continued) Nick Feamster CS 7260 March 26, 2007

Advanced Data Types and New Applications

Lecture 7: Sliding Windows. CSE 123: Computer Networks Geoff Voelker (guest lecture)

Compressed Bloom Filters

Midterm Exam 2B Answer key

Bloom Filters. References:

Bloom Filter for Network Security Alex X. Liu & Haipeng Dai

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Coding theory for scalable media delivery

DoS Attacks. Network Traceback. The Ultimate Goal. The Ultimate Goal. Overview of Traceback Ideas. Easy to launch. Hard to trace.

Error Detection Codes. Error Detection. Two Dimensional Parity. Internet Checksum Algorithm. Cyclic Redundancy Check.

CSE 123: Computer Networks Alex C. Snoeren. HW 1 due NOW!

Overview Content Delivery Computer Networking Lecture 15: The Web Peter Steenkiste. Fall 2016

Optimizing Joint Erasure- and Error-Correction Coding for Wireless Packet Transmissions

Overlay and P2P Networks. Introduction and unstructured networks. Prof. Sasu Tarkoma

Mobile Routing : Computer Networking. Overview. How to Handle Mobile Nodes? Mobile IP Ad-hoc network routing Assigned reading

CSC310 Information Theory. Lecture 21: Erasure (Deletion) Channels & Digital Fountain Codes. November 22, 2006 see

The TokuFS Streaming File System

Fair Queueing. Presented by Brighten Godfrey. Slides thanks to Ion Stoica (UC Berkeley) with slight adaptation by Brighten Godfrey

Multimedia Networking

Hashing and sketching

CSC310 Information Theory. Lecture 22: Erasure (Deletion) Channels & Digital Fountain Codes. November 30, 2005 see

Error-Correcting Codes

COMP/ELEC 429/556 Introduction to Computer Networks

4.1 COMPUTATIONAL THINKING AND PROBLEM-SOLVING

Client Level Framework for Parallel Downloading of Large File Systems

Raptor Codes for P2P Streaming

COOCHING: Cooperative Prefetching Strategy for P2P Video-on-Demand System

Networking Link Layer

BATS: Achieving the Capacity of Networks with Packet Loss

CS5112: Algorithms and Data Structures for Applications

Content distribution networks

Chapter 8 LOCATION SERVICES

What is the relationship between a domain name (e.g., youtube.com) and an IP address?

Module 15: Network Structures

Approximately Uniform Random Sampling in Sensor Networks

Carnegie Mellon Computer Science Department Spring 2015 Midterm Exam

Federated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni

Fast Approximate Reconciliation of Set Differences

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

L3S Research Center, University of Hannover

A Digital Fountain Approach to Reliable Distribution of Bulk Data

Some portions courtesy Srini Seshan or David Wetherall

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: April 1, 2015

PRESENTED BY SARAH KWAN NETWORK CODING

CSCI-1680 Transport Layer III Congestion Control Strikes Back Rodrigo Fonseca

A Digital Fountain Approach to Reliable Distribution of Bulk Data

Tracking Frequent Items Dynamically: What s Hot and What s Not To appear in PODS 2003

Congestion Control In the Network

CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007

Broadcasting with Hard Deadlines in Wireless Multi-hop Networks Using Network Coding

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

MULTI-LAYER VIDEO STREAMING WITH HELPER NODES USING NETWORK CODING

Distributed System Chapter 16 Issues in ch 17, ch 18

Research Students Lecture Series 2015

Purity: building fast, highly-available enterprise flash storage from commodity components

In examining performance Interested in several things Exact times if computable Bounded times if exact not computable Can be measured

HP AutoRAID (Lecture 5, cs262a)

Analysis of algorithms

PartialSync: Efficient Synchronization of a Partial Namespace in NDN

Unicasts, Multicasts and Broadcasts

CIS 121 Data Structures and Algorithms with Java Spring 2018

CS370 Operating Systems

Scalable Enterprise Networks with Inexpensive Switches

CS132/EECS148 - Instructor: Karim El Defrawy Midterm Spring 2013 Time: 1hour May 2nd, 2013

Network Architecture

ITTC High-Performance Networking The University of Kansas EECS 881 Architecture and Topology

Virtual Memory. CSCI 315 Operating Systems Design Department of Computer Science

CS5412 CLOUD COMPUTING: PRELIM EXAM Open book, open notes. 90 minutes plus 45 minutes grace period, hence 2h 15m maximum working time.

Lecture 19. Lecturer: Aleksander Mądry Scribes: Chidambaram Annamalai and Carsten Moldenhauer

Topic Notes: Building Memory

Distributed Systems. 21. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2018

Move-to-front algorithm

CS November 2018

Last Lecture: Network Layer

CHAPTER 4 BLOOM FILTER

Transcription:

Codes, Bloom Filters, and Overlay Networks Michael Mitzenmacher 1

Today... Erasure codes Digital Fountain Bloom Filters Summary Cache, Compressed Bloom Filters Informed Content Delivery Combining the two Other Recent Work 2

Codes: High Level Idea Everyone thinks of data as an ordered stream. I need packets 1-1,000. Using codes, data is like water: You don t care what drops you get. You don t care if some spills. You just want enough to get through the pipe. I need 1,000 packets. 3

Erasure Codes n cn n n Message Encoding Received Message Encoding Algorithm Transmission Decoding Algorithm 4

Application: Trailer Distribution Problem Millions of users want to download a new movie trailer. 32 megabyte file, at 56 Kbits/second. Download takes around 75 minutes at full speed. 5

Point-to-Point Solution Features Good Users can initiate the download at their discretion. Users can continue download seamlessly after temporary interruption. Moderate packet loss is not a problem. Bad High server load. High network load. Doesn t scale well (without more resources). 6

Bad Broadcast Solution Features Users cannot initiate the download at their discretion. Users cannot continue download seamlessly after temporary interruption. Packet loss is a problem. Good Low server load. Low network load. Does scale well. 7

A Coding Solution: Assumptions We can take a file of n packets, and encode it into cn encoded packets. From any set of n encoded packets, the original message can be decoded. 8

Coding Solution 5 hours Encoding Encoding Copy 2 4 hours 3 hours File 2 hours Encoding Copy 1 1 hour Transmission User 1 Reception User 2 Reception 0 hours 9

Coding Solution Features Users can initiate the download at their discretion. Users can continue download seamlessly after temporary interruption. Moderate packet loss is not a problem. Low server load - simple protocol. Does scale well. Low network load. 10

So, Why Aren t We Using This... Encoding and decoding are slow for large files -- especially decoding. So we need fast codes to use a coding scheme. We may have to give something up for fast codes... 11

Performance Measures Time Overhead The time to encode and decode expressed as a multiple of the encoding length. Reception efficiency Ratio of packets in message to packets needed to decode. Optimal is 1. 12

Reception Efficiency Optimal Can decode from any n words of encoding. Reception efficiency is 1. Relaxation Decode from any (1+ε) n words of encoding Reception efficiency is 1/(1+ε). 13

Parameters of the Code Message n cn Encoding (1+ε)n Reception efficiency is 1/(1+ε) 14

Previous Work Reception efficiency is 1. Standard Reed-Solomon Time overhead is number of redundant packets. Uses finite field operations. Fast Fourier-based Time overhead is ln 2 n field operations. Reception efficiency is 1/(1+ε). Random mixed-length linear equations Time overhead is ln(1/ε)/ε. 15

Tornado Code Performance Reception efficiency is 1/(1+ε). Time overhead is ln(1/ε). Simple, fast, and practical. 16

Codes: Other Applications? Using codes, data is like water. What more can you do with this idea? Example --Parallel downloads: Get data from multiple sources, without the need for co-ordination. 17

Latest Improvements Practical problem with Tornado code: encoding length Must decide a priori -- what is right? Encoding/decoding time/memory proportional to encoded length. Luby transform: Encoding produced on-the-fly -- no encoding length. Encoding/decoding time/memory proportional to message length. 18

Coding Solution 5 hours 4 hours 3 hours File Encoding 2 hours 1 hour Transmission User 1 Reception User 2 Reception 19 0 hours

Bloom Filters: High Level Idea Everyone thinks they need to know exactly what everyone else has. Give me a list of what you have. Lists are long and unwieldy. Using Bloom filters, you can get small, approximate lists. Give me information so I can figure out what you have. 20

Lookup Problem Given a set S = {x 1,x 2,x 3, x n } on a universe U, want to answer queries of the form: Is y S. Example: a set of URLs from the universe of all possible URL strings. Bloom filter provides an answer in Constant time (time to hash). Small amount of space. But with some probability of being wrong. 21

Bloom Filters B B B Start with an m bit array, filled with 0s. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Hash each item x j in S k times. If H i (x j ) = a, set B[a] = 1. 0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0 To check if y is in S, check B at H i (y). All k values must be 1. 0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0 Possible to have a false positive; all k values are 1, but y is not in S. B 0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0 22

Errors Assumption: We have good hash functions, look random. Given m bits for filter and n elements, choose number k of hash functions to minimize false positives: Let Let p f = = Pr[cell is Pr[false pos] empty] = (1 = (1 1/ m) As k increases, more chances to find a 0, but more 1 s in the array. p) Find optimal at k = (ln 2)m/n by calculus. k kn (1 e e kn / m kn / m ) k 23

Example False positive rate 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 Opt k = 8ln2 = 5.45... 0 1 2 3 4 5 6 7 8 9 10 Hash functions m/n = 8 24

Bloom Filters: Distributed Systems Web Cache 1 Web Cache 2 Web Cache 3 Web Cache 4 Web Cache 5 Web Cache 6 Send Bloom filters of URLs. False positives do not hurt much. Get errors from cache changes anyway. 25

Tradeoffs Three parameters. Size m/n : bits per item. Time k : number of hash functions. Error f : false positive probability. 26

Compression Insight: Bloom filter is not just a data structure, it is also a message. If the Bloom filter is a message, worthwhile to compress it. Compressing bit vectors is easy. Arithmetic coding gets close to entropy. Can Bloom filters be compressed? 27

Optimization, then Compression Optimize to minimize false positive. p = Pr[cell is empty] = f = Pr[false pos] k ( m ln 2) / n Atk = m (ln 2) /n, p = 1/2. Bloom filter looks like a random string. Can t compress it. = (1 = is optimal (1 1/ m) p) k kn (1 e e kn / m kn / m ) k 28

Tradeoffs With compression, four parameters. Compressed (transmission) size z/n : bits per item. Decompressed (stored) size m/n : bits per item. Time k : number of hash functions. Error f : false positive probability. 29

Does Compression Help? Claim: transmission cost limiting factor. Updates happen frequently. Machine memory is cheap. Can we reduce false positive rate by Increasing decompressed size (storage). Keeping transmission cost constant. 30

Errors: Compressed Filter Assumption: optimal compressor, z = mh(p). H(p) is entropy function; optimally get H(p) compressed bits per original table bit. Arithmetic coding close to optimal. Optimization: Given z bits for compressed filter and n elements, choose table size m and number of hash functions k to minimize f. p e kn / m Optimal found by calculus. ; f (1 e kn / m ) k ; z mh ( p) 31

Example False positive rate 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 Original Compressed z/n = 8 0.01 0 0 1 2 3 4 5 6 7 8 9 10 Hash functions 32

Results Atk = m (ln 2) /n, false positives are maximized with a compressed Bloom filter. Best case without compression is worst case with compression; compression always helps. Side benefit: Use fewer hash functions with compression; possible speedup. 33

Examples Array bits per elt. m/n 8 14 92 Trans. Bits per elt. z/n 8 7.923 7.923 Hash functions k 6 2 1 False positive rate f 0.0216 0.0177 0.0108 Array bits per elt. m/n 16 28 48 Trans. Bits per elt. z/n 16 15.846 15.829 Hash functions k 11 4 3 False positive rate f 4.59E-04 3.14E-04 2.22E-04 Examples for bounded transmission size. 20-50% of false positive rate. Simulations very close. Small overhead, variation in compression. 34

Examples Array bits per elt. m/n 8 12.6 46 Trans. Bits per elt. z/n 8 7.582 6.891 Hash functions k 6 2 1 False positive rate f 0.0216 0.0216 0.0215 Array bits per elt. m/n 16 37.5 93 Trans. Bits per elt. z/n 16 14.666 13.815 Hash functions k 11 3 2 False positive rate f 4.59E-04 4.54E-04 4.53E-04 Examples with fixed false probability rate. 5-15% compression for transmission size. Matches simulations. 35

Bloom Filters: Other Applications? Finding objects Oceanstore : Object Location Geographical Region Summary Service Data summaries IP Traceback Reconciliation methods Coming up... 36

Putting it all Together: Informed Content Delivery on Overlay Networks To appear in SIGCOMM 2002. Joint work with John Byers, Jeff Considine, Stan Rost. 37

Informed Delivery: Basic Idea Reliable multicast uses tree networks. On an overlay/p2p network, there may be other bandwidth/communication paths available. But I need coordination to use it wisely. 38

Application: Movie Distribution Problem Millions of users want to download a new movie. Or a CDN wants to populate thousands of servers with a new movie for those users. Big file -- for people with lots of bandwidth. People will being using P2P networks. 39

Motivating Example 40

Our Argument In CDNs/P2Ps with ample bandwidth, performance will benefit from additional connections If intelligent in collaborating on how to utilize the bandwidth Assuming a pair of end-systems has not received exactly the same content, it should reconcile the differences in received content 41

It s a Mad, Mad, Mad World Challenges Native Internet Asynchrony of connections, disconnections Heterogeneity of speed, loss rates Enormous client population Preemptable sessions Transience of hosts, routers and links Adaptive overlays In reconfiguring topologies, exacerbate some of the above 42

Environmental Fluidity Requires Flexible Content Paradigms Expect frequent reconfiguration Need scalable migration, preemption support Digital fountain to the rescue Stateless: servers can produce encoded continuously Time-invariant in memoryless encoding Tolerance to client differences Additivity of fountains 43

Environmental Fluidity Produces Opportunities Opportunities for reconciliation Significant discrepancies between working sets of peers receiving identical content Receiver with higher transfer rate or having arrived earlier will have more content Receivers with uncorrelated losses will have gaps in different portions of their working sets Parallel downloads Ephemeral connections of adaptive overlay networks 44

Reconciliation Problem With standard sequential ordering, reconciliation is not (necessarily) a problem. Using coding, must reconcile over a potentially large, unordered universe of symbols (using Luby s improved codes). How to reconcile peers with partial content in an informed manner? 45

Approximate Reconciliation with Bloom Filters Send a (compressed) Bloom filter of encoding packets held. Respondent can start sending encoding packets you do not have. False positives not so important. Coding already gives redundancy. You want useful packets as quickly as possible. Bloom filters require a small number of packet. 46

Additional Work Coarse estimation of overlap in 1 packet. Using sampling. Using min-wise independent samples. Approximate reconciliation trees. Enhanced data structure for when the number of discrepancies is small. Also based on Bloom filters. Re-coding. Combining coded symbols. 47

Reconciliation: Other Applications Approximate vs. Exact Reconciliation Communication complexity. Practical uses: Databases, handhelds, etc. 48

Public Relations: Latest Research (1) A Dynamic Model for File Sizes and Double Pareto Distributions A generative model that explains the empirically observed shape of file sizes in file systems. Lognormal body, Pareto tail. Combines multiplicative models from probability theory with random graph models similar to recent work on Web graphs. 49

Public Relations: Latest Research (2) Load Balancing with Memory Throw n balls into n bins. Randomly: maximum load is log n / log log n Best of 2 choices: log log n / log 2 Suppose you get to remember the best possibility from the last throw. 1 Random choice, 1 memory: log log n / 2 log τ Queueing variations also analyzed. 50

Public Relations: Latest Research (3) Verification Codes Low-Density Parity-Check codes for large alphabets: e.g. 32-bit integers, and random errors. Simple, efficient codes. Linear time. Based on XORs. Performance: better than worst-case Reed- Solomon codes. Extended to additional error models (code scrambling). 51

Conclusions I m interested in network problems. There are lots of interesting problems out there. New techniques, algorithms, data structures New analyses Finding the right way to apply known ideas I d love to work with MIT students, too. 52