CIS 700/005 Networking Meets Databases

Similar documents
Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

CS 268: DHTs. Page 1. How Did it Start? Model. Main Challenge. Napster. Other Challenges

EE 122: Peer-to-Peer Networks

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

EE 122: Peer-to-Peer (P2P) Networks. Ion Stoica November 27, 2002

Main Challenge. Other Challenges. How Did it Start? Napster. Model. EE 122: Peer-to-Peer Networks. Find where a particular file is stored

15-744: Computer Networking P2P/DHT

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

CSCI-1680 P2P Rodrigo Fonseca

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

Page 1. Key Value Storage"

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

Badri Nath Rutgers University

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

P2P: Distributed Hash Tables

CS514: Intermediate Course in Computer Systems

Distributed Hash Tables: Chord

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

CSCI-1680 Web Performance, Content Distribution P2P Rodrigo Fonseca

Content Overlays. Nick Feamster CS 7260 March 12, 2007

CSCI-1680 Web Performance, Content Distribution P2P John Jannotti

Distributed Hash Table

Distributed lookup services

Peer-to-peer computing research a fad?

L3S Research Center, University of Hannover

Telematics Chapter 9: Peer-to-Peer Networks

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

: Scalable Lookup

Architectures for Distributed Systems

Handling Churn in a DHT

Distributed Hash Tables Chord and Dynamo

Distributed Hash Tables

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Part 1: Introducing DHTs

Peer-to-Peer Systems and Distributed Hash Tables

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

CS 3516: Advanced Computer Networks

Page 1. P2P Traffic" P2P Traffic" Today, around 18-20% (North America)! Big chunk now is video entertainment (e.g., Netflix, itunes)!

Peer-to-Peer Systems. Chapter General Characteristics

LECT-05, S-1 FP2P, Javed I.

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

Motivation. The Impact of DHT Routing Geometry on Resilience and Proximity. Different components of analysis. Approach:Component-based analysis

The Impact of DHT Routing Geometry on Resilience and Proximity. Acknowledgement. Motivation

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Peer-to-Peer Protocols and Systems. TA: David Murray Spring /19/2006

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

Overlay networks. Today. l Overlays networks l P2P evolution l Pastry as a routing overlay example

Decentralized Object Location In Dynamic Peer-to-Peer Distributed Systems

Scaling Out Systems. COS 518: Advanced Computer Systems Lecture 6. Kyle Jamieson

08 Distributed Hash Tables

Introduction to P2P Computing

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. What We Want. Today s Question. What We Want. What We Don t Want C 1

CS 347 Parallel and Distributed Data Processing

Introduction to Peer-to-Peer Systems

Peer to Peer I II 1 CS 138. Copyright 2015 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved.

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Ken Birman. Cornell University. CS5410 Fall 2008.

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

Distributed Hash Tables

Peer to peer systems: An overview

C 1. Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. Today s Question. What We Want. What We Want. What We Don t Want

Peer-to-Peer Internet Applications: A Review

A Survey of Peer-to-Peer Content Distribution Technologies

CPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

Distributed Knowledge Organization and Peer-to-Peer Networks

A Structured Overlay for Non-uniform Node Identifier Distribution Based on Flexible Routing Tables

CSE 486/586 Distributed Systems

ReCord: A Distributed Hash Table with Recursive Structure

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

L3S Research Center, University of Hannover

0!1. Overlaying mechanism is called tunneling. Overlay Network Nodes. ATM links can be the physical layer for IP

P2P Network Structured Networks (IV) Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

Dynamic Load Sharing in Peer-to-Peer Systems: When some Peers are more Equal than Others

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

MULTI-DOMAIN VoIP PEERING USING OVERLAY NETWORK

Distributed Data Management. Profr. Dr. Wolf-Tilo Balke Institut für Informationssysteme Technische Universität Braunschweig

Structured Peer-to-Peer Networks

6 Structured P2P Networks

Distributed Hash Tables (DHT)

Cycloid: A constant-degree and lookup-efficient P2P overlay network

An Expresway over Chord in Peer-to-Peer Systems

INF5070 media storage and distribution systems. to-peer Systems 10/

Semester Thesis on Chord/CFS: Towards Compatibility with Firewalls and a Keyword Search

Modern Technology of Internet

Overlay networks. T o do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. q q q. Turtles all the way down

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Making Gnutella-like P2P Systems Scalable

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

Transcription:

Announcements CIS / Networking Meets Databases Boon Thau Loo Spring Lecture Paper summaries due at noon today. Office hours: Wed - pm ( Levine) Project proposal: due Feb. Student presenter: rd Jan: A Scalable Distributed Information Management System. SIGCOMM. Note: Several slides are courtesy of CS (Spring ) at UC Berkeley Today How Did it Start? Two assigned papers: Chord (SIGCOMM ) Distributed Hash Tables (CACM ) Focus on CAN and Chord A killer application: Napster Free music over the Internet Key idea: share the content, storage and bandwidth of individual (home) users Internet Model Main Challenge Each user stores a subset of files Each user has access (can download) files from all users in the system Find where a particular file is stored E F D E? A B C

Other Challenges Napster Scale: up to hundred of thousands or millions of machines Dynamicity: machines can come and go any time Assume a centralized index system that maps files (songs) to machines that are alive How to find a file (song) Query the index system return a machine that stores the required file Ideally this is the closest/least-loaded machine ftp the file Advantages: Simplicity, easy to implement sophisticated search engines on top of the index system Disadvantages: Robustness, scalability (?) Napster: Example Gnutella m F E? m E A E? m m E m A m B m C m D m E m F m B D m C m Distribute file location Idea: flood the request How to find a file: Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the request, and it sends back the answer Advantages: Totally decentralized, highly robust Disadvantages: Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL) Gnutella Distributed Hash Tables (DHTs) Ad-hoc topology Queries are flooded for bounded number of hops No guarantees on recall xyz Abstraction: a distributed hash-table data structure put(key, item); item = get(key) Note: item can be anything: a data object, document, file, pointer to a file Proposals CAN, Chord, Pastry, Tapestry, etc xyz Query: xyz

DHT Design Goals Structured Networks (aka DHTs) Make sure that an item (file) identified is always found Scales to hundreds of thousands of nodes Handles rapid arrival and failure of nodes Distributed Hash Tables (DHTs) Hash table interface: put(key,item), get(key) O(log n) hops (K,I ) Guarantees on recall I put(k,i ) get (K ) Keyword Search using DHTs Inverted Lists hashed by keyword (term) in the DHT (h(t ), {I,I }) (h(t ), {I,I }) (h(t ), {I,I }) {I,I } {I } Usage of DHTs Rendezvous: The key (or identifier) is globally unique and static DHT functions as a dynamic and flat DNS E.g. IP mobility, Chat, Internet telephony, DNS, The Web! Storage: Rendezvous applications use the DHT only to store small pointers (IP addresses, etc.) DHTs for more serious storage, such as file systems Chord File System (SOSP ) Query: T AND T Usage of DHTs Internet-scale Query Processing Use the DHT as a global hash index Distributed joins and aggregations using DHTs PIER (VLDB, CIDR ) Others: Multicast, publish-subscribe, event-notification, etc Data-centric Internet Architecture i (SIGCOMM ), ROFL (SIGCOMM ) DHT-based Research Projects File sharing [CFS, OceanStore, PAST, ] Web cache [Squirrel, CoralCDN] Backup store [Pastiche] Database Query Engines [PIER, ] Event notification [Scribe] Naming systems [ChordDNS, Twine,..] Communication primitives [i, ROFL] Multimedia streaming [SplitStream] Sensor networks [GHT DHT + GPSR] Common thread: data is location-independent

Content Addressable Network (CAN) Associate to each node and item a unique id in an d-dimensional Co-ordinate space Properties Routing table size O(d) Guarantees that a file is found in at most d*n /d steps, where n is the total number of nodes divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios : or : n Example: Node n:(, ) first node that joins cover the entire space Node n:(, ) joins space is divided between n and n Node n:(, ) joins space is divided between n and n n n n n n Nodes n:(, ) and n:(,) join n n n n n Nodes: n:(, ); n:(,); n:(, ); n:(,);n:(,) Items: f:(,); f:(,); f:(,); f:(,); n f f n n n f n f

Each item is stored by the node who owns its mapping in the space n f f n n n f n f CAN: Query Example Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n queries f Can route around some f failures n f f n n n n f CAN: Node Joining CAN: Node Joining (x,y) I I new node ) Discover some node I already in CAN new node ) Pick random point in space CAN: Node Joining CAN: Node Joining (x,y) J J new I new node ) I routes to (x,y), discovers node J ) split J s zone in half new node owns one half

Node departure Chord Node explicitly hands over its zone and the associated (key,value) database to one of its neighbors In case of network failure this is handled by a take-over algorithm Problem : take over mechanism does not provide regeneration of data Solution: every node has a backup of its neighbours Associate to each node and item a unique id in an uni-dimensional space.. m - Key design decision Decouple correctness from efficiency Properties Routing table size O(log(N)), where N is the total number of nodes Guarantees that a file is found in O(log(N)) steps Review: Simple Hashing Consistent Hash [Karger 9] Given document XYZ, we need to choose a server to use Suppose we use modulo Number servers from n Place document XYZ on server (XYZ mod n) What happens when a servers fails? n n- Why might this be bad? David Karger, et al. Consistent Hashing and Random Trees: Tools for Relieving Hot Spots on the World Wide Web. STOC 9. Desired features Balanced load is equal across buckets Smoothness little impact on hash bucket contents when buckets are added/removed Spread small set of hash buckets that may hold a set of object Load # of objects assigned to hash bucket is small Consistent Hash [Karger 9] Identifier to Node Mapping Example Construction Assign each of C hash buckets to random points on mod n circle, where, hash key size = n. Map object to random position on circle Hash of object = closest clockwise bucket Smoothness addition of bucket does not cause movement between existing buckets Spread & Load small set of buckets that lie near object Balance no bucket is responsible for large number of objects Bucket Node maps [,] Node maps [9,] Node maps [, ] Node maps [9, ] Each node maintains a pointer to its successor

Lookup Each node maintains its successor lookup() Joining Operation Node with id= joins the ring via node succ= pred= Route packet (ID, data) to the node responsible for ID using successor pointers node= succ=nil pred=nil succ= pred= Joining Operation Joining Operation Node : send join() to node Node : returns node Node updates its successor to succ= succ=nil pred=nil succ= pred= succ= pred= join() Node : send stabilize() to node Node : update predecessor to send notify() back succ= pred=nil notify(pred=) succ= pred= succ= pred= pred= stabilize(node=) Joining Operation (cont d) Joining Operation (cont d) Node sends a stabilize message to its successor, node Node reply with a notify message Node updates its successor to succ= pred=nil stabilize(node=) notify(pred=) succ= pred= Node sends a stabilize message to its new successor, node Node sets its predecessor to node succ= pred= pred=nil succ= pred= Stabilize(node=) succ= succ= pred= succ= pred=

Joining Operation (cont d) Achieving Efficiency: finger tables This completes the joining operation! succ= pred= succ= pred= Finger Table at i ft[i] 9 9 9 9 9 + ( + ) mod = + 9 + + + + Say m= i m ith entry at peer with id n is first peer with id >= n + (mod ) CAN/Chord Optimizations Achieving Robustness Reduce latency Chose finger that reduces expected time to reach destination ( route selection ) Chose the closest node from range [N+ i-,n+ i ) as i th finger ( proximity neighbor selection ) Accommodate heterogeneous systems Multiple virtual nodes per physical node To improve robustness each node maintains the k (> ) immediate successors instead of only one successor In the notify() message, node A can send its k- successors to its predecessor B Upon receiving notify() message, B can update its successor list by concatenating the successor list received from A with A itself Good papers to follow-up Geometries we compare Krishna Gummadi et al, The Impact of DHT Routing Geometry on Resilience and Proximity, SIGCOMM Sean Rhea, et al. Handling Churn in a DHT. Proceedings of the USENIX Annual Technical Conference, June. Frank Dabek, et al. Designing a DHT for Low Latency and High Throughput, NSDI. Castro, et al. Secure Routing for Structured Peerto-Peer Overlay Networks, OSDI Geometry Ring Hypercube Tree Hybrid = Tree + Ring Algorithm Chord, Symphony CAN Plaxton Tapestry, Pastry

Summary of our flexibility analysis Available Software Flexibility Neighbors (FNS) Routes (FRS) Ordering of Geometries Hypercube << Tree, XOR, Ring, Hybrid () ( i- ) Tree << XOR, Hybrid < Hypercube < Ring () (logn/) (logn/) (logn) Chord: http://pdos.csail.mit.edu/chord/ (C++) FreePastry: http://freepastry.org/ (Java) Bamboo: http://bamboo-dht.org/ (Java) Open DHT: http://opendht.org/ (Java/Per/Python) ppsim: http://pdos.csail.mit.edu/ppsim/ Next lecture This Thursday: Querying the Internet with PIER (VLDB ) A Knowledge Plane for the Internet (SIGCOMM ) Summaries due at noon on Thursday. 9