Architectures for Distributed Systems

Similar documents
DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 2 ARCHITECTURES

Telematics Chapter 9: Peer-to-Peer Networks

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

Distributed Information Processing

Peer-to-Peer Systems. Chapter General Characteristics

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

Introduction to Peer-to-Peer Systems

L3S Research Center, University of Hannover

A Survey of Peer-to-Peer Content Distribution Technologies

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Hash Table

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

CPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

ICT 6544 Distributed Systems Lecture 2: ARCHITECTURES

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Telecommunication Services Engineering Lab. Roch H. Glitho

Content Overlays. Nick Feamster CS 7260 March 12, 2007

LECT-05, S-1 FP2P, Javed I.

Today. Architectural Styles

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

Distributed Systems: Architectural Issues

Chapter 2 ARCHITECTURES

P2P: Distributed Hash Tables

Introduction to P2P Computing

A Framework for Peer-To-Peer Lookup Services based on k-ary search

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Peer-To-Peer Techniques

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

Overlay networks. Today. l Overlays networks l P2P evolution l Pastry as a routing overlay example

Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. What We Want. Today s Question. What We Want. What We Don t Want C 1

Making Gnutella-like P2P Systems Scalable

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

C 1. Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. Today s Question. What We Want. What We Want. What We Don t Want

12/5/16. Peer to Peer Systems. Peer-to-peer - definitions. Client-Server vs. Peer-to-peer. P2P use case file sharing. Topics

Structured Peer-to-Peer Networks

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

CSCI-1680 P2P Rodrigo Fonseca

A Structured Overlay for Non-uniform Node Identifier Distribution Based on Flexible Routing Tables

Today. Architectural Styles

Distributed Hash Tables (DHT)

A Chord-Based Novel Mobile Peer-to-Peer File Sharing Protocol

CIS 700/005 Networking Meets Databases

Early Measurements of a Cluster-based Architecture for P2P Systems

ReCord: A Distributed Hash Table with Recursive Structure

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

Building a low-latency, proximity-aware DHT-based P2P network

Slides for Chapter 10: Peer-to-Peer Systems

Lecture 8: Application Layer P2P Applications and DHTs

Three Layer Hierarchical Model for Chord

An Expresway over Chord in Peer-to-Peer Systems

Introduction to Distributed Systems

Introduction to Peer-to-Peer Networks

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

Design of a New Hierarchical Structured Peer-to-Peer Network Based On Chinese Remainder Theorem

Content Search. Unstructured P2P. Jukka K. Nurminen

Advanced Distributed Systems. Peer to peer systems. Reference. Reference. What is P2P? Unstructured P2P Systems Structured P2P Systems

Peer-to-Peer Internet Applications: A Review

Chapter 10: Peer-to-Peer Systems

Application Layer Multicast For Efficient Peer-to-Peer Applications

EE 122: Peer-to-Peer (P2P) Networks. Ion Stoica November 27, 2002

Should we build Gnutella on a structured overlay? We believe

Lecture 2: January 24

Distributed Hash Tables: Chord

LECTURE 3: CONCURRENT & DISTRIBUTED ARCHITECTURES

CSE 486/586 Distributed Systems

Peer-to-Peer Networks

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

Assignment 5. Georgia Koloniari

Advanced Computer Networks

Small-World Overlay P2P Networks: Construction and Handling Dynamic Flash Crowd

EE 122: Peer-to-Peer Networks

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

DRing: A Layered Scheme for Range Queries over DHTs

A P2P File Sharing Technique by Indexed-Priority Metric

Subway : Peer-To-Peer Clustering of Clients for Web Proxy

Scalable overlay Networks

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

Overlay and P2P Networks. Introduction and unstructured networks. Prof. Sasu Tarkoma

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

A Super-Peer Based Lookup in Structured Peer-to-Peer Systems

Badri Nath Rutgers University

Opportunistic Application Flows in Sensor-based Pervasive Environments

Architectures for distributed systems (Chapter 2)

Simulations of Chord and Freenet Peer-to-Peer Networking Protocols Mid-Term Report

A Hybrid Peer-to-Peer Architecture for Global Geospatial Web Service Discovery

Peer-to-Peer Systems and Distributed Hash Tables

INF5071 Performance in distributed systems: Distribution Part III

A Decentralized Content-based Aggregation Service for Pervasive Environments

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

INF5070 media storage and distribution systems. to-peer Systems 10/

CSE 5306 Distributed Systems

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Peer-to-Peer (P2P) Systems

Transcription:

Distributed Systems and Middleware 2013 2: Architectures Architectures for Distributed Systems Components A distributed system consists of components Each component has well-defined interface, can be replaced by another one with same I/F in the system Architectural styles How components should be organized? How components should interact with each other? 1 2 System architecture How components are placed on real machines Architectural t styles (1) Request flow Layer N Layer N-1 Layer 2 Layer 1 Response flow Layered architecture Component at layer i is allowed to call components at underlying layer i-1 3 Method call -based architecture Components (objects) are connected through a remote procedure call mechanism 4 Architectural styles (2) Data-centered architecture Processes communicate through a common repository Event-based architecture Processes communicate through propagation of events publish/subscribe (pub/sub) system events are published only subscriber processes receive the published events component component component component delivery delivery publish Event bus publish component Event-based architecture (publish/subscribe system) Shared data space Shared data-space architecture (data-centered + event-based)

System architectures What is the system architecture? Instance of a distributed system after deciding components, their interaction, and their placement According to placement, following forms exist Centralized architectures Decentralized architectures Various hybrid forms Centralized architectures t Client-server model Processes in a DS are divided into two groups: server, client Server: e apocess process implementing pe e gaspec specific cservice Client: a process requesting a service from a server 5 6 Can be implemented in LAN with connectionless protocol and in WAN with reliable connection-oriented protocol Application i layering Client-server model has the following 3 levels User-interface level Processing level Data level Multi-tiered i architecture Three levels can be distributed across several machines Two-tiered architecture Three levels are distributed over two kinds of machines: clients and servers There are the following five possibilities 7 Simplified organization of Internet search engine 8

Three-tiered architecture A single server can be replaced by multiple servers running on different machines A server may need to act as a client Question 2-1 (1) Show an example system for each architectural styles on pages 3-4. (2) Show an example system for each of (a)- (e) in two-tiered architecture on page 8. (3) Show a system (other than Internet search engine) that can be realized with three- tiered architecture. 9 This kind of distribution is called vertical distribution 10 Decentralized architectures Vertical distribution: logical level division This is one of many possible ways of organizing a distributed system Horizontal distribution: physical level division Client (server) is physically split up into logically equivalent parts Each part operates on its own share of the complete data set All the parts balance the load Architectures supporting horizontal distribution Peer-to-peer (P2P) systems Peer-to-peer (P2P) systems What is a P2P system? Resources (files, bandwidth, computation power, services, etc) are distributed and shared among peers (user processes) Characteristics of P2P systems The processes that constitute a P2P system are all equal Interaction between processes is symmetric Each process will act as a client and a server at the same time Processes form a network called the overlay network Consists of processes and overlay links (commun. channel) A process cannot communicate directly with an arbitrary other process, but is required ed to send requests ests through available communication channels (neighboring peers) 11 12

Process (peer) Example of overlay network Overlay link (TCP connection) Overlay Network Types of P2P systems Hybrid P2P Structured P2P Unstructured P2P Hierarchical P2P Host A Host C Host D Physical Network Host F 13 Host B Switch (router) Host E 14 15 P2P-based file sharing Goals Files are distributed across all peers When a peer sends a asking for a file to one of other peers, it will receive a reply indicating which peer retains the file File placement Centralized placement: one server retains all files Bottleneck in the server and the network Distributed placement: M files are distributed over N peers Processing and communication traffic amounts can be balanced Problem is how efficiently queries can be routed Query routing Storing the index to all the files (routing table) in one place or in all peers does not scale as the network size grows Need to keep the routing table adequately small Query 16 Hybrid P2P: example Index server Peer A (requests file X ) Reply Napster DL Request for X Download X Peer B Peer E (storing file X ) Peer D Peer C Advantage Fast t search Easy security guarantee Easy management for contents Disadvantage Server maintenance Fault-tolerance Size scalability User searches the responsible peer for the requested file with the index server

Structured P2P architectures DHT (Distributed Hash Table) Referring to methods for efficiently/deterministically searching the peer that retains a file, given a with a key (hash) of the file Examples: CAN, Chord [1], Pastry [2], Tapestry [1] I. Stoica, et al.: Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, Proc. of ACM SIGCOMM 01 01, 2001. [2] A. Rowstron, P. Druschel, Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems, Proc. of IFIP/ACM Middleware 2001, 2001. Challenges in structured P2P To realize efficient file sharing, Where (in which peers) files should be stored? How queries should be routed to reach the peers with target files? Peers Data Files 17 DHT (contents) Query User (peer) 18 19 Chord: file placement Assign a responsible peer for each file in fair manner With a hash function, compute a key (m-bit) for each file Store the file in a peer with the same ID as its key (if no peer with the same ID, the file is stored in the first-found succeeding peer) 0 Key space (each key: m bits) 2 m -1 0 2 m -1 Problem Peer ID space (each ID: m bits) red box: file, blue box: peer When with a key is issued, how can the responsible peer be searched? searching the whole peer ID space one by one O(2 m ) Chord: routing Tradeoff between lookup table size and number of hops to reach the target peer If each peer has complete lookup table for all files Can reach the target peer by 1 hop, but table size is O(N) If each peer does not have any index for files Table size is O(1), but reach the target peer by O(N) hops Chord approach: each peer has partial view of lookup table Each peer has partial lookup table with O(log N) entries O(log N) hops to reach the target peer by recursively narrowing the search space (similar to binary search) 20 N is the number of all peers, N 2 m

100 Chord: solution (1/2) dividing search space into m intervals 0 Peer ID space (each ID: m(=8)bit) ) 2 8-1 101 103 107 To decide peer i s (e.g., peer 99) routing table, divide search space into m (=8) intervals of 2 0, 2 1, 2 2, 2 3,, 2 m-1 entries i (=99) 255 0 m th interval 1 st interval 2 nd interval 3 rd interval 100+2 7 =228 2 8-1 =128 entries 21 The peer with ID i (=99) has the routing table with m (=8) entries Intervals Next peer to ask [100, 101) Peer 100 (or the succeeding peer if peer 10 does not exist) [101, 103) Peer 101 (or succeeding peer) [103, 107) Peer 103 (or succeeding peer) [228, 100) Peer 228 (or succeeding peer) Note: for peer i, 1st interval starts from i+1, m th interval starts t from (i+2 m-1 ) mod 2 m Chord: solution (2/2) step-by-step search space narrowing Lookup table of peer 99 100 100+2 7 =228 99 Lookup table of peer 228 228+1 232 Lookup table of peer 232 233 233 (1) When peer 99 receives with key (=233), it identifies the interval containing the key, and forwards the to peer 228 (2) When peer 228 receives es the, it identifies the interval containing the key, and forwards it to peer 232 22 (3) Peer 232 receives the, identifies the interval (peer 233), and forwards the to peer 233. Worst case Chord: example (1/3) 1 st peer 2 nd peer 3 rd peer 4 th peer 5 th peer O(log N)=O(m) N=32 Key specified in the was in this interval log N =log 2 32 =5 M files are distributed across N peers search file by key (1) Assign a hash value (key) to each file Key is m-bit decided by hash function SHA-1 (N 2 m M) (2) Construct a virtual ring of peers m =128 or 160 Ring consists of 0 to 2 m -1 IDs All peers are associated with IDs on the ring (3) Decide a responsible peer for each file A file is retained by the peer with the smallest ID such that key ID represented by successor(key) return ID of the peer first found starting from key m=3 23 24

25 Chord: example (2/3) (4) Construct a routing table called finger table for each peer Finger table has m (=log N) entries Key space with 2 m items is divided into m intervals with 2 0, 2 1,..., 2 m-1 items Each entry specifies the ID of the next searching peer (succ.) for an interval Finger table of peer 0 1 st int.: [1, 2) next is peer 1 2 nd int.: [2, 4) next is peer 3 3 rd int.: [4, 0) next is peer 0 How to get k-th int. for peer n start k =(n+2 k-1 ) mod 2 m (1 k m) int k = [start k,start k+1 ) succ k = first node start k 26 Chord: example (3/3) What happens when peer 3 receives with key:1 Peer 3 NOT have file 1 forwards to peer 0 based on f. table Peer 0 NOT have file 1 forwards to peer 1 Finally reaches peer 1 that t has file 1 How many hops for? Query is forwarded by at most m times (See page 23 for the worst case) 27 Question 2-2 2 Suppose to use Chord, and answer the following questions (write your answer in next page). Set of IDs for peers {2, 4, 6, 7, 9, 12, 15} Set of keys for files {0, 2, 5, 8, 10, 13, 14, 15} (1) In which peer is each file stored? (2) Complete the finger table of peer 4 (3) How for file 0 is traversed when starting from peer 4? (4) What happens when peer 4 leaves from the network? After that, what happens when a new peer with ID 5 joins? Answer for question 2-2 2 12 13 11 14 10 15 9 28 8 0 1 7 2 6 3 5 4 Files to retain = Finger table start int. succ.

Unstructured P2P: architectures Characteristics of Unstructured P2P No limits on topology (connection of peers), flexible search Each peer has a list of c peers selected at random: partial view The list is periodically exchanged between neighboring peers All peers compose a random graph Peer F Unstructured P2P: example hit Peer A (requests file X ) X) Gnutella hit Peer B Peer E (storing file X ) Peer D Peer C Commun. order Advantage Fault-tolerancetolerance Privacy-preserve Disadvantage network bandwidth is suppressed Search a file by flooding a request without the index server 29 30 To join network, new peer should get one of existing peers address Each peer can keep connections with up to four peers Managing topology in unstructured dp2p Hierarchical P2P Disadvantage of unstructured P2P Search is done by flooding queries efficiency is not good Improvement for efficiency Manage topology in two layers Structured topology Protocol for maintaining optimal topology in a given criterion Links to neighbor peers (optimally selected by a criterion, e.g., having common data, geographically close, ) Peers selected at random Unstructured P2P does not scale as the network grows Flooding a request will overload the entire network Broker that collects resource usage for peers in each other s proximity will allow to quickly select a peer with sufficient resources Superpeers Peers that maintain an index for all peers in their groups and act as a broker regular peer Random topology Protocol for maintaining random graph Links to neighbor peers (selected at random) 31 superpeer 32 superpeer network

Summary Architectures for distributed systems Architectural styles Layered, object-based, event-based, data-centered System architectures: Centralized architecture: client-server model Decentralized architecture: vertical/horizontal distribution of C/S model Peer-to-peer systems Hybrid P2P Structured P2P Unstructured P2P Hierarchical P2P 33