Overlay and P2P Networks. Advanced topics and summary. Prof. Sasu Tarkoma

Similar documents
Overlay and P2P Networks. Introduction. Prof. Sasu Tarkoma

Overlay and P2P Networks. Introduction. Prof. Sasu Tarkoma

Scalable overlay Networks

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Overlay and P2P Networks. Introduction. Prof. Sasu Tarkoma

Overlay and P2P Networks. Unstructured networks. PhD. Samu Varjonen

Scalable overlay Networks

Scalable overlay Networks

Overlay and P2P Networks. Unstructured networks: Freenet. Dr. Samu Varjonen

CIS 700/005 Networking Meets Databases

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

Lecture 2: January 24

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Peer-to-Peer Internet Applications: A Review

Introduction to Peer-to-Peer Systems

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Today. Architectural Styles

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

CSCI-1680 P2P Rodrigo Fonseca

Peer-to-peer systems and overlay networks

Today. Architectural Styles

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Peer-to-Peer Systems and Distributed Hash Tables

Architectures for distributed systems (Chapter 2)

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

CS514: Intermediate Course in Computer Systems

A Survey of Peer-to-Peer Content Distribution Technologies

EE 122: Peer-to-Peer Networks

Overlay and P2P Networks. Introduction and unstructured networks. Prof. Sasu Tarkoma

Architectures for Distributed Systems

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Peer-To-Peer Techniques

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

Making Gnutella-like P2P Systems Scalable

Peer-to-Peer (P2P) Systems

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Overlay networks. Today. l Overlays networks l P2P evolution l Pastry as a routing overlay example

Overlay and P2P Networks. Applications. Prof. Sasu Tarkoma and

Indirect Communication

Telecommunication Services Engineering Lab. Roch H. Glitho

Distributed Hash Tables

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Advanced Computer Networks

Motivation for peer-to-peer

Distributed Hash Tables

L3S Research Center, University of Hannover

Content Overlays. Nick Feamster CS 7260 March 12, 2007

Course Curriculum for Master Degree in Network Engineering and Security

15-744: Computer Networking P2P/DHT

Telematics Chapter 9: Peer-to-Peer Networks

Peer-to-Peer Protocols and Systems. TA: David Murray Spring /19/2006

Peer-to-peer computing research a fad?

Computer Security. 14. Blockchain & Bitcoin. Paul Krzyzanowski. Rutgers University. Spring 2019

Electronic Payment Systems (1) E-cash

FutureNet IV Fourth International Workshop on the Network of the Future, in conjunction with IEEE ICC

internet technologies and standards

Problems in Reputation based Methods in P2P Networks

Peer- Peer to -peer Systems

Peer-to-Peer Systems. Chapter General Characteristics

Peer to Peer Networks

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

CS 268: DHTs. Page 1. How Did it Start? Model. Main Challenge. Napster. Other Challenges

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

LECT-05, S-1 FP2P, Javed I.

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

Indirect Communication

Distributed Knowledge Organization and Peer-to-Peer Networks

Scalable P2P architectures

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Early Measurements of a Cluster-based Architecture for P2P Systems

CS 43: Computer Networks BitTorrent & Content Distribution. Kevin Webb Swarthmore College September 28, 2017

Security for Structured Peer-to-peer Overlay Networks. Acknowledgement. Outline. By Miguel Castro et al. OSDI 02 Presented by Shiping Chen in IT818

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Introduction to P2P Computing

Lecture 8: Application Layer P2P Applications and DHTs

Peer to Peer Networks

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

L3S Research Center, University of Hannover

LECTURE 3: CONCURRENT & DISTRIBUTED ARCHITECTURES

A Case For OneSwarm. Tom Anderson University of Washington.

P2PNS: A Secure Distributed Name Service for P2PSIP

Large-Scale Data Stores and Probabilistic Protocols

Distributed Hash Table

CS 3516: Advanced Computer Networks

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

Peer to Peer Systems and Probabilistic Protocols

An Expresway over Chord in Peer-to-Peer Systems

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P?

GNUnet Distributed Data Storage

Transcription:

Overlay and P2P Networks Advanced topics and summary Prof. Sasu Tarkoma 19.2.2014

Contents Advanced topics Security and BitCoin Bloom filters Publish/subscribe Synthesis

Key security issues Security is a weak point of many overlays and DHTs Not an issue with centrally managed system, but a significant concern for decentralized systems Freenet was our key example of a secure P2P system Also Skype with centralized login servers Overlays are vulnerable to the sybil attack One entity presents multiple identities for malicious intent.

Sybil attack Sybil attack (John Doceur, IPTPS '01) Problem: Malicious nodes overwhelm the network with identities Resolution: Without a central authority that certifies identifies (binding real-world person to nodeid) no realistic approach exists to completely stop the sybil attack With certain assumption sybil attacks can be mitigated Geometric distinctness certification (Bazzi & Konjevod, PODC 2005) Detect clock skew of the devices (Kohno, Broido, and claffy, UCSD, IEEE S&P 2005)

Eclipse attack A group of malicious nodes tries to dominate the neighbor set Start with Sybil and then neighbour discovery Adds malicious node identifiers to neighbour table For example: pick ids close to the target id Result: network partitions Solutions: Remember Freenet node identifier creation (the commitment protocol). Baseline: do not allow arbitary choice of node id.

Security Considerations Malicious nodes Attacker floods DHT with data Attacker returns incorrect data Attacker denies data or supplies incorrect routing info Basic solutions: rate limits and proof-of-work Self-authenticating data Redundancy k-redundant networks What if attackers have quorum? Need a way to control creation of node Ids Solution: secure node identifiers Use public keys

Security summary Wide-area overlay networks are vulnerable to sybil and eclipse attacks A production system typically utilizes centralized authentication Skype login servers Decentralized systems Freenet: verification to protect content, darknet for better protection Bitcoin: verification processes, proof-of-work If managed by one organization security becomes easier Dynamo: no security considerations

BitCoin Form of decentralized cryptographic currency Transaction system similar to credit card/google wallet Created by pseudonymous Satoshi Nakamoto in 2008 Maximum amount of bitcoins is 21 million Payment is broadcast in a signed message These are handled as blocks every ten minutes Added to public transaction record (block chain) Voting on transaction authentication Mining process for authenticating transactions and Discovering new bitcoins Validation based on proof of work Avoiding Sybil attack

BOB s private key Digital signature BOB s public key BOB Transaction BitCoin network ALICE ALICE s public key Transaction is signed by BOB so that the network can verify the transfer to ALICE. The signature is verified by BOB s public key. Signatures form a chain: previous owner signs and gives ownership to the new owner (public key). In our example BOB gives ownership to ALICE.

10 minutes of transactions are broadcast to miners Transactions Miners Block Miners create a block Block puzzle Miner solves a puzzle Proof of work Miner broadcasts solution to other miners Miners Miners verify the proof of work Verification Process starts again

BitCoin Network Steps from Satoshi s paper: 1) New transactions are broadcast to all nodes. 2) Each node collects new transactions into a block. 3) Each node works on finding a difficult proof-of-work for its block. 4) When a node finds a proof-of-work, it broadcasts the block to all nodes. 5) Nodes accept the block only if all transactions in it are valid and not already spent. 6) Nodes express their acceptance of the block by working on creating the next block in the chain, using the hash of the accepted block as the previous hash.

BitCoin Network A nonce in a block is incremented until the block s hash gives the required number of prefix zero bits (typical proof-of-work from Hashcash) Time to compute hash collision is exponential with the number of zero bits This is easy to verify, but any modification to block requires recomputation Blocks are chained and hence transaction history is relatively safe BitCoin increases the difficulty to keep block generation constant

BitCoin Anonymity BitCoin is not anonymous The block chain is public and contains public keys Transactions can be traced with two networks: The transaction network represents the flow of Bitcoins between transactions over time t1 and t2 à t3 à t4 The user network represents the flow of Bitcoins between users over time u1 à u3 à u3 and u4 If user to public key mapping is known, it is possible to trace user s transactions Reference: http://arxiv.org/pdf/1107.4524.pdf

A Short Primer on Bloom Filters

Bloom filters in Gnutella v0.7 Bloom filters are probabilistic structures used to store dictionaries A bit-vector that supports constant time querying of keywords Easy to merge two filters Many variants If space is at premium Number of hash functions (k) Size of filter (m) Number of elements in the inserted set (n) Decrease Less computation Higher false positive rate Smaller space requirements Higher false positive rate Lower false positive rate Increase More computation Lower false positive rate More space is needed Lower false positive rate Higher false positive rate

Bloom Filters Source: Tarkoma et al. Theory and Practice of Bloom Filters for distributed Systems. 2013.

Example Bloom filter x y z 0 1 0 1 1 1 0 0 0 0 0 1 0 1 0 0 1 0

BF False positive probability is given by: ( 1 ( 1 1 m) kn ) k ( 1 e kn/m) k. kn/m Bit not All set bit to one positions are set to After one inserting for k hash n elements functions with k opt = m k hash 9m functions ln 2 n 13n. the bit is still zero Optimal number of hash functions k: Size of filter given optimal number of hash functions: ivenby ( ) m = n ln p (ln 2) 2. Details in the survey paper available on course page.

BF False positive probability is given by: ( 1 ( 1 1 m) kn ) k ( 1 e kn/m) k. kn/m Optimal number of hash functions k: k opt = m n ln 2 9m 13n. Size of filter given optimal number of hash functions: ivenby ( ) m = n ln p (ln 2) 2. Details in the survey paper available on course page.

If false positive rate is fixed, the filter size grows linearly with inserted elements.

Filter Key feature C D P FN Output Standard Bloom filter Is element x in set S? N N N N Boolean Adaptive Bloom filter Frequency by increasing number of hash functions Y N N N Boolean Bloomier filter Frequency and function value Y N N N Freq., f(x) Compressed Bloom filter Compress filter for transmission N N N N Boolean Counting Bloom filter Element frequency queries and deletion Y Y N M Boolean or freq. Decaying Bloom filter Time-window Y Y N N Boolean Deletable Bloom filter Probabilistic element removal N Y N N Boolean Distance-sensitive Bloom filters Is x close to an item in S? N N N Y Boolean Dynamic Bloom filter Dynamic growth of the filter Y Y N N Boolean Filter Bank Mapping to elements and sets Y Y M N x, set, freq. Generalized Bloom filter Two set of hash functions to code x with 1s and 0s N N N Y Boolean Hierarchical Bloom filter String matching N N N N Boolean Memory-optimized Bloom filter Multiple-choice single hash function N N N N Boolean Popularity conscious Bloom filter Popularity-awareness with off-line tuning N N Y N Boolean Retouched Bloom filter Allow some false negatives for better false positive rate N N N Y Boolean Scalable Bloom filter Dynamic growth of the filter N N N N Boolean Secure Bloom filters Privacy-preserving cryptographic filters N N N N Boolean Space Code Bloom filter Frequency queries Y N M N Frequency Spectral Bloom filter Element frequency queries Y Y N M Frequency Split Bloom filter Set cardinality optimized multi-bf construct N N N N Boolean Stable Bloom filter Has element x been seen before? N Y N Y Boolean Variable-length Signature filter Popularity-aware with on-line tuning Y Y Y Y Boolean Weighted Bloom filter Assign more bits to popular elements N N Y N Boolean Source: Tarkoma et al. Theory and Practice of Bloom Filters for distributed Systems. 2013.

Source: Tarkoma et al. Theory and Practice of Bloom Filters for distributed Systems. 2013.

Worked Example Gnutella uses Bloom filters to store and disseminate keyword indexes 1-hop replication in the flat ultranode layer, much improved design over flooding An ultrapeer maintains about 30 leafs and thus 30 Bloom filters, one for each leaf One leaf has about 1000 keywords in our example Assuming false positive rate of 0.1, for 1000 keywords we need 4793 bits. For 30 000 keywords we need 143 776 bits. There is overlap (some keywords are popular)! Gnutella uses 2**16 (65536) bits that is sufficient even when aggregating leaf BFs Experiments report ultrapeer BF 65% full and leaf BF 3% full Today having hundreds of KBs in the BF is not a problem, Gnutella design is old and today s networks are much faster Linear growth if p is fixed! ivenby m = n ln p (ln 2) 2.

Pub/Sub Systems

Event-based Systems and Publish/subscribe Event delivery from publishers to subscribers Event is a message with content One-to-many, many-to-many Builds on messaging systems and store-and-forward A frequently used communication paradigm Decoupling in space and time Solutions from local operation to wide-area networking Proposed for mobile/pervasive computing The event service is a logically centralized service Basic primitives: subscribe, unsubscribe, publish Various routing topologies and semantics

A) Space decoupling Publish Notify Subscriber Publisher Event Notification Service Notify Subscriber Publisher Publish B) Time decoupling Subscriber Event Notification Service Publisher Buffering Notify Subscriber C) Sychronization decoupling Publisher Publish Subscriber Event Notification Service Buffering Notify

Subscriptions Subscriptions are described using filters Filter: a stateless Boolean function Defines a subspace of the content space A single event is a point in the content space Selects a subset of published events Expressive interest definition and contentmatching Content model is typically typed tuples or XML

Pub/Sub Service Subscriber Notification Consumer Receives notifications Subscriptions Notify Notification Engine Matches and sends notification to the appropriate consumers Notifications message instances Publisher Situation Subscription Manager Subscription management on behalf of the Publisher Subscriptions

Event Systems I Traditional MoM systems are message queue based (one-to-one) Event systems and publish/subscribe are one-to-many or many-tomany One object monitors another object Reacts to changes in the object Multiple objects can be notified about changes Events address problems with synchronous operation and polling In distributed environments a logically centralized service mediates events anonymous communication expressive semantics using filtering

Event Systems II Push versus Pull May be implemented using RPC, unicast, multicast, broadcast,.. Three main patterns Observer design pattern Used in Java / Jini Notifier architectural pattern Used by many research systems Event channel Used in CORBA Event/Notification Service Filtering improves scalability / accuracy Research topic: content-based routing

Event Message Flooding Subscription Flooding 2 F: X<4 1 F: X<4 1 2 X=2 X=2

Filter propagation Rendezvous routing F: X<4 F: X<4 X=2 1 X=2 3 2 3 2 2 1 2

Example of Siena-style Content-based Routing Publisher I want to publish information C Overlay Routing infrastructure I want information that matches my interests. A Subscriber B

Tuple Spaces Tuple-based model of coordination The shared tuple space is global and persistent Communication is decoupled in space and time implicit and content-based Primitives In, atomically read and removes a tuple Rd, non-destructive read Out, produce a tuple Eval, creates a process to evaluate tuples Examples: Linda, Lime, JavaSpaces, TSpaces

Research Examples

Pub/sub on network layer Idea to build a new network architecture based on pub/sub PSIRP and PURSUIT projects at HIIT Open source implementation Connect publishers and subscribers through a rendezvous interconnect (overlay) In-packet Bloom filters for stateless forwarding elements 38

Scope-specific Rendezvous core Scope X home Rendezvous Network D 1. Scope X is advertised in the Rendezvous interconnect pointer for scope X is stored in a rendezvous no Hierarchical DHT Rendezvous interconnect Scope X home Local Rendezvous Network A Local Rendezvous Network B Local Rendezvous Network C 2. A set of potential Rids for a scope is Advertised in the HRN of the scope Topology node Topology node Topology node 3.Subscriber rendezvous with the Sid: Rid 4. Join message along control links sets up a graphlet and accumulates the needed Fids Service container Control node Control node Control node Client Forwarding node Forwarding node Forwarding node Forwarding node Forwarding node Domain A Domain B Domain C 5. Payload communication takes place using Fids in the newly created graphlet

P2P Video-on-Demand delivery Joint work with Samuli Aalto and Pasi Lassila (Aalto University) Idea to add windowing mechanism to BitTorrent P2P Video-on-Demand analysis Implementation (cluster, PlanetLab) Simulation Analytical models (stochastic, fluid model) A model for understanding the evolution of the system Articles in Globecom 2008 and 2010, ITC 22, MAMA 2011, and ACM Perf. Eval. Review

Towards Optimal Keywordbased Content Dissemination in DHT P2P Networks Weixiong Rao Roman Vitenberg Sasu Tarkoma University of Helsinki, Finland {weixiong.rao, sasu.tarkoma}@cs.helsinki.fi University of Oslo, Norway romanvi@ifi.uio.no Faculty of Science Department of 4 1 December 2011

Keywords-based content dissemination in DHTs doc A B C D E High doc. publication cost A doc contains tens and even thousands of terms. Disseminating the doc to the home node of each term takes O(logN) hops The total hop count is O( d logn) Our contribution is reduction in the doc pub. cost. D E B C F A G H I K Home Node of Term K Real data sets show that a filter contains 2-3 J keywords. The registration cost is low. B E A B filter-2 DHT P2P networ ks filter-1

Comparisons

Unstructured or semi structured networks BitTorrent Freenet v0.7 Gnutella v0.4 Gnutella v0.7 Decentralization Centralized model Similar to DHTs, two modes (darknet and opennet), two tiers Foundation Tracker Keywords and text strings are used to identify data objects. Assumes small world structure for efficiency Routing function Tracker Clustering using node location and file identifier. Path folding optimization (opennet). Location swapping (darknet). Routing performance Guarantee to locate data, good performance for popular data Search based on Hop-To- Live, no guarantee to locate data. With small world property O(log(n) 2 ) hops are required, where n is the number of nodes. Flat topology (random graph), equal peers Flooding mechanism Flooding mechanism Search until Time-To-Live expires, no guarantee to locate data Random graph with two tiers. Two kinds of nodes, regular and ulta nodes. Ultra nodes are connectivity hubs Selective flooding using the ultra nodes Selective flooding mechanism Search until Time-To-Live expires, second tier improves efficiency, no guarantee to locate data Routing state Constant, choking may occur With small world property O(log(n) 2 ) Constant (reverse path state, max rate and TTL determine max state) Constant (regular to ultra, ultra to ultra). Ultra nodes have to manage leaf node state. Reliability Tracker keeps track of the peers and pieces No central point of failure Performance degrades when the number of peer grows. No central point. Performance degrades when the number of peer grows. Hubs are central points that can be taken out.

Structured networks CAN Chord Kademlia Koorde Pastry Tapestry Viceroy Foundation Multi-dimensional space (ddimensional torus) Circular space XOR metric de Bruijn graph Plaxton-style mesh Plaxton-style mesh Butterfly network Routing function Maps (key,value) pairs to coordinate space Matching key and nodeid Matching key and nodeid Matching key and nodeid Matching key and prefix in nodeid Suffix matching Routing using levels of tree, vicinity search System parameters Number of peers N, number of dimensions d Number of peers N Number of peers N, base of peer identifier B Number of peers N Number of peers N, base of peer identifier B Number of peers N, base of peer identifier B Number of peers N Routing performance O(dN 1/d ) O(log N) O(log B N) + small constant Between O(log log N) and O(log N), depending on state Routing state 2d log N Blog B N + B From constant to log N O(log B N) O(log B N) O(log N) 2Blog B N log B N Constant Joins/leaves 2d (log N) 2 log B N + small constant log N log B N log B N log N

Summary of Datastores and Related Systems Name Type Data placement Dynamo 1-hop DHT Keyà node, consistent hashing Chord PAST Wide-area DHT Wide-area DHT Keyà node, based on consistent hashing Keyà node, based on Plaxton mesh Freenet P2P Keyà closest node, approximate small world routing table Data can be located Yes Replication Yes, clockwise nodes Stabilization Eventual consistency: anti-entropy, gossip Yes Add-on Stabilization mechanism Yes No guarantee Yes, close-by nodes Yes, reversepath Part of Pastry routing Path folding (opennet) and location swapping (darknet) Gnutella P2P Client No guarantee No No (ultranode layer in principle)

Current research involving overlays Virtualized cloud systems and data centers Cluster-based data analytics and processing Information centric networking (ICN) Software defined networking (SDN) Social networks Data dissemination (live video, subscriptions)

Overview of current landscape

Google CDN The Cloud Yahoo Planetlab Amazon

DB SDN P2P/ DHT App DHT DHT Google P2P- CDN DB SDN CDN The Cloud DHT DHT Yahoo DB CH SDN P2P/ DHT App DHT DHT DHT PlanetLab Amazon DHT DB SDN DB DHT SDN DHT P2P/ DHT App Services span multiple sites and regions

SDN is about control/data separation and management of networks Enabler for new network applications ICN is a fundamental change to networking: named data items as first class citizens CDN-like behaviour in the network OpenFlow, Controllers, PSIRP, CCN, SDN ICN Virtualization Overlays Clean-slate

Synthesis We have covered building blocks for decentralized distributed systems Unstructured: Bittorrent, flooding, limited flooding, Bloom filter summaries, depth-first search with backtracking, layered Structured: consistent hashing and DHTs, small world and location swapping, network proximity, content placement, rendezvous Security: asymmetric and symmetric techniques, Sybil attack Additional techniques: gossip, anti-entropy, vector clocks, Applications: Search, file exchange, secure and high performance storage, multicast, DNS replacement, CDN, Future Internet?

Rendezvous, replication, multicast, Titfor-tat Track er Layered Bloom Filt. Limited flooding Flooding Node id clustering Location swapping Depth first Ring Hybrids Trees/ cubes Consistent hashing Proximity Coral Kademlia XOR BT Gnutella Freenet DHTs Random Unstructured Small world Structured N/A Internet (TCP/IP)

Consistent hashing alleviates network problems and eventual consistency can be achieved through replication and synchronization Example: Dynamo Replication, Gossip, etc. Selective flooding Consistent hash (O(1) DHT) Good for arbitrary data and search functions, can aggregate routing info, structure improves scalability Examples: Gnutella and Freenet Limited flooding / depth first / Bloom filters Example: BitTorrent Tracker Good for name/value data, note flat address space, one node is responsible, churn is a concern Examples: Lookup: Chord, CAN, Kademlia Storage: PAST Rendezvous: Scribe (for multicast), i3 DHT Search Storage Rendezvous Search Storage Rendezvous Search Storage Rendezvous Cluster Wide-area (unstructured) Internet (TCP/IP) Wide-area (structured)

Discussion Overlays as fundamental building blocks Web backends Distributed storage systems Cloud systems CDNs Network management platforms Cellular network control planes CDN federation is now in progress SDN federation? DHT federation?

Grading Course grading will be based on the final exam and the assignments/exercises. Course exam 4.3.2015 16:00 B123 Separate exam on 24.4.2015 16:00 in B123 You can use the exercise score with the first separate exam

Exam material: additional articles Article: G. DeCandia et al. Dynamo: Amazon s Highly Available Key-value Store. SOSP 2007. Article: Theory and Practice of Bloom Filters for Distributed Systems. IEEE Surveys and Tutorials. 2012. Slideset: Modeling and Analysis of Anonymous-Communication Systems. J. Feigenbaum. Presentation at WITS 2008. General idea of MIX and Onion Routing are part of the exam material. Article: R. Cohen et al. Resilience of the Internet to Random Breakdowns. Phys. Rev. Lett. 85, 4626 4628 (2000). Background info and derivation of the resiliency formula (power law lectures). Additional details not part of exam material. Article: D. Kreutz, F. M. V. Ramos, P. Verissimo, C. E. Rothenberg, S. Azodolmolky, S. Uhlig. Software-defined Networking: A Comprehensive Survey. Sections I,II, V and VI are part of the exam material. Proceedings of the IEEE, Vol. 103, Issue 1, Jan. 2015.

Exam Four essay questions, answer three, 18 max points, max 6 points per answer (4 bonus points from exercises) Example questions: Q1: A) Compare Gnutella and Kademlia. What are the key differences and similarities of the systems. B) Consider the structure and routing functions of the systems. C) How is content placed on the overlay nodes in these two systems? Q2: A) What problem does BitTorrent solve? B) Compare BitTorrent to HTTP. C) What are the key metrics in BitTorrent data delivery? Briefly consider the significance of each metric.

Main theme Prerequisites Approaches learning goals Meets learning goals Deepens learning goals Overlay and peerto-peer networks: definitions and systems Basics of data communications and distributed systems (Introduction to Data Communications, Distributed Systems) Knowledge of how to define the concepts of overlay and peer-topeer networks, and state their central features Ability to describe at least one system in detail Ability of being able to compare different overlay and p2p networks in a qualitative manner Ability to assess the suitability of different systems to different use cases Ability to give one s own definition of the central concepts and discuss the key design and deployment issues Distributed hash tables Basics of data communications and distributed systems (Introduction to Data Communications, Distributed Systems) Big-O-notation and basics of algorithmic complexity Knowledge of the concepts of structured and unstructured networks and the ability to classify solutions into these two categories Knowledge of the basics of distributed hash tables Ability to describe at least one distributed hash table algorithm in detail Ability of being able to compare different distributed hash table algorithms Ability of designing distributed hash table-based applications Knowledge of key performance issues of distributed hash table systems and the ability to analyze these systems The knowledge of choosing a suitable distributed hash table design for a problem Familiarity with the state of the art Reliability and performance modelling Content distribution Basics of probability theory Basics of reliability in distributed systems Introduction to Data Communications Ability to model and assess the reliability of overlay and peer-topeer networks by using probability theory Knowledge of the most important factors pertaining to reliability Knowledge of the basic content distribution solutions Ability to describe at least one overlay and p2p network based content distribution solution Security Basics of computer security Knowledge of the basic security issues with overlay and p2p networks Knowledge of the sybil attack concept Ability of analytically analyzing the reliability and performance of overlay and peer-to-peer networks Understanding of the design issues that are pertinent for reliable systems Knowledge of different content distribution systes and the ability to compare them in detail Knowledge of several content distribution techniques Ability to discuss how security problems and limitations can be solved Knowledge of how to prevent sybil attacks Familiarity with the state of the art Familiarity with the state of the art Knowledge of how to prevent sybil attacks Familiarity with the state of the art

Course Development General feedback More hands-on assignments? Separate programming task for more credits? MOOC based questions?

Have a nice Spring! Remember course feedback