A Survey on Peer-to-Peer File Systems

Size: px
Start display at page:

Download "A Survey on Peer-to-Peer File Systems"

Transcription

1 Christopher Chang May 10, 2007 CSE 598D: Storage Systems Department of Computer Science and Engineering The Pennsylvania State University A Survey on Peer-to-Peer File Systems Introduction Demand for information through the Internet has skyrocketed in the past decade. More and more people are beginning to use the Internet as a storage transport medium. A number of protocols have been created to facilitate the storage and transfer of this data. Since we are currently limited by the speed of the networks, we need to create efficient storage techniques to allow users to send and receive data at a reasonable and acceptable rate. The simple and traditional way to do this is to have a server that serves clients, which are the users. However, this could put a lot of load on the server and it could induce heavy bandwidth requirements. One method of alleviating this problem is through peer-to-peer communication. Every peer or user acts as both a client and a server. I believe this type of system s popularity will only increase as the Internet becomes more and more heavily used. Due to our limitations in bandwidth, being able to efficiently spread the load and processing is necessary to improve performance. This survey will start with an introduction into the goals of a peer to peer file system, continue with a discussion on the Ivy file system, the PAST storage utility and Samsara enforcement technique, and end with a conclusion. Peer-to-Peer File System A peer-to-peer network relies on the nodes in the network to store and move files around. As opposed to the traditional client-server model where there are a small amount of servers where clients go to access files, in a peer-to-peer network, there is an equal number of clients and servers. In fact, there is no concept of a client or server in a p2p network. Instead, the nodes in this network are referred to as peers and they communicate with each other to make and service requests. Nodes store their own data as well as those from other nodes. Each object is replicated across a set of nodes, providing fault-tolerance if a node was to fail. Nodes enter and leave without the need of administrative action, allowing for a decentralized, self-organizing and scalable architecture. Therefore, the power of the network depends primarily on the computing power and bandwidth of these peers. This philosophy is becoming widely accepted today, due to the ever changing nodes of the Internet. Since there are always people connecting/disconnecting to and from the Internet, a technique like this would be able to adapt to its changes. You can compare this to a RAID-1 storage system, in which data is replicated to both increase parallelism and availability. It is also possible to split up files into different chunks or stripes, as they are called RAID file systems, in a peer-to-peer network. One of the main objectives of a p2p network is to remedy the load that can be induced on a limited amount of servers where there are a lot more clients accessing it. By spreading the responsibility to every other user in the network, we can more equally distribute the load and possibly provide better performance. Although this approach requires that the peers provide some of their bandwidth, disk space and CPU time, the

2 network automatically becomes more scalable. If we had a server that can only handle a number of clients at one time, a bigger burst will overload the server and decrease performance. However, in this technique, as more peers connect and the demand increases, the capacity and capability of the system also increases. In the ideal case, peers produce more than or equal to the amount they consume. A problem with this is when a peer refuses to contribute its resources, and takes more than he gives back. Since there usually aren t strict rules put in place, peers can often easily get around this and leave the network or not service requests when they come in. This issue is looked at with a technique called Samsara that will be discussed later on. Another advantage of peer-topeer networks is that it increases availability because data is often replicated over several peers. In case of a failure, data can be recovered through accessing it from another node that might have a replication of it. Ivy File System Ivy is a peer to peer file system designed by the researchers from MIT. This read/write system provides integrity characteristics that relax the trust policies of the peers. It uses a distributed hash table to store logs that contain file data and meta-data and provides the close-to-open semantics from NFS. Designed as a log-based file system, it provides a number of integrity properties that help enforce security. Data can be found using the table but modifications are only done on the user s log, allowing consistency without having to lock. Logs and the DHT Ivy consists of a set of logs in which each participant has its own. Logs can be read by everyone but only modified by the owner. These logs are stored in a distributed hash table, which does not require a dedicated server. Each peer also maintains a loghead that points to the most recent log record. The distributed peer-to-peer hash table is used to map keys from these logs to arbitrary values. These pairings are stored on different Internet hosts determined by the hash of the key. This hashing technique provides two types of integrity: content-hash block and public-key block. In the first form, the block s key is the SHA-1 hash of the block value. In the ladder, the public key of the owner is used. Since the log-head is the public key, this doesn t change. This allows the participant to update its value without having to change the key. In theory, this guarantees read/write consistency at the expense of maintaining careful replication and updates. Figure 1 View of the Ivy log system. White boxes are the content-hash blocks and the grey boxes are the public-key blocks.

3 The log itself is made up of a linked list of immutable log records as shown in figure 1. Each of these records is a content-hash block. A log record is considered a single file system modification, similar to an NFS operation. The owner of a file and its permissions are stored, but they are not enforced in Ivy. Ivy relies on the user to provide their own form of security on the data, such as encryption. The files and directories are classified using 160-bit i-numbers which are stored in the logs where the relevant files. Views A concept called a view is introduced in the Ivy file system. It consists of the set of logs that make up the file system that is agreed upon by everyone in the system. A view-block then points to all the log-heads in that view and represents the root directory of the file system. The file system is named using the view-block key, which allows for the verification of the public keys of the participants. This approach provides integrity of the log records, but requires that the users trust the file system name from the source and also the new users that join the system. Handling Requests Log requests that do not require any updating can be satisfied in one pass through the log. For those that do, starting at the log-head, the log is scanned until it has gathered enough information to handle the request. This requires one or more passes through the log and the appending of records to the log. When it does this, the description of the update is appended, as well as the required fields. The log-head is also updated when this occurs. When a request comes in the Ivy server checks all the logs to find relevant information. The logs are ordered by sequence numbers and version vectors. Since the order obeys causality, all the users should choose the same order. Ivy supports the basic file operations: file creation, file read, file name lookup (similar to open()), get file attributes, and directory listing. In addition, it also has a file system creation operation that creates a new file system from scratch by creating a new log. The user mounts this Ivy file system as a NFS file system, using the root i-number as the NFS root handle. Figure 2 Snapshot data structure.

4 Snapshots Participants periodically create a private snapshot of the entire state of the file system in order to improve performance. Having this prevents the necessity of traversing the entire log. They are stored in the distributed hash table to provide persistence. Snapshots are built off the previous snapshot. If no previous snapshot is available, then it is either built from scratch starting at the beginning of each log, or copied from another user s snapshot. A snapshot data structure is shown in figure 2. H(x) is the distributed hash content-hash of x. Semantics Ivy provides write-read semantics in which updates are immediately visible, with the exception of network partitions. It also offers the NFS session semantics, also known as close-to-open semantics. This ensures that applications can see the data written by the last application that wrote and closed the file. Doing this prevents having to fetch the log head on every read. Ivy uses the ordering that everyone agreed on, which was mentioned earlier, to execute concurrent operations. Exclusive creation of files and directories is implemented so that other applications can use it to implement locks. This is with the exception of partitions. If the network is partitioned, Ivy allows for partitioned updates by sacrificing consistency for availability. Version vectors are used, in this case, to provide consistency. Ivy also has a conflict resolution tool that manages concurrent writes to the same file on different partitions. Ivy is a peer-to-peer system that makes use of the log idea. It does not require trust from each of the participants, making it more flexible. In order to prevent having to use locking mechanisms, each peer maintains a log and a snapshot to ensure integrity and improve performance. Although Ivy does not provide the traditional file system semantics, there conflict resolution tools are in place to handle the problems that arise due to it. There are some NFS references as mentioned earlier and I believe that this happens mainly because p2p systems are very similar to distributed file systems that require consistency and performance guarantees. PAST PAST is an Internet based storage utility built on top of Pastry, a routing overlay for p2p systems. It provides persistent storage with high availability through caching and replication. Files are assigned identifiers that are used to manage routing and storage. It attempts to achieve high utilization by minimizing routes for fetching data and balancing loads on popular files. Pastry is a peer-to-peer routing scheme that works in conjunction with PAST s storage architecture. It routes messages to the node whose nodeid is the closest to the 128 most significant bits of the fileid. These two identifiers are discussed below. Identifiers Nodes and files are assigned uniformly distributed identifiers (nodeid and fileid respectively). Each of the nodes is connected to the Internet and has the ability to initiate or route requests. New files can be replicated across nodes to increase availability. These nodes can be any host on the Internet, as long as they have the software installed to support it. When they connect, they are assigned a 128-bit nodeid. When a request to add a new file to the system comes in, the file is stored on the k PAST nodes whose node

5 IDs are the closest to the 128 most significant bits of the fileid. The k is chosen based on the availability requirements of the file. Handling Requests To insert a new file into the system, a fileid is first computed as the SHA-1 hashcode of the filename, the owner s public key and a random number. A file certificate is created as well, signed with the owner s public key. When this is done, the file and certificate are routed to the node with a nodeid that is closest to that of the fileid that was created. That node then has the responsibility of routing the copies to the remaining nodes. When a lookup request is received, the fileid is used to find the node that has that file. Once the file and certificate are returned, the request message is routed no more. In order to remove a file from the system, the owner of the file has to issue a reclaim certificate that is signed with the owner s key. This certificate is sent to the k nodes that have replicas of the file. Once they verify the validity of the certificate, they return a reclaim receipt to acknowledge the operation. Storage Management The storage management strategy is responsible for balancing the distribution of free space as the system as a whole starts to fill up. At the same time it has to enforce that notion that copies of each file are held by the k nodes that are closest to the fileid. Due to the conflicting nature of these two responsibilities, PAST has two ways to deal with their resolutions: replica diversion and file diversion. Replica diversion is used to balance the amount of free space among the nodes in the same leaf set. It allows a node that is not part of the k numerically closest nodes to the fileid to store the file. File diversion is used to balance the amount of free space among different parts of the nodeid space. When a leaf set is reaching its maximum storage capacity, a file is diverted to a different part of the nodeid space by regenerating the random number in its fileid. This is done up to a maximum of 3 times, in which on the 4 th attempt, a failure of insert the file is returned to the client. Caching Caching is done to improve performance when necessary (eg. high popularity of a file). Cached data is stored on the available free space of the nodes. Therefore, its cache size is constantly changing depending on the amount of free space that the node has. A file that is related to an insert or lookup operation is cached when its size is less than a fraction c (which can be set) of the node s current cache size. When this space is needed for actual data, then the cached data is removed. PAST is a p2p storage utility that assigns identifiers to the peers and files in the system. By using these identifiers, they can set the locations of the files and succeed in performing efficient routing with the help of Pastry. PAST also has a storage manager that balances out the available space across the system and a caching mechanism to help improve performance. In addition to the load distribution that this provides, PAST also offers availability through the k replicas of files that they store on different nodes in the system. Since these adjacent nodes are usually geographically diverse, the odds of them all leaving the system at once is very low. Therefore, this distribution of data allows for high availability. Once again, we see the desire for reliability and performance, which are two very important aspects of distributed file systems.

6 Samsara The users in a peer to peer file system are assumed to contribute their fair share of bandwidth relative to their consumption rates. However, this is high unlikely without the aid of some type of enforcement mechanism. Samsara attempts to provide a solution for this without the need of a trusted third party and a centralized infrastructure. There is more storage demand than supply so we need to make the amount of consumption less than or equal to the amount contributed. Thieves or cheaters are identified and punished by having their files discarded. Without a balance, the system will fall apart. Figure 3 shown below, portrays the relationship between Samsara and Pastry, which was described earlier. It is meant to lie on the same layer. Figure 3 Storage Claims Samsara hopes to ensure the notion that nodes are consuming no more than they contribute. This can be done if all the storage relationships are symmetric. In other words, if node A were to provide storage for node B, then B must provide storage for node A, resulting in equal exchange. Because this is not the case for several peer-topeer systems, Samsara introduces the notion of storage claims, which are incompressible placeholders to formulate a symmetric relationship between the nodes. When a trade of a storage claim and data occurs, each node constantly checks each other to ensure that it is obeying the contract that they formed. If a node notices that it has been violated, then the data exchange can be terminated and dropped. If every node operates similarly, then we can be safe to assume that the system as a whole is acting fairly. However, the implementation of these storage claims presents the problem of doubling the storage space and the idea that not all nodes are reliable due to the absence of an administrator. Samsara solves the former problem by replacing a claim that it has given to another node with the one is it responsible for. To further reduce the amount of storage, Samsara does not allow for the storage claim owner to store a copy of their own storage claim for verification purposes. The latter is resolved by punishing nodes that fail to reply sufficiently to a query. When a node fails a query, nodes that are storing its data drop data with a probability. A claim is composed of three values: a secret passphrase P, a private key K and a location in the storage space. Before joining the system, nodes logically fill their storage space storage claims. Claims are fixed-sized blocks that are formed from consecutive hash values. They do not need to be computed during initialization, and can be done only when necessary.

7 Figure 4 Queries Queries are done to check up on the other nodes and make sure they are fulfilling their end of the bargain. However, due to network traffic or busy nodes, queries can be done infrequently and replies do not have to be done immediately. When a node replies to a query, it does not have to send the whole data back to prove its existence on the node. Nodes that wish to query other nodes can send a unique value h0, along with the list of n objects that they wish to verify. The responding node just needs to append this h0 to the first object in the list and compute the SHA1 hash of the concatenation. This hash, h1, is then appended to the second object resulting in h2. This is done until hn (h sub n) is computed and returned to the requester. This idea is shown in figure 4. In order to make sure that those with temporary connection problems are not punished as opposed to those that are deliberately cheating, Samsara introduces a solution with a grace period for responding to a query. Therefore, the node is not punished if it fails to reply to a single query. The grace period is usually set longer than a conservative estimate to recover a failed node. Replicas at the data are removed once the grace period is violated. Storage Management In order to deal with low space issues, Samsara has a replacement technique that relies on the dependencies of data, claims and nodes. We should note that replacing does not affect equal exchange since all nodes are still consuming and contributing the same amount. When storage space becomes a problem, nodes can use the storage rights they have attained through their claims by replacing their remotely placed stored claims with the ones they have stored locally. This claim forwarding technique frees up space by using the claimed space on another node to store claims that are prevalent. Queries for that claim will be forwarded to that new node instead. There are many issues dealing with the reliability of peers in a network. They are often not trusted and selfish. As opposed to a distributed file system, the nodes all obey a certain set of guidelines that are enforced by the system. However, in a peer to peer system, users can be just you and I. We have complete control over whether or not we wish to contribute to the system. There is no enforcement policy that prevents us from doing so. Samsara hopes to solve these problems by creating IOUs on nodes that represents promised storage space to other nodes. Cheating will only result in punishment. However, I believe that there a loophole to get through Samsara s current implementation. Consider a node that promises to store an X amount of storage space. There is no mechanism to prevent the malicious node from dropping it right away and

8 ruining the guarantee. Another issue is that these storage placeholders can only be on storage and memory. It will be hard to do this for resources like bandwidth and processor cycles. Despite the vulnerabilities in Samsara, I believe this is a very good start on the road to creating a near equal peer to peer system. In my opinion, I do not believe complete equality can be preserved since nodes can leave a system whenever they wish. Conclusion Peer-to-peer systems are becoming a widely used form of network in today s Internet. Some popular and commercial programs such as bit-torrent and Napster have already been implemented and brought into the public for a while now. In this survey, I discussed Ivy, which was one of the many interesting p2p systems that are out there now. In order to sufficiently route and service requests in a p2p system, Pastry and PAST was introduced. Then there was the problem of preserving the foundation that peer-to-peer systems were based on, which is equal or near equal contribution and consumption of resources. Samsara was one of the techniques that attempted to solve this problem. In local storage systems, we are often limited by the transport medium s capacity as to how fast data can be transferred. In order to solve this problem, the notion of parallelism through several disks (RAID) was introduced. This technique was found to be a very good solution in providing both performance and reliability guarantees. Similarly, on the Internet, we are bounded by the bandwidth and connection of the endpoints. If we put the idea of parallelism to use by distributing the load across several objects, we can achieve comparable results. Once the peer-to-peer network is stabilized I believe that this will become one of the more popular forms of data transfer on the net, if not the best. As long as all the problems are fixed, the efficiency and equality provided by it will help improve the overall performance of the Internet as a whole.

9 References B. Callaghan, B. Pawlowski, P. Staubach. NFS version 3 protocol specification. Network Working Group B.F. Cooper, H. Garcia-Molina. Peer-to-peer resource trading in a reliable distributed system. Proceeding of the First International Workshop on Peer-to-Peer Systems L. Cox, B. Noble. Samsara: Honor Among Thieves in Peer-to-Peer Storage. SOSP P. Druschel, A. Rowstron. PAST: A large-scale, persistent peer-to-peer storage utility. Proc. HotOS VIII A. Muthitacharoen, R. Morris, T. Gil, B. Chen. Ivy: A Read/Write Peer-to-Peer File System. OSDI M. Rosenblum, J. Ousterhout. The design and implementation of a log-structured file system. ACM Transaction on Computer Systems A. Rowstron, P. Druschel. Storage Management and Caching in PAST, A Large-scale, Persistent Peer-to-peer Storage Utility. SOSP Wikipedia: Peer-to-peer.

Samsara: Honor Among Thieves in Peer-to-Peer Storage

Samsara: Honor Among Thieves in Peer-to-Peer Storage Samsara: Honor Among Thieves in Peer-to-Peer Storage Landon P. Cox, Brian D. Noble University of Michigan Presented by: Todd J. Green University of Pennsylvania March 30, 2004 1 Motivation A previous system

More information

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space Today CSCI 5105 Coda GFS PAST Instructor: Abhishek Chandra 2 Coda Main Goals: Availability: Work in the presence of disconnection Scalability: Support large number of users Successor of Andrew File System

More information

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE for for March 10, 2006 Agenda for Peer-to-Peer Sytems Initial approaches to Their Limitations CAN - Applications of CAN Design Details Benefits for Distributed and a decentralized architecture No centralized

More information

Architectures for Distributed Systems

Architectures for Distributed Systems Distributed Systems and Middleware 2013 2: Architectures Architectures for Distributed Systems Components A distributed system consists of components Each component has well-defined interface, can be replaced

More information

Introduction to Peer-to-Peer Systems

Introduction to Peer-to-Peer Systems Introduction Introduction to Peer-to-Peer Systems Peer-to-peer (PP) systems have become extremely popular and contribute to vast amounts of Internet traffic PP basic definition: A PP system is a distributed

More information

EARM: An Efficient and Adaptive File Replication with Consistency Maintenance in P2P Systems.

EARM: An Efficient and Adaptive File Replication with Consistency Maintenance in P2P Systems. : An Efficient and Adaptive File Replication with Consistency Maintenance in P2P Systems. 1 K.V.K.Chaitanya, 2 Smt. S.Vasundra, M,Tech., (Ph.D), 1 M.Tech (Computer Science), 2 Associate Professor, Department

More information

Problems in Reputation based Methods in P2P Networks

Problems in Reputation based Methods in P2P Networks WDS'08 Proceedings of Contributed Papers, Part I, 235 239, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS Problems in Reputation based Methods in P2P Networks M. Novotný Charles University, Faculty of Mathematics

More information

Chapter 10: Peer-to-Peer Systems

Chapter 10: Peer-to-Peer Systems Chapter 10: Peer-to-Peer Systems From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, Addison-Wesley 2005 Introduction To enable the sharing of data and resources

More information

Peer-to-Peer Systems and Distributed Hash Tables

Peer-to-Peer Systems and Distributed Hash Tables Peer-to-Peer Systems and Distributed Hash Tables CS 240: Computing Systems and Concurrency Lecture 8 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Selected

More information

Distributed Hash Table

Distributed Hash Table Distributed Hash Table P2P Routing and Searching Algorithms Ruixuan Li College of Computer Science, HUST rxli@public.wh.hb.cn http://idc.hust.edu.cn/~rxli/ In Courtesy of Xiaodong Zhang, Ohio State Univ

More information

A Top Catching Scheme Consistency Controlling in Hybrid P2P Network

A Top Catching Scheme Consistency Controlling in Hybrid P2P Network A Top Catching Scheme Consistency Controlling in Hybrid P2P Network V. Asha*1, P Ramesh Babu*2 M.Tech (CSE) Student Department of CSE, Priyadarshini Institute of Technology & Science, Chintalapudi, Guntur(Dist),

More information

Peer-to-Peer Systems. Chapter General Characteristics

Peer-to-Peer Systems. Chapter General Characteristics Chapter 2 Peer-to-Peer Systems Abstract In this chapter, a basic overview is given of P2P systems, architectures, and search strategies in P2P systems. More specific concepts that are outlined include

More information

Slides for Chapter 10: Peer-to-Peer Systems

Slides for Chapter 10: Peer-to-Peer Systems Slides for Chapter 10: Peer-to-Peer Systems From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, Addison-Wesley 2012 Overview of Chapter Introduction Napster

More information

Secure Distributed Storage in Peer-to-peer networks

Secure Distributed Storage in Peer-to-peer networks Secure Distributed Storage in Peer-to-peer networks Øyvind Hanssen 07.02.2007 Motivation Mobile and ubiquitous computing Persistent information in untrusted networks Sharing of storage and information

More information

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today Network Science: Peer-to-Peer Systems Ozalp Babaoglu Dipartimento di Informatica Scienza e Ingegneria Università di Bologna www.cs.unibo.it/babaoglu/ Introduction Peer-to-peer (PP) systems have become

More information

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables Peer-to-Peer Systems and Distributed Hash Tables COS 418: Distributed Systems Lecture 7 Today 1. Peer-to-Peer Systems Napster, Gnutella, BitTorrent, challenges 2. Distributed Hash Tables 3. The Chord Lookup

More information

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Venugopalan Ramasubramanian Emin Gün Sirer Presented By: Kamalakar Kambhatla * Slides adapted from the paper -

More information

Ivy: A Read/Write Peer-to-Peer File System

Ivy: A Read/Write Peer-to-Peer File System Ivy: A Read/Write Peer-to-Peer File System Athicha Muthitacharoen, Robert Morris, Thomer M. Gil, and Benjie Chen {athicha, rtm, thomer, benjie}@lcs.mit.edu MIT Laboratory for Computer Science 200 Technology

More information

Subway : Peer-To-Peer Clustering of Clients for Web Proxy

Subway : Peer-To-Peer Clustering of Clients for Web Proxy Subway : Peer-To-Peer Clustering of Clients for Web Proxy Kyungbaek Kim and Daeyeon Park Department of Electrical Engineering & Computer Science, Division of Electrical Engineering, Korea Advanced Institute

More information

Ken Birman. Cornell University. CS5410 Fall 2008.

Ken Birman. Cornell University. CS5410 Fall 2008. Ken Birman Cornell University. CS5410 Fall 2008. Cooperative Storage Early uses of P2P systems were mostly for downloads But idea of cooperating to store documents soon emerged as an interesting i problem

More information

Chapter 11 DISTRIBUTED FILE SYSTEMS

Chapter 11 DISTRIBUTED FILE SYSTEMS DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 11 DISTRIBUTED FILE SYSTEMS Client-Server Architectures (1) Figure 11-1. (a) The remote access

More information

Data Replication CS 188 Distributed Systems February 3, 2015

Data Replication CS 188 Distributed Systems February 3, 2015 Data Replication CS 188 Distributed Systems February 3, 2015 Page 1 Some Other Possibilities What if the machines sharing files are portable and not always connected? What if the machines communicate across

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [P2P SYSTEMS] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Byzantine failures vs malicious nodes

More information

Peer-to-peer systems and overlay networks

Peer-to-peer systems and overlay networks Complex Adaptive Systems C.d.L. Informatica Università di Bologna Peer-to-peer systems and overlay networks Fabio Picconi Dipartimento di Scienze dell Informazione 1 Outline Introduction to P2P systems

More information

Cloud FastPath: Highly Secure Data Transfer

Cloud FastPath: Highly Secure Data Transfer Cloud FastPath: Highly Secure Data Transfer Tervela helps companies move large volumes of sensitive data safely and securely over network distances great and small. Tervela has been creating high performance

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

CONIKS: Bringing Key Transparency to End Users

CONIKS: Bringing Key Transparency to End Users CONIKS: Bringing Key Transparency to End Users Morris Yau 1 Introduction Public keys must be distributed securely even in the presence of attackers. This is known as the Public Key Infrastructure problem

More information

WHITE PAPER Cloud FastPath: A Highly Secure Data Transfer Solution

WHITE PAPER Cloud FastPath: A Highly Secure Data Transfer Solution WHITE PAPER Cloud FastPath: A Highly Secure Data Transfer Solution Tervela helps companies move large volumes of sensitive data safely and securely over network distances great and small. We have been

More information

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need

More information

Distributed Systems (5DV147)

Distributed Systems (5DV147) Distributed Systems (5DV147) Replication and consistency Fall 2013 1 Replication 2 What is replication? Introduction Make different copies of data ensuring that all copies are identical Immutable data

More information

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,

More information

Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04

Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04 The Design and Implementation of a Next Generation Name Service for the Internet Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04 Presenter: Saurabh Kadekodi Agenda DNS overview Current DNS Problems

More information

Early Measurements of a Cluster-based Architecture for P2P Systems

Early Measurements of a Cluster-based Architecture for P2P Systems Early Measurements of a Cluster-based Architecture for P2P Systems Balachander Krishnamurthy, Jia Wang, Yinglian Xie I. INTRODUCTION Peer-to-peer applications such as Napster [4], Freenet [1], and Gnutella

More information

08 Distributed Hash Tables

08 Distributed Hash Tables 08 Distributed Hash Tables 2/59 Chord Lookup Algorithm Properties Interface: lookup(key) IP address Efficient: O(log N) messages per lookup N is the total number of servers Scalable: O(log N) state per

More information

Overlay Networks for Multimedia Contents Distribution

Overlay Networks for Multimedia Contents Distribution Overlay Networks for Multimedia Contents Distribution Vittorio Palmisano vpalmisano@gmail.com 26 gennaio 2007 Outline 1 Mesh-based Multicast Networks 2 Tree-based Multicast Networks Overcast (Cisco, 2000)

More information

Finding a needle in Haystack: Facebook's photo storage

Finding a needle in Haystack: Facebook's photo storage Finding a needle in Haystack: Facebook's photo storage The paper is written at facebook and describes a object storage system called Haystack. Since facebook processes a lot of photos (20 petabytes total,

More information

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron Microsoft Research St. George House, 1 Guildhall Street Cambridge, CB2 3NH, United Kingdom.

More information

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Page 1 Example Replicated File Systems NFS Coda Ficus Page 2 NFS Originally NFS did not have any replication capability

More information

Peer-to-peer computing research a fad?

Peer-to-peer computing research a fad? Peer-to-peer computing research a fad? Frans Kaashoek kaashoek@lcs.mit.edu NSF Project IRIS http://www.project-iris.net Berkeley, ICSI, MIT, NYU, Rice What is a P2P system? Node Node Node Internet Node

More information

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P?

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P? Peer-to-Peer Data Management - Part 1- Alex Coman acoman@cs.ualberta.ca Addressed Issue [1] Placement and retrieval of data [2] Server architectures for hybrid P2P [3] Improve search in pure P2P systems

More information

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016 Distributed Systems 17. Distributed Lookup Paul Krzyzanowski Rutgers University Fall 2016 1 Distributed Lookup Look up (key, value) Cooperating set of nodes Ideally: No central coordinator Some nodes can

More information

Pastis: a Highly-Scalable Multi-User Peer-to-Peer File System

Pastis: a Highly-Scalable Multi-User Peer-to-Peer File System Pastis: a Highly-Scalable Multi-User Peer-to-Peer File System Jean-Michel Busca 1, Fabio Picconi 2, and Pierre Sens 2 1 INRIA Rocquencourt Le Chesnay, France jean-michel.busca@inria.fr 2 LIP6, Université

More information

Slides for Chapter 10: Peer-to-Peer Systems. From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design

Slides for Chapter 10: Peer-to-Peer Systems. From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Slides for Chapter 10: Peer-to-Peer Systems From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, Addison-Wesley 2012 Edited and supplemented by Jonne Itkonen,!

More information

Ivy: A Read/Write Peer-to-Peer File System

Ivy: A Read/Write Peer-to-Peer File System Ivy: A Read/Write Peer-to-Peer File System Athicha Muthitacharoen, Robert Morris, Thomer M. Gil, and Benjie Chen {athicha, rtm, thomer, benjie}@lcs.mit.edu MIT Laboratory for Computer Science 200 Technology

More information

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Outline System Architectural Design Issues Centralized Architectures Application

More information

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems Fall 2017 Exam 3 Review Paul Krzyzanowski Rutgers University Fall 2017 December 11, 2017 CS 417 2017 Paul Krzyzanowski 1 Question 1 The core task of the user s map function within a

More information

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4 Xiaowei Yang xwy@cs.duke.edu Overview Problem Evolving solutions IP multicast Proxy caching Content distribution networks

More information

CSE 5306 Distributed Systems

CSE 5306 Distributed Systems CSE 5306 Distributed Systems Naming Jia Rao http://ranger.uta.edu/~jrao/ 1 Naming Names play a critical role in all computer systems To access resources, uniquely identify entities, or refer to locations

More information

Protocol for Tetherless Computing

Protocol for Tetherless Computing Protocol for Tetherless Computing S. Keshav P. Darragh A. Seth S. Fung School of Computer Science University of Waterloo Waterloo, Canada, N2L 3G1 1. Introduction Tetherless computing involves asynchronous

More information

: Scalable Lookup

: Scalable Lookup 6.824 2006: Scalable Lookup Prior focus has been on traditional distributed systems e.g. NFS, DSM/Hypervisor, Harp Machine room: well maintained, centrally located. Relatively stable population: can be

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google SOSP 03, October 19 22, 2003, New York, USA Hyeon-Gyu Lee, and Yeong-Jae Woo Memory & Storage Architecture Lab. School

More information

DHT Overview. P2P: Advanced Topics Filesystems over DHTs and P2P research. How to build applications over DHTS. What we would like to have..

DHT Overview. P2P: Advanced Topics Filesystems over DHTs and P2P research. How to build applications over DHTS. What we would like to have.. DHT Overview P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar DHTs provide a simple primitive put (key,value) get (key) Data/Nodes distributed over a key-space High-level idea: Move

More information

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC Distributed Meta-data Servers: Architecture and Design Sarah Sharafkandi David H.C. Du DISC 5/22/07 1 Outline Meta-Data Server (MDS) functions Why a distributed and global Architecture? Problem description

More information

Consistency and Replication 1/65

Consistency and Replication 1/65 Consistency and Replication 1/65 Replicas and Consistency??? Tatiana Maslany in the show Orphan Black: The story of a group of clones that discover each other and the secret organization Dyad, which was

More information

Distributed Hash Tables: Chord

Distributed Hash Tables: Chord Distributed Hash Tables: Chord Brad Karp (with many slides contributed by Robert Morris) UCL Computer Science CS M038 / GZ06 12 th February 2016 Today: DHTs, P2P Distributed Hash Tables: a building block

More information

Consistency and Replication 1/62

Consistency and Replication 1/62 Consistency and Replication 1/62 Replicas and Consistency??? Tatiana Maslany in the show Orphan Black: The story of a group of clones that discover each other and the secret organization Dyad, which was

More information

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2 416 Distributed Systems Mar 3, Peer-to-Peer Part 2 Scaling Problem Millions of clients server and network meltdown 2 P2P System Leverage the resources of client machines (peers) Traditional: Computation,

More information

Advantages of P2P systems. P2P Caching and Archiving. Squirrel. Papers to Discuss. Why bother? Implementation

Advantages of P2P systems. P2P Caching and Archiving. Squirrel. Papers to Discuss. Why bother? Implementation Advantages of P2P systems P2P Caching and Archiving Tyrone Nicholas May 10, 2004 Redundancy built in - by definition there are a large number of servers Lends itself automatically to multiple copies of

More information

CS /29/17. Paul Krzyzanowski 1. Fall 2016: Question 2. Distributed Systems. Fall 2016: Question 2 (cont.) Fall 2016: Question 3

CS /29/17. Paul Krzyzanowski 1. Fall 2016: Question 2. Distributed Systems. Fall 2016: Question 2 (cont.) Fall 2016: Question 3 Fall 2016: Question 2 You have access to a file of class enrollment lists. Each line contains {course_number, student_id}. Distributed Systems 2017 Pre-exam 3 review Selected questions from past exams

More information

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi 1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.

More information

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs 1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds

More information

Cloud Computing CS

Cloud Computing CS Cloud Computing CS 15-319 Distributed File Systems and Cloud Storage Part I Lecture 12, Feb 22, 2012 Majd F. Sakr, Mohammad Hammoud and Suhail Rehman 1 Today Last two sessions Pregel, Dryad and GraphLab

More information

GFS-python: A Simplified GFS Implementation in Python

GFS-python: A Simplified GFS Implementation in Python GFS-python: A Simplified GFS Implementation in Python Andy Strohman ABSTRACT GFS-python is distributed network filesystem written entirely in python. There are no dependencies other than Python s standard

More information

Assignment 5. Georgia Koloniari

Assignment 5. Georgia Koloniari Assignment 5 Georgia Koloniari 2. "Peer-to-Peer Computing" 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last

More information

Current Topics in OS Research. So, what s hot?

Current Topics in OS Research. So, what s hot? Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general

More information

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste Overview 5-44 5-44 Computer Networking 5-64 Lecture 6: Delivering Content: Peer to Peer and CDNs Peter Steenkiste Web Consistent hashing Peer-to-peer Motivation Architectures Discussion CDN Video Fall

More information

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson Distributed systems Lecture 6: Elections, distributed transactions, and replication DrRobert N. M. Watson 1 Last time Saw how we can build ordered multicast Messages between processes in a group Need to

More information

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat Dynamo: Amazon s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat Dynamo An infrastructure to host services Reliability and fault-tolerance at massive scale Availability providing

More information

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems 16. Distributed Lookup Paul Krzyzanowski Rutgers University Fall 2017 1 Distributed Lookup Look up (key, value) Cooperating set of nodes Ideally: No central coordinator Some nodes can

More information

Efficient Resource Management for the P2P Web Caching

Efficient Resource Management for the P2P Web Caching Efficient Resource Management for the P2P Web Caching Kyungbaek Kim and Daeyeon Park Department of Electrical Engineering & Computer Science, Division of Electrical Engineering, Korea Advanced Institute

More information

Peer Assisted Content Distribution over Router Assisted Overlay Multicast

Peer Assisted Content Distribution over Router Assisted Overlay Multicast Peer Assisted Content Distribution over Router Assisted Overlay Multicast George Xylomenos, Konstantinos Katsaros and Vasileios P. Kemerlis Mobile Multimedia Laboratory & Department of Informatics Athens

More information

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

! Design constraints.  Component failures are the norm.  Files are huge by traditional standards. ! POSIX-like Cloud background Google File System! Warehouse scale systems " 10K-100K nodes " 50MW (1 MW = 1,000 houses) " Power efficient! Located near cheap power! Passive cooling! Power Usage Effectiveness = Total

More information

INF5070 media storage and distribution systems. to-peer Systems 10/

INF5070 media storage and distribution systems. to-peer Systems 10/ INF5070 Media Storage and Distribution Systems: Peer-to to-peer Systems 10/11 2003 Client-Server! Traditional distributed computing! Successful architecture, and will continue to be so (adding proxy servers)!

More information

DISTRIBUTED COMPUTER SYSTEMS

DISTRIBUTED COMPUTER SYSTEMS DISTRIBUTED COMPUTER SYSTEMS CONSISTENCY AND REPLICATION CONSISTENCY MODELS Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Consistency Models Background Replication Motivation

More information

Distributed hash table - Wikipedia, the free encyclopedia

Distributed hash table - Wikipedia, the free encyclopedia Page 1 sur 6 Distributed hash table From Wikipedia, the free encyclopedia Distributed hash tables (DHTs) are a class of decentralized distributed systems that provide a lookup service similar to a hash

More information

Building a low-latency, proximity-aware DHT-based P2P network

Building a low-latency, proximity-aware DHT-based P2P network Building a low-latency, proximity-aware DHT-based P2P network Ngoc Ben DANG, Son Tung VU, Hoai Son NGUYEN Department of Computer network College of Technology, Vietnam National University, Hanoi 144 Xuan

More information

Content Overlays. Nick Feamster CS 7260 March 12, 2007

Content Overlays. Nick Feamster CS 7260 March 12, 2007 Content Overlays Nick Feamster CS 7260 March 12, 2007 Content Overlays Distributed content storage and retrieval Two primary approaches: Structured overlay Unstructured overlay Today s paper: Chord Not

More information

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems Consistency CS 475, Spring 2018 Concurrent & Distributed Systems Review: 2PC, Timeouts when Coordinator crashes What if the bank doesn t hear back from coordinator? If bank voted no, it s OK to abort If

More information

Introduction to Distributed Data Systems

Introduction to Distributed Data Systems Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January

More information

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q Overlay networks To do q q q Overlay networks P2P evolution DHTs in general, Chord and Kademlia Turtles all the way down Overlay networks virtual networks Different applications with a wide range of needs

More information

Kademlia: A P2P Informa2on System Based on the XOR Metric

Kademlia: A P2P Informa2on System Based on the XOR Metric Kademlia: A P2P Informa2on System Based on the XOR Metric Today! By Petar Mayamounkov and David Mazières, presented at IPTPS 22 Next! Paper presentation and discussion Image from http://www.vs.inf.ethz.ch/about/zeit.jpg

More information

CSE 5306 Distributed Systems. Naming

CSE 5306 Distributed Systems. Naming CSE 5306 Distributed Systems Naming 1 Naming Names play a critical role in all computer systems To access resources, uniquely identify entities, or refer to locations To access an entity, you have resolve

More information

Distributed Hash Tables

Distributed Hash Tables Distributed Hash Tables CS6450: Distributed Systems Lecture 11 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University.

More information

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09 Distributed File Systems CS 537 Lecture 15 Distributed File Systems Michael Swift Goal: view a distributed system as a file system Storage is distributed Web tries to make world a collection of hyperlinked

More information

Synchronization (contd.)

Synchronization (contd.) Outline Synchronization (contd.) http://net.pku.edu.cn/~course/cs501/2008 Hongfei Yan School of EECS, Peking University 3/17/2008 Mutual Exclusion Permission-based Token-based Election Algorithms The Bully

More information

CS /29/18. Paul Krzyzanowski 1. Question 1 (Bigtable) Distributed Systems 2018 Pre-exam 3 review Selected questions from past exams

CS /29/18. Paul Krzyzanowski 1. Question 1 (Bigtable) Distributed Systems 2018 Pre-exam 3 review Selected questions from past exams Question 1 (Bigtable) What is an SSTable in Bigtable? Distributed Systems 2018 Pre-exam 3 review Selected questions from past exams It is the internal file format used to store Bigtable data. It maps keys

More information

CLOUD-SCALE FILE SYSTEMS

CLOUD-SCALE FILE SYSTEMS Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients

More information

A Hybrid Peer-to-Peer Architecture for Global Geospatial Web Service Discovery

A Hybrid Peer-to-Peer Architecture for Global Geospatial Web Service Discovery A Hybrid Peer-to-Peer Architecture for Global Geospatial Web Service Discovery Shawn Chen 1, Steve Liang 2 1 Geomatics, University of Calgary, hschen@ucalgary.ca 2 Geomatics, University of Calgary, steve.liang@ucalgary.ca

More information

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology Mobile and Heterogeneous databases Distributed Database System Transaction Management A.R. Hurson Computer Science Missouri Science & Technology 1 Distributed Database System Note, this unit will be covered

More information

Recap. CSE 486/586 Distributed Systems Google Chubby Lock Service. Recap: First Requirement. Recap: Second Requirement. Recap: Strengthening P2

Recap. CSE 486/586 Distributed Systems Google Chubby Lock Service. Recap: First Requirement. Recap: Second Requirement. Recap: Strengthening P2 Recap CSE 486/586 Distributed Systems Google Chubby Lock Service Steve Ko Computer Sciences and Engineering University at Buffalo Paxos is a consensus algorithm. Proposers? Acceptors? Learners? A proposer

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Dynamo Recap Consistent hashing 1-hop DHT enabled by gossip Execution of reads and writes Coordinated by first available successor

More information

Reputation Management in P2P Systems

Reputation Management in P2P Systems Reputation Management in P2P Systems Pradipta Mitra Nov 18, 2003 1 We will look at... Overview of P2P Systems Problems in P2P Systems Reputation Management Limited Reputation Sharing Simulation Results

More information

Changing Requirements for Distributed File Systems in Cloud Storage

Changing Requirements for Distributed File Systems in Cloud Storage Changing Requirements for Distributed File Systems in Cloud Storage Wesley Leggette Cleversafe Presentation Agenda r About Cleversafe r Scalability, our core driver r Object storage as basis for filesystem

More information

Distributed Systems Pre-exam 3 review Selected questions from past exams. David Domingo Paul Krzyzanowski Rutgers University Fall 2018

Distributed Systems Pre-exam 3 review Selected questions from past exams. David Domingo Paul Krzyzanowski Rutgers University Fall 2018 Distributed Systems 2018 Pre-exam 3 review Selected questions from past exams David Domingo Paul Krzyzanowski Rutgers University Fall 2018 November 28, 2018 1 Question 1 (Bigtable) What is an SSTable in

More information

LECT-05, S-1 FP2P, Javed I.

LECT-05, S-1 FP2P, Javed I. A Course on Foundations of Peer-to-Peer Systems & Applications LECT-, S- FPP, javed@kent.edu Javed I. Khan@8 CS /99 Foundation of Peer-to-Peer Applications & Systems Kent State University Dept. of Computer

More information

PRIMARY-BACKUP REPLICATION

PRIMARY-BACKUP REPLICATION PRIMARY-BACKUP REPLICATION Primary Backup George Porter Nov 14, 2018 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative Commons

More information

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo Document Sub Title Yotpo Technical Overview 07/18/2016 2015 Yotpo Contents Introduction... 3 Yotpo Architecture... 4 Yotpo Back Office (or B2B)... 4 Yotpo On-Site Presence... 4 Technologies... 5 Real-Time

More information

Middleware and Distributed Systems. Peer-to-Peer Systems. Peter Tröger

Middleware and Distributed Systems. Peer-to-Peer Systems. Peter Tröger Middleware and Distributed Systems Peer-to-Peer Systems Peter Tröger Peer-to-Peer Systems (P2P) Concept of a decentralized large-scale distributed system Large number of networked computers (peers) Each

More information

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Distributed Systems Lec 10: Distributed File Systems GFS Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1 Distributed File Systems NFS AFS GFS Some themes in these classes: Workload-oriented

More information