Presented By: Devarsh Patel

Size: px
Start display at page:

Download "Presented By: Devarsh Patel"

Transcription

1 : Amazon s Highly Available Key-value Store Presented By: Devarsh Patel CS5204 Operating Systems 1

2 Introduction Amazon s e-commerce platform Requires performance, reliability and efficiency To support continuous growth, platform needs to be highly scalable A highly available and scalable distributed data store built for Amazon s platform is used to manage services that have very high reliability requirements and need tight control over the tradeoffs between availability, consistency, cost-effectiveness and performance. provides a simple primary-key only interface to meet requirements of applications like best seller lists, shopping carts, customer preferences, session management, etc. A completely decentralized system with minimal need for manual administration. CS5204 Operating Systems 2

3 System Assumptions and Requirements Simple key-value interface Highly available Efficient in resource usage Simple scale out scheme to address growth in data set size or request rates Each service that uses runs its own instances Used only by Amazon s internal services Non-hostile environment No security requirements like authentication and authorization Targets applications that operate with weaker consistency in favor of high availability Service level agreements (SLA) Measured at the 99.9 th percentile of the distribution Key factors: service latency at a given request rate Example: response time of 300ms for 99.9% of requests at peak client load of 500 requests per second State management is the main component of a service s SLAs CS5204 Operating Systems 3

4 Design Considerations Designed to be an eventually consistent data store Always writeable data store Consistency vs. availability To achieve a level of consistency, replication algorithms are forced to tradeoff the availability of the data under certain failure scenarios. To improve availability, uses weaker form of consistency (eventual consistency) Allows optimistic replication techniques Can lead to conflicting changes which must be detected and resolved Data store or application performs conflict resolution to the reads Other key principles Incremental scalability One storage node at a time Symmetry Every node has same set of responsibilities Decentralization Favor decentralized peer-to-peer techniques Heterogeneity Work distribution must be proportional CS5204 Operating Systems 4

5 System Architecture Core distributed system techniques used in : Partitioning, Replication, Versioning, Membership, Failure handling and Scaling CS5204 Operating Systems 5

6 System Interface Two operations: get() and put() get(key) Locates the object replicas associated with the key in the storage system and returns a single object or a list of objects with conflicting versions along with a context put(key, context, object) - Determines where the replicas of the object should be placed based on the associated key, and writes the replicas to disk context encodes system metadata about the object MD5 hash on the key generates 128-bit identifier to identify storage nodes CS5204 Operating Systems 6

7 Consistent Hashing Partitioning Algorithm Output range is a fixed circular space or ring Advantage Departure or arrival of a node only affects immediate neighbors Issues Non-uniform data and load distribution uses a variant of consistent hashing by using concept of virtual nodes CS5204 Operating Systems 7

8 Replication Replicate data on multiple hosts Reason To achieve high availability and durability per-instance Preference list List of nodes responsible for storing particular key Figure 1: Partitioning and replication of keys in ring. CS5204 Operating Systems 8

9 Data Versioning treats the result of each modification as a new and immutable version of the data Allows for multiple versions of an object to be present in the system at the same time. Problem Version branching due to failures combined with concurrent updates, resulting in conflicting versions of object Updates in the presence of network partitions and node failures result in an object having distinct version sub-histories CS5204 Operating Systems 9

10 Data Versioning Uses vector clocks A list of (node, counter) pairs Determines two version of an object are on parallel branches or have causal ordering Conflict requires reconciliation Conflicting versions passed to application as output of get operation Application resolves conflicts and puts a new (consistent) version CS5204 Operating Systems 10

11 Data Versioning Figure: Version evolution of an object over time CS5204 Operating Systems 11

12 Execution of get/put operations Two strategies to select a node: Request through a load balancer Request directly to the coordinator nodes Coordinator Node handling read and write operation First among the top N nodes in the preference list Quorum system Two key configurable values: R and W R - minimum nodes participated in successful read operation W - minimum nodes participated in successful write operation Quorum like system requires, R+W > N (N, R, W) can be chosen to achieve desired tradeoff R and W are usually configured to be less than N, to provide better latency. Write is successful If W-1 nodes respond to put() request Read is successful If R noes respond to get() request CS5204 Operating Systems 12

13 Sloppy quorum Hinted Handoff All read and write operations are done on Top N healthy nodes in the preference list Coordinator is first in this group Replicas sent to node will have a hint in its metadata indicating the original node that should hold the replica Hinted replicas are stored by available node and sent forwarded when original node recovers. Ensures read and write operations are not failed due to node or network failures CS5204 Operating Systems 13

14 Replica synchronization Detect the inconsistencies between replicas faster and to minimize the amount of transferred data using Merkle tree. Separate tree maintained by each node for each key range Advantage: each branch of the tree can be checked independently without requiring nodes to download the entire tree or the entire data set Disadvantage: Adds overhead to maintain Merkle trees when a node joins or leaves the system CS5204 Operating Systems 14

15 Membership and Failure Detection Ring Membership Explicit mechanism to add or remove node from a ring Done by administrator using command line tool or browser Gossip-based protocol propagates membership, partitioning, and placement information via periodic exchanges Nodes eventually know key ranges of its peers and can forward requests to them External Discovery To prevent logical partitions, some nodes play role of seeds Seed nodes discovered via external mechanism are known to all nodes Failure Detection Nodes failures are detected by lack of responsiveness and recovery detected by periodic retry CS5204 Operating Systems 15

16 Experiences & Lessons Learned Main patterns in which is used: Business logic specific reconciliation Timestamp based reconciliation High performance read engine Client applications can tune values of N, R and W Common (N,R,W) configuration used by several instances of is (3,2,2) CS5204 Operating Systems 16

17 Experiences & Lessons Learned Balancing performance and Durability CS5204 Operating Systems 17

18 Experiences & Lessons Learned Ensuring Uniform Load Distribution CS5204 Operating Systems 18

19 Partitioning & Placement Strategies Partitioning and placement of keys in the three strategies. A, B, and C depict the three unique nodes that form the preference list for the key k1 on the consistent hashing ring (N=3). The shaded area indicates the key range for which nodes A, B, and C form the preference list. Dark arrows indicate the token locations for various nodes. CS5204 Operating Systems 19

20 Strategy 1 Partitioning & Placement Strategies T random tokens per node and partition by token value: It needs to steal its key ranges from other nodes Bootstrapping of new node is lengthy Other nodes process scanning/transmission of key ranges for new node as background activities Disadvantages: Numerous nodes have to adjust their Merkle trees when a new node joins or leaves system Archiving entire key space is highly inefficient CS5204 Operating Systems 20

21 Partitioning & Placement Strategies Strategy 2 T random tokens per node and equal sized partitions: Divided into Q equally sized partitions Q >> N and Q >> S*T, where S is no. of nodes in the system Advantages: Decoupling of partition and partition placement Allows changing of placement scheme at run-time Strategy 3 Q/S tokens per node, equal sized partitions: Decoupling of partition and placement Advantages: Faster bootstrapping/recovery Ease of archival CS5204 Operating Systems 21

22 Partitioning & Placement Strategies Strategies have different tuning parameters Fair way to compare strategies is to evaluate the skew in their load distributions for a fixed amount of space to maintain membership information Strategy 3 achieves best load balancing efficiency CS5204 Operating Systems 22

23 Client-driven or Server-driven Coordination Any node can coordinate read requests; write requests handled by coordinator State-machine for coordination can be in load balancing server or incorporated into client Client-driven coordination has lower latency because it avoids extra network hop (redirection) CS5204 Operating Systems 23

24 Thank You CS5204 Operating Systems 24

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE Presented by Byungjin Jun 1 What is Dynamo for? Highly available key-value storages system Simple primary-key only interface Scalable and Reliable Tradeoff:

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

CS Amazon Dynamo

CS Amazon Dynamo CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed

More information

CAP Theorem, BASE & DynamoDB

CAP Theorem, BASE & DynamoDB Indian Institute of Science Bangalore, India भ रत य व ज ञ न स स थ न ब गल र, भ रत DS256:Jan18 (3:1) Department of Computational and Data Sciences CAP Theorem, BASE & DynamoDB Yogesh Simmhan Yogesh Simmhan

More information

Dynamo: Amazon s Highly Available Key-Value Store

Dynamo: Amazon s Highly Available Key-Value Store Dynamo: Amazon s Highly Available Key-Value Store DeCandia et al. Amazon.com Presented by Sushil CS 5204 1 Motivation A storage system that attains high availability, performance and durability Decentralized

More information

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation Dynamo Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/20 Outline Motivation 1 Motivation 2 3 Smruti R. Sarangi Leader

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat Dynamo: Amazon s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat Dynamo An infrastructure to host services Reliability and fault-tolerance at massive scale Availability providing

More information

Dynamo: Key-Value Cloud Storage

Dynamo: Key-Value Cloud Storage Dynamo: Key-Value Cloud Storage Brad Karp UCL Computer Science CS M038 / GZ06 22 nd February 2016 Context: P2P vs. Data Center (key, value) Storage Chord and DHash intended for wide-area peer-to-peer systems

More information

CS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

CS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. CS 138: Dynamo CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Dynamo Highly available and scalable distributed data store Manages state of services that have high reliability and

More information

Background. Distributed Key/Value stores provide a simple put/get interface. Great properties: scalability, availability, reliability

Background. Distributed Key/Value stores provide a simple put/get interface. Great properties: scalability, availability, reliability Background Distributed Key/Value stores provide a simple put/get interface Great properties: scalability, availability, reliability Increasingly popular both within data centers Cassandra Dynamo Voldemort

More information

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage Horizontal or vertical scalability? Scaling Out Key-Value Storage COS 418: Distributed Systems Lecture 8 Kyle Jamieson Vertical Scaling Horizontal Scaling [Selected content adapted from M. Freedman, B.

More information

Scaling Out Key-Value Storage

Scaling Out Key-Value Storage Scaling Out Key-Value Storage COS 418: Distributed Systems Logan Stafman [Adapted from K. Jamieson, M. Freedman, B. Karp] Horizontal or vertical scalability? Vertical Scaling Horizontal Scaling 2 Horizontal

More information

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman Scaling KVS CS6450: Distributed Systems Lecture 14 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed

More information

There is a tempta7on to say it is really used, it must be good

There is a tempta7on to say it is really used, it must be good Notes from reviews Dynamo Evalua7on doesn t cover all design goals (e.g. incremental scalability, heterogeneity) Is it research? Complexity? How general? Dynamo Mo7va7on Normal database not the right fit

More information

Reprise: Stability under churn (Tapestry) A Simple lookup Test. Churn (Optional Bamboo paper last time)

Reprise: Stability under churn (Tapestry) A Simple lookup Test. Churn (Optional Bamboo paper last time) EECS 262a Advanced Topics in Computer Systems Lecture 22 Reprise: Stability under churn (Tapestry) P2P Storage: Dynamo November 20 th, 2013 John Kubiatowicz and Anthony D. Joseph Electrical Engineering

More information

FAQs Snapshots and locks Vector Clock

FAQs Snapshots and locks Vector Clock //08 CS5 Introduction to Big - FALL 08 W.B.0.0 CS5 Introduction to Big //08 CS5 Introduction to Big - FALL 08 W.B. FAQs Snapshots and locks Vector Clock PART. LARGE SCALE DATA STORAGE SYSTEMS NO SQL DATA

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

CS 655 Advanced Topics in Distributed Systems

CS 655 Advanced Topics in Distributed Systems Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3

More information

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems 16. Distributed Lookup Paul Krzyzanowski Rutgers University Fall 2017 1 Distributed Lookup Look up (key, value) Cooperating set of nodes Ideally: No central coordinator Some nodes can

More information

11/12/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data FALL 2018 Colorado State University

11/12/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data FALL 2018 Colorado State University S Introduction to ig Data Week - W... S Introduction to ig Data W.. FQs Term project final report Preparation guide is available at: http://www.cs.colostate.edu/~cs/tp.html Successful final project must

More information

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques Recap Distributed Systems Case Study: Amazon Dynamo CAP Theorem? Consistency, Availability, Partition Tolerance P then C? A? Eventual consistency? Availability and partition tolerance over consistency

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [DYNAMO & GOOGLE FILE SYSTEM] Frequently asked questions from the previous class survey What s the typical size of an inconsistency window in most production settings? Dynamo?

More information

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques Recap CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo CAP Theorem? Consistency, Availability, Partition Tolerance P then C? A?

More information

Distributed Hash Tables Chord and Dynamo

Distributed Hash Tables Chord and Dynamo Distributed Hash Tables Chord and Dynamo (Lecture 19, cs262a) Ion Stoica, UC Berkeley October 31, 2016 Today s Papers Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, Ion Stoica,

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Dynamo Recap Consistent hashing 1-hop DHT enabled by gossip Execution of reads and writes Coordinated by first available successor

More information

Dynamo Tom Anderson and Doug Woos

Dynamo Tom Anderson and Doug Woos Dynamo motivation Dynamo Tom Anderson and Doug Woos Fast, available writes - Shopping cart: always enable purchases FLP: consistency and progress at odds - Paxos: must communicate with a quorum Performance:

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.

More information

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,

More information

References. NoSQL Motivation. Why NoSQL as the Solution? NoSQL Key Feature Decisions. CSE 444: Database Internals

References. NoSQL Motivation. Why NoSQL as the Solution? NoSQL Key Feature Decisions. CSE 444: Database Internals References SE 444: atabase Internals Scalable SQL and NoSQL ata Stores, Rick attell, SIGMO Record, ecember 2010 (Vol. 39, No. 4) ynamo: mazon s Highly vailable Key-value Store. y Giuseppe eandia et. al.

More information

Riak. Distributed, replicated, highly available

Riak. Distributed, replicated, highly available INTRO TO RIAK Riak Overview Riak Distributed Riak Distributed, replicated, highly available Riak Distributed, highly available, eventually consistent Riak Distributed, highly available, eventually consistent,

More information

Intuitive distributed algorithms. with F#

Intuitive distributed algorithms. with F# Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype

More information

Consistency and Replication

Consistency and Replication Consistency and Replication 1 D R. Y I N G W U Z H U Reasons for Replication Data are replicated to increase the reliability of a system. Replication for performance Scaling in numbers Scaling in geographical

More information

6.830 Lecture Spark 11/15/2017

6.830 Lecture Spark 11/15/2017 6.830 Lecture 19 -- Spark 11/15/2017 Recap / finish dynamo Sloppy Quorum (healthy N) Dynamo authors don't think quorums are sufficient, for 2 reasons: - Decreased durability (want to write all data at

More information

Haridimos Kondylakis Computer Science Department, University of Crete

Haridimos Kondylakis Computer Science Department, University of Crete CS-562 Advanced Topics in Databases Haridimos Kondylakis Computer Science Department, University of Crete QSX (LN2) 2 NoSQL NoSQL: Not Only SQL. User case of NoSQL? Massive write performance. Fast key

More information

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Performance and Forgiveness June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Margo Seltzer Architect Outline A consistency primer Techniques and costs of consistency

More information

Introduction to Distributed Data Systems

Introduction to Distributed Data Systems Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January

More information

Large-Scale Data Stores and Probabilistic Protocols

Large-Scale Data Stores and Probabilistic Protocols Distributed Systems 600.437 Large-Scale Data Stores & Probabilistic Protocols Department of Computer Science The Johns Hopkins University 1 Large-Scale Data Stores and Probabilistic Protocols Lecture 11

More information

Federated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni

Federated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni Federated Array of Bricks Y Saito et al HP Labs CS 6464 Presented by Avinash Kulkarni Agenda Motivation Current Approaches FAB Design Protocols, Implementation, Optimizations Evaluation SSDs in enterprise

More information

Replication in Distributed Systems

Replication in Distributed Systems Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over

More information

Cloud Computing. Lectures 11, 12 and 13 Cloud Storage

Cloud Computing. Lectures 11, 12 and 13 Cloud Storage Cloud Computing Lectures 11, 12 and 13 Cloud Storage 2014-2015 1 Up until now Introduction Definition of Cloud Computing Grid Computing Content Distribution Networks Cycle-Sharing Distributed Scheduling

More information

Building Consistent Transactions with Inconsistent Replication

Building Consistent Transactions with Inconsistent Replication Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports University of Washington Distributed storage systems

More information

Dynamo: Amazon s Highly- Available Key- Value Store

Dynamo: Amazon s Highly- Available Key- Value Store Dynamo: Amazon s Highly- Available Key- Value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan KakulapaD, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall

More information

Distributed Systems (5DV147)

Distributed Systems (5DV147) Distributed Systems (5DV147) Replication and consistency Fall 2013 1 Replication 2 What is replication? Introduction Make different copies of data ensuring that all copies are identical Immutable data

More information

Cloud Computing. Up until now

Cloud Computing. Up until now Cloud Computing Lecture 13 Cloud Storage 2011-2012 Up until now Introduction Definition of Cloud Computing Grid Computing Content Distribution Networks Cycle-Sharing Distributed Scheduling Map Reduce 1

More information

Distributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency

Distributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency Distributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency CPSC 416 Project Proposal n6n8: Trevor Jackson, u2c9: Hayden Nhan, v0r5: Yongnan (Devin) Li, x5m8: Li Jye Tong Introduction

More information

Improving Logical Clocks in Riak with Dotted Version Vectors: A Case Study

Improving Logical Clocks in Riak with Dotted Version Vectors: A Case Study Improving Logical Clocks in Riak with Dotted Version Vectors: A Case Study Ricardo Gonçalves Universidade do Minho, Braga, Portugal, tome@di.uminho.pt Abstract. Major web applications need the partition-tolerance

More information

CSE 444: Database Internals. Section 9: 2-Phase Commit and Replication

CSE 444: Database Internals. Section 9: 2-Phase Commit and Replication CSE 444: Database Internals Section 9: 2-Phase Commit and Replication 1 Today 2-Phase Commit Replication 2 Two-Phase Commit Protocol (2PC) One coordinator and many subordinates Phase 1: Prepare Phase 2:

More information

Distributed Data Management Replication

Distributed Data Management Replication Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut Distributing Data Motivation Scalability (Elasticity) If data volume, processing, or access exhausts one machine, you might want to spread

More information

Distributed KIDS Labs 1

Distributed KIDS Labs 1 Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in

More information

Important Lessons. Today's Lecture. Two Views of Distributed Systems

Important Lessons. Today's Lecture. Two Views of Distributed Systems Important Lessons Replication good for performance/ reliability Key challenge keeping replicas up-to-date Wide range of consistency models Will see more next lecture Range of correctness properties L-10

More information

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need

More information

10. Replication. Motivation

10. Replication. Motivation 10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure

More information

Peer-to-peer Sender Authentication for . Vivek Pathak and Liviu Iftode Rutgers University

Peer-to-peer Sender Authentication for  . Vivek Pathak and Liviu Iftode Rutgers University Peer-to-peer Sender Authentication for Email Vivek Pathak and Liviu Iftode Rutgers University Email Trustworthiness Sender can be spoofed Need for Sender Authentication Importance depends on sender Update

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL

More information

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary Consistency and Replication Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary Reasons for Replication Reliability/Availability : Mask failures Mask corrupted data Performance: Scalability

More information

Consistency and Replication 1/62

Consistency and Replication 1/62 Consistency and Replication 1/62 Replicas and Consistency??? Tatiana Maslany in the show Orphan Black: The story of a group of clones that discover each other and the secret organization Dyad, which was

More information

Chapter 6 Synchronization (2)

Chapter 6 Synchronization (2) DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 6 Synchronization (2) Plan Clock synchronization in distributed systems Physical clocks Logical

More information

Peer- to- Peer in the Datacenter: Amazon Dynamo

Peer- to- Peer in the Datacenter: Amazon Dynamo Peer- to- Peer in the atacenter: mazon ynamo C upload rate u s Last Lecture bits Internet d 4 u 4 Mike reedman COS 461: Computer Networks hp://www.cs.princeton.edu/courses/archive/spr14/cos461/ 2 upload

More information

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Modified by: Dr. Ramzi Saifan Definition of a Distributed System (1) A distributed

More information

The material in this lecture is taken from Dynamo: Amazon s Highly Available Key-value Store, by G. DeCandia, D. Hastorun, M. Jampani, G.

The material in this lecture is taken from Dynamo: Amazon s Highly Available Key-value Store, by G. DeCandia, D. Hastorun, M. Jampani, G. The material in this lecture is taken from Dynamo: Amazon s Highly Available Key-value Store, by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall,

More information

10.0 Towards the Cloud

10.0 Towards the Cloud 10.0 Towards the Cloud Distributed Data Management Wolf-Tilo Balke Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 10.0 Special Purpose Database

More information

Distributed Hash Tables

Distributed Hash Tables Distributed Hash Tables What is a DHT? Hash Table data structure that maps keys to values essen=al building block in so?ware systems Distributed Hash Table (DHT) similar, but spread across many hosts Interface

More information

What Came First? The Ordering of Events in

What Came First? The Ordering of Events in What Came First? The Ordering of Events in Systems @kavya719 kavya the design of concurrent systems Slack architecture on AWS systems with multiple independent actors. threads in a multithreaded program.

More information

Axway API Management 7.5.x Cassandra Best practices. #axway

Axway API Management 7.5.x Cassandra Best practices. #axway Axway API Management 7.5.x Cassandra Best practices #axway Axway API Management 7.5.x Cassandra Best practices Agenda Apache Cassandra - Overview Apache Cassandra - Focus on consistency level Apache Cassandra

More information

Scalable overlay Networks

Scalable overlay Networks Scalable overlay Networks Dr. Samu Varjonen 15.02.2018 Scalable overlay networks 15.02.2018 1 Lectures MO 15.01. C122 Introduction. Exercises. Motivation. TH 18.01. DK117 Unstructured networks I MO 22.01.

More information

Page 1. Key Value Storage"

Page 1. Key Value Storage Key Value Storage CS162 Operating Systems and Systems Programming Lecture 14 Key Value Storage Systems March 12, 2012 Anthony D. Joseph and Ion Stoica http://inst.eecs.berkeley.edu/~cs162 Handle huge volumes

More information

Replication and Consistency. Fall 2010 Jussi Kangasharju

Replication and Consistency. Fall 2010 Jussi Kangasharju Replication and Consistency Fall 2010 Jussi Kangasharju Chapter Outline Replication Consistency models Distribution protocols Consistency protocols 2 Data Replication user B user C user A object object

More information

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology Mobile and Heterogeneous databases Distributed Database System Transaction Management A.R. Hurson Computer Science Missouri Science & Technology 1 Distributed Database System Note, this unit will be covered

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

Distributed Data Analytics Partitioning

Distributed Data Analytics Partitioning G-3.1.09, Campus III Hasso Plattner Institut Different mechanisms but usually used together Distributing Data Replication vs. Replication Store copies of the same data on several nodes Introduces redundancy

More information

NoSQL systems: sharding, replication and consistency. Riccardo Torlone Università Roma Tre

NoSQL systems: sharding, replication and consistency. Riccardo Torlone Università Roma Tre NoSQL systems: sharding, replication and consistency Riccardo Torlone Università Roma Tre Data distribution NoSQL systems: data distributed over large clusters Aggregate is a natural unit to use for data

More information

Consistency and Replication (part b)

Consistency and Replication (part b) Consistency and Replication (part b) EECS 591 Farnam Jahanian University of Michigan Tanenbaum Chapter 6.1-6.5 Eventual Consistency A very weak consistency model characterized by the lack of simultaneous

More information

From Relational to Riak

From Relational to Riak www.basho.com From Relational to Riak December 2012 Table of Contents Table of Contents... 1 Introduction... 1 Why Migrate to Riak?... 1 The Requirement of High Availability...1 Minimizing the Cost of

More information

Chapter 11 - Data Replication Middleware

Chapter 11 - Data Replication Middleware Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 11 - Data Replication Middleware Motivation Replication: controlled

More information

Distributed Key-Value Stores UCSB CS170

Distributed Key-Value Stores UCSB CS170 Distributed Key-Value Stores UCSB CS170 Overview Key-Value Stores/Storage Architecture Replication management Key Value Stores: Important system service on a cluster of machines Handle huge volumes of

More information

02 - Distributed Systems

02 - Distributed Systems 02 - Distributed Systems Definition Coulouris 1 (Dis)advantages Coulouris 2 Challenges Saltzer_84.pdf Models Physical Architectural Fundamental 2/58 Definition Distributed Systems Distributed System is

More information

02 - Distributed Systems

02 - Distributed Systems 02 - Distributed Systems Definition Coulouris 1 (Dis)advantages Coulouris 2 Challenges Saltzer_84.pdf Models Physical Architectural Fundamental 2/60 Definition Distributed Systems Distributed System is

More information

Apache Cassandra - A Decentralized Structured Storage System

Apache Cassandra - A Decentralized Structured Storage System Apache Cassandra - A Decentralized Structured Storage System Avinash Lakshman Prashant Malik from Facebook Presented by: Oded Naor Acknowledgments Some slides are based on material from: Idit Keidar, Topics

More information

Clusters. Or: How to replace Big Iron with PCs. Robert Grimm New York University

Clusters. Or: How to replace Big Iron with PCs. Robert Grimm New York University Clusters Or: How to replace Big Iron with PCs Robert Grimm New York University Before We Dive into Clusters! Assignment 2: HTTP/1.1! Implement persistent connections, pipelining, and digest authentication!

More information

Distributed Hash Tables: Chord

Distributed Hash Tables: Chord Distributed Hash Tables: Chord Brad Karp (with many slides contributed by Robert Morris) UCL Computer Science CS M038 / GZ06 12 th February 2016 Today: DHTs, P2P Distributed Hash Tables: a building block

More information

Chapter 7 Consistency And Replication

Chapter 7 Consistency And Replication DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 7 Consistency And Replication Data-centric Consistency Models Figure 7-1. The general organization

More information

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space Today CSCI 5105 Coda GFS PAST Instructor: Abhishek Chandra 2 Coda Main Goals: Availability: Work in the presence of disconnection Scalability: Support large number of users Successor of Andrew File System

More information

Protecting Microsoft Exchange

Protecting Microsoft Exchange TECHNICAL WHITE PAPER: BACKUP EXEC TM 2014 PROTECTING MICROSOFT EXCHANGE Backup Exec TM 2014 Technical White Paper Protecting Microsoft Exchange Technical White Papers are designed to introduce Symantec

More information

Consistency in Distributed Systems

Consistency in Distributed Systems Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission

More information

Consistency and Replication

Consistency and Replication Consistency and Replication Introduction Data-centric consistency Client-centric consistency Distribution protocols Consistency protocols 1 Goal: Reliability Performance Problem: Consistency Replication

More information

Trade- Offs in Cloud Storage Architecture. Stefan Tai

Trade- Offs in Cloud Storage Architecture. Stefan Tai Trade- Offs in Cloud Storage Architecture Stefan Tai Cloud computing is about providing and consuming resources as services There are five essential characteristics of cloud services [NIST] [NIST]: http://csrc.nist.gov/groups/sns/cloud-

More information

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju Chapter 4: Distributed Systems: Replication and Consistency Fall 2013 Jussi Kangasharju Chapter Outline n Replication n Consistency models n Distribution protocols n Consistency protocols 2 Data Replication

More information

Consistency and Replication 1/65

Consistency and Replication 1/65 Consistency and Replication 1/65 Replicas and Consistency??? Tatiana Maslany in the show Orphan Black: The story of a group of clones that discover each other and the secret organization Dyad, which was

More information

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm Final Exam Logistics CS 133: Databases Fall 2018 Lec 25 12/06 NoSQL Final exam take-home Available: Friday December 14 th, 4:00pm in Olin Due: Monday December 17 th, 5:15pm Same resources as midterm Except

More information

SCALABLE CONSISTENCY AND TRANSACTION MODELS

SCALABLE CONSISTENCY AND TRANSACTION MODELS Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:

More information

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014 Cassandra @ Spotify Scaling storage to million of users world wide! Jimmy Mårdell October 14, 2014 2 About me Jimmy Mårdell Tech Product Owner in the Cassandra team 4 years at Spotify

More information

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Page 1 Example Replicated File Systems NFS Coda Ficus Page 2 NFS Originally NFS did not have any replication capability

More information

Replication. Consistency models. Replica placement Distribution protocols

Replication. Consistency models. Replica placement Distribution protocols Replication Motivation Consistency models Data/Client-centric consistency models Replica placement Distribution protocols Invalidate versus updates Push versus Pull Cooperation between replicas Client-centric

More information

Making Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari

Making Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari Making Non-Distributed Databases, Distributed Ioannis Papapanagiotou, PhD Shailesh Birari Dynomite Ecosystem Dynomite - Proxy layer Dyno - Client Dynomite-manager - Ecosystem orchestrator Dynomite-explorer

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [REPLICATION & CONSISTENCY] Frequently asked questions from the previous class survey Shrideep Pallickara Computer Science Colorado State University L25.1 L25.2 Topics covered

More information

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski Distributed Systems 09. State Machine Replication & Virtual Synchrony Paul Krzyzanowski Rutgers University Fall 2016 1 State machine replication 2 State machine replication We want high scalability and

More information

CS655: Advanced Topics in Distributed Systems [Fall 2013] Dept. Of Computer Science, Colorado State University

CS655: Advanced Topics in Distributed Systems [Fall 2013] Dept. Of Computer Science, Colorado State University CS 655: ADVANCED TOPICS IN DISTRIBUTED SYSTEMS Shrideep Pallickara Computer Science Colorado State University PROFILING HARD DISKS L4.1 L4.2 Characteristics of peripheral devices & their speed relative

More information

Today s topics. FAQs. Topics in BigTable. This material is built based on, CS435 Introduction to Big Data Spring 2017 Colorado State University

Today s topics. FAQs. Topics in BigTable. This material is built based on, CS435 Introduction to Big Data Spring 2017 Colorado State University Spring 07 CS5 BIG DATA Today s topics FAQs BigTable: Column-based storage system PART. DATA STORAGE AND FLOW MANAGEMENT Sangmi Lee Pallickara Computer Science, http://www.cs.colostate.edu/~cs5 FAQs PA

More information