Why distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data?"

Size: px
Start display at page:

Download "Why distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data?""

Transcription

1 Why distributed databases suck, and what to do about it - Regaining consistency Do you want a database that goes down or one that serves wrong data?" 1

2 About the speaker NoSQL team lead at Trifork, Aarhus, Denmark Working with databases since '97 NoSQL since 2008 Danish Shared Medication Record Migrating data from MySQL to Riak Devel Riak clients NoSQL architect on various international projects RuneSkouLarsen 2

3 Agenda Part 1: Working with eventual consistency NoSQL persistence landscape What is consistency Eventual vs. sequential consistence Conflicts and how to handle them CRDT's Consistency models of current OLTP databases Part 2: Stronger consistency in distributed, fault tolerant systems Consensus Delta consistency Dynamic delta Consistency 3

4 Polyglot persistence landscape In-memory Neo4j VoltDB Redis EasyDB MongoDB CouchDB OLTP Riak Cassandra Voldemort CouchBase Analytics Hadoop 4

5 Why distributed databases? Redundancy Availability Scaling Getting closer to your users 5

6 What is Concistency Consistency: All nodes see the same data at the same time Eventual consistency Autonomous consistency Sequential consistency Bureaucratic consistency 6

7 When to be Consistent with what Eventual consistency Support disconnected operations Better to read a stale value than nothing Better to save writes somewhere than nothing Potentially anomalous application behavior Stale reads and conflicting writes Sequential consistency Requires highly available connections Not suitable for certain scenarios: Disconnected clients (e.g. your phone) Apps might prefer potential inconsistency to loss of availability 7

8 Conflicting updates User A User B A B A Asynchronous Synchronization B 8

9 Last Write Wins (LWW) User A User B A t=t0 B t=t1 A t=t0 Asynchronous Synchronization B t=t1 Assign timestamp to all objects Simple but fragile depends on precise synchronization of timers Data is lost 9

10 Google Spanner As a distributed-systems developer, you re taught from I want to say childhood not to trust time. What we did is find a way that we could trust time and understand what it meant to trust time. Andrew Fikes 1

11 Detecting conflicts using Vector Clocks (1) User A User B A vclock=a:1 B vclock=a:1,b:1 A vclock=a:1 Asynchronous Synchronization A B vclock=a:1 vclock=a:1,b:1 Assign vector clock to objects Ancestors are removed descendants remain 1

12 Detecting conflicts using Vector Clocks (2) User A User B A vclock=a:1 B vclock=b:1 A vclock=a:1 Asynchronous Synchronization B vclock=b:1 Spawn siblings when causality chain is broken 1

13 Semantic resolution User A User B CA B A Asynchronous Synchronization B Keep both values as siblings User does the merging The only solution if you need to do intelligent merging or start outside processes. 1

14 Conflict-free Replicated DataTypes User A User B A B AB A Asynchronous Synchronization AB B Datastructure intrinsically merges objects Limited applicability 1

15 Conflict-free Replicated Data Types Convergent (CvRDT) State is replicated Moves towards one value Commutative (CmRDT) Operations to the state are replicated The order of operations is insignificant a*b = b*a 1 CvRDT and CmRDT can emulate eachother

16 CRDT examples: G-set and 2P-Set Tombstone RIP 1

17 CRDT References CRDTs: Consistency without concurrency control 2009 INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE A comprehensive study of Convergent and Commutative Replicated Data Types 2011 INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE Sean Cribbs - Eventually Consistent Data Structures 1

18 Methods for handling conflicts Last Write Wins Easy Data is lost Depends on timestamps Semantic resolution Requires application/user involvement Generic solution Conflict-free Data Types Data structure has built-in convergence Limited ability to model real-world problems 1

19 Consistency models of OLTP databases Last write wins Riak CouchDB/CouchBase Cassandra User resolvable conflicts Riak Voldemort CouchDB/CouchBase (but unreliable) Active anti-entropy Riak (Soon) Hinted handoff with sloppy quorums (highest write-availability) Riak Cassandra Strong consistency (read you own writes + strict quorums) Riak Voldemort Cassandra CouchBase MongoDB Traditional SQL databases (Oracle, MySQL, etc.) 1

20 Consistency ph Atomic Consistent Isolated Durable vs Basically Available Soft state Eventual Consistency Consistency availability 2

21 Consensus Protocol for agreeing on a decision More than half the nodes must be in agreement (n/2+1) Tolerates remaining nodes being down/slow/un-updated. Consistency availability Consensus 2

22 Example: Ensuring idempotence using consensus Communication protocols are unreliable and requests can be resent even when they have already completed. Clients assign requestid. If a request is resent, we should return the first answer instead of processing it again. vnodes serialize writes in Riak. We use Riak. N=3, PW=quorum to ensure strict quorums.(*) (*) Riak has a bug in the P checks, but we have deemed it insignificant to our use. 2

23 Example: Ensuring idempotence using consensus Doctor system Request idempotence Proxy instance Pharmacy system Requests Requests Requests Requests 2

24 Example: Ensuring idempotence using consensus Doctor system reqid=xyz Request idempotence Proxy instance Pharmacy system reqid=xyz Down We tolerate one node down at a time 2 reqid=xyz Asuming n<=nodes: n=3: quorum=2, maxdown=1 n=4: quorum=3, maxdown=1 n=5: quorum=3, maxdown=2 n=6: quorum=4, maxdown=2 n=7: quorum=4, maxdown=3

25 Example: Ensuring idempotence using consensus Doctor system reqid=xyz Request idempotence Proxy instance Pharmacy system reqid=xyz reqid=xyz 2

26 Delta consistency An update will propagate through the system and all replicas will be consistent after a fixed time period δ Easy to understand for customer Consistency availability Consensus Delta consistency 2

27 Example: Delta Consistency with prescription replication We guarentee that prescriptions are replicated from Oracle to Riak in 20 minutes. Max 20 minutes Prescription server Drug medication server Oracle Master Oracle MView Riak Riak Riak 2

28 Dynamic Delta consistency Same as Delta Consistency, but users can monitor directly how far behind we are Define one or more authorities, and track how far behind they are. All responses are added information on updatedness of data for each authority. Useful when delay is normally low (sub-second), but can be high in times of degraded service. Useful for CQRS or temporarily offline systems Pro/Con: Users have to understand what data delay means. Consistency availability 2 Consensus Delta consistency Dynamic Delta consistency

29 Example: Dynamic Delta Consistency using mobile device When beginning a sync, note the time on the authority After completing a sync, store the time of last sync on one or boths sides. Mobile device Riak Sync Riak Relay server Expose updatedness of data. Riak Riak Riak 2

30 Example: Dynamic Delta Consistency using CQRS Eventlog Commands trigger async events Events update views View Expose the oldest waiting event as updated_until on view, or now if no events are in queue. 3

31 Example: Dynamic Delta Consistency using multiple authorities Setup is multiple datacenters everybody replicates with everybody at intervals. full sync When a full sync is done, save the sync data in each data center DC2 Example: DC1 done syncing with DC2 sync started at time t. When a datacenter is internally consistent (no pending handoffs for instance), it can expose the time of sync with the other authorities as updated_until timestamp. DC1 DC3 3

32 Thank you! RuneSkouLarsen 3

What Came First? The Ordering of Events in

What Came First? The Ordering of Events in What Came First? The Ordering of Events in Systems @kavya719 kavya the design of concurrent systems Slack architecture on AWS systems with multiple independent actors. threads in a multithreaded program.

More information

Self-healing Data Step by Step

Self-healing Data Step by Step Self-healing Data Step by Step Uwe Friedrichsen (codecentric AG) NoSQL matters Cologne, 29. April 2014 @ufried Uwe Friedrichsen uwe.friedrichsen@codecentric.de http://slideshare.net/ufried http://ufried.tumblr.com

More information

Eventual Consistency Today: Limitations, Extensions and Beyond

Eventual Consistency Today: Limitations, Extensions and Beyond Eventual Consistency Today: Limitations, Extensions and Beyond Peter Bailis and Ali Ghodsi, UC Berkeley - Nomchin Banga Outline Eventual Consistency: History and Concepts How eventual is eventual consistency?

More information

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,

More information

Why NoSQL? Why Riak?

Why NoSQL? Why Riak? Why NoSQL? Why Riak? Justin Sheehy justin@basho.com 1 What's all of this NoSQL nonsense? Riak Voldemort HBase MongoDB Neo4j Cassandra CouchDB Membase Redis (and the list goes on...) 2 What went wrong with

More information

CS Amazon Dynamo

CS Amazon Dynamo CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed

More information

Introduction to NoSQL Databases

Introduction to NoSQL Databases Introduction to NoSQL Databases Roman Kern KTI, TU Graz 2017-10-16 Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 1 / 31 Introduction Intro Why NoSQL? Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 2 / 31 Introduction

More information

CS 655 Advanced Topics in Distributed Systems

CS 655 Advanced Topics in Distributed Systems Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3

More information

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014 Cassandra @ Spotify Scaling storage to million of users world wide! Jimmy Mårdell October 14, 2014 2 About me Jimmy Mårdell Tech Product Owner in the Cassandra team 4 years at Spotify

More information

Dynamo: Amazon s Highly Available Key-Value Store

Dynamo: Amazon s Highly Available Key-Value Store Dynamo: Amazon s Highly Available Key-Value Store DeCandia et al. Amazon.com Presented by Sushil CS 5204 1 Motivation A storage system that attains high availability, performance and durability Decentralized

More information

Haridimos Kondylakis Computer Science Department, University of Crete

Haridimos Kondylakis Computer Science Department, University of Crete CS-562 Advanced Topics in Databases Haridimos Kondylakis Computer Science Department, University of Crete QSX (LN2) 2 NoSQL NoSQL: Not Only SQL. User case of NoSQL? Massive write performance. Fast key

More information

Important Lessons. Today's Lecture. Two Views of Distributed Systems

Important Lessons. Today's Lecture. Two Views of Distributed Systems Important Lessons Replication good for performance/ reliability Key challenge keeping replicas up-to-date Wide range of consistency models Will see more next lecture Range of correctness properties L-10

More information

SCALABLE CONSISTENCY AND TRANSACTION MODELS

SCALABLE CONSISTENCY AND TRANSACTION MODELS Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:

More information

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi 1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.

More information

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs 1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds

More information

10. Replication. Motivation

10. Replication. Motivation 10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure

More information

Building Consistent Transactions with Inconsistent Replication

Building Consistent Transactions with Inconsistent Replication DB Reading Group Fall 2015 slides by Dana Van Aken Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports

More information

CompSci 516 Database Systems

CompSci 516 Database Systems CompSci 516 Database Systems Lecture 20 NoSQL and Column Store Instructor: Sudeepa Roy Duke CS, Fall 2018 CompSci 516: Database Systems 1 Reading Material NOSQL: Scalable SQL and NoSQL Data Stores Rick

More information

FAQs Snapshots and locks Vector Clock

FAQs Snapshots and locks Vector Clock //08 CS5 Introduction to Big - FALL 08 W.B.0.0 CS5 Introduction to Big //08 CS5 Introduction to Big - FALL 08 W.B. FAQs Snapshots and locks Vector Clock PART. LARGE SCALE DATA STORAGE SYSTEMS NO SQL DATA

More information

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE Presented by Byungjin Jun 1 What is Dynamo for? Highly available key-value storages system Simple primary-key only interface Scalable and Reliable Tradeoff:

More information

Distributed Systems. Lec 12: Consistency Models Sequential, Causal, and Eventual Consistency. Slide acks: Jinyang Li

Distributed Systems. Lec 12: Consistency Models Sequential, Causal, and Eventual Consistency. Slide acks: Jinyang Li Distributed Systems Lec 12: Consistency Models Sequential, Causal, and Eventual Consistency Slide acks: Jinyang Li (http://www.news.cs.nyu.edu/~jinyang/fa10/notes/ds-eventual.ppt) 1 Consistency (Reminder)

More information

Relational databases

Relational databases COSC 6397 Big Data Analytics NoSQL databases Edgar Gabriel Spring 2017 Relational databases Long lasting industry standard to store data persistently Key points concurrency control, transactions, standard

More information

Distributed Data Management Replication

Distributed Data Management Replication Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut Distributing Data Motivation Scalability (Elasticity) If data volume, processing, or access exhausts one machine, you might want to spread

More information

The NoSQL Ecosystem. Adam Marcus MIT CSAIL

The NoSQL Ecosystem. Adam Marcus MIT CSAIL The NoSQL Ecosystem Adam Marcus MIT CSAIL marcua@csail.mit.edu / @marcua About Me Social Computing + Database Systems Easily Distracted: Wrote The NoSQL Ecosystem in The Architecture of Open Source Applications

More information

Building Consistent Transactions with Inconsistent Replication

Building Consistent Transactions with Inconsistent Replication Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports University of Washington Distributed storage systems

More information

Distributed Systems. Catch-up Lecture: Consistency Model Implementations

Distributed Systems. Catch-up Lecture: Consistency Model Implementations Distributed Systems Catch-up Lecture: Consistency Model Implementations Slides redundant with Lec 11,12 Slide acks: Jinyang Li, Robert Morris, Dave Andersen 1 Outline Last times: Consistency models Strict

More information

Database Availability and Integrity in NoSQL. Fahri Firdausillah [M ]

Database Availability and Integrity in NoSQL. Fahri Firdausillah [M ] Database Availability and Integrity in NoSQL Fahri Firdausillah [M031010012] What is NoSQL Stands for Not Only SQL Mostly addressing some of the points: nonrelational, distributed, horizontal scalable,

More information

Riak. Distributed, replicated, highly available

Riak. Distributed, replicated, highly available INTRO TO RIAK Riak Overview Riak Distributed Riak Distributed, replicated, highly available Riak Distributed, highly available, eventually consistent Riak Distributed, highly available, eventually consistent,

More information

CSE-E5430 Scalable Cloud Computing Lecture 10

CSE-E5430 Scalable Cloud Computing Lecture 10 CSE-E5430 Scalable Cloud Computing Lecture 10 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 23.11-2015 1/29 Exam Registering for the exam is obligatory,

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.

More information

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems Jargons, Concepts, Scope and Systems Key Value Stores, Document Stores, Extensible Record Stores Overview of different scalable relational systems Examples of different Data stores Predictions, Comparisons

More information

DISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS

DISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS LASP 2 DISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS 3 EN TAL AV CHRISTOPHER MEIKLEJOHN 4 RESEARCH WITH: PETER VAN ROY (UCL) 5 MOTIVATION 6 SYNCHRONIZATION IS EXPENSIVE 7 SYNCHRONIZATION IS SOMETIMES

More information

Trade- Offs in Cloud Storage Architecture. Stefan Tai

Trade- Offs in Cloud Storage Architecture. Stefan Tai Trade- Offs in Cloud Storage Architecture Stefan Tai Cloud computing is about providing and consuming resources as services There are five essential characteristics of cloud services [NIST] [NIST]: http://csrc.nist.gov/groups/sns/cloud-

More information

Replication. Feb 10, 2016 CPSC 416

Replication. Feb 10, 2016 CPSC 416 Replication Feb 10, 2016 CPSC 416 How d we get here? Failures & single systems; fault tolerance techniques added redundancy (ECC memory, RAID, etc.) Conceptually, ECC & RAID both put a master in front

More information

Eventual Consistency Today: Limitations, Extensions and Beyond

Eventual Consistency Today: Limitations, Extensions and Beyond Eventual Consistency Today: Limitations, Extensions and Beyond Peter Bailis and Ali Ghodsi, UC Berkeley Presenter: Yifei Teng Part of slides are cited from Nomchin Banga Road Map Eventual Consistency:

More information

Distributed Systems (5DV147)

Distributed Systems (5DV147) Distributed Systems (5DV147) Replication and consistency Fall 2013 1 Replication 2 What is replication? Introduction Make different copies of data ensuring that all copies are identical Immutable data

More information

Basic vs. Reliable Multicast

Basic vs. Reliable Multicast Basic vs. Reliable Multicast Basic multicast does not consider process crashes. Reliable multicast does. So far, we considered the basic versions of ordered multicasts. What about the reliable versions?

More information

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis 1 NoSQL So-called NoSQL systems offer reduced functionalities compared to traditional Relational DBMSs, with the aim of achieving

More information

Migrating Oracle Databases To Cassandra

Migrating Oracle Databases To Cassandra BY UMAIR MANSOOB Why Cassandra Lower Cost of ownership makes it #1 choice for Big Data OLTP Applications. Unlike Oracle, Cassandra can store structured, semi-structured, and unstructured data. Cassandra

More information

Design Patterns for Large- Scale Data Management. Robert Hodges OSCON 2013

Design Patterns for Large- Scale Data Management. Robert Hodges OSCON 2013 Design Patterns for Large- Scale Data Management Robert Hodges OSCON 2013 The Start-Up Dilemma 1. You are releasing Online Storefront V 1.0 2. It could be a complete bust 3. But it could be *really* big

More information

NoSQL Databases. Amir H. Payberah. Swedish Institute of Computer Science. April 10, 2014

NoSQL Databases. Amir H. Payberah. Swedish Institute of Computer Science. April 10, 2014 NoSQL Databases Amir H. Payberah Swedish Institute of Computer Science amir@sics.se April 10, 2014 Amir H. Payberah (SICS) NoSQL Databases April 10, 2014 1 / 67 Database and Database Management System

More information

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 14: Data Replication Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database Replication What is database replication The advantages of

More information

Chapter 11 - Data Replication Middleware

Chapter 11 - Data Replication Middleware Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 11 - Data Replication Middleware Motivation Replication: controlled

More information

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL CISC 7610 Lecture 5 Distributed multimedia databases Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL Motivation YouTube receives 400 hours of video per minute That is 200M hours

More information

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary Consistency and Replication Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary Reasons for Replication Reliability/Availability : Mask failures Mask corrupted data Performance: Scalability

More information

MY CONVERSATION HAS RUN DRY

MY CONVERSATION HAS RUN DRY PARTITION TOLERANCE MY CONVERSATION HAS RUN DRY Many systems degrade, or otherwise change state, under partition BRING THE PIECES BACK TOGETHER REDISCOVER COMMUNICATION A EXAMPLE ANPLICATION 5 clients

More information

Replication in Distributed Systems

Replication in Distributed Systems Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over

More information

Advanced Databases ( CIS 6930) Fall Instructor: Dr. Markus Schneider. Group 17 Anirudh Sarma Bhaskara Sreeharsha Poluru Ameya Devbhankar

Advanced Databases ( CIS 6930) Fall Instructor: Dr. Markus Schneider. Group 17 Anirudh Sarma Bhaskara Sreeharsha Poluru Ameya Devbhankar Advanced Databases ( CIS 6930) Fall 2016 Instructor: Dr. Markus Schneider Group 17 Anirudh Sarma Bhaskara Sreeharsha Poluru Ameya Devbhankar BEFORE WE BEGIN NOSQL : It is mechanism for storage & retrieval

More information

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 21 (optional) NoSQL systems Instructor: Sudeepa Roy Duke CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Key- Value Stores Duke CS,

More information

Integrity in Distributed Databases

Integrity in Distributed Databases Integrity in Distributed Databases Andreas Farella Free University of Bozen-Bolzano Table of Contents 1 Introduction................................................... 3 2 Different aspects of integrity.....................................

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Dynamo Recap Consistent hashing 1-hop DHT enabled by gossip Execution of reads and writes Coordinated by first available successor

More information

Eventual Consistency 1

Eventual Consistency 1 Eventual Consistency 1 Readings Werner Vogels ACM Queue paper http://queue.acm.org/detail.cfm?id=1466448 Dynamo paper http://www.allthingsdistributed.com/files/ amazon-dynamo-sosp2007.pdf Apache Cassandra

More information

Certified Program Models for Eventual Consistency

Certified Program Models for Eventual Consistency Certified Program Models for Eventual Consistency Edgar Pek, Pranav Garg, Muntasir Raihan Rahman, Indranil Gupta, P. Madhusudan University of Illinois at Urbana-Champaign {pek1, garg11, mrahman2, indy,

More information

MDCC MULTI DATA CENTER CONSISTENCY. amplab. Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete

MDCC MULTI DATA CENTER CONSISTENCY. amplab. Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete MDCC MULTI DATA CENTER CONSISTENCY Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete gpang@cs.berkeley.edu amplab MOTIVATION 2 3 June 2, 200: Rackspace power outage of approximately 0

More information

Distributed PostgreSQL with YugaByte DB

Distributed PostgreSQL with YugaByte DB Distributed PostgreSQL with YugaByte DB Karthik Ranganathan PostgresConf Silicon Valley Oct 16, 2018 1 CHECKOUT THIS REPO: github.com/yugabyte/yb-sql-workshop 2 About Us Founders Kannan Muthukkaruppan,

More information

Just-Right Consistency. Centralised data store. trois bases. As available as possible As consistent as necessary Correct by design

Just-Right Consistency. Centralised data store. trois bases. As available as possible As consistent as necessary Correct by design Just-Right Consistency As available as possible As consistent as necessary Correct by design Marc Shapiro, UMC-LI6 & Inria Annette Bieniusa, U. Kaiserslautern Nuno reguiça, U. Nova Lisboa Christopher Meiklejohn,

More information

Strong Eventual Consistency and CRDTs

Strong Eventual Consistency and CRDTs Strong Eventual Consistency and CRDTs Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC Large-scale replicated data structures Large, dynamic

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB s C. Faloutsos A. Pavlo Lecture#23: Distributed Database Systems (R&G ch. 22) Administrivia Final Exam Who: You What: R&G Chapters 15-22

More information

There is a tempta7on to say it is really used, it must be good

There is a tempta7on to say it is really used, it must be good Notes from reviews Dynamo Evalua7on doesn t cover all design goals (e.g. incremental scalability, heterogeneity) Is it research? Complexity? How general? Dynamo Mo7va7on Normal database not the right fit

More information

Intuitive distributed algorithms. with F#

Intuitive distributed algorithms. with F# Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype

More information

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein 10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright 2012 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4.

More information

CISC 7610 Lecture 2b The beginnings of NoSQL

CISC 7610 Lecture 2b The beginnings of NoSQL CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone

More information

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering Consistency: Relaxed SWE 622, Spring 2017 Distributed Software Engineering Review: HW2 What did we do? Cache->Redis Locks->Lock Server Post-mortem feedback: http://b.socrative.com/ click on student login,

More information

Dynamo Tom Anderson and Doug Woos

Dynamo Tom Anderson and Doug Woos Dynamo motivation Dynamo Tom Anderson and Doug Woos Fast, available writes - Shopping cart: always enable purchases FLP: consistency and progress at odds - Paxos: must communicate with a quorum Performance:

More information

Coordination-Free Computations. Christopher Meiklejohn

Coordination-Free Computations. Christopher Meiklejohn Coordination-Free Computations Christopher Meiklejohn LASP DISTRIBUTED, EVENTUALLY CONSISTENT COMPUTATIONS CHRISTOPHER MEIKLEJOHN (BASHO TECHNOLOGIES, INC.) PETER VAN ROY (UNIVERSITÉ CATHOLIQUE DE LOUVAIN)

More information

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Performance and Forgiveness June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Margo Seltzer Architect Outline A consistency primer Techniques and costs of consistency

More information

DrRobert N. M. Watson

DrRobert N. M. Watson Distributed systems Lecture 15: Replication, quorums, consistency, CAP, and Amazon/Google case studies DrRobert N. M. Watson 1 Last time General issue of consensus: How to get processes to agree on something

More information

PushyDB. Jeff Chan, Kenny Lam, Nils Molina, Oliver Song {jeffchan, kennylam, molina,

PushyDB. Jeff Chan, Kenny Lam, Nils Molina, Oliver Song {jeffchan, kennylam, molina, PushyDB Jeff Chan, Kenny Lam, Nils Molina, Oliver Song {jeffchan, kennylam, molina, osong}@mit.edu https://github.com/jeffchan/6.824 1. Abstract PushyDB provides a more fully featured database that exposes

More information

Genie. Distributed Systems Synthesis and Verification. Marc Rosen. EN : Advanced Distributed Systems and Networks May 1, 2017

Genie. Distributed Systems Synthesis and Verification. Marc Rosen. EN : Advanced Distributed Systems and Networks May 1, 2017 Genie Distributed Systems Synthesis and Verification Marc Rosen EN.600.667: Advanced Distributed Systems and Networks May 1, 2017 1 / 35 Outline Introduction Problem Statement Prior Art Demo How does it

More information

SMAC: State Management for Geo-Distributed Containers

SMAC: State Management for Geo-Distributed Containers SMAC: State Management for Geo-Distributed Containers Jacob Eberhardt, Dominik Ernst, David Bermbach Information Systems Engineering Research Group Technische Universitaet Berlin Berlin, Germany Email:

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

Conflict-free Replicated Data Types (CRDTs)

Conflict-free Replicated Data Types (CRDTs) Conflict-free Replicated Data Types (CRDTs) for collaborative environments Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC Conflict-free

More information

Replication and Consistency

Replication and Consistency Replication and Consistency Today l Replication l Consistency models l Consistency protocols The value of replication For reliability and availability Avoid problems with disconnection, data corruption,

More information

Databases. Laboratorio de sistemas distribuidos. Universidad Politécnica de Madrid (UPM)

Databases. Laboratorio de sistemas distribuidos. Universidad Politécnica de Madrid (UPM) Databases Laboratorio de sistemas distribuidos Universidad Politécnica de Madrid (UPM) http://lsd.ls.fi.upm.es/lsd/lsd.htm Nuevas tendencias en sistemas distribuidos 2 Summary Transactions. Isolation.

More information

Consistency & Replication

Consistency & Replication Objectives Consistency & Replication Instructor: Dr. Tongping Liu To understand replication and related issues in distributed systems" To learn about how to keep multiple replicas consistent with each

More information

CSE 5306 Distributed Systems. Consistency and Replication

CSE 5306 Distributed Systems. Consistency and Replication CSE 5306 Distributed Systems Consistency and Replication 1 Reasons for Replication Data are replicated for the reliability of the system Servers are replicated for performance Scaling in numbers Scaling

More information

CockroachDB on DC/OS. Ben Darnell, CTO, Cockroach Labs

CockroachDB on DC/OS. Ben Darnell, CTO, Cockroach Labs CockroachDB on DC/OS Ben Darnell, CTO, Cockroach Labs Agenda A cloud-native database CockroachDB on DC/OS Why CockroachDB Demo! Cloud-Native Database What is Cloud-Native? Horizontally scalable Individual

More information

Consistency in Distributed Systems

Consistency in Distributed Systems Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission

More information

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related

More information

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need

More information

Linearizability CMPT 401. Sequential Consistency. Passive Replication

Linearizability CMPT 401. Sequential Consistency. Passive Replication Linearizability CMPT 401 Thursday, March 31, 2005 The execution of a replicated service (potentially with multiple requests interleaved over multiple servers) is said to be linearizable if: The interleaved

More information

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies! DEMYSTIFYING BIG DATA WITH RIAK USE CASES Martin Schneider Basho Technologies! Agenda Defining Big Data in Regards to Riak A Series of Trade-Offs Use Cases Q & A About Basho & Riak Basho Technologies is

More information

CS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

CS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. CS 138: Dynamo CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Dynamo Highly available and scalable distributed data store Manages state of services that have high reliability and

More information

INF-5360 Presentation

INF-5360 Presentation INF-5360 Presentation Optimistic Replication Ali Ahmad April 29, 2013 Structure of presentation Pessimistic and optimistic replication Elements of Optimistic replication Eventual consistency Scheduling

More information

CAP and the Architectural Consequences

CAP and the Architectural Consequences CAP and the Architectural Consequences NoSQL matters Cologne 2013-04-27 martin Schönert (triagens) 2013 triagens GmbH 2013-04-27 1 Who am I martin Schönert I work at triagens GmbH I have been in software

More information

relational Relational to Riak Why Move From Relational to Riak? Introduction High Availability Riak At-a-Glance

relational Relational to Riak Why Move From Relational to Riak? Introduction High Availability Riak At-a-Glance WHITEPAPER Relational to Riak relational Introduction This whitepaper looks at why companies choose Riak over a relational database. We focus specifically on availability, scalability, and the / data model.

More information

NewSQL Databases. The reference Big Data stack

NewSQL Databases. The reference Big Data stack Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica NewSQL Databases Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini The reference

More information

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases Key-Value Document Column Family Graph John Edgar 2 Relational databases are the prevalent solution

More information

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis 1 NoSQL So-called NoSQL systems offer reduced functionalities compared to traditional Relational DBMS, with the aim of achieving

More information

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication Important Lessons Lamport & vector clocks both give a logical timestamps Total ordering vs. causal ordering Other issues in coordinating node activities Exclusive access to resources/data Choosing a single

More information

A Global In-memory Data System for MySQL Daniel Austin, PayPal Technical Staff

A Global In-memory Data System for MySQL Daniel Austin, PayPal Technical Staff A Global In-memory Data System for MySQL Daniel Austin, PayPal Technical Staff Percona Live! MySQL Conference Santa Clara, April 12th, 2012 v1.3 Intro: Globalizing NDB Proposed Architecture What We Learned

More information

Distributed Systems: Consistency and Replication

Distributed Systems: Consistency and Replication Distributed Systems: Consistency and Replication Alessandro Sivieri Dipartimento di Elettronica, Informazione e Bioingegneria Politecnico, Italy alessandro.sivieri@polimi.it http://corsi.dei.polimi.it/distsys

More information

Applications of Paxos Algorithm

Applications of Paxos Algorithm Applications of Paxos Algorithm Gurkan Solmaz COP 6938 - Cloud Computing - Fall 2012 Department of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL Oct 15, 2012 1

More information

CAP Theorem, BASE & DynamoDB

CAP Theorem, BASE & DynamoDB Indian Institute of Science Bangalore, India भ रत य व ज ञ न स स थ न ब गल र, भ रत DS256:Jan18 (3:1) Department of Computational and Data Sciences CAP Theorem, BASE & DynamoDB Yogesh Simmhan Yogesh Simmhan

More information

TAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton

TAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton TAPIR By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton Outline Problem Space Inconsistent Replication TAPIR Evaluation Conclusion Problem

More information

Shen PingCAP 2017

Shen PingCAP 2017 Shen Li @ PingCAP About me Shen Li ( 申砾 ) Tech Lead of TiDB, VP of Engineering Netease / 360 / PingCAP Infrastructure software engineer WHY DO WE NEED A NEW DATABASE? Brief History Standalone RDBMS NoSQL

More information

Conflict-Free Replicated Data Types (basic entry)

Conflict-Free Replicated Data Types (basic entry) Conflict-Free Replicated Data Types (basic entry) Marc Shapiro Sorbonne-Universités-UPMC-LIP6 & Inria Paris http://lip6.fr/marc.shapiro/ 16 May 2016 1 Synonyms Conflict-Free Replicated Data Types (CRDTs).

More information

CIB Session 12th NoSQL Databases Structures

CIB Session 12th NoSQL Databases Structures CIB Session 12th NoSQL Databases Structures By: Shahab Safaee & Morteza Zahedi Software Engineering PhD Email: safaee.shx@gmail.com, morteza.zahedi.a@gmail.com cibtrc.ir cibtrc cibtrc 2 Agenda What is

More information

Modern Database Concepts

Modern Database Concepts Modern Database Concepts Basic Principles Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz NoSQL Overview Main objective: to implement a distributed state Different objects stored on different

More information