NoSQL systems: sharding, replication and consistency. Riccardo Torlone Università Roma Tre

Similar documents
Modern Database Concepts

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases

SCALABLE CONSISTENCY AND TRANSACTION MODELS

Transactions and ACID

CAP Theorem. March 26, Thanks to Arvind K., Dong W., and Mihir N. for slides.

Report to Brewer s original presentation of his CAP Theorem at the Symposium on Principles of Distributed Computing (PODC) 2000

CS6450: Distributed Systems Lecture 15. Ryan Stutsman

CS6450: Distributed Systems Lecture 11. Ryan Stutsman

Important Lessons. Today's Lecture. Two Views of Distributed Systems

The CAP theorem. The bad, the good and the ugly. Michael Pfeiffer Advanced Networking Technologies FG Telematik/Rechnernetze TU Ilmenau

Eventual Consistency 1

4/9/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data Spring 2018 Colorado State University. FAQs. Architecture of GFS

11/5/2018 Week 12-A Sangmi Lee Pallickara. CS435 Introduction to Big Data FALL 2018 Colorado State University

Relational databases

Introduction to NoSQL

Strong Consistency & CAP Theorem

NoSQL Databases. CPS352: Database Systems. Simon Miner Gordon College Last Revised: 4/22/15

10. Replication. Motivation

Big Data Management and NoSQL Databases

Consistency in Distributed Storage Systems. Mihir Nanavati March 4 th, 2016

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

CS October 2017

Dynamo: Key-Value Cloud Storage

CISC 7610 Lecture 2b The beginnings of NoSQL

Mutual consistency, what for? Replication. Data replication, consistency models & protocols. Difficulties. Execution Model.

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

Distributed Data Store

Replication in Distributed Systems

Database Architectures

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23

NoSQLDatabases: AggregatedDBs

CS Amazon Dynamo

Database Availability and Integrity in NoSQL. Fahri Firdausillah [M ]

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Scaling Out Key-Value Storage

Distributed Systems. Day 13: Distributed Transaction. To Be or Not to Be Distributed.. Transactions

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Architekturen für die Cloud

Apache Cassandra - A Decentralized Structured Storage System

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

CS5412: ANATOMY OF A CLOUD

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

{brandname} 9.4 Glossary. The {brandname} community

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

CA485 Ray Walshe NoSQL

SCALABLE CONSISTENCY AND TRANSACTION MODELS THANKS TO M. GROSSNIKLAUS

CompSci 516 Database Systems

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

DrRobert N. M. Watson

Introduction to NoSQL Databases

Computing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1

MySQL Replication Options. Peter Zaitsev, CEO, Percona Moscow MySQL User Meetup Moscow,Russia

Introduction to Computer Science. William Hsu Department of Computer Science and Engineering National Taiwan Ocean University

Databases : Lecture 1 2: Beyond ACID/Relational databases Timothy G. Griffin Lent Term Apologies to Martin Fowler ( NoSQL Distilled )

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

EECS 498 Introduction to Distributed Systems

Exam 2 Review. October 29, Paul Krzyzanowski 1

Database Architectures

CIB Session 12th NoSQL Databases Structures

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering

10.0 Towards the Cloud

Cassandra Design Patterns

CSE 444: Database Internals. Section 9: 2-Phase Commit and Replication

CS 655 Advanced Topics in Distributed Systems

NoSQL Concepts, Techniques & Systems Part 1. Valentina Ivanova IDA, Linköping University

NoSQL Databases. Amir H. Payberah. Swedish Institute of Computer Science. April 10, 2014

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

CONSISTENT FOCUS. Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability.

Building Consistent Transactions with Inconsistent Replication

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL

Migrating Oracle Databases To Cassandra

NewSQL Databases. The reference Big Data stack

CSE 530A. Non-Relational Databases. Washington University Fall 2013

CAP Theorem, BASE & DynamoDB

Module 8 Fault Tolerance CS655! 8-1!

PNUTS and Weighted Voting. Vijay Chidambaram CS 380 D (Feb 8)

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm

Datacenter replication solution with quasardb

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

Haridimos Kondylakis Computer Science Department, University of Crete

CS-580K/480K Advanced Topics in Cloud Computing. NoSQL Database

Rule 14 Use Databases Appropriately

Webinar Series TMIP VISION

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

Basic vs. Reliable Multicast

MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman

Low-Latency Multi-Datacenter Databases using Replicated Commit

Google File System, Replication. Amin Vahdat CSE 123b May 23, 2006

Consistency & Replication

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun

TAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton

Introduction to Distributed Data Systems

CS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

Transcription:

NoSQL systems: sharding, replication and consistency Riccardo Torlone Università Roma Tre

Data distribution NoSQL systems: data distributed over large clusters Aggregate is a natural unit to use for data distribution Data distribution models: Single server (is an option for some applications) Multiple servers Orthogonal aspects of data distribution: Sharding: different data on different nodes Replication: the same data copied over multiple nodes 2

Sharding Different parts of the data onto different servers Horizontal scalability Ideal case: different users all talking to different server nodes Data accessed together on the same node aggregate unit! Pros: it can improve both reads and writes Cons: Clusters use less reliable machines resilience decreases 3

Improving performance Main rules of sharding: 1. Place the data close to where it s accessed Orders for Boston: data in your eastern US data center 2. Try to keep the load even All nodes should get equal amounts of the load 3. Put together data that may be read in sequence Same order, same node Many NoSQL databases offer auto-sharding the database takes on the responsibility of sharding 4

Replication It comes in two forms: master-slave peer-to-peer 5

Master-Slave Replication 6 Master is the authoritative source for the data is responsible for processing any updates to that data can be appointed manually or automatically Slaves A replication process synchronizes the slaves with the master After a failure of the master, a slave can be appointed as new master very quickly

Pros and cons of Master-Slave Replication Pros More read requests: Add more slave nodes Ensure that all read requests are routed to the slaves Should the master fail, the slaves can still handle read requests Good for datasets with a read-intensive dataset Cons The master is a bottleneck Limited by its ability to process updates and to pass those updates on Its failure does eliminate the ability to handle writes until: the master is restored or a new master is appointed Inconsistency due to slow propagation of changes to the slaves Bad for datasets with heavy write traffic 7

Peer-to-Peer Replication All the replicas have equal weight, they can all accept writes The loss of any of them doesn t prevent access to the data store. 8

Pros and cons of peer-to-peer replication Pros: you can ride over node failures without losing access to data you can easily add nodes to improve your performance Cons: Inconsistency! Slow propagation of changes to copies on different nodes Inconsistencies on read lead to problems but are relatively transient Two people can update different copies of the same record stored on different nodes at the same time - a write-write conflict. Inconsistent writes are forever. 9

Sharding and Replication on MS We have multiple masters, but each data only has a single master. Two schemes: A node can be a master for some data and slaves for others Nodes are dedicated for master or slave duties 10

Sharding and Replication on P2P Usually each shard is present on three nodes A common strategy for column-family databases 11

Key points There are two styles of distributing data: Sharding distributes different data across multiple servers each server acts as the single source for a subset of data. Replication copies data across multiple servers each bit of data can be found in multiple places. A system may use either or both techniques. Replication comes in two forms: Master-slave replication makes one node the authoritative copy that handles writes while slaves synchronize with the master and may handle reads. Peer-to-peer replication allows writes to any node; the nodes coordinate to synchronize their copies of the data. Master-slave replication reduces the chance of update conflicts but peerto-peer replication avoids loading all writes onto a single point of failure. 12

Consistency Biggest change from a centralized relational database to a clusteroriented NoSQL Relational databases: strong consistency NoSQL systems: mostly eventual consistency 13

Conflicts Read-read (or simply read) conflict: Different people see different data at the same time Stale data: out of date Write-write conflict Two people updating the same data item at the same time If the server serialize them: one is applied and immediately overwritten by the other (lost update) Read-write conflict: A read in the middle of two logically-related writes 14

Read conflict Replication is a source of inconsistency 15

Read-write conflict A read in the middle of two logically-related writes 16

Solutions Pessimistic approach Prevent conflicts from occurring Usually implemented with write locks managed by the system Optimistic approach Lets conflicts occur, but detects them and takes action to sort them out Approaches: conditional updates: test the value just before updating save both updates: record that they are in conflict and then merge them 17

Pessimistic vs optimistic approach Concurreny involves a fundamental tradeoff between: consistency (avoiding errors such as update conflicts) and availability (responding quickly to clients). Pessimistic approaches often: severely degrade the responsiveness of a system leads to deadlocks, which are hard to prevent and debug. 18

Forms of consistency Strong (or immediate) consistency ACID transaction Logical consistency No read-write conflicts (atomic transactions) Sequential consistency Updates are serialized Session (or read-your-writes) consistency Within a user s session Eventual consistency You may have replication inconsistencies but eventually all nodes will be updated to the same value 19

Transactions on NoSQL databases Graph databases tend to support ACID transactions Aggregate-oriented NoSQL database: Support atomic transactions, but only within a single aggregate To avoid inconsistency in our example: orders in a single aggregate Update over multiple aggregates: possible inconsistent reads Inconsistency window: length of time an inconsistency is present Amazon s documentation:..inconsistency window for SimpleDB service is usually less than a second.. 20

The CAP Theorem Why you may need to relax consistency Proposed by Eric Brewer in 2000 Formal proof by Seth Gilbert and Nancy Lynch in 2002 Given the properties of Consistency, Availability, and Partition tolerance, you can only get two Consistency: all people see the same data at the same time Availability: Every request receives a response without guarantee that it contains the most recent write Partition tolerance: The system continues to operate despite communication breakages that separate the cluster into partitions unable to communicate with each other 23

Network partition 24

CA systems A single-server system is the obvious example of a CA system Traditional RDBMS CA cluster: if a partition occurs, all the nodes would go down Data center (rare and partial partitions) Availability in CAP: every request received by a non-failing node in the system must result in a response (a failed, unresponsive node doesn t infer a lack of availability) A system that suffer partitions: Distributed cluster Tradeoff between consistency VS availability Give up to some consistency to get some availability 25

An example Ann is trying to book a room of the Ace Hotel in New York on a node located in London of a booking system Pathin is trying to do the same on a node located in Mumbai The booking system uses a peer-to-peer distribution There is only a room available The network link breaks 26

Possible solutions CA: Neither user can book any hotel room CP: Designate Mumbai node as the master for Ace hotel Pathin can make the reservation Ann can see the inconsistent room information Ann cannot book the room AP: both nodes accept the hotel reservation Overbooking! These situations are closely tied to the domain Financial exchanges? Blogs? Shopping charts? Issues: how tolerant you are of stale reads how long the inconsistency window can be Common approach: BASE (Basically Available, Soft state, Eventual consistency) 27

Durability You may want to trade off durability for higher performance Main memory database: if the server crashes, any updates since the last flush will be lost Keeping user-session states as temporary information Capturing sensor data from physical devices Replication durability: occurs when a master processes an update but fails before that update is replicated to the other nodes. 28

Practical approach: Quorums N: replication factor, W: n. of confirmed writes, R: n. of reads How many nodes you need to contact to be sure you have the most up-to-date change? Read quorum: R + W > N How many nodes need to be involved to get consistency? Strong consistency: W = N Eventual consistency: W < N Write quorum: W > N/2 In a master-slave distribution one W and one R (to the master!) is enough A replication factor of 3 is usually enough to have good resilience 29

Key points Write-write conflicts occur when two clients try to write the same data at the same time. Read-write conflicts occur when one client reads inconsistent data in the middle of another client s write. Read conflicts occur when when two clients reads inconsistent data. Pessimistic approaches lock data records to prevent conflicts. Optimistic approaches detect conflicts and fix them. Distributed systems (with replicas) see read(-write) conflicts due to some nodes having received updates while other nodes have not. Eventual consistency means that at some point the system will become consistent once all the writes have propagated to all the nodes. Clients usually want read-your-writes consistency, which means a client can write and then immediately read the new value. This can be difficult if the read and the write happen on different nodes. To get good consistency, you need to involve many nodes in data operations, but this increases latency. So you often have to trade off consistency versus latency. The CAP theorem states that if you get a network partition, you have to trade off availability of data versus consistency. Durability can also be traded off against latency, particularly if you want to survive failures with replicated data. You do not need to contact all replicas to preserve strong consistency with replication; you just need a large enough quorum. 30