CAP and the Architectural Consequences

Similar documents
Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Self-healing Data Step by Step

SCALABLE CONSISTENCY AND TRANSACTION MODELS

Eventual Consistency 1

CS6450: Distributed Systems Lecture 11. Ryan Stutsman

10. Replication. Motivation

CISC 7610 Lecture 2b The beginnings of NoSQL

Extreme Computing. NoSQL.

Report to Brewer s original presentation of his CAP Theorem at the Symposium on Principles of Distributed Computing (PODC) 2000

CS October 2017

CS6450: Distributed Systems Lecture 15. Ryan Stutsman

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL

Modern Database Concepts

Strong Consistency & CAP Theorem

The CAP theorem. The bad, the good and the ugly. Michael Pfeiffer Advanced Networking Technologies FG Telematik/Rechnernetze TU Ilmenau

Introduction to NoSQL

Intuitive distributed algorithms. with F#

Eventual Consistency Today: Limitations, Extensions and Beyond

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

COSC 416 NoSQL Databases. NoSQL Databases Overview. Dr. Ramon Lawrence University of British Columbia Okanagan

Riak. Distributed, replicated, highly available

Distributed Data Store

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

NoSQL Concepts, Techniques & Systems Part 1. Valentina Ivanova IDA, Linköping University

Databases : Lecture 1 2: Beyond ACID/Relational databases Timothy G. Griffin Lent Term Apologies to Martin Fowler ( NoSQL Distilled )

Data Informatics. Seon Ho Kim, Ph.D.

Database Availability and Integrity in NoSQL. Fahri Firdausillah [M ]

The new sharding feature in ArangoDB 2.0

CS 655 Advanced Topics in Distributed Systems

Consistency in Distributed Storage Systems. Mihir Nanavati March 4 th, 2016

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

Cassandra Design Patterns

GridGain and Apache Ignite In-Memory Performance with Durability of Disk

Architekturen für die Cloud

CIB Session 12th NoSQL Databases Structures

Migrating Oracle Databases To Cassandra

Achieving the Potential of a Fully Distributed Storage System

CONSISTENT FOCUS. Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability.

CAP Theorem. March 26, Thanks to Arvind K., Dong W., and Mihir N. for slides.

Introduction to NoSQL Databases

Why distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data?"

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Scaling Out Key-Value Storage

SQL Azure as a Self- Managing Database Service: Lessons Learned and Challenges Ahead

Trade- Offs in Cloud Storage Architecture. Stefan Tai

Transactions and ACID

Advanced Databases ( CIS 6930) Fall Instructor: Dr. Markus Schneider. Group 17 Anirudh Sarma Bhaskara Sreeharsha Poluru Ameya Devbhankar

Don t Give Up on Serializability Just Yet. Neha Narula

Database Management Systems

Databases : Lectures 11 and 12: Beyond ACID/Relational databases Timothy G. Griffin Lent Term 2013

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014

Introduction to riak_ensemble. Joseph Blomstedt Basho Technologies

DIVING IN: INSIDE THE DATA CENTER

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Dynamo: Amazon s Highly Available Key-Value Store

CAP Theorem, BASE & DynamoDB

Introduction to Distributed Systems Seif Haridi

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Low-Latency Multi-Datacenter Databases using Replicated Commit

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Design Patterns for Large- Scale Data Management. Robert Hodges OSCON 2013

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm

Dynamo: Key-Value Cloud Storage

Big Data Analytics. Rasoul Karimi

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

NoSQL systems: sharding, replication and consistency. Riccardo Torlone Università Roma Tre

CSE 344 Final Review. August 16 th

CS Amazon Dynamo

Goals for Today. CS 133: Databases. Final Exam: Logistics. Why Use a DBMS? Brief overview of course. Course evaluations

Big Data Development CASSANDRA NoSQL Training - Workshop. November 20 to (5 days) 9 am to 5 pm HOTEL DUBAI GRAND DUBAI

Apache Ignite Best Practices: Cluster Topology Management and Data Replication

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

SQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden

Mix n Match Async and Group Replication for Advanced Replication Setups. Pedro Gomes Software Engineer

Independent consultant. Oracle ACE Director. Member of OakTable Network. Available for consulting In-house workshops. Performance Troubleshooting

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

PNUTS: Yahoo! s Hosted Data Serving Platform. Reading Review by: Alex Degtiar (adegtiar) /30/2013

6.830 Lecture Spark 11/15/2017

Crescando: Predictable Performance for Unpredictable Workloads

Important Lessons. Today's Lecture. Two Views of Distributed Systems

A Global In-memory Data System for MySQL Daniel Austin, PayPal Technical Staff

Data-Intensive Distributed Computing

Independent consultant. Oracle ACE Director. Member of OakTable Network. Available for consulting In-house workshops. Performance Troubleshooting

AN introduction to nosql databases

Topics in Reliable Distributed Systems

Shen PingCAP 2017

EECS 498 Introduction to Distributed Systems

There is a tempta7on to say it is really used, it must be good

Cassandra- A Distributed Database

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun

Outline. Introduction Background Use Cases Data Model & Query Language Architecture Conclusion

Parallel Computing: MapReduce Jin, Hai

June 20, 2017 Revision NoSQL Database Architectural Comparison

CompSci 516 Database Systems

Transcription:

CAP and the Architectural Consequences NoSQL matters Cologne 2013-04-27 martin Schönert (triagens) 2013 triagens GmbH 2013-04-27 1

Who am I martin Schönert I work at triagens GmbH I have been in software development since 30 years programmer product manager responsible for a data center department head at a large company software architect I am the architect of 2013 triagens GmbH 2013-04-27 2

The CAP Theorem: Consistency, Availability, Partition Tolerance Write Replicate 2013 triagens GmbH 2013-04-27 3

The CAP Theorem: Consistency, Availability, Partition Tolerance Read the actual data 2013 triagens GmbH 2013-04-27 4

The CAP Theorem: Consistency, Availability, Partition Tolerance Partition 2013 triagens GmbH 2013-04-27 5

Theorem: You can at most have two of these properties for any shared data system. Dr. Eric A. Brewer Towards Robust Distributed Systems Consistency Availability PODC Keynote, July 19. 2000 Proceedings of the Anual ACM Symposium on the Principles of Distributed Systems, 2000 Tolerance to network Partitions 2013 triagens GmbH 2013-04-27 6

Which was criticized in many articles and blog entries (below is just a small sample ;-). blog.voltdb.com/clarifications-cap-theorem-and-data-related-errors/ dbmsmusings.blogspot.de/2010/04/problems-with-cap-and-yahoos-little.html codahale.com/you-cant-sacrifice-partition-tolerance/ 2013 triagens GmbH 2013-04-27 7

Which was criticized in many articles and blog entries (below is just a small sample ;-). I really need to write an updated blog.voltdb.com/clarifications-cap-theorem-and-data-related-errors/ CAP theorem paper. Dr. Eric A. Brewer (twitter, Oct. 2010) dbmsmusings.blogspot.de/2010/04/problems-with-cap-and-yahoos-little.html codahale.com/you-cant-sacrifice-partition-tolerance/ 2013 triagens GmbH 2013-04-27 8

Critique of CAP: CP Was basically interpreted as: if anything at all goes wrong (real network partition, node failure,...), immediately stop accepting any operation (read, write, ) at all. and was rejected because: you can still accept some operations (e.g. reads), or continue top accept all operations in one partition (e.g. the one with a quorum),... 2013 triagens GmbH 2013-04-27 9

Critque of CAP: CA Was basically interpreted as: the system gives up all of the ACID semantics and at no time (even while not partitioned) does the system guarantee consistency. this confusion is partly because at the same time we had discussions about: ACID vs. BASE and P(A C) E(L C) 2013 triagens GmbH 2013-04-27 10

Critque of CAP: CA Can you actually choose to not have partitions? Yes: No: small clusters (2-3 nodes) in one datacenter nodes and clients are connected through one switch not for systems with more nodes or distributed over several datacenters 2013 triagens GmbH 2013-04-27 11

So let us take a better look at the situation: partition mode normal mode normal mode partition recovery partition detection Operations on the state 2013 triagens GmbH 2013-04-27 12

Detect the partition Happens at the last when one node tries to replicate an operation to another node and this times out. In this moment the node must make a decision: go ahead with the operation (and risk consistency) cancel the operation (and reduce availability) Options: separate watchdog (to distuingish failed node from partitions) heartbeats (to avoid that only one side detects the partition) 2013 triagens GmbH 2013-04-27 13

Partition Mode Place restrictions on: on the nodes that accept operations: quorum on the data on which a client can operate: data ownership (MESI, MOESI, ) problems with complex operations on the operations read only on the semantics: delayed commit async failure record intent any combination of the above possibly with human intervention (e.g. shut down one partition and make the other fully functional) 2013 triagens GmbH 2013-04-27 14

Partition Recovery Merging strategies Eventual consistency last writer wins commutative operators lattice of operations it IS NOT the fact that every operation is first committed on one node and later (eventually) replicated to other nodes application controlled opportunistic (read time) it IS the fact that the system will heal itself, i.e. without external intervention converge to consistent state Fix invariants e.g. violation of uniqueness constraints Merkle hash trees Hinted handoff 2013 triagens GmbH 2013-04-27 15

Massively Distributed Systems Store so much data that hundreds of nodes are needed just to store it. Not that common. Main driver behind early NoSQL developments. Receive a lot of publicity. 2013 triagens GmbH 2013-04-27 16

Consequences of CAP for massively distributed systems Failures happen constantly Partition detection Nodes die Network connections die Network route flapping Partitions can be huge number of possible failure modes and fault lines is HUGE impossible to find out the failure mode quickly is impossible always operate under a worst case assumption Must use resources well if a node dies the load must distributed over multiple other nodes 2013 triagens GmbH 2013-04-27 17

Consequences of CAP for massively distributed systems Partition mode Partition recovery restricting operations to nodes with quorum is impossible restricting operations to read only is impossible restricting operation semantics is possible (though always difficult) restricting operations to own or borrowed data is sometimes necessary must happen fully automatically must merge states must fix invariants Consequences no complex operations resp. only local complex operations 2013 triagens GmbH 2013-04-27 18

Further properties of massively distributed systems Properties Consequence: Nodes fail often New nodes are added regularly Nodes are not homogenous Distribution and redistribution of data must be fully automatic Consistent Hashing No complex operations no scans over large parts of the data no non-trivial joins no multi-index operations The marvel is not that the bear dances well, but that the bear dances at all. Russian Proverb 2013 triagens GmbH 2013-04-27 19

My view of the (NoSQL) Database world DBs that manage an evolving state (OLTP) Analyzing data (OLAP) Operations on compex structures Complex Queries Map Reduce Document Stores Massively Distributed Key/Value Stores Graph Stores Column oriented Stores 2013 triagens GmbH 2013-04-27 20