Distributed Systems. Day 13: Distributed Transaction. To Be or Not to Be Distributed.. Transactions

Similar documents
Distributed File System

ATOMIC COMMITMENT Or: How to Implement Distributed Transactions in Sharded Databases

Distributed System. Gang Wu. Spring,2018

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS October 2017

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Goal A Distributed Transaction

Building Consistent Transactions with Inconsistent Replication

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

CS 425 / ECE 428 Distributed Systems Fall 2017

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions

Exam 2 Review. October 29, Paul Krzyzanowski 1

Causal Consistency and Two-Phase Commit

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Topics in Reliable Distributed Systems

Distributed Systems. Day 9: Replication [Part 1]

TAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton

Transactions. CS 475, Spring 2018 Concurrent & Distributed Systems

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Computing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1

Module 8 Fault Tolerance CS655! 8-1!

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Defining properties of transactions

SCALABLE CONSISTENCY AND TRANSACTION MODELS

EECS 498 Introduction to Distributed Systems

Replication in Distributed Systems

Integrity in Distributed Databases

Transactions. Intel (TX memory): Transactional Synchronization Extensions (TSX) 2015 Donald Acton et al. Computer Science W2

CSE 444: Database Internals. Section 9: 2-Phase Commit and Replication

CPS 512 midterm exam #1, 10/7/2016

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

Control. CS432: Distributed Systems Spring 2017

Consistency in Distributed Systems

Distributed Transactions

Replication and Consistency. Fall 2010 Jussi Kangasharju

Modern Database Concepts

Fault Tolerance. Goals: transparent: mask (i.e., completely recover from) all failures, or predictable: exhibit a well defined failure behavior

CSE 5306 Distributed Systems

FIT: A Distributed Database Performance Tradeoff. Faleiro and Abadi CS590-BDS Thamir Qadah

Distributed Transaction Management

NoSQL systems: sharding, replication and consistency. Riccardo Torlone Università Roma Tre

CSE 5306 Distributed Systems. Fault Tolerance

Concurrency Control II and Distributed Transactions

Exam 2 Review. Fall 2011

Distributed Databases

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju

Fault Tolerance. o Basic Concepts o Process Resilience o Reliable Client-Server Communication o Reliable Group Communication. o Distributed Commit

Database Replication: A Tutorial

Fault Tolerance Causes of failure: process failure machine failure network failure Goals: transparent: mask (i.e., completely recover from) all

Distributed Systems Fault Tolerance

Eventual Consistency Today: Limitations, Extensions and Beyond

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Module 8 - Fault Tolerance

Distributed Databases

CS5412: THE BASE METHODOLOGY VERSUS THE ACID MODEL

Databases. Laboratorio de sistemas distribuidos. Universidad Politécnica de Madrid (UPM)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Parallel Data Types of Parallelism Replication (Multiple copies of the same data) Better throughput for read-only computations Data safety Partitionin

TRANSACTION MANAGEMENT

Transactions. Kathleen Durant PhD Northeastern University CS3200 Lesson 9

Building Consistent Transactions with Inconsistent Replication

CS Amazon Dynamo

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Consistency in Distributed Storage Systems. Mihir Nanavati March 4 th, 2016

PRIMARY-BACKUP REPLICATION

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Distributed Systems Consensus

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat

CS 347 Parallel and Distributed Data Processing

Motivating Example. Motivating Example. Transaction ROLLBACK. Transactions. CSE 444: Database Internals

Important Lessons. Today's Lecture. Two Views of Distributed Systems

Consistency and Replication. Why replicate?

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 13 - Distribution: transactions

CSE 444: Database Internals. Lectures 13 Transaction Schedules

CS 347 Parallel and Distributed Data Processing

Building a transactional distributed

CMPT 354: Database System I. Lecture 11. Transaction Management

Homework #2 Nathan Balon CIS 578 October 31, 2004

CS 347 Parallel and Distributed Data Processing

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

Chapter 22. Transaction Management

Consensus in Distributed Systems. Jeff Chase Duke University

Transactions and ACID

Transaction Management. Pearson Education Limited 1995, 2005

Database Architectures

Recovering from a Crash. Three-Phase Commit

Principles of Software Construction: Objects, Design, and Concurrency

Lock-based concurrency control. Q: What if access patterns rarely, if ever, conflict? Serializability. Concurrency Control II (OCC, MVCC)

Fault Tolerance. Distributed Systems. September 2002

Synchronization Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

No compromises: distributed transactions with consistency, availability, and performance

No Compromises. Distributed Transactions with Consistency, Availability, Performance

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Consistency and Scalability

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology

Datenbanksysteme II: Implementation of Database Systems Synchronization of Concurrent Transactions

Transaction Models for Massively Multiplayer Online Games

Transcription:

Distributed Systems Day 13: Distributed Transaction To Be or Not to Be Distributed.. Transactions

Summary Background on Transactions ACID Semantics Distribute Transactions Terminology: Transaction manager,, Two Phase Commit Adding Isolation with Locks: optimistic V. pessimistic Performance Issues Consistency Models Serializability Versus Linearizability

https://cloud.google.com/spanner/docs/transactions

https://blog.couchbase.com/optimistic-or-pessimistic-locking-which-one-should-you-pick/

Hash table k 0 v 0 k 1 v 1 k 2 v 2 k 3 v 3 k 4 v 4 All Facebook Data k 5 v 5 Replication Lazy, Passive, Active Consistency for a single shard Distributed Transaction Consistent/Atomic change to data in multiple shards Multiple shards è Can use traditional replication techniques Clients send requests To all replicas FE FE Node A Node B Node C Shard 1 Shard 1 Shard 1 k 4 v 4 k 5 v 5 k 4 v 4 k 5 v 5 k 4 v 4 k 5 v 5 Partition data into shards, maps shards to server with consistent hashing Node D Node E Node F Shard 2 Shard 2 Shard 2 k 2 v 2 k 3 v 3 k 2 v 2 k 3 v 3 k 2 v 2 k 3 v 3 Maintain multiple copies For fault tolerance and to reduce latency

What is Transaction? A set of operations that need to be performed together. Example 1: transferring money between accounts Example 2: shopping cart checkout Initially Theo and Rodrigo have $100. Goal: Transfer $50 from Rodrigo to Theo. Read(R) Update(R, $50) Read(T) Update(T, $150) Rodrigo is in shard 1 Theo is in shard 2 Shard 1 Shard 2 Node A Node D Shard 1 Shard 2

What is Transaction? A set of operations that need to be performed together. Example 1: transferring money between accounts Example 2: shopping cart checkout Initially Theo and Rodrigo have $100. Goal: Transfer $50 from Rodrigo to Theo. Ideal: either the whole 4 operations happen or none happen Worst case: only a subset occur Read(R) Update(R, $50) Read(T) Update(T, $150) Rodrigo is in shard 1 Theo is in shard 2 Shard 1 Shard 2 Node A Node D Shard 1 Shard 2

Transaction Background ACID Semantics Atomicity Consistency Isolation Durability All or nothing semantics: all operations succeeds or fails. Transitions from one consistent state to another consistent state Intermediate states are not exposed to the outside world (no partial writes are exposed) Results of a committed transactions persists after the transaction (and through failures) Transactions are easy for traditional databases Traditional databases are on a single server à failure is ``all-or-nothing All components of the transactions fail Distributed transactions à different components can fail Need to provides Transaction semantics when only a subset of the components fail

Distributed Transactions Semantics Transaction Manager Server in charge of orchestrating the transaction Steps for transaction Client initiates a transaction TM gives client a Transaction ID (TID) Client submits operations to TM TM relays operations to replicas Client commits transaction TM performs two phase commit Node A Node B Node C FE Transaction Manager Shard 1 Shard 1 Shard 1 Node D Node E Node F Shard 2 Shard 2 Shard 2

Client-Side Code: Distributed Transactions Semantics tid = opentransaction(); RVal = a.get(tid, Rodrigo); a.update(tid, Rodrigo, RVal- 50); Tval = b.get(tid,theo); Transaction Manager b.update(tid,theo, Tval + 50); closetransaction(tid) Server in charge orof orchestrating the aborttransaction(tid) transaction Steps for transaction Client initiates a transaction TM gives client a Transaction ID (TID) Client submits operations to TM TM relays operations to replicas Client commits transaction TM performs two phase commit FE Transaction Manager Shard 1 Node A Node D Replica Shard Manager 1 Shard 2 Prepare operations Store operations locally in log Node But B do not commit Node operations E Shard 1 Transaction Manager TM hands out TIDs TM manages and relays operations to Replica Leaders TM keeps track of all Replicas involved in the transactions Shard 2 Shard 2 Node C Node F

Two Phase Commit

Two Phase Commit FE Provides Atomicity and Consistency NOT Isolation and Durability Assumptions: each server maintains a transaction log Transaction log is stored in persistent memory If failure à items in Transaction Log survives Terminology changes: <--- The transaction Manager ß Leader of a replica Transaction Manager Node A Node B Node C Shard 1 Shard 1 Shard 1 Node D Node E Node F Shard 2 Shard 2 Shard 2

Two Phase Commit Ready to Commit? Phase 1: sends request for votes s vote Vote [Yes/No] Phase 2: counts votes informs of transaction status At the end of Phase 2: either all participants commit or all abort Count Votes!!!!!! Abort/Commit?

State Diagram for Two-Phase Commit Init app commit/vote req Ready? pariticpants pariticpants pariticpants pariticpants any abort/abort Wait all commit/commit Vote: Commit or Abort Abort Commit Response

State Diagram for Two-Phase Commit Make a change pariticpants pariticpants pariticpants pariticpants vote req/abort Init vote req/commit Commit or Abort abort/ack Uncertain commit/ack Response Abort Commit

State Diagrams for Two Phase Commit Init app commit/vote req vote req/abort Init vote req/commit any abort/abort Wait all commit/commit abort/ack Uncertain commit/ack Abort Commit Abort Commit

Two Phase Commit Ready to Commit? Phase 1: sends request for votes s vote Vote [Yes/No] Phase 2: counts votes informs of transaction status At the end of Phase 2: either all participants commit or all abort Count Votes!!!!!! Abort/Commit?

Two Phase Commit With Failures Ready to Commit? What is the impact of failures on 2PC? Vote [Yes/No] 2PC is synchronous Failure == node failure or network failure Failure --> the protocol blocks/stalls fails, participants will be uncertain (waiting) fails, coordinator will be waiting Count Votes!!!!!! Abort/Commit?

Crash Points Init app commit/vote req vote req/abort Init vote req/commit any abort/abort Wait all commit/commit abort/ack Uncertain commit/ack Abort Commit Abort Commit

Crash Points Init app commit/vote req vote req/abort Init vote req/commit any abort/abort Wait all commit/commit abort/ack Uncertain commit/ack Abort Commit Abort Commit

Two Phase Commit With Failures What is the impact of failures on 2PC? 2PC is synchronous Failure == node failure or network failure Failure --> the protocol blocks/stalls fails, participants will be uncertain (waiting) fails, coordinator will be waiting Ready to Commit? Vote [Yes/No] Count Votes!!!!!! Detect failure using Timeouts Is this model Synchronous or Asynchronous? Abort/Commit?

Two Phase Commit With Failures Ready to Commit? What is the impact of failures on 2PC? 2PC is synchronous Failure == node failure or network failure Failure --> the protocol blocks/stalls Vote [Yes/No] fails, participants will be uncertain (waiting) fails, coordinator will be waiting Detect failure using Timeouts detects participant failure and assumes ABORT à Transaction terminates detects failure Why Can t automatically ABORT? Count Votes!!!!!! Abort/Commit?

Recovery from Failure vote req/abort Init vote req/commit in Uncertain state waiting for coordinator to say commit or abort abort/ack Uncertain commit/ack It detects failure of coordinator Using timeout Abort Commit in Uncertainstate can t assume either outcome - BAD things happen if participant makes wrong assumptions waits for coordinator to restart - On restart contact coordinator for final outcome

Recovery from Failure vote req/abort Init vote req/commit Uncertain abort/ack commit/ack Abort Commit in Uncertain state waiting for coordinator to say commit or abort It detects failure of coordinator Using timeout in Uncertain state can t assume either outcome - BAD things happen if participant makes wrong assumptions waits for coordinator to restart - On restart contact coordinator for final outcome

Recovery from Failure any abort/abort Abort Init Wait app commit/vote req all commit/commit Commit in wait state waiting for participant to say commit or abort It detects failure of participant Using timeout assumes they said no Takes no response as an abort Abort transaction!! If participant Fails, can make progress

Two Phase Commit With Failures Ready to Commit? What is the impact of failures on 2PC? 2PC is synchronous Failure == node failure or network failure Failure --> the protocol blocks/stalls Vote [Yes/No] fails, participants will be uncertain (waiting) fails, coordinator will be waiting Detect failure using Timeouts detects participant failure and assumes ABORT à transaction terminates detects failure The participant must wait for coordinator The transaction is stalled! Count Votes!!!!!! Abort/Commit?

Two Phase Commit and CAP Theorem! Ready to Commit? During a partition Does 2PC pick Availability or Consistency? Network Partition

CAP Theorem Given a Partition, you must pick between Availability and Consistency Pick Consistently: Some clients (not all) can change data consistently Pick Availability: All clients can change data but inconsistently C: Consistency (Linearizable) A: Availability P: Partition tolerance

Two Phase Commit and CAP Theorem! Ready to Commit? During a partition Does 2PC pick Availability or Consistency? Network Partition

Two Phase Commit With Failures What is the impact of failures on 2PC? 2PC is synchronous Failure == node failure or network failure Failure --> the protocol blocks/stalls ABORT Eventually transaction ends assumes ABORT Transaction ends Ready to Commit? Vote [Yes/No] fails, participants will be waiting fails, coordinator will be waiting Detect failure using Timeouts detects participant failure and assumes ABORT detects failure Why Can t automatically ABORT? that voted NO can abort However, Voted yes can not ABORT Count Votes!!!!!! Abort/Commit?

Two Phase Commit With Failures Ready to Commit? ABORT Eventually transaction ends What is the impact of failures on 2PC? 2PC is synchronous assumes ABORT Transaction ends Vote [Yes/No] Failure == node failure or network failure Failure --> the protocol blocks/stalls fails, participants will be waiting fails, coordinator will be waiting Count Votes!!!!!! Detect failure using Timeouts that voted NO can abort detects participant failure and However, Voted yes can not ABORT assumes ABORT detects failure Why Can t automatically ABORT? Abort/Commit?

Two Phase Commit: Adding Isolation with Locks

https://github.com/facebook/rocksdb/wiki/transactions

https://apacheignite.readme.io/docs/concurrency-modes-and-isolation-levels

Pessimistic Versus Optimistic Locking Trade-off: concurrency versus isolation Pessimistic: Get all locks before transaction Release all locks after transaction Release locks after commit/abort Prevents anyone else from using the data during transaction Locks prevent read/write of data Locks stop other transactions Optimistic No locks Get a copy of data before transaction After transaction check to make sure data has not changed If the data changed then ABORT!!!! Data changes means someone else changed the data

Pessimistic Versus Optimistic Locking Trade-off: concurrency versus isolation Pessimistic Optimistic Low level of concurrency Low performance Sequential ordering of transactions High level of concurrency High throughput: especially if all reads Many Transactions will abort if many writes

Two Phase Commit: Practical Issues

Practical Performance Issues with 2PC Ready to Commit? Synchronization: 2PC Overheads Multiple ``rounds of communication Three rounds of communication 3(N) messages During these rounds resources are frozen Vote [Yes/No] Abort/Commit?

Practical Performance Issues with 2PC Blocking: 2PC During failure à 2PC can block When 2PC blocks à then other transactions are unable to progress

Practical Performance Issues with 2PC Synchronization: 2PC Overheads Multiple ``rounds of communication Three rounds of communication 3(N) messages During these rounds resources are frozen Blocking: 2PC During failure à 2PC can block When 2PC blocks à then other transactions are unable to progress https://www.ics.uci.edu/~cs223/papers/cidr07p15.pdf

Distributed Transaction No Longer Considered Dead! 2007: Avoid Distributed transactions 2012: Google s Spanner Dist. Transaction!!! 2019: MS s Orleans Dist. Transaction to the Cloud!!! https://www.ics.uci.edu/~cs223/papers/cidr07p15.pdf https://thenewstack.io/microsoft-orleans-brings-distributed-transactions-to-cloud/

Distributed Transactions and ACID

How do you get ACID in Distributed Transactions? Two Phase Commit à A + C 2PC: atomic and consistent change from one state to another state Two Phase Commit + Locks à A + C + I Locks provide isolation by preventing concurrent transaction from accessing data Two Phase Commit + Locks + Logs à A + C + I + D Logs ensure the transactions persists

Consistency Models Revisited

Consistency Spectrum SLOWER BUT EASY TO PROGRAM Strict Serializability Sequential FAST BUT HARDER TO PROGRAM Eventual Linearizable Causal+ STRONG CONSISTENCY WEAK CONSISTENCY

Strict Serializability Total order + FIFO + Time à for a transaction After a transaction commits, all future reads will see committed data Requires 2PC + pessimistic Locks Low performance: reads/write have high latency and low throughput Strict Serializability V. Linearizability Strict Serializability = Linearizability for Transactions Linearizability = Total order+ FIFO + Real time for individual operations Strict Serializability = Total order+ FIFO + Real time for transactions (groups of operations)

Strict Serializability Total order + FIFO + Time à for a transaction After a transaction commits, all future reads will see committed data Requires 2PC + pessimistic Locks Low performance: reads/write have high latency and low throughput Strict Serializability V. Linearizability Strict Serializability = Linearizability for Transactions Linearizability = Total order+ FIFO + Real time for individual operations Strict Serializability = Total order+ FIFO + Real time for transactions (groups of operations)

Summary Background on Transactions ACID Semantics Distribute Transactions Terminology: Transaction manager,, Two Phase Commit Adding Isolation with Locks: optimistic V. pessimistic Performance Issues Consistency Models Serializability Versus Linearizability