Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Similar documents
Consistency: Strict & Sequential. SWE 622, Spring 2017 Distributed Software Engineering

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering

Transactions. CS 475, Spring 2018 Concurrent & Distributed Systems

Agreement and Consensus. SWE 622, Spring 2017 Distributed Software Engineering

Distributed Systems. Lec 11: Consistency Models. Slide acks: Jinyang Li, Robert Morris

Distributed Systems. Catch-up Lecture: Consistency Model Implementations

ZooKeeper & Curator. CS 475, Spring 2018 Concurrent & Distributed Systems

Distributed Systems. Lec 12: Consistency Models Sequential, Causal, and Eventual Consistency. Slide acks: Jinyang Li

Causal Consistency and Two-Phase Commit

What is consistency?

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

EECS 498 Introduction to Distributed Systems

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Recovering from a Crash. Three-Phase Commit

Distributed Systems. Lec 9: Distributed File Systems NFS, AFS. Slide acks: Dave Andersen

ATOMIC COMMITMENT Or: How to Implement Distributed Transactions in Sharded Databases

Today: Fault Tolerance

Distributed Consensus Protocols

PRIMARY-BACKUP REPLICATION

Distributed Commit in Asynchronous Systems

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Lecture 6 Consistency and Replication

TWO-PHASE COMMIT ATTRIBUTION 5/11/2018. George Porter May 9 and 11, 2018

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks.

CS 347 Parallel and Distributed Data Processing

Distributed Systems (5DV147)

Today: Fault Tolerance. Fault Tolerance

Memory Consistency. Minsoo Ryu. Department of Computer Science and Engineering. Hanyang University. Real-Time Computing and Communications Lab.

Consistency and Scalability

Don t Give Up on Serializability Just Yet. Neha Narula

CS October 2017

Topics in Reliable Distributed Systems

Strong Consistency and Agreement

Exam 2 Review. Fall 2011

Synchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011

Distributed Systems Consensus

Lecture X: Transactions

Data Consistency and Blockchain. Bei Chun Zhou (BlockChainZ)

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

CAP Theorem. March 26, Thanks to Arvind K., Dong W., and Mihir N. for slides.

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17

Distributed Data Store

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

CS6450: Distributed Systems Lecture 11. Ryan Stutsman

Distributed Computing. CS439: Principles of Computer Systems November 20, 2017

Replication and Consistency

Consensus, impossibility results and Paxos. Ken Birman

Beyond FLP. Acknowledgement for presentation material. Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen

Deadlock and Monitors. CS439: Principles of Computer Systems September 24, 2018

Replications and Consensus

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Steve Gribble. Synchronization. Threads cooperate in multithreaded programs

Page 1. Goals of Today s Lecture. The ACID properties of Transactions. Transactions

Distributed System. Gang Wu. Spring,2018

Recall: Primary-Backup. State machine replication. Extend PB for high availability. Consensus 2. Mechanism: Replicate and separate servers

Distributed Architectures & Microservices. CS 475, Spring 2018 Concurrent & Distributed Systems

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences

Parallel DBs. April 25, 2017

Distributed Computing. CS439: Principles of Computer Systems November 19, 2018

Replication. Feb 10, 2016 CPSC 416

CS 347 Parallel and Distributed Data Processing

Introduction to Data Management. Lecture #24 (Transactions)

Distributed Shared Memory

Consistency in Distributed Storage Systems. Mihir Nanavati March 4 th, 2016

Distributed Systems. Day 13: Distributed Transaction. To Be or Not to Be Distributed.. Transactions

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

Review of last lecture. Peer Quiz. DPHPC Overview. Goals of this lecture. Lock-based queue

CS 425 / ECE 428 Distributed Systems Fall 2017

CS5412: TRANSACTIONS (I)

Distributed Systems COMP 212. Lecture 19 Othon Michail

Transaction Management: Introduction (Chap. 16)

Introduction to Distributed Systems Seif Haridi

CPS 512 midterm exam #1, 10/7/2016

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Hank Levy 412 Sieg Hall

Data Replication CS 188 Distributed Systems February 3, 2015

Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Paxos provides a highly available, redundant log of events

SCALABLE CONSISTENCY AND TRANSACTION MODELS

FAULT TOLERANCE. Fault Tolerant Systems. Faults Faults (cont d)

Failures, Elections, and Raft

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Distributed Systems 8L for Part IB

Transactions and Isolation

Dealing with Issues for Interprocess Communication

Synchronization. Clock Synchronization

Distributed Databases

CSE 486/586: Distributed Systems

Distributed Transaction Management 2003

Recap. CSE 486/586 Distributed Systems Paxos. Paxos. Brief History. Brief History. Brief History C 1

G Programming Languages Spring 2010 Lecture 13. Robert Grimm, New York University

Last time: Saw how to implement atomicity in the presence of crashes, but only one transaction running at a time

RMI & RPC. CS 475, Spring 2019 Concurrent & Distributed Systems

Distributed Transactions

Lecture #7: Implementing Mutual Exclusion

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology

Computing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1

Extend PB for high availability. PB high availability via 2PC. Recall: Primary-Backup. Putting it all together for SMR:

Paxos and Replication. Dan Ports, CSEP 552

Transcription:

Consistency CS 475, Spring 2018 Concurrent & Distributed Systems

Review: 2PC, Timeouts when Coordinator crashes What if the bank doesn t hear back from coordinator? If bank voted no, it s OK to abort If bank voted yes It can t decide to abort (maybe both banks voted yes and coordinator heard this) It can t decide to commit (maybe other bank voted yes) Hence, 2PC does NOT guarantee liveness, but it will guarantee safety 2

Review: 3PC, Has an answer for every CRASH timeout Coordinator Participants (A,B,C,D) Soliciting votes Timeout causes abort Commit authorized (if all yes) Timeout causes abort Done prepare response pre-commit OK OK commit Status: Uncertain Timeout causes abort Status: Prepared to commit Timeout causes commit Status: Committed This only works if a timeout == crash!

Review: Partitions Coordinator Prepared to commit Commit Soliciting Authorized Votes Network Partition!!! Yes Yes Yes Yes Participant A Participant B Participant C Participant D Uncertain Committed Uncertain Aborted Uncertain Aborted Uncertain Aborted Timeout behavior: Timeout behavior: abort Commit! 4

Review: FLP Why can t we make a protocol for consensus/ agreement that can tolerate node failures or network delays that may be transient? To tolerate a failure, you need to assume that eventually the failure will be resolved, and the network will deliver the delayed packets But the messages might be delayed forever Hence, your protocol would not come to a result, until forever (it would not have the liveness property) 5

Review: Partition Tolerant Consensus Key idea: if you always have an odd number of nodes There will always be a minority partition and a majority partition Give up processing in the minority until partition heals and network resumes Majority can continue processing Really, really really complicated to design (Paxos) and implement (ZooKeeper, Raft) an algorithm that does this 6

Announcements HW4 is out! http://www.jonbell.net/gmu-cs-475-spring-2018/ homework-4/ Today: Consistency & Memory Models Strict Consistency Sequential Consistency Distributed Shared Memory Additional readings: Tannenbaum 7-7.2 7

Review: Properties of Transactions Traditional properties: ACID Atomicity: transactions are all or nothing Consistency: Guarantee some basic properties of data; each transaction leaves the database in a valid state Isolation: Each transaction runs as if it is the only one; there is some valid serial ordering that represents what happens when transactions run concurrently Durability: Once committed, updates cannot be lost despite failures 8

Consistency The problem of consistency arrises whenever some data is replicated That data exists in (at least) two places at the same time What is a "valid" state? 9

Consistency Set A=5 OK! Read A 5! Set A=5 OK! A B A B 65 7 65 7 10

Consistency Why do we think the prior slide was consistent? Whenever we read, we see the most recent writes Does that come for free? No, remember NFS What do you need to do ensure that each read reflects the most recent write? Even programs running on a single computer have to obey some consistency model 11

Quiz: What s the output? class MyObj { int x = 0; int y = 0; "" void thread0() { x = 1; if(y==0) System.out.println( OK"); void thread1() { y = 1; if(x==0) System.out.println( OK"); "OK" "OK" "OK" WTF?

Quiz: What s the output? class MyObj { int x = 0; int y = 0; void thread0() { x = 1; if(y==0) System.out.println( OK"); void thread1() { y = 1; if(x==0) System.out.println( OK"); heap x y stack thread0() x y stack thread1() y x Java Memory Model: Threads are allowed to cache reads and writes

Java Memory Model CPU 1 thread0() 7ns CPU 1 Cache 100ns CPU 2 Main Memory thread1() CPU 2 Cache

Quiz: What s the output? class MyObj { volatile int x = 0; volatile int y = 0; void thread0() { x = 1; if(y==0) System.out.println("OK") void thread1() { y = 1; if(x==0) System.out.println("OK")

Quiz: What s the output? class MyObj { volatile int x = 0; volatile int y = 0; void thread0() { x = 1; if(y==0) System.out.println( OK"); void thread1() { y = 1; if(x==0) System.out.println( OK"); heap x y stack thread0() stack thread1() Volatile keyword: no per-thread caching of variables

Volatile Keyword CPU 1 thread0() 7ns X CPU 1 Cache 100ns CPU 2 thread1() X CPU 2 Cache Main Memory

Consistency This is a consistency model! Constraints on the system state that are observable by applications When I write y=1, any future reads must say y=1 except in Java, if it s a non-volatile variable Clearly, this often comes at a cost (see simple example with volatile ) 18

Strict Consistency Each operation is stamped with a global (wallclock, aka absolute) time Rules: Each read sees the latest write All operations on one CPU have time-stamps in execution order 19

Strict Consistency t=0 t=1 class MyObj { volatile int x = 0; volatile int y = 0; void thread0() { x = 1; if(y==0) System.out.println( OK"); void thread1() { y = 1; if(x==0) System.out.println( OK"); CPU0 W(X) 1 R(Y) 1 CPU1 W(Y) 1 R(X) 1 t=0 t=1 CPU0 W(X) 1 R(Y) 0 CPU1 W(Y) 1 R(X) 1 Xt=0 t=1 CPU0 W(X) 1 R(Y) 0 CPU1 W(Y) 1 R(X) 0 20

Sequential Consistency Strict consistency is often not practical Requires globally synchronizing clocks Sequential consistency gets close, in an easier way: There is some total order of operations so that: Each CPUs operations appear in order All CPUs see results according to that order (read most recent writes) 21

Sequential Consistency There is some total order of operations so that: Each CPUs operations appear in order All CPUs see results according to that order (read most recent writes) Consider this case, noting that there are no locks to enforce the ordering P1 W(X) a P2 W(X) b P3 R(X) b R(X) a P4 R(X) b R(X) a Sequentially consistent. NOT strictly consistent W(X)b, R(X)b,R(X)b,W(X)a, R(X)a, R(X)a 22

Sequential Consistency There is some total order of operations so that: Each CPUs operations appear in order All CPUs see results according to that order (read most recent writes) Consider this case, noting that there are no locks to enforce the ordering P1 W(X) a P2 W(X) b P3 R(X) b R(X) a P4 R(X) a R(X) b Not sequentially consistent 23

Sequential Consistency: Activity http://b.socrative.com, room CS475 24

Distributed Shared Memory heap heap stack thread0() stack thread1() DSM x y

Naïve DSM Assume each machine has a complete copy of memory Reads from local memory Writes broadcast update to other machines, then immediately continue class Machine1 { DSMInt x = 0; DSMInt y = 0; class Machine2 { DSMInt x = 0; DSMInt y = 0; static void main(string[] args) { x = 1; if(y==0) System.out.println( OK"); static void main(string[] args) { y = 1; if(x==0) System.out.println( OK"); 26

Naïve DSM Assume each machine has a complete copy of memory Reads from local memory Writes broadcast update to other machines, then immediately continue class Machine1 { DSMInt x = 10; DSMInt y = 0; class Machine2 { DSMInt x = 0; DSMInt y = 0; 1 static void main(string[] args) { x = 1; if(y==0) System.out.println( OK"); static void main(string[] args) { y = 1; if(x==0) System.out.println( OK"); 27

Naïve DSM Assume each machine has a complete copy of memory Reads from local memory Writes broadcast update to other machines, then immediately continue class Machine1 { DSMInt x = 10; DSMInt y = 0; class Machine2 { DSMInt x = 0; DSMInt y = 0; 1 1 Is this correct? static void main(string[] args) { x = 1; if(y==0) System.out.println( OK"); static void main(string[] args) { y = 1; if(x==0) System.out.println( OK"); 28

Naïve DSM Gets even more funny when we add a third host Many more interleaving possible Definitely not sequentially consistent Who is at fault? The DSM system? The app? The developers of the app, if they thought it would be sequentially consistent. 29

Sequentially Consistent DSM How do we get this system to behave similar to Java s volatile keyword? We want to ensure: Each machine s own operations appear in order All machines see results according to some total order (each read sees the most recent writes) We can say that some observed runtime ordering of operations can be explained by a sequential ordering of operations that follow the above rules 30

Sequentially Consistent DSM Each node must see the most recent writes before it reads that same data Performance is not great: Might make writes expensive: need to wait to broadcast and ensure other nodes heard your new value Might make reads expensive: need to wait to make sure that there are no pending writes that you haven t heard about yet 31

Sequentially Consistent DSM Each processor issues requests in the order specified by the program Can t issue the next request until previous is finished Requests to an individual memory location are served from a single FIFO queue Writes occur in single order Once a read observes the effect of a write, it s ordered behind that write 32

Sequentially Consistent DSM CPU 1 thread0() 7ns X CPU 1 Cache 100ns 1s? CPU 2 XMain Memory FIFO queue thread1() X CPU 2 Cache 33

Ivy DSM Integrated shared Virtual memory at Yale Provides shared memory across a group of workstations Might be easier to program with shared memory than with message passing Makes things look a lot more like one huge computer with hundreds of CPUs instead of hundreds of computers with one CPU 34

Ivy Architecture Each node keeps a cached copy of each piece of data it reads cached data If some data doesn t exist locally, request it from remote node cached data cached data

Ivy provides sequential consistency Support multiple readers, single writer semantics Write invalidate update protocol If I write some data, I must tell everyone who has cached it to get rid of their cache 36

Ivy Architecture Each node keeps a cached copy of each piece of data it reads cached data x=0 x=1 invalidate x Write X=1 If some data doesn t exist locally, request it from remote node read x read x ead X cached data cached data x=0 Read X

Ivy Implementation Ownership of data moves to be whoever last wrote it There are still some tricky bits: How do we know who owns some data? How do we ensure only one owner per data? How do we ensure all cached data are invalidated on writes? Solution: Central manager node 38

Ivy invariants Every piece of data has exactly one current owner Current owner is guaranteed to have a copy of that data If the owner has write permission, no other copies can exist If owner has read permission, it s guaranteed to be identical to other copies Manager node knows about all of the copies Sounds a lot like HW4? :) 39

HW4 Architecture Each node keeps a cached copy of each piece of data it reads cached data x=0 x=1 update x=1 Write X=1 Each node always has a copy of the most recent data ead X cached data x=0 cached data x=0 Read X

Ivy Architecture Each node keeps a cached copy of each piece of data it reads cached data x=0 x=1 invalidate x Write X=1 If some data doesn t exist locally, request it from remote node read x read x ead X cached data cached data x=0 Read X

Ivy vs HW4 Ivy never copies the actual values until a replica reads them (unlike HW4) Invalidate messages are probably smaller than the actual data! Ivy only sends update (invalidate) messages to replicas who have a copy of the data (unlike HW4) Maybe most data is not actively shared Ivy requires the lock server to keep track of a few more bits of information (which replica has which data) With near certainty Ivy is a lot faster :) 42

Sequential Consistency Set A=5 OK! Read A 5! Set A=5 OK! A B A B 65 7 65 7 43

Availability Our protocol for sequential consistency does NOT guarantee that the system will be available! Set A=5 Read A Set A=5 A B A B 65 7 6 7 44

Consistent + Available Set A=5 Read A OK! 5! Set A=5 Assume replica failed A B A B 65 7 6 7 45

Still broken... Set A=5 OK! Set A=5 Read A 6! Assume replica failed A B A B 65 7 6 7 46

Network Partitions The communication links between nodes may fail arbitrarily But other nodes might still be able to reach that node Set A=5 OK! Set A=5 Read A 6! Assume replica failed A B A B 65 7 6 7 47

CAP Theorem Pick two of three: Consistency: All nodes see the same data at the same time (strong consistency) Availability: Individual node failures do not prevent survivors from continuing to operate Partition tolerance: The system continues to operate despite message loss (from network and/or node failure) You can not have all three, ever* If you relax your consistency guarantee (we ll talk about in a few weeks), you might be able to guarantee THAT 48

CAP Theorem C+A: Provide strong consistency and availability, assuming there are no network partitions C+P: Provide strong consistency in the presence of network partitions; minority partition is unavailable A+P: Provide availability even in presence of partitions; no strong consistency guarantee 49

Still broken... Set A=5 OK! Read A 6! Set A=5 OK! A B A B 65 7 6 7 The robot devil will return in lecture 23 50