Virtual Synchrony. Jared Cantwell

Size: px

Start display at page:

Download "Virtual Synchrony. Jared Cantwell"

Myles Anderson
5 years ago
Views:

1 Virtual Synchrony Jared Cantwell

2 Review Mul7cast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed file systems

3 Goal Distributed programming is hard What tools can make it easier? What assump3ons can make it easier? Distributed programming is hard! Let s go shopping!!! According to hhp://en.wikipedia.org/wiki/barbie, Barbie once said Math is hard! (misquoted).

4 The Process Group Approach to Reliable Distributed Compu7ng Ken Birman Professor, Cornell University ISIS toolkit mechanism for distributed programming Financial trading floors Telecommunica7ons switching

5 Virtual Synchrony Simplify distributed systems programming by assuming a synchronous environment Features: Process Groups Reliable Mul7cast Fault Tolerance Performance

6 Outline Problem / Mo7va7on Solu7on (Virtual Synchrony) Assump7ons Close Synchrony Virtual Synchrony

7 Outline Problem / Mo7va7on Solu7on (Virtual Synchrony) Assump7ons Close Synchrony Virtual Synchrony

8 Mo7va7on Distributed Programming is hard

9 Difficul7es No reliable mul7cast Membership churn Message ordering State transfers Failure atomicity

10 No Reliable Mul7cast Ideal Reality p q r UDP, TCP, Mul7cast not good enough What is the correct way to recover?

11 Membership Churn Receives new membership p q r Never sent Membership changes are not instant How to handle failure cases?

12 Message Ordering p 1 2 q r Everybody wants it! How can you know if you have it? How can you get it?

13 State Transfers p q r New nodes must get current state Does not happen instantly How do you handle nodes failing/joining?

14 Failure Atomicity Ideal Reality p x q r? Nodes can fail mid transmit Some nodes receive message, others do not Inconsistencies arise!

15 Mo7va7on Review Distributed programming is hard! No reliable mul7cast Membership churn Message ordering State transfers Failure atomicity

16 Outline Problem / Mo7va7on Solu7on (Virtual Synchrony) Assump7ons Close Synchrony Virtual Synchrony

17 Assump7ons WAN of LANs Unreliable network Flow control at lowest layer Clocks not synchronized No par77ons CAP Theorem?

18 Failure Model Nodes crash Network is lossy Can t dis7nguish difference

19 Outline Problem / Mo7va7on Solu7on (Virtual Synchrony) Assump7ons Close Synchrony Virtual Synchrony

20 Outline Problem / Mo7va7on Solu7on (Virtual Synchrony) Assump7ons Close Synchrony Model Significance Issues Virtual Synchrony

21 Model Events (all or nothing) Internal computa7on Message transmission & delivery Membership change

22 Model p q r t s Synchronous execu7on u Ken s Slides 2006

23 Significance Mul7cast is always reliable Membership is always consistent Totally ordered message delivery State transfer happens instantaneously Failure Atomicity Mul7cast is a single event

24 Issues Discrete event simulator Is it prac7cal? Impossible with failures Very expensive System progresses in lock step Limited by speed of other members

25 Outline Problem / Mo7va7on Solu7on (Virtual Synchrony) Assump7ons Close Synchrony Virtual Synchrony

26 Virtual Synchrony Outline Asynchronous Execu7on Virtual Synchrony ISIS Parallels Benefits Discussion

27 Asynchronous Execu7on Key to high throughput in distributed systems Only wait for responses (or too fast sends) Communica7on channel Acts as a pipeline Not limited by latency Not possible with Close Synchrony!!

28 Asynchronous Execu7on p q r t s u Ken s Slides 2006

29 Virtual Synchrony Close Synchrony + Asynchronous Indis7nguishable to applica7on So.when can synchronous execu7on be relaxed?

30 ISIS Communica7on Framework Membership Service VS primi7ves ABCAST CBCAST

31 ISIS Problem Crash and Lossy Network Indis3nguishable Solu7on: Membership list Nonresponsive or failed members are dropped Only listed members can par7cipate Re join protocol Does Membership exist in all distributed systems?

32 ISIS Atomic Broadcast (ABCAST) No message can be delivered to any user un7l all previous ABCAST messages have been delivered Costly to implement But not everyone needs such strong guarantees

33 ISIS Causal Atomic Broadcast (CBCAST) Sufficient for most programmers Concurrent messages commute Weaker than ABCAST

34 When to use CBCAST? Each thread corresponds to a different lock p r s 3 t When any conflic7ng mul7casts are uniquely ordered along a single causal chain..this is Virtual Synchrony Ken s Slides 2006

35 Parallels Logical 7me Replica7on in database systems Schneider s state machine approach Parallel processor architectures Distributed database systems

36 Benefits Assume a closely synchronous model Group state and state transfer Pipelined communica7on (async) Single event model Failure handling

37 Discussion Par11ons False posi7ves Most have them, VS admits it False nega7ves Depend on a 7meout

38 Summary Programming in distributed systems is hard Close Synchrony makes it easier Costs too much Take asynchronous when you can Virtual Synchrony Pipelined Easy to reason over

39 Authors Understanding the Limita7ons of Causally and Totally Ordered Communica7on David Cheriton Stanford PhD Waterloo Billionaire Dale Skeen PhD UC Berkeley 3 phase commit protocol

40 The flaws of CATOCS Unrecognized causality No seman3c ordering No Efficiency Gain (over State level Techniques) No Scalability

41 Unrecognized Causality p q r s External communica7on is unknown

42 Unrecognized Causality Database is external en7ty Causal rela7on exists, but CATOCS misses it

43 No Seman7c Ordering Serializa7on Messages can t be group together Implemen7ng eliminates CATOCS need Causal Memory Solu7on: state level logical clock

44 No Efficiency Gain S7ll need state level techniques False causality Reduces Performance Increased Memory Message overhead

45 No Efficiency Gain What if m2 happened to follow m1, but was not causally related? CATOCS would make False Causality

46 No Scalability quadra7c growth of expected message buffering RebuHal: Worst case Imprac7cal use case

47 Summary CATOCS sopware is overkill Communica7on system doesn t know everything Everything is beher at the applica7on level

48 Conclusions Distributed Programming is hard Close Synchrony Too costly Virtual Synchrony Limita7ons VS not perfect for all situa7ons

Virtual Synchrony. Ki Suh Lee Some slides are borrowed from Ken, Jared (cs ) and JusBn (cs )

Virtual Synchrony. Ki Suh Lee Some slides are borrowed from Ken, Jared (cs ) and JusBn (cs ) Virtual Synchrony Ki Suh Lee Some slides are borrowed from Ken, Jared (cs6410 2009) and JusBn (cs614 2005) The Process Group Approach to Reliable Distributed CompuBng Ken Birman Professor, Cornell University