EECS 591 DISTRIBUTED SYSTEMS

Size: px

Start display at page:

Download "EECS 591 DISTRIBUTED SYSTEMS"

Rosanna Randall
5 years ago
Views:

1 EECS 591 DISTRIBUTED SYSTEMS Manos Kapritsos Fall 2018 Slides by: Lorenzo Alvisi

2 3-PHASE COMMIT Coordinator I. sends VOTE-REQ to all participants 3. if (all votes are Yes) then send Precommit to all else := Abort send Abort to all who voted Yes halt 5. collect Ack from all participants When all Ack s have been received: := Commit send Commit to all Participant 2. sends to Coordinator if halt = No then := Abort 4. if received Precommit then send Ack 6. When receives Commit, sets := Commit and halts

3 TIMEOUT ACTIONS Coordinator Participant Step 3: Coordinator is waiting for vote from participants Step 2: is waiting for VOTE-REQ from the coordinator Same as in 2PC Same as in 2PC Step 4: is waiting for Precommit Run termination protocol Step 5: Coordinator is waiting for Ack s Coordinator sends Commit Step 6: is waiting for Commit Run termination protocol

4 TERMINATION PROTOCOL: PROCESS STATES At any time while running 3PC, each participant can be in exactly one of these four states: Aborted Uncertain Committable Committed Not voted, voted No, received Abort Voted Yes but not received Precommit Received Precommit, not Commit Received Commit

5 NOT ALL STATES ARE COMPATIBLE Aborted Uncertain Committable Committed Aborted Uncertain Committable Committed

6 TERMINATION PROTOCOL When times out, it starts an election protocol to elect a new coordinator The new coordinator sends STATE-REQ to all processes that participated in the election The new coordinator collects the states and follows a set of termination rules

7 to elect a new coordinator The new coordinator sends STATE-REQ to all TERMINATION PROTOCOL processes that participated in the election The new coordinator collects the states and follows a set of termination rules TR1: if some process decided Abort, then decide Abort send Abort to all halt TR2: if some process decided Commit, then decide Commit send Commit to all halt TR3: if all processes that reported state are uncertain, then decide Abort send Abort to all halt TR4: if some process is committable, but none committed, then send Precommit to uncertain processes wait for Ack s send Commit to all halt

8 TERMINATION PROTOCOL AND FAILURES Processes can fail while executing the termination protocol if times out on, it can just ignore if fails, a new coordinator is elected and the protocol is restarted (election protocol to follow) total failures will need special care

9 RECOVERING If If If If fails before sending Yes, decide Abort fails after having decided, follow decision fails after voting Yes, but before receiving decision value asks other processes for help 3PC is non-blocking: will receive a response with the decision has received Precommit still needs to ask other processes (cannot just Commit) No need to log Precommit! (or is there?)

10 THE ELECTION PROTOCOL Processes agree on linear ordering (e.g. by pid) Each process maintains a set of all processes that it believes to be operational When detects failure of, it removes from and chooses smallest in to be the new coordinator If, then is the new coordinator Otherwise, sends UR-ELECTED to

11 WHAT IF? What if, which has not detected the failure of, receives a STATE-REQ from? it concludes that it removes from must be faulty every What if receives a STATE-REQ from after it has changed the coordinator to? ignores the request

12 TOTAL FAILURE Suppose that is the first process to recover and that is uncertain. Can decide Abort? Some process could have decided Commit after crashed! is blocked until some process recovers such that either can recover independently is the last process to fail: then invoke the termination protocol can simply

13 DETERMINING THE LAST PROCESS TO FAIL Suppose a set of processes has recovered Does contain the last process to fail? the last process to fail is in the set of every process so the last process to fail must be in contains the last process to fail if:

14 ADMINISTRIVIA Homework #1 due Wednesday before class Research project Declare your team by Oct 1st (by to me) Declare your topic by Oct 8 (by to me) Not sure what to do? Come talk to me.

15 CONSENSUS AND RELIABLE BROADCAST

16 BROADCAST If a process sends a message eventually delivers, then every process How can we adapt the spec for an environment where processes may fail?

17 RELIABLE BROADCAST Validity Agreement Integrity If the sender is correct and broadcasts a message, then all correct processes eventually deliver If a correct process delivers a message, then all correct processes eventually deliver Every correct process delivers at most one message, and if it delivers, then some process must have broadcast

18 TERMINATING RELIABLE BROADCAST Validity Agreement Integrity Termination If the sender is correct and broadcasts a message, then all correct processes eventually deliver If a correct process delivers a message, then all correct processes eventually deliver Every correct process delivers at most one message, and if it delivers, then some process must have broadcast Every correct process eventually delivers some message

19 CONSENSUS Every process has a value to propose. After running a consensus algorithm, all processes should deliver the same value.

20 CONSENSUS Validity Agreement If all processes that propose a value propose, then all correct processes eventually decide If a correct process decides, then all correct processes eventually decide Integrity Termination Every correct process decides at most one value, and if it decides, then some process must have proposed Every correct process eventually decides some value

21 PROPERTIES OF send(m) AND receive(m) Benign failures: Validity If sends to, and, and the link between them are correct, then eventually receives Uniform* integrity For every message, receives at most once from, and only if sent to * A property is called uniform if it applies to both correct and faulty processes

22 MODEL Synchronous message passing Execution is a sequence of rounds In each round every process takes a step sends messages to neighbors receives messages send in that round changes its state Network is fully connected No communication failures

23 A SIMPLE CONSENSUS ALGORITHM Process : Initially To execute propose( ): 1. Send { } to all decide( ) occurs as follows: 2. for all, do 3. receive from decide min( )

24 time AN EXECUTION

25 AN EXECUTION What should decide at the end of the round? start of round end of round

26 AN EXECUTION What should decide at the end of the round?

27 AN EXECUTION What should decide at the end of the round?

28 AN EXECUTION What should decide at the end of the round?

29 ECHOING VALUES A process that receives a proposal in round 1, relays it to others during round 2 Suppose hasn t heard from at the end of round 2. Can decide? round 1 round 2

30 ECHOING VALUES A process that receives a proposal in round 1, relays it to others during round 2 Suppose hasn t heard from at the end of round 2. Can decide? round 1 round 2

31 ECHOING VALUES A process that receives a proposal in round 1, relays it to others during round 2 Suppose hasn t heard from at the end of round 2. Can decide? round 1 round 2

32 ECHOING VALUES A process that receives a proposal in round 1, relays it to others during round 2 Suppose hasn t heard from at the end of round 2. Can decide? round 1 round 2

EECS 591 DISTRIBUTED SYSTEMS. Manos Kapritsos Winter 2018

EECS 591 DISTRIBUTED SYSTEMS. Manos Kapritsos Winter 2018 EECS 591 DISTRIBUTED SYSTEMS Manos Kapritsos Winter 2018 ATOMIC COMMIT Preserve data consistency for distributed transactions in the presence of failures Setup one coordinator a set of participants Each