Assignment 12: Commit Protocols and Replication

Data Modelling and Databases Exercise dates: May 24 / May 25, 2018 Ce Zhang, Gustavo Alonso Last update: June 04, 2018 Spring Semester 2018 Head TA: Ingo Müller Assignment 12: Commit Protocols and Replication This assignment will be discussed during the exercise slots indicated above. If you want feedback for your copy, hand it in during the lecture on the Wednesday before (preferably stapled and with your e-mail address). You can also annotate your copy with questions you think should be discussed during the exercise session. If you have questions that are not answered by the solution we provide, send them to David (david.sidler@inf.ethz.ch). 1 2PC 1. Assuming a completely asynchronous system, is it always possible to achieve consensus? Explain your answer. 2. List all timeout possibilities of the 2PC protocol. Differentiate between the coordinator and the participants. Describe the consequences of each scenario. Coordinator Participant Participant timeout phase consequences 3. Assume the following scenario: We have one coordinator C and three participants P 1, P 2, P 3 running 2PC protocol. We define the event (A, B, MSG) as A sends the message MSG to

B. A and B can be any of the participants or the coordinator, i.e., A, B {C, P 1, P 2, P 3 }. Allowed messages are request to vote, voting yes, voting no, request to abort, and request to commit, i.e., MSG {REQ, YES, NO, ABORT, COM } respectively. We also define the event (A, FAIL) to be the failure of node A at that point. Consider now the following order of events: (a) (C, P 1, REQ) (b) (C, P 2, REQ) (c) (C, P 3, REQ) (d) (P 1, C, YES) (e) (P 2, C, YES) (f) (P 3, C, YES) (g) (C, P 1, COM ) (h) (C, P 2, COM ) (i) (C, P 3, COM ) For each of the fail-scenarios described in the table below, replace one of the events from (a) to (i) with a different event in order for the fail-scenario to take place. If there are multiple possibilities, replace the earliest one. Assume that all the actions following the given modification will also change according to the 2PC protocol. Scenario event timestep(a-i) 2PC aborts, but no node has failed A participant experiences a timeout waiting for a message The coordinator experiences a timeout waiting for a message 2PC blocks A Cooperative Termination Protocol is run and the protocol finishes 4. Give an example of a scenario where the 2PC protocol does not terminate. 5. Given your answer to the question 1.4, define an alternation of the 2PC protocol that would terminate in the same scenario. Disregard all other constraints.

2 3PC 1. A coordinator C and two participants P 1, P 2 run the three-phase-commit (3PC) protocol. The coordinator also acts as participant. We model the execution of the protocol as a series of events. An event can be one of the following: A message event of the form (A, B, MSG) means that node A sends the message MSG to node B, where A, B {C, P 1, P 2 } and the message MSG {REQ, YES, NO, PRE COM, ACK, ABORT, COM }, meaning request to vote, voting yes, voting no, pre-commit, acknowledge last message, request to abort, and request to commit respectively. A group communication event of the form (A, ask around ABORT) or (A, ask around COM ) means that node A initiates a round of group communication where all reachable nodes exchange all relevant information and then decide to abort or to commit accordingly. Group communication can be initiated after a time-out. We assume that no failures occur during group communication. A failure event of the form (A, FAIL) means that node A fails. The following sequence of events shows an execution of the 3PC protocol where no failures occur: time step event 1 (C, P 1, REQ) 2 (C, P 2, REQ) 3 (P 1, C, YES) 4 (P 2, C, YES) 5 (C, P 1, PRE COM ) 6 (C, P 2, PRE COM ) 7 (P 1, C, ACK) 8 (P 2, C, ACK) 9 (C, P 1, COM ) 10 (C, P 2, COM ) We now modify this sequence of events starting from some time step. Complete each new sequence with one possible next event such that it models a valid execution of the 3PC protocol. Sequence (i): 4 (P 2, C, NO) 5 2 (C, FAIL) 3 (P 1, C, YES) 4 Sequence (ii): Sequence (iii):

5 (C, FAIL) 6 Sequence (iv): 6 (C, FAIL) 7 Sequence (v): 4 (P 2, FAIL) 5 Sequence (vi): 6 (P 2, FAIL) 7 (C, P 2, PRE COM ) 8 (P 1, C, ACK) 9 2. In the commit protocols discussed in the lecture, participants vote whether to commit, then they decide by consensus. Give an example scenario in 3PC, which can t happen in 2PC, and in which a participant is forced to decide against its own vote. 3. Define a scenario in which 3PC violates at least one of the AC rules.

4. Compare the number of messages sent for 2PC and 3PC protocols when all the participants have committed the update. 5. Compare the runtimes of 2PC and 3PC protocols with a timeout occurring during the second timeout window of at least one participant (no decision in 2PC, no pre-commit in 3PC). 3 Liveness, safety, fault tolerance A protocol is live if each non-faulty process will eventually terminate. A protocol is safe if all processes that terminate arrive to the same decision (whether to commit or abort). The network is reliable if all messages arrive on time (the only possible failures are process failures). The network is unreliable if some messages may be lost. Fill the following table with true or false.

2PC is live 2PC is safe 3PC is live 3PC is safe there exists a protocol that is live and safe reliable network unreliable network 4 Replication 1. List three reasons why to use replication. 2. List two disadvantages shared by all replication types. 3. List the four main types of replication strategies.

4. Describe a scenario in which you would use a synchronous primary copy-strategy instead of an asynchronous update-everywhere strategy and state why. 5. Describe a scenario in which a synchronous strategy causes the database to loose consistency.