Self-Stabilizing Byzantine Digital Clock Synchronization

Size: px

Start display at page:

Download "Self-Stabilizing Byzantine Digital Clock Synchronization"

Annabel Cross
5 years ago
Views:

1 Self-Stabilizing Byzantine Digital Clock Synchronization Ezra N. Hoch, Danny Dolev, Ariel Daliot School of Engineering and Computer Science, The Hebrew University of Jerusalem, Israel

2 Problem Statement Digital Clock Synchronization Self-stabilizing Byzantine faults

3 Model & Problem Statement Model: N nodes, fully connected (complete graph) Synchronous: Global beat system Rapid Beat interval: on the order of the message delivery time No common initialization point Nodes may be subject to transient faults Permanent presence of Byzantine nodes (up to f < n/4 ) Problem statement: If enough good nodes weren t subject to transient faults for a sufficient period of time, then all these nodes attain the same DigiClock value, and with each consecutive beat increase it by 1 (mod Overlap)

4 What we want to achieve? 09:15:01 09:15:02 09:15:03 09:15:04 09:15:01 09:15:02 09:15:03 09:15:04 08:10:33 09:15:01 09:15:02 09:15:03 09:15:04 09:15:01 09:15:02 09:15:03 09:15:04

5 Why is this hard? 20:10:10 20:10:11 20:10:12 20:10:13 16:00:00 16:00:01 16:00:02 16:00:03 08:10:33 09:15:01 09:15:02 09:15:03 09:15:04 12:45:03 12:45:04 12:45:05 12:45:06

6 Intuitive Solution Overview I. Agreed stream: Produce a common stream of agreed values at all good nodes (using rotating consensus ). (, 20, 4, 23, 19, ) II. III. Continuous Agreed Stream: Transform the stream to a stream of consecutive, increasing, integer values. (, 27, 28, 29, 30, ) Update Clock: Update the internal digital clock in accord with the common increasing stream.

7 Intuitive Solution Contd. Beat Node Node Node Node Last transient fault, arbitrary state Stage I (Agreed stream) Stage II-III (Continuous agreed stream)

8 Classic Byzantine Consensus Problem statement: Each node has an initial value. All nodes need to agree on a common output value within a finite time, in the presence of Byzantine nodes. Agreement: All good nodes terminate with the same output value Validity: If all good nodes have the same initial value, v, then the output value is v Termination: All good nodes terminate within rounds In addition: Solidarity: If the output value is v, then more than n/2 good nodes had v as their initial value; Otherwise, the output value is the default value Assumptions: all good nodes start in a consistent initial state Synchronous execution

9 Stage I Agreed Stream Rotating Consensus : Execute simultaneously Byzantine Consensus instances, differing at their round of execution. At each beat: Execute current round of each of the instances Output the value of the last terminated instance Invoke a new instance of Byzantine consensus

10 Stage I Contd. Beat i Beat i+1 Beat i+2 Execution of round 1 Execution of round 2 Execution of round Execution of round -1 Execution of round Output Output Output

11 Stage I Summary Starting from an arbitrary state: At the next beat, a new instance of Byzantine consensus (BC i ) is initialized After beats all nodes agree on the output value of the BC i consensus instance This holds for all consensus instances initialized after the last transient fault Therefore, we have an agreed stream, such that at each beat, all good nodes have an agreed value associated with that beat

12 Stage II Continuous Agreed Stream At every beat, all nodes send their DigiClock value to all nodes. At each node, the variable most holds a value that was received from more than half of the nodes. If no such value exists, most will hold 0. v holds the value associated with the current beat v_prev holds the value v had at the previous beat All `+` operations are done (mod) Overlap

13 Stage II Contd. Use the following update rule: if (v=0) or (v=v_prev+1) then DigiClock:= most+1 else DigiClock := 0; Initialize the new instance of the Byzantine Consensus algorithm with DigiClock as the initial consensus value.

14 Beat i+1 i Round 1 2 Round 23. Round. 34 Round wv 1 Update v w Rule Round. -1. Other input Round

15 Why Stage I isn t enough? Each round all good nodes agree on some value. However, this is not enough. If we use v+ +1 as the new initial value, we might get stuck in a repeating situation ( ) Other immediate update rules failed.

16 Stage II Closure if (v=0) or (v=v_prev+1) then DigiClock:= most+1 else DigiClock := 0; If the system has already converged then v=v_prev+1 and all good nodes have the same DigiClock value. Hence, most will be the same at all good nodes. Therefore, all good nodes will update their DigiClock value to be increased by 1. And the system stays in a legal state (synchronized clocks).

17 Stage II Contd. if (v=0) or (v=v_prev+1) then DigiClock:= most+1 else DigiClock := 0; Note that once all good nodes agree on DigiClock: They continue to agree. Hence, the values entered into the rotating consensus, will be either 0, or an increment of the previous value. Hence, After beats, DigiClock will increase by 1 each beat.

18 Stage II Convergence if (v=0) or (v=v_prev+1) then DigiClock:= most+1 else DigiClock := 0; Ensuring DigiClock is the same at all good nodes: All nodes execute either the first line or the second line. If during +1 beats, DigiClock isn t the same, then only the first line was executed, and no more than n/2 good nodes had the same DigiClock value. After +1 beats, v would be equal to (due to solidarity requirement). Hence, the second line would be executed, and all good nodes will have the same DigiClock value.

19 Formal Algorithm

20 Convergence Timeline (in beats) Starting from an arbitrary state: The All system good nodes converges agree on the value of v All good nodes agree on the value of v_prev Execute the same lines of code All good nodes agree on the value of DigiClock 3 +3 All good nodes increase DigiClock by 1 each beat Time

21 Complexity Analysis Convergence in Ω( ), that is, Ω( f). Amortized message complexity per round is 2 O( n )

22 Related Work Previous Self-stabilizing (digital) clock synchronization (non- Byzantine) : Arora, Dolev, Gouda, Herman, Papatriantafilou, etc. Not many previous results addressing Self-stabilization with Byzantine faults, due to the complexity of the combined model. Dolev, Welch 95. Self-Stabilizing Clock Synchronization in the presence of Byzantine faults: To the best of our knowledge, only previous work operating in the same model (Global beat, SS, Byz) Expected convergence time is exponential, as opposed to our deterministic linear time. Tolerates up to a third Byzantine nodes, as opposed to our tolerance of a fourth.

23 Related Work Contd. Daliot and Dolev s Self stabilizing Byzantine Clock synchronization No common external synchronization Built on top of an underlying Self-stabilizing Byzantine Pulse with large Cycle length Internal distributed Pulse is difficult to attain Doesn t take advantage of our stronger synchronous model: Precision stays tight (in the order of the network delay), but not 0

24 Contribution of Current Work Deterministic linear convergence time (as opposed to expected exponential time in previous work of in exact same model) Simple solution that takes advantage of the strength of the model (as opposed to a more complex solution with same convergence time) Rotating Consensus mechanism

25 Future Directions Can the Byzantine tolerance be improved to support f < n/3? What happens when the global beats are received at intervals that are less than the message delay? Can the rotating consensus be applied in a more general way? E.g. to create a general stabilizer of Byzantine tolerant algorithms?

26 Questions?

Self-stabilizing Byzantine Digital Clock Synchronization

Self-stabilizing Byzantine Digital Clock Synchronization Ezra N. Hoch, Danny Dolev and Ariel Daliot The Hebrew University of Jerusalem We present a scheme that achieves self-stabilizing Byzantine digital