Distributed Systems 2 Introduction

Size: px

Start display at page:

Download "Distributed Systems 2 Introduction"

Silas Hubbard
5 years ago
Views:

1 Distributed Systems 2 Introduction Alberto Montresor University of Trento, Italy 2018/09/13 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

2 Contents 1 Getting Started Two generals Common Knowledge Byzantine generals Summary 2 Themes of the course Impossible vs practical Classical vs extreme distributed systems Syllabus 3 Modeling Distributed Systems Computation Interaction Failures Time

3 Getting Started Two generals Two generals A thought experiment: Alberto Montresor (UniTN) DS - Introduction 2018/09/13 1 / 43

4 Getting Started Two generals A potential solution General A: attack at dawn! General B: ack, attack at dawn! General A: ack ack, attack at dawn! General B: ack ack ack, attack at dawn!... Theorem Under this scenario, there is no solution for the Two Generals Problem Proof. By contradiction on the number of messages exchanged. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 2 / 43

5 DS - Introduction Getting Started Two generals A potential solution A potential solution General A: attack at dawn! General B: ack, attack at dawn! General A: ack ack, attack at dawn! General B: ack ack ack, attack at dawn!... Theorem Under this scenario, there is no solution for the Two Generals Problem Proof. By contradiction on the number of messages exchanged. Let s assume (by contradiction) that there is at least one solution to this problem under this scenario. If there is one, there could be many. If there are many, we can find one which uses the minimum number of messages. Take the last message of this protocol: it can be received or it can be lost The protocol should work in both cases. So we could avoid sending it at all! The resulting protocol uses less messages than the minimum, which is a contradiction

6 Getting Started Two generals Reality Check Atomic Commit An atomic commit is an operation in which a set of distinct changes is applied as a single operation. Example: ATM s withdrawal You whitdraw 100 euro from an ATM in Trento Your balance should be reduced by 100 euro Alberto Montresor (UniTN) DS - Introduction 2018/09/13 3 / 43

7 Getting Started Two generals A pragmatic solution Probabilistic protocol Assuming messengers are caught independently of each other with probability p General A: Send n messengers Attacks no matter what General B: Attacks if receives at least one messenger p n is the probability that the attack will be uncoordinated Trade-off: We can decrease the probability of failure by increasing n... But at the additional cost of sending more messengers! Without be ever certain that the attack will be coordinated! Alberto Montresor (UniTN) DS - Introduction 2018/09/13 4 / 43

8 Getting Started Common Knowledge Muddy children n children go playing Children are truthful, perceptive, intelligent Mom says: Don t get muddy! A bunch (say, k) get mud on their forehead Daddy comes, looks around, and says Some of you got a muddy forehead Daddy repeatedly ask: Do you know whether you have a muddy forehead? What happens? Slides from Lorenzo Alvisi Alberto Montresor (UniTN) DS - Introduction 2018/09/13 5 / 43

9 Getting Started Common Knowledge Muddy children Theorem The first k 1 times Daddy asks, the children says No The k-th time, the k children say Yes. Proof. By induction on k Alberto Montresor (UniTN) DS - Introduction 2018/09/13 6 / 43

10 DS - Introduction Getting Started Common Knowledge Muddy children Muddy children Theorem The first k 1 times Daddy asks, the children says No The k-th time, the k children say Yes. Proof. By induction on k Let k = 1. The first time daddy asks, the child with mud on his forehead say yes. Because all the other have no mud, and someone has mud on his forehead, it must be him. Let k > 1 Every child with mud see k 1 children with mud on the forehead. If there were k 1 children with mud, they would have said yes at the (k 1)-th time daddy asks, but they didn t. So there are actually k children with mud, and they all say yes at the k-th time daddy asks.

11 Getting Started Common Knowledge Muddy children Variation 1 Suppose k > 1 Every one knows that someone has a dirty forehead before Dad announces it Does Dad still need to speak up? Let p = Someone s forehead is dirty Every one knows p But, unless the father speak, if k = 2 not every one knows that everyone knows p! Suppose A and B are dirty. Before the father speaks A does not know whether B knows p If k = 3, not every one knows that every one knows that every one knows p... Alberto Montresor (UniTN) DS - Introduction 2018/09/13 7 / 43

12 Getting Started Common Knowledge Muddy children Variation 2... the father took every child aside and told them individually (without others noticing) that someone s forehead is muddy? Variation 3... every child had (unknown to the other children) put a miniature microphone on every other child so they can hear what the father says in private to them? Alberto Montresor (UniTN) DS - Introduction 2018/09/13 8 / 43

13 Getting Started Common Knowledge Two generals, reloaded There is an entire logic that formalizes what knowledge participants acquire while running a protocol J. Halpern and Y. Moses. Knowledge and Common Knowledge in a Distributed Environment. E.W. Dijkstra Prize Solving the Two Generals Problem requires common knowledge everyone knows that everyone knows that everyone knows... But: Common knowledge cannot be achieved by communicating through unreliable channels Alberto Montresor (UniTN) DS - Introduction 2018/09/13 9 / 43

14 Getting Started Common Knowledge A common knowledge puzzle Albert and Bernard just become friends with Cheryl, and they want to know when her birthday is. Cheryl gives them a list of 10 dates: May 15,16,19 June 17,18 July 14,16 August 14,15,17 Cheryl then tells Albert and Bernard separately the month and the day of her birthday, respectively. Albert: I don t know when Cheryl s birthday is, but I know that Bernard does not know too. Bernard: At first I didn t know when Cheryl s birthday is, but I know now. Albert: Then I also know when Cheryl s birthday is Alberto Montresor (UniTN) DS - Introduction 2018/09/13 10 / 43

15 Getting Started Byzantine generals Byzantine generals Scenario n Byzantine generals encircling a city They must decide whether to attack or retreat! Messengers are reliable and synchronous Generals may be traitors Nobody knows which generals are traitors Problem specification The generals require an algorithm to reach an agreement such that (i) all loyal generals decide on the same plan of action and (ii) a small number of traitorous generals cannot cause the loyal generals to adopt different plans. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 11 / 43

16 Getting Started Byzantine generals A potential solution Wait for a majority of generals to agree WRONG! Possible scenario: 3 generals 1 vote attack, 1 vote retreat 1 traitorous general: sends a vote attack to the attack general sends a vote retreat to the retreat general Alberto Montresor (UniTN) DS - Introduction 2018/09/13 12 / 43

17 Getting Started Byzantine generals The problem is solvable Byzantine Fault Tolerance (1982) L. Lamport, R. Shostak, M. Pease, The Byzantine Generals Problem, ACM Trans. on Programming Languages and Systems, 4(3): , A protocol that given n processes,can tolerate up to t traitorous generals with n 3t + 1 Example: 4 generals can tolerate up to 1 byzantine general Practical Byzantine Fault Tolerance (2002) M. Castro and B. Liskov, Practical Byzantine Fault Tolerance and Proactive Recovery, ACM Trans. on Computer Systems, 20(4): , PBFT triggered a renaissance in BFT replication research Still going on... Alberto Montresor (UniTN) DS - Introduction 2018/09/13 13 / 43

18 Getting Started Byzantine generals Reality Check BFT sponsors in 1982 NASA The Ballistic Missile Defense System Command Army Research Office Nancy Lynch s book on Distributed Systems: The agreement problem is a simplified version of a problem that originally arose in the development of on-board aircraft control systems. BitCoin, a peer-to-peer digital currency system, is based on BFT. The 8-hour downtime of Amazon S3 in July 2008 is a well-known example of what happens when you don t use BFT. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 14 / 43

19 Getting Started Summary Take-home lessons We need to properly model our distributed systems Reliable / unreliable communication Benign / malicious processes Solutions depend on the underlying model Approximate or probabilistic solutions Bounded solution No solution at all! Coordinating multiple processes is difficult Unexpected events: failures, malicious behavior Lack of common knowledge Alberto Montresor (UniTN) DS - Introduction 2018/09/13 15 / 43

Themes of the course Impossible vs practical Theory vs practice Yogi Berra says: In theory, theory and practice are the same. In practice, they are not.

20 Themes of the course Impossible vs practical Theory vs practice Yogi Berra says: In theory, theory and practice are the same. In practice, they are not. Yogi Berra Always go to other people s funerals, otherwise they won t go to yours I really didn t say everything I said Nobody goes there anymore; it s too crowded Alberto Montresor (UniTN) DS - Introduction 2018/09/13 16 / 43

21 Themes of the course Impossible vs practical First theme of the course Impossible vs practical Several papers about impossibility results: M. Fischer, N. Lynch, M. Paterson. Impossibility of Distributed Consensus with One Faulty Process. Journal of ACM, 32(2): , S. Gilbert, N. Lynch. Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services. ACM SIGACT News, 33(2):51-59, Yet, many of these problems have practical solutions: T. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2): , L. Lamport. Paxos made simple. ACM SIGACT News, 32(4):18 25, A general tension in Computer Science: M. Vardi. Solving the Unsolvable. Comm. of the ACM, 54(7):5, Alberto Montresor (UniTN) DS - Introduction 2018/09/13 17 / 43

22 Themes of the course Classical vs extreme distributed systems Beyond technology The Spanish Flu, Pandemic: killed 20M people in a relatively short time, more than World War I Virus goal: spread itself as quickly as possible Unreliable environment: viruses may be killed transmission may fail Transmission network is a complex graph Alberto Montresor (UniTN) DS - Introduction 2018/09/13 18 / 43

23 Themes of the course Classical vs extreme distributed systems Beyond technology Flocks of birds Flying in a flock is good: probability of being killed by a predator is reduced Flying in a flock is bad: probability of finding (enough) food is reduced Birds self-organize themselves in a flock No central authority Alberto Montresor (UniTN) DS - Introduction 2018/09/13 19 / 43

24 Themes of the course Classical vs extreme distributed systems Second theme of the course Classical vs extreme distributed systems Classical distributed system problems include agreement, total order broadcast, atomic commit, replication, etc. Extreme distributed system problems include self-* properties, scalability, full decentralization, etc. Special issue on Springer Computing (Sept. 2012) Alberto Montresor, Gusz Eiben, Maarten van Steen, editors Modern distributed systems may nowadays consist of hundreds of thousands of computers, ranging from high-end powerful machines to low-end resource-constrained wireless devices. We label them as extreme distributed systems, as they push scalability and complexity well beyond traditional scenarios. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 20 / 43

25 Themes of the course Syllabus Topics Introduction Reliable broadcast Epidemic protocols Impossibility of consensus Consensus and failure detectors Complex networks P2P Epidemics: Beyond dissemination Rollback and Recovery Paxos Practical Byzantine Fault Tolerance Byzantine, altruistic, rational model Blockchains Alberto Montresor (UniTN) DS - Introduction 2018/09/13 21 / 43

26 Themes of the course Syllabus What s next in the course? Distributed system modeling Which kind of failures exist? Which kind of failures we tolerate? Problem specification Formal description of the problem Algorithms, algorithms, algorithms Pseudo-code descriptions of algorithms Proofs Just writing the code is not enough! Sometimes, impossibility proofs Reality checks Learn about real systems where these protocols are applied Alberto Montresor (UniTN) DS - Introduction 2018/09/13 22 / 43

27 Modeling Distributed Systems Contents Modeling distributed systems Computation: Processes, deterministic vs probabilistic behavior Interaction: Processes interact through messages, which result in: Communication, i.e. information flow Coordination, i.e. synchronization and ordering of activities Failures: Which kind of failures can occur? Benign vs malicious (Byzantine) Process vs communication Time: Determining whether we can make any assumption on time bounds on communication and computation speeds. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 23 / 43

28 Modeling Distributed Systems Computation Computation Process: the unit of computation in a distributed system. Sometimes we may call it node, host, etc. Process set: denoted by Π, it is composed by a collection of n uniquely identified processes, like p 1, p 2,..., p n. Typical assumptions: The set is static (n is well-defined); Processes do know each other All processes run a copy of the same algorithm; the sum of all these copies constitutes the distributed algorithm But in extreme distributed systems: Dynamic set Too many, too dynamic to know them all Multiple algorithms Alberto Montresor (UniTN) DS - Introduction 2018/09/13 24 / 43

29 Modeling Distributed Systems Computation Deterministic vs probabilistic Deterministic process: the local computation and the messages sent by a process is determined by the current state and the messages previously received. Probabilistic process: processes may make used of random oracles to choose the local computation to be performed or the next message to be sent. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 25 / 43

30 Modeling Distributed Systems Interaction Interaction Processes communicate through messages send(m, p): sends a message m to p receive(m): receives a messages m In some cases, messages may be uniquely identified by Sender of the message A sequence number local to the sender General assumption: every pair of processes is connected by a bi-directional communication channel Through routing Not true for P2P systems Alberto Montresor (UniTN) DS - Introduction 2018/09/13 26 / 43

31 DS - Introduction Modeling Distributed Systems Interaction Interaction Interaction Processes communicate through messages send(m, p): sends a message m to p receive(m): receives a messages m In some cases, messages may be uniquely identified by Sender of the message A sequence number local to the sender General assumption: every pair of processes is connected by a bi-directional communication channel Through routing Not true for P2P systems In the receive operation, we do not specify the original sender; can be Fully connected topology may be obtained through routing. For example, consider the following architectures: Fully connected mesh broadcast medium (Ethernet, wireless) Ring Internet with routers

32 Modeling Distributed Systems Failures Process failures In a distributed systems, both processes and communication channels may fail, i.e. depart from what is considered its correct behavior. Hadzilacos and Toueg provide a taxonomy. Benign process failures Fail-stop: A process stops executing events, and other processes may detect this fact. Crash: A process stops executing events Malicious process failures Arbitrary failure, or Byzantine: any type of error may occur. This may be caused by: A software bug A malicious behavior inspired by an intelligent adversary Alberto Montresor (UniTN) DS - Introduction 2018/09/13 27 / 43

33 Modeling Distributed Systems Failures Process failures A process that never fails is correct A process that eventually fails is faulty Several protocols are designed to work correctly if the number of failures f is bounded (for example, f < n/3). In some models, processes may perform a recovery action: After some time, a process may resume functioning It suffers amnesia: the local state maintained in volatile memory is lost To limit the effects of amnesia, a log can be maintained Alberto Montresor (UniTN) DS - Introduction 2018/09/13 28 / 43

34 DS - Introduction Modeling Distributed Systems Failures Process failures Process failures A process that never fails is correct A process that eventually fails is faulty Several protocols are designed to work correctly if the number of failures f is bounded (for example, f < n/3). In some models, processes may perform a recovery action: After some time, a process may resume functioning It suffers amnesia: the local state maintained in volatile memory is lost To limit the effects of amnesia, a log can be maintained To avoid the problem of amnesia completely, every read/write would have to pass through permanent memory; too expensive

35 Modeling Distributed Systems Failures Communication failures Benign communication failures Process p performs send of a message m to process q Message m is inserted in a local outgoing buffer of p (Send-omission) Message m is transmitted from p to q (Omission) Message m is inserted in a local incoming buffer of q (Receive-omission) Process q performs receive of m Malign communication failures Messages created out of nothing, duplicated messages, etc. These problems can easily be solved through encryption techniques. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 29 / 43

36 Modeling Distributed Systems Failures Communication failures Possible causes of message failures: Buffer overflow in the operating system Congestion, routing errors in routers Partitioning: Processes are subdivided in disjoint sets called partitions Communication inside a partition is possible Communication between partitions is not possible When a partition disappears, we say that partitions merge Alberto Montresor (UniTN) DS - Introduction 2018/09/13 30 / 43

37 Modeling Distributed Systems Failures Modeling (faulty) communication channels The idea: the channels cannot systematically drop a specific message. This is the minimum abstraction needed to create reliable channels. Fair-Loss Channels Validity Fair Loss: If a message m is sent infinitely often by a process p to a process q and neither p and q crash, then q will receive m infinitely often Integrity Finite Duplication: If a message m is sent a finite number of times by a process p to a process q, then m cannot be received by q an infinite number of times Integrity No creation: If a message m is delivered by some process p, then m was previously sent by some process q to p Alberto Montresor (UniTN) DS - Introduction 2018/09/13 31 / 43

38 Modeling Distributed Systems Failures Modeling (correct) communication channels The idea: channels are reliable, messages are never lost. It can be implemented, but there is a price to be payed: asynchrony. Perfect Channels Validity Reliable delivery: If p sends a message to q, and neither of p and q crash, then q will eventually receive m Integrity No duplication: No message is delivered to a process more than once Integrity No creation: If a message m is delivered by some process p, then m was previously sent by some process q to p Alberto Montresor (UniTN) DS - Introduction 2018/09/13 32 / 43

39 Modeling Distributed Systems Failures An Example Algorithm Fair-loss Channel Perfect Channel upon init do Set sent Set delivered starttimer(timeout) upon timeout do foreach (m, q) sent do fairlosssend(m, q) starttimer(timeout) upon perfectsend(m, q) do fairlosssend(m, q) sent sent {(m, q)} upon fairlossreceive(m, q) do if m / delivered then delivered delivered {m} perfectreceive(m, q) Alberto Montresor (UniTN) DS - Introduction 2018/09/13 33 / 43

40 Modeling Distributed Systems Failures Safety and liveness Safety Something bad will never happen In other words, a distributed program should never enter an unacceptable state. No message is delivered to a process more than once. Liveness Something good eventually does happen In other words, a distributed program eventually enters a desirable state. If p sends a message to q, and neither of p and q crash, then eventually q will receive m. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 34 / 43

41 Modeling Distributed Systems Time Time Global clock For presentation simplicity, it may be convenient to assume the presence of a global real-time clock, outside the control of processes. This can be used to provide a global ordering of steps in a distributed systems In reality: Each process is associated with a local clock Local clocks may not report the perfect time Clock drift rate: refers to the relative amount that a computer clock differs from a perfect reference clock. Synchronization is possible, but expensive: Atomic clocks GPS See: Google TrueTime API: Alberto Montresor (UniTN) DS - Introduction 2018/09/13 35 / 43

42 DS - Introduction Modeling Distributed Systems Time Time Time Global clock For presentation simplicity, it may be convenient to assume the presence of a global real-time clock, outside the control of processes. This can be used to provide a global ordering of steps in a distributed systems In reality: Each process is associated with a local clock Local clocks may not report the perfect time Clock drift rate: refers to the relative amount that a computer clock differs from a perfect reference clock. Synchronization is possible, but expensive: Atomic clocks GPS See: Google TrueTime API: GPS does not work into buildings Atomic clocks: cost not justified

43 Modeling Distributed Systems Time Time measures associated to communication Latency: The delay between the start of message sending from one process and the beginning of its receipt by another. Possible causes: the actual time for bit transmission (e.g., satellite link) the delay for accessing the network, especially in case of congestion the time taken by the operating system to handle the message both at sender and receiver Bandwidth: Total amount of information that can be transmitted over a communication channel in a given time. Jitter: Variation in the time taken to deliver a series of messages. Mostly related with multimedia data. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 36 / 43

44 Modeling Distributed Systems Time Asynchronous vs synchronous Distributed Systems vs Time Distributed systems make difficult to reason about time, not only for lack of clock synchronization. It is also difficult to pose time bounds on events and communication. We may think about several different models: Asynchronous distributed systems Synchronous distributed systems Partially synchronous distributed systems Alberto Montresor (UniTN) DS - Introduction 2018/09/13 37 / 43

45 DS - Introduction Modeling Distributed Systems Time Asynchronous vs synchronous Asynchronous vs synchronous Distributed Systems vs Time Distributed systems make difficult to reason about time, not only for lack of clock synchronization. It is also difficult to pose time bounds on events and communication. We may think about several different models: Asynchronous distributed systems Synchronous distributed systems Partially synchronous distributed systems Asynchronous distributed systems No assumptions can be made. Most of the problems cannot be solved Synchronous distributed systems Precise assumptions are possible on computation, communication time and clocks. Not really realistic / difficult to implement Partially synchronous distributed systems Some assumptions can be made, others not, OR Assumptions can be made statistically, OR Assumptions hold for arbitrarily long periods of time

46 Modeling Distributed Systems Time Asynchronous vs synchronous Asynchronous distributed system There are no bounds on the relative speed of process execution. There are no bounds on message transmission delays. There are no bounds on clock drift. OR, since we cannot count on their precision at all, there are no clocks. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 38 / 43

47 Modeling Distributed Systems Time Asynchronous vs synchronous Comments These are not assumptions! These are lack of assumptions! The worst possible model: services as simple as: failure detection time-based coordination are not possible Advantages: simple semantics easier to port to more powerful models More realistic: several sources of asynchrony are present in a large-scale network (like the Internet) Alberto Montresor (UniTN) DS - Introduction 2018/09/13 39 / 43

48 Modeling Distributed Systems Time Asynchronous vs synchronous Synchronous Distributed Systems Synchronous computation: There is a known upper bound on the relative speed of process execution. Synchronous communication: There is a known upper bound on message transmission delays. Synchronous clocks: Processes are equipped with local clocks. There is a known upper bound on the drift rates of local clocks with respect to a global real-time clock. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 40 / 43

49 Modeling Distributed Systems Time Asynchronous vs synchronous Comments The best possible model. Can be built, but not with standard hardware/software. Synchronous Ethernet vs CSMA/CD Ethernet Real-time OS vs normal OS Many interesting properties: Timed failure detection (e.g., ping) Coordination based on time (e.g., lease) Worst-case performance analysis Synchronized clocks Alberto Montresor (UniTN) DS - Introduction 2018/09/13 41 / 43

50 Modeling Distributed Systems Time Asynchronous vs synchronous Partial synchrony For most systems we know of, it is relatively easy to define physical time bounds that are respected most of the time. There are however periods where the timing assumptions do not hold. Delays on processes: Machines may run out of memory, slowing down processes A typical case of no bound on relative speeds of processes Delays on messages: Network may congested, and messages may be dropped. Re-transmission protocols can ensure reliability, but at the price of asynchrony Messages may be re-transmitted an arbitrary number of times. Alberto Montresor (UniTN) DS - Introduction 2018/09/13 42 / 43

51 DS - Introduction Modeling Distributed Systems Time Asynchronous vs synchronous Asynchronous vs synchronous Partial synchrony For most systems we know of, it is relatively easy to define physical time bounds that are respected most of the time. There are however periods where the timing assumptions do not hold. Delays on processes: Machines may run out of memory, slowing down processes A typical case of no bound on relative speeds of processes Delays on messages: Network may congested, and messages may be dropped. Re-transmission protocols can ensure reliability, but at the price of asynchrony Messages may be re-transmitted an arbitrary number of times. In this sense, practical systems are partially synchronous

52 Modeling Distributed Systems Time Asynchronous vs synchronous How to express partial synchrony? A possibility is the following: Timing assumptions only hold eventually. Theoretically, it means: There is a time after which the system is synchronous forever The system is initially asynchronous and only after a long time becomes synchronous How to read it: The system is not always synchronous There is no known bound to the period in which it is asynchronous We expect that there are periods during which the system is synchronous Some of these periods are long enough to terminate protocol execution Alberto Montresor (UniTN) DS - Introduction 2018/09/13 43 / 43

53 Reading Material A. Panconesi. The coordinated attack and the jealous amazons. A. Panconesi. Coordination and the fall of Eastern Roman Empire. V. Hadzilacos and S. Toueg. A modular approach to fault-tolerant broadcasts and related problems. In S. Mullender, editor, Distributed Systems (2 nd ed.). Addison-Wesley,

54 Reality Check: Interesting links The S3 incident Solving the unsolvable The rise and fall of Corba Notes on Distributed Systems for Young Bloods

Distributed Algorithms Introduction

Distributed Algorithms Introduction Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1