CS 378 Intro to Distributed Computing Lorenzo Alvisi Harish Rajamani What is a distributed system? A distributed system is one in which the failure of a computer you didn t even know existed can render your own computer unusable. Leslie Lamport
What is a distributed system? A distributed system is software through which a collection of independent computers appears to its users as a single, coherent system Machine A Machine B Machine C Distributed Applications Middleware Local OS Local OS Local OS Network Goals (and auto-goals) of a Distributed System Connecting Resources and Users Transparency Openness Scalability
Transparency Transparency Access Location Migration Relocation Replication Concurrency Failure Persistence Description Hides differences in data representation and invocation mechanisms Hides where an object resides Hides from an object that object s location Hides from a client the change of location of an object to which the client is bound Hides that an object may be replicated, with replicas at different locations Hides coordination of activities between objects Hides the failure and recovery of object Hides whether a resource is in memory or on disk Openness Conform to well-defined interfaces Easily interact with other open systems Achieve independence in heterogeneity wrt Hardware Platform Languages Support different app/user-specific policies
Openness Conform to well-defined interfaces Easily interact with other open systems Achieve independence in heterogeneity wrt Hardware Platform Languages Support different app/user-specific policies ideally, provide only mechanisms! Scalability Size scalability number of users and processes Geographical scalability maximum distance between nodes Administrative scalability number of administrative domains
Scaling Distribute partition data and computation across multiple machine Java applets, DNS, WWW Replicate make copies of data available at different machines Cache mirrored web sites, replicated fs, replicated db Consistency allow client processes to access local copies Web caches, file caching A first course in Distributed Computing... Two basic approaches cover many interesting systems, and distill from them fundamental principles focus on a deep understanding of the fundamental principles, and see them instantiated in a few systems
A few intriguing questions How do we talk about a distributed execution? Can we draw global conclusions from local information? Can we coordinate operations without relying on synchrony? For the problems we know how to solve, how do we characterize the goodness of our solution? Are there problems that simply cannot be solved? What are useful notions of consistency, and how do we maintain them? What if part of the system is down? Can we still do useful work? What if instead part of the system becomes possessed and starts behaving arbitrarily all bets are off?
Global Predicate Detection and Event Ordering Our Problem To compute predicates over the state of a distributed application
Model Message passing No failures Two possible timing assumptions: 1. Synchronous System 2. Asynchronous System No upper bound on message delivery No bound on relative process speeds No centralized clock Clock Synchronization External Clock Synchronization: keeps processor clock within some maximum deviation from an external source. can exchange of info about timing events of different systems can take actions at real- deadlines synchronization within 0.1 ms Internal Clock Synchronization: keeps processor clocks within some maximum deviation from each other. can measure duration of distributed activities that start on one process and terminate on another can totally order events that occur on a distributed system
Clock Synchronization: Take 1 Assume an upper bound max and a lower bound min on message delivery Guarantee that processes stay synchronized within max min. Clock Synchronization: Take 1 Assume an upper bound max and a lower bound min on message delivery Guarantee that processes stay synchronized within max min. Problem: % of messages 5000 message run (IBM Almaden) 4.22 4.48 Time (ms) 93.17
Clock Synchronization: Take 2 No upper bound on message delivery......but lower bound min on message delivery Use out maxp to detect process failures slaves send messages to master Master averages slaves value; computes fault-tolerant average Precision: 4 maxp min Probabilistic Clock Synchronization (Cristian) Master-Slave architecture Master is connected to external source Slaves read master s clock and! adjust their own How accurately can a slave read the master s clock?
The Idea Clock accuracy depends on message roundtrip if roundtrip is small, master and slave cannot have drifted by much! Since no upper bound on message delivery, no certainty of accurate enough reading... but very accurate reading can be achieved by repeated attempts Asynchronous systems Weakest possible assumptions cfr. finite progress axiom Weak assumptions less vulnerabilities Asynchronous " slow Interesting model w.r.t. failures (ah ah ah!)
Client-Server Processes exchange messages using Remote Procedure Call (RPC) A client requests a service by sending the server a message. The client blocks while waiting for a response c s Client-Server Processes exchange messages using Remote Procedure Call (RPC) A client requests a service by sending the server a message. The client blocks while waiting for a response c The server computes the response (possibly asking other servers) and returns it to the client s #!?%!
Deadlock! p 2 Goal Design a protocol by which a processor can determine whether a global predicate (say, deadlock) holds
Wait-For Graphs Draw arrow from p i to p j if p j has received a request but has not responded yet Wait-For Graphs Draw arrow from p i to p j if p j has received a request but has not responded yet Cycle in WFG Deadlock deadlock cycle in WFG
The protocol p 0 sends a message to... p 0 On receipt of s message, replies with its state and wait-for info p i An execution p 2 p 2
An execution p 2 p 2 An execution p 2 p 2 Ghost Deadlock!
Houston, we have a problem... Asynchronous system no centralized clock, etc. etc. Synchrony useful to coordinate actions order events Mmmmhhh... Events and Histories Processes execute sequences of events Events can be of 3 types: local, send, and receive e i p is the i-th event of process p The local history h p of process p is the sequence of events executed by process p h k p h 0 p : prefix that contains first k events : initial, empty sequence The history H is the set h p0 h p1... h pn 1 NOTE: In H, local histories are interpreted as sets, rather than sequences, of events
Ordering events Observation 1: Events in a local history are totally ordered p i Ordering events Observation 1: Events in a local history are totally ordered p i Observation 2: For every message m, send(m) precedes receive(m) p i m p j
Happened-before (Lamport[1978]) A binary relation defined over events 1. if e k i, e l i h i and k < l, then e k i el i 2. if e i = send(m) and e j = receive(m), then e i e j 3. if e e and e e then e e Space-Time diagrams A graphic representation of a distributed execution p 2 p 2
Space-Time diagrams A graphic representation of a distributed execution p 2 p 2 Space-Time diagrams A graphic representation of a distributed execution p 2 p 2
Space-Time diagrams A graphic representation of a distributed execution p 2 p 2 Space-Time diagrams A graphic representation of a distributed execution p 2 p 2 H and impose a partial order
Space-Time diagrams A graphic representation of a distributed execution p 2 p 2 H and impose a partial order Space-Time diagrams A graphic representation of a distributed execution p 2 p 2 H and impose a partial order
Space-Time diagrams A graphic representation of a distributed execution p 2 p 2 H and impose a partial order Runs and Consistent Runs A run is a total ordering of the events in H that is consistent with the local histories of the processors Ex: h 1, h 2,..., h n is a run A run is consistent if the total order imposed in the run is an extension of the partial order induced by A single distributed computation may correspond to several consistent runs!
Cuts A cut C is a subset of the global history of H C = h c 1 1 hc 2 2... hc n n p 2 Cuts A cut C is a subset of the global history of H The frontier of C is the set of events e c 1 1, ec 2 2,... ec n n C = h c 1 1 hc 2 2... hc n n p 2
Global states and cuts The global state of a distributed computation is an n-tuple of local states Σ = (σ 1,... σ n ) To each cut (c 1... c n ) state (σ c 1 1,... σc n n ) corresponds a global