Data Replication CS 188 Distributed Systems February 3, 2015

Size: px

Start display at page:

Download "Data Replication CS 188 Distributed Systems February 3, 2015"

Helen Conley
5 years ago
Views:

1 Data Replication CS 188 Distributed Systems February 3, 2015 Page 1

2 Some Other Possibilities What if the machines sharing files are portable and not always connected? What if the machines communicate across the Internet? What if the load on some files is too heavy for a single machine? Page 2

3 An Answer to These Questions Replicate the data Keep multiple copies of the data on different machines Depending on details, make different copies available for different purposes Page 3

4 How Does This Help? What if the machines sharing files are portable and not always connected? Put a replica of the data on the portable machine What if the machines communicate across the Internet? Avoid expensive cross-internet traffic by having replicas on both sides What if the load on some files is too heavy for a single machine? Share the load among multiple replicas Page 4

5 Other Replication Advantages Reliability If one machine fails, replicas of its data might be elsewhere Flexibility Easier to assign data workloads to storage resources Page 5

6 The Replication Concept When in When the in When the in the When in When the in the course of human course of human course of human course of course human of human events it becomes events it becomes events it becomes events it becomes events it becomes necessary for necessary one for necessary one for necessary one necessary for one for one people to. people.. to. people.. to. people.. to people... to... There is a conceptual object (like a file) We keep more than one physical copy of it Maybe several Each copy is meant to be a full representation of the object So accessing any should be the same as accessing any other Page 6

7 Replication and Caching The two are obviously similar Caching usually implies it s temporary Replication usually implies it s permanent Caching is usually for local use only Replication is usually for more general use These distinctions are not actually binary, though Permanent isn t always really permanent Some caches service multiple machines Page 7

8 There Are Some Differences For example, invalidation on write is feasible for cached data It isn t feasible for replicated data One can always throw away a cached copy of data (modulo local needs) One can t always throw away a replica Especially the only one Page 8

9 Replication and Reading If the data is read-only, the replication problem is easy IF... The problems arise if the data is ever written Life then becomes much more complicated Page 9

10 Read-Only Replication Merely ensure that all copies start off the same They never change Accessing any copy as good as any other Still a problem of finding and choosing replicas to access Page 10

11 Read-Only Data and Metadata Usually we treat file metadata as part of the file Maybe the data is read only But is the metadata? How about access permissions? How about access time? If metadata can be updated, you still have issues Page 11

12 Choosing Read-Only Replicas Mostly a performance question Which one is closest? Which one is least loaded? Initial placement might make a big difference And what if replicas can move? Page 12

13 Varying Read-Only Replication Factors We can add or delete read-only replicas easily Some issues regarding open files When should we add a replica? When should we delete a replica? When should we move a replica to a different location? Page 13

14 Replication and Writing Life becomes complicated when you write replicated data Physically the write occurs at one copy Logically the write should be applied to all copies Going from the physical reality to the logical goal is challenging Page 14

15 Illustrating the Problem When Forescore in the and course seven years of human ago, events our forefathers it becomes necessary brought for one people forth. to.... When Forescore in the and course seven years of human ago, events our forefathers it becomes necessary brought for one people forth. to.... We write to the yellow replica The yellow and blue replicas should be the same, but they aren t What do we do? Problem solved! But... Page 15

16 A Fly in the Ointment When in the course of human events it becomes necessary for one people to... Forescore When in the and seven course years of human ago, our events forefathers it becomes brought necessary for one forth people. to.... We ve gotten ourselves into this state What if the writer s next access is to the other replica? Page 16

17 A Worse Situation When in the course of human events it becomes necessary for one people to... Forescore and seven years ago, our forefathers brought forth... What if someone else reads the other copy? Page 17

18 An Even Worse Situation When Ask not in what the course your country of human can events do for you, it becomes but necessary what you can for one do people for your to country... Forescore and seven years ago, our forefathers brought forth... What if someone else writes the other copy? Page 18

19 These Situations Arose Before Distributed Computing What if there are two processes on one machine? What if they read a file and then both choose to write it? Or one writes without the other s knowledge? Still problematic, but easier to solve Page 19

20 Single Machine Solutions Have only one copy of shared data Replication advantages less on a single machine, anyway Use locks to control access to shared data Both solutions rely on a single piece of storage that both parties consult So they don t work on two machines Page 20

21 Cross-Machine Locking Why can t I just share a lock between two machines? A lock is really a piece of data Saying who holds it Either you store it on one machine or on both Storing on just one leads to performance and reliability problems Storing on both gets us back to our original problem But now the shared data is the lock itself Page 21

22 Primary Copy Options Only allow writes to one replica So no issue of conflicting writes to different replicas Doesn t solve the read/write concurrency problem Issues if the primary copy fails Or if its server is overloaded Or if there are network partitions Page 22

23 A Diversion Into Clocks Ultimately, these issues relate to the question of ordering events What order do things happen in? In a distributed system One form of ordering used a lot in the real world is time Can we use time to solve our problem? Page 23

24 Time Services One way to make things happen in order is to timestamp them Read a clock and slap a time stamp on the event As in normal life, things only happen in time order Possible solution for ordering distributed events Page 24

25 Time Services and Replication Maybe we can slap a timestamp on every write And maybe use timestamps to control reads The timestamps of multiple writes control the order in which they occur Doesn t solve all the problems, but does solve some Page 25

26 Using a Clock 3:15 3:22 3:27 Node 1 Node 2 3:15 3:15 A To B 3:15 To C C Read the clock To B B 3:22 Node 3 Now B can know the proper order of writes Page 26

27 The Problem With Clocks A clock is (ultimately) a physical resource So it s in exactly one place We use messages to access remote places And messages take varying amounts of time to get from one place to another So, with a single clock, can t guarantee proper ordering Page 27

28 Solutions to Clock Problems Physical clocks Logical clocks Page 28

29 Physical Clocks Each node keeps its own local clock Modern machines always have them, anyway Stamp each synchronizable event with the local clock Problem becomes keeping the clocks synchronized Page 29

30 Globally Accessible Clocks In the general case, this usually means GPS clocks GPS satellites broadcast highly accurate clock signals Over the entire Earth s surface Anyone with a GPS receiver that s working can hear it Page 30

31 Pros and Cons of Physical Clocks + Simplicity Need constant access to clock Transmission errors/delays damage synchronization Requires strong knowledge of transmission delays Never possible to reduce clock skew to zero Page 31

32 Logical Clocks Don t try to keep track of passage of actual time Use a logical mechanism to keep track of proper order of events Essentially, assign artificial timestamps that maintain the causality required for the computation Page 32

33 When Are Logical Clocks Useful? When relative order of events is the issue Rather than relationship to wall clock time Often the case for operations of distributed applications Not always when there is a relationship to the real world Page 33

34 Lamport Clocks Fundamental logical clock system Each process P i has a clock C i Each event is assigned a time at its processor is the happens-before relation a If a b means a happened before b b, C(a) < C(b) Page 34

35 Implementing Lamport Clocks Whenever an event occurs, increment the local clock Assign new value to event But how do we provide the correct global view? Since processes live on different processors Page 35

36 Handling Messages in Lamport Clocks Processes communicate only via send and receive of messages Which are events If P i sends to P j, C i (send) < C j (receive) Since send must happen-before receive How do we force that? Page 36

37 Rules for Lamport Clocks 1). If a b within the same process, C(a) < C(b) 2). If a is a sending event in P i and b is the corresponding receiving event in P j, then C(a) < C(b) Enforcing Rule 1 is easy, since it s on the same processor Page 37

38 Enforcing Rule 2 Timestamp outgoing messages with time of send Receiver j adds increment d to maximum of message timestamp and local clock C j = max(c(a), C j ) + d C(b) = C j Ensures that receive event b gets a clock value after send event a Page 38

39 Lamport Clocks Example 1 i 1 20 a 1 send 2 2 j receive C(a) =1, C(send) = 2, C(receive) = 3 C(a) < C(send) C(send) < C(receive) Page 39

40 Properties of Lamport Clocks Happens-before is transitive If a b and b c, then a c If a b, then C(a) < C(b) But the converse is not true C(a) < C(b) does not imply a b How can that happen? Page 40

41 Lamport Clock Example 2 i 01 2 a 1 b 2 j 01 d 1 C(a) =1, C(b) = 2, C(d) = 1 C(a) < C(b) C(d) < C(b)????!!!!????!!!! Page 41

42 The Sad Truth About Distributed Systems Concurrency Abandon all hope ye who enter here You ve got to forget your godlike view In the absence of a physical clock, YOU CAN T ORDER ALL EVENTS PROPERLY!!!!!!!! But perhaps you don t believe that... Page 42

43 Lamport Clock Example 3 i 01 2 a 1 b 2 j d C(a) =1, C(b) = 2, C(d) = 1 But the order of events was different than before Page 43

44 Why Do We Have This Problem? Not really because we aren t keeping a physical clock It s because we aren t communicating enough to derive the order If each process sent the other a message after each local event, our examples would have proper ordering Page 44

45 Obtaining the Proper Order for Example 2 i a 1 b 2 send 3 3 Synchronize j receive C(a)<C(b), C(b)<C(d) d 5 Page 45

46 And For Example 3 receive a b i Synchronize d send j C(d) < C(a), C(a) < C(b) Page 46

47 But There s a Problem What if we have true concurrency? What if an event occurs while a synchronization message is in transit? Page 47

48 Lamport Clocks Example 4 i 01 2 Synchronize a 1 j 01 2 d 1 send 2 C(d) = C(a) Because of concurrency, you can t win Page 48

49 Lamport Clocks and Partial Orders Basic Lamport clocks only give a partial order They don t order events with equal times Easy to provide a full order Number all processes Concatenate process number to clock Page 49

50 In Our Examples, Say process i is numbered 1 and process j is numbered 2 In example 1, no equal times In example 2, C(a) = 1,1 C(b) = 2,1 C(d) = 1,2 So C(a) is ordered before C(d) Page 50

51 Fully Ordered Clocks in Example 1 i 1 20 a 1,1 send 2,1 2 receive j ,2 Page 51

52 Fully Ordered Clocks in Example 2 i 01 2 a 1,1 b 2,1 j 01 1,2 But d still ordered before b d Page 52

53 Don t Read Too Much Into This Ordering In example 3, C(a) = 1,1 C(b) = 2,1 C(d) = 1,2 C(a) is still ordered before C(d) Even though we know C(d) happened first This ordering is complete, but somewhat arbitrary Page 53

54 Fully Ordered Clocks in Example 3 a b i ,1 2,1 j 0 1 d 1,2 Page 54

55 Vector Logical Clocks In normal Lamport clocks, C(a) < C(b) does not imply a happened before b a and b might be concurrent, instead Vector clocks allow us to distinguish those cases At the cost of keeping more information Page 55

56 How Vector Clocks Work Each process keeps a vector of clocks For n total processes, n vector elements, one per process Each element of the vector is the newest clock from that process seen locally Comparisons are done on full vectors Page 56

57 Vector Clock Example i j a 1,0,0 b 0,1,0 e 2,0,0 f 2,2,0 When message arrives, Greater-than C(a)<C(e)<C(f) operation matches set each vector element C(b)<C(f) Lamport criteria separately exactly But, C(a)!< C(b) d k ,0,1 Page 57

58 Vector Clocks Pros and Cons + Partial ordering only where causal relationships exist Higher overheads for clock storage and message transport Potentially killers for huge numbers of processes Tricky (not impossible) when number of processes changes Page 58

59 What s This Got to Do With Replication? Writes can be clock events We can use vector clocks to keep track of writes to multiple replicas Doesn t prevent concurrent writes But does detect them Which leads to the possibility of optimistic replication Page 59

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Page 1 Example Replicated File Systems NFS Coda Ficus Page 2 NFS Originally NFS did not have any replication capability