Prophecy: Using History for High Throughput Fault Tolerance

Size: px

Start display at page:

Download "Prophecy: Using History for High Throughput Fault Tolerance"

Lily Stevens
5 years ago
Views:

1 Prophecy: Using History for High Throughput Fault Tolerance Siddhartha Sen Joint work with Wyatt Lloyd and Mike Freedman Princeton University

2 Non crash failures happen

3 Non crash failures happen Model as Byzantine (malicious)

4 Mask Byzantine faults Service

5 Mask Byzantine faults Throughput Replicated service

6 Mask Byzantine faults Throughput Replicated service

7 Mask Byzantine faults Throughput Replicated service

8 Mask Byzantine faults Throughput Replicated service

9 Mask Byzantine faults Throughput Linearizability (t (strong consistency) it Replicated service

10 Byzantine fault tolerance (BFT) Low throughput Modifies clients Long lived sessions

11 Prophecy High throughput + good consistency No free lunch: Read mostly workloads Slightly weakened consistency

12 Byzantine fault tolerance (BFT) Low throughput Modifies clients Long lived sessions D Prophecy Prophecy

13 Traditional BFT reads application Replica Group

14 Traditional BFT reads application Agree? Replica Group

15 A cache solution cache application Replica Group

16 A cache solution cache application Agree? Replica Group

17 A cache solution cache application Problems: Agree? Huge cache Invalidation Replica Group

18 A compact cache cache application Requests req1 req2 req3 Responses resp1 resp2 resp3 Replica Group

19 A compact cache cache application Requests sketch(req1) sketch(req2) sketch(req3) Responses sketch(resp1) sketch(resp2) sketch(resp3) Replica Group

20 A sketcher sketcher application Replica Group

21 Executing a read sketch webpage Replica Group

22 Executing a read sketch webpage Replica Group

23 Executing a read sketch webpage Replica Group

24 Executing a read sketch webpage Agree? Replica Group

25 Executing a read sketch webpage Agree? Fast, load balanced reads Replica Group

26 Executing a read sketch webpage Agree? Replica Group

27 Executing a read sketch webpage Replica Group

28 Executing a read sketch webpage key value store replicated state machine Replica Group

29 Executing a read sketch webpage Replica Group

30 Executing a read sketch webpage Replica Group

31 Executing a read sketch webpage Replica Group

32 Executing a read sketch webpage Agree? Replica Group

33 Executing a read sketch webpage Agree? Maintain a fresh cache Replica Group

34 Did we achieve linearizability? i NO!

35 Executing a read sketch webpage Replica Group

36 Executing a read sketch webpage Replica Group

37 Executing a read sketch webpage Agree? Replica Group

38 Executing a read sketch webpage Replica Group

39 Executing a read sketch webpage Agree? Replica Group

40 Executing a read sketch webpage Agree? Fast reads may be stale Replica Group

41 Load balancing sketch webpage Replica Group

42 Load balancing sketch webpage Agree? Replica Group

43 Load balancing sketch webpage Agree? Pr(k stale) = g k Replica Group

44 D Prophecy vs. BFT Traditional BFT: Each replica executes read Linearizability Replica Group D Prophecy: One replica executes read Delay once linearizability

45 Byzantine fault tolerance (BFT) Low throughput Modifies clients Long lived sessions D Prophecy Prophecy

46 Key exchange exchange overhead

47 Key exchange exchange overhead 11%

48 Key exchange exchange overhead 3% 11%

49 Internet services Replica Group

50 A proxy solution Sketcher Proxy Replica Group

51 A proxy solution Consolidate sketchers Sketcher Proxy Replica Group

52 A proxy solution Consolidate sketchers Sketcher Replica Group

53 A proxy solution Sketcher must be fail stop Sketcher Trusted Replica Group

54 A proxy solution Sketcher mustbefail stop stop Trust middlebox already Small and simple Sketcher Trusted Replica Group

55 Executing a read q Sketcher Trusted Replica Group

56 Executing a read Sketcher Trusted Replica Group

57 Executing a read Sketcher Trusted Replica Group

58 Executing a read Sketcher Req s(q) ( ) Trusted Resp Replica Group

59 Executing a read Sketcher Trusted Replica Group

60 Executing a read Sketcher Trusted Replica Group

61 Executing a read Sketcher Trusted Replica Group

62 Executing a read Sketcher Req s(q) ( ) Trusted Resp Replica Group

63 Executing a read Sketcher Req s(q) ( ) Trusted Resp Replica Group

64 Prophecy Sketcher Trusted Replica Group

65 Prophecy Fast, load balanced reads Sketcher Trusted Replica Group

66 Prophecy Fast reads may be stale Sketcher Req s(q) ( ) Trusted Resp Replica Group

67 Delay once linearizability

68 Delay once linearizability

69 Delay once linearizability W, R, W, W, R, R, W, R

70 Delay once linearizability Read after write property W, R, W, W, R, R, W, R

71 Delay once linearizability Read after write property W, R, W, W, R, R, W, R

72 Example application Upload embarrassingphotos 1. Remove colleagues from ACL 2. Upload photos 3. (Refresh) Weak may reorder Delay once preserves order

73 Byzantine fault tolerance (BFT) Low throughput Modifies clients Long lived sessions D Prophecy Prophecy

74 Implementation Modified PBFT PBFT is stable, complete Competitive with Zyzzyva et. al. C++, Tamer async I/O Sketcher: 2000 LOC PBFT library: 1140 LOC PBFT client: 1000 LOC

75 Evaluation Prophecy vs. proxied PBFT Proxied systems D Prophecy vs. PBFT Non proxied systems

76 Evaluation Prophecy vs. proxied PBFT Proxied systems We will study: Performance on null workloads Performance with real replicated service Where system bottlenecks, how to scale

77 Basic setup Sketcher (100) (concurrent) Replica Group (PBFT)

79 Fraction of failed Fraction of failed fast reads

80 Alexa top sites: < 15% Fraction of failed fast reads

81 Small benefit on null reads

82 Small benefit on null reads

83 Apache webserver setup Sketcher Replica Group

84 Large benefit on real workload

85 Large benefit on real workload 3.7x

86 Large benefit on real workload 3.7x 2.0x

87 Large benefit on real workload 3.7x 2.0x

88 Benefit grows with work

89 Benefit grows with work

90 Benefit grows with work

91 Benefit grows with work 94μs (Apache)

92 Benefit grows with work 94μs (Apache) Null workloads are misleading!

93 Benefit grows with work

94 Single sketcher bottlenecks

95 Single sketcher bottlenecks

96 Scaling out

97 Scales linearly with replicas

98 Summary Prophecy ygood for Internet services Fast, load balanced reads D Prophecy good for traditional services Prophecy scaleslinearly linearly whilepbft stays flat Limitations: Read mostly workloads (meas. study corroborates) Delay once linearizability (useful for many apps)

99 Thank You

100 Additional slides

101 Transitions Prophecy good for read mostly workloads Are transitions ii rare in practice?

102 Measurement study Alexa top sites Access main page every 20 sec for 24 hrs

103 Mostly static content

104 Mostly static content

105 Mostly static content 15%

106 Dynamic content Rabin fingerprinting on transitions 43% differ by single contiguous change Sampled 4000 of them, over half due to: Load balancing directives Random IDs in links, function parameters

Viewstamped Replication to Practical Byzantine Fault Tolerance. Pradipta De

Viewstamped Replication to Practical Byzantine Fault Tolerance Pradipta De pradipta.de@sunykorea.ac.kr ViewStamped Replication: Basics What does VR solve? VR supports replicated service Abstraction is