Verdi: A Framework for Implementing and Formally Verifying Distributed Systems

Size: px
Start display at page:

Download "Verdi: A Framework for Implementing and Formally Verifying Distributed Systems"

Transcription

1 Verdi: A Framework for Implementing and Formally Verifying Distributed Systems Key-value store VST James R. Wilcox, Doug Woos, Pavel Panchekha, Zach Tatlock, Xi Wang, Michael D. Ernst, Thomas Anderson

2 Challenges

3 Challenges Distributed systems run in unreliable environments

4 Challenges Distributed systems run in unreliable environments Many types of failure can occur

5 Challenges Distributed systems run in unreliable environments Many types of failure can occur Fault-tolerance mechanisms are challenging to implement correctly

6 Challenges Distributed systems run in unreliable environments Many types of failure can occur Fault-tolerance mechanisms are challenging to implement correctly

7 Challenges Contributions Distributed systems run in unreliable environments Formalize network as operational semantics Many types of failure can occur Build semantics for a variety of fault models Fault-tolerance mechanisms are challenging to implement correctly Verify fault-tolerance as transformation between semantics

8 Verdi Workflow Build, verify system in simple semantics Client I/O Key-value store

9 Verdi Workflow Build, verify system in simple semantics Client I/O Key-value store V Apply verified system transformer S T KV Consensus Client I/O KV Consensus KV Consensus

10 Verdi Workflow Build, verify system in simple semantics Client I/O Key-value store V Apply verified system transformer T S KV End-to-end correctness by composition Client I/O KV Consensus Consensus KV Consensus

11 Contributions Formalize network as operational semantics Build semantics for a variety of fault models Verify fault-tolerance as transformation between semantics

12 Contributions General Approach Formalize network as operational semantics Find environments in your problem domain Build semantics for a variety of fault models Formalize these environments as operational semantics Verify fault-tolerance as transformation between semantics Verify layers as transformations between semantics

13 Verdi Successes Applications Key-value store Lock service Fault-tolerance mechanisms Sequence numbering Retransmission Primary-backup replication Consensus-based replication linearizability

14 Verdi Successes Applications Key-value store Lock service Fault-tolerance mechanisms Sequence numbering Retransmission Primary-backup replication Consensus-based replication linearizability

15 Important data

16 Replicated KV store Replicated KV store Replicated KV store Replicated for availability

17 Replicated KV store Replicated KV store Replicated KV store

18 Crash Reorder Drop Duplicate Partition... Replicated KV store Replicated KV store Replicated KV store

19 Crash Reorder Drop Duplicate Partition... Replicated KV store Replicated KV store Replicated KV store Environment is unreliable

20 Crash Reorder Drop Duplicate Partition... Replicated KV store Replicated KV store Replicated KV store Decades of research; still difficult to implement correctly Implementations often have bugs

21 Bug-free Implementations

22 Bug-free Implementations

23 Bug-free Implementations Several inspiring successes in formal verification CompCert, sel4, Jitk, Bedrock, IronClad, Frenetic, Quark

24 Bug-free Implementations Several inspiring successes in formal verification CompCert, sel4, Jitk, Bedrock, IronClad, Frenetic, Quark Goal: formally verify distributed system implementations

25 Formally Verify Distributed Implementations

26 Formally Verify Distributed Implementations Separate independent system components

27 Formally Verify Distributed Implementations App Fault tolerance App Fault tolerance App Fault tolerance Separate independent system components

28 Formally Verify Distributed Implementations App Fault tolerance App Fault tolerance App Fault tolerance Separate independent system components Verify application logic independently from fault-tolerance tolerance

29 Formally Verify Distributed Implementations KV Consensus KV Consensus KV Consensus Separate independent system components Verify application key-value store logic independently from consensus

30 Formally Verify Distributed Implementations KV KV 1. Verify application logic Consensus 2. Verify fault tolerance mechanism KV 3. Run the system! Consensus Consensus Separate independent system components Verify application key-value store logic independently from consensus

31 1. Verify Application Logic Client I/O Key-value store

32 1. Verify Application Logic Simple model, prove good map Client I/O Key-value store

33 1. Verify Application Logic Simple model, prove good map Client I/O Key-value store

34 2. Verify Fault Tolerance Mechanism Simple model, prove good map Client I/O Key-value store

35 2. Verify Fault Tolerance Mechanism Simple model, prove good map Client I/O Key-value store V S T

36 2. Verify Fault Tolerance Mechanism Simple model, prove good map Client I/O Key-value store V S T KV Consensus Client I/O KV Consensus KV Consensus

37 2. Verify Fault Tolerance Mechanism Simple model, prove good map Client I/O Key-value store V Apply verified system transformer, prove properties preserved S T KV Consensus Client I/O KV Consensus KV Consensus

38 2. Verify Fault Tolerance Mechanism Simple model, prove good map Client I/O Key-value store V Apply verified system transformer, prove properties preserved S T KV Consensus Client I/O KV Consensus KV Consensus

39 2. Verify Fault Tolerance Mechanism Simple model, prove good map Client I/O Key-value store V Apply verified system transformer, prove properties preserved S T KV End-to-end correctness by composition Client I/O KV Consensus Consensus KV Consensus

40 3. Run the System! KV Consensus KV Consensus KV Consensus Extract to OCaml, link unverified shim Run on real networks

41 Verifying application logic

42 Simple One-node Model Key-value State: {}

43 Simple One-node Model Set k v" Key-value State: {}

44 Simple One-node Model Set k v" Key-value State: { k : v }

45 Simple One-node Model Set k v" Key-value State: { k : v } Resp k v

46 Simple One-node Model Set k v" Key-value State: { k : v } Resp k v Trace: [Set k v", Resp k v ]

47 Simple One-node Model System State: σ H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

48 Simple One-node Model Input:! System State: σ H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

49 Simple One-node Model Input:! System State: σ H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

50 Simple One-node Model Input:! System State: σ State: σ H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

51 Simple One-node Model Input:! System State: σ State: σ Output: o H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

52 Simple One-node Model Input:! System State: σ State: σ Output: o H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

53 Simple One-node Model Input:! System State: σ State: σ Output: o H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

54 Simple One-node Model Input:! System State: σ State: σ Output: o Trace: [!, o] H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

55 Simple One-node Model Input:! System State: σ State: σ Output: o Trace: [!, o] H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

56 Simple One-node Model Input:! System State: σ State: σ Output: o Trace: [!, o] H inp (, i)=( 0, o) (, T) s ( 0, T ++ hi, oi) Input

57 Simple One-node Model

58 Simple One-node Model Spec: operations have expected behavior (good map) Set, Get Del, Get

59 Simple One-node Model Spec: operations have expected behavior (good map) Set, Get Del, Get Verify system against semantics by induction Safety Property

60 Verifying Fault Tolerance

61 The Raft Transformer Consensus provides a replicated state machine Raft Same inputs on each node Calls into original system Raft Raft

62 The Raft Transformer Log of operations Consensus provides a replicated state machine Raft Same inputs on each node Calls into original system Raft Raft

63 The Raft Transformer Log of operations Consensus provides a replicated state Original machine system Raft Same inputs on each node Calls into original system Raft Raft

64 The Raft Transformer When input received: Add to log Send to other nodes Raft When op replicated: Apply to state machine Send output Raft Raft

65 The Raft Transformer For KV store: Ops are Get, Set, Del State is dictionary Raft Raft Raft

66 Raft Correctness V Correctly transforms systems T S Preserves traces Raft Linearizability Raft Raft

67 Fault Model Model global state Model internal communication Model failure

68 Fault Model: Global State

69 Fault Model: Global State Machines have names 1 2 3

70 Fault Model: Global State Machines have names Σ maps name to state 1 Σ[1] 2 3 Σ[2] Σ[3]

71 Fault Model: Messages 1 Σ[1] 2 3 Σ[2] Σ[3]

72 Fault Model: Messages 1 Vote? Σ[1] Vote? 2 3 Σ[2] Σ[3]

73 Fault Model: Messages 1 Network <1,2, Vote? > Vote? Σ[1] Vote? <1,3, Vote? > 2 3 Σ[2] Σ[3]

74 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <1,2, Vote? > 2 3 Σ[2] Σ[3]

75 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <1,2, Vote? > 2 3 Σ[2] Σ[3] H net (dst, [dst], src, m)=( 0, o, P 0 ) 0 = [dst 7! 0 ] ({(src, dst, m)} ] P,, T) r (P ] P 0, 0, T ++ hoi)

76 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <1,2, Vote? > 2 3 Σ[2] Σ [2] = σ Σ[3] H net (dst, [dst], src, m)=( 0, o, P 0 ) 0 = [dst 7! 0 ] ({(src, dst, m)} ] P,, T) r (P ] P 0, 0, T ++ hoi)

77 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <1,2, Vote? > 2 3 Σ[2] Σ [2] = σ Σ[3] Output: o H net (dst, [dst], src, m)=( 0, o, P 0 ) 0 = [dst 7! 0 ] ({(src, dst, m)} ] P,, T) r (P ] P 0, 0, T ++ hoi)

78 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <1,2, Vote? > <2,1, +1 > 2 3 Σ[2] Σ [2] = σ Σ[3] Output: o H net (dst, [dst], src, m)=( 0, o, P 0 ) 0 = [dst 7! 0 ] ({(src, dst, m)} ] P,, T) r (P ] P 0, 0, T ++ hoi)

79 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <2,1, +1 > <1,2, Vote? > 2 3 Σ[2] Σ [2] = σ Σ[3] Output: o H net (dst, [dst], src, m)=( 0, o, P 0 ) 0 = [dst 7! 0 ] ({(src, dst, m)} ] P,, T) r (P ] P 0, 0, T ++ hoi)

80 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <2,1, +1 > <1,2, Vote? > 2 3 Σ[2] Σ [2] = σ Σ[3] Output: o H net (dst, [dst], src, m)=( 0, o, P 0 ) 0 = [dst 7! 0 ] ({(src, dst, m)} ] P,, T) r (P ] P 0, 0, T ++ hoi)

81 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <2,1, +1 > <1,2, Vote? > 2 3 Σ[2] Σ [2] = σ Σ[3] Output: o H net (dst, [dst], src, m)=( 0, o, P 0 ) 0 = [dst 7! 0 ] ({(src, dst, m)} ] P,, T) r (P ] P 0, 0, T ++ hoi)

82 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <2,1, +1 > <1,2, Vote? > 2 3 Σ[2] Σ [2] = σ Σ[3] Output: o H net (dst, [dst], src, m)=( 0, o, P 0 ) 0 = [dst 7! 0 ] ({(src, dst, m)} ] P,, T) r (P ] P 0, 0, T ++ hoi)

83 Fault Model: Messages 1 Network Vote? Σ[1] Vote? <1,3, Vote? > <2,1, +1 > <1,2, Vote? > 2 3 Σ[2] Σ [2] = σ Σ[3] Output: o H net (dst, [dst], src, m)=( 0, o, P 0 ) 0 = [dst 7! 0 ] ({(src, dst, m)} ] P,, T) r (P ] P 0, 0, T ++ hoi)

84 Fault Model: Failures Network <1,2, Vote? > <1,3, Vote? > 1 Σ[1] 2 3 Σ[2] Σ[3]

85 Fault Model: Failures Network <1,3, Vote? > Message drop 1 Σ[1] 2 3 Σ[2] Σ[3]

86 Fault Model: Failures Network <1,3, Vote? > Message drop <1,3, Vote? > Message duplication 1 Σ[1] 2 3 Σ[2] Σ[3]

87 Fault Model: Failures Network <1,3, Vote? > Message drop <1,3, Vote? > Message duplication 1 Σ[1] Machine crash 2 3 Σ[2] Σ[3]

88 Fault Model: Drop Network <1,2, hi > <1,3, hi > ({p} ] P,, T) drop (P,, T) Drop

89 Fault Model: Drop Network <1,2, hi > <1,3, hi > ({p} ] P,, T) drop (P,, T) Drop

90 Fault Model: Drop Network <1,2, hi > <1,3, hi > ({p} ] P,, T) drop (P,, T) Drop

91 Toward Verifying Raft General theory of linearizability 1k lines of implementation, 5k lines for linearizability State machine safety: 30k lines Most state invariants proved, some left to do

92 Verified System Transformers Functions on systems Transform systems between semantics Maintain equivalent traces Get correctness of transformed system for free

93 Verified System Transformers

94 Verified System Transformers Raft Consensus App

95 Verified System Transformers Raft Consensus Primary Backup App App

96 Verified System Transformers Raft Consensus Primary Backup Seq # and Retrans App App

97 Verified System Transformers Raft Primary Seq # and Ghost Consensus Backup Retrans Variables App App

98 Running Verdi Programs

99 Running Verdi Programs Coq extraction to Ocaml Thin, unverified shim Trusted compute base: shim, Coq, Ocaml, OS

100 Performance Evaluation Compare with etcd, a similar open-source store 10% performance overhead Mostly disk/network bound etcd has had linearizability bugs

101 Previous Approaches EventML [Schiper 2014] Verified Paxos using the NuPRL proof assistant MACE [Killian 2007] Model checking distributed systems in C++ TLA+ [Lamport 2002] Specification language and logic

102 Contributions Formalize network as operational semantics Build semantics for a variety of fault models Verify fault-tolerance as transformation between semantics

103 Contributions Formalize network as operational semantics Build semantics for a variety of fault models Verify fault-tolerance as transformation between semantics

104 Contributions Formalize network as operational semantics Build semantics for a variety of fault models Verify fault-tolerance as transformation between semantics Thanks!

Programming and Proving with Distributed Protocols Disel: Distributed Separation Logic.

Programming and Proving with Distributed Protocols Disel: Distributed Separation Logic. Programming and Proving with Distributed Protocols Disel: Distributed Separation Logic ` {P } c {Q} http://distributedcomponents.net Ilya Sergey James R. Wilcox Zach Tatlock Distributed Systems Distributed

More information

Programming Language Abstractions for Modularly Verified Distributed Systems. James R. Wilcox Zach Tatlock Ilya Sergey

Programming Language Abstractions for Modularly Verified Distributed Systems. James R. Wilcox Zach Tatlock Ilya Sergey Programming Language Abstractions for Modularly Verified Distributed Systems ` {P } c {Q} James R. Wilcox Zach Tatlock Ilya Sergey Distributed Systems Distributed Infrastructure Distributed Applications

More information

An Empirical Study on the Correctness of Formally Verified Distributed Systems. Pedro Fonseca, Kaiyuan Zhang, Xi Wang, Arvind Krishnamurthy

An Empirical Study on the Correctness of Formally Verified Distributed Systems. Pedro Fonseca, Kaiyuan Zhang, Xi Wang, Arvind Krishnamurthy An Empirical Study on the Correctness of Formally Verified Distributed Systems Pedro Fonseca, Kaiyuan Zhang, Xi Wang, Arvind Krishnamurthy We need robust distributed systems Distributed systems are critical!

More information

Browser Security Guarantees through Formal Shim Verification

Browser Security Guarantees through Formal Shim Verification Browser Security Guarantees through Formal Shim Verification Dongseok Jang Zachary Tatlock Sorin Lerner UC San Diego Browsers: Critical Infrastructure Ubiquitous: many platforms, sensitive apps Vulnerable:

More information

Designing Systems for Push-Button Verification

Designing Systems for Push-Button Verification Designing Systems for Push-Button Verification Luke Nelson, Helgi Sigurbjarnarson, Xi Wang Joint work with James Bornholt, Dylan Johnson, Arvind Krishnamurthy, EminaTorlak, Kaiyuan Zhang Formal verification

More information

Hyperkernel: Push-Button Verification of an OS Kernel

Hyperkernel: Push-Button Verification of an OS Kernel Hyperkernel: Push-Button Verification of an OS Kernel Luke Nelson, Helgi Sigurbjarnarson, Kaiyuan Zhang, Dylan Johnson, James Bornholt, Emina Torlak, and Xi Wang The OS Kernel is a critical component Essential

More information

There Is More Consensus in Egalitarian Parliaments

There Is More Consensus in Egalitarian Parliaments There Is More Consensus in Egalitarian Parliaments Iulian Moraru, David Andersen, Michael Kaminsky Carnegie Mellon University Intel Labs Fault tolerance Redundancy State Machine Replication 3 State Machine

More information

IronFleet. Dmitry Bondarenko, Yixuan Chen

IronFleet. Dmitry Bondarenko, Yixuan Chen IronFleet Dmitry Bondarenko, Yixuan Chen A short survey How many people have the confidence that your Paxos implementation has no bug? How many people have proved that the implementation is correct? Why

More information

Replication in Distributed Systems

Replication in Distributed Systems Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over

More information

Implementing a Verified On-Disk Hash Table

Implementing a Verified On-Disk Hash Table Implementing a Verified On-Disk Hash Table Stephanie Wang Abstract As more and more software is written every day, so too are bugs. Software verification is a way of using formal mathematical methods to

More information

Toward Rigorous Design of Domain-Specific Distributed Systems

Toward Rigorous Design of Domain-Specific Distributed Systems FormaliSE 16 Toward Rigorous Design of Domain-Specific Distributed Systems Mohammed S. Al-Mahfoudh Ganesh Gopalakrishnan Ryan Statesman The University of Utah Outline Intro Nowadays situation solutions:

More information

An Empirical Study on the Correctness of Formally Verified Distributed Systems

An Empirical Study on the Correctness of Formally Verified Distributed Systems An Empirical Study on the Correctness of Formally Verified Distributed Systems Pedro Fonseca Kaiyuan Zhang Xi Wang Arvind Krishnamurthy University of Washington {pfonseca, kaiyuanz, xi, arvind}@cs.washington.edu

More information

Push-Button Verification of File Systems

Push-Button Verification of File Systems 1 / 25 Push-Button Verification of File Systems via Crash Refinement Helgi Sigurbjarnarson, James Bornholt, Emina Torlak, Xi Wang University of Washington October 26, 2016 2 / 25 File systems are hard

More information

Byzantine Fault Tolerant Raft

Byzantine Fault Tolerant Raft Abstract Byzantine Fault Tolerant Raft Dennis Wang, Nina Tai, Yicheng An {dwang22, ninatai, yicheng} @stanford.edu https://github.com/g60726/zatt For this project, we modified the original Raft design

More information

Push-Button Verification of File Systems

Push-Button Verification of File Systems 1 / 24 Push-Button Verification of File Systems via Crash Refinement Helgi Sigurbjarnarson, James Bornholt, Emina Torlak, Xi Wang University of Washington 2 / 24 File systems are hard to get right Complex

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Replicated State Machines Logical clocks Primary/ Backup Paxos? 0 1 (N-1)/2 No. of tolerable failures October 11, 2017 EECS 498

More information

A Reliable Broadcast System

A Reliable Broadcast System A Reliable Broadcast System Yuchen Dai, Xiayi Huang, Diansan Zhou Department of Computer Sciences and Engineering Santa Clara University December 10 2013 Table of Contents 2 Introduction......3 2.1 Objective...3

More information

Programming Distributed Systems

Programming Distributed Systems Annette Bieniusa Programming Distributed Systems Summer Term 2018 1/ 26 Programming Distributed Systems 09 Testing Distributed Systems Annette Bieniusa AG Softech FB Informatik TU Kaiserslautern Summer

More information

picoq: Parallel Regression Proving for Large-Scale Verification Projects

picoq: Parallel Regression Proving for Large-Scale Verification Projects picoq: Parallel Regression Proving for Large-Scale Verification Projects Karl Palmskog, Ahmet Celik, and Milos Gligoric The University of Texas at Austin, USA 1 / 29 Introduction Verification Using Proof

More information

Exploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout

Exploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout Exploiting Commutativity For Practical Fast Replication Seo Jin Park and John Ousterhout Overview Problem: replication adds latency and throughput overheads CURP: Consistent Unordered Replication Protocol

More information

Two phase commit protocol. Two phase commit protocol. Recall: Linearizability (Strong Consistency) Consensus

Two phase commit protocol. Two phase commit protocol. Recall: Linearizability (Strong Consistency) Consensus Recall: Linearizability (Strong Consistency) Consensus COS 518: Advanced Computer Systems Lecture 4 Provide behavior of a single copy of object: Read should urn the most recent write Subsequent reads should

More information

Velisarios: Byzantine Fault-Tolerant Protocols Powered by Coq

Velisarios: Byzantine Fault-Tolerant Protocols Powered by Coq Velisarios: Byzantine Fault-Tolerant Protocols Powered by Coq Vincent Rahli, Ivana Vukotic, Marcus Völp, Paulo Esteves-Verissimo SnT, University of Luxembourg, Esch-sur-Alzette, Luxembourg firstname.lastname@uni.lu

More information

Failures, Elections, and Raft

Failures, Elections, and Raft Failures, Elections, and Raft CS 8 XI Copyright 06 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved. Distributed Banking SFO add interest based on current balance PVD deposit $000 CS 8 XI Copyright

More information

CS 138: Practical Byzantine Consensus. CS 138 XX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

CS 138: Practical Byzantine Consensus. CS 138 XX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. CS 138: Practical Byzantine Consensus CS 138 XX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Scenario Asynchronous system Signed messages s are state machines It has to be practical CS 138

More information

Strong Consistency & CAP Theorem

Strong Consistency & CAP Theorem Strong Consistency & CAP Theorem CS 240: Computing Systems and Concurrency Lecture 15 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Consistency models

More information

Remote Procedure Call. Tom Anderson

Remote Procedure Call. Tom Anderson Remote Procedure Call Tom Anderson Why Are Distributed Systems Hard? Asynchrony Different nodes run at different speeds Messages can be unpredictably, arbitrarily delayed Failures (partial and ambiguous)

More information

Translation Validation for a Verified OS Kernel

Translation Validation for a Verified OS Kernel To appear in PLDI 13 Translation Validation for a Verified OS Kernel Thomas Sewell 1, Magnus Myreen 2, Gerwin Klein 1 1 NICTA, Australia 2 University of Cambridge, UK L4.verified sel4 = a formally verified

More information

Developing Correctly Replicated Databases Using Formal Tools

Developing Correctly Replicated Databases Using Formal Tools Developing Correctly Replicated Databases Using Formal Tools Nicolas Schiper Vincent Rahli Robbert Van Renesse Mark Bickford Robert L. Constable Department of Computer Science, Cornell University Abstract

More information

Distributed Consensus Protocols

Distributed Consensus Protocols Distributed Consensus Protocols ABSTRACT In this paper, I compare Paxos, the most popular and influential of distributed consensus protocols, and Raft, a fairly new protocol that is considered to be a

More information

Failure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18

Failure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18 Failure models Byzantine Fault Tolerance Fail-stop: nodes either execute the protocol correctly or just stop Byzantine failures: nodes can behave in any arbitrary way Send illegal messages, try to trick

More information

Byzantine Fault Tolerance

Byzantine Fault Tolerance Byzantine Fault Tolerance CS6450: Distributed Systems Lecture 10 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University.

More information

Recall our 2PC commit problem. Recall our 2PC commit problem. Doing failover correctly isn t easy. Consensus I. FLP Impossibility, Paxos

Recall our 2PC commit problem. Recall our 2PC commit problem. Doing failover correctly isn t easy. Consensus I. FLP Impossibility, Paxos Consensus I Recall our 2PC commit problem FLP Impossibility, Paxos Client C 1 C à TC: go! COS 418: Distributed Systems Lecture 7 Michael Freedman Bank A B 2 TC à A, B: prepare! 3 A, B à P: yes or no 4

More information

CS6450: Distributed Systems Lecture 15. Ryan Stutsman

CS6450: Distributed Systems Lecture 15. Ryan Stutsman Strong Consistency CS6450: Distributed Systems Lecture 15 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed

More information

Causal Consistency and Two-Phase Commit

Causal Consistency and Two-Phase Commit Causal Consistency and Two-Phase Commit CS 240: Computing Systems and Concurrency Lecture 16 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Consistency

More information

Runway. A new tool for distributed systems design. Diego Ongaro Lead Software Engineer, Compute

Runway. A new tool for distributed systems design. Diego Ongaro Lead Software Engineer, Compute Runway A new tool for distributed systems design Diego Ongaro Lead Software Engineer, Compute Infrastructure @ongardie https://runway.systems Outline 1. Why we need new tools for distributed systems design

More information

Validating Key-value Based Implementations of the Raft Consensus Algorithm for Distributed Systems

Validating Key-value Based Implementations of the Raft Consensus Algorithm for Distributed Systems San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2018 Validating Key-value Based Implementations of the Raft Consensus Algorithm for Distributed

More information

Prophecy: Using History for High Throughput Fault Tolerance

Prophecy: Using History for High Throughput Fault Tolerance Prophecy: Using History for High Throughput Fault Tolerance Siddhartha Sen Joint work with Wyatt Lloyd and Mike Freedman Princeton University Non crash failures happen Non crash failures happen Model as

More information

Basic vs. Reliable Multicast

Basic vs. Reliable Multicast Basic vs. Reliable Multicast Basic multicast does not consider process crashes. Reliable multicast does. So far, we considered the basic versions of ordered multicasts. What about the reliable versions?

More information

Static and User-Extensible Proof Checking. Antonis Stampoulis Zhong Shao Yale University POPL 2012

Static and User-Extensible Proof Checking. Antonis Stampoulis Zhong Shao Yale University POPL 2012 Static and User-Extensible Proof Checking Antonis Stampoulis Zhong Shao Yale University POPL 2012 Proof assistants are becoming popular in our community CompCert [Leroy et al.] sel4 [Klein et al.] Four-color

More information

Genie. Distributed Systems Synthesis and Verification. Marc Rosen. EN : Advanced Distributed Systems and Networks May 1, 2017

Genie. Distributed Systems Synthesis and Verification. Marc Rosen. EN : Advanced Distributed Systems and Networks May 1, 2017 Genie Distributed Systems Synthesis and Verification Marc Rosen EN.600.667: Advanced Distributed Systems and Networks May 1, 2017 1 / 35 Outline Introduction Problem Statement Prior Art Demo How does it

More information

Designing for Understandability: the Raft Consensus Algorithm. Diego Ongaro John Ousterhout Stanford University

Designing for Understandability: the Raft Consensus Algorithm. Diego Ongaro John Ousterhout Stanford University Designing for Understandability: the Raft Consensus Algorithm Diego Ongaro John Ousterhout Stanford University Algorithms Should Be Designed For... Correctness? Efficiency? Conciseness? Understandability!

More information

Extend PB for high availability. PB high availability via 2PC. Recall: Primary-Backup. Putting it all together for SMR:

Extend PB for high availability. PB high availability via 2PC. Recall: Primary-Backup. Putting it all together for SMR: Putting it all together for SMR: Two-Phase Commit, Leader Election RAFT COS 8: Distributed Systems Lecture Recall: Primary-Backup Mechanism: Replicate and separate servers Goal #: Provide a highly reliable

More information

Recovering from a Crash. Three-Phase Commit

Recovering from a Crash. Three-Phase Commit Recovering from a Crash If INIT : abort locally and inform coordinator If Ready, contact another process Q and examine Q s state Lecture 18, page 23 Three-Phase Commit Two phase commit: problem if coordinator

More information

Turning proof assistants into programming assistants

Turning proof assistants into programming assistants Turning proof assistants into programming assistants ST Winter Meeting, 3 Feb 2015 Magnus Myréen Why? Why combine proof- and programming assistants? Why proofs? Testing cannot show absence of bugs. Some

More information

A Graphical Interactive Debugger for Distributed Systems

A Graphical Interactive Debugger for Distributed Systems A Graphical Interactive Debugger for Distributed Systems Doug Woos March 23, 2018 1 Introduction Developing correct distributed systems is difficult. Such systems are nondeterministic, since the network

More information

A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm

A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Appears as Technical Memo MIT/LCS/TM-590, MIT Laboratory for Computer Science, June 1999 A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Miguel Castro and Barbara Liskov

More information

CSE 124: REPLICATED STATE MACHINES. George Porter November 8 and 10, 2017

CSE 124: REPLICATED STATE MACHINES. George Porter November 8 and 10, 2017 CSE 24: REPLICATED STATE MACHINES George Porter November 8 and 0, 207 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative Commons

More information

Ovid A Software-Defined Distributed Systems Framework. Deniz Altinbuken, Robbert van Renesse Cornell University

Ovid A Software-Defined Distributed Systems Framework. Deniz Altinbuken, Robbert van Renesse Cornell University Ovid A Software-Defined Distributed Systems Framework Deniz Altinbuken, Robbert van Renesse Cornell University Ovid Build distributed systems that are easy to evolve easy to reason about easy to compose

More information

Œuf: Minimizing the Coq Extraction TCB

Œuf: Minimizing the Coq Extraction TCB Œuf: Minimizing the Coq Extraction TCB Eric Mullen University of Washington, USA USA Abstract Zachary Tatlock University of Washington, USA USA Verifying systems by implementing them in the programming

More information

Enhancing Throughput of

Enhancing Throughput of Enhancing Throughput of NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing Throughput of Partially Replicated State Machines via NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing

More information

Functional Programming in Coq. Nate Foster Spring 2018

Functional Programming in Coq. Nate Foster Spring 2018 Functional Programming in Coq Nate Foster Spring 2018 Review Previously in 3110: Functional programming Modular programming Data structures Interpreters Next unit of course: formal methods Today: Proof

More information

Byzantine Fault Tolerance

Byzantine Fault Tolerance Byzantine Fault Tolerance CS 240: Computing Systems and Concurrency Lecture 11 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. So far: Fail-stop failures

More information

Recall: Primary-Backup. State machine replication. Extend PB for high availability. Consensus 2. Mechanism: Replicate and separate servers

Recall: Primary-Backup. State machine replication. Extend PB for high availability. Consensus 2. Mechanism: Replicate and separate servers Replicated s, RAFT COS 8: Distributed Systems Lecture 8 Recall: Primary-Backup Mechanism: Replicate and separate servers Goal #: Provide a highly reliable service Goal #: Servers should behave just like

More information

Specifying the Ethereum Virtual Machine for Theorem Provers

Specifying the Ethereum Virtual Machine for Theorem Provers 1/28 Specifying the Ethereum Virtual Machine for Theorem Provers Yoichi Hirai Ethereum Foundation Cambridge, Sep. 13, 2017 (FC 2017 + some updates) 2/28 Outline Problem Motivation EVM as a Machine Wanted

More information

Trends in Automated Verification

Trends in Automated Verification Trends in Automated Verification K. Rustan M. Leino Senior Principal Engineer Automated Reasoning Group (ARG), Amazon Web Services 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

More information

Temporal NetKAT. Ryan Beckett Michael Greenberg*, David Walker. Pomona College* Princeton University

Temporal NetKAT. Ryan Beckett Michael Greenberg*, David Walker. Pomona College* Princeton University Temporal NetKAT Ryan Beckett Michael Greenberg*, David Walker Princeton University Pomona College* Software-Defined Networking Controller Software-Defined Networking Controller Match Action dst=1.2.3.4

More information

Paxos Made Moderately Complex Made Moderately Simple

Paxos Made Moderately Complex Made Moderately Simple Paxos Made Moderately Complex Made Moderately Simple Tom Anderson and Doug Woos State machine replication Reminder: want to agree on order of ops Can think of operations as a log S1 S2 Lab 3 Paxos? S3

More information

Split-Brain Consensus

Split-Brain Consensus Split-Brain Consensus On A Raft Up Split Creek Without A Paddle John Burke jcburke@stanford.edu Rasmus Rygaard rygaard@stanford.edu Suzanne Stathatos sstat@stanford.edu ABSTRACT Consensus is critical for

More information

From Crypto to Code. Greg Morrisett

From Crypto to Code. Greg Morrisett From Crypto to Code Greg Morrisett Languages over a career Pascal/Ada/C/SML/Ocaml/Haskell ACL2/Coq/Agda Latex Powerpoint Someone else s Powerpoint 2 Cryptographic techniques Already ubiquitous: e.g., SSL/TLS

More information

Today: Fault Tolerance

Today: Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

Self-healing Data Step by Step

Self-healing Data Step by Step Self-healing Data Step by Step Uwe Friedrichsen (codecentric AG) NoSQL matters Cologne, 29. April 2014 @ufried Uwe Friedrichsen uwe.friedrichsen@codecentric.de http://slideshare.net/ufried http://ufried.tumblr.com

More information

Practical Byzantine Fault Tolerance

Practical Byzantine Fault Tolerance Practical Byzantine Fault Tolerance Robert Grimm New York University (Partially based on notes by Eric Brewer and David Mazières) The Three Questions What is the problem? What is new or different? What

More information

Practical Byzantine Fault Tolerance. Castro and Liskov SOSP 99

Practical Byzantine Fault Tolerance. Castro and Liskov SOSP 99 Practical Byzantine Fault Tolerance Castro and Liskov SOSP 99 Why this paper? Kind of incredible that it s even possible Let alone a practical NFS implementation with it So far we ve only considered fail-stop

More information

Formal verification of program obfuscations

Formal verification of program obfuscations Formal verification of program obfuscations Sandrine Blazy joint work with Roberto Giacobazzi and Alix Trieu IFIP WG 2.11, 2015-11-10 1 Background: verifying a compiler Compiler + proof that the compiler

More information

Distributed Consensus: Making Impossible Possible

Distributed Consensus: Making Impossible Possible Distributed Consensus: Making Impossible Possible QCon London Tuesday 29/3/2016 Heidi Howard PhD Student @ University of Cambridge heidi.howard@cl.cam.ac.uk @heidiann360 What is Consensus? The process

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Implementing RSMs Logical clock based ordering of requests Cannot serve requests if any one replica is down Primary-backup replication

More information

Exploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout

Exploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout Exploiting Commutativity For Practical Fast Replication Seo Jin Park and John Ousterhout Overview Problem: consistent replication adds latency and throughput overheads Why? Replication happens after ordering

More information

Ironclad Apps: End-to-End Security via Automated Full-System Verification. Danfeng Zhang

Ironclad Apps: End-to-End Security via Automated Full-System Verification. Danfeng Zhang Ironclad Apps: End-to-End Security via Automated Full-System Verification Chris Hawblitzel Jon Howell Jay Lorch Arjun Narayan Bryan Parno Danfeng Zhang Brian Zill Online and Mobile Security Chase Online,

More information

CS6450: Distributed Systems Lecture 11. Ryan Stutsman

CS6450: Distributed Systems Lecture 11. Ryan Stutsman Strong Consistency CS6450: Distributed Systems Lecture 11 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed

More information

Fast Paxos. Leslie Lamport

Fast Paxos. Leslie Lamport Distrib. Comput. (2006) 19:79 103 DOI 10.1007/s00446-006-0005-x ORIGINAL ARTICLE Fast Paxos Leslie Lamport Received: 20 August 2005 / Accepted: 5 April 2006 / Published online: 8 July 2006 Springer-Verlag

More information

A Framework for the Formal Verification of Time-Triggered Systems

A Framework for the Formal Verification of Time-Triggered Systems A Framework for the Formal Verification of Time-Triggered Systems Lee Pike leepike@galois.com Indiana University, Bloomington Department of Computer Science Advisor: Prof. Steven D. Johnson December 12,

More information

Kami: A Framework for (RISC-V) HW Verification

Kami: A Framework for (RISC-V) HW Verification Kami: A Framework for (RISC-V) HW Verification Murali Vijayaraghavan Joonwon Choi, Adam Chlipala, (Ben Sherman), Andy Wright, Sizhuo Zhang, Thomas Bourgeat, Arvind 1 The Riscy Expedition by MIT Riscy Library

More information

Introduction to riak_ensemble. Joseph Blomstedt Basho Technologies

Introduction to riak_ensemble. Joseph Blomstedt Basho Technologies Introduction to riak_ensemble Joseph Blomstedt (@jtuple) Basho Technologies riak_ensemble Paxos framework for scalable consistent system 2 node node node node node node node node 3 What about state? 4

More information

Distributed Systems 24. Fault Tolerance

Distributed Systems 24. Fault Tolerance Distributed Systems 24. Fault Tolerance Paul Krzyzanowski pxk@cs.rutgers.edu 1 Faults Deviation from expected behavior Due to a variety of factors: Hardware failure Software bugs Operator errors Network

More information

Distributed Consensus: Making Impossible Possible

Distributed Consensus: Making Impossible Possible Distributed Consensus: Making Impossible Possible Heidi Howard PhD Student @ University of Cambridge heidi.howard@cl.cam.ac.uk @heidiann360 hh360.user.srcf.net Sometimes inconsistency is not an option

More information

Building Consistent Transactions with Inconsistent Replication

Building Consistent Transactions with Inconsistent Replication DB Reading Group Fall 2015 slides by Dana Van Aken Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports

More information

Consensus, impossibility results and Paxos. Ken Birman

Consensus, impossibility results and Paxos. Ken Birman Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

Recall use of logical clocks

Recall use of logical clocks Causal Consistency Consistency models Linearizability Causal Eventual COS 418: Distributed Systems Lecture 16 Sequential Michael Freedman 2 Recall use of logical clocks Lamport clocks: C(a) < C(z) Conclusion:

More information

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks.

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks. Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

PARALLEL CONSENSUS PROTOCOL

PARALLEL CONSENSUS PROTOCOL CANOPUS: A SCALABLE AND MASSIVELY PARALLEL CONSENSUS PROTOCOL Bernard Wong CoNEXT 2017 Joint work with Sajjad Rizvi and Srinivasan Keshav CONSENSUS PROBLEM Agreement between a set of nodes in the presence

More information

Beyond FLP. Acknowledgement for presentation material. Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen

Beyond FLP. Acknowledgement for presentation material. Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen Beyond FLP Acknowledgement for presentation material Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen Paper trail blog: http://the-paper-trail.org/blog/consensus-protocols-paxos/

More information

VeriCon: Towards Verifying Controller Programs in SDNs

VeriCon: Towards Verifying Controller Programs in SDNs VeriCon: Towards Verifying Controller Programs in SDNs Thomas Ball, Nikolaj Bjorner, Aaron Gember, Shachar Itzhaky, Aleksandr Karbyshev, Mooly Sagiv, Michael Schapira, Asaf Valadarsky 1 Guaranteeing network

More information

Verasco: a Formally Verified C Static Analyzer

Verasco: a Formally Verified C Static Analyzer Verasco: a Formally Verified C Static Analyzer Jacques-Henri Jourdan Joint work with: Vincent Laporte, Sandrine Blazy, Xavier Leroy, David Pichardie,... June 13, 2017, Montpellier GdR GPL thesis prize

More information

Practical Byzantine Fault Tolerance and Proactive Recovery

Practical Byzantine Fault Tolerance and Proactive Recovery Practical Byzantine Fault Tolerance and Proactive Recovery MIGUEL CASTRO Microsoft Research and BARBARA LISKOV MIT Laboratory for Computer Science Our growing reliance on online services accessible on

More information

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,

More information

Improving Interrupt Response Time in a Verifiable Protected Microkernel

Improving Interrupt Response Time in a Verifiable Protected Microkernel Improving Interrupt Response Time in a Verifiable Protected Microkernel Bernard Blackham Yao Shi Gernot Heiser The University of New South Wales & NICTA, Sydney, Australia EuroSys 2012 Motivation The desire

More information

Today: Fault Tolerance. Fault Tolerance

Today: Fault Tolerance. Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

An Extensible Programming Language for Verified Systems Software. Adam Chlipala MIT CSAIL WG 2.16 meeting, 2012

An Extensible Programming Language for Verified Systems Software. Adam Chlipala MIT CSAIL WG 2.16 meeting, 2012 An Extensible Programming Language for Verified Systems Software Adam Chlipala MIT CSAIL WG 2.16 meeting, 2012 The status quo in computer system design DOM JS API Web page Network JS API CPU Timer interrupts

More information

Distributed Systems. Day 11: Replication [Part 3 Raft] To survive failures you need a raft

Distributed Systems. Day 11: Replication [Part 3 Raft] To survive failures you need a raft Distributed Systems Day : Replication [Part Raft] To survive failures you need a raft Consensus Consensus: A majority of nodes agree on a value Variations on the problem, depending on assumptions Synchronous

More information

340 Review Fall Midterm 1 Review

340 Review Fall Midterm 1 Review 340 Review Fall 2016 Midterm 1 Review Concepts A. UML Class Diagrams 1. Components: Class, Association (including association name), Multiplicity Constraints, General Constraints, Generalization/Specialization,

More information

How Can You Trust Formally Verified Software?

How Can You Trust Formally Verified Software? How Can You Trust Formally Verified Software? Alastair Reid Arm Research @alastair_d_reid Formal verification Of libraries and apps Of compilers Of operating systems 2 Fonseca et al., An Empirical Study

More information

REPLICATED STATE MACHINES

REPLICATED STATE MACHINES 5/6/208 REPLICATED STATE MACHINES George Porter May 6, 208 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative Commons license These

More information

Verification of Fault-Tolerant Protocols with Sally

Verification of Fault-Tolerant Protocols with Sally Verification of Fault-Tolerant Protocols with Sally Bruno Dutertre, Dejan Jovanović, and Jorge A. Navas Computer Science Laboratory, SRI International Abstract. Sally is a model checker for infinite-state

More information

How much is a mechanized proof worth, certification-wise?

How much is a mechanized proof worth, certification-wise? How much is a mechanized proof worth, certification-wise? Xavier Leroy Inria Paris-Rocquencourt PiP 2014: Principles in Practice In this talk... Some feedback from the aircraft industry concerning the

More information

Certified compilers. Do you trust your compiler? Testing is immune to this problem, since it is applied to target code

Certified compilers. Do you trust your compiler? Testing is immune to this problem, since it is applied to target code Certified compilers Do you trust your compiler? Most software errors arise from source code But what if the compiler itself is flawed? Testing is immune to this problem, since it is applied to target code

More information

Coordinating distributed systems part II. Marko Vukolić Distributed Systems and Cloud Computing

Coordinating distributed systems part II. Marko Vukolić Distributed Systems and Cloud Computing Coordinating distributed systems part II Marko Vukolić Distributed Systems and Cloud Computing Last Time Coordinating distributed systems part I Zookeeper At the heart of Zookeeper is the ZAB atomic broadcast

More information

Database Management Systems

Database Management Systems Database Management Systems Distributed Databases Doug Shook What does it mean to be distributed? Multiple nodes connected by a network Data on the nodes is logically related The nodes do not need to be

More information

Distributed System. Gang Wu. Spring,2018

Distributed System. Gang Wu. Spring,2018 Distributed System Gang Wu Spring,2018 Lecture4:Failure& Fault-tolerant Failure is the defining difference between distributed and local programming, so you have to design distributed systems with the

More information

Negations in Refinement Type Systems

Negations in Refinement Type Systems Negations in Refinement Type Systems T. Tsukada (U. Tokyo) 14th March 2016 Shonan, JAPAN This Talk About refinement intersection type systems that refute judgements of other type systems. Background Refinement

More information

Handling Environments in a Nested Relational Algebra with Combinators and an Implementation in a Verified Query Compiler

Handling Environments in a Nested Relational Algebra with Combinators and an Implementation in a Verified Query Compiler Handling Environments in a Nested Relational Algebra with Combinators and an Implementation in a Verified Query Compiler Joshua Auerbach, Martin Hirzel, Louis Mandel, Avi Shinnar and Jérôme Siméon IBM

More information