What is Distributed Storage Good For?

Size: px
Start display at page:

Download "What is Distributed Storage Good For?"

Transcription

1 Efficient Robust Storage using Secret Tokens Dan Dobre, Matthias Majuntke, Marco Serafini and Neeraj Suri Dependable Embedded Systems & SW Group Neeraj Suri EU-NSF ICT March 2006

2 What is Distributed Storage Good For? 1980 s 2009 clients request Distributed clients request bad reply or no reply Storage certificate

3 What Distributed Storage is Good? Distributed storage that is robust Regular & wait-free Resilient to Byzantine faults Optimally resilient 3t+1 storage nodes (base objects) AND stores unauthenticated data No crypto overhead Invulnerable to attacks Q: What latency of READ and WRITE?

4 Our Results Unauthenticated Storage Optimal resilience WRITE 2 [ACKM06] s do not write READ 2 Adversary may guess the future READ 2 [GV06] READ t+1 [ACKM06] READ t+1 [ACKM06] Robust READ 2 READ 1 READ 2 [GV06,DMS08]

5 Model Background Our Results Outline 1. Lower Bound 2. Unbounded s Algorithm 3. Fast s Algorithm Summary

6 Model Asynchronous message passing system Clients: 1 honest writer and many (malicious) readers Base objects: 3t+1 of which t may be Byzantine Regular read/write storage Initial value v 0 WRITE(v) stores value v v 0 and returns ok READ returns the last written value or a concurrent one Metric: latency = # communication round-trips clients base objects Honest clients can generate a secret (token)

7 Registers and Distributed Storage Register types [La86] Safe, regular and atomic (1,b) (1,b) (2,y) OK (0,v (2,y) 0 ) (1,b) b y return b return yb or y

8 Registers and Distributed Storage Register types [La86] Safe, regular and atomic (1,b) (1,b) (2,y) b y return vy 0

9 Optimal Resilience 3t+1 base objects necessary [MAD02] (1,b) (1,b) S 4 (1,b) σ 1 b return b

10 Wait for to break ties? Optimal Resilience... S 4 (1,b) σ 1 b... return b

11 WRITE 2 [ACKM06] pre-write write What about the READ? S 4 b

12 Unauthenticated Storage Optimal resilience WRITE 2 [ACKM06] s do not write READ 2 Adversary can guess the future READ 2 [GV06] READ t+1 [ACKM06] READ t+1 [ACKM06] Robust READ 2 READ 1 READ 2 [GV06,DMS08]

13 Our First Result: A Lower Bound Theorem: Reading from unauthenticated storage with optimal resilience requires at least two communication rounds if readers do not write.

14 cannot wait for Lower Bound (Run 1) S 4 return? b

15 Lower Bound (Run 2) WRITE precedes READ and Run 2 ~ Run 1 S 4 b return b

16 Lower Bound (Run 3) Run 3 ~ Run 2 S 4 return b return v 0

17 Unauthenticated Storage Optimal resilience WRITE 2 [ACKM06] s do not write READ 2 Adversary can guess the future READ 2 [GV06] READ t+1 [ACKM06] READ t+1 [ACKM06] Robust READ 2 READ 1 READ 2 [GV06,DMS08]

18 Unbounded s Algorithm (t = 1) σ 1 S 4 b return b

19 Unbounded s Algorithm (t = 1) σ 1 Wait condition: t+1 (2) pre-writes of b S 4 pw b write b rnd 1 rnd2 return b

20 Unbounded s Algorithm (t = 1) S 4 Wait condition: t+1 (2) pre-writes of b OR S-t (3) writes with smaller timestamp return v 0

21 Unbounded s Algorithm (t = 1) Wait condition: t+1 (2) pre-writes of b S 4 OR S-t (3) writes with smaller timestamp return b b

22 Unbounded s Algorithm (t > 1) S = 7, t = 2 Wait condition: S 4 t+1 (3) pre-writes of b S 5 OR S-t (5) writes with smaller timestamp S 6 S 7 return? b

23 Unbounded s Algorithm (t > 1) S = 7, t = 2 Recall: at least S 4 t+1 (3) READ rounds S 5 are necessary [ACKM06] S 6 S 7 b

24 Unbounded s Algorithm (t > 1) S = 7, t = 2 S 4 S 5 S 6 Adversary guesses Idea: bind the value to a secret token s in write phase S 7 pre-write b b write b rnd1 rnd2

25 Unbounded s Algorithm (t > 1) S = 7, t = 2 write <b,s> σ 3 S 4 S 5 S 6 S 7 New reports condition: σ 3 and S-t writes reports with and smaller σ 3 timestamp than b s OR with same timestamp but different secret token pre-write b write <b,s> rnd1 rnd2 return v 0

26 Summary of Unbounded s Algorihm Pros: Unbounded AND malicious readers READ latency down from t+1 to 2 rounds (tight) Graceful degradation Constant messages (except 2nd READ round with O(S)) Cons: Non-amensic stores unlimited # of values [CGK07] READs are blocking if secrecy is violated

27 Unauthenticated Storage Optimal resilience WRITE 2 [ACKM06] s do not write READ 2 Adversary can guess the future READ 2 [GV06] READ t+1 [ACKM06] READ t+1 [ACKM06] Robust READ 2 READ 1 READ 2 [GV06,DMS08]

28 The Fast s Algorithm READ and WRITE(b) overlap READs write σ 0 S 4 b

29 The Fast s Algorithm WRITE(b) precedes READ σ 0 <1,s> Adversary guesses σ 0 Idea: (1) reader writes secret s together with a timestamp: e.g. <1, s> S 4 <1,s> <1,s> (2) pre-write collects the info and write phase writes it back [<0,ε>,<1,s >, <0,ε>, ] b s s not both and are correct <1,s> return b

30 Summary of Fast s Algorithm Pros: Fast READs Graceful degradation Constant READ messages Cons: Non-amensic stores unlimited # of values [CGK07] Old values may be read if secrecy is violated Storage requirements and WRITE messages linear in the # readers

31 Summary Optimal resilience WRITE 2 [ACKM06] s do not write READ 2 Adversary can guess the future READ 2 [GV06] READ t+1 [ACKM06] READ t+1 [ACKM06] Robust READ 2 READ 1 READ 2 [GV06,DMS08]

32 Fin Questions?

33 What bounds for Unauthenticated storage? Observation: the READ lower bounds the adversary knows a value before it is written Idea: write a secret together with each value (e.g. a random bit string) Better READ Latency 2 rounds are necessary and sufficient if readers don t write 1 round is sufficient otherwise Graceful Degradation: Forged values are never read The first algorithm is always safe and the second is always live

34 Deriving the UR* Algorithm (t > 1) S = 7, t = 2 S 4 S 5 S 6 Wait condition: t+1 (3) pre-writes of b OR S-t (5) writes with smaller timestamp S 7 b return b * Unbounded s

35 Deriving the UR* Algorithm (t > 1) S = 7, t = 2 S 4 S 5 S 6 Wait condition: t+1 (3) pre-writes of b OR S-t (5) writes with smaller timestamp S 7 return b return v 0 * Unbounded s

36 The UR* Algorithm WRITE(v): 1. pre-write (ts++, v) to all base objects 2. wait for n-t responses 3. s gettoken() /*random value*/ 4. write (ts, v, s) to all base objects 5. wait for n-t responses and return ok; READ: 1. read from 2t+1 base objects 2. collect all written values (*,*,*) in set C /*candidates*/ 3. send C to all base objects 4. wait for responses /*two sets PW and W*/ remove c C when c is missing from n-t W sets until c: c = max ts (C) AND c is contained in t+1 PW or W sets 5. return c.val * Unbounded s

37 Argumentation C1: Some correct bo reports that b is written in the first READ round t+1 correct bos report b to the second READ round W R C2: not C1 a) no correct base object reports b in the second round from w field n-t bos report values with smaller timestamps WRITE(b) is not completed b) some correct base object reports b from w field

38 Deriving the Algorithm (t > 1) S = 7, t = 2 Wait condition: S 4 t+1 pre-writes of b S 5 OR S-t writes with smaller timestamp S 6 S 7 b

39 Deriving the Algorithm (t > 1) S = 7, t = 2 S 4 S 5 S 6 Wait condition: t+1 pre-writes of b OR S-t writes with smaller timestamp S 7

40 References [CGK07] G. Chockler, R. Guerraoui, and I. Keidar: Amnesic Distributed Storage. DISC 07 [ACKM06] I. Abraham, G. Chockler, I. Keidar, and D. Malkhi : Byzantine Disk Paxos: Optimal Resilience with Byzantine Shared Memory. Distributed Computing 2006 [DMS08] D. Dobre, M. Majuntke, N. Suri: On the Time- Complexity of Amnesic Storage. OPODIS 08 [GV06] R. Guerraoui and M. Vukolić: How fast can a very robust read be? PODC 06 [LA86] L.Lamport: On interprocess Communication. Part I, Algorithms [MAD02] JP Martin, L. Alvisi, M. Dahlin: Small Byzantine Quorum Systems, DISC 02

41 Increasing Resilience is Critical BFT protocols rely on failure independence Compromising s i does not help compromising s j So, in addition to HW costs: n implementations of the service n versions to maintain n operating systems

Wait-Free Regular Storage from Byzantine Components

Wait-Free Regular Storage from Byzantine Components Wait-Free Regular Storage from Byzantine Components Ittai Abraham Gregory Chockler Idit Keidar Dahlia Malkhi July 26, 2006 Abstract We consider the problem of implementing a wait-free regular register

More information

With the advent of storage area network. Reliable Distributed Storage

With the advent of storage area network. Reliable Distributed Storage R E S E A R C H f e a t u r e Reliable Distributed Storage Gregory Chockler, IBM Haifa Research Laboratory Idit Keidar, Technion Rachid Guerraoui, EPFL Marko Vukolic, EPFL A distributed storage service

More information

Reducing the Costs of Large-Scale BFT Replication

Reducing the Costs of Large-Scale BFT Replication Reducing the Costs of Large-Scale BFT Replication Marco Serafini & Neeraj Suri TU Darmstadt, Germany Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de

More information

HP: Hybrid Paxos for WANs

HP: Hybrid Paxos for WANs HP: Hybrid Paxos for WANs Dan Dobre, Matthias Majuntke, Marco Serafini and Neeraj Suri {dan,majuntke,marco,suri}@cs.tu-darmstadt.de TU Darmstadt, Germany Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded

More information

Implementing Shared Registers in Asynchronous Message-Passing Systems, 1995; Attiya, Bar-Noy, Dolev

Implementing Shared Registers in Asynchronous Message-Passing Systems, 1995; Attiya, Bar-Noy, Dolev Implementing Shared Registers in Asynchronous Message-Passing Systems, 1995; Attiya, Bar-Noy, Dolev Eric Ruppert, York University, www.cse.yorku.ca/ ruppert INDEX TERMS: distributed computing, shared memory,

More information

RESEARCH ARTICLE. A Simple Byzantine Fault-Tolerant Algorithm for a Multi-Writer Regular Register

RESEARCH ARTICLE. A Simple Byzantine Fault-Tolerant Algorithm for a Multi-Writer Regular Register The International Journal of Parallel, Emergent and Distributed Systems Vol., No.,, 1 13 RESEARCH ARTICLE A Simple Byzantine Fault-Tolerant Algorithm for a Multi-Writer Regular Register Khushboo Kanjani,

More information

Review. Review. Review. Constructing Reliable Registers From Unreliable Byzantine Components 12/15/08. Space of registers: Transformations:

Review. Review. Review. Constructing Reliable Registers From Unreliable Byzantine Components 12/15/08. Space of registers: Transformations: Review Constructing Reliable Registers From Unreliable Byzantine Components Space of registers: Dimension 1: binary vs. multivalued Dimension 2: safe vs. regular vs. atomic Seth Gilbert Dimension 3: SRSW

More information

Optimistic Erasure-Coded Distributed Storage

Optimistic Erasure-Coded Distributed Storage Optimistic Erasure-Coded Distributed Storage Partha Dutta IBM India Research Lab Bangalore, India Rachid Guerraoui EPFL IC Lausanne, Switzerland Ron R. Levy EPFL IC Lausanne, Switzerland Abstract We study

More information

Asynchronous BFT Storage with 2t + 1 Data Replicas

Asynchronous BFT Storage with 2t + 1 Data Replicas Asynchronous BFT Storage with 2t + 1 Data Replicas Christian Cachin IBM Research - Zurich cca@zurich.ibm.com Dan Dobre NEC Labs Europe dan.dobre@neclab.eu Marko Vukolić EURECOM marko.vukolic@eurecom.fr

More information

Distributed Storage Systems: Data Replication using Quorums

Distributed Storage Systems: Data Replication using Quorums Distributed Storage Systems: Data Replication using Quorums Background Software replication focuses on dependability of computations What if we are primarily concerned with integrity and availability (and

More information

Byzantine fault tolerance. Jinyang Li With PBFT slides from Liskov

Byzantine fault tolerance. Jinyang Li With PBFT slides from Liskov Byzantine fault tolerance Jinyang Li With PBFT slides from Liskov What we ve learnt so far: tolerate fail-stop failures Traditional RSM tolerates benign failures Node crashes Network partitions A RSM w/

More information

Optimal Resilience for Erasure-Coded Byzantine Distributed Storage

Optimal Resilience for Erasure-Coded Byzantine Distributed Storage Optimal Resilience for Erasure-Coded Byzantine Distributed Storage Christian Cachin IBM Research Zurich Research Laboratory CH-8803 Rüschlikon, Switzerland cca@zurich.ibm.com Stefano Tessaro ETH Zurich

More information

Reliable Distributed Storage A Research Perspective. Idit Keidar Technion

Reliable Distributed Storage A Research Perspective. Idit Keidar Technion Reliable Distributed Storage A Research Perspective Idit Keidar Technion UnDistributed Enterprise Storage (What I Won t Talk About Today) Expensive Needs to be replaced to scale up Direct fiber access

More information

Byzantine Fault Tolerance

Byzantine Fault Tolerance Byzantine Fault Tolerance CS 240: Computing Systems and Concurrency Lecture 11 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. So far: Fail-stop failures

More information

Paxos Replicated State Machines as the Basis of a High- Performance Data Store

Paxos Replicated State Machines as the Basis of a High- Performance Data Store Paxos Replicated State Machines as the Basis of a High- Performance Data Store William J. Bolosky, Dexter Bradshaw, Randolph B. Haagens, Norbert P. Kusters and Peng Li March 30, 2011 Q: How to build a

More information

Failure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18

Failure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18 Failure models Byzantine Fault Tolerance Fail-stop: nodes either execute the protocol correctly or just stop Byzantine failures: nodes can behave in any arbitrary way Send illegal messages, try to trick

More information

Multi-writer Regular Registers in Dynamic Distributed Systems with Byzantine Failures

Multi-writer Regular Registers in Dynamic Distributed Systems with Byzantine Failures Multi-writer Regular Registers in Dynamic Distributed Systems with Byzantine Failures Silvia Bonomi, Amir Soltani Nezhad Università degli Studi di Roma La Sapienza, Via Ariosto 25, 00185 Roma, Italy bonomi@dis.uniroma1.it

More information

PoWerStore: Proofs of Writing for efficient and robust Storage

PoWerStore: Proofs of Writing for efficient and robust Storage PoWerStore: Proofs of Writing for efficient and robust Storage Dan Dobre 1 Ghassan O. Karame 1 Wenting Li 1 Matthias Majuntke 2 Neeraj Suri 3 Marko Vukolić 4 2 Capgemini Deutschland Berlin, 10785, Germany

More information

R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch

R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch - Shared Memory - R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch R. Guerraoui 1 The application model P2 P1 Registers P3 2 Register (assumptions) For presentation simplicity, we assume

More information

Byzantine Disk Paxos

Byzantine Disk Paxos Byzantine Disk Paxos Optimal Resilience with Byzantine Shared Memory ABSTRACT Ittai Abraham Hebrew University ittaia@cs.huji.ac.il Idit Keidar Technion We present Byzantine Disk Paxos, an asynchronous

More information

Oh-RAM! One and a Half Round Atomic Memory

Oh-RAM! One and a Half Round Atomic Memory Oh-RAM! One and a Half Round Atomic Memory Theophanis Hadjistasi Nicolas Nicolaou Alexander Schwarzmann July 21, 2018 arxiv:1610.08373v1 [cs.dc] 26 Oct 2016 Abstract Emulating atomic read/write shared

More information

Zyzzyva. Speculative Byzantine Fault Tolerance. Ramakrishna Kotla. L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin

Zyzzyva. Speculative Byzantine Fault Tolerance. Ramakrishna Kotla. L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin Zyzzyva Speculative Byzantine Fault Tolerance Ramakrishna Kotla L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin The Goal Transform high-performance service into high-performance

More information

Tolerating Byzantine Faulty Clients in a Quorum System

Tolerating Byzantine Faulty Clients in a Quorum System Tolerating Byzantine Faulty Clients in a Quorum System Barbara Liskov MIT CSAIL Cambridge, MA, USA Rodrigo Rodrigues INESC-ID / Instituto Superior Técnico Lisbon, Portugal Abstract Byzantine quorum systems

More information

Evaluating BFT Protocols for Spire

Evaluating BFT Protocols for Spire Evaluating BFT Protocols for Spire Henry Schuh & Sam Beckley 600.667 Advanced Distributed Systems & Networks SCADA & Spire Overview High-Performance, Scalable Spire Trusted Platform Module Known Network

More information

Byzantine Fault Tolerance

Byzantine Fault Tolerance Byzantine Fault Tolerance CS6450: Distributed Systems Lecture 10 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University.

More information

Byzantine Clients Rendered Harmless Barbara Liskov, Rodrigo Rodrigues

Byzantine Clients Rendered Harmless Barbara Liskov, Rodrigo Rodrigues Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-2005-047 MIT-LCS-TR-994 July 21, 2005 Byzantine Clients Rendered Harmless Barbara Liskov, Rodrigo Rodrigues massachusetts

More information

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change

More information

Evaluating Byzantine Quorum Systems

Evaluating Byzantine Quorum Systems Evaluating Byzantine Quorum Systems Wagner Saback Dantas Alysson Neves Bessani Joni da Silva Fraga Miguel Correia Departamento de Automação e Sistemas, Universidade Federal de Santa Catarina Brazil LASIGE,

More information

Shared memory model" Shared memory guarantees" Read-write register" Wait-freedom: unconditional progress " Liveness" Shared memory basics"

Shared memory model Shared memory guarantees Read-write register Wait-freedom: unconditional progress  Liveness Shared memory basics Shared memory basics INF346, 2014 Shared memory model Processes communicate by applying operations on and receiving responses from shared objects! A shared object is a state machine ü States ü Operations/Responses

More information

A definition. Byzantine Generals Problem. Synchronous, Byzantine world

A definition. Byzantine Generals Problem. Synchronous, Byzantine world The Byzantine Generals Problem Leslie Lamport, Robert Shostak, and Marshall Pease ACM TOPLAS 1982 Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov OSDI 1999 A definition Byzantine (www.m-w.com):

More information

Asynchronous Reconfiguration for Paxos State Machines

Asynchronous Reconfiguration for Paxos State Machines Asynchronous Reconfiguration for Paxos State Machines Leander Jehl and Hein Meling Department of Electrical Engineering and Computer Science University of Stavanger, Norway Abstract. This paper addresses

More information

Key-Evolution Schemes Resilient to Space Bounded Leakage

Key-Evolution Schemes Resilient to Space Bounded Leakage Key-Evolution Schemes Resilient to Space Bounded Leakage Stefan Dziembowski Tomasz Kazana Daniel Wichs Main contribution We propose a secure scheme for deterministic key-evolution Properties: leakage-resilient

More information

Making Fast Atomic Operations Computationally Tractable

Making Fast Atomic Operations Computationally Tractable Making Fast Atomic Operations Computationally Tractable Antonio Fernández Anta 1, Nicolas Nicolaou 1, and Alexandru Popa 2 1 IMDEA Networks Institute Madrid, Spain antonio.fernandez@imdea.org, nicolas.nicolaou@imdea.org

More information

Erasure Coding in Object Stores: Challenges and Opportunities

Erasure Coding in Object Stores: Challenges and Opportunities Erasure Coding in Object Stores: Challenges and Opportunities Lewis Tseng Boston College July 2018, PODC Acknowledgements Nancy Lynch Muriel Medard Kishori Konwar Prakash Narayana Moorthy Viveck R. Cadambe

More information

Byzantine and Multi-writer K-Quorums

Byzantine and Multi-writer K-Quorums Byzantine and Multi-writer K-Quorums Amitanand S. Aiyer 1, Lorenzo Alvisi 1, and Rida A. Bazzi 2 1 Department of Computer Sciences, The University of Texas at Austin {anand, lorenzo}@cs.utexas.edu 2 Computer

More information

Practical Byzantine Fault Tolerance and Proactive Recovery

Practical Byzantine Fault Tolerance and Proactive Recovery Practical Byzantine Fault Tolerance and Proactive Recovery MIGUEL CASTRO Microsoft Research and BARBARA LISKOV MIT Laboratory for Computer Science Our growing reliance on online services accessible on

More information

Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer

Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors Michel Raynal, Julien Stainer Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors

More information

When You Don t Trust Clients: Byzantine Proposer Fast Paxos

When You Don t Trust Clients: Byzantine Proposer Fast Paxos 2012 32nd IEEE International Conference on Distributed Computing Systems When You Don t Trust Clients: Byzantine Proposer Fast Paxos Hein Meling, Keith Marzullo, and Alessandro Mei Department of Electrical

More information

BAR Gossip. Lorenzo Alvisi UT Austin

BAR Gossip. Lorenzo Alvisi UT Austin BAR Gossip Lorenzo Alvisi UT Austin MAD Services Nodes collaborate to provide service that benefits each node Service spans multiple administrative domains (MADs) Examples: Overlay routing, wireless mesh

More information

Dfinity Consensus, Explored

Dfinity Consensus, Explored Dfinity Consensus, Explored Ittai Abraham, Dahlia Malkhi, Kartik Nayak, and Ling Ren VMware Research {iabraham,dmalkhi,nkartik,lingren}@vmware.com Abstract. We explore a Byzantine Consensus protocol called

More information

or? Paxos: Fun Facts Quorum Quorum: Primary Copy vs. Majority Quorum: Primary Copy vs. Majority

or? Paxos: Fun Facts Quorum Quorum: Primary Copy vs. Majority Quorum: Primary Copy vs. Majority Paxos: Fun Facts Quorum Why is the algorithm called Paxos? Leslie Lamport described the algorithm as the solution to a problem of the parliament on a fictitious Greek island called Paxos Many readers were

More information

Tradeoffs in Byzantine-Fault-Tolerant State-Machine-Replication Protocol Design

Tradeoffs in Byzantine-Fault-Tolerant State-Machine-Replication Protocol Design Tradeoffs in Byzantine-Fault-Tolerant State-Machine-Replication Protocol Design Michael G. Merideth March 2008 CMU-ISR-08-110 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213

More information

A Framework for Dynamic Byzantine Storage

A Framework for Dynamic Byzantine Storage A Framework for Dynamic Byzantine Storage Jean-Philippe Martin, Lorenzo Alvisi Laboratory for Advanced Systems Research The University of Texas at Austin {jpmartin,lorenzo}@cs.utexas.edu Abstract We present

More information

Revisiting Fast Practical Byzantine Fault Tolerance

Revisiting Fast Practical Byzantine Fault Tolerance Revisiting Fast Practical Byzantine Fault Tolerance Ittai Abraham, Guy Gueta, Dahlia Malkhi VMware Research with: Lorenzo Alvisi (Cornell), Rama Kotla (Amazon), Jean-Philippe Martin (Verily) December 4,

More information

arxiv: v3 [cs.dc] 11 Mar 2016

arxiv: v3 [cs.dc] 11 Mar 2016 CoVer-ability: Consistent Versioning for Concurrent Objects Nicolas Nicolaou Antonio Fernández Anta Chryssis Georgiou March 14, 2016 arxiv:1601.07352v3 [cs.dc] 11 Mar 2016 Abstract An object type characterizes

More information

Dynamic Atomic Storage Without Consensus

Dynamic Atomic Storage Without Consensus Dynamic Atomic Storage Without Consensus Marcos K. Aguilera Idit Keidar Dahlia Malkhi Alexander Shraer June 2, 2009 Abstract This paper deals with the emulation of atomic read/write (R/W) storage in dynamic

More information

Overview. This Lecture. Interrupts and exceptions Source: ULK ch 4, ELDD ch1, ch2 & ch4. COSC440 Lecture 3: Interrupts 1

Overview. This Lecture. Interrupts and exceptions Source: ULK ch 4, ELDD ch1, ch2 & ch4. COSC440 Lecture 3: Interrupts 1 This Lecture Overview Interrupts and exceptions Source: ULK ch 4, ELDD ch1, ch2 & ch4 COSC440 Lecture 3: Interrupts 1 Three reasons for interrupts System calls Program/hardware faults External device interrupts

More information

Introduction to Distributed Systems Seif Haridi

Introduction to Distributed Systems Seif Haridi Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send

More information

Process groups and message ordering

Process groups and message ordering Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create ( name ), kill ( name ) join ( name, process ), leave

More information

Viewstamped Replication to Practical Byzantine Fault Tolerance. Pradipta De

Viewstamped Replication to Practical Byzantine Fault Tolerance. Pradipta De Viewstamped Replication to Practical Byzantine Fault Tolerance Pradipta De pradipta.de@sunykorea.ac.kr ViewStamped Replication: Basics What does VR solve? VR supports replicated service Abstraction is

More information

Robust Data Sharing with Key-Value Stores

Robust Data Sharing with Key-Value Stores Robust Data Sharing with Key-Value Stores Cristina Bǎsescu Christian Cachin Ittay Eyal Robert Haas Alessandro Sorniotti Marko Vukolić Ido Zachevsky Abstract A key-value store (KVS) offers functions for

More information

Weak Consistency as a Last Resort

Weak Consistency as a Last Resort Weak Consistency as a Last Resort Marco Serafini and Flavio Junqueira Yahoo! Research Barcelona, Spain { serafini, fpj }@yahoo-inc.com ABSTRACT It is well-known that using a replicated service requires

More information

Topics in Reliable Distributed Systems

Topics in Reliable Distributed Systems Topics in Reliable Distributed Systems 049017 1 T R A N S A C T I O N S Y S T E M S What is A Database? Organized collection of data typically persistent organization models: relational, object-based,

More information

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast HariGovind V. Ramasamy Christian Cachin August 19, 2005 Abstract Atomic broadcast is a communication primitive that allows a group of

More information

Practical Byzantine Fault Tolerance

Practical Byzantine Fault Tolerance Practical Byzantine Fault Tolerance Robert Grimm New York University (Partially based on notes by Eric Brewer and David Mazières) The Three Questions What is the problem? What is new or different? What

More information

On the Availability of Non-strict Quorum Systems

On the Availability of Non-strict Quorum Systems On the Availability of Non-strict Quorum Systems Amitanand Aiyer 1, Lorenzo Alvisi 1, and Rida A. Bazzi 2 1 Department of Computer Sciences, The University of Texas at Austin {anand, lorenzo}@cs.utexas.edu

More information

Shared Memory Seif Haridi

Shared Memory Seif Haridi Shared Memory Seif Haridi haridi@kth.se Real Shared Memory Formal model of shared memory No message passing (No channels, no sends, no delivers of messages) Instead processes access a shared memory Models

More information

Shared Objects. Shared Objects

Shared Objects. Shared Objects Shared Objects Shared Objects Invoked operations have a non-zero duration Invocations can overlap Useful for: modeling distributed shared memory Objects can be combined together to implement higher level

More information

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,

More information

Generating Fast Indulgent Algorithms

Generating Fast Indulgent Algorithms Generating Fast Indulgent Algorithms Dan Alistarh 1, Seth Gilbert 2, Rachid Guerraoui 1, and Corentin Travers 3 1 EPFL, Switzerland 2 National University of Singapore 3 Université de Bordeaux 1, France

More information

Dandelion: Privacy-Preserving Transaction Propagation in Bitcoin s P2P Network

Dandelion: Privacy-Preserving Transaction Propagation in Bitcoin s P2P Network Dandelion: Privacy-Preserving Transaction Propagation in Bitcoin s P2P Network Presenter: Giulia Fanti Joint work with: Shaileshh Bojja Venkatakrishnan, Surya Bakshi, Brad Denby, Shruti Bhargava, Andrew

More information

SCALABLE CONSISTENCY AND TRANSACTION MODELS

SCALABLE CONSISTENCY AND TRANSACTION MODELS Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:

More information

Byzantine Fault-Tolerant Deferred Update Replication

Byzantine Fault-Tolerant Deferred Update Replication Byzantine Fault-Tolerant Deferred Update Replication Fernando Pedone University of Lugano (USI) Switzerland fernando.pedone@usi.ch Nicolas Schiper University of Lugano (USI) Switzerland nicolas.schiper@usi.ch

More information

Reconfiguring Replicated Atomic Storage: A Tutorial

Reconfiguring Replicated Atomic Storage: A Tutorial Reconfiguring Replicated Atomic Storage: A Tutorial Marcos K. Aguilera Idit Keidar Dahlia Malkhi Jean-Philippe Martin Alexander Shraer 1 Microsoft Research Silicon Valley Technion Israel Institute of Technology

More information

Secure Distributed Programming

Secure Distributed Programming Secure Distributed Programming Christian Cachin* Rachid Guerraoui Luís Rodrigues Tutorial at CCS 2011 A play in three acts Abstractions and protocols for Reliable broadcast Shared memory Consensus In asynchronous

More information

Authenticated Agreement

Authenticated Agreement Chapter 18 Authenticated Agreement Byzantine nodes are able to lie about their inputs as well as received messages. Can we detect certain lies and limit the power of byzantine nodes? Possibly, the authenticity

More information

Practical Byzantine Fault Tolerance. Castro and Liskov SOSP 99

Practical Byzantine Fault Tolerance. Castro and Liskov SOSP 99 Practical Byzantine Fault Tolerance Castro and Liskov SOSP 99 Why this paper? Kind of incredible that it s even possible Let alone a practical NFS implementation with it So far we ve only considered fail-stop

More information

Coded Emulation of Shared Atomic Memory for Message Passing Architectures

Coded Emulation of Shared Atomic Memory for Message Passing Architectures Coded Emulation of Shared Atomic Memory for Message Passing Architectures Viveck R. Cadambe, ancy Lynch, Muriel Médard, Peter Musial Abstract. This paper considers the communication and storage costs of

More information

Securing the Frisbee Multicast Disk Loader

Securing the Frisbee Multicast Disk Loader Securing the Frisbee Multicast Disk Loader Robert Ricci, Jonathon Duerig University of Utah 1 What is Frisbee? 2 Frisbee is Emulab s tool to install whole disk images from a server to many clients using

More information

A High Throughput Atomic Storage Algorithm

A High Throughput Atomic Storage Algorithm A High Throughput Atomic Storage Algorithm Rachid Guerraoui EPFL, Lausanne Switzerland Dejan Kostić EPFL, Lausanne Switzerland Ron R. Levy EPFL, Lausanne Switzerland Vivien Quéma CNRS, Grenoble France

More information

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Page 1 Introduction We frequently want to get a set of nodes in a distributed system to agree Commitment protocols and mutual

More information

Cryptographic Primitives and Protocols for MANETs. Jonathan Katz University of Maryland

Cryptographic Primitives and Protocols for MANETs. Jonathan Katz University of Maryland Cryptographic Primitives and Protocols for MANETs Jonathan Katz University of Maryland Fundamental problem(s) How to achieve secure message authentication / transmission in MANETs, when: Severe resource

More information

A Distributed and Robust SDN Control Plane for Transactional Network Updates

A Distributed and Robust SDN Control Plane for Transactional Network Updates A Distributed and Robust SDN Control Plane for Transactional Network Updates Marco Canini (UCL) with Petr Kuznetsov (Télécom ParisTech), Dan Levin (TU Berlin), Stefan Schmid (TU Berlin & T-Labs) 1 Network

More information

Proactive Recovery in a Byzantine-Fault-Tolerant System

Proactive Recovery in a Byzantine-Fault-Tolerant System Proactive Recovery in a Byzantine-Fault-Tolerant System Miguel Castro and Barbara Liskov Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA 02139

More information

Data Consistency and Blockchain. Bei Chun Zhou (BlockChainZ)

Data Consistency and Blockchain. Bei Chun Zhou (BlockChainZ) Data Consistency and Blockchain Bei Chun Zhou (BlockChainZ) beichunz@cn.ibm.com 1 Data Consistency Point-in-time consistency Transaction consistency Application consistency 2 Strong Consistency ACID Atomicity.

More information

Byzantine Fault-Tolerance with Commutative Commands

Byzantine Fault-Tolerance with Commutative Commands Byzantine Fault-Tolerance with Commutative Commands Pavel Raykov 1, Nicolas Schiper 2, and Fernando Pedone 2 1 Swiss Federal Institute of Technology (ETH) Zurich, Switzerland 2 University of Lugano (USI)

More information

Cristina Nita-Rotaru. CS355: Cryptography. Lecture 17: X509. PGP. Authentication protocols. Key establishment.

Cristina Nita-Rotaru. CS355: Cryptography. Lecture 17: X509. PGP. Authentication protocols. Key establishment. CS355: Cryptography Lecture 17: X509. PGP. Authentication protocols. Key establishment. Public Keys and Trust Public Key:P A Secret key: S A Public Key:P B Secret key: S B How are public keys stored How

More information

Building Consistent Transactions with Inconsistent Replication

Building Consistent Transactions with Inconsistent Replication Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports University of Washington Distributed storage systems

More information

ByzID: Byzantine Fault Tolerance from Intrusion Detection

ByzID: Byzantine Fault Tolerance from Intrusion Detection : Byzantine Fault Tolerance from Intrusion Detection Sisi Duan UC Davis sduan@ucdavis.edu Karl Levitt UC Davis levitt@ucdavis.edu Hein Meling University of Stavanger, Norway hein.meling@uis.no Sean Peisert

More information

CSE 344 MARCH 5 TH TRANSACTIONS

CSE 344 MARCH 5 TH TRANSACTIONS CSE 344 MARCH 5 TH TRANSACTIONS ADMINISTRIVIA OQ6 Out 6 questions Due next Wednesday, 11:00pm HW7 Shortened Parts 1 and 2 -- other material candidates for short answer, go over in section Course evaluations

More information

HQ Replication: A Hybrid Quorum Protocol for Byzantine Fault Tolerance

HQ Replication: A Hybrid Quorum Protocol for Byzantine Fault Tolerance HQ Replication: A Hybrid Quorum Protocol for Byzantine Fault Tolerance James Cowling 1, Daniel Myers 1, Barbara Liskov 1, Rodrigo Rodrigues 2, and Liuba Shrira 3 1 MIT CSAIL, 2 INESC-ID and Instituto Superior

More information

Detectable Byzantine Agreement Secure Against Faulty Majorities

Detectable Byzantine Agreement Secure Against Faulty Majorities Detectable Byzantine Agreement Secure Against Faulty Majorities Matthias Fitzi, ETH Zürich Daniel Gottesman, UC Berkeley Martin Hirt, ETH Zürich Thomas Holenstein, ETH Zürich Adam Smith, MIT (currently

More information

Dynamic Atomic Storage without Consensus

Dynamic Atomic Storage without Consensus Dynamic Atomic Storage without Consensus MARCOS K. AGUILERA, Microsoft Research IDIT KEIDAR, Technion DAHLIA MALKHI, Microsoft Research ALEXANDER SHRAER, Yahoo! Research This article deals with the emulation

More information

Distributed Systems. replication Johan Montelius ID2201. Distributed Systems ID2201

Distributed Systems. replication Johan Montelius ID2201. Distributed Systems ID2201 Distributed Systems ID2201 replication Johan Montelius 1 The problem The problem we have: servers might be unavailable The solution: keep duplicates at different servers 2 Building a fault-tolerant service

More information

Time-Efficient Asynchronous Service Replication. Dissertation. Dipl.-Inform. Dan Dobre

Time-Efficient Asynchronous Service Replication. Dissertation. Dipl.-Inform. Dan Dobre Time-Efficient Asynchronous Service Replication Vom Fachbereich Informatik der Technischen Universität Darmstadt genehmigte Dissertation zur Erlangung des akademischen Grades eines Doktor-Ingenieur (Dr.-Ing.)

More information

Routing v.s. Spanners

Routing v.s. Spanners Routing v.s. Spanners Spanner et routage compact : similarités et différences Cyril Gavoille Université de Bordeaux AlgoTel 09 - Carry-Le-Rouet June 16-19, 2009 Outline Spanners Routing The Question and

More information

PBFT: A Byzantine Renaissance. The Setup. What could possibly go wrong? The General Idea. Practical Byzantine Fault-Tolerance (CL99, CL00)

PBFT: A Byzantine Renaissance. The Setup. What could possibly go wrong? The General Idea. Practical Byzantine Fault-Tolerance (CL99, CL00) PBFT: A Byzantine Renaissance Practical Byzantine Fault-Tolerance (CL99, CL00) first to be safe in asynchronous systems live under weak synchrony assumptions -Byzantine Paxos! The Setup Crypto System Model

More information

TAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton

TAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton TAPIR By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton Outline Problem Space Inconsistent Replication TAPIR Evaluation Conclusion Problem

More information

T ransaction Management 4/23/2018 1

T ransaction Management 4/23/2018 1 T ransaction Management 4/23/2018 1 Air-line Reservation 10 available seats vs 15 travel agents. How do you design a robust and fair reservation system? Do not enough resources Fair policy to every body

More information

CONCURRENCY CONTROL, TRANSACTIONS, LOCKING, AND RECOVERY

CONCURRENCY CONTROL, TRANSACTIONS, LOCKING, AND RECOVERY CONCURRENCY CONTROL, TRANSACTIONS, LOCKING, AND RECOVERY George Porter May 18, 2018 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative

More information

Exam Distributed Systems

Exam Distributed Systems Exam Distributed Systems 5 February 2010, 9:00am 12:00pm Part 2 Prof. R. Wattenhofer Family Name, First Name:..................................................... ETH Student ID Number:.....................................................

More information

G Distributed Systems: Fall Quiz II

G Distributed Systems: Fall Quiz II Computer Science Department New York University G22.3033-006 Distributed Systems: Fall 2008 Quiz II All problems are open-ended questions. In order to receive credit you must answer the question as precisely

More information

Advanced Systems Security: Multics

Advanced Systems Security: Multics Systems and Internet Infrastructure Security Network and Security Research Center Department of Computer Science and Engineering Pennsylvania State University, University Park PA Advanced Systems Security:

More information

Semi-Passive Replication in the Presence of Byzantine Faults

Semi-Passive Replication in the Presence of Byzantine Faults Semi-Passive Replication in the Presence of Byzantine Faults HariGovind V. Ramasamy Adnan Agbaria William H. Sanders University of Illinois at Urbana-Champaign 1308 W. Main Street, Urbana IL 61801, USA

More information

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University)

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University) Secure Multiparty Computation: Introduction Ran Cohen (Tel Aviv University) Scenario 1: Private Dating Alice and Bob meet at a pub If both of them want to date together they will find out If Alice doesn

More information

Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication 12th EuroSys Doctoral Workshop (EuroDW 2018)

Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication 12th EuroSys Doctoral Workshop (EuroDW 2018) Low-Latency Network-Scalable Byzantine Fault-Tolerant tion 12th EuroSys Doctoral Workshop (EuroDW 2018) Ines Messadi, TU Braunschweig, Germany, 2018-04-23 New PhD student (Second month) in the distributed

More information

5/17/17. Announcements. Review: Transactions. Outline. Review: TXNs in SQL. Review: ACID. Database Systems CSE 414.

5/17/17. Announcements. Review: Transactions. Outline. Review: TXNs in SQL. Review: ACID. Database Systems CSE 414. Announcements Database Systems CSE 414 Lecture 21: More Transactions (Ch 8.1-3) HW6 due on Today WQ7 (last!) due on Sunday HW7 will be posted tomorrow due on Wed, May 24 using JDBC to execute SQL from

More information

arxiv: v2 [cs.dc] 12 Sep 2017

arxiv: v2 [cs.dc] 12 Sep 2017 Efficient Synchronous Byzantine Consensus Ittai Abraham 1, Srinivas Devadas 2, Danny Dolev 3, Kartik Nayak 4, and Ling Ren 2 arxiv:1704.02397v2 [cs.dc] 12 Sep 2017 1 VMware Research iabraham@vmware.com

More information

Database Systems CSE 414

Database Systems CSE 414 Database Systems CSE 414 Lecture 21: More Transactions (Ch 8.1-3) CSE 414 - Spring 2017 1 Announcements HW6 due on Today WQ7 (last!) due on Sunday HW7 will be posted tomorrow due on Wed, May 24 using JDBC

More information

Group Replication: A Journey to the Group Communication Core. Alfranio Correia Principal Software Engineer

Group Replication: A Journey to the Group Communication Core. Alfranio Correia Principal Software Engineer Group Replication: A Journey to the Group Communication Core Alfranio Correia (alfranio.correia@oracle.com) Principal Software Engineer 4th of February Copyright 7, Oracle and/or its affiliates. All rights

More information