VALIDATING AN ANALYTICAL APPROXIMATION THROUGH DISCRETE SIMULATION

Similar documents
Analysis of Replication Control Protocols

Dynamic Management of Highly Replicated Data

An Available Copy Protocol Tolerating Network Partitions

Resilient Memory-Resident Data Objects

Voting with Regenerable Volatile Witnesses

to replicas and does not include a tie-breaking rule, a ordering to the sites. each time the replicated data are modied.

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742

Annual Simulation Symposium

Self-Adaptive Two-Dimensional RAID Arrays

Voting with Witnesses: A Consistency Scheme for Replicated Files

A Leaner, More Efficient, Available Copy Protocol

Cost Reduction of Replicated Data in Distributed Database System

The Efkct of Failure and Repair Distributions on Consistency Protocols for Replicated Data Objects

Software Engineering: Integration Requirements

Henning Koch. Dept. of Computer Science. University of Darmstadt. Alexanderstr. 10. D Darmstadt. Germany. Keywords:

Lazy Agent Replication and Asynchronous Consensus for the Fault-Tolerant Mobile Agent System

AN EFFICIENT DESIGN OF VLSI ARCHITECTURE FOR FAULT DETECTION USING ORTHOGONAL LATIN SQUARES (OLS) CODES

Module 8 Fault Tolerance CS655! 8-1!

Integration of analytic model and simulation model for analysis on system survivability

International Journal of Scientific & Engineering Research Volume 8, Issue 5, May ISSN

Hunting for Bindings in Distributed Object-Oriented Systems

A queueing network model to study Proxy Cache Servers

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks

Fault Tolerance. it continues to perform its function in the event of a failure example: a system with redundant components

Module 8 - Fault Tolerance

Analysis of Stochastic Model on a Two-Unit Hot Standby Combined Hardware-Software System

Discrete Event Simulation and Petri net Modeling for Reliability Analysis

Distributed Systems (5DV147)

TESTING MULTI-AGENT SYSTEMS FOR DEADLOCK DETECTION BASED ON UML MODELS

Identifying Stable File Access Patterns

A New Statistical Procedure for Validation of Simulation and Stochastic Models

Adapting Commit Protocols for Large-Scale and Dynamic Distributed Applications

A Technique for Managing Mirrored Disks

A Novel Quorum Protocol for Improved Performance

FEC Performance in Large File Transfer over Bursty Channels

Byzantine Consensus in Directed Graphs

On Dependability in Distributed Databases

Association Rules Mining using BOINC based Enterprise Desktop Grid

A Two-Expert Approach to File Access Prediction

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology

Stochastic Petri nets

Chapter 19: Distributed Databases

Chapter 8 Fault Tolerance

these developments has been in the field of formal methods. Such methods, typically given by a

Chapter 18: Parallel Databases

A Simulation Based Comparative Study of Normalization Procedures in Multiattribute Decision Making

ENHANCEMENTS TO THE VOTING ALGORITHM

Distributed KIDS Labs 1

Improving Liquid Handling Robot Throughput by means of Direct Path Planning and Obstacle Avoidance Programming

Key words: B- Spline filters, filter banks, sub band coding, Pre processing, Image Averaging IJSER

SOFTWARE ARCHITECTURE For MOLECTRONICS

A Project Network Protocol Based on Reliability Theory

Module 4: Stochastic Activity Networks

Fault Tolerance. Basic Concepts

DISTRIBUTED SYSTEMS. Second Edition. Andrew S. Tanenbaum Maarten Van Steen. Vrije Universiteit Amsterdam, 7'he Netherlands PEARSON.

Fault Tolerance via the State Machine Replication Approach. Favian Contreras

PRIMARY-BACKUP REPLICATION

Data Replication Model For Remote Procedure Call Transactions

On Checkpoint Latency. Nitin H. Vaidya. In the past, a large number of researchers have analyzed. the checkpointing and rollback recovery scheme

Industrial Computing Systems: A Case Study of Fault Tolerance Analysis

Application of Importance Sampling in Simulation of Buffer Policies in ATM networks

Using Queuing theory the performance measures of cloud with infinite servers

The TOBIAS test generator and its adaptation to some ASE challenges Position paper for the ASE Irvine Workshop

Cheap Paxos. Leslie Lamport and Mike Massa. Appeared in The International Conference on Dependable Systems and Networks (DSN 2004)

Characterization of Boolean Topological Logics

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Web page recommendation using a stochastic process model

Comparison of pre-backoff and post-backoff procedures for IEEE distributed coordination function

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013

A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter

HDL IMPLEMENTATION OF SRAM BASED ERROR CORRECTION AND DETECTION USING ORTHOGONAL LATIN SQUARE CODES

Quasi-Dynamic Network Model Partition Method for Accelerating Parallel Network Simulation

Approximations and Errors

Reusability Metrics for Object-Oriented System: An Alternative Approach

CHAPTER 5 PROPAGATION DELAY

Business Continuity and Disaster Recovery. Ed Crowley Ch 12

Fault Tolerant Domain Decomposition for Parabolic Problems

CODING TCPN MODELS INTO THE SIMIO SIMULATION ENVIRONMENT

Dep. Systems Requirements

'A New Approach to Simulation of Heavy Construction Operations

FAULT TOLERANCE. Fault Tolerant Systems. Faults Faults (cont d)

Storage and Network Resource Usage in Reactive and Proactive Replicated Storage Systems

Availability Study of Dynamic Voting Algorithms

Prediction-based diagnosis and loss prevention using qualitative multi-scale models

Last Class:Consistency Semantics. Today: More on Consistency

ON-LINE QUALITATIVE MODEL-BASED DIAGNOSIS OF TECHNOLOGICAL SYSTEMS USING COLORED PETRI NETS

Implementing Shared Registers in Asynchronous Message-Passing Systems, 1995; Attiya, Bar-Noy, Dolev

Monu Kumar Department of Mathematics, M.D. University, Rohtak , INDIA

Verification and Validation of X-Sim: A Trace-Based Simulator

Multilevel Fault-tolerance for Designing Dependable Wireless Networks

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

AN APPROACH FOR LOAD BALANCING FOR SIMULATION IN HETEROGENEOUS DISTRIBUTED SYSTEMS USING SIMULATION DATA MINING

Frequently Asked Questions Updated 2006 (TRIM version 3.51) PREPARING DATA & RUNNING TRIM

An Optimal Voting Scheme for Minimizing the. Overall Communication Cost in Replicated Data. December 14, Abstract

A Methodology for Disaster Tolerance Utilizing the Concepts of Axiomatic Design

16.1 Runge-Kutta Method

AUTOMATED METHODOLOGY FOR MODELING CRACK EXTENSION IN FINITE ELEMENT MODELS

Replication. Feb 10, 2016 CPSC 416

INTEGER SEQUENCE WINDOW BASED RECONFIGURABLE FIR FILTERS.

Transcription:

MATHEMATICAL MODELLING AND SCIENTIFIC COMPUTING, Vol. 8 (997) VALIDATING AN ANALYTICAL APPROXIMATION THROUGH DISCRETE ULATION Jehan-François Pâris Computer Science Department, University of Houston, Houston, TX 77204-3475 paris@cs.uh.edu ABSTRACT While Markov models have been extensively used to study the availability of replicated data, they cannot handle effectively network configurations where sites failures and network partitions have to be simultaneously considered. We had proposed in a previous paper a hierarchical decomposition method aimed at overcoming this limitation. While our method could provide closed form estimates of the availability of replicated objects whose replicas reside on networks subject to communication failures. We present here a simulation study measuring the quality of our estimates and attempting to improve upon them. KEYWORDS fault-tolerance, replicated systems, redundancy, voting. INTRODUCTION Managing replicated data can be a demanding task particularly when the replicas are stored at different sites of a computer network. Special replication control protocols have been devised to perform this task without user intervention and maintain the replicated data in a consistent state. Evaluating the performance of these protocols, and especially the data availabilities they afford is a very important issue because they vary greatly in their complexity, their communication overhead, and the protection they provide or do not provide against network partitions. Several techniques have been used to evaluate the availability of replicated data. Combinatorial models are very simple to use (Pu et al. 988, van Renesse and Tanenbaum 988) but cannot represent complex recovery modes as these found in some of the most efficient replication control protocols. Simulation models can be very accurate whenever all the parameters of the modeled system are known but this is rarely the case. As a result, stochastic models have become the method of choice for evaluating the availability of replicated data managed by protocols with complex recovery modes (Pâris 986, Jajodia and Mutchler 987, Ahamad and Ammar 987). Unfortunately, stochastic models that take simultaneously into account site failures and network partitions become quickly untractable because their number of states increases exponentially with the number of failure modes being considered. As a result, all analytical studies of the availability of replicated data have either assumed that the sites could not fail or that the network could not partition. One possible solution to this problem is the use of simplified Markov models. We had presented in a previous paper (Pâris 992) a hierarchical decomposition method aimed at overcoming this limitation. While our method could provide closed form estimates of the availability of replicated objects whose replicas reside on networks subject to communication failures, we had no way to evaluate the accuracy of these estimates. We present here a simulation study aimed at measuring the quality of our estimates and attempting to improve upon them.

THE HIERARCHICAL DECOMPOSITION METHOD Hierarchical decomposition reduces the complexity of the model itself by identifying parts of the system that can be studied in isolation and replaced by simpler equivalent components (Courtois 977, Ferrari et al. 983). This technique has been widely used in computer systems performance evaluation to solve queuing models too complex to be directly tractable. It can also be applied to the evaluation of the availability of replicated data objects (Pâris 99). First Ethernet A B G C Second Ethernet Figure : A small local area network with three nodes on two network segments Many local-area networks consist of several carrier-sense segments or token rings linked by gateway machines. Figure shows a very simple example of such networks: it consists of three nodes A, B and C located on two Ethernets linked by a gateway node G. Since gateways can fail without causing a total network failure, such networks can be partitioned. The key difference with conventional point-to-point networks is that sites that are on the same carrier-sense network or token ring will never be separated by a partition. We will refer to these entities as network segments (van Renesse and Tanenbaum 988). We propose to model these networks as a set of nodes and gateways with independent failure modes. When a node or a gateway fails, a repair process is immediately initiated at that site. Should several sites fail, the repair process will be performed in parallel on those failed sites. We assume that failures are exponentially distributed with mean failure rate, and that repairs are exponentially distributed with mean repair rate. The system is assumed to exist in statistical equilibrium and to be characterized by a discrete-state Markov process. Assume now that the three nodes A, B and C are used to store the three copies of a replicated file X. Under a static voting protocol such as majority consensus voting (Ellis 977, Gifford 979), the replica on node C will only be able to be counted in quorums when the gateway G is operational. For all practical purposes, a failure of G will thus have the same effect as a failure of C. We propose therefore to replace the subsystem consisting of site C and its gateway G by an aggregate site C that will remain operational as long as both C and G are operational. A replicated file having replicas on the two nodes A and B and the aggregate site C would have the same availability as the replicated file X but will be much easier to investigate since we will not have to consider gateway failures. To compute the failure rate and the repair rate of the aggregate site C, we need to observe that C will remain operational as long as both the gateway G and the node holding the replica C are both operational. The states of C and G can thus be represented by the state transition diagram of figure 2, which consist of four states numbered from 00 to. State represents the state of the aggregate site when the node and its gateway are both operational. States 0 and 0 represent states when either the site or its gateway have failed while site 00 corresponds to a failure of both entities. We can immediately derive the steady state probabilities for the four states:

0 0 00 Figure 2: State transition diagram for an aggregate site consisting of one node and one gateway 2 p = p = p = p ( + ),, ( + ) 2 0 0 2 00 2 = ( + ) Since the only available state for the aggregate site is state, the failure rate and the repair rate of the aggregate site C are given by: ' p = 2p and '( p ) = ( p + p ) 0 0 which have as unique solution '= 2 and '= + 2 This aggregation technique can be trivially extended to replicated objects consisting of an arbitrary number of replicas located on a network consisting of network segments linked by gateways. Note however that the hierarchical decomposition method assumes that replicas located on nodes that become part of a given aggregate site can never become a majority by themselves. This assumption is correct for majority consensus voting as long as an aggregate site does not contain a majority of the replicas. There are however several voting protocols that allow a minority of the replicas to have a majority of the votes. Whenever this is the case, the hierarchical decomposition method will underestimate the availability of the replicated data. 2 G A H B C Aggregate B Aggregate C Figure 3: Three nodes A, B and C on three network segments linked by two gateways THE VALIDATION STUDY Being aware that our hierarchical dccomposition method could sometimes underestimate the availability of the replicated data, we decided therefore to compare the data availabilities computed through our hierarchical methods with those obtained by simulating the same configurations over a wide range of site failure

and repair rates. The simulator we utilized in our study has been described in more detail elsewhere (Pâris et al. 988). It uses the batch means method to compute 95 percent confidence intervals of the unavailability of the replicated data. In order to reduce the initial bias, we excluded all measurements collected during the first simulated 80 days. We decided to investigate three distinct replica configurations managed by two distinct voting protocols. These configurations were: (a) three replicas on one Ethernet, (b) three replicas on two Ethernets linked by one gateway( as in our previous example), and (c) three replicas on three Ethernets linked by two gateways (as on figure 3). We selected majority consensus voting for its simplicity and the dynamic-linear voting protocol (DLV) (Jajodia and Mutchler 987) for its excellent performance. We did not investigate configurations with two replicas since voting protocols require a minimum of three voting sites to improve upon the availability of a single copy. Availability 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 NS 2 NS 3 NS 0 0.02 0.04 0.06 0.08 0. 0.2 0.4 0.6 0.8 0.2 ρ Figure 4: Compared availabilities for majority consensus voting The results for majority consensus voting are summarized on figure 4 where the three curves correspond to the three configurations under study and ρ = / represents the ratio of the failure rate over the repair rate. The solid curves were obtained by substituting the proper values of the availabilities of the three replicas to p, p 2 and p 3 in: A () 3 MCV = p p p ( p ) p p ( p ) p p ( p ) 2 3 + 2 3 + 2 3 + 3 pp2 while the isolated values were obtained through our simulator. As one can see, the results of our simulations were in perfect agreement with those derived from our hierarchical decomposition method. This was expected since none of the aggregate sites contained a majority of the replicas. As figure 5 shows, the results for dynamic-linear voting were quite different: while the simulation results agreed with the analytical results in two of the three configurations, the simulation results for three replicas on three network segments were systematically higher than those achieved through our hierarchical decomposition method. This discrepancy results from the fact that the dynamic-linear voting protocol adjusts its quorums to reflect changes in replica availability and network connectivity. Hence the failure of any of the three replicas will result into a the establishment of a new quorum. This new quorum will consist of the replica that comes first in lexicographic ordering of the two remaining replicas. Thus, if node A fails first, the replica on node B will become a quorum by itself because B > C. The hierarchical decomposition method will then underestimate the availability of the replicated data because it will assume that the replica on node B cannot be accessed when its gateway G is not operational. This can better seen on the (rather complex) state transition diagram for three replicas on three network segments under the dynamic-linear voting protocol. As figure 6 shows, the replicated data are in state 2 when the three nodes holding replicas and the two gateways are operational. A failure of node A appears on the diagram as a transition from state 2 to state 02. Since replica B is now a majority by itself, a failure of the aggregate site C would leave the replicated data in state 0, which still allows access while

a failure of the aggregate site B would move the system to state 0, which does not allow access. We speculated first that the discrepancy between the simulation data and the results of the hierarchical decomposition method could result from the fact that once the system is in state 0, it should remain in this state as long as B remains operational regardless of the status of the gateway G. 0.98 Availability 0.96 0.94 0.92 0.9 0.88 NS 2 NS 3 NS 0 0.02 0.04 0.06 0.08 0. 0.2 0.4 0.6 0.8 0.2 ρ Figure 5: Compared availabilities for dynamic-linear voting We decided to change the state transition rate between states 0 and 00 from to (which means halving it since = 2). At our great surprise, this had no significant effect on the availability figures. 02 2 02 0 2 2 0 2 0 00 0 00 0 Figure 6: State transition diagram for three replicas on three segments managed by DLV We then considered another explanation. When the system is in state 02 any failure of the gateway G will result in a failure of the aggregate site B and move the system to the unavailable state 0. This is not correct because a failure of G would leave B capable to continue to process requests since it already constitutes a majority by itself. We decided then to change the transition rate from state 02 to 0 from to. As figure 7 indicates, this brought the analytical results in much better agreement with the simulation results.

0.98 0.96 0.94 0.92 0.9 0.88 0 0.02 0.04 0.06 0.08 0. 0.2 0.4 0.6 0.8 0.2 Availability NEW ORIGINAL ρ Figure 7: Tuning the Markov model of the DLV protocol for three replicas on three segments DISCUSSION While Markov models have been extensively used to study the availability of replicated data, they cannot handle effectively network configurations where sites failures and network partitions have to be simultaneously considered. We had proposed in a previous paper a hierarchical decomposition method aimed at overcoming this limitation. While our method could provide closed form estimates of the availability of replicated objects whose replicas reside on networks subject to communication failures. We have presented here a simulation study aimed at measuring the quality of our estimates and attempting to improve upon them. We gained from our study a better understanding of the limitations of our hierarchical decomposition method and one possible method to improve upon its accuracy. REFERENCES Pu, C., J. D. Noe and A. Proudfoot, Regeneration of Replicated Objects: A Technique and its Eden Implementation, IEEE Transactions on Software Engineering, SE-4, 7 (988), pp. 936-945. van Renesse, R., and A. Tanenbaum, Voting with Ghosts, Proc. 8 th Int. Conf. on Distributed Computing Systems, (988), pp. 456-462. Pâris, J.-F., Voting with Witnesses: A Consistency Scheme for Replicated Files, Proc. 6 th Int. Conf. on Distributed Computing Systems, (986), pp. 606-62. Jajodia, S. and D. Mutchler, Enhancements to the Voting Algorithm, Proc. 3 th VLDB Conf., (987), pp. 399-405. Ahamad, M., and M. H. Ammar, Performance Characterization of Quorum-Consensus Algorithms for Replicated Data, IEEE Trans. on Software Engineering, SE-5, 4 (989), pp. 492-496. Ellis, C. A., Consistency and Correctness of Duplicate Database Systems, Operating Systems Review, (977). Gifford, D. K., Weighted Voting for Replicated Data, Proc. 7 th ACM Symp. on Operating System Principles, (979), pp. 50-6. Courtois, P. J., Decomposability: Queuing and Computer System Applications, Academic Press, New York (977). Ferrari, D., G. Serazzi and A. Zeigner, Measurement and Tuning of Computer Systems, Prentice-Hall, Englewood Cliffs, NJ (983). Pâris, J.-F., D. D. E. Long and A. Glockner, A Realistic Evaluation of Consistency Algorithms for Replicated Files, Proc. 2 st Annual Simulation Symp., (988), pp. 2-30. Pâris, J.-F., Evaluating the Impact of Network Partitions on Replicated Data Availability, in J.F. Meyer, R. D. Schlichting (eds.) Dependable Computing for Critical Applications, Springer Verlag, (992), pp. 49-65.