Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

Similar documents
Data diverse software fault tolerance techniques

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

. Written in factored form it is easy to see that the roots are 2, 2, i,

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Definitions. Error. A wrong decision made during software development

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

n Maurice Wilkes, 1949 n Organize software to minimize errors. n Eliminate most of the errors we made anyway.

Ones Assignment Method for Solving Traveling Salesman Problem

COP4020 Programming Languages. Compilers and Interpreters Prof. Robert van Engelen

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

n Learn how resiliency strategies reduce risk n Discover automation strategies to reduce risk

Appendix D. Controller Implementation

Python Programming: An Introduction to Computer Science

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Overview. Common tasks. Observation. Chapter 20 The STL (containers, iterators, and algorithms) 8/13/18. Bjarne Stroustrup

Enhancing Efficiency of Software Fault Tolerance Techniques in Satellite Motion System

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Code Review Defects. Authors: Mika V. Mäntylä and Casper Lassenius Original version: 4 Sep, 2007 Made available online: 24 April, 2013

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Τεχνολογία Λογισμικού

Pattern Recognition Systems Lab 1 Least Mean Squares

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Analysis of Algorithms

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

Symbolic Execution with Abstraction

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Bayesian approach to reliability modelling for a probability of failure on demand parameter

GE FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III

6.854J / J Advanced Algorithms Fall 2008

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

MR-2010I %MktBSize Macro 989. %MktBSize Macro

End Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization

Chapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Improving Template Based Spike Detection

Data Structures and Algorithms. Analysis of Algorithms

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

How do we evaluate algorithms?

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues

Filter design. 1 Design considerations: a framework. 2 Finite impulse response (FIR) filter design

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

top() Applications of Stacks

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP

n Some thoughts on software development n The idea of a calculator n Using a grammar n Expression evaluation n Program organization n Analysis

Elementary Educational Computer

BOOLEAN MATHEMATICS: GENERAL THEORY

Computer Systems - HS

Abstract. Chapter 4 Computation. Overview 8/13/18. Bjarne Stroustrup Note:

Lecture 18. Optimization in n dimensions

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

Homework 1 Solutions MA 522 Fall 2017

CSE 111 Bio: Program Design I Class 11: loops

ISSN (Print) Research Article. *Corresponding author Nengfa Hu

the beginning of the program in order for it to work correctly. Similarly, a Confirm

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Python Programming: An Introduction to Computer Science

CMPT 125 Assignment 2 Solutions

Octahedral Graph Scaling

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1

Analysis of Algorithms

Data Structures and Algorithms Part 1.4

MOTIF XF Extension Owner s Manual

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Recursion. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Review: Method Frames

A Generalized Set Theoretic Approach for Time and Space Complexity Analysis of Algorithms and Functions

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

DATA STRUCTURES. amortized analysis binomial heaps Fibonacci heaps union-find. Data structures. Appetizer. Appetizer

Algorithm Design Techniques. Divide and conquer Problem

CMSC Computer Architecture Lecture 15: Multi-Core. Prof. Yanjing Li University of Chicago

Evaluation of Different Fitness Functions for the Evolutionary Testing of an Autonomous Parking System

Chapter 3 Classification of FFT Processor Algorithms

Examples and Applications of Binary Search

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

The isoperimetric problem on the hypercube

It just came to me that I 8.2 GRAPHS AND CONVERGENCE

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago

Today s objectives. CSE401: Introduction to Compiler Construction. What is a compiler? Administrative Details. Why study compilers?

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

System and Software Architecture Description (SSAD)

Baan Tools User Management

Neuro Fuzzy Model for Human Face Expression Recognition

Big-O Analysis. Asymptotics

One advantage that SONAR has over any other music-sequencing product I ve worked

Computers and Scientific Thinking

Chapter 3. More Flow of Control. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Transcription:

Structurig Redudacy for Fault Tolerace CSE 598D: Fault Tolerat Software

What do we wat to achieve? Versios Damage Assessmet Versio 1 Error Detectio Iputs Versio 2 Voter Outputs State Restoratio Cotiued Service Versio N

Robust Software Robust software approach does ot use redudacy Robustess: extet to which software ca cotiue to operate correctly despite the itroductio of ivalid iputs as defied i program specificatio Hadles Out of rage iputs Iputs of the wrog type Iputs i the wrog format What happes whe ivalid iput is detected?

Iputs false Valid iput? true Request ew iput OR Use last acceptable value OR Use predefied default value Raise exceptio flag Cotiue software operatio Result Hadle exceptios Robust Software

Self Checkig Software Testig iput data by, for example, error detectig code ad data type checks Testig the cotrol sequeces by, for example, settig bouds o loop iteratios Testig the fuctio of the process by, for example, performig reasoableess check o the output

Assertios A assertio is a statemet that eables you to test your assumptios about your program at specific poits For example, if you write a fuctio that calculates the speed of a particle, you might assert that the calculated speed is less tha the speed d of light Each assertio cotais a Boolea expressio that you believe will be true whe the assertio executes. If it is ot true, the system will throw a error. By verifyig that the Boolea expressio is ideed true, the assertio cofirms your assumptios about the behavior of your program, icreasig your cofidece that the program is free of errors Q: Why do we eed assertios if we have a exceptio mechaism? A: Exceptios are primarily used to hadle uusual coditios arisig durig program executio. Assertios are used to specify coditios that t a programmer assumes are true. Whe programmig, if a programmer ca c swear that the value beig passed ito a particular method is positive o matter what a callig cliet passes, it ca be documeted usig a assertio to state it. Exceptios hadle abormal coditios arisig i the course of the program; however they do ot guaratee smooth or correct c executio of the program. Assertios help state scearios that esure e the program is ruig smoothly. Assertios ca be efficiet tools to esure correct executio of a program. They improve the cofidece about t the program. Not very ew: see Bob Floyd's origial paper "Assigig meaigs to programs" (1967)

Robust Software + Errors detectio of errors i the developmet ad test process - Caot detect ad tolerate less specific errors

Desig Diversity Redudat, exact copies of software compoets aloe caot icrease reliability Diversity: Provisio of idetical services through separate desig ad implemetatios called modules, versios, variats, alteratives Goal: Make the variats as diverse ad idepedet as possible, with the ultimate objective beig the miimizatio of idetical error causes Whe the variats fail, we wat them fail o disjoit subsets of the iput space We wat the reliability of variats as high as possible (at least oe variat will be operatioal at all times)

Desig Diversity Begis with a iitial requiremets specificatio Specificatios may also employ diversity (as log as fuctioal equivalecy is maitaied) Each developer or developmet orgaizatio implemets the variat to the specificatio ad provides the outputs required by the specificatio

Desig Diversity Variat 1 Variat 2 Variat Decider Correct Icorrect

Variats ad Adjudicator ad Cost Whe sigificat idepedece i the variats failure profile ca be achieved, a simple ad efficiet adjudicator ca be used, ad desig diversity provides effective error recovery from desig faults It is likely, however, that completely idepedet developmet caot be achieved i practice Is desig diversity costly?

Case Study Bishop presets a useful review of the research i this area Summarized fidigs: A sigificat proportio of the faults foud i the experimets were similar The major cause of the commo faults was the specificatio (ay solutio?) The major deficiecies i the specificatios were icompleteess ad ambiguity. This caused/forced the programmer to make some icorrect ad potetially commo desig choices Diverse desig specificatios ca potetially reduce specificatio related commo faults I geeral, fewer faults seem to occur i strogly typed, tightly structured laguages such as Modula 2 ad Ada, while low-level level assembler has the worst performace i terms of fault tolerace\ A sigificat improvemet i the reductio of idetical ad very similar faults was foud by usig the N-versio N desig paradigm

Levels of Diversity Two aspects of the level of fault tolerace to cosider Determiig at what level of detail to decompose the system ito modules that will be diversified Determiatio of which layers of the system to diversify (hardware, applicatio software, system software, operators, ad iterfaced betwee these compoets) Multilayer diversity? Problems: cost ad speed

Systematic Diversity Oe way to add diversity at a potetially lower cost is systematic diversity, although it is typically used as a software techique for toleratig hardware faults Utilizatio of differet processor registers i the variats Trasformatio of mathematical expressios Differet implemetatio of programmig structures Differet memory usages Usig complemetary brachig coditios i the variats by trasformig the brach statemets Differet compilers, libraries, ad likers Differet optimizatio ad code-geeratio optios

Data Diversity Limitatios of some desig diverse techiques led to the developmet of data diverse software fault tolerace techiques Data diverse techiques are meat to complemet, rather tha replace, desig diverse techiques Steps Obtai a related set of poits i the program data space, executig the same software o those poits Use a decisio algorithm to determie the resultig output

Failure Domai ad Failure Regio Failure Domai: set of iput poits that cause program failure Failure Regio: geometry of the failure domai Iput space of most programs is a hyperspace of may dimesios E.g., if a program reads ad processes a set of 25 floatig-poit umbers, its iput space has 25 dimesios The valid program space is defied by the specificatios ad by tested values ad rages

Basic Data Re-expressio expressio x Execute P P(x) Re-expressio y = R(x) Execute P P(y) The program, P, ad R determie the relatioship betwee P(x) ad P(y)

Re-expressio expressio with Postexecutio Adjustmet x Execute P P(x) Re-expressio y = R(x) Execute P Adjust for re-expressio A(P(y)) P(y)

Re-expressio expressio via Decompositio ad Recombiatio x Execute P P(x) P(x 1 ) Decompose x -> x 1,x 2,,x N P(x 2 ) Recombie P(x i ) F(P(x i )) P(x N )

Sets i the Output Space Valid output set {y Valid(x,Pc(y))} Failureset {y ot Valid(y,P c (y))} Idetical output set {y Correct(x,P c (y))} These sets are importat i the developmet of data re-expressio Algorithms.

Data Re-expressio expressio R(x)=y x Failure Set Idetical output set Valid output set Iput Space Output Space Failure Regio

Examples of Data Re-expressio expressio Itersectio of lie segmets (exact) Sort fuctio (exact) Sesor data (approximate) What about re-expressio expressio via decompositio ad recombiatio? si(a+b) ) = si(a)cos(b) ) + cos(a)si(b) cos(a) ) = si(π/2 /2-a) si(a+b)= )=si(a) si(π/2 /2-b) + si(π/2 /2-a)si(b) Data re-expressio expressio ca be used o umeric data, character strigs, differetial equatios, ad other represetatios. For example, combiig tree trasformatios, data storage re-orderig, ad code storage re-orderig provide cosiderable diversity i the data processed by large fractios of a covetioal compiler Cautio: Exact re-expressio expressio algorithms may have the defect of preservig precisely those aspects of the data that cause program failure

Temporal Diversity Temporal diversity ivolves the performace or occurrece of a evet at differet times E.g., begiig software executio at differet times effective for trasiet faults Temporal diversity by usig data produced at differet times ca also provide iputs to a data diverse techique temporal skewig of data Receive iput Receive iput Receive iput Software executio Adjudicate result Reject Accept Discard t i t i+1 t i+2

Architectural Structure for Diverse Software To aid i avoidace of faults i the first place ad the tolerace of those remaiig faults, the system complexity must be cotrolled Structurig the hardware ad software compoets that comprise these systems is a key factor to cotrollig the complexity Laprie ad colleagues describe two structurig mechaisms Layerig: We wat each layer to have the fault tolerace mechaisms to hadle the errors produced i that layer Error cofiemet areas: described i terms of the system hardware ad software architecture elemets

Xu ad Radell Framework Adjudicator Variat Targets at developig fault tolerat applicatio Two abstract classes Voter-1 Voter-2 Variat-1 Variat-2 Complex Voter-1 Complex Variat-1 User-defied adjudicators User-defied variat hierarchy