Software Techniques for Dependable Computer-based Systems. Matteo SONZA REORDA
|
|
- Pearl Blake
- 6 years ago
- Views:
Transcription
1 Software Techniques for Dependable Computer-based Systems Matteo SONZA REORDA
2 Summary Introduction State of the art Assertions Algorithm Based Fault Tolerance (ABFT) Control flow checking Data duplication and comparison A comprehensive approach Conclusions. Matteo SONZA REORDA Politecnico di Torino 2
3 Introduction Cost is becoming a major constraint for many safety-critical applications exploiting microprocessor-based systems Software-implemented hardware faulttolerance (SIHFT) approaches are attractive: Commercial components can be used Hardware redundancy can be avoided High flexibility can be reached. Matteo SONZA REORDA Politecnico di Torino 3
4 SIHFT The aim of SIHFT techniques is to Address temporary faults in the hardware Provide solutions based on software changes SIHFT techniques are cheap flexible but introduce memory overhead performance slow-down. Matteo SONZA REORDA Politecnico di Torino 4
5 Evaluation parameters Detection/correction capabilities Cost Development Memory overhead Performance slow down Generality (e.g., wrt the application characteristics) Flexibility (e.g., wrt the addressed faults). Matteo SONZA REORDA Politecnico di Torino 5
6 Assumptions Microprocessor- (or microcontroller-) based systems Only the software can be modified for system hardening purposes Transient faults are addressed Trading off between reliability and performance is acceptable Software is correct (no bugs). Matteo SONZA REORDA Politecnico di Torino 6
7 The dream Hardened system SW automatic transformation SW* µp mem I/O I/O I/O I/O Matteo SONZA REORDA Politecnico di Torino 7
8 State of the art Assertions Algorithm Based Fault Tolerance (ABFT) Control flow checking Data duplication and comparison Matteo SONZA REORDA Politecnico di Torino 8
9 Assertions Some invariant property is checked during the code execution Originally intended for software bugs identification Can detect hardware faults, also Their insertion is often performed in an empirical way. Matteo SONZA REORDA Politecnico di Torino 9
10 Limits The attained fault detection capability is normally low Assertions insertion can hardly be automated. Matteo SONZA REORDA Politecnico di Torino 10
11 ABFT Introduced by Huang & Abraham in 1984 Set of techniques for hardening some common algorithms, e.g., Matrix multiplication Matrix inversion LU decomposition FFT Based on encoding input so that the algorithm produces an encoded output. Matteo SONZA REORDA Politecnico di Torino 11
12 Example: matrix multiplication Definitions: A column checksum matrix of the matrix A is an (n+1)-by-m matrix A c, which consists of the matrix A in the first n rows, and a column summation vector in the (n+1)st row. A row checksum matrix of the matrix A is an n-by- (m+1) matrix A r, which consists of the matrix A in the first m columns, and a row summation vector in the (m+1)st column. The full checksum matrix A f of the matrix A is an (n+1)-by-(m+1) matrix, which is the column checksum matrix of the row checksum matrix of A. Matteo SONZA REORDA Politecnico di Torino 12
13 Theorem The result of a column checksum matrix A c multiplied by a row checksum matrix B r is a full check-sum matrix C f. The corresponding matrices A, B, and C have the following relation: A * B = C Matteo SONZA REORDA Politecnico di Torino 13
14 Example A B CHECKSUM = C CHECKSUM CHECKSUM CHECKSUM Matteo SONZA REORDA Politecnico di Torino 14
15 Improvements ABFT can be tailored to different targets (fault detection, fault localization, etc.) by changing the adopted checksum. Matteo SONZA REORDA Politecnico di Torino 15
16 Fault detection capabilities ABFT normally has very good detection capabilities (>98%) with respect to Faults in the data memory Faults in the arithmetic units Some faults in the code memory are also detected. Matteo SONZA REORDA Politecnico di Torino 16
17 Overhead When used for fault detection, ABFT overhead is 13% for memory 32% for execution time. When tailored to single error correction, the overhead is 28% for memory 73% for execution time. Matteo SONZA REORDA Politecnico di Torino 17
18 ABFT: summary Advantages: Low memory overhead Low execution time overhead High coverage of faults in the data Disadvantages: Available for some algorithms, only High cost for implementation Some faults escape (e.g., those affecting the data before encoding) Determining the threshold for error identification can be difficult. Matteo SONZA REORDA Politecnico di Torino 18
19 Control flow checking Aims at detecting faults changing the execution flow of a program It is based on Dividing the program code in basic blocks Building the program graph Checking at run-time the correctness of each transition performed during the program execution. Matteo SONZA REORDA Politecnico di Torino 19
20 Basic blocks A basic block is a maximal set of ordered instructions in which the execution begins from the first instruction and terminates at the last instruction There is no branching instruction in a basic block except possibly for the last one A basic block terminates at either an instruction branching to another basic block or an instruction receiving control from two or more places in the program. Matteo SONZA REORDA Politecnico di Torino 20
21 Program graph It is a graph where Vertices are basic blocks Edges are transitions in the execution flow, i.e., branches, call and return instructions, etc. Matteo SONZA REORDA Politecnico di Torino 21
22 Example Matteo SONZA REORDA Politecnico di Torino 22
23 Checking control flow It can be performed on-line by Assigning a signature s i to each basic block i Computing an expected signature difference G for each block Evaluating during execution (each time a new block is entered) the function f = G d i, where d d = s d s s, being s d and s s the signatures of the basic block and its predecessor, respectively. Matteo SONZA REORDA Politecnico di Torino 23
24 Example Matteo SONZA REORDA Politecnico di Torino 24
25 Developing the idea Suitable techniques have been developed for dealing with blocks that can be reached from several blocks optimizing the choice of the signatures assigned to blocks. Matteo SONZA REORDA Politecnico di Torino 25
26 Experimental results According to Oh et al., The percentage of branch faults escaping this mechanism is below 2.5% Memory overhead ranges from 26% to 64% Execution time overhead ranges from 16 to 69%. The overhead mainly depends on the average basic block size Some aliasing effects may arise. Matteo SONZA REORDA Politecnico di Torino 26
27 Improvements Several techniques have been proposed to reduce the overhead and improve the fault coverage by Carefully choosing the signatures Cleverly performing the checks Avoiding unnecessary checks. Matteo SONZA REORDA Politecnico di Torino 27
28 Data duplication and comparison Independently developed by a couple of groups in the last 5 years Addresses faults in the data and code Based on duplicating data and checking copies. Matteo SONZA REORDA Politecnico di Torino 28
29 EDDI Proposed in 2002 by Oh, Shirvani and McCluskey Addresses faults affecting the code Based on duplicating instructions and data at the assembly level Reduces execution time slow down by careful instruction scheduling in superscalar architectures. Matteo SONZA REORDA Politecnico di Torino 29
30 Example Original code ADD R3, R1, R2 Modified code ADD R3, R1, R2 ADD R23, R21, R22 BNE R3, R23, error Matteo SONZA REORDA Politecnico di Torino 30
31 EDDI: results Detection capability for faults in the code segment exceeds 98% Execution time overhead ranges from 72% to 111% for a two-way processor Memory overhead is about 100%. Matteo SONZA REORDA Politecnico di Torino 31
32 A comprehensive approach Proposed by Politecnico di Torino group in 1999 Main characteristics Works on high-level code (e.g., C language) Limited assumptions on addressed faults High independence on hardware High fault coverage Its application can be automated Originally intended for fault detection, only. Matteo SONZA REORDA Politecnico di Torino 32
33 Basic idea A set of rules have been defined Rules transform a high-level code into a hardened one Rules application can be automated. Matteo SONZA REORDA Politecnico di Torino 33
34 Rules (I) Duplicate every variable Execute every operation on the two replicas Check for consistency after each read access. Matteo SONZA REORDA Politecnico di Torino 34
35 Example Original code int a, b; b=a+5; Hardened code int a1, b1, a2, b2; b1 = a1+5; b2 = a2+5; if(a1!=a2) error(); Matteo SONZA REORDA Politecnico di Torino 35
36 Rules (II) Check twice the condition in every conditional statement. Matteo SONZA REORDA Politecnico di Torino 36
37 Example Original code Hardened code if(condition) {/* block A */ }else {/* block B */ } if(condition) {/* block A */ if(!condition) error(); } else {/* block B */ if(!condition) error(); } Matteo SONZA REORDA Politecnico di Torino 37
38 Rules (III) Introduce some control flow checking technique. Matteo SONZA REORDA Politecnico di Torino 38
39 Rules (IV) Additional rules have been defined to cover Cycles Procedure call and return Matteo SONZA REORDA Politecnico di Torino 39
40 Rules (V) Similar rules can be defined for other high-level languages. Matteo SONZA REORDA Politecnico di Torino 40
41 Transformation tool A prototypical tool automating the application of the rules has been developed Reads a C code Produces a hardened C code. Matteo SONZA REORDA Politecnico di Torino 41
42 Overhead The application of all the hardening rules on the whole code on unpipelined or RISC processor systems Increases the size of the code and data areas Factors from 3 to 4 have been observed Slows-down the program execution Factors of about 3 have been observed. Matteo SONZA REORDA Politecnico di Torino 42
43 Experimental results Several fault injection campaigns have been performed to evaluate the fault detection capabilities of the method On a transputer-based system On an 8051-based system On a LEON processor. Matteo SONZA REORDA Politecnico di Torino 43
44 Experimental set up Some sample benchmark programs have been selected Hardened versions have been developed. Fault injection experiments have been performed. Matteo SONZA REORDA Politecnico di Torino 44
45 Fault Classification Effect less No Answer Latent Wrong Answer Undetected incorrect output Detection Hardware: through hardware mechanisms Software: through transformation rules. Matteo SONZA REORDA Politecnico di Torino 45
46 Transputer system: set up Random SEUs are injected in memory Fault Injection Software techniques have been exploited. Radiation experiments Performed at ONERA/DESP, Toulouse, France by the TIMA team Confirmed the fault injection results. Matteo SONZA REORDA Politecnico di Torino 46
47 Faults in the memory data area Wrong Answer SW Detected Effectless Hardened Original % Average of benchmark programs Matteo SONZA REORDA Politecnico di Torino 47
48 Faults in the memory code area Wrong Answer Hardened Original No Answer HW Detected SW Detected Effectless % Average of benchmark programs Matteo SONZA REORDA Politecnico di Torino 48
49 8051 system: set up Injected faults SEUs in the code memory SEUs in the processor memory elements (including hidden registers) SETs in the combinational part Emulation-based fault injection has been exploited. Matteo SONZA REORDA Politecnico di Torino 49
50 SEUs in the code memory Detected Effect-less Latent Hardened Original Wrong Answer ,000 injected SEUs Matteo SONZA REORDA Politecnico di Torino 50
51 SEUs in processor elements Detected Effect-less Latent Hardened Original Wrong Answer ,000 injected SEUs Matteo SONZA REORDA Politecnico di Torino 51
52 SETs on processor gates Detected 0 48 Effect-less Latent Hardened Original Wrong Answer ,000 injected SETs Matteo SONZA REORDA Politecnico di Torino 52
53 LEON system: set up Injected faults SEUs in the processor memory elements (including register file and pipeline registers) Emulation-based fault injection has been exploited. Matteo SONZA REORDA Politecnico di Torino 53
54 SEUs in pipeline registers Wrong answer Latent Hardened Original Effect-less Detected ,000 injected SEUs Matteo SONZA REORDA Politecnico di Torino 54
55 SEUs in register file Wrong answer Latent Hardened Original Effect-less Detected ,000 injected SEUs Matteo SONZA REORDA Politecnico di Torino 55
56 Observations (I) The method detects faults not only in the external memory, but also inside the processor: Cache User registers Hidden registers (e.g., in the control unit, or in the pipeline) Combinational logic. Matteo SONZA REORDA Politecnico di Torino 56
57 Observations (II) The method is able to detect any kind of fault creating a mismatch between the two replicas of a variable most of the faults affecting the code, or the control part of the processor. The method is NOT able to detect some of the permanent faults affecting the execution units of the processor. Matteo SONZA REORDA Politecnico di Torino 57
58 Observations (III) The method can be applied flexibly : A subset of rules may be applied (e.g., only duplication rules) A subset of variables may be hardened A subset of the code may be hardened. In this way, the most suitable trade-off between detection capabilities and overhead can be attained. Matteo SONZA REORDA Politecnico di Torino 58
59 Extension to fault tolerance The rules can be extended to obtain fault tolerance Only faults affecting data have been targeted Some faults affecting the code can be tolerated, also. Matteo SONZA REORDA Politecnico di Torino 59
60 Rules (I) Each variable is duplicated; two sets of variables V0 and V1 are thus obtained Every operation is repeated on the two replicas A checksum variable is introduced for V0 Checksum is updated after each write operation. Matteo SONZA REORDA Politecnico di Torino 60
61 Rules (II) After each read operation, the two copies are checked for consistency If a fault is detected Checksum is exploited to identify the corrupted copy The safe copy is used to correct the other. Matteo SONZA REORDA Politecnico di Torino 61
62 Example: original code int a, b; a = b+5; Matteo SONZA REORDA Politecnico di Torino 62
63 Example: hardened code int a0, a1, b0, b1, c; c = c^a0; a0 = b0+5; a1 = b1+5; c = c^a0; /* c is updated A checksum */ if(b0!=b1)/* error detection variable */ is if(chk()==c) introduced Each { variable b1 = b0; /* b1 is wrong */ is duplicated a1 = a0; /* a1 is wrong */ } else { b0 = b1; /* b0 is wrong */ a0 = a1; /* a0 is wrong */ c = chk(); } Matteo SONZA REORDA Politecnico di Torino 63
64 Example: hardened code int a0, a1, b0, b1, c; c = c^a0; a0 = b0+5; a1 = b1+5; c = c^a0; /* c is updated */ if(b0!=b1)/* error detection */ if(chk()==c) { b1 = b0; /* b1 is wrong */ a1 = a0; /* a1 is wrong Every */ operation } else is performed on { b0 = b1; /* b0 is wrong the */ two copies a0 = a1; /* a0 is wrong */ c = chk(); } Matteo SONZA REORDA Politecnico di Torino 64
65 Example: hardened code int a0, a1, b0, b1, c; c = c^a0; a0 = b0+5; a1 = b1+5; c = c^a0; /* c is updated */ if(b0!=b1)/* error detection */ if(chk()==c) Each time a { b1 = b0; /* b1 is wrong */ variable is a1 = a0; /* a1 is wrong */ modified, the } else checksummust is { b0 = b1; beupdated /* b0 is wrong */ a0 = a1; /* a0 is wrong */ c = chk(); } Matteo SONZA REORDA Politecnico di Torino 65
66 Example: hardened code int a0, a1, b0, b1, c; c = c^a0; a0 = b0+5; a1 = b1+5; c = c^a0; /* c is updated */ if(b0!=b1)/* error detection */ if(chk()==c) { b1 = b0; /* b1 is wrong */ a1 = a0; /* a1 is wrong */ } else { b0 = b1; /* b0 is wrong */ Read variables a0 = a1; /* a0 is wrong */ are checked for correctness c = chk(); } Matteo SONZA REORDA Politecnico di Torino 66
67 Example: hardened code int a0, a1, b0, b1, c; c = c^a0; a0 = b0+5; a1 = b1+5; c = c^a0; /* c is updated */ if(b0!=b1)/* error detection */ if(chk()==c) { b1 = b0; /* b1 is wrong */ a1 = a0; /* a1 is wrong */ } else The checksum is { b0 = b1; /* b0 is exploited wrong */ to identify the a0 = a1; /* a0 is wrong */ corrupted copy c = chk(); } Matteo SONZA REORDA Politecnico di Torino 67
68 Example: hardened code int a0, a1, b0, b1, c; c = c^a0; a0 = b0+5; a1 = b1+5; Correction takes c = c^a0; /* c is updated */ place if(b0!=b1)/* error detection */ if(chk()==c) { b1 = b0; /* b1 is wrong */ a1 = a0; /* a1 is wrong */ } else { b0 = b1; /* b0 is wrong */ a0 = a1; /* a0 is wrong */ c = chk(); } Matteo SONZA REORDA Politecnico di Torino 68
69 Memory overhead Data memory: Variable duplication and checksum introduce a memory overhead of about 2 Code memory: Checksum computation and updating, operation duplication and coherency checking introduce an overhead of between 2 and 3. Matteo SONZA REORDA Politecnico di Torino 69
70 Time overhead It is caused by Operation duplication Coherency checking Checksum computation and updating. It amounts to about 2.5. Matteo SONZA REORDA Politecnico di Torino 70
71 Experimental results Set of benchmark C programs Two versions: original and hardened 8051-based system Simulation-based fault injection experiments on both versions: Random SEUs in data and code memory Matteo SONZA REORDA Politecnico di Torino 71
72 Experimental results % ,48 86,81 12,71 original hardened 5 out of 1,000 0,49 0,64 0,03 Effect-less Wrong Answer Time-Out Matteo SONZA REORDA Politecnico di Torino 72
73 Conclusions Software-implemented fault tolerance is promising (at least to complement other techniques) New techniques have been developed able to Reduce development and product costs Guarantee high detection/correction capabilities. Matteo SONZA REORDA Politecnico di Torino 73
74 On-going work Improving known techniques Investigating hybrid solutions: Duplication is performed in software Consistency check is performed in hardware Validating the method on real applications. Matteo SONZA REORDA Politecnico di Torino 74
75 For more information Matteo SONZA REORDA Politecnico di Torino 75
Soft-error Detection Using Control Flow Assertions
Soft-error Detection Using Control Flow Assertions O. Goloubeva, M. Rebaudengo, M. Sonza Reorda, M. Violante Politecnico di Torino, Dipartimento di Automatica e Informatica Torino, Italy Abstract Over
More informationSOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE
SOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE SOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE O. Goloubeva, M. Rebaudengo, M. Sonza Reorda, and M. Violante Politecnico di Torino - Dipartimento di Automatica
More informationAccurate Analysis of Single Event Upsets in a Pipelined Microprocessor
Accurate Analysis of Single Event Upsets in a Pipelined Microprocessor M. Rebaudengo, M. Sonza Reorda, M. Violante Politecnico di Torino Dipartimento di Automatica e Informatica Torino, Italy www.cad.polito.it
More informationA Low-Cost Correction Algorithm for Transient Data Errors
A Low-Cost Correction Algorithm for Transient Data Errors Aiguo Li, Bingrong Hong School of Computer Science and Technology Harbin Institute of Technology, Harbin 150001, China liaiguo@hit.edu.cn Introduction
More informationEvaluating the Fault Tolerance Capabilities of Embedded Systems via BDM
Evaluating the Fault Tolerance Capabilities of Embedded Systems via BDM M. Rebaudengo, M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica e Informatica Torino, Italy Fault tolerant system
More informationRedundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992
Redundancy in fault tolerant computing D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 1 Redundancy Fault tolerance computing is based on redundancy HARDWARE REDUNDANCY Physical
More informationAutomatic Test Program Generation from RT-level microprocessor descriptions
Automatic Test Program Generation from RT-level microprocessor descriptions F. Corno, G. Cumani, M. Sonza Reorda, G. Squillero Politecnico di Torino Dipartimento di Automatica e Informatica Torino, Italy
More informationDuke University Department of Electrical and Computer Engineering
Duke University Department of Electrical and Computer Engineering Senior Honors Thesis Spring 2008 Proving the Completeness of Error Detection Mechanisms in Simple Core Chip Multiprocessors Michael Edward
More informationReport on benchmark identification and planning of experiments to be performed
COTEST/D1 Report on benchmark identification and planning of experiments to be performed Matteo Sonza Reorda, Massimo Violante Politecnico di Torino Dipartimento di Automatica e Informatica Torino, Italy
More informationDEPENDABLE PROCESSOR DESIGN
DEPENDABLE PROCESSOR DESIGN Matteo Carminati Politecnico di Milano - October 31st, 2012 Partially inspired by P. Harrod (ARM) presentation at the Test Spring School 2012 - Annecy (France) OUTLINE What?
More informationEliminating Single Points of Failure in Software Based Redundancy
Eliminating Single Points of Failure in Software Based Redundancy Peter Ulbrich, Martin Hoffmann, Rüdiger Kapitza, Daniel Lohmann, Reiner Schmid and Wolfgang Schröder-Preikschat EDCC May 9, 2012 SYSTEM
More informationA Low-Cost SEE Mitigation Solution for Soft-Processors Embedded in Systems On Programmable Chips
A Low-Cost SEE Mitigation Solution for Soft-Processors Embedded in Systems On Programmable Chips M. Sonza Reorda, M. Violante Politecnico di Torino Torino, Italy Abstract The availability of multimillion
More informationA Fault Tolerant Superscalar Processor
A Fault Tolerant Superscalar Processor 1 [Based on Coverage of a Microarchitecture-level Fault Check Regimen in a Superscalar Processor by V. Reddy and E. Rotenberg (2008)] P R E S E N T E D B Y NAN Z
More informationTransient Fault Detection and Reducing Transient Error Rate. Jose Lugo-Martinez CSE 240C: Advanced Microarchitecture Prof.
Transient Fault Detection and Reducing Transient Error Rate Jose Lugo-Martinez CSE 240C: Advanced Microarchitecture Prof. Steven Swanson Outline Motivation What are transient faults? Hardware Fault Detection
More informationACCE: Automatic Correction of Control-flow Errors
ACCE: Automatic Correction of Control-flow Errors Ramtilak Vemu, Sankar Gurumurthy and Jacob A. Abraham Computer Engineering Research Center University of Texas at Austin {rvemu, sankar, jaa}@cerc.utexas.edu
More informationError Detection by Duplicated Instructions in Super-Scalar Processors
IEEE TRANSACTIONS ON RELIABILITY, VOL. 51, NO. 1, MARCH 2002 63 Error Detection by Duplicated Instructions in Super-Scalar Processors Nahmsuk Oh, Member, IEEE, Philip P. Shirvani, Member, IEEE, and Edward
More informationSIListra. Coded Processing in Medical Devices. Dr. Martin Süßkraut (TU-Dresden / SIListra Systems)
SIListra making systems safer Coded Processing in Medical Devices Dr. Martin Süßkraut (TU-Dresden / SIListra Systems) martin.suesskraut@se.inf.tu-dresden.de Embedded goes Medical 5./6. Oct. 2011 1 SIListra
More informationFault-Injection testing and code coverage measurement using Virtual Prototypes on the context of the ISO standard
Fault-Injection testing and code coverage measurement using Virtual Prototypes on the context of the ISO 26262 standard NMI Automotive Electronics Systems 2013 Event Victor Reyes Technical Marketing System
More informationSystem-level Test and Validation of Hardware/Software Systems
M. Sonza Reorda Z. Peng M. Violante (Eds.) System-level Test and Validation of Hardware/Software Systems With 55 Figures Springer Table of Contents Table of Figures List of Contributors ix xi 1 Introduction
More informationCONTROL FLOW CHECKING USING MAIN MEMORY BUS MONITORING IN AN INTERNAL CACHE ENVIRONMENT
CONTROL FLOW CHECKING USING MAIN MEMORY BUS MONITORING IN AN INTERNAL CACHE ENVIRONMENT BY FEDERICO G. D. ROTA B.S. equivalent certificate, Politecnico di Torino, 2001 THESIS Submitted as partial fulfillment
More informationFault Diagnosis Schemes for Low-Energy BlockCipher Midori Benchmarked on FPGA
Fault Diagnosis Schemes for Low-Energy BlockCipher Midori Benchmarked on FPGA Abstract: Achieving secure high-performance implementations for constrained applications such as implantable and wearable medical
More informationUsing Error Detection Codes to detect fault attacks on Symmetric Key Ciphers
Using Error Detection Codes to detect fault attacks on Symmetric Key Ciphers Israel Koren Department of Electrical and Computer Engineering Univ. of Massachusetts, Amherst, MA collaborating with Luca Breveglieri,
More informationFault Simulation. Problem and Motivation
Fault Simulation Problem and Motivation Fault Simulation Problem: Given A circuit A sequence of test vectors A fault model Determine Fault coverage Fraction (or percentage) of modeled faults detected by
More informationUsing Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance
Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance Outline Introduction and Motivation Software-centric Fault Detection Process-Level Redundancy Experimental Results
More informationFast SEU Detection and Correction in LUT Configuration Bits of SRAM-based FPGAs
Fast SEU Detection and Correction in LUT Configuration Bits of SRAM-based FPGAs Hamid R. Zarandi,2, Seyed Ghassem Miremadi, Costas Argyrides 2, Dhiraj K. Pradhan 2 Department of Computer Engineering, Sharif
More informationOverview. State-of-the-Art. Relative cost of error correction. CS 619 Introduction to OO Design and Development. Testing.
Overview CS 619 Introduction to OO Design and Development ing! Preliminaries! All sorts of test techniques! Comparison of test techniques! Software reliability Fall 2012! Main issues: There are a great
More informationReliable Computing I
Instructor: Mehdi Tahoori Reliable Computing I Lecture 9: Concurrent Error Detection INSTITUTE OF COMPUTER ENGINEERING (ITEC) CHAIR FOR DEPENDABLE NANO COMPUTING (CDNC) National Research Center of the
More informationFault-tolerant techniques
What are the effects if the hardware or software is not fault-free in a real-time system? What causes component faults? Specification or design faults: Incomplete or erroneous models Lack of techniques
More informationSoft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study
Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study Bradley F. Dutton, Graduate Student Member, IEEE, and Charles E. Stroud, Fellow, IEEE Dept. of Electrical and Computer Engineering
More informationRedundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992
Redundancy in fault tolerant computing D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 1 Redundancy Fault tolerance computing is based on redundancy HARDWARE REDUNDANCY Physical
More informationSafety and Reliability of Software-Controlled Systems Part 14: Fault mitigation
Safety and Reliability of Software-Controlled Systems Part 14: Fault mitigation Prof. Dr.-Ing. Stefan Kowalewski Chair Informatik 11, Embedded Software Laboratory RWTH Aachen University Summer Semester
More informationUltra Depedable VLSI by Collaboration of Formal Verifications and Architectural Technologies
Ultra Depedable VLSI by Collaboration of Formal Verifications and Architectural Technologies CREST-DVLSI - Fundamental Technologies for Dependable VLSI Systems - Masahiro Fujita Shuichi Sakai Masahiro
More informationHIGH-LEVEL AND HIERARCHICAL TEST SEQUENCE GENERATION
HIGH-LEVEL AND HIERARCHICAL TEST SEQUENCE GENERATION Gert Jervan, Zebo Peng Linköping University Embedded Systems Laboratory Linköping, Sweden Olga Goloubeva, Matteo Sonza Reorda, Massimo Violante Politecnico
More informationA Low Cost Checker for Matrix Multiplication
A Low Cost Checker for Matrix Multiplication Lisbôa, C. A., Erigson, M. I., and Carro, L. Instituto de Informática, Universidade Federal do Rio Grande do Sul calisboa@inf.ufrgs.br, mierigson@terra.com.br,
More informationA Fault-Tolerant Alternative to Lockstep Triple Modular Redundancy
A Fault-Tolerant Alternative to Lockstep Triple Modular Redundancy Andrew L. Baldwin, BS 09, MS 12 W. Robert Daasch, Professor Integrated Circuits Design and Test Laboratory Problem Statement In a fault
More informationMultiple Event Upsets Aware FPGAs Using Protected Schemes
Multiple Event Upsets Aware FPGAs Using Protected Schemes Costas Argyrides, Dhiraj K. Pradhan University of Bristol, Department of Computer Science Merchant Venturers Building, Woodland Road, Bristol,
More informationControl-flow checking via regular expressions
! Politecnico di Torino Control-flow checking via regular expressions Authors: Benso A., Di Carlo S., Di Natale G., Prinetto P., Tagliaferri L., Published in the Proceedings of the IEEE 10th AsianTest
More informationFault Tolerance. The Three universe model
Fault Tolerance High performance systems must be fault-tolerant: they must be able to continue operating despite the failure of a limited subset of their hardware or software. They must also allow graceful
More informationSequential Fault Tolerance Techniques
COMP-667 Software Fault Tolerance Software Fault Tolerance Sequential Fault Tolerance Techniques Jörg Kienzle Software Engineering Laboratory School of Computer Science McGill University Overview Robust
More informationHardware Implementation of a Fault-Tolerant Hopfield Neural Network on FPGAs
Hardware Implementation of a Fault-Tolerant Hopfield Neural Network on FPGAs Juan Antonio Clemente a, Wassim Mansour b, Rafic Ayoubi c, Felipe Serrano a, Hortensia Mecha a, Haissam Ziade d, Wassim El Falou
More informationNode-level Fault Tolerance for Fixed Priority Scheduling. Project proposal submitted to ARTES
Chalmers University of Technology Department of Computer Engineering 412 96 Göteborg 1998-11-13 Node-level Fault Tolerance for Fixed Priority Scheduling Project proposal submitted to ARTES Johan Karlsson
More informationTSW Reliability and Fault Tolerance
TSW Reliability and Fault Tolerance Alexandre David 1.2.05 Credits: some slides by Alan Burns & Andy Wellings. Aims Understand the factors which affect the reliability of a system. Introduce how software
More informationNew Reliable Reconfigurable FPGA Architecture for Safety and Mission Critical Applications LOGO
New Reliable Reconfigurable FPGA Architecture for Safety and Mission Critical Applications B. Chagun Basha, IETR Sébastien Pillement, IETR Loïc Lagadec, Lab-STICC LOGO Contents Introduction Making FPGA
More informationSOLUTION. Midterm #1 February 26th, 2018 Professor Krste Asanovic Name:
SOLUTION Notes: CS 152 Computer Architecture and Engineering CS 252 Graduate Computer Architecture Midterm #1 February 26th, 2018 Professor Krste Asanovic Name: I am taking CS152 / CS252 This is a closed
More informationOverview ECE 753: FAULT-TOLERANT COMPUTING 1/21/2014. Recap. Fault Modeling. Fault Modeling (contd.) Fault Modeling (contd.)
ECE 753: FAULT-TOLERANT COMPUTING Kewal K.Saluja Department of Electrical and Computer Engineering Fault Modeling Lectures Set 2 Overview Fault Modeling References Fault models at different levels (HW)
More informationCOTS Commercial is not always advertising Monica Alderighi
COTS Commercial is not always advertising Monica Alderighi Astro-Siesta, 30/01/2014 M. Alderigh, Astro-Siesta, 30/01/2014 1 COTS - Definition By Commercial Off-The-Shelf (COTS) is meant software or hardware
More informationFunctional Safety and Safety Standards: Challenges and Comparison of Solutions AA309
June 25th, 2007 Functional Safety and Safety Standards: Challenges and Comparison of Solutions AA309 Christopher Temple Automotive Systems Technology Manager Overview Functional Safety Basics Functional
More informationHigh Speed Fault Injection Tool (FITO) Implemented With VHDL on FPGA For Testing Fault Tolerant Designs
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-2894-2900 ISSN: 2249-6645 High Speed Fault Injection Tool (FITO) Implemented With VHDL on FPGA For Testing Fault Tolerant Designs M. Reddy Sekhar Reddy, R.Sudheer Babu
More informationSoftware-based Fault Tolerance Mission (Im)possible?
Software-based Fault Tolerance Mission Im)possible? Peter Ulbrich The 29th CREST Open Workshop on Software Redundancy November 18, 2013 System Software Group http://www4.cs.fau.de Embedded Systems Initiative
More informationLecture 3 - Fault Simulation
Lecture 3 - Fault Simulation Fault simulation Algorithms Serial Parallel Deductive Random Fault Sampling Problem and Motivation Fault simulation Problem: Given A circuit A sequence of test vectors A fault
More informationControl Hazards. Branch Prediction
Control Hazards The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction is a conditional branch, when does the processor know whether the conditional
More informationDefect Tolerance in VLSI Circuits
Defect Tolerance in VLSI Circuits Prof. Naga Kandasamy We will consider the following redundancy techniques to tolerate defects in VLSI circuits. Duplication with complementary logic (physical redundancy).
More informationCOE608: Computer Organization and Architecture
Add on Instruction Set Architecture COE608: Computer Organization and Architecture Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University Overview More
More informationFAULT TOLERANT SYSTEMS
FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 18 Chapter 7 Case Studies Part.18.1 Introduction Illustrate practical use of methods described previously Highlight fault-tolerance
More informationControl-Flow Checking by Software Signatures
IEEE TRANSACTIONS ON RELIABILITY, VOL. 51, NO. 2, MARCH 2002 111 Control-Flow Checking by Software Signatures Nahmsuk Oh, Member, IEEE, Philip P. Shirvani, Member, IEEE, and Edward J. McCluskey, Life Fellow,
More informationFAULT TOLERANT SYSTEMS
FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 3 - Resilient Structures Chapter 2 HW Fault Tolerance Part.3.1 M-of-N Systems An M-of-N system consists of N identical
More informationLecture 4: Instruction Set Design/Pipelining
Lecture 4: Instruction Set Design/Pipelining Instruction set design (Sections 2.9-2.12) control instructions instruction encoding Basic pipelining implementation (Section A.1) 1 Control Transfer Instructions
More informationAries: Transparent Execution of PA-RISC/HP-UX Applications on IPF/HP-UX
Aries: Transparent Execution of PA-RISC/HP-UX Applications on IPF/HP-UX Keerthi Bhushan Rajesh K Chaurasia Hewlett-Packard India Software Operations 29, Cunningham Road Bangalore 560 052 India +91-80-2251554
More informationFault Tolerant and BIST design of a FIFO cell
Fault Tolerant and design of a FIFO cell F. Corno, P. Prinetto, M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica e Informatica Torino, Italy Abstract * This paper presents a design of a
More informationLECTURE 10. Pipelining: Advanced ILP
LECTURE 10 Pipelining: Advanced ILP EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls, returns) that changes the normal flow of instruction
More informationRTL Power Estimation and Optimization
Power Modeling Issues RTL Power Estimation and Optimization Model granularity Model parameters Model semantics Model storage Model construction Politecnico di Torino Dip. di Automatica e Informatica RTL
More informationSelf-repairing in a Micro-programmed Processor for Dependable Applications
Self-repairing in a Micro-programmed Processor for Dependable Applications A. BENSO, S. CHUSANO, P. PRNETTO, P. SMONOTT, G UGO Politecnico di Torino Dipartimento di Automatica e nformatica Corso duca degli
More informationEvaluating the Fault Tolerance Capabilities of Embedded Systems via BDM
Evaluating the Fault Tolerance Capabilities of Embedded Systems via BDM M. Rebaudengo, M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica e Informatica Torino, Italy Abstract * Fault Injection
More informationError Resilience in Digital Integrated Circuits
Error Resilience in Digital Integrated Circuits Heinrich T. Vierhaus BTU Cottbus-Senftenberg Outline 1. Introduction 2. Faults and errors in nano-electronic circuits 3. Classical fault tolerant computing
More informationCS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007
CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007 Name: Solutions (please print) 1-3. 11 points 4. 7 points 5. 7 points 6. 20 points 7. 30 points 8. 25 points Total (105 pts):
More informationHAFT Hardware-Assisted Fault Tolerance
HAFT Hardware-Assisted Fault Tolerance Dmitrii Kuvaiskii Rasha Faqeh Pramod Bhatotia Christof Fetzer Technische Universität Dresden Pascal Felber Université de Neuchâtel Hardware Errors in the Wild Online
More informationDLX Unpipelined Implementation
LECTURE - 06 DLX Unpipelined Implementation Five cycles: IF, ID, EX, MEM, WB Branch and store instructions: 4 cycles only What is the CPI? F branch 0.12, F store 0.05 CPI0.1740.83550.174.83 Further reduction
More informationEffective Software-Based Self-Test Strategies for On-Line Periodic Testing of Embedded Processors
Effective Software-Based Self-Test Strategies for On-Line Periodic Testing of Embedded Processors Antonis Paschalis Department of Informatics & Telecommunications University of Athens, Greece paschali@di.uoa.gr
More informationHardware-based Speculation
Hardware-based Speculation M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica e Informatica 1 Introduction Hardware-based speculation is a technique for reducing the effects of control dependences
More informationCS 640 Introduction to Computer Networks. Role of data link layer. Today s lecture. Lecture16
Introduction to Computer Networks Lecture16 Role of data link layer Service offered by layer 1: a stream of bits Service to layer 3: sending & receiving frames To achieve this layer 2 does Framing Error
More informationArea Versus Detection Latency Trade-Offs in Self-Checking Memory Design
Area Versus Detection Latency Trade-Offs in Self-Checking Memory Design Omar Kebichi *, Yervant Zorian**, Michael Nicolaidis* * Reliable Integrated Systems Group, TIMA / INPG, 46 avenue Félix Viallet 38031
More informationFault Tolerant Parallel Filters Based on ECC Codes
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 11, Number 7 (2018) pp. 597-605 Research India Publications http://www.ripublication.com Fault Tolerant Parallel Filters Based on
More informationVerifying the Correctness of the PA 7300LC Processor
Verifying the Correctness of the PA 7300LC Processor Functional verification was divided into presilicon and postsilicon phases. Software models were used in the presilicon phase, and fabricated chips
More informationReport on automatic generation of test benches from system-level descriptions
COTEST/D2 Report on automatic generation of test benches from system-level descriptions Olga GOLOUBEVA, Matteo SONZA REORDA, Massimo VIOLANTE Politecnico di Torino Dipartimento di Automatica e Informatica
More informationOn the Automatic Generation of Software-Based Self-Test Programs for Functional Test and Diagnosis of VLIW Processors
On the Automatic Generation of Software-Based Self-Test Programs for Functional Test and Diagnosis of VLIW Processors Davide Sabena, Luca Sterpone, and Matteo Sonza Reorda Dipartimento di Automatica e
More informationChapter 3 (Cont III): Exploiting ILP with Software Approaches. Copyright Josep Torrellas 1999, 2001, 2002,
Chapter 3 (Cont III): Exploiting ILP with Software Approaches Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Exposing ILP (3.2) Want to find sequences of unrelated instructions that can be overlapped
More informationLogic Bug Detection and Localization Using Symbolic Quick Error Detection
Logic Bug Detection and Localization Using Symbolic Quick Error Detection 1 Logic Bug Detection and Localization Using Symbolic Quick Error Detection Eshan Singh, David Lin, Clark Barrett, and Subhasish
More informationSelf-checking combination and sequential networks design
Self-checking combination and sequential networks design Tatjana Nikolić Faculty of Electronic Engineering Nis, Serbia Outline Introduction Reliable systems Concurrent error detection Self-checking logic
More informationHW/SW Co-Detection of Transient and Permanent Faults with Fast Recovery in Statically Scheduled Data Paths
HW/SW Co-Detection of Transient and Permanent Faults with Fast Recovery in Statically Scheduled Data Paths Mario Schölzel Department of Computer Science Brandenburg University of Technology Cottbus, Germany
More informationOn the Optimal Design of Triple Modular Redundancy Logic for SRAM-based FPGAs
On the Optimal Design of Triple Modular Redundancy Logic for SRAM-based FPGAs F. Lima Kastensmidt, L. Sterpone, L. Carro, M. Sonza Reorda To cite this version: F. Lima Kastensmidt, L. Sterpone, L. Carro,
More informationAdvanced Computer Architecture
Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes
More informationS-SETA: Selective Software-Only Error-Detection Technique Using Assertions
IEEE TRANSACTIONS ON NUCLEAR SCIENCE 1 S-SETA: Selective Software-Only Error-Detection Technique Using Assertions Eduardo Chielle, Gennaro S. Rodrigues, Fernanda L. Kastensmidt, Sergio Cuenca-Asensi, Lucas
More informationRISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard.
COMP 212 Computer Organization & Architecture Pipeline Re-Cap Pipeline is ILP -Instruction Level Parallelism COMP 212 Fall 2008 Lecture 12 RISC & Superscalar Divide instruction cycles into stages, overlapped
More informationDriver Assistance Pushes New Flash Functionalities
Driver Assistance Pushes New Flash Functionalities Anil Gupta Technical Executive Winbond Electronics Corporation Santa Clara, CA 1 Automotive and ADAS terminology ECC use to increase reliability of Flash
More informationTU Wien. Fault Isolation and Error Containment in the TT-SoC. H. Kopetz. TU Wien. July 2007
TU Wien 1 Fault Isolation and Error Containment in the TT-SoC H. Kopetz TU Wien July 2007 This is joint work with C. El.Salloum, B.Huber and R.Obermaisser Outline 2 Introduction The Concept of a Distributed
More informationControl Hazards. Prediction
Control Hazards The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction is a conditional branch, when does the processor know whether the conditional
More informationFault mitigation strategies for CUDA GPUs
Fault mitigation strategies for CUDA GPUs Stefano Di Carlo, Giulio Gambardella, Ippazio Martella, Paolo Prinetto, Daniele Rolfo, Pascal Trotta Politecnico di Torino Dipartimento di Automatica e Informatica
More informationDependability Threats
Dependable Systems Dependability Threats Dr. Peter Tröger Operating Systems Group Dependability Dependability is defined as the trustworthiness of a computer system such that reliance can justifiable be
More informationChapter 3. The Data Link Layer. Wesam A. Hatamleh
Chapter 3 The Data Link Layer The Data Link Layer Data Link Layer Design Issues Error Detection and Correction Elementary Data Link Protocols Sliding Window Protocols Example Data Link Protocols The Data
More informationKESO Functional Safety and the Use of Java in Embedded Systems
KESO Functional Safety and the Use of Java in Embedded Systems Isabella S1lkerich, Bernhard Sechser Embedded Systems Engineering Kongress 05.12.2012 Lehrstuhl für Informa1k 4 Verteilte Systeme und Betriebssysteme
More informationOn the Consolidation of Mixed Criticalities Applications on Multicore Architectures
J Electron Test (2017) 33:65 76 DOI 10.1007/s10836-016-5636-7 On the Consolidation of Mixed Criticalities Applications on Multicore Architectures Stefano Esposito 1 Massimo Violante 1 Received: 21 July
More informationDual-Core Execution: Building A Highly Scalable Single-Thread Instruction Window
Dual-Core Execution: Building A Highly Scalable Single-Thread Instruction Window Huiyang Zhou School of Computer Science University of Central Florida New Challenges in Billion-Transistor Processor Era
More informationDynamic Control Hazard Avoidance
Dynamic Control Hazard Avoidance Consider Effects of Increasing the ILP Control dependencies rapidly become the limiting factor they tend to not get optimized by the compiler more instructions/sec ==>
More informationFAST FIR FILTERS FOR SIMD PROCESSORS WITH LIMITED MEMORY BANDWIDTH
Key words: Digital Signal Processing, FIR filters, SIMD processors, AltiVec. Grzegorz KRASZEWSKI Białystok Technical University Department of Electrical Engineering Wiejska
More informationA Performance Degradation Tolerable Cache Design by Exploiting Memory Hierarchies
A Performance Degradation Tolerable Cache Design by Exploiting Memory Hierarchies Abstract: Performance degradation tolerance (PDT) has been shown to be able to effectively improve the yield, reliability,
More informationBuilding Dependable COTS Microkernel-based Systems using MAFALDA
Building Dependable COTS Microkernel-based Systems using MAFALDA Jean-Charles Fabre, Manuel Rodríguez, Jean Arlat, Frédéric Salles and Jean-Michel Sizun LAAS-CNRS Toulouse, France PRDC-2000, UCLA, Los
More informationLecture 22: Fault Tolerance
Lecture 22: Fault Tolerance Papers: Token Coherence: Decoupling Performance and Correctness, ISCA 03, Wisconsin A Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures, HPCA 07, Spain Error
More informationREAL TIME DIGITAL SIGNAL PROCESSING
REAL TIME DIGITAL SIGNAL PROCESSING UTN - FRBA 2011 www.electron.frba.utn.edu.ar/dplab Introduction Why Digital? A brief comparison with analog. Advantages Flexibility. Easily modifiable and upgradeable.
More informationUsing a Swap Instruction to Coalesce Loads and Stores
Using a Swap Instruction to Coalesce Loads and Stores Apan Qasem, David Whalley, Xin Yuan, and Robert van Engelen Department of Computer Science, Florida State University Tallahassee, FL 32306-4530, U.S.A.
More informationUniversität Dortmund. ARM Architecture
ARM Architecture The RISC Philosophy Original RISC design (e.g. MIPS) aims for high performance through o reduced number of instruction classes o large general-purpose register set o load-store architecture
More information