Fault-Tolerant Computing

Similar documents
Fault-Tolerant Computing

Fault-Tolerant Computing

Fault-Tolerant Computing

Contents 1 Basic of Test and Role of HDLs 2 Verilog HDL for Design and Test

Overview ECE 753: FAULT-TOLERANT COMPUTING 1/21/2014. Recap. Fault Modeling. Fault Modeling (contd.) Fault Modeling (contd.)

Advanced VLSI Design Prof. Virendra K. Singh Department of Electrical Engineering Indian Institute of Technology Bombay

Overview ECE 753: FAULT-TOLERANT COMPUTING 1/23/2014. Recap. Introduction. Introduction (contd.) Introduction (contd.)

Design and Synthesis for Test

Very Large Scale Integration (VLSI)

VLSI Test Technology and Reliability (ET4076)

Page 1. Outline. A Good Reference and a Caveat. Testing. ECE 254 / CPS 225 Fault Tolerant and Testable Computing Systems. Testing and Design for Test

Digital VLSI Testing Prof. Santanu Chattopadhyay Department of Electronics and EC Engineering India Institute of Technology, Kharagpur.

Chapter 9. Design for Testability

Preizkušanje elektronskih vezij

Origins of Stuck-Faults. Combinational Automatic Test-Pattern Generation (ATPG) Basics. Functional vs. Structural ATPG.

12. Use of Test Generation Algorithms and Emulation

UNIT IV CMOS TESTING

VLSI Test Technology and Reliability (ET4076)

Digital Integrated Circuits

VLSI Test Technology and Reliability (ET4076)

QUESTION BANK FOR TEST

Advanced Digital Logic Design EECS 303

Fault Tolerant Computing CS 530 Testing Sequential Circuits

VLSI Testing. Lecture Fall 2003

CMOS Testing: Part 1. Outline

Department of Electrical and Computer Engineering University of Wisconsin Madison. Fall Midterm Examination CLOSED BOOK

Diagnostic Test Vectors for Combinational and Sequential

COEN-4730 Computer Architecture Lecture 12. Testing and Design for Testability (focus: processors)

CPE 628 Chapter 4 Test Generation. Dr. Rhonda Kay Gaede UAH. CPE Introduction Conceptual View. UAH Chapter 4

l Some materials from various sources! n Current course textbook! Soma 1! Soma 3!

Scan-Based BIST Diagnosis Using an Embedded Processor

Placement and Routing A Lecture in CE Freshman Seminar Series: Puzzling Problems in Computer Engineering. Apr Placement and Routing Slide 1

On Using Machine Learning for Logic BIST

Testing Digital Systems I

Design for Test of Digital Systems TDDC33

VLSI System Testing. Fault Simulation

Testing & Verification of Digital Circuits ECE/CS 5745/6745. Hardware Verification using Symbolic Computation

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab

Safety and Reliability of Software-Controlled Systems Part 14: Fault mitigation

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

VLSI Test Technology and Reliability (ET4076)

ECE 587 Hardware/Software Co-Design Lecture 11 Verification I

Digital VLSI Testing. Week 1 Assignment Solution

INTERCONNECT TESTING WITH BOUNDARY SCAN

VLSI Test Technology and Reliability (ET4076)

Delay Test with Embedded Test Pattern Generator *

Logic and Computation

Digital System Test and Testable Design

Fault Simulation. Problem and Motivation

MODEL FOR DELAY FAULTS BASED UPON PATHS

Testing Embedded Cores Using Partial Isolation Rings

Lecture 7 Fault Simulation

Principles of Digital Techniques PDT (17320) Assignment No State advantages of digital system over analog system.

TSW Reliability and Fault Tolerance

Metodologie di progetto HW Il test di circuiti digitali

Defect Tolerance in VLSI Circuits

VLSI Testing. Virendra Singh. Bangalore E0 286: Test & Verification of SoC Design Lecture - 7. Jan 27,

Metodologie di progetto HW Il test di circuiti digitali

VLSI Testing. Fault Simulation. Virendra Singh. Indian Institute of Science Bangalore

Design for Testability

BUILDING BLOCKS OF A BASIC MICROPROCESSOR. Part 1 PowerPoint Format of Lecture 3 of Book

ECE 156B Fault Model and Fault Simulation

Code No: R Set No. 1

Topics in Software Testing

Static Compaction Techniques to Control Scan Vector Power Dissipation

BIST-Based Test and Diagnosis of FPGA Logic Blocks 1

(See related materials in textbook.) CSE 435: Software Engineering (slides adapted from Ghezzi et al & Stirewalt

Evaluation of FPGA Resources for Built-In Self-Test of Programmable Logic Blocks

CS8803: Advanced Digital Design for Embedded Hardware

Hours / 100 Marks Seat No.

1488 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 16, NO. 12, DECEMBER 1997

BIST-Based Test and Diagnosis of FPGA Logic Blocks

Driving Toward Higher I DDQ Test Quality for Sequential Circuits: A Generalized Fault Model and Its ATPG

CprE 458/558: Real-Time Systems. Lecture 17 Fault-tolerant design techniques

Programmable Logic Devices II

CHAPTER 1 INTRODUCTION

Review. EECS Components and Design Techniques for Digital Systems. Lec 05 Boolean Logic 9/4-04. Seq. Circuit Behavior. Outline.

Circuit Partitioning for Application-Dependent FPGA Testing

Test Driven Development Building a fortress in a greenfield (or fortifying an existing one) Dr. Hale University of Nebraska at Omaha

Fault Grading FPGA Interconnect Test Configurations

Lecture 28 IEEE JTAG Boundary Scan Standard

Administrivia. ECE/CS 5780/6780: Embedded System Design. Acknowledgements. What is verification?

Henry Lin, Department of Electrical and Computer Engineering, California State University, Bakersfield Lecture 7 (Digital Logic) July 24 th, 2012

Recitation Session 6

ELCT201: DIGITAL LOGIC DESIGN

High Speed Fault Injection Tool (FITO) Implemented With VHDL on FPGA For Testing Fault Tolerant Designs

At-Speed On-Chip Diagnosis of Board-Level Interconnect Faults

High-Level Testability Analysis and Enhancement Techniques

ELCT201: DIGITAL LOGIC DESIGN

QuteSat. A Robust Circuit-Based SAT Solver for Complex Circuit Structure. Chung-Yang (Ric) Huang National Taiwan University

VALLIAMMAI ENGINEERING COLLEGE. SRM Nagar, Kattankulathur DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING EC6302 DIGITAL ELECTRONICS

Self-checking combination and sequential networks design

3. The high voltage level of a digital signal in positive logic is : a) 1 b) 0 c) either 1 or 0

Verification and Testing

Combinational Equivalence Checking

X(1) X. X(k) DFF PI1 FF PI2 PI3 PI1 FF PI2 PI3

Addressing Verification Bottlenecks of Fully Synthesized Processor Cores using Equivalence Checkers

Formal Equivalence Checking. Logic Verification

PROGRAMMABLE LOGIC DEVICES

PROOFS Fault Simulation Algorithm

Transcription:

Fault-Tolerant Computing Dealing with Low-Level Impairments Slide 1

About This Presentation This presentation has been prepared for the graduate course ECE 257A (Fault-Tolerant Computing) by Behrooz Parhami, Professor of Electrical and Computer Engineering at University of California, Santa Barbara. The material contained herein can be used freely in classroom teaching or any other educational setting. Unauthorized uses are prohibited. Behrooz Parhami Edition Released Revised Revised First Slide 2

Slide 3

The good news is that the tests don t show any other problems Slide 4

Multilevel Model Ideal Component Logic Information System Legend: Legned: Initial Entry Entry Deviation Today Defective Faulty Erroneous Malfunctioning Next lecture Low-Level Impaired Mid-Level Impaired Service Result Remedy Tolerance Degraded Failed High-Level Impaired Slide 5

Overview of FAULT TESTING (Engineering, Manufacturing, Maint enance) (Engineering, Manufacturing, Maintenance) Correct design? Correct implementation? Correct Operation? TEST GENERATION (Preset/Adaptive) TEST VALIDATION TEST APPLICATION FUNCTIONAL (Exhaustive/ Heuristic) STRUCTURAL (Analytic/ Heuristic) THEORETICAL EXPERIMENTAL EXTERNALLY CONTROLLED INTERNALLY CONTROLLED FAULT MODEL switchor gatelevel (single/ multiple stuck-at, bridging, etc.) FAULT COVER- AGE DIAG- NOSIS EXTENT none (checkout, go/ no-go) to full resolution ALGO- RITHM D-algorithm, boolean difference, etc. SIMULA- TION software (parallel, deductive, concurrent) or hardware (simulation engine) FAULT INJEC- TION MANUAL AUTO- MATIC (ATE) off-line testing TEST MODE (BIST) CONCUR- RENT on-line testing (selfchecked design) Slide 6

Requirements and Setup for Testing Easier to test if direct access to some inner points is possible Reference value Test pattern source Circuit under test (CUT) Comparator Pass/Fail Testability requires controllability and observability (redundancy may reduce testability if we are not careful; e.g., TMR) Reference value can come from a gold version or from a table Test patterns may be randomly generated, come from a preset list, or be selected according to previous test outcomes Test results may be compressed into a signature before comparing Test application may be off-line or on-line (concurrent) Slide 7

Importance and Limitations of Testing Important to detect faults as early as possible Approximate cost of catching a fault at various levels Component $1 Board $10 System $100 Field $1000 Test coverage may be well below 100% (model inaccuracies and impossibility of dealing with all combinations of the modeled faults) Trying to improve software quality by increasing the amount of testing is like trying to lose weight by weighing yourself more often. Steve C. McConnell Program testing can be used to show the presence of bugs, but never to show their absence! Edsger W. Dijkstra Slide 8

Fault Models at Different Abstraction Levels Fault model is an abstract specification of the types of deviations in logic values that one expects in the circuit under test Can be specified at various levels: transistor, gate, function, system Transistor-level faults Caused by defects, shorts/opens, electromigration, transients,... May lead to high current, incorrect output, intermediate voltage,... Modeled as stuck-on/off, bridging, delay, coupling, crosstalk faults Quickly become intractable because of the large model space Function-level faults Selected in an ad hoc manner based on the function of a block (decoder, ALU, memory) System-level faults (malfunctions, in our terminology) Will discuss later in section dealing with mid-level impairments Slide 9

Gate- or Logic-Level Fault Models Most popular models (due to their accuracy and relative tractability) Line stuck faults Stuck-at-0 (s-a-0) Stuck-at-1 (s-a-1) Line bridging faults Unintended connection (wired OR/AND) Line open faults Often can be modeled as s-a-0 or s-a-1 Delay faults (less tractable than the previous fault types) Signals experience unusual delays Other faults Coupling, crosstalk A B C Short (OR) s-a-0 Open S K Slide 10

Path Sensitization and D-Algorithm The main idea behind test design: control the faulty point from inputs and propagate its behavior to some output Example: s-a-0 fault Test must force the line to 1 Two possible tests (A, B, C) = (0 1 1) or (1 0 1) This method is formalized in the D-algorithm A B C 1 Backward trace 1/0 s-a-0 1 1/0 0 1/0 Forward trace (sensitization) D-calculus 1/0 on the diagram above is represented as D 0/1 is represented as D Encounters difficulties with XOR gates (PODEM algorithm fixes this) S K Slide 11

Selection of a Minimal Test Set Each input pattern detects a subset of all possible faults of interest (according to our fault model) A B C P Q s-a-0 s-a-1 s-a-0 s-a-1 0 0 0 - - - x 0 1 1 x - x - 1 0 1???? 1 1 1???? H Choosing a minimal test set is a covering problem A B C E G F M N J P L Q R S K Equivalent faults: e.g., P s-a-0 L s-a-0 Q s-a-0 Q s-a-1 R s-a-1 K s-a-1 Slide 12

Capabilities and Complexity of D-Algorithm Reconvergent fan-out Consider the s input s-a-0 x s 1 D D 1 z Simple path sensitization does not allow us to propagate the fault to the primary output z y 1 D PODEM solves the problem by setting y to 0 Worst-case complexity of D-algorithm is exponential in circuit size Must consider all path combinations XOR gates cause the behavior to approach the worst case Average case is much better; quadratic PODEM: Path-oriented decision making Developed by Goel in 1981 Also exponential, but in the number of circuit inputs, not its size Slide 13

Boolean Difference K = f(a, B, C) = AB BC CA dk/db = f(a, 0, C) f(a, 1, C) = CA (A C) = A C K = PC AB dk/dp = AB (C AB) = C(AB) A B C E G F H M N J s-a-0 P L Q R S K Tests that detect P s-a-0 are solutions to the equation P dk/dp = 1 (A B) C(AB) = 1 C = 1, A B Tests that detect P s-a-1 are solutions to the equation P dk/dp = 1 (A B) C(AB) = 1 C = 1, A = B = 0 Slide 14

Complexity of The satisfiability problem (SAT) Decision problem: Is a Boolean expression satisfiable? (i.e., can we assign values to the variables to make the result 1?) Theorem (Cook, 1971): SAT is NP-complete In fact, even restricted versions of SAT remain NP-complete According to the Boolean difference formulation, test generation can be converted to SAT (find the solutions to P dk/dp = 1) To prove the NP-completeness of test generation, we need to show that SAT (or some other NP-complete problem) can be converted to test generation For a simple proof, see [Fuji85], pp. 113-114 Slide 15

Design for Testability: Combinational Increase controllability and observability via the insertion of degating mechanisms and control points Design for dual-mode operation Normal mode Test mode Degate Control/Observe Partitioned design Normal mode Test mode for A A B A B A B Muxes Slide 16

Design for Testability: Sequential Increase controllability and observability via provision of mechanisms to set and observe internal flip-flops Scan design Shift desired states into FF Shift out FF states to observe. Mode control FF FF Combinational logic. FF FF. Combinational logic. Partial scan design: Mitigates the excessive overhead of a full scan design Slide 17

Boundary Scan for Board-Level Testing Parallel in Lets us apply arbitrary inputs to circuit parts whose inputs would otherwise not be externally accessible Any digital circuit Scan in Test clock Mode select Scan out Parallel out Boundary scan elements of multiple parts are cascaded together into a scan path From: http://www.asset-intertech.com/pdfs/boundaryscan_tutorial.pdf Slide 18

Basic Boundary Scan Cell From: http://www.asset-intertech.com/pdfs/boundaryscan_tutorial.pdf Slide 19

Built-in Self-Test (BIST) Ordinary testing Reference value Test pattern source Circuit under test (CUT) Comparison Pass/Fail Built-in self-testing Test pattern generation Circuit under test (CUT) Decision Pass/Fail Test patterns may be generated (pseudo)randomly e.g., via LFSRs Decision may be based on compressed test results Slide 20

Quantifying Testability: Controllability Controllability C of a line has a value between 0 and 1 Controllability transfer factor CTF = 1 N(0) N(1) N(0) + N(1) 1.0 0.3 0.5 k-input, 1-output components 0.15 N(0) = 7 N(1) = 1 CTF = 0.25 N(0) = 1 N(1) = 7 CTF = 0.25 C output = (Σ i C input i /k) CTF k-way fan-out C C / (1 + log k) for each of k A line with very low controllability is a good place for test point insertion 0 Control point Slide 21

Quantifying Testability: Observability Observability O of a line has a value between 0 and 1 Observability transfer factor OTF = N(sp) N(sp) + N(ip) O input = O output OTF k-way fan-out 0.15 0.15 0.15 A line with very low observability is a good place for test point insertion 1 k-input, 1-output components N(sp) = 1 N(ip) = 3 OTF = 0.25 Observation point 0.6 Sensitizations Inhibitions 1 Π(1 O j ) O j for line j Slide 22

Quantifying Testability: Combined Measure Testability = Controllability Observability Controllabilities Observabilities 1.0 0.3 0.5 0.15 0.15 0.15 0.15 0.6 Testabilities 0.15 0.045 0.075 0.09 Overall testability of a circuit = Average of line testabilities Slide 23