CMOS Testing: Part 1 Introduction Fault models Stuck-line (single and multiple) Bridging Stuck-open Test pattern generation Combinational circuit test generation Sequential circuit test generation ECE 261 Krish Chakrabarty 1 Testing Logic Verification Silicon Debug Manufacturing Test Fault Models Outline Observability and Controllability Design for Test Scan BIST Boundary Scan ECE 261 Krish Chakrabarty 3 1
Testing Testing is one of the most expensive parts of chips Logic verification accounts for > 50% of design effort for many chips Debug time after fabrication has enormous cost Shipping defective parts can sink a company Example: Intel FDIV bug Logic error not caught until > 1M units shipped Recall cost $450M (!!!) ECE 261 Krish Chakrabarty 4 Logic Verification Does the chip simulate correctly? Usually done at HDL level Verification engineers write test bench for HDL Can t test all cases Look for corner cases Try to break logic design Ex: 32-bit adder Test all combinations of corner cases as inputs: 0, 1, 2, 2 31-1, -1, -2 31, a few random numbers Good tests require ingenuity ECE 261 Krish Chakrabarty 5 2
Silicon Debug Test the first chips back from fabrication If you are lucky, they work the first time If not Logic bugs vs. electrical failures Most chip failures are logic bugs from inadequate simulation Some are electrical failures Crosstalk Dynamic nodes: leakage, charge sharing Ratio failures A few are tool or methodology failures (e.g. DRC) Fix the bugs and fabricate a corrected chip ECE 261 Krish Chakrabarty 6 Shmoo Plots How to diagnose failures? Hard to access chips Picoprobes Electron beam Laser voltage probing Built-in self-test Shmoo plots Vary voltage, frequency Look for cause of electrical failures ECE 261 Krish Chakrabarty 7 3
Need for Testing Physical defects are likely in manufacturing Missing connections (opens) Bridged connections (shorts) Imperfect doping, processing steps Packaging Yields are generally low Yield = Fraction of good die per wafer Need to weed out bad die before assembly Need to test during operation Electromagnetic interference, mechanical stress, electromigration, alpha particles ECE 261 Krish Chakrabarty 8 Manufacturing Test A speck of dust on a wafer is sufficient to kill chip Yield of any chip is < 100% Must test chips after manufacturing before delivery to customers to only ship good parts Manufacturing testers are very expensive Minimize time on tester Careful selection of test vectors ECE 261 Krish Chakrabarty 9 4
Testing Your Chips If you don t have a multimillion dollar tester: Build a breadboard with LED s and switches Hook up a logic analyzer and pattern generator Or use a low-cost functional chip tester ECE 261 Krish Chakrabarty 10 Motivation for Testing: XBox 360 Technical Problems The "Red Ring of Death": Three red lights on the Xbox 360 indicator, representing "general hardware failure ( http://en.wikipedia.org/wiki/3_red_lights_of_death) The Xbox 360 can be subject to a number of possible technical problems. Since the Xbox 360 console was released in 2005 the console gained reputation in the press in articles portraying poor reliability and relatively high failure rates. On 5 July 2007, Peter Moore published an open letter recognizing the problem and announcing 3 years warranty expansion for every Xbox 360 console that experiences the general hardware failure indicated by the three flashing red lights on the console. 5
XBox 360 Technical Problems and Test-Cost Projections July 5, 2007, Xbox issues to cost Microsoft $1 billion-plus. Unacceptable number of repairs leads to company extending warranties. Matt Rosoff, an analyst at the independent research group Directions on Microsoft, estimates that Microsoft s entertainment and devices division has lost more than $6 billion since 2002. Manufacturing cost Test cost ITRS Cost of Test The emergence of more advanced ICs and SOC semiconductor devices is causing test costs to escalate to as much as 50 percent of the total manufacturing cost. M. Kondrat, Bridging design and ATE cuts test cost - Test & Measurement - automatic test equipment, Electronic News, Sept 9, 2002. As a result, semiconductor test cost continues to increase in spite of the introduction of DFT, and can account for up to 25-50% of total manufacturing cost. T. Cooper, G. Flynn, G. Ganesan, R. Nolan, C. Tran, Motorola, Demonstration and Deployment of a Test Cost Reduction Strategy Using Design-for-Test (DFT) and Wafer Level Burn-In and Test, (6/29/2001) Future Fab Intl. Volume 11. Test may account for more than 70% of the total manufacturing cost - test cost does not directly scale with transistor count, dies size, device pin count, or process technology, ITRS 2003. 6
Cost escalates rapidly Motivation for Testing billion+ Rule of Ten Chip x1 Board x10 System x100 Some Real Defects in Chips Processing defects Missing contact windows Parasitic transistors Oxide breakdown... Material defects Bulk defects (cracks, crystal imperfections) Surface impurities (ion migration)... Time-dependent failures Dielectric breakdown Electromigration... Packaging failures Contact degradation Seal leaks... 7
Testing Levels and Test Costs Wafer Packaged chip Board System Field Concurrent checking Cost to detect a fault (per chip) Wafer: $0.01-$0.1 Packaged chip: $0.1-$1 Board: $1-$10 System: $10-$100 Field: $100-1000 ECE 261 Krish Chakrabarty 16 Manufacturing Testing Goal: Detect manufacturing detects Defects: layer-to-layer shorts discontinuous wires thin-oxide shorts to substrate or well Faults: nodes shorted to power or ground (stuck-at) nodes shorted to each other (bridging) inputs floating, outputs disconnected (stuck-open) Fault models ECE 261 Krish Chakrabarty 17 8
Testing : The Buzzwords Errors Permanent Intermittent Transient Faults Physical Logical Test Evaluation Fault coverage Fault simulation Types of testing Off-line, on-line Self-test vs external test DC (static) vc AC (at-speed) Edge-pin, guided-probe, bedof-nails, E-beam, in-circuit ECE 261 Krish Chakrabarty 18 Testing and Diagnosis Testing: Determine if the system (chip, board) is behaving correctly Diagnosis: Locate the cause of malfunctioning ECE 261 Krish Chakrabarty 19 9
Testing and Diagnosis Design Fabrication (Physical defects) Test pattern generation (Fault model) Wafer, IC, board Pass Test Equipment Fail 00100 10110 Reject Diagnosis, repair ECE 261 Krish Chakrabarty 20 Testing: The Inevitable Acronyms System under test UUT: Unit Under Test CUT: Circuit Under Test DUT: Device Under Test The tester ATE: Automatic Test Equipment Test generation ATPG: Automatic Test Pattern Generation Fault Models SSL: Single Stuck-Line MSL: Multiple Stuck-Line BF: Bridging Fault DFT: Design for Testability BIST: Built-in self-test LFSR:Linear-Feedback Shift- Register ECE 261 Krish Chakrabarty 21 10
Fault Models Defects are too many and too difficult to explicitly enumerate Abstraction (technology independence): presence of physical defect is modeled by changing the logic function (or delay) Reduced complexity: distinct physical defects may be represented by the same logical fault Generality: tests derived for logical faults may detect vaguely-understood or hard-to-analyze physical defects A test pattern detects a fault from the fault model ECE 261 Krish Chakrabarty 22 Single Stuck-Line (SSL) Model A single node in the circuit is stuck-at 1 (s-a-1) or 0 (s-a-0). s-a-0 A D Fault-free function z = AB+CD Faulty function z f = A s-a-1 A D Fault-free function z = AB+CD Faulty function z f = AB+D Number of possible stuck-at faults in a circuit with n lines? Number of faults reduced by finding equivalent classes ECE 261 Krish Chakrabarty 23 11
Stuck-At Faults How does a chip fail? Usually failures are shorts between two conductors or opens in a conductor This can cause very complicated behavior A simpler model: Stuck-At Assume all failures cause nodes to be stuck-at 0 or 1, i.e. shorted to GND or V DD Not quite true, but works well in practice ECE 261 Krish Chakrabarty 24 Examples ECE 261 Krish Chakrabarty 25 12
SSL Fault Detection A test pattern for fault x s-a-d is an input combination that 1) places d on x (activation), 2) propagates fault effect (D or D) to primary output D: 1/0, D: 0/1 Good circuit Bad circuit s-a-0 A E ABCE = 0011 is a test pattern for C s-a-0 ECE 261 Krish Chakrabarty 26 Multiple Stuck-Line (MSF) Faults More than one line may be stuck at a logic value s-a-0 A D Fault: {C s-a-0, x s-a-1} s-a-1 How many MSL fault can there be in a circuit with n nodes? How to get test patterns for MSL faults? Fault universe is too large, MSL fault model seldom used, especially since tests for SSL faults cover many MSL faults x ECE 261 Krish Chakrabarty 27 13
Observability & Controllability Observability: ease of observing a node by watching external output pins of the chip Controllability: ease of forcing a node to 0 or 1 by driving input pins of the chip Combinational logic is usually easy to observe and control Finite state machines can be very difficult, requiring many cycles to enter desired state Especially if state transition diagram is not known to the test engineer ECE 261 Krish Chakrabarty 28 Test Pattern Generation Manufacturing test ideally would check every node in the circuit to prove it is not stuck. Apply the smallest sequence of test vectors necessary to prove each node is not stuck. Good observability and controllability reduces number of test vectors required for manufacturing test. Reduces the cost of testing Motivates design-for-test ECE 261 Krish Chakrabarty 29 14
Test Pattern Generation Exhaustive testing: Apply 2 n pattern to n-input circuit Not practical for large n Advantage: Fault-model independent Fault-Oriented Test Generation Algorithm: A D s-a-0 y x 1) Set x to 1: activate fault 2) Justify D on x, propagate D to Set C and D to 1 Example test pattern: ABCD = 0011 Backtracking may be necessary Test generation is NP-complete Set y to 0 Set either A or B to 0 ECE 261 Krish Chakrabarty 30 Test Example SA1 SA0 A 3 {0110} {1110} A 2 {1010} {1110} A 1 {0100} {0110} A 0 {0110} {0111} n1 {1110} {0110} n2 {0110} {0100} n3 {0101} {0110} Y {0110} {1110} Minimum set: {0100, 0101, 0110, 0111, 1010, 1110} ECE 261 Krish Chakrabarty 31 15
Bridging Faults Models short circuits, pairs of nodes considered Number of bridging faults? Feedback vs non-feedback bridging faults bridge A A B z z f Wired-AND Wired-OR 0 0 0 0 0 0 0 1 0? 0 1 1 0 1? 0 1 1 1 1 1 1 1 z f =? What are the test patterns in this example? ECE 261 Krish Chakrabarty 32 V DD Stuck-Open Faults a a b Floating node b Fault-free circuit: z = a+b Faulty circuit: z f = a+b + ab ~ ~ : Previous value of Gnd Case 1: a = b = 1, z pulled down to 0 Case 2: a = 1, b =0, z retains previous state A test for a stuck-open fault requires two patterns {ab = 00, ab = 10} ECE 261 Krish Chakrabarty 33 16
Sequential Circuit Test Generation Primary inputs (controllable) n Combinational logic Primary outputs (observable) State inputs (not controllable) m Registers State outputs (not observable) Difficult problem! Exhaustive testing requires 2 m+n patterns (2 m states and 2 n transitions from each state) Every fault requires a sequence of patterns Initializing sequence: drive to known state Test activation Propagation sequence: propagate discrepancy to observable output ECE 261 Krish Chakrabarty 34 Sequential Circuit Test Generation Iterative-array model (pseudo-combinational circuit) A y x D Q A y y x y + ECE 261 Krish Chakrabarty 35 17
Sequential Circuit Test Generation A s-a-0 y x D Q Assume initial state of flip-flop is not known A 1 1 y y x ABC = 11X 1 y + A 0 1 1 1 y Backward traversal in time Test pattern sequence: {11X, 011} y + 0 y D D x ABC = 011 Current time frame ECE 261 Krish Chakrabarty 36 Summary Think about testing from the beginning Simulate as you go Plan for test after fabrication If you don t test it, it won t work! (Guaranteed) ECE 261 Krish Chakrabarty 37 18