A Strategy for Interconnect Testing in Stacked Mesh Network-on- Chip

Similar documents
High Performance Interconnect and NoC Router Design

Parallelized Network-on-Chip-Reused Test Access Mechanism for Multiple Identical Cores

Fault-Tolerant Techniques to Manage Yield and Power Constraints in Network-on-Chip Interconnections

Efficient And Advance Routing Logic For Network On Chip

Network on Chip Architectures BY JAGAN MURALIDHARAN NIRAJ VASUDEVAN

Network on Chip Architecture: An Overview

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip

A scalable built-in self-recovery (BISR) VLSI architecture and design methodology for 2D-mesh based on-chip networks

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP

BISTed cores and Test Time Minimization in NOC-based Systems

A Scalable and Parallel Test Access Strategy for NoC-based Multicore System

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs

ISSN Vol.04,Issue.01, January-2016, Pages:

Configurable Error Control Scheme for NoC Signal Integrity*

Fault Tolerant Prevention in FIFO Buffer of NOC Router

MinRoot and CMesh: Interconnection Architectures for Network-on-Chip Systems

WITH the development of the semiconductor technology,

A Fault Tolerant NoC Architecture for Reliability Improvement and Latency Reduction

Highly Resilient Minimal Path Routing Algorithm for Fault Tolerant Network-on-Chips

DESIGN AND IMPLEMENTATION ARCHITECTURE FOR RELIABLE ROUTER RKT SWITCH IN NOC

Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study

Noc Evolution and Performance Optimization by Addition of Long Range Links: A Survey. By Naveen Choudhary & Vaishali Maheshwari

Deadlock-free XY-YX router for on-chip interconnection network

Fault-Tolerant Routing in Fault Blocks. Planarly Constructed. Dong Xiang, Jia-Guang Sun, Jie. and Krishnaiyan Thulasiraman. Abstract.

High Throughput and Low Power NoC

PERFORMANCE EVALUATION OF FAULT TOLERANT METHODOLOGIES FOR NETWORK ON CHIP ARCHITECTURE

OVERVIEW: NETWORK ON CHIP 3D ARCHITECTURE

A Heuristic Search Algorithm for Re-routing of On-Chip Networks in The Presence of Faulty Links and Switches

Design and Implementation of Buffer Loan Algorithm for BiNoC Router

Analyzing the Performance of NoC Using Hierarchical Routing Methodology

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson

Test of NoCs and NoC-based Systems-on-Chip. UFRGS, Brazil. A small world... San Diego USA. Porto Alegre Brazil

Fault-Tolerant Multiple Task Migration in Mesh NoC s over virtual Point-to-Point connections

EE586 VLSI Design. Partha Pande School of EECS Washington State University

On an Overlaid Hybrid Wire/Wireless Interconnection Architecture for Network-on-Chip

Testability Optimizations for A Time Multiplexed CPLD Implemented on Structured ASIC Technology

DFT Trends in the More than Moore Era. Stephen Pateras Mentor Graphics

Scan-Based BIST Diagnosis Using an Embedded Processor

VLSI Design Automation

A Concurrent Testing Method for NoC Switches

Improving Fault Tolerance of Network-on-Chip Links via Minimal Redundancy and Reconfiguration

FT-Z-OE: A Fault Tolerant and Low Overhead Routing Algorithm on TSV-based 3D Network on Chip Links

Implementation of PNoC and Fault Detection on FPGA

PERFORMANCE EVALUATION OF WIRELESS NETWORKS ON CHIP JYUN-LYANG CHANG

Design and implementation of deadlock free NoC Router Architecture

VERY large scale integration (VLSI) design for power

Temperature and Traffic Information Sharing Network in 3D NoC

In-Field Test for Permanent Faults in FIFO Buffers of NoC Routers

International Journal of Research and Innovation in Applied Science (IJRIAS) Volume I, Issue IX, December 2016 ISSN

Testable SOC Design. Sungho Kang

Index Terms FIFO buffers, in-field test, NOC, permanent fault, transparent test. On Line Faults in FIFO Buffers of NOC Routers 1.

AN IMPLEMENTATION THAT FACILITATE ANTICIPATORY TEST FORECAST FOR IM-CHIPS

A Literature Review of on-chip Network Design using an Agent-based Management Method

A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing

CAD System Lab Graduate Institute of Electronics Engineering National Taiwan University Taipei, Taiwan, ROC

Dynamic Router Design For Reliable Communication In Noc

Demand Based Routing in Network-on-Chip(NoC)

Extended Junction Based Source Routing Technique for Large Mesh Topology Network on Chip Platforms

HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK

Networks-on-Chip Router: Configuration and Implementation

On the Physicl Layout of PRDT-Based NoCs

Built-in Self-Test and Repair (BISTR) Techniques for Embedded RAMs

VLSI Design Automation

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

SoC Design Lecture 14: SoC Testing. Shaahin Hessabi Department of Computer Engineering Sharif University of Technology

Introduction to System-on-Chip

A Fault Tolerant NoC Architecture Using Quad-Spare Mesh Topology and Dynamic Reconfiguration

Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA

A Modified NoC Router Architecture with Fixed Priority Arbiter

Network-on-Chip Architecture

An Area-Efficient BIRA With 1-D Spare Segments

Study of Network on Chip resources allocation for QoS Management

Design-for-Test Approach of an Asynchronous etwork-on-chip Architecture and its Associated Test Pattern Generation and Application

Design and Test Solutions for Networks-on-Chip. Jin-Ho Ahn Hoseo University

Design of an Efficient Communication Protocol for 3d Interconnection Network

Improving Memory Repair by Selective Row Partitioning

Design And Verification of 10X10 Router For NOC Applications

ScienceDirect. Power-Aware Mapping for 3D-NoC Designs using Genetic Algorithms

NoCAlert: An On-Line and Real- Time Fault Detection Mechanism for Network-on-Chip Architectures

Fault-adaptive routing

Embedded Quality for Test. Yervant Zorian LogicVision, Inc.

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology

AUTONOMOUS RECONFIGURATION OF IP CORE UNITS USING BLRB ALGORITHM

The Design and Implementation of a Low-Latency On-Chip Network

Scalable Controller Based PMBIST Design For Memory Testability M. Kiran Kumar, G. Sai Thirumal, B. Nagaveni M.Tech (VLSI DESIGN)

Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema

Design of Efficient Power Reconfigurable Router for Network on Chip (NoC)

IMPLEMENTATION OF LOW POWER DATA ENCODING TECHNIQUES FOR NoC

Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution

udirec: Unified Diagnosis and Reconfiguration for Frugal Bypass of NoC Faults

LOW POWER REDUCED ROUTER NOC ARCHITECTURE DESIGN WITH CLASSICAL BUS BASED SYSTEM

Driving 3D Chip and Circuit Board Test Into High Gear

Fully Reliable Dynamic Routing Logic for a Fault-Tolerant NoC Architecture

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

A New BIST-based Test Approach with the Fault Location Capability for Communication Channels in Network-on-Chip

3D Memory Formed of Unrepairable Memory Dice and Spare Layer

Efficient Algorithm for Test Vector Decompression Using an Embedded Processor

TESTING AND TESTABLE DESIGN OF DIGITAL SYSTES

Built-In Self-Test for Programmable I/O Buffers in FPGAs and SoCs

Transcription:

2010 25th International Symposium on Defect and Fault Tolerance in VLSI Systems A Strategy for Interconnect Testing in Stacked Mesh Network-on- Chip Min-Ju Chan and Chun-Lung Hsu Department of Electrical Engineering, National Dong Hwa University, 1, Sec. 2, Da Hsueh Rd., Shou-Feng, Hualien, 974, Taiwan, R.O.C cch@mail.ndhu.edu.tw Abstract 3D IC process has be a tendency in recent years. But the progress of IC process technologies recently has the related problems. In the 3D NoC architecture, the 3D IC process makes the placement and routing to become more complex. Then, the faults increase because of the more complex architecture. Therefore, we have to study a methodology to solve the problem. At present, the testing approach for NoC interconnect fault is based on the 2D architecture. The 3D simulated tool is not perfect. Therefore, we have to study a feasible method to test 3D architecture. In this paper, we consider how will apply a mature interconnect test approach for the 2D NoC architecture to test the 3D NoC architecture. Then, we are able to achieve the objective for increasing the yield of product through the replacement of defective chips. Index Terms built-in self-test (BIST), interconnect testing, network-on-chip (NoC). 1. Introduction According to Moore s Law, IC process technologies will progress doubled through each 18 months. The fact means a thing that ICs is able to be embedded more blocks of different function in the same size. For integrating a very high number of Intellectual Property (IP) blocks in a single die and having systems with intensive parallel communication requirement, it has emerged as a revolutionary methodology to using the Network-on-Chip (NoC) architectures [1]. The NoC architecture is able to increase performance of the SoC (System-on-Chip). It outperforms more mainstream bus architectures. At present, the conventional 2D IC has limited the choices for floor planning. And consequently, it restrains the performance improvement for using the NoC architectures. According to the International Technology Roadmap for Semiconductors (ITRS) for the longer term, new interconnect paradigms are in need [2]. Recent works have already a revolutionary methodology to solve these problems. That is introduction of 3D IC. One major advantage of the 3D IC paradigm is that it allows for the integration of dissimilar technologies, e.g., memory, analog, MEMS, and so forth, in a single die. 3D ICs improve the performance of microprocessors by forming a processor and memory stack. 3D IC has emerged better performance, functionality, and packaging density compared to more traditional 2D IC. Current NoCs are implemented predominantly following 2D architectures. However, the emergence of 3D ICs will present a fundamental change. 3D NoC has better transmission distance and number of transmission channel on the communication infrastructure than 2D NoC. It makes the 3D NoC to be better throughput, latency, energy dissipation, and wiring area overhead more than using the 2D NoC. because the distance is very short between each layers in the 3D NoC, it can embedded more blocks under the circumstance that size of the die have not more change. However increasing dramatically in the number of blocks and interconnects has made the all structure to be complex. It leads to increase the fault probability and make the yield of chips to decrease. Therefore, a methodology for 1550-5774/10 $26.00 2010 IEEE DOI 10.1109/DFT.2010.21 122

detecting the 3D NoC is more needed at present. But the recent study is almost based on the 2D NoC [3], [4], [5]. And consequently, this paper is aimed to study how to use a mature 2D NoC test strategy on the 3D NoC. 2. Test consideration 2.1 NoC testing approach The test of a NoC-based SoC for manufacturing defects is usually divided into two parts: the test of the cores and the test of the communication infrastructure [3]. The test of the cores is usually based on the reuse of the NoC as TAM, to avoid the burden of adding extra hardware for a dedicated test bus. Recent works have been addressing the test of the NoC infrastructure, including routers [6]-[9] and interconnect channels [10], [11]. Interconnect testing in NoCbased chips has been related to faults in wires within a single channel connecting two adjacent routers. However, this assumption is not reasonable in large NoC layouts. Considering realistic NoC layouts [12], the placement and routing of routers and channels are actually prone to even simpler faults, such as shorts between wires connecting the core to the network and between wires of distinct network channels. Grecu et al. [10] propose a built-in self-test (BIST) methodology for testing the channels of the communication platform. The proposed methodology targets crosstalk faults. The problem of detecting short faults in interconnection has been widely studied [3], [4], [5]. The most of the works are aimed at detecting faults for interconnects between two adjacent routers. Some studies have proposed to insert the BIST block in the router. However, setting the BIST block in the router make some faults to do not detected between the core and router. Therefore, another research is embeding the BIST block in the NI (network interface) of core. It can accomplish the object for testing all interconnect. The test strategy is based on two BIST blocks: the test data generator (TDG) and the test response analyzer (TRA). TDG generates the test vectors to transmit in the NoC. TRA receives the test vectors from the NoC and detect whether they occur faults. 2.2 Fault model definition When considering short faults in the NoC, it is important to define the region where the faults may occur, i.e., which links will more likely be short circuited. However, considering that all possible wires can be faulty might not be realistic. The number of faults grows exponentially with the number of wires considered, as shown in (1) for n, the number of independent wires, and k, the size of each fault group [2]. n! Cnk (, ) = k!( n k)! (1) In the worst-case scenario, we suppose the short faults can occur between any two interconnects. Short faults include two kinds of AND-short and OR-short, as shown in Fig. 1. They will make information packet to change the path or information flit to generate error. In this test work, considering the test difficulty and structure scalability, we suppose a 2 2 2 Stacked Mesh NoC to be a most minimum search space for test structure. It has 56 links, and the channel is 8 bits. 123

Figure 1. Fault model. 3. Proposed methodology At present, mature NoC technology refers to the 2D NoC scenario, whereas 3D NoC architectures have simulation tools still in a preliminary stage, and no complete testing plan. Therefore, considering how to take testing method for the 2D NoC interconnects to apply on the 3D NoC is more easy and feasible. According to the conception of 3D space, we know that 3D space is comprised of three kinds of 2D plane. As shown in Fig. 2, when we observe the Stacked Mesh NoC by the conception, we will discover that Stacked Mesh NoC is able to also be partitioned into three kinds of 2D structures. We use this discovery to apply to the testing 2 2 2 Stacked Mesh NoC and get the result as Fig. 3. In other words, we are able to achieve the test objective by partition 3D to 2D planes. After observing Fig. 3, we are able to choose any two kinds of plane to do testing. It have the same effect as testing a complete 2 2 2 Stacked Mesh NoC. In this work, we choose the Y-Z plane and Z-X plane to do testing, because the two kinds of plane have same test structure. Figure 2. 3D space is comprised of three kinds of 2D plane. Figure 3. A 2 2 2 Stacked Mesh NoC is partitioned into three kinds of 2D structures. 124

For this work, the number of transmission paths is 4, as shown in Fig. 4. The number of total links is 24. Because of the property of 3D structure, the faults must be calculated separately on each floor. According to (1), this test structure is most possible to generate 6,560 faults. The TDG in the core will transmit a test vector. It consists of the header and data. The header has the related information flits about path. It is able to be modified the path by shifting. Therefore, if interconnects occur the short fault, it will make the flit to change, and lead the transmission path to change. And the test vector in the data will be changed by the short fault. It will occur to the data error. There are two situations about detecting fault on the TRA. First one, the change of path information for the header causes the test sequence to arrive the wrong target core. The fault is defined as time_out. Another, the test vectors for the data what the TRA receives are error. The fault is defined as data_error. But, some faults do not change the flit or affect the path. The flit is 8 bits. The bits for 0, 1, and 4 control the direction. The bits 3 and 7 control the number of shift. And consequently, the bits 2, 5, and 6 do not directly affect the path. In other words, the transformations of bits 2, 5, and 6 do not change the path. Figure 4. Test transmission structure for Z-X plane. In this work, we use the C code to simulate the running state of test structure. Fig. 5 shows is the working flowchart. Test steps are listed as follows: 1. TDG generates the 8 bit test sequence. 2. Inject the short fault into the test structure. 3. Short fault affect the test sequence. 4. Header is modified the path, and data is changed the test vectors. 5. TRA receives the test sequence and detects fault. For example, we transmit the test sequence from core 1 to core 7 in Fig. 4. The path information of flit in the header is set to 11011001. The bits 0 and 1 control the direction of move for east (E), west (W), south (S), and north (N). The bit 3 controls the number of shift for E, W, S, and N. The bit 4 controls the direction of move for up (U) and down (D). The bit 7 controls the number of shift for U and D. When the test sequence transmits in the test structure, the router can detect the path information of the header. At first, router detects the value of the bit 3. If the bit 3 is 1, router will make the test sequence to shift according to the direction of the representative of the bits 0 and 1. If the bit 3 is 0, router will detect the value of the bit 7 to determine whether to shift 125

U and D direction. If the bits 3 and 7 are 0, router will transmit the test sequence to the core that the router connects. Under normal circumstances, when the header arrive the goal core 7, the information of flit is 11001000. After we inject the short faults into the test structure, TRA receives the test sequence and will detect two error situations. First one, the short faults change the flit (e.g. 01011001) to affect the path. That leads the test sequence to arrive the wrong core. After the set time, TRA in the original target core think the test sequence loss, and decide the fault is time_out. Second one, the short faults do not change the flit of header, or change the flit of header (e.g. 01010011), but the path is not affected. The test sequence still reaches target core 7. But the test vectors of data in the test sequence are changed by the short faults. When TRA compares the test vectors of data, TRA will know what the test vectors of data are wrong, and decide the fault is data_error. 4. Experimental results Figure 5. Working flowchart. In the simulation test, we divide the short faults into two kinds of AND-short and OR-short to test. Because we randomly select the location of short faults, a problem is that select the repeated short faults. Therefore, we inject 8,000 faults into the test plane to reduce the problem for the repeated selection. That will make the result to close to the actual situation. Then we average the test results of two kinds of plane. The average result is equivalent to the test fault coverage of a 2 2 2 Stacked Mesh NoC. Table 1 and table 2 present the simulation result when the faults are divided into AND-short and OR-short. The test results are two kinds of time_out and data_error. However, the test vectors are sure to be changed because the short faults affect the data. Therefore, the test sequence that happen the time_out fault also generates the data_error fault. In this test, we only use the time_out on behalf of this fault case. As shown in table 1 and table 2, we discover that the incidence of time_out is higher in the OR-short case. The reason is the property of AND-short and OR-short. AND-short faults have 75 percent probability to change the value to 0, and ORshort faults have 75 percent probability to change the value to 1. The value that changed to 1 is more likely to affect the path of test sequence according to the header format. In the simulation process, we discover when the flit of header has a similar number of 0 and 1, it will increase the incidence of time_out fault. Therefore, this is the reason that the incidence of time_out fault on the Z-X plane is higher than Y-Z plane. However, if we changed the header format, the result will be different. 126

Table 1. Testing fault coverage analysis for AND-short Time_out Data_error only Total detected Test area Inject 8000 faults at each plane Y-Z plane (1) 538 (6.72%) 7462 (93.28%) 8000 (100%) Y-Z plane (2) 563 (7.04%) 7437 (92.96%) 8000 (100%) Z-X plane (1) 587 (7.34%) 7413 (92.66%) 8000 (100%) Z-X plane (2) 558 (6.98%) 7442 (93.02%) 8000 (100%) 2 2 2 Stacked Mesh NoC 2246 (7.02%) 29754 (92.98%) 32000 (100%) Table 2. Testing fault coverage analysis for OR-short Time_out Data_error only Total detected Test area Inject 8000 faults at each plane Y-Z plane (1) 1326 (16.53%) 6674 (83.42%) 8000 (100%) Y-Z plane (2) 1281 (16.01%) 6719 (83.98%) 8000 (100%) Z-X plane (1) 1510 (18.88%) 6490 (81.12%) 8000 (100%) Z-X plane (2) 1558 (19.48%) 6442 (80.52%) 8000 (100%) 2 2 2 Stacked Mesh NoC 5675 (17.73%) 26325 (82.27%) 32000 (100%) 5. Conclusions In this test work, we discover that a 3D architecture is just a rule arrangement as the mesh NoC. Then we are able to use the conception of 3D space in this test work to partition it. Therefore, we are able to use the test approach for 2D to do test we need. And the test is not limited to short fault. The simulation results show we take a 2D testing approach for using the BIST methodology to test the 3D structure that is feasible. We note that the probability of the test sequence that is affected is bigger if the information flit of header and test vectors of data are more complex. In addition, the selection for test plane will be different according to used the test structure and the definition of the fault model. In the end, if we obtain the high fault coverage, we are will achieve the objective for increasing the yield of product through the replacement of defective chips. 6. References [1] P. Guerrier and A. Greiner, A Generic Architecture for On-Chip Packet-Switched Interconnections, Proc. Conf. Design, Automation and Test in Europe, pp. 250-256, 2000. [2] B. S. Feero and P. P. Pande, "Networks-on-Chip in a Three-Dimensional Environment: A Performance Evaluation", IEEE Transactions on Computers, Vol. 58, No. 1, January 2009, pp. 32-45 [3] E. Cota, F. L. Kastensmidt, L. Fernanda, M. Cassel, M. Hervé, P. Almeida, P. Meirelles, A. Amory and M. Lubaszewski, A High-Fault-Coverage Approach for the Test of Data, Control and Handshake Interconnects in Mesh Networks-on-Chip, Computers, IEEE Transactions on Volume 57, Issue 9, pp. 1202-1215, Sep. 2008. [4] E. Cota, F.L. Kastensmidt, M. Cassel, P. Meirelles, A. Amory, and M. Lubaszewski, Redefining and Testing Interconnect Faults in Mesh NoCs, Proc. IEEE Int l Test Conf., pp. 1-10, 2007. [5] M. B. Herve, E. Cota, F. L. Kastensmidt and M. Lubaszewski, NoC Interconnection Functional Testing: Using Boundary-Scan to Reduce the Overall Testing Time IEEE 10th Latin American Test Workshop (LATW '09), pp. 1-6, 2009. [6] A.M. Amory, E. Briao, E. Cota, M. Lubaszewski, and F.G. Moraes, A Scalable Test Strategy for Networkon-Chip Routers, Proc. IEEE Int l Test Conf., p. 9, 2005. [7] K. Stewart and S. Tragoudas, Interconnect Testing for Networks on Chips, Proc. 24th IEEE VLSI Test Symp., p. 6, 2006. 127

[8] C. Grecu, P. Pande, B. Wang, A. Ivanov, and R. Saleh, Methodologies and Algorithms for Testing Switch- Based NoC Interconnects, Proc. 20th IEEE Int l Symp. Defect and Fault Tolerance in VLSI Systems, pp. 238-246, 2005. [9] J. Raik, V. Govind, and R. Ubar, An External Test Approach for Network-on-a-Chip Switches, Proc. 15th Asian Test Symp., pp. 437-442, 2006. [10] C. Grecu, P. Pande, A. Ivanov, and R. Saleh, BIST for Network-on-Chip Interconnect Infrastructures, Proc. 24th IEEE VLSI Test Symp., p. 6, 2006. [11] P.P. Pande, A. Ganguly, B. Feero, B. Belzer, and C. Grecu, Design of Low Power and Reliable Networks on Chip through Joint Crosstalk Avoidance and Forward Error Correction Coding, Proc. 21st IEEE Int l Symp. Defect and Fault Tolerance in VLSI Systems, pp. 466-476, 2006. [12] F. Angiolini, P. Meloni, S. Carta, L. Benini, and L. Raffo, Contrasting a NoC and a Traditional Interconnect Fabric with Layout Awareness, Proc. Int l Conf. Design, Automation and Test in Europe, pp. 1-6, 2006. 128