A Hybrid Approach to the test of Cache Memory Controllers Embedded in SoCs

Size: px

Start display at page:

Download "A Hybrid Approach to the test of Cache Memory Controllers Embedded in SoCs"

Erik Melton
6 years ago
Views:

1 14th IEEE International On-Line Testing Symposium 2008 A Hybrid Approach to the test of Cache Memory Controllers Embedded in SoCs W. J. Perez 1, J. Velasco 1, D. Ravotto 2, E. Sanchez 2, M. Sonza Reorda 2 1 Universidad del Valle Grupo de Bionanoelectrónica Cali, Colombia {wjperezh, jvelasco}@univalle.edu.co Abstract Software-Based Self-Test (SBST) is increasingly used for testing processor cores embedded in SoCs, mainly because it allows at-speed, low-cost testing, while requiring limited (if any) hardware modifications to the original design. However, the method requires effective techniques for generating suitable test programs and for monitoring the results. In the case of processor core testing, a particularly complex module to test is the cache controller, due to its limited accessibility and observability. In this paper we propose a hybrid methodology that exploits an Infrastructure Intellectual Property (I-IP) to complement an SBST algorithm for testing the data and instruction cache controllers of embedded processors in SoCs. In particular, the I-IP may be programmed to monitor the system buses and generate the appropriate feedback about the correct result of the executed programs (in terms of obtained hit or miss operations). The effectiveness of the proposed methodology is evaluated resorting to a sample SoC design. I. INTRODUCTION Today s high performing processors require efficient memory hierarchy subsystems. A memory hierarchy is usually organized into some consecutive levels of memory, each one smaller, faster, and more expensive per byte than the next one. The main goal of such a kind of structure is to provide the final user with a memory system as cheap and fast as possible [1]. One of the most important elements within the memory system that try to reduce the gap between fast and high performing processors cores and slow memory devices is the cache memory; indeed, in modern designs the relative chip area occupation of cache devices is about 50%, and is still growing. As sketched by the SIA05 [2] technology roadmap, today most of the integrated circuit (IC) manufacturing cost is brought by the test and validation processes. This work was partially supported by the European Union through the ALFA/NICRON Project. Contact author: E. Sanchez. Dip. Automatica e Informatica. Politecnico di Torino. Cso. Duca degli Abruzzi 24, 10129, Torino, Italy. Tel: , Fax ernesto.sanchez@polito.it 2 Politecnico di Torino Dipartimento di Automatica e Informatica Torino, Italy {danilo.ravotto,ernesto.sanchez, Only few years ago, testing cost represented a small percentage of the total cost, but (among others) the increasing difficulty to generate appropriate testing and validation patterns, and the expensive elaboration times required to test an IC, raised these costs up to near 70%. Indeed, while the production costs continue to go down, the testing cost slope remains flat or trends upward. The previous considerations are particularly valid regarding the first level of memory hierarchy systems embedded in SoCs (Systems on Chip), i.e., caches, since there is not a mature enough methodology able to cope with all testing issues. In the SoC testing arena, Software-Based Self-Test (SBST) strategies are increasingly used for microprocessor and peripheral testing. These strategies are based on using the processor core itself to execute a program, which is able to test the processor, and possibly other cores accessible by it [9]. The program is loaded in an internal memory of the SoC; then, the onboard processor executes it and, eventually, the result produced by the system is checked by monitoring what is produced on specified output ports or memory variables. The method has several advantages with respect to traditional hardware-based ones (e.g., scan test): it allows cheap at-speed testing of the SoC; it is relatively fast and flexible; it has very limited, if any, requirements in terms of additional hardware for the test; it is applicable even when the structure of a core is not known, or cannot be modified. Currently, the real challenge of software-based testing techniques is how to generate effective test programs. An SBST algorithm devoted to test the circuitry implementing the controller of caches has been proposed in [8]: this algorithm assumes that an accurate counter is available, which is exploited to check whether the expected hit or miss operations are performed by the controller. However, in some case these accurate timers could not be available or easily usable. For this purpose, an alternative solution has been investigated, which is proposed in this paper. In the last years, a new family of cores, called Infrastructure Intellectual Property cores (I-IPs), has been introduced in order to improve (among the others) testing, silicon debugging, and diagnosis facilities in /08 $ IEEE DOI /IOLTS

2 SoCs [15]. For example in [12] and [13] the authors report the use of I-IPs for testing and fault detection in microprocessor based systems. As it has been experimentally demonstrated, the inclusion of I-IPs in a microprocessor-based SoC does not generate system performance degradation, nor requires an excessive hardware overhead, while improving testing and diagnostic capabilities of the SoC. In this paper we propose a hybrid strategy able to exploit SBST techniques and an improved version of an I-IP [12] for supporting the test of data and instruction cache controllers; the presented method is particularly suitable for post-production and incoming inspection testing of processor cores. The stimula are internally generated by the application of some specially crafted test programs, while test responses are observed by the I-IP, which generates an error signal when an unexpected behaviour is detected. The exploited test programs are generated by flexible and parametric algorithm, previously introduced and detailed in [8] able to generate implementationindependent solutions for different cache configurations. In a few words, the exploited algorithms are based on a series of specially crafted memory access operations able to generate cache hits and misses for every data element in the cache memory, while thoroughly exciting the rest of the cache controller. In the former approach [8] the authors exploited embedded timers to determine the correctness of the executed programs; on the contrary, in this case we propose the inclusion of an I-IP to accurately and inexpensively verify whether memory accesses have been performed. In order to provide the reader with enough information about the suitability and usefulness of the presented method, the detailed approach was implemented resorting to the implementation of a SoC containing a processor core with data and instruction caches; we will also provide the reader with a feedback about the real performance of the approach detailing information about stuck-at fault coverage and test length duration, as well as about the hardware overhead required to implement the proposed I-IP. The rest of the paper is organized as follows: section II outlines the basic concepts required to better understand the rest of the paper; section III describes the proposed approach, and sections IV and V illustrate a proposed case study and the gathered experimental results. Finally, section VI concludes the paper. II.BACKGROUND Different approaches have been proposed to deal with cache testing. These approaches can be classified in two basic categories: software-based and hardwarebased techniques. Hardware-based approaches usually require considerable modifications to the initial design in order to support testing procedures but it normally allows too high frequency operation testing. For example in [3], the authors propose a structural modification of the cache architecture in order to improve the IDDQ (Quiescent supply current) testing sensitivity. In [4], authors include a MBIST (Memory Built-in Self-Test) device capable of applying an improved March C- algorithm to L1 and L2 caches. Software-based approaches, on the other side, normally propose a direct transformation of March-like tests in order to test mainly the data part of cache memories [5], [6], [7]. The main drawback of these methodologies relies on the fact that these methods require the adoption of special system features to facilitate main memory writing and reading operations while the cache memory is disabled; such enabling-disabling mechanisms may not be present in SoCs normal operation mode. Moreover, these methods do not adequately deal with the test of the control part of the cache, despite the importance of this part for the correct behaviour of the processor. Alternatively, in [8] the authors proposed an SBST-based technique suitable for testing the controller of data cache memories. This technique does not require special features in the cache memory but relies on an accurate counter in order to validate cache memory operations. A. Cache memory description Cache memories are small but fast memories placed in the first levels of the memory hierarchy. Roughly speaking, caches are composed of two clearly separated parts: data part and control part. The data portion in caches is organized in cache lines, usually containing a data storage portion, an address tag, and some validity bits. Each data storage portion is called a block; cache blocks are grouped together forming the so called data storage array. Tags and validity bits are usually stored together; tags contain the high order address bits of the memory block stored in each line, whereas validity bits enclose information about the current status of the cache line. The three most adopted organization models for the cache are direct mapped, set associative and fully associative, while the most common writing mechanisms are write-through and write-back [1]. The cache control circuit is devoted to manage the data part by performing the following tasks: determine if a required block is placed in cache (hit) or not (miss), place a block in the appropriate cache position, find a block in cache, replace a block on a miss, and eventually, in the case of data caches, write in the main memory the updated information when it is required. Comparing the most significant bits of the effective address with the cache tags, the cache controller is able to determine whether a required data or instruction is in the cache or not. In the case of read misses, the controller circuit performs block replacements following the adopted replacement policy (random, least recently replaced, least recently used, etc.). The valid bits are used together with tags to understand if a cache block contains or not valid information. Dirty bits are used to indicate the presence of modified data inside the cache memory in caches that implement the write-back policy. 144

3 B. Cache controller testing challenges The cache controller is a module deeply embedded in the SoC architecture and is not possible to directly access to this part of the design. Instead, it is necessary to indirectly try to thoroughly excite the cache controller by executing specially crafted programs that perform several memory operations carefully selecting the effective address of every memory access; depending on the cache, memory operations must be reading operations (during the fetch stage in the case of the instruction cache), and reading and writing operations (in the case of data caches). In order to devise a suitable test program for testing the cache memory controllers, the generated program should be able to activate and verify the correct behavior of three functional blocks within the cache control circuitry: (i) the look-up circuit in charge of identifying the cache block storing the required information, (ii) the circuit in charge of implementing the replacement policy for every cache block, (iii) in the specific case of the data cache, the circuitry in charge of implementing the write support strategy. On the other hand, the introduction of some special hardware specifically devoted to test the cache controller is not always feasible, especially if designers cannot access the processor core description. Thus, exploiting an I-IP may support the test program application providing with an appropriate feedback information about its elaboration results. Additionally, the I-IP implementation does not require any modification of the memory cache controller (for example a cache memory freeze facilities), but it is simply an additional core plugged in the SoC. III.PROPOSED APPROACH In the following, a hybrid approach for testing the controller circuit of data and instruction caches embedded in a SoC is presented. In order to cope with the different constraints imposed by data and instruction memory caches, a couple of algorithms have been devised and briefly summarized here. In order to support the observation of produced results, the method proposes the introduction of an I-IP connected to the system buses, which is used by the SBST test program for receiving feedbacks regarding the program execution as well as to identify the faulty conditions in the cache controller. The same I-IP is exploited for testing both controllers and performs a continuous bus sniffing in order to provide the SBST test programs with the necessary input to understand the hit / miss condition in the cache. A. Algorithm notation In this paragraph we introduce the notation used in the description of the algorithms. The following parameters are used by the algorithms to calculate the memory addresses, as well as to calculate the exact number of write / read operations in the cache: As is the start address of the memory cacheable zone, Cs is the Cache size, Bs is the Block size, Ds is the Data size, Is is the Instruction size, Ms is the implemented Memory size and Wp is the Write policy (Write Back or Write Through). In consequence Nb = Cs/Bs is the number of blocks in the cache, Nd = Bs/Ds is the number of data words in the block, and Ni = Bs/Is is the number of instructions in the block. B. Data cache The algorithm proposed for data cache testing (Figure 1) is divided in two parts, each structured as a for loop. In both cases, a series of memory operations are performed Na=Nd*Nb times, where Na represents the total amount of data elements contained in the cache. Variables Initialization Flush operation (or function) for (n = 0 to Nd-1) do for (m = 0 to Nb-1) do A = Address_Calculation(As,Cs,Ds,Nd,Nt,n,m) Set_IIP(1) //Memory access allowed Write_Data (b, A) //w(miss) Set_IIP(0) //Memory access avoided Rx = Read_Data (A) //r(hit) Idle_IIP() //I-IP idle state if (Rx b) then Abnormal Termination <<Cache Faulty>> end if Restart variables for (n = 0 to Nd-1) do for (m = 0 to Nb-1) do A = Address_Calculation (As,Cs,Ds,Nd,Nt,n,m) Set_IIP(1) //Memory access allowed Rx = Read_Data (A) //r(miss) if (Wp = WB) then Set_IIP(0) //Memory access avoided else Set_IIP(1) //Memory access allowed end if Write_Data (/b, A) //w(hit) Set_IIP(0) //Memory access avoided Ry = Read_Data (A) //r(hit) Idle_IIP() //I-IP idle state if (Ry Rx 0) then Abnormal Termination <<Cache Faulty>> end if Normal Termination << Cache OK >> End I-IP_Exception Abnormal termination << Cache Faulty >> end Exception Figure 1. Proposed algorithm for the data cache controller After initializing some variables and carrying out a complete flush on the data cache (thus invalidating all cache lines), the first loop is intended to set the cache memory in a known initial condition, as well as to write into the main memory some useful information exploited later in the second loop. It is important to notice that before and after every read/write operation in the memory the I-IP is programmed (using the macro SetIIP()). The structure of the I-IP as well as its functioning will be detailed in section III.C. Once the memory address A has been computed, data b is written in the memory. In this case, every write instruction (Write_Data) generates a miss since the whole cache is invalid. Later on, data b is read from memory at the same memory address A, always 145

4 producing a hit. Finally, for every couple of write and read instructions the acquired data (Rx) is compared with reference value b. This couple of loops is repeated Na times in order to guarantee a set of writing-reading operations in every data element of the cache. The second loop of the algorithm presented in figure 1 exploits the same address calculation function for reading and writing; however, the value used for writing is the bit by bit logical complement of the value used in the first loop. In this case the external loop contains a series of read (r(miss)), write (w(hit)), read (r(hit)) loops in order to excite the rest or the cache controller. As mentioned before, accessed addresses are essential to carefully excite the cache controller; to this end a function called Address_Calculation is used. The function provides the cache entries with information as much as possible diverse. Additionally, this function guarantee cache accesses to every position of each cache block, and contemporarily the generated addresses must fully excite all circuits correlated with the tag, index and offset fields. The generic form of the address calculation function is outlined in the following: A = As + a_tag + a_index + a_offset (1) where values denoted with a_ supply the desired patterns to be applied at every address section, while As is the initial address of the cacheable memory. The following equations guarantee a marching one (2) and a marching zero (3) series on the tag, while thoroughly exciting index and offset parts: A = As + 2 k Cs + m Bs + j Ds (2) A = As + (2 Nt -2 k -1)Cs + m Bs + j Ds (3) where k, m and j determine the limits for these functions, and are strictly defined considering cache specification parameters: k = 0.. Nt-1 m = 0.. Nb-1 j = 0.. Nd-1 where Nt is the number of bits required to address the cache blocks, Nb is the number of cache blocks, and Nd is the number of data per block. The function (2) implemented to calculate the acceded addresses is reported in Figure 2: function Address_Calculation(As,Cs,Ds,Nd,Nt,n,m) k = mod (m, Nt); j = mod (m, Nd); A = As + 2 k * Cs + m * Bs +mod((j+n),nd) * Ds; return (A); end function Figure 2. Marching one function for data cache controller C. I-IP Description The previous defined algorithm is devised to carefully excite the cache controller and the logic behind it: it is able to guarantee the activation of a miss and a hit in the cache for every data element in the block. The execution of the algorithm, however, must be stopped when the operation in the cache is not the expected one. Using a pure SBST methodology this goal may be reached by exploiting an embedded timer properly activated and deactivated before and after the target instruction for the cache. Clearly, this is not always possible, especially when the granularity of the timer does not allow the measurement accuracy to detect one clock cycle variations. The introduction of an I-IP, on the other hand, guarantees a more simple identification of the memory accesses by a continuous sniffing on the system bus. The I-IP is also able to provide to the SBST test programs with the information about the bus activity through the interrupt mechanism. Finally, the area overhead introduced by the I-IP is negligible with respect to the entire SoC area. When enabled, the I-IP continuously spies the bus identifying the memory cycles on the bus. The I-IP is connected to the system bus and is addressable like a normal I/O device or a memory mapped register. Additionally, the I-IP is connected to one of the interrupt ports of the processor, generating an interrupt when an abnormal functioning of the cache is detected. In the following table the bus behavior (in terms of write and read cycles) on the different cache operations and different write policies is summarized. In table 1 the column Write Cycle is labeled with possible indicating that some memory bus cycles can be performed in order to write in the main memory the dirty words or the entire line. It is interesting to note that, for example, in a data cache memory implementing the write-through policy there are not writing cycles on a read miss, since every write operations in cache contemporarily performed in the main memory. TABLE 1. BUS BEHAVIOUR ON DIFFERENT CACHE OPERATIONS Cache Operation Write Policy Write Cycles Read cycles Read Miss WB Possible Certainly WT None Certainly Read Hit WB None None WT None None Write Miss WB Possible Certainly WT One Certainly Write Hit WB None None WT One None For simplifying and limiting the area overhead introduced by the I-IP only the memory accesses are checked. The I-IP is designed to be transparent to the write policy thus when a different behavior on the bus is possible, the test program has to take care of it. Moreover, since the main objective of the I-IP is to verify if the current read or write operation is performed in cache or in the main memory, no controls are made on the data value, since these controls are also performed by software. Following the previous consideration the I-IP was designed to be programmed to only check if a memory access is performed or not. The I-IP continuously checks the bus and verifies if a memory access is performed or not. If a violation is detected, the I-IP generates an interrupt that break the flow of the program and signal a faulty condition in the cache. 146

5 It is also possible to stop the continuous I-IP sniffing of the bus, so that the program flow can be correctly continued if the cache works correctly. In the proposed algorithms, the programming of the I-IP is done through the macro Set_IIP(), while the macro Idle_IIP put the I-IP in an idle state where the continuous I-IP sniffing is stopped. D. Instruction cache Even though there are several similarities with data cache, when moving to the test of the instruction cache controller, some additional considerations must be highlighted: (i) in order to adequately excite the control part of the instruction caches, a set of instructions placed on carefully selected memory addresses must be executed; (ii) no writing instructions are allowed in instruction caches, since usually program memory is only read but not modified; (iii) differently from data caches, the proposed algorithm must consider possible overlapping inconveniences. As depicted in Figure 3, after a first configuration step (Variable Initialization and Memory Flushing), three main elements compose the algorithm for testing the instruction cache controller: - Jumping main loop: this loop iterates Na = Cs/Is times, and for every iteration a properly generated address is calculated exploiting the Address_Calculation function; consequently, the control flow of the program is changed jumping to this position; the main goal behind this jumping loop is to guarantee the execution of instructions carefully placed in memory addresses able to thoroughly excite the cache controller; Variables Initialization Flush operation (or function) for (n = 0 to Nd-1) do for (m = 0 to Nb-1) do A = Address_Calculation (As,Cs,Is,Nb,Nd,Nt,n,m) Set_IIP(0) //Memory access avoided Jump to A //r(miss) Set_IIP(1) //Memory access allowed Jump to A //r(hit) Normal termination << Cache OK >> End I-IP_Exception Abnormal termination << Cache Faulty >> end Exception / Address_A 1 / Instruction 1 Idle_IIP(); //I-IP idle state Return / Address_A Na / Instruction 1 Idle_IIP(); //I-IP idle state Return Fig. 3. Instruction Cache Controller; proposed algorithm - I-IP configuration: in order to guarantee the correct execution of the devised program, the I-IP mechanism described above is used. - Atomic routines: small routines previously placed in the main memory, in charge of executing some instructions that assure miss and hit events in an appropriate word of every block in the instruction cache; at the end of its execution, return the program control flow to the main jump loop; It is worth noting that in the jumping main loop two jumps are performed to the same address; the first one always produce a miss in the cache, while the second always produce an hit. Doing so, it is possible to guarantee the sequence of miss/hit in the cache also for the routines stored in two blocks. It is important to notice that the Jumping main loop is placed in a non-cacheable zone in order to avoid the loading in the cache of the respective machine code. The pseudo-code of the Address_Calculation function for the instruction cache controller testing is provided in Figure 4. function Address_Calculation (As,Cs,Is,Nd,Nt,n,m) k = mod (m, Nt); j = mod (m, Ni); if (mod(n,2)==0) then pag = 2 k ; //TAG: marching one else pag= (2 Nt 1) 2 k ; //TAG: marching zero endif A = As + pag*cs + m*bs +mod((j+n),ni)*is; return(a); end function Fig. 4. Address calculation function pseudo-code In order to avoid overlapping problems, in this case, alternatively for every consecutive cache filling, the tag field is excited exploiting a marching one followed by a marching zero strategy. IV. CASE OF STUDY The effectiveness of our hybrid methodology has been experimentally evaluated on a benchmark SoC derived from a publicly available one [11] containing the fullypipelined RISC processor OpenRISC 1200, described at RTL in Verilog, the I-IP previously described in the proposed approach, 64KB on-chip RAM and several other IP cores. The SoC uses a 32 bit WISHBONE bus rev. B for the communication between the cores. The operating frequency of the SoC is 150 MHz. The OpenRISC processor is a 32 bit scalar RISC architecture with Harvard architecture, 5 stages integer pipeline, and virtual memory support. It includes supplementary functionalities, such as programmable interrupt controller, power management unit and highresolution tick timer. The processor implements a 8Kbyte data cache and a 8Kbyte 1-way direct mapped instruction cache; both caches implements a writethrough mechanism. Table 2 shows details about the gate level descriptions of the data cache and instruction cache. It is important to note that the information provided at gate-level does not include the memory elements of the cache memory. 147

6 TABLE 2. DATA CACHE AND INSTRUCTION CACHE DESCRIPTIONS INFORMATION Gates faults D-cache 985 4,528 I-cache 831 3,805 The I-IP counts about 150 gates and is implemented as a memory mapped register sharing the system bus with the memory banks. If the previous described I-IP functionalities are added to the I-IP presented in [12] the area overhead is lower than 2%. V.EXPERIMENTAL RESULTS Following the guidelines detailed above, we developed two test programs in assembly language for the OpenRISC 1200 microprocessor core. The first one implements the algorithm described in section III.B and aims at testing the data cache controller of the microprocessor. This assembly program contains 230 lines of code and its machine version occupies about 410 bytes, requiring about 159 K clock cycles to be executed. The second assembly program implements the algorithm outlined in section III.D. The program counts 10,370 lines of code, occupies about 29 Kbytes of memory and takes about 435 K clock cycles to be executed. The test size in this case is higher than in the case of data cache since the test program is structured with many atomic routines stored in different memory addresses. This is necessary since the instruction cache can not be directly written, but only accessed by mean of jumps to predefined addresses in the test program, as we described above. Each test program was simulated using a commercial logic simulator (Modelsim 6.2e by Menthor Graphics) and fault simulation results are gathered against the stuck-at faults using the Tetramax version Z tool by Synopsys. The executed experiments have been performed on a PC with an Intel Core 2 with 2 GB of RAM. For proving the effectiveness of the proposed algorithms at the gate level, the data cache and the instruction cache have been synthesized using a generic home-developed technologic library. Targeting the stuck-at fault model, the proposed methodology achieved more than 95% of stuck-at fault coverage for both instruction and data cache memory controllers. The missed fault coverage is mainly due to the address space limitation for the processor in this specific SoC implementation that prevents the registers involved in the memory exchanges to alter some bits in the higher part of the address without causing a memory exception. VI.CONCLUSIONS AND FUTURE WORKS In this paper we proposed a mixed methodology for testing the control part of data and instruction caches often existing in processor cores embedded in a SoC. For both controllers a fully parametric algorithm able to deal with different cache configurations, while maintaining a linear complexity with respect to the cache size, was previously proposed, which exploited a timer to check the correct behavior of the controller: in order to remove the requirement for the timer, while still avoiding the need for any change in the processor/cache architecture, a hybrid solution has been proposed in this paper, based on integrating the same SBST algorithm with an external module (in the form of an I-IP) to be added to the SoC. The effectiveness of the proposed methodology has been evaluated resorting to a sample SoC that included an improved version of the I-IP core described previously: for both controllers the methodology is able to reach a high stuck-at fault coverage with reduced cost in terms of code memory size and test application time. It is also interesting to note that even though cache memories are deeply embedded in SoCs, the inclusion of an improved I-IP avoids hardware modifications to the processor core, while still achieving very high coverage figures. We are currently working toward the evaluation of this methodology with respect to delay faults in cache controllers. REFERENCES [1] John L, Henessey & David A, Pattterson. "Computer Architecture". 3th edition. Morgan Kaufmann publishers [2] Semiconductor Industry Association, International Technology Roadmap for Semiconductors 2006 Update, [3] S. Bhunia, Li Hai, K. Roy, A high performance IDDQ testable cache for scaled CMOS technologies, IEEE Asian Test Symposium, (ATS '02), pp [4] P. J. Tan, Le Tung, Mantri Prasad, J. Westfall, Testing of UltraSPARC T1 Microprocessor and its Challenges, IEEE International Test Conference, 2006, ITC '06, pp [5] Sultan M. Al-Harbi, Sandeep K. Gupta. "A Methodology for Transforming Memory Tests for In-System Testing of Direct Mapped Cache Tags". 16th IEEE VLSI Test Symposium (VTS '98), pp [6] J. Sosnowski. "In system of cache memories". IEEE International Test Conference, 1995, ITC pp [7] J. Sosnowski, Improving software based self-testing for cache memories, Proc. Of IEEE 2nd Int. design and Test workshop, 2007, pp [8] W. J. Perez, J. Velasco, D. Ravotto, E. Sanchez, M. Sonza Reorda, Software-Based Self-Test Strategy for Data Cache Memories Embedded in SoCs, IEEE Workshop on Design and Diagnostics of Electronic Systems DDECS, 2008, pp [9] N. Kranitis, A. Paschalis, D. Gizopoulos, G. Xenoulis, Software-based self-testing of embedded processors, IEEE Transactions on Computers, Vol 54, issue 4, 2005, pp [10] Ad J. Van De Goor, Using March Tests to Test SRAMs, IEEE Design & Test, Vol. 10, issue 1, 1993, pp [11] Opencores, [12] P. Bernardi, M. Grosso, M. Rebaudengo, M. Reorda. Exploiting an I-IP for both Test and Silicon Debug of Microprocessor Cores Microprocessor Test and Verification, IEEE Sixth International Workshop, pp [13] P. Bernardi, L. Bolzani, M. Sonza; A Hybrid Approach to Fault Detection and Correction in SoCs On-Line Testing Symposium, IOLTS th IEEE International 2007 pp [14] IEEE P1500 Standard for Embedded Core Test (SECT), [15] Y. Zorian, What Is an Infrastructure IP? IEEE Design and Test of Computers, vol. 19, no. 3, pp. 5-7, May/June

Applying March Tests to K-Way Set-Associative Cache Memories

13th European Test Symposium Applying March Tests to K-Way Set-Associative Cache Memories Simone Alpe, Stefano Di Carlo, Paolo Prinetto, Alessandro Savino Politecnico di Torino, Dep. of Control and Computer