Single error correction, double error detection and double adjacent error correction with no mis-correction code

This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* Single error correction, double error detection and double adjacent error correction with no mis-correction code Ho-yoon Jun and Yong-surk Lee a) School of Electrical and Electronic Engineering, Yonsei Univsity, 134 Sinchon-dong, Seodaemun-gu, Seoul 120-749, Korea a) yonglee@yonsei.ac.kr Abstract: Single error correction, double error detection and double adjacent error correction (SEC DED DAEC) code without mis-correction of double non-adjacent error is proposed to achieve high reliability protection against soft errors in on-chip memory systems. To eliminate mis-correction among information bits, the orthogonality of orthogonal Latin square codes is engrafted in the H-matrix of the proposed code. Experimental results show that there is no mis-correction for the proposed code and the overhead of implementation is lower than that of other SEC DED DAEC codes. The proposed SEC DED DAEC code is suitable for applications to on-chip memory with high reliability. Keywords: error correcting code, soft error, multiple cell upset, memory Classification: Integrated circuits References IEICE 2013 DOI: 10.1587/elex.10.20130743 Received September 24, 2013 Accepted September 25, 2013 Publicized October 08, 2013 [1] Soonyoung Lee, Sang Hoon Jeon, Sanghyeon Baeg, and Dongho Lee: IEEE Trans. Nuclear Science 60 [2] (2013) 1384. [2] Sanguhn Cha and Hongil Yoon: IEICE Electron. Express 10 (2013) 20130103. [3] M. Y. Hsiao: IBM J. Res. and Dev. 14 [4] (1970) 395. [4] Costas Argyrides, Pedro Reviriego, and Juan Antonio Maestro: IEEE Trans. Reliability 62 [1] (2013) 238. [5] Shu Lin and Daniel J. Costello: Error Control Coding (Prentice Hall, 2004) 2nd. [6] Avijit Dutta and Nur A. Touba: IEEE 25th VLSI Test Symp.(2007) 349. [7] Michael Richter, Klaus Oberlaender, and Michael Goessel: IEEE 14th Int. On-Line Testing Symp. (2008) 37. [8] Rudrajit Datta and Nur A. Touba: IEEE 27th VLSI Test Symp.(2009) 47. [9] Adam Neale and Manoj Sachdev: IEEE Trans. Device and Materials Reliability 13 [1] (2013) 223. 1

[10] M. Y. Hsiao, D. C. Bossen, and R. T. Chien: IBM J. Res. and Dev. 14 [4] (1970) 390. 1 Introduction In deep sub-micron CMOS regime, neutron-induced soft errors are becoming a nontrivial issue in on-chip memory systems. Neutron-induced soft errors lead to multiple cell upset (MCU) in physically adjacent regions and these errors cause unrecoverable system malfunctions [1]. To address these problems, an error correcting code (ECC) is employed in on-chip memory systems [2]. The single error correction and double error detection (SEC DED) code, designed by Hsiao [3], is the most widely adopted in ECC protected memory systems. The most significant feature of Hsiao s code is its fast encoding and decoding with small parity check bits. In addition, an interleaving structure combined with SEC DED code and scrubbing is effective against MCU [4]. However, long distance interleaving is accompanied by problems with the aspect ratio in the floor planning and overhead of performance, area and power consumption. Scrubbing also requires additional power consumption and clock cycles due to its periodic reading of memory content. Even though more powerful cyclic based ECCs, such as the Reed Solomon (RS) code, Bose Chaudhuri Hocquenghem (BCH) code and Euclidean Geometry (EG) code have been proposed to address multiple errors [5], they require long latency, higher area, power consumption and more parity check bits than SEC DED codes. Recently, double adjacent error correction (DAEC) codes have been proposed in ECC literature [6-9]. However, they do not resolve mis-correction of double non-adjacent errors because syndromes for double non-adjacent errors are equal to that of double adjacent errors. This probable problem can also lead to system failure. To achieve high reliability memory systems, the mis-correction problem must be removed. In this letter, we propose a SEC DED DAEC code with no mis-correction to realize highly reliable protection against soft errors in on-chip memory systems. The proposed code is evaluated and compared with alternatives. The results show that the proposed code introduces no mis-correction and the implementation overhead is lower than that of other SEC DED DAEC codes. 2 Proposed code generation rules The proposed code is derived from Hsiao s code. The H-matrix of the proposed code consists of odd weight columns in which the number of 1 s in every column vector is odd. In the proposed SEC DED DAEC code, to correct single and double adjacent errors, the corresponding syndromes must be unique. For single error correction, all columns in the H-matrix correspond to syndromes for SEC. For double adjacent error correction, the XOR result of two adjacent column vectors in the H-matrix should also correspond to syndromes for DAEC. However, the weight of the XOR result for all double errors is even because all column vectors in the H-matrix are of odd weight. Therefore, the XOR result for double adjacent and non-adjacent error should be separated. Additionally, the XOR 2

results for non-adjacent errors can overlap because they are uncorrectable errors that can only be detected. H = 1 2 3 4 5 6 7 8 9 11 12 syndrome space syndromes for SEC (unique) odd weight syndromes for DAEC (unique) even weight all-zero syndrome (= no error) syndromes for non-adjacent DED (overlapped) r XOR gates, where r is the length of columns Fig. 1. Block diagram of mapping syndrome space onto H-matrix Fig. 1 shows a block diagram for mapping syndrome space onto the H-matrix of the proposed code consisting of 12 r-tuple column vectors. The syndrome space is determined by 2 r. The subspace of syndromes for SEC is n, which is the length of a codeword. The subspace of syndromes for double error is 2 r 1 1. The subspace of syndrome for adjacent error correction is k 1, where k is the length of information. There is no error, if the values of the syndrome are all zero. The H-matrix of the proposed code is generated to satisfy the following constraints: 1. All column weight should be nonzero. 2. Every column should be distinct. 3. Every column should be of odd weight. 4. An XOR result for two adjacent column vectors is distinct. 5. An XOR result for two non-adjacent column vectors in the parity check matrix should not overlap that of an adjacent column vector. 6. An XOR result for two columns between in the parity check matrix and the identity matrix may overlap with that of an adjacent column vector. The first two constraints provide a hamming distance of 3 for SEC. The third constraint allows a hamming distance of 4 for SEC DED. The fourth constraint provides DAEC capability, and the fifth constraint completely eliminates mis-correction caused by double non-adjacent bit errors among information bits. In addition, interleaving using column Mux is used to achieve effective regularity of the SRAM layout because the bit-cell pitch in the horizontal direction of SRAM columns is typically smaller than that of an I/O circuit, including the sense amplifier and write driver. As a result, most information bits are physically stored away from parity check bits. Therefore, the last constraint is acceptable. 3 Proposed code construction procedures The proposed code is constructed using a heuristic approach. For an r k parity 3

check matrix, the number of possible choices npc is defined by npc s r s 2 i 1 r s for i 1. (1) The variable s is a sample of r and i is an integer greater than zero. For example, for a (39,32) code, the npc is 57. In addition, there are 57 C 32 9.9e + 15 choices for 32 columns. Furthermore, there are 32! 2.6e + 51 column permutations to satisfy the constraints of the proposed code. As a result, the exhaustive search cost to generate the H-matrix are 32! 57 C 32. Therefore, a heuristic approach is required to generate the proposed SEC DED DAEC code. For (n, k) code generation, all code parameters (n, k, r) are based on those of Hsiao s code. The H-matrix for the proposed code is constructed using the algorithm shown in Fig. 2. -Input k: the length of information r: the length of parity check bits -Output H-matrix satisfied constraints -Variables OWcpl: odd weight column pools EWcpl: even weight column pools THx: temporay H-matrix FHx: final H-matrix LcolTHx: last updated column in THx OLW: the number of overlap weight (default value = 1) Ccol: Confirmed column Scol: Selected column 1: r = the number of Hsiao parity check bits(k) 2: while Ccol < k do 3: OWcpl, EWcpl = create_pool(r) 4: THx, FHx = create_empty hmatrix array(r) 5: for i from 3 to i r do 6: THx = select_seed column(owcpl i ) 7: while Ccol < k do 8: for j from 3 to j r do 9: Scol = select_column(lcolthx, OWcpl j, OLW) 10: if unique(thx, Ccol, Scol) && no-miscorrect(thx, Ccol, Scol) is true then 11: Ccol ++ 12: update_column pools status(owcpl, EWcpl) 13: go back the beginning of for at line 8 14: end if 15: end for 16: OLW ++ 17: if OLW > r then 18: exit while at line 7 19: end if 20: end while 21: if Ccol == k && (one_min(thx) < one_min(last FHx)) is true then 22: FHx = THx 23: end if 24: end for 25: if Ccol < k then 26: r++ 27: go back the beginning while at line 2 28: end if 29: end while Fig. 2. H-matrix construction algorithm for the proposed SEC DED DAEC code First, column pools are created with the pattern shown in Fig. 3. This pattern is 4

useful to eliminate mis-corrections due to the adjunction of characteristics of 11111111111111100000000000000000000 11111000000000011111111110000000000 10000111100000011110000001111110000 01000100011100010001110001110001110 00100010010011001001001101001101101 00010001001010100100101010101011011 00001000100101100010010110010110111 Fig. 3 Column pool example to eliminate mis-correction orthogonal Latin square codes [10]. To access column pools, a column is always retrieved from left to right. Second, column selection from the column pools is determined by the overlap weight i.e. the weight of a XOR result from the last confirmed column and the column to be selected, because the constant overlap weight among adjacent columns in H-matrix is also useful to separate double adjacent errors from double non-adjacent errors. Third, the temporary H-matrix that is updated by the last column is tested for uniqueness and mis-correction. If the appropriate H-matrix is not found, the process is repeated with varied values, such as overlap weight, seed column and the length of the parity check matrix bits. Finally, the H-matrix with the smallest number of 1 s is selected to decrease decoding logic area and power consumption. 4 Experimental results To verify the usefulness of the proposed SEC DED DAEC code, it was implemented and simulated in high level language. The H-matrix of the proposed (42, 32) code and the associated code parameters are shown in Fig. 4(a) and 4(b), respectively. The H-matrix pattern is similar to that of orthogonal Latin square code, which is effective to eliminate mis-correction. In Fig. 4(b), the numbers in bracket indicate a decrease of parity check bits compared to that of double error correction (DEC) BCH code. The number of 1 s is only counted in the parity check matrix without the identity matrix. The number of 1 s in the parity check matrix determines the number of XOR gates in the decoder. Therefore, a smaller number of 1 s implies low decoding complexity and area. H = 011111111111111000000000000000001000000000 010101010000000111110000100000000100000000 110010101000000100101111010000000010000000 101010000100000010000000001100100001000000 101000000010100000010101000000000000100000 000100000010010001000100010101000000010000 000000100001010010000010000111110000001000 000001000000101101000001101010100000000100 000000001001000000011000111010010000000010 000000010100001000101010000001010000000001 (a) proposed SEC-DED-DAEC code parameters k r n # 1s 32 10(-2) 42 96 64 11(-3) 75 254 128 13(-3) 141 536 (b) Fig. 4 Proposed (42, 32) code and code parameters 5

Fig. 5 shows a comparison of SEC DED DAEC codes, including the proposed code and alternatives, where k = 32. It can be seen that there is only no mis-correction in the proposed code and DEC BCH code. However, the number of parity check bits and 1 s in the DEC BCH code are higher than those of the proposed code. Although the number of 1 s for the Neale code is the least, there is 9% mis-correction. Thus, the proposed code is suitable for on-chip memory system with high reliability because of no mis-correction, the low cost decoder and the acceptable number of parity check bits. Table I. Comparison of SEC-DED-DAEC codes (k=32) alternatives r # 1s mis-correction Hsiao code [3] 7 96 N/A Dutta code [6] 7 96 53.4 % Richter code [7] 7 115 39.0 % Datta code [8] 10 140 8.8 % Neale code [9] 10 80 9.0 % DEC BCH 12 200 zero % proposed code 10 96 zero % 5 Conclusions We demonstrated a new SEC DED DAEC code to achieve high reliability protection against neutron-induced soft errors in on-chip memory systems. We showed that the proposed code has higher reliability than other SEC DED DAEC codes because it addresses the mis-correction problem. The implementation cost for the proposed code is also acceptable. Therefore, the proposed code is suitable for a protection scheme against MCU in on-chip memory system as the overhead of implementation for the proposed code is acceptable. Acknowledgments This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government. (MEST) (No. 2013035233) 6