Adaptive ECC for Tailored Protection of Nanoscale Memory

Size: px
Start display at page:

Download "Adaptive ECC for Tailored Protection of Nanoscale Memory"

Transcription

1 Adaptive ECC for Tailored Protection of Nanoscale Memory Dongyeob Shin, Jongsun Park Korea University Seoul, Korea {shindy99, Jangwon Park Samsung Electronics Suwon, Korea Somnath Paul Intel Corporation Hillsboro, OR, USA Swarup Bhunia University of Florida Gainesville, FL, USA Abstract Increasing run-time failure in nanoscale memory, specifically at low supply voltages, has emerged as a major challenge in current VLSI design. This paper presents a novel reconfigurable Error Correction Code (ECC) for robust nanoscale memory, which can dynamically adapt, in space and time, to varying reliability of memory blocks, thus providing the right amount of protection to a memory block at a given time. Our analysis shows that the proposed ECC scheme can efficiently tolerate high run-time failure rates with modest performance and area penalty. It can significantly enhance nanoscale memory reliability at iso-overhead compared to existing uniform ECC scheme. Keywords: Memory Failures, Error Correction Code (ECC), Variable ECC, Run-time Protection, Robust Nanoscale Memory I. INTRODUCTION Post-silicon calibration and healing techniques have emerged as effective solutions for recovering from manufacturing defects or process variation induced failures in digital, analog and RF circuits/systems [] []. In case of nanoscale memories, aggressive area optimization in the quest of higher integration density has made them highly vulnerable to manufacturing defects as well as run-time failures. Built-in redundancy (e.g. in row/column) has been a well-adopted healing approach for memory to adapt to hard defects []. However, tolerance to runtime failures in memory remains a serious challenge for system-on-chip designs [] particularly in the sub-5nm technology regime. Increasing process variation in these process nodes largely aggravates run-time failure rate. Such failures can affect random or contiguous bit positions in a memory codeword. They can be primarily caused by: ) supply voltage or D. Shin and J. Park are with the School of Electrical Engineering, Korea University, 7B Innovation Hall, Seoul -7, Korea (phone/fax: /+89544, shindy99@korea.ac.kr and jongsun@korea.ac.kr) J. Park is with Samsung Electronics, Suwon 44-74, Gyeonggi, Korea ( jw849.park@samsung.com) S. Paul is with Intel Corporation, N.E. 5th Ave, MS JF-55, Hillsboro, OR 974, USA (phone: +5754, somnath.paul@intel.com) S. Bhunia is with University of Florida, A Larsen Hall, Gainesville, FL, USA (phone/fax: / swarup@ece.ufl.edu ) thermal noises, and ) temporal device degradation due to aging effects [4]. In order to address the multiple bit runtime failures in onchip memories, error correcting codes (ECC) such as single error correction and double error detection (SECDED) and bit-interleaving have been used together [5]. Bitinterleaving distributes the contiguous errors into different words and facilitates error correction using SECDED. However, it typically incurs significant energy overhead and half-select issues due to pseudo-read operations. It is worth noting that one of the primary drawbacks encountered with the conventional uniform ECC approach is that the ECC protection is equally applied to all memory blocks. It fails to account for the distribution of vulnerability to run-time failures across memory blocks. The conventional overly pessimistic uniform ECC allocation approach, where ECC error correction capability is based on worst-case memory block vulnerability, generally wastes significant silicon area and leads to greater power consumption. With increasing spatial as well as temporal shift in intrinsic reliability of memory blocks, such uniform protection approaches are unattractive in terms of overhead or level of protection. This paper presents a novel reconfigurable ECC scheme for robust nanoscale memory, which can dynamically adapt in space and time to varying reliability of memory blocks. This is achieved by incorporating a reconfigurable ECC encoder and decoder with multiple protection capabilities during the design, and selecting them on demand during actual operation. In order to enhance the effectiveness of a multi-bit bit error tolerance scheme, we use Bose-Chaudhuri-Hocquenghem (BCH) cyclic code that is effective for random multi-bit correction at low hardware overhead. Our approach can provide the right amount of error correction capability to the individual memory blocks depending on their relative vulnerabilities to runtime failures without incurring large hardware and power overhead. As a case study, the proposed time-varying ECC approach is applied on a low-power, supply-voltage-scalable MB L cache. In order to reduce the increasing number of errors in L cache due to voltage scaling, we propose a gradual voltage scaling scheme together with the adaptive time duration control. We show that the proposed adaptive ECC approach provides high level of reliability for the cache while maintaining its low-power advantage.

2 Space and Time Varying ECC in Memory Protection Model Variation Model Fig. : Overall scheme for the proposed variable error correction in nanoscale memory. The correction capability changes over space and time. T and W indicate, the numbers of bits to be corrected and the codeword width, respectively. Two types of configurability for dynamic error correction in nanoscale memory array. II. SPACE-TIME VARING ECC In this section, first we describe prior art on variable ECC and present the basic concept of the adaptive ECC approach. A. Related Work on Variable ECC ) Reliability-Driven ECC Allocations for Adaptive Error Protection: A reliability-driven ECC allocation scheme, where the relative vulnerability of a memory block (determined using post-fabrication characterization) is matched with appropriate ECC protection, has been proposed in []. In this approach, post-fabrication variable ECC allocation to different memories are achieved by storing the check bits in the ways of an associative cache. This work also presents efficient circuit/architecture-level optimizations of the ECC encoding/decoding logic to minimize the impact on area, performance, and energy. ) Bit-width Reconfigurable ECC: Based on the facts that the MSBs are significantly more important than least significant bits (LSBs) in digital signal processing (DSP) applications with respect to output data quality, a bit-width reconfigurable ECC [7] is designed with extra control units for dynamically changing the input data-length. When the number of memory failures in a code-word exceeds the maximum correctable number of bits during low voltage operation, the input data length of ECC is reduced to focus on the more important MSB parts. As a result, the correction of failures on MSBs can be ensured even at low supply voltages, and the overall system quality degradation caused by SRAM failures can be minimized since uncorrected LSB failures have much less prominent effect on the system output. B. Space-Time Varying ECC With inter and intra-die process variations, different sections of a memory array move to different process corners, and some of the memory cells may become marginally functional during the manufacturing test. Those weak cells can undergo runtime failures due to voltage/thermal noise or aging effects. In order to improve the reliability, the memory cells that suffer larger process variations should be protected using stronger ECC with higher error correction capabilities. However, due to the unpredictable random process variations, the conventional uniform ECC protection fails to account for the distribution of vulnerability across memory blocks. The proposed spacetime varying ECC scheme addresses this shortcoming and allocates detection and correction capabilities proportional to the vulnerability of the blocks. Fig. and illustrate the overall scheme for the proposed variable ECC in memory. As shown in Fig., depending on the severities of static, spatial and temporal variations, the reconfigurable ECC can adaptively change error correction capability (T) and code-word width (W) over space and time. In the following sections, as an example of variable ECC approach, this paper presents a

3 H = (, ) R(x) bit H(,8) bit H(44,8) 4bit H(,8) α α α α 4 α 5 α α 7 α 8 α 9 α 4 α 4 α 4 α 4 α 44 α 45 α 4 α 47 α 48 α 49 α 5 α 5 α 5 α 5 α 54 α 55 α 5 α 57 α 58 α 59 α α α 9 α 47 α 5 α 5 α 5 α 59 α α 5 α 8 α 7 α 74 α 77 α 8 α 8 α 8 α 89 α 9 α 95 α 98 α α 4 α 7 α α α α 9 α α 5 α α 5 α α 5 α 7 α 75 α 8 α 85 α 9 α 95 α α 5 α α 5 α α 5 α α 5 α 4 α 45 α 5 α α 5 α α 5 α α 5 α α 7 α 4 α α 7 α 8 α 87 α 94 α α 8 α 5 α α 9 α α 4 α 5 α α 9 α α α α 7 α 44 α 5 α 58 α 5 α 7 α 79 α 8 α 9 Syndrome Generator Key Equation Solver FIFO Chien Search C(x) 59 S S S S4 S5 S S7 bit bit 4bit (c) S8 a 8 a XOR array XOR array S S S5 S7 mult. PE mult. mult. mult. mult. mult. mult. mult. mult. (d) PE bit 4bit bit bit 48 D D D 4bit (e) 48 D Encoding Mode LUT Decoding Control mode_selection Syndrome Monitoring Input Syndrome Φ PDN VDD PUN Φ GND Turning-off gate() Fig. : VC-ECC decoder architecture. The unified parity check matrix of VC-ECC. The complete BCH decoding process. (c) Syndrome generator. (d) Peterson algorithm [9] based key equation solver implementation. (e) Chien search. (f) Dynamic adaptation scheme applied to syndrome generator using turning-off gate []. The enable signal Φ is generated from control module using the mode_selection signal. (f) Output S S S S4 S5 S S7 S8 bit bit 4bit temporally varying ECC scheme, where the ECC architecture can dynamically change the error correction capabilities depending on the number of failures in the embedded memory. III. TEMPORALLY VARYING ECC The proposed Variable error Correction capability ECC (VC-ECC) scheme offers three different error correction capabilities (bit / bit / 4bit), and the correction capability

4 can be automatically adapted at run time to the number of failures in memory using a dynamic syndrome monitoring approach. When smaller error correction options (- bit correction) is selected, the unused modules can be easily turned off to save computation energy. A. The VC-ECC Architecture ) VC-ECC Encoder/Decoder Architecture: VC-ECC encoder [8] is mainly composed of Galois field adders and dividers, and three different division parts are used. The area overhead of the reconfiguration (different division parts) is small since the area of encoder is much smaller (around 5 %) than that of the decoder. The VC-ECC decoder [8] is composed of syndrome generator (SG), key equation solver (KES), and Chien search (CS) modules as shown in Fig.. Overall VC- ECC decoder is similar to 4-bit correction BCH decoder, and the architecture is scalable such that a simple control logic can easily turn off the unused modules when the correction capability is or bits. The unified parity check of VC-ECC is presented in Fig.. The dimension of the parity check matrix is (, ) meaning that the input is the codeword of bit and the outputs are four odd syndromes of 8 bit width. Each Galois field element of the parity check matrix is (8, ) vector. As shown in Fig. (c) and (f), only 5% or 5% of SG is utilized for -bit or -bit correction BCH, respectively. The scalable syndrome calculation is also shown in the unified parity check matrix. For KES module, the inversion-less Peterson algorithm [9] is adopted to reduce the critical path delay. The inversion-less KES for 4-bit correction BCH decoder is designed with PE and PE, and PE can be turned off when -bit correction mode is used. The scalable CS modules are also presented in Fig. (e). Fig. (f) illustrates the power-gating scheme [8] to turn off the unused parts in the BCH decoder. Simple pull-up and pull-down transistors with correct dimensions is used to turn-off unused sections of the SG based on whether -bit, -bit or 4-bit correction scheme is being exercised. The pulldown NMOS transistor is required to ensure that SG modules provide zero output when unused in order to have correct ECC functionality. The additional area for powergating is accounted for in the results presented in Fig.. ) Dynamic Adaptation of VC-ECC: The proposed VC- ECC has three choices of error correction capabilities, and the correction mode can be controlled using -bit mode_selection signal as shown in Fig.. For the protection of on-chip cache memory using VC-ECC, the -bit mode_selection is stored per cache block to indicate the encoding type, and the number of ways to store ECC bits is dynamically adjusted during runtime, similar to spatially varying ECC []. The two bit overhead for the mode_selection storage is negligible considering a typical cache block size (e.g. 5bits). At runtime, this mode_selection information is updated on a regular basis by monitoring the frequency of memory failures. This is obtained from the output of the syndrome generator since ECC Type Total area (μm ) Max. freq. (MHz) # of Cycles Power (mw) SECDED any non-zero syndrome indicates memory failure occurrence. When the frequency of memory failures increases or decreases, the VC-ECC scheme can change mode_selection signal to offer proper error correction capabilities. As presented in Fig. (f), we do not need an extra stage to identify/change ECC mode since the mode_selection signal can be directly used to VC-ECC decoder. B. Experimental Results. Hardware Implementation Results SECDAEC 84 Bit Cor. BCH The proposed VC-ECC decoders are implemented using 5-nm standard-cell CMOS library, and Fig. shows the implementation results. SECDED, single-error-correction double-adjacent-error correction (SECDAEC) ECC, -bit (Hamming), -bit, and 4-bit correction BCH decoders are also implemented for comparison. From the results, it is evident that the power and performance for single bit error correction with the proposed VC-ECC hardware is comparable with those in stand-alone SECDED, SECDAEC.5 (c) Bit Cor. BCH Bit Cor. BCH 5 4. VC-ECC 575 Storage (bit) 9 8 /8/ /4/ /./4.9 Fig. : Implementation results of VC-ECC scheme. Hardware implementation results (area, power, performance). L miss rate ratio when VC-ECC is applied to L cache. (c) CPI ratio when VC-ECC is applied to L cache. 4

5 The number of replacements in L cache (x 5 ) VDD (mv) VDD (mv) ECC mode ECC mode L cache miss Data writing + L cache miss Normal data writing 5 5 Clock cycles (x 8 ) Clock cycles (x 8 ) V DD e n : the n th error frequency e e Initial V DD drop e T Clock cycles (x 8 ) (c) and Hamming hardware. The area requirement is understandably larger since VC-ECC hardware provides greater flexibility in error correction. The parity bits of VC- ECC (8// bit) is identical with those of each //4 bit BCH. In Fig., storage bits indicate the parity bits plus the selection bits for configurability, which are called the mode_selection bits. The power consumptions results in Fig. are obtained using the clock cycle of ns i.e. a Error frequency Error frequency Fig. 4: Simulation results with 49.mcf benchmark: Increasing cache block replacement with time. Error frequency variations with the conventional one-step voltage scaling. (c) Error frequency variation with the proposed step-by-step voltage scaling. (ΔV dd = mv and ΔT= million cycles) frequency of 7MHz for all ECC schemes at.v with circuit-level simulations in Spice using input data. Fig. and (c) show the simulation results on cache miss rate and clock-cycles per instruction (CPI) when the proposed temporally varying ECC scheme is applied to L cache. The performance of the proposed VC-ECC scheme is measured using SPEC benchmark suite complied for 4-bit single core out-of-order processor with 4 issue width. It is simulated by general execution-driven multiprocessor simulator (GEMS) on Simics using the MOESI coherence protocol. Each of the SPEC benchmarks was simulated for million instructions. Our memory system includes a KB, 8-way set-associative L instruction and data cache, and a MB, -way set-associative unified L cache. All caches in the system are configured to have 4 byte lines. The L cache access latency is assumed to be clock cycles, and L cache is assumed to be protected by SECDED as a baseline. From the simulation results shown in Fig and (c), we can observe the following. When the proposed VC-ECC is used as -bit or 4-bit correctible scheme, the performance degradation induced by reduction in cache capacity due to parity storage bits and additional latencies of ECC decoder logic, is negligibly small for most of the benchmarks compared to baseline SECDED scheme. In the following section, we present a case study of the VC-ECC application. For the very slowly changing temporal variations like aging, since the error rate increases only with time, the adaptation method can be relatively regular and simple. We will consider the more complex example of L cache with supply voltage scaling. IV. APPLICATION TO VOLTAGE-SCALABLE L CACHE When VC-ECC is used in L cache, following are the two issues to consider: ) since the supply voltage scaling induces the increase of bit error rate (BER), VC-ECC should change the ECC mode for providing stronger protection to L cache memory. However, the data that has been already encoded by ECC encoder and stored in the cache, need to be decoded in the same way as it was encoded. For example, consider that an L cache word has been stored with ECC mode (-bit correction mode) before the supply voltage scaling, and ECC mode changes from mode to mode (-bit correction mode) after scaling down the voltage. Then the cache words already encoded with ECC mode need to be decoded by mode after the mode change although the current ECC mode is mode. ) A simple and effective way to cope with the ECC mode change is to read out all the cache data and re-encode the data following the new mode_selection signal. However, reencoding all the data in the memory would incur large latency and power overheads. Actually, the normal cache read/write operations naturally replace the cache data with the new ECC mode as time goes on, which is shown in Fig. 4. In the normal cache operation, since the data in cache 5

6 is decoded with previous ECC mode while reading, and the data is encoded with the updated ECC mode when writing to memory, it naturally updates the cache data even without re-encoding. However, when the supply voltage is scaled down, the cache error rate abruptly increases, which may causes considerable performance loss. As an example, Fig. 4 shows the increasing error frequency when the supply voltage is abruptly scaled down from 7 mv to mv. Here, the error frequency is defined as the number of codewords encountering errors in the total number of decoded codewords during a given time duration. In this work, considering the L cache access rate of 5.45 million per sec (. million per 5 ms) with the worst case of the benchmark simulations, the time duration of 5 ms is used to consider at least million L cache accesses. The total number of decoded codewords during the time duration is around. To obtain the results shown in Fig. 4, first, BER of SRAM is obtained using Monte Carlo simulations with 45nm predictive technology model (PTM) for various supply voltages. Cache access data is calculated using the GEM5 simulator on Simics using SPEC benchmark suite compiled for GHz 4-bit single core out-of-order processor. The SPEC benchmark has been simulated for 5 million instructions. The details of the cache configurations can be found in Section III. As presented in Fig. 4, when the supply voltage is scaled down directly from 7 mv to mv with one step change, the error frequency starts to increase rapidly. Due to the error rate difference between 7 mv and mv, memory blocks encoded with lower protection level, show a high decoding error frequency. A large error frequency observed especially at the initial moment of the voltage scaling as shown in Fig. 4, can lead to the significant performance loss due to the latency overhead of reading data from the next level of memory i.e. the main memory (DRAM). Gradual voltage scaling scheme with VC-ECC: As shown in Fig. 4, the cache blocks are gradually replaced with time. We leverage the observation of gradual cache replacement of blocks to scale the supply voltage down at multiple small steps (ΔV dd ) instead of one-step scaling to reduce the memory error rate incurred by voltage scaling. In this way, we give enough time (ΔT) for the existing blocks with increased errors to be gradually replaced with new blocks with updated ECC mode. The results for the step-bystep supply voltage scaling, which incorporates a relatively gradual change in the supply voltage, are presented in Fig. 4 (c). As shown in the figure, dividing the voltage shift in small steps can prevent the error rate from abruptly increasing, and it becomes much lower than that in Fig. 4. In addition to the cache replacement due to the normal data writing (a dotted line) as shown in Fig. 4, the regular cache-misses also accelerate the cache replacement (a solid line) for the L cache application. Since the cachemiss also needs to replace the cache block with the data from the L cache or main memory block, the proposed step-by-step scheme can be applied more effectively to the L cache. The numerical results shown in Fig. 4 (c) is obtained with a fixed voltage step (ΔV dd ) of mv and time duration (ΔT) of million cycles ( ms). Here, the voltage step (ΔV dd ) is a fixed parameter, which is limited by a voltage regulator performance []. However, the time duration (ΔT) can be controlled by monitoring the error frequencies to provide enough time for cache replacement since blindly reducing the supply voltage with fixed ΔT can increase the memory errors. Adaptive Time Duration (ΔT) Control: In this approach, we decide whether to scale down the supply voltage or not by comparing the error frequency with an error rate threshold value. When the monitored error frequency is still larger than a threshold value, the supply voltage is maintained to give more time until the error rate decreases with cache replacement. Otherwise, the supply voltage can be scaled down one step. The cumulative moving average (CMA) [] of the error frequency is used as the threshold value in our approach. The CMA is widely used to determine the moving average since it provides a distortion tolerance for the applications with equally important input data []. As presented in Fig. 4 (c), the n th CMA from the initial ΔV dd drop can be expressed as: CMA n = e n + (n ) CMA n n = e + e + + e n n e n = the n th error frequency from the initial V dd drop, where CMA =. CMA n is the average of all the error frequency values from the initial e, and it can be easily calculated with a simple hardware ( adder, multiplier, divider and flip-flop) based on the current error frequency e n and the former CMA n-. When the error frequency e n is larger than CMA n-, the supply voltage is maintained (ΔT increases), which means that the cache has not been replaced enough. If e n is smaller than CMA n-, the supply voltage can be scaled by a step ΔV dd. In this approach, since each of e n for computing CMA n is equally weighted, a single error frequency does not have a large effect on CMA value, which helps us to make more reliable decision as n grows. Since CMA values can be unstable with the small n near the initial ΔV dd drop, the error frequency comparison with CMA starts when n is larger than. For the proposed adaptive ΔT control approach, the block diagram of the step-by-step voltage scaling process is presented in Fig. 5. The process is initiated with the start signal generated when dynamic voltage scaling (DVS) circuitry begins scaling down the supply voltage with ΔV dd step. VC-ECC also changes the ECC mode by updating mode_selection signal. After changing the ECC mode, the syndrome monitoring module of VC-ECC sends non-zero syndrome detection information to the monitoring circuit to update the decoding errors. Then, the monitoring circuit and scaling decision module in Fig. 5 begins the adaptive ΔT

7 Mode Control VC-ECC Syndrome Monitoring Core Non-zero syndrome detect Monitoring Circuit Error Freq. Scaling Decision CMA Decoding Error Monitoring Circuit VDD (mv) Mode_selection LUT (SRAM) Data L Cache (SRAM) Initial Avg. Period e e e Cumulative sum. of errors Initial start V DD Scale Down DVS Circuit Adaptive VDD step Fixed VDD step Adaptive Err. freq. Fixed Err. freq Clock cycles (x 8 ) No step M cycle 5M cycle M cycle M cycle 5M cycle Adapt. step Error frequency No Increment Error Sum. Period End Compute CMA.5ms [].s.s.s.4s 5 5 Clock cycles (x 8 ) s Yes Initial Avg. Period End No Yes Scaling decision & DVS circuit.55s Error Freq. < CMA Yes Scale Down One Step V dd V dd = V Yes Voltage Scaling End No Start with V No V is target V dd Fig. 5: The proposed voltage scaling system with VC-ECC: Block diagram of step-by-step voltage scaling process. Comparison of error frequency changes between the adaptive and fixed-time duration approaches simulated with 49.mcf benchmark. (ΔV dd = mv, fixed ΔT= million cycles). (c) Operation flow chart of adaptive time duration control approach. (d) Cumulative sum of errors when various fixed-time duration and adaptive time duration approaches are used (simulation with 49.mcf benchmark). (d) (c) control approach and major steps of this process are presented in Fig. 5(c). In the figure, the monitoring circuit calculates the error frequency based on the non-zero syndrome detection. The scaling decision module computes CMA using the error frequency throughout the step-by-step voltage scaling process. The scaling decision module also makes a decision on V dd scaling down by comparing CMA with the error frequency that is delivered from monitoring circuit. For reliable voltage scaling decision, comparison between the error frequency with the computed CMA starts after the initial average period ends, as shown in Fig. 5. The comparison results between the adaptive time duration (ΔT) and the fixed ΔT ( million cycles) approaches are shown in Fig. 5. As shown in Fig. 5, the average error frequency of the adaptive time duration control approach is reduced to.75, which shows 5.4% and 5.7% error frequency reductions compared to the conventional one-step voltage scaling scheme and the fixed ΔT approach, respectively. We can observe from the results 7

8 shown in Fig. 5 that larger time duration at the first step, results in reduced error frequency. Fig. 5(d) also shows the cumulative sum of errors when the adaptive time duration approach and the various fixed time duration schemes are used. The total number of errors of the adaptive time duration scheme is 75, which is the same as that of fixed 5 million cycles case. However, with the fixed time duration of 5 million cycles, the total time taken to scale down the supply voltage from 7 mv to mv with 5 steps is sec (.55 sec for the adaptive time duration case), which incurs relatively large (approximately 9% more) power consumption compared to the proposed adaptive ΔT approach. Consideration with dirty cache data: In the application of the VC-ECC to L cache, when an error occurs in a clean cache line and the VC-ECC scheme is not able to correct it, re-fetching it from the next level memory is a viable solution. However, if the errors occur in dirty cache lines, there is no option to recover. These errors can eventually lead to incorrect program execution. To prevent this case, when VC-ECC encodes the cache data, it can selectively provide stronger protection for the dirty block (e.g.. -bit correction for clean and -bit correction mode to dirty data). This is similar to the scheme proposed in [], where different protection levels for dirty and clean cache blocks are used. It also shows that the average portion of dirty data is considerably small compared to the whole cache size. As presented in Fig. and (c), the increase in L cache miss ratios and CPI ratios with -bit correction or 4-bit correction modes is negligible. Since the dirty data portion in whole cache is not large, the overhead of applying stronger ECC for only dirty cache data is expected to be modest. V. SUMMARY & FUTURE DIRECTIONS We have presented an adaptive protection scheme for nanoscale memory arrays that provides the right amount of protection to each memory block under spatially and temporally varying reliability. Area, performance, and energy overheads for the proposed scheme are minimized by appropriate choice of ECC and joint circuit/architecture level optimizations of the encoding/decoding hardware. In contrast to existing multiple bit error tolerance schemes, the proposed ECC approach can tolerate both higher random and contiguous errors, and is amenable for efficient dynamic adaption in operating point e.g. voltage, which makes it attractive for low-power memory. Future investigations will include application of reliability-aware address mapping and combination with bit-interleaving to further reduce ECC overhead and enhance error protection. the Information Technology Research and Development Program of Korea Evaluation Institute of Industrial Technology (KEIT) [57, Design technology development of ultralow voltage operating circuit and IP for smart sensor SoC]. REFERENCES [] S. Narasimhan, K. Kunaparaju, and S. Bhunia, Healing of DSP Circuits Under Power Bound Using Post-Silicon Operand Bitwidth Truncation, IEEE Trans. Circuits Syst. I, vol. 59, no. 9, pp. 9-94,. [] J. Tschanz, K. Bowman, and V. De, Variation-Tolerant Circuits: Circuit Solutions and Techniques, Proc. Design Automation Conf., pp. 7-7, 5. [] S.S. Mukherjee, J. Emer, T. Fossum, and S. K. Reinhardt, Cache Scrubbing in Microprocessors: Myth or Necessity? IEEE Int. Symp. Dependable Computing, pp. 7-4, 4. [4] E. H. Cannon, A. Kleinosowski, R. Kanj, D. Reinhardt, and R. V. Joshi, The Impact of Aging Effects and Manufacturing Variation on SRAM Soft-Error Rate, IEEE Trans. Dev. Mat. Rel., vol. 8, no., 8, pp [5] N. Quach, High Availability and Reliability in the Itanium Processor, IEEE Micro, vol., no. 5, pp. -9,. [] S. Paul, F. Cai, X. Zhang, and S. Bhunia, "Reliability-Driven ECC Allocation for Multiple Bit Error Resilience in Processor Cache", IEEE Trans. Comput., vol., no., pp. -4,. [7] J. Park, J. Park, and S. Bhunia, VL-ECC: Variable Data- Length Error Correction Code for Embedded Memory in DSP Applications, IEEE Trans. Circuits Syst.Ⅱ, vol., no., pp. -4, 4. [8] A. Basak, S. Paul, J. Park, J. Park, and S. Bhunia, "Reconfigurable ECC for adaptive protection of memory," IEEE 5th International Midwest Symposium on Circuits and Systems,, pp [9] S. Lin and D. Costello, Error Control Coding, nd Edition, Prentice Hall, 4. [] S. Park et al., Accurate Modeling of the Delay and Energy Overhead of Dynamic Voltage and Frequency Scaling in Modern Microprocessors, IEEE Trans. Comput. Aided Design, vol., no. 5, pp ,. [] T. Chen, and M. Ikeda, Design and Implementation of Low- Power Hardware Architecture with Single-Cycle Divider for On- Line Clustering Algorithm, IEEE Trans. Circuits Syst. I, vol., no. 8, pp. 5-7,. [] L. Li et al., Soft error and energy consumption interactions: a data cache perspective, in Proc. of International Symposium on Low Power Electronics and Design, pp. -7, 4. Dongyeob Shin is currently working toward the integrated Master and Ph.D. degree in the VLSI Signal Processing Research Lab, Korea University, Seoul, Korea. His research interests include low-power, energy-efficient VLSI design, and error correction code design. Shin has a BS in electrical engineering from Korea University, Seoul, Korea. ACKNOWLEDGMENTS The work is supported in part by Semiconductor Research Corporation (SRC) grant 5.. This work is also supported by National Research Foundation of Korea (#5MDA745 and #RAB459), and Jangwon Park currently works as a Senior Engineer for Samsung Electronics, Suwon, Korea. His research interests include low-power error correction code design for embedded memory. Park has BS and MS degrees in electrical engineering from Korea University, Seoul, Korea. 8

9 Jongsun Park is currently an Associate Professor of the School of Electrical Engineering, Korea University, Seoul, Korea. His research interests focus on variation-tolerant, low-power, high-performance VLSI architectures and circuit designs. Park has a PhD in electrical and computer engineering from Purdue University. He is a senior member of IEEE. Somnath Paul is currently a research scientist at Intel Labs, Intel Corporation. His primary research interest is hardware-software co-design for energy-efficiency, yield and reliability in nanoscale technologies. Paul received his Ph.D. degree in Computer Engineering from Case Western Reserve University, Cleveland, OH. Swarup Bhunia is a professor of electrical and computer engineering at the University of Florida. His research interests include hardware and system security, implantable systems, and energy-efficient electronics. He received his PhD in computer engineering from Purdue University. He is a Senior Member of IEEE and a member of ACM. 9

120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014

120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014 120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014 VL-ECC: Variable Data-Length Error Correction Code for Embedded Memory in DSP Applications Jangwon Park,

More information

TOLERANCE to runtime failures in large on-chip caches has

TOLERANCE to runtime failures in large on-chip caches has 20 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 1, JANUARY 2011 Reliability-Driven ECC Allocation for Multiple Bit Error Resilience in Processor Cache Somnath Paul, Student Member, IEEE, Fang Cai, Student

More information

AN EFFICIENT DESIGN OF VLSI ARCHITECTURE FOR FAULT DETECTION USING ORTHOGONAL LATIN SQUARES (OLS) CODES

AN EFFICIENT DESIGN OF VLSI ARCHITECTURE FOR FAULT DETECTION USING ORTHOGONAL LATIN SQUARES (OLS) CODES AN EFFICIENT DESIGN OF VLSI ARCHITECTURE FOR FAULT DETECTION USING ORTHOGONAL LATIN SQUARES (OLS) CODES S. SRINIVAS KUMAR *, R.BASAVARAJU ** * PG Scholar, Electronics and Communication Engineering, CRIT

More information

Available online at ScienceDirect. Procedia Technology 25 (2016 )

Available online at  ScienceDirect. Procedia Technology 25 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Technology 25 (2016 ) 544 551 Global Colloquium in Recent Advancement and Effectual Researches in Engineering, Science and Technology (RAEREST

More information

HDL IMPLEMENTATION OF SRAM BASED ERROR CORRECTION AND DETECTION USING ORTHOGONAL LATIN SQUARE CODES

HDL IMPLEMENTATION OF SRAM BASED ERROR CORRECTION AND DETECTION USING ORTHOGONAL LATIN SQUARE CODES HDL IMPLEMENTATION OF SRAM BASED ERROR CORRECTION AND DETECTION USING ORTHOGONAL LATIN SQUARE CODES (1) Nallaparaju Sneha, PG Scholar in VLSI Design, (2) Dr. K. Babulu, Professor, ECE Department, (1)(2)

More information

ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES

ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES Shashikiran H. Tadas & Chaitali Chakrabarti Department of Electrical Engineering Arizona State University Tempe, AZ, 85287. tadas@asu.edu, chaitali@asu.edu

More information

Outline. Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication. Outline

Outline. Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication. Outline Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication Khanh N. Dang and Xuan-Tu Tran Email: khanh.n.dang@vnu.edu.vn VNU Key Laboratory for Smart Integrated Systems

More information

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 6T- SRAM for Low Power Consumption Mrs. J.N.Ingole 1, Ms.P.A.Mirge 2 Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 PG Student [Digital Electronics], Dept. of ExTC, PRMIT&R,

More information

Single error correction, double error detection and double adjacent error correction with no mis-correction code

Single error correction, double error detection and double adjacent error correction with no mis-correction code This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* Single error correction, double error detection

More information

A Low-Power ECC Check Bit Generator Implementation in DRAMs

A Low-Power ECC Check Bit Generator Implementation in DRAMs 252 SANG-UHN CHA et al : A LOW-POWER ECC CHECK BIT GENERATOR IMPLEMENTATION IN DRAMS A Low-Power ECC Check Bit Generator Implementation in DRAMs Sang-Uhn Cha *, Yun-Sang Lee **, and Hongil Yoon * Abstract

More information

Area-Efficient Error Protection for Caches

Area-Efficient Error Protection for Caches Area-Efficient Error Protection for Caches Soontae Kim Department of Computer Science and Engineering University of South Florida, FL 33620 sookim@cse.usf.edu Abstract Due to increasing concern about various

More information

FPGA Implementation of Double Error Correction Orthogonal Latin Squares Codes

FPGA Implementation of Double Error Correction Orthogonal Latin Squares Codes FPGA Implementation of Double Error Correction Orthogonal Latin Squares Codes E. Jebamalar Leavline Assistant Professor, Department of ECE, Anna University, BIT Campus, Tiruchirappalli, India Email: jebilee@gmail.com

More information

Error Detecting and Correcting Code Using Orthogonal Latin Square Using Verilog HDL

Error Detecting and Correcting Code Using Orthogonal Latin Square Using Verilog HDL Error Detecting and Correcting Code Using Orthogonal Latin Square Using Verilog HDL Ch.Srujana M.Tech [EDT] srujanaxc@gmail.com SR Engineering College, Warangal. M.Sampath Reddy Assoc. Professor, Department

More information

A Low-Cost Correction Algorithm for Transient Data Errors

A Low-Cost Correction Algorithm for Transient Data Errors A Low-Cost Correction Algorithm for Transient Data Errors Aiguo Li, Bingrong Hong School of Computer Science and Technology Harbin Institute of Technology, Harbin 150001, China liaiguo@hit.edu.cn Introduction

More information

Fault Tolerant Parallel Filters Based On Bch Codes

Fault Tolerant Parallel Filters Based On Bch Codes RESEARCH ARTICLE OPEN ACCESS Fault Tolerant Parallel Filters Based On Bch Codes K.Mohana Krishna 1, Mrs.A.Maria Jossy 2 1 Student, M-TECH(VLSI Design) SRM UniversityChennai, India 2 Assistant Professor

More information

CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL

CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL Shyam Akashe 1, Ankit Srivastava 2, Sanjay Sharma 3 1 Research Scholar, Deptt. of Electronics & Comm. Engg., Thapar Univ.,

More information

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism

More information

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017 Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit Pallavi Mamidala 1 K. Anil kumar 2 mamidalapallavi@gmail.com 1 anilkumar10436@gmail.com 2 1 Assistant Professor, Dept of

More information

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141 ECE 637 Integrated VLSI Circuits Introduction EE141 1 Introduction Course Details Instructor Mohab Anis; manis@vlsi.uwaterloo.ca Text Digital Integrated Circuits, Jan Rabaey, Prentice Hall, 2 nd edition

More information

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP 1 M.DEIVAKANI, 2 D.SHANTHI 1 Associate Professor, Department of Electronics and Communication Engineering PSNA College

More information

Exploiting Unused Spare Columns to Improve Memory ECC

Exploiting Unused Spare Columns to Improve Memory ECC 2009 27th IEEE VLSI Test Symposium Exploiting Unused Spare Columns to Improve Memory ECC Rudrajit Datta and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer Engineering

More information

Low Power Cache Design. Angel Chen Joe Gambino

Low Power Cache Design. Angel Chen Joe Gambino Low Power Cache Design Angel Chen Joe Gambino Agenda Why is low power important? How does cache contribute to the power consumption of a processor? What are some design challenges for low power caches?

More information

Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure

Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure Iswarya Gopal, Rajasekar.T, PG Scholar, Sri Shakthi Institute of Engineering and Technology, Coimbatore, Tamil Nadu, India Assistant

More information

Efficient Majority Logic Fault Detector/Corrector Using Euclidean Geometry Low Density Parity Check (EG-LDPC) Codes

Efficient Majority Logic Fault Detector/Corrector Using Euclidean Geometry Low Density Parity Check (EG-LDPC) Codes Efficient Majority Logic Fault Detector/Corrector Using Euclidean Geometry Low Density Parity Check (EG-LDPC) Codes 1 U.Rahila Begum, 2 V. Padmajothi 1 PG Student, 2 Assistant Professor 1 Department Of

More information

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech)

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) K.Prasad Babu 2 M.tech (Ph.d) hanumanthurao19@gmail.com 1 kprasadbabuece433@gmail.com 2 1 PG scholar, VLSI, St.JOHNS

More information

Error Correction Using Extended Orthogonal Latin Square Codes

Error Correction Using Extended Orthogonal Latin Square Codes International Journal of Electronics and Communication Engineering. ISSN 0974-2166 Volume 9, Number 1 (2016), pp. 55-62 International Research Publication House http://www.irphouse.com Error Correction

More information

CS250 VLSI Systems Design Lecture 9: Memory

CS250 VLSI Systems Design Lecture 9: Memory CS250 VLSI Systems esign Lecture 9: Memory John Wawrzynek, Jonathan Bachrach, with Krste Asanovic, John Lazzaro and Rimas Avizienis (TA) UC Berkeley Fall 2012 CMOS Bistable Flip State 1 0 0 1 Cross-coupled

More information

Efficient Implementation of Single Error Correction and Double Error Detection Code with Check Bit Precomputation

Efficient Implementation of Single Error Correction and Double Error Detection Code with Check Bit Precomputation http://dx.doi.org/10.5573/jsts.2012.12.4.418 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.12, NO.4, DECEMBER, 2012 Efficient Implementation of Single Error Correction and Double Error Detection

More information

Improved Error Correction Capability in Flash Memory using Input / Output Pins

Improved Error Correction Capability in Flash Memory using Input / Output Pins Improved Error Correction Capability in Flash Memory using Input / Output Pins A M Kiran PG Scholar/ Department of ECE Karpagam University,Coimbatore kirthece@rediffmail.com J Shafiq Mansoor Assistant

More information

Designing a Fast and Adaptive Error Correction Scheme for Increasing the Lifetime of Phase Change Memories

Designing a Fast and Adaptive Error Correction Scheme for Increasing the Lifetime of Phase Change Memories 2011 29th IEEE VLSI Test Symposium Designing a Fast and Adaptive Error Correction Scheme for Increasing the Lifetime of Phase Change Memories Rudrajit Datta and Nur A. Touba Computer Engineering Research

More information

Survey on Stability of Low Power SRAM Bit Cells

Survey on Stability of Low Power SRAM Bit Cells International Journal of Electronics Engineering Research. ISSN 0975-6450 Volume 9, Number 3 (2017) pp. 441-447 Research India Publications http://www.ripublication.com Survey on Stability of Low Power

More information

DETECTION AND CORRECTION OF CELL UPSETS USING MODIFIED DECIMAL MATRIX

DETECTION AND CORRECTION OF CELL UPSETS USING MODIFIED DECIMAL MATRIX DETECTION AND CORRECTION OF CELL UPSETS USING MODIFIED DECIMAL MATRIX ENDREDDY PRAVEENA 1 M.T.ech Scholar ( VLSID), Universal College Of Engineering & Technology, Guntur, A.P M. VENKATA SREERAJ 2 Associate

More information

Low Power Set-Associative Cache with Single-Cycle Partial Tag Comparison

Low Power Set-Associative Cache with Single-Cycle Partial Tag Comparison Low Power Set-Associative Cache with Single-Cycle Partial Tag Comparison Jian Chen, Ruihua Peng, Yuzhuo Fu School of Micro-electronics, Shanghai Jiao Tong University, Shanghai 200030, China {chenjian,

More information

Outline of Presentation Field Programmable Gate Arrays (FPGAs(

Outline of Presentation Field Programmable Gate Arrays (FPGAs( FPGA Architectures and Operation for Tolerating SEUs Chuck Stroud Electrical and Computer Engineering Auburn University Outline of Presentation Field Programmable Gate Arrays (FPGAs( FPGAs) How Programmable

More information

Design of Flash Controller for Single Level Cell NAND Flash Memory

Design of Flash Controller for Single Level Cell NAND Flash Memory Design of Flash Controller for Single Level Cell NAND Flash Memory Ashwin Bijoor 1, Sudharshana 2 P.G Student, Department of Electronics and Communication, NMAMIT, Nitte, Karnataka, India 1 Assistant Professor,

More information

Design of Low Power Wide Gates used in Register File and Tag Comparator

Design of Low Power Wide Gates used in Register File and Tag Comparator www..org 1 Design of Low Power Wide Gates used in Register File and Tag Comparator Isac Daimary 1, Mohammed Aneesh 2 1,2 Department of Electronics Engineering, Pondicherry University Pondicherry, 605014,

More information

Yield Enhancement Considerations for a Single-Chip Multiprocessor System with Embedded DRAM

Yield Enhancement Considerations for a Single-Chip Multiprocessor System with Embedded DRAM Yield Enhancement Considerations for a Single-Chip Multiprocessor System with Embedded DRAM Markus Rudack Dirk Niggemeyer Laboratory for Information Technology Division Design & Test University of Hannover

More information

An Area-Efficient BIRA With 1-D Spare Segments

An Area-Efficient BIRA With 1-D Spare Segments 206 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 1, JANUARY 2018 An Area-Efficient BIRA With 1-D Spare Segments Donghyun Kim, Hayoung Lee, and Sungho Kang Abstract The

More information

2 Asst Prof, ECE Dept, Kottam College of Engineering, Chinnatekur, Kurnool, AP-INDIA.

2 Asst Prof, ECE Dept, Kottam College of Engineering, Chinnatekur, Kurnool, AP-INDIA. www.semargroups.org ISSN 2319-8885 Vol.02,Issue.06, July-2013, Pages:480-486 Error Correction in MLC NAND Flash Memories Based on Product Code ECC Schemes B.RAJAGOPAL REDDY 1, K.PARAMESH 2 1 Research Scholar,

More information

Breaking the Energy Barrier in Fault-Tolerant Caches for Multicore Systems

Breaking the Energy Barrier in Fault-Tolerant Caches for Multicore Systems Breaking the Energy Barrier in Fault-Tolerant Caches for Multicore Systems Paul Ampadu, Meilin Zhang Dept. of Electrical and Computer Engineering University of Rochester Rochester, NY, 14627, USA

More information

Near Optimal Repair Rate Built-in Redundancy Analysis with Very Small Hardware Overhead

Near Optimal Repair Rate Built-in Redundancy Analysis with Very Small Hardware Overhead Near Optimal Repair Rate Built-in Redundancy Analysis with Very Small Hardware Overhead Woosung Lee, Keewon Cho, Jooyoung Kim, and Sungho Kang Department of Electrical & Electronic Engineering, Yonsei

More information

An Integrated ECC and BISR Scheme for Error Correction in Memory

An Integrated ECC and BISR Scheme for Error Correction in Memory An Integrated ECC and BISR Scheme for Error Correction in Memory Shabana P B 1, Anu C Kunjachan 2, Swetha Krishnan 3 1 PG Student [VLSI], Dept. of ECE, Viswajyothy College Of Engineering & Technology,

More information

A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors

A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors Brent Bohnenstiehl and Bevan Baas Department of Electrical and Computer Engineering University of California, Davis {bvbohnen,

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May-2013 ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May-2013 ISSN 255 CORRECTIONS TO FAULT SECURE OF MAJORITY LOGIC DECODER AND DETECTOR FOR MEMORY APPLICATIONS Viji.D PG Scholar Embedded Systems Prist University, Thanjuvr - India Mr.T.Sathees Kumar AP/ECE Prist University,

More information

A 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing

A 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing A 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing Dean Truong, Wayne Cheng, Tinoosh Mohsenin, Zhiyi Yu, Toney Jacobson, Gouri Landge, Michael Meeuwsen, Christine

More information

LOW POWER SRAM CELL WITH IMPROVED RESPONSE

LOW POWER SRAM CELL WITH IMPROVED RESPONSE LOW POWER SRAM CELL WITH IMPROVED RESPONSE Anant Anand Singh 1, A. Choubey 2, Raj Kumar Maddheshiya 3 1 M.tech Scholar, Electronics and Communication Engineering Department, National Institute of Technology,

More information

CMOS Logic Gate Performance Variability Related to Transistor Network Arrangements

CMOS Logic Gate Performance Variability Related to Transistor Network Arrangements CMOS Logic Gate Performance Variability Related to Transistor Network Arrangements Digeorgia N. da Silva, André I. Reis, Renato P. Ribas PGMicro - Federal University of Rio Grande do Sul, Av. Bento Gonçalves

More information

Computer Architecture s Changing Definition

Computer Architecture s Changing Definition Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction

More information

Memory Design I. Array-Structured Memory Architecture. Professor Chris H. Kim. Dept. of ECE.

Memory Design I. Array-Structured Memory Architecture. Professor Chris H. Kim. Dept. of ECE. Memory Design I Professor Chris H. Kim University of Minnesota Dept. of ECE chriskim@ece.umn.edu Array-Structured Memory Architecture 2 1 Semiconductor Memory Classification Read-Write Wi Memory Non-Volatile

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture

Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on on-chip Architecture Avinash Kodi, Ashwini Sarathy * and Ahmed Louri * Department of Electrical Engineering and

More information

DPA: A data pattern aware error prevention technique for NAND flash lifetime extension

DPA: A data pattern aware error prevention technique for NAND flash lifetime extension DPA: A data pattern aware error prevention technique for NAND flash lifetime extension *Jie Guo, *Zhijie Chen, **Danghui Wang, ***Zili Shao, *Yiran Chen *University of Pittsburgh **Northwestern Polytechnical

More information

Three DIMENSIONAL-CHIPS

Three DIMENSIONAL-CHIPS IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 4 (Sep-Oct. 2012), PP 22-27 Three DIMENSIONAL-CHIPS 1 Kumar.Keshamoni, 2 Mr. M. Harikrishna

More information

DESIGN OF FAULT SECURE ENCODER FOR MEMORY APPLICATIONS IN SOC TECHNOLOGY

DESIGN OF FAULT SECURE ENCODER FOR MEMORY APPLICATIONS IN SOC TECHNOLOGY DESIGN OF FAULT SECURE ENCODER FOR MEMORY APPLICATIONS IN SOC TECHNOLOGY K.Maheshwari M.Tech VLSI, Aurora scientific technological and research academy, Bandlaguda, Hyderabad. k.sandeep kumar Asst.prof,

More information

Yield-driven Near-threshold SRAM Design

Yield-driven Near-threshold SRAM Design Yield-driven Near-threshold SRAM Design Gregory K. Chen, David Blaauw, Trevor Mudge, Dennis Sylvester Department of EECS University of Michigan Ann Arbor, MI 48109 grgkchen@umich.edu, blaauw@umich.edu,

More information

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee

More information

Improving Memory Repair by Selective Row Partitioning

Improving Memory Repair by Selective Row Partitioning 200 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems Improving Memory Repair by Selective Row Partitioning Muhammad Tauseef Rab, Asad Amin Bawa, and Nur A. Touba Computer

More information

An Efficient Error Detection Technique for 3D Bit-Partitioned SRAM Devices

An Efficient Error Detection Technique for 3D Bit-Partitioned SRAM Devices JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.5, OCTOBER, 2015 ISSN(Print) 1598-1657 http://dx.doi.org/10.5573/jsts.2015.15.5.445 ISSN(Online) 2233-4866 An Efficient Error Detection Technique

More information

Vdd Programmable and Variation Tolerant FPGA Circuits and Architectures

Vdd Programmable and Variation Tolerant FPGA Circuits and Architectures Vdd Programmable and Variation Tolerant FPGA Circuits and Architectures Prof. Lei He EE Department, UCLA LHE@ee.ucla.edu Partially supported by NSF. Pathway to Power Efficiency and Variation Tolerance

More information

Effective Implementation of LDPC for Memory Applications

Effective Implementation of LDPC for Memory Applications Effective Implementation of LDPC for Memory Applications Y.Sreeja PG Scholar, VLSI & ES, Dept of ECE, Vidya Bharathi Institute of Technology, Janagaon, Warangal, Telangana. Dharavath Jagan Associate Professor,

More information

Post-Manufacturing ECC Customization Based on Orthogonal Latin Square Codes and Its Application to Ultra-Low Power Caches

Post-Manufacturing ECC Customization Based on Orthogonal Latin Square Codes and Its Application to Ultra-Low Power Caches Post-Manufacturing ECC Customization Based on Orthogonal Latin Square Codes and Its Application to Ultra-Low Power Caches Rudrajit Datta and Nur A. Touba Computer Engineering Research Center The University

More information

FABRICATION TECHNOLOGIES

FABRICATION TECHNOLOGIES FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general

More information

A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid Cache in 3D chip Multi-processors

A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid Cache in 3D chip Multi-processors , July 4-6, 2018, London, U.K. A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid in 3D chip Multi-processors Lei Wang, Fen Ge, Hao Lu, Ning Wu, Ying Zhang, and Fang Zhou Abstract As

More information

Enhanced Detection of Double Adjacent Errors in Hamming Codes through Selective Bit Placement

Enhanced Detection of Double Adjacent Errors in Hamming Codes through Selective Bit Placement Enhanced Detection of Double Adjacent Errors in Hamming Codes through Selective Bit Placement 1 Lintu K Babu, 2 Hima Sara Jacob 1 M Tech Student, 2 Assistant Professor 1 Department of Electronics And Communication

More information

EECS 322 Computer Architecture Superpipline and the Cache

EECS 322 Computer Architecture Superpipline and the Cache EECS 322 Computer Architecture Superpipline and the Cache Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses powerpoint animation: please viewshow Summary:

More information

Spare Block Cache Architecture to Enable Low-Voltage Operation

Spare Block Cache Architecture to Enable Low-Voltage Operation Portland State University PDXScholar Dissertations and Theses Dissertations and Theses 1-1-2011 Spare Block Cache Architecture to Enable Low-Voltage Operation Nafiul Alam Siddique Portland State University

More information

Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques

Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques Yu Cai, Saugata Ghose, Yixin Luo, Ken Mai, Onur Mutlu, Erich F. Haratsch February 6, 2017

More information

Correction Prediction: Reducing Error Correction Latency for On-Chip Memories

Correction Prediction: Reducing Error Correction Latency for On-Chip Memories Correction Prediction: Reducing Error Correction Latency for On-Chip Memories Henry Duwe University of Illinois at Urbana-Champaign Email: duweiii2@illinois.edu Xun Jian University of Illinois at Urbana-Champaign

More information

Microelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica

Microelettronica. J. M. Rabaey, Digital integrated circuits: a design perspective EE141 Microelettronica Microelettronica J. M. Rabaey, "Digital integrated circuits: a design perspective" Introduction Why is designing digital ICs different today than it was before? Will it change in future? The First Computer

More information

Unleashing the Power of Embedded DRAM

Unleashing the Power of Embedded DRAM Copyright 2005 Design And Reuse S.A. All rights reserved. Unleashing the Power of Embedded DRAM by Peter Gillingham, MOSAID Technologies Incorporated Ottawa, Canada Abstract Embedded DRAM technology offers

More information

Reconfigurable Multicore Server Processors for Low Power Operation

Reconfigurable Multicore Server Processors for Low Power Operation Reconfigurable Multicore Server Processors for Low Power Operation Ronald G. Dreslinski, David Fick, David Blaauw, Dennis Sylvester, Trevor Mudge University of Michigan, Advanced Computer Architecture

More information

THE latest generation of microprocessors uses a combination

THE latest generation of microprocessors uses a combination 1254 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 30, NO. 11, NOVEMBER 1995 A 14-Port 3.8-ns 116-Word 64-b Read-Renaming Register File Creigton Asato Abstract A 116-word by 64-b register file for a 154 MHz

More information

High Performance Memory Read Using Cross-Coupled Pull-up Circuitry

High Performance Memory Read Using Cross-Coupled Pull-up Circuitry High Performance Memory Read Using Cross-Coupled Pull-up Circuitry Katie Blomster and José G. Delgado-Frias School of Electrical Engineering and Computer Science Washington State University Pullman, WA

More information

ISSN Vol.05,Issue.09, September-2017, Pages:

ISSN Vol.05,Issue.09, September-2017, Pages: WWW.IJITECH.ORG ISSN 2321-8665 Vol.05,Issue.09, September-2017, Pages:1693-1697 AJJAM PUSHPA 1, C. H. RAMA MOHAN 2 1 PG Scholar, Dept of ECE(DECS), Shirdi Sai Institute of Science and Technology, Anantapuramu,

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February ISSN

International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February ISSN International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February-2014 938 LOW POWER SRAM ARCHITECTURE AT DEEP SUBMICRON CMOS TECHNOLOGY T.SANKARARAO STUDENT OF GITAS, S.SEKHAR DILEEP

More information

DESIGN AND IMPLEMENTATION OF 8X8 DRAM MEMORY ARRAY USING 45nm TECHNOLOGY

DESIGN AND IMPLEMENTATION OF 8X8 DRAM MEMORY ARRAY USING 45nm TECHNOLOGY DESIGN AND IMPLEMENTATION OF 8X8 DRAM MEMORY ARRAY USING 45nm TECHNOLOGY S.Raju 1, K.Jeevan Reddy 2 (Associate Professor) Digital Systems & Computer Electronics (DSCE), Sreenidhi Institute of Science &

More information

REPLACING 6T SRAMS WITH 3T1D DRAMS IN THE L1 DATA CACHE TO COMBAT PROCESS VARIABILITY

REPLACING 6T SRAMS WITH 3T1D DRAMS IN THE L1 DATA CACHE TO COMBAT PROCESS VARIABILITY ... REPLACING 6T SRAMS WITH 3T1D DRAMS IN THE L1 DATA CACHE TO COMBAT PROCESS VARIABILITY... Xiaoyao Liang Harvard University Ramon Canal Universitat Politècnica de Catalunya Gu-Yeon Wei David Brooks Harvard

More information

J. Manikandan Research scholar, St. Peter s University, Chennai, Tamilnadu, India.

J. Manikandan Research scholar, St. Peter s University, Chennai, Tamilnadu, India. Design of Single Correction-Double -Triple -Tetra (Sec-Daed-Taed- Tetra Aed) Codes J. Manikandan Research scholar, St. Peter s University, Chennai, Tamilnadu, India. Dr. M. Manikandan Associate Professor,

More information

Random Access Memory (RAM)

Random Access Memory (RAM) Random Access Memory (RAM) EED2003 Digital Design Dr. Ahmet ÖZKURT Dr. Hakkı YALAZAN 1 Overview Memory is a collection of storage cells with associated input and output circuitry Possible to read and write

More information

/$ IEEE

/$ IEEE IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 56, NO. 1, JANUARY 2009 81 Bit-Level Extrinsic Information Exchange Method for Double-Binary Turbo Codes Ji-Hoon Kim, Student Member,

More information

EE586 VLSI Design. Partha Pande School of EECS Washington State University

EE586 VLSI Design. Partha Pande School of EECS Washington State University EE586 VLSI Design Partha Pande School of EECS Washington State University pande@eecs.wsu.edu Lecture 1 (Introduction) Why is designing digital ICs different today than it was before? Will it change in

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

250nm Technology Based Low Power SRAM Memory

250nm Technology Based Low Power SRAM Memory IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 5, Issue 1, Ver. I (Jan - Feb. 2015), PP 01-10 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org 250nm Technology Based Low Power

More information

FLEXIBLE PRODUCT CODE-BASED ECC SCHEMES FOR MLC NAND FLASH MEMORIES

FLEXIBLE PRODUCT CODE-BASED ECC SCHEMES FOR MLC NAND FLASH MEMORIES FLEXIBLE PRODUCT CODE-BASED ECC SCHEMES FOR MLC NAND FLASH MEMORIES C. Yang 1, Y. Emre 1, C. Chakrabarti 1 and T.Mudge 2 1 School of Electrical, Computer and Energy Engineering, Arizona State University,

More information

[Kalyani*, 4.(9): September, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

[Kalyani*, 4.(9): September, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY SYSTEMATIC ERROR-CORRECTING CODES IMPLEMENTATION FOR MATCHING OF DATA ENCODED M.Naga Kalyani*, K.Priyanka * PG Student [VLSID]

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

High Performance Interconnect and NoC Router Design

High Performance Interconnect and NoC Router Design High Performance Interconnect and NoC Router Design Brinda M M.E Student, Dept. of ECE (VLSI Design) K.Ramakrishnan College of Technology Samayapuram, Trichy 621 112 brinda18th@gmail.com Devipoonguzhali

More information

Design and Simulation of Low Power 6TSRAM and Control its Leakage Current Using Sleepy Keeper Approach in different Topology

Design and Simulation of Low Power 6TSRAM and Control its Leakage Current Using Sleepy Keeper Approach in different Topology Vol. 3, Issue. 3, May.-June. 2013 pp-1475-1481 ISSN: 2249-6645 Design and Simulation of Low Power 6TSRAM and Control its Leakage Current Using Sleepy Keeper Approach in different Topology Bikash Khandal,

More information

CHAPTER 12 ARRAY SUBSYSTEMS [ ] MANJARI S. KULKARNI

CHAPTER 12 ARRAY SUBSYSTEMS [ ] MANJARI S. KULKARNI CHAPTER 2 ARRAY SUBSYSTEMS [2.4-2.9] MANJARI S. KULKARNI OVERVIEW Array classification Non volatile memory Design and Layout Read-Only Memory (ROM) Pseudo nmos and NAND ROMs Programmable ROMS PROMS, EPROMs,

More information

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup Yan Sun and Min Sik Kim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington

More information

Low-Power Technology for Image-Processing LSIs

Low-Power Technology for Image-Processing LSIs Low- Technology for Image-Processing LSIs Yoshimi Asada The conventional LSI design assumed power would be supplied uniformly to all parts of an LSI. For a design with multiple supply voltages and a power

More information

FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance

FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance Ping Huang, Pradeep Subedi, Xubin He, Shuang He and Ke Zhou Department of Electrical and Computer Engineering, Virginia Commonwealth

More information

CHIP multiprocessors (CMPs) integrate multiple CPUs (or

CHIP multiprocessors (CMPs) integrate multiple CPUs (or 638 IEEE TRANSACTIONS ON COMPUTERS, VOL. 59, NO. 5, MAY 2010 PERFECTORY: A Fault-Tolerant Directory Memory Architecture Hyunjin Lee, Student Member, IEEE, Sangyeun Cho, Member, IEEE, and Bruce R. Childers,

More information

Advanced Computer Architecture (CS620)

Advanced Computer Architecture (CS620) Advanced Computer Architecture (CS620) Background: Good understanding of computer organization (eg.cs220), basic computer architecture (eg.cs221) and knowledge of probability, statistics and modeling (eg.cs433).

More information

Majority Logic Decoding Of Euclidean Geometry Low Density Parity Check (EG-LDPC) Codes

Majority Logic Decoding Of Euclidean Geometry Low Density Parity Check (EG-LDPC) Codes Majority Logic Decoding Of Euclidean Geometry Low Density Parity Check (EG-LDPC) Codes P. Kalai Mani, V. Vishnu Prasath PG Student, Department of Applied Electronics, Sri Subramanya College of Engineering

More information

Postsilicon Adaptation for Low-Power SRAM under Process Variation

Postsilicon Adaptation for Low-Power SRAM under Process Variation Postsilicon Calibration and Repair for Yield and Reliability Improvement Postsilicon Adaptation for Low-Power SRAM under Process Variation Minki Cho Georgia Institute of Technology Jason Schlessman Princeton

More information

Analysis of 8T SRAM Cell Using Leakage Reduction Technique

Analysis of 8T SRAM Cell Using Leakage Reduction Technique Analysis of 8T SRAM Cell Using Leakage Reduction Technique Sandhya Patel and Somit Pandey Abstract The purpose of this manuscript is to decrease the leakage current and a memory leakage power SRAM cell

More information

Minimizing Power Dissipation during. University of Southern California Los Angeles CA August 28 th, 2007

Minimizing Power Dissipation during. University of Southern California Los Angeles CA August 28 th, 2007 Minimizing Power Dissipation during Write Operation to Register Files Kimish Patel, Wonbok Lee, Massoud Pedram University of Southern California Los Angeles CA August 28 th, 2007 Introduction Outline Conditional

More information

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES Volume 120 No. 6 2018, 4453-4466 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR

More information

A Low Power SRAM Base on Novel Word-Line Decoding

A Low Power SRAM Base on Novel Word-Line Decoding Vol:, No:3, 008 A Low Power SRAM Base on Novel Word-Line Decoding Arash Azizi Mazreah, Mohammad T. Manzuri Shalmani, Hamid Barati, Ali Barati, and Ali Sarchami International Science Index, Computer and

More information