Energy-Efficient Value-Based Selective Refresh for Embedded DRAMs

Size: px
Start display at page:

Download "Energy-Efficient Value-Based Selective Refresh for Embedded DRAMs"

Transcription

1 Energy-Efficient Value-Based Selective Refresh for Embedded DRAMs K. Patel 1,L.Benini 2, Enrico Macii 1, and Massimo Poncino 1 1 Politecnico di Torino, Torino, Italy 2 Università di Bologna, Bologna, Italy Abstract. DRAM idle power consumption consists for a large part of the power required for the refresh operation.this is exacerbated by(i) increasing amount of memory devoted to cache, that filter out many accesses to DRAM, and (ii) increased temperature of the chips, which increase leakage and thus data retention times. The well-known structured distribution of zeros in a memory, combined with the observation that cells containing zeros in a DRAM do not require to be refreshed, can be constructively used together to reduce the unnecessary number of required refresh operations. We propose a value-based selective refresh scheme in which both horizontal and vertical clusters of zeros are identified and used to selectively deactivated refresh of such clusters. As a result, our technique significantly achieves a net reduction of the number of refresh operations on average of 31%, evaluated on a set of typical embedded applications. 1 Introduction Embedded DRAM (EDRAM) is viewed as viable design option for applications with significant memory requirements, tight performance constraints and limited power budgets. Embedded DRAM has lower density, requires a more expensive mask set and fabrication process, but it offers a drastically improved energyper-access [1]. The energy efficiency of EDRAMs advantage may be reduced, or even worse, compromised, if adequate countermeasures are not taken to address its idle power consumption, caused mainly by the periodic refresh operation. Refresh is a more serious problem for EDRAMs than for standard DRAMs for two main reasons. First, technology options to reduce cell leakage cannot be as aggressively pursued in EDRAM as in standard DRAMs, because of cost reasons, and fundamentally as a consequence of the tradeoff between efficient logic and efficient memory. Second, the presence of fast switching logic on the same die causes higher-temperature operation, which increases leakage, thereby implying an increase in refresh rate. From the architectural viewpoint, faster refresh rates imply larger idle power for EDRAM. The importance of idle power is magnified by the fact that DRAMs are often used with a low duty cycle of busy periods, since DRAM accesses are filtered by the higher levels of the memory hierarchy (i.e, fast and small SRAM based caches). For these reasons, several researchers have proposed techniques V. Paliouras, J. Vounckx, and D. Verkest (Eds.): PATMOS 2005, LNCS 3728, pp , c Springer-Verlag Berlin Heidelberg 2005

2 Energy-Efficient Value-Based Selective Refresh for Embedded DRAMs 467 for idle power minimization in EDRAM memories [4], which are also applicable to generic DRAMs [5]. Most of these techniques aim at providing very low-power shutdown states, either with loss of information or with significant access time penalty. Alternative approaches aim at reducing power less aggressively than with shutdown, while at the same time minimizing access time overhead [9, 11]. In this paper, we propose a low-overhead refresh power reduction technique based on the concept of selective refresh. We exploit the well-known dominance of zero bits in the data stored in DRAM, and by adding a limited amount of redundant storage and logic (which is the overhead in our technique) to index the memory blocks that contain a zero value, so as to eliminate refresh to these blocks. As a result, our technique significantly reduces the number of refresh operations, decreasing idle power. One important design exploration parameter is the granularity at which we apply zero value detection and tagging. In this work, we propose two alternative mechanisms, namely horizontal and vertical zero clustering and we analyze the granularity at which they can be applied. Our results, demonstrate that an average reduction of 31% of the refresh operations, measured for different granularities. 2 Previous Work The non-uniform distribution of values in memories have been exploited in several ways, although this property has been mainly used in the context of caches, with the objective of reducing the average access time or the total energy by reducing the cost of memory reads using the common-case principle. The frequent value (FV) cache [7] is based on the analysis of application data usage, that allows to identify few frequently accessed data values; these are stored in a small buffer, resulting in frequent access of the small buffer. The experimentally observed dominance of zeros in a cache has been exploited by the value-conscious cache [8], where a sort of hierarchical bitline scheme is used to avoid discharging of the bitline whenever a zero value is stored. A similar approach is used in the dynamic zero compression (DZC) architecture [6], where, zero bytes are encoded using one bit, achieving energy reduction while accessing zero bytes in cache. Concerning architectural techniques that aim at reducing the energy impact of refresh, Ohsawa et al. [9] propose two refresh architectures. The first one, called selective refresh, is based on the observation that data need to be retained (and thus refreshed) only for the duration of their lifetimes. The difficulty in the implementation of this architecture lies in that it is not immediate to extract this information, which may require some support by the compiler. The second architecture, called variable refresh period, uses multiple refresh periods for different regions of the array, based on the fact that the data retention time of the cells is not constant. This property was first exploited at the circuit level [10], and has been refined in [9] by employing a refresh counter for each row, plus a register that stores the refresh period of a given row. This idea has been elaborated into a more sophisticated scheme in [11], where the idea variable refresh period is applied on a granularity (a block ) smaller than a row.

3 468 K. Patel et al. 3 Value-Based Selective Refresh 3.1 Motivation The technique proposed in this paper is motivated by the observation of two properties of DRAM memory systems. The first one is a consequence of the fact that, since most of the memory references are filtered out by caches, only few accesses go to main memory (normally a DRAM). This causes the contribution of idle power consumption to become dominant, since refresh is a mandatory operation. The plot in Figure 1 endorses this fact: it shows, for a set of benchmarks, the split between refresh and access energy as a function of the refresh period, referred to a system with 16KB of L1 Cache and no L2 cache. We notice that the relative importance of refresh becomes dominant for refresh periods around few million cycles; assuming a 200MHz frequency typical of a SoC, this figure is equivalent to a few tens of ms, comparable to the refresh periods of common EDRAM macros [2, 3]. Notice that the addition of a L2 cache would make refresh energy even more dominant Refresh Accesses [%] adpcm epic g721 gsm jpeg pegwit rasta K 500K 1M 2M 4M Refresh Period [Cycles] Fig. 1. (a) Refresh vs Access Energy Split (b) Distribution of Zero Clusters The second property is that memories, in general, tend to exhibit structured distribution of 0 s and 1 s. This also holds for DRAMs, that besides the classical high horizontal occurrence of 0 s (e.g., a large number of zero bytes), exhibit an even more relevant vertical occurrence of 0 s. We will use the term clusters to denote these subsets of rows or columns. Figure 1-(b) shows the occurrence frequency of 0 s in a DRAM, relative to a set of benchmarks. Values are relative to different granularities (8, 16, and 32) of 0-valued clusters, either vertically or horizontally. The plot shows that, while the number of horizontal zero clusters decreases quite rapidly as the cluster size increases, vertical clusters are more frequent and do not decrease much for larger cluster sizes: on average, 38% of the vertical 32-bit clusters contain all zeros. Our idea is to use the latter property to reduce the number of refresh operations by observing that cells containing a zero do not need to be refreshed. Since they can be grouped into clusters, it is possible to avoid the refresh of an entire

4 Energy-Efficient Value-Based Selective Refresh for Embedded DRAMs 469 cluster. In other words, we transform the operation of refresh, that is is done independent of the value contained in the cell, into a value-dependent operation. From the architectural standpoint, our value-based selective refresh consists of grouping 0 s in clusters. The information regarding the value of each cluster is stored in an extra DRAM cell. This cell stores a one if all bits in the cluster are zero, and zero otherwise. From the observation of the data of Figure 1-(b), clustering of zeros can be either horizontal or vertical, and hence, in this paper, we investigate two different approaches to clusters zero cells. 3.2 Horizontal Zero Clustering(HZC) Our approach to cluster zeros in the horizontal direction is similar to one proposed by Villa et. al. [6], in which clustering of zeros is exploit to reduce dynamic power dissipation in caches (SRAM). Figure 2-(a) shows the conceptual architecture of HZC. Each word line is divided in number of clusters each one having its Zero Indicator Bit (ZIB). The ZIBs are placed in extra columns in the cell array, depicted as vertical gray stripes in Figure 2-(a). As shown in Figure 3-(a), depending on the content of ZIB cell, the local word line of the corresponding cluster can be disconnected from the global word line. The operations of the memory with HZC architecture can be summarized as follows. Fig. 2. Architecture of (a) Horizontal and (b) Vertical Zero Clustering Refresh:During the refresh operation, the local wordline of the cluster is connected or disconnected from the global wordline based on the content of ZIB cell. Figure 3-(a) shows that refresh is not performed if the ZIB is one (M2 is turned off, and M1 is turned on, grounding the local word line). Read:Read operation similar to refresh. During read, the ZIB is read and depending on its value, the bits in the cluster will be read or not. If ZIB stores a one, then the bits in the cluster are not read (we know that they are all zero), whereas if it stores a zero they will be read out. Write:During write operation the ZIB bit is updated when its cluster is written. The Zero Detect Circuit (ZDC) will detect if all cluster bits are zero; if so, a 1 is written into the ZIB. The design of ZDC is very similar to the to the one found in [6], and thus it is not reported here.

5 470 K. Patel et al. ZIB BitLine BitLine 1 BitLine n Global WordLine GND M3 M2 Local WordLine M1 C 1 n ZIB Fig. 3. (a) ZIB circuit [5] (b) Refresh Period Operations : VZC Architecture Notice that during the read and refresh operations the ZIB is read; since the read operation is destructive, the proposed scheme imposes a small read time overhead. When the ZIB is 0, the cluster bits have to refreshed along with ZIB, during the refresh operation. Referring to Figure 3-(a), we notice that, during refresh, the bitline is pre-charged to V dd /2 thus partially charging the capacitance C of the ZIB and hence possibly turning transistor M2 off and hence cutting off the local wordline of rest of the n bits. If we wait to restore the value of ZIB, the local wordline will again be active and connecting the cluster bits to their bitlines. When the ZIB is 1, this indicates a zero cluster. This will cut off the local wordline of the cluster during read. Thus, the sense amplifiers of the cluster s columns will remain in a meta-stable state. To avoid this problem, the sense amplifier design has to be modified, as done in [6] (the modified circuit is not shown here for space reasons). 3.3 Vertical Zero Clustering(VZC) Vertical Zero Clustering(VZC) aims at detecting and exploiting the presence of clusters of zeros in columns of the DRAM array. Figure 2-(b) shows the conceptual architecture for VZC. Depending on the granularity of clustering every n rows will have one Zero Indicator Row (ZIR). Each ZIR contains one ZIB for each column in the DRAM array. Since we add one ZIR for n rows, we need a separate decoder for ZIRs, which will use higher-order address bits to access it, depending on the granularity. We also add an additional column to the array, containing a set of dirty bit indicators, one for each ZIR. These bits will be used to track writes and to ensure that the ZIB bits are correctly updated (as discussed in detail later). Similarly to HZC, VZC also requires support circuitry for each column of the array. These circuits are shown in Figure 4-(a) and Figure 4-(b), that are used in different moments by the different operations on the DRAM. Memory operations in the VZC can be summarized as follows. Read:Read is performed as a normal read, the only difference being the presence of the circuit depicted in Figure 4-(a).

6 Energy-Efficient Value-Based Selective Refresh for Embedded DRAMs 471 Fig. 4. (a) Selective Refresh Circuit: VZC (b) Write Circuit of ZIB During read operation, the Read signal is held high, ensuring that the transistor M1 is turned on. Then, a normal pre-charge takes place, followed by the read of the desired cell. Notice that the ATD signal in the figure is used only during refresh to selectively disable the pre-charge of bitlines. Write:Write is also done in the usual way. Notice that as long as data is just read from memory, the status of the clusters will not change. Conversely, when a datum is written into the memory, it may modify the status of the clusters; therefore, the status of the ZIB must be changed accordingly. This may be a problem, since, in order to change the status of the ZIB, all rows of the cluster must be read. Rather, we avoid this overhead by postponing the ZIB update operation until the next refresh operation by zeroing the ZIB of the cluster corresponding to the current write address. This operation also sets the dirty bit of that ZIR by writing 1 to it. Based on the value of the dirty bit, the next refresh operation of this cluster will determine the latest status of ZIBs. Refresh:Refresh operation has two modes. One is the normal refresh operation where zero clusters are not refreshed and the other one is the ZIB update mode. Normal Refresh: Before starting the refresh of the first row of the cluster, corresponding ZIR is read. Reading of ZIR is triggered by the Address Transition Detection (ATD) signal. The ATD signal goes to one every n rows that are refreshed, that is, when we cross the boundary of a ZIR. The ATD signal triggers the ZIR read operation using the ZIR decoder, shown in Figure 2-(b). As shown in Figure 4-(a), ATD will turn on transistor M2 and depending on the value of ZIB the capacitor will be charged or discharged. At the end of the read operation on ZIR, ATD will go low. If the ZIB is 1, a 0 will be produced at the output of inverter of Figure 4-(a). The read signal will be held low during the refresh operation, so that the transmission gate M3 will be turned on, putting also tran-

7 472 K. Patel et al. sistor M6 on. This, in turn, will turn the transistor M1 off cutting of the pre-charge from bitline. Hence if the ZIB is 1 the pre-charge will be for that particular column, will remain disabled for the next n rows. Therefore, during row-based refresh operation, the bits of this column belonging to the cluster will not be refreshed. Conversely, if the ZIB is 0 this results in discharging the capacitance, forcing the output of the inverter to 1. This will turn M6 off and and M1 on, so that a normal refresh will occur. ZIB update mode: During the ZIB update mode, the status of the clusters has to be determined to update the value of ZIB. This part of the circuit is shown in Figure 4-(b). As explained above, before starting to refresh of the first row of a cluster, the content of the ZIR for that cluster is read by raising ATD high. Reading ZIR will read the dirty bit corresponding to that ZIR. If the dirty bit is is set (i.e., the status of the ZIR is unknown) this will turn transistor M2 on using the NAND gate. This will result in charging the capacitance C. This will, in turn, result in putting 1 at the Write Output, using transistor M1. All these operations occur when the ZIR is read. Assuming ZIR is reset and its dirty bit is set, the regular row based refresh will follow the ZIR read operation. Notice that the ATD signal will go low before the row-based refresh starts. During refresh if any of the bits of this given column is 1, then it will put transistor M3 on and it will ground all the charged capacitances, setting output of the inverter to 1. This will make Write O/P to go to 0. For those ZIRs which have the dirty bit set, the value of Write O/P will be written back to the corresponding ZIR at the end of the refresh of all the rows of the cluster. The end is actually detected, again, by ATD, since after the refresh of the last row of the cluster it will move to the first row of the next cluster. The sequence of operations occurring during refresh of the VZC DRAM is summarized in Figure 3-(b). When refreshing the first row of the cluster, refresh operation will be performed after ZIB update and ZIR read. The following n 1 row refresh operations will be normal refresh operations. Notice that since the content of the ZIR is built based on the content of DRAM cells, it has to be refreshed as well. This is done when reading the ZIR, and does not require an explicit refresh. 3.4 Write Impact on Refresh In the VZC architecture, whenever a write comes, it resets the ZIR of the cluster to which this write belongs and sets the dirty bit. Hence, on the next refresh, this cluster will have to be refreshed. If during the same refresh period another write comes to the same cluster then it will not change the status of ZIR since this cluster has already been declared as dirty. Instead, if the write goes to another cluster it results in destroying another cluster by reseting its ZIR. Hence, on the next refresh this cluster will have to be refreshed as well. If many writes

8 Energy-Efficient Value-Based Selective Refresh for Embedded DRAMs 473 are distributed over different clusters this will jeopardize the opportunities to save refresh to these clusters. This is also strictly related to the cluster size. Experiments show that as we move towards coarser granularity, the percentage of dirty clusters increases. This is due to the fact that, even though distribution of writes to different clusters is reduced, the total number of clusters is reduced. In general, however, this percentage remains quite small and hence, dirty writes do not reduce significantly the impact of VZC. 4 Overhead Analysis Both HZC and VZC architectures have some area and timing overhead. Here we briefly discuss it with approximate quantification. Concerning area, Table 1 summarizes different components contributing to area overhead for HZC and VZC architectures for different cluster granularities n. The percentage overhead is with respect to the area of that component in a regular, non-clustered DRAM. Table 1. Relative Percentage Area Overhead HZC VZC Components n=8 n=16 n=32 n=8 n=16 n=32 Data Matrix Bitline Wordline Sel. Refresh negligible constant overhead Row Decoder No Overhead Data Array: For the data array in HZC architecture every n bits we have three extra transistors (2 n-mos and 1 p-mos, Figure 3-(a)), i.e., an overhead of 3/n. In the VZC architecture we have an extra additional row for every n rows, hence an overhead of 1/n. Notice that in HZC architecture the p-mos transistor drives the local wordline, and depending on the granularity of the cluster, it has to be wide enough to drive it without introducing an extra delay in reading. Bitlines: In the HZC architecture, we have an extra bitline for every n bitlines, while in the VZC architecture we have an extra wire running parallel to every bitline (Figure 4-(a)). Though this wire is not an extra bitline, for the sake of the overhead calculation we considered it as a bitline overhead. Wordlines: Due to divided wordline type of architecture (Figure 3-(a)) of HZC, we have extra local wordlines, which has total length per row approximately equal to the length of the global wordline. In the VZC architecture we have an extra row for every n rows.

9 474 K. Patel et al. Row Decoders: While the HZC architecture does not need an extra row decoder, the VZC architecture has an extra row decoder for decoding ZIRs. Though the complexity of this extra row decoder is significantly smaller than the main row decoder, its cost depends on n. As shown in the table, with respect to the regular row decoder this extra row decoder has marginal overhead. And its contribution to overall area overhead is very small, since the dynamic logic based decoders themselves have complexity of 5 to 8 % with respect to data matrix (considering transistor counts). Delay, on the contrary, is slightly more critical for read operations. For the HZC architecture read operation has a slight increase in delay since the ZIB has to be read out to determine the value of the rest of the n bits. Whereas, in case of VZC architecture, read operation is performed in the normal way, and hence there is no visible increase in delay. Concerning the write operation, in the HZC scheme during each write the zero detect circuit has to determine if there are zero clusters or not, before committing the write to the lines. Hence there is an increase in write time, determined by the delay of the zero detect circuit. Conversely, in the VZC architecture, write operation is carried as normally, followed by the resetting ZIR and the setting of dirty bit. Hence, there is no sizable delay increase during the write operation as well. Overall, the overhead of the VZC architecture is smaller, and, even more important, it does not impact normal read and write operations. This fact, coupled with the statistics of Figure 1-(a) seems to make VZC a more competitive architecture than HZC. 5 Experimental Results For our experiments we have used a modified version of the SimpleScalar- 3.0/PISA [12]. All our experiments are run using sim-outorder. We configured simulator to have separate L1 Instruction (direct mapped) and Data cache (4- way set associative), both of size 16KB with 32-byte block size. L2 cache has been disabled, since the relatively limited execution time of the applications did not allow to see sufficient traffic towards the DRAM. During all our experiments we have kept data retention time of a DRAM cell to be one million CPU cycles. Assuming a 200MHz frequency of a typical SoC, this is equivalent to 5 milliseconds. We have used the MediaBench suite [13], which includes various multimedia, networking and security related applications. Most of the benchmarks have separate encoding and decoding applications. Figure 5 plots the percentage of refreshes avoided by two the HZC and VZC architectures, for different granularities of cluster. The plots correspond to encoding and decoding applications of the MediaBench benchmarks, Notice that the reported values already account for the refresh overheads brought by HZC and VZC, and are in fact equivalent to reductions of refresh energy. In the plots, x v represent the relative savings brought by VZC architecture, where x is the granularity of horizontal (h) or vertical (v) clustering. As can be seen from plots, at the byte granularity both VZC and HZC bring almost the same percentage of savings, but as we move towards granularity of 16 and

10 Energy-Efficient Value-Based Selective Refresh for Embedded DRAMs _v 8_h 16_v 16_h 32_v 32_h Encode _v 8_h 16_v 16_h 32_v 32_h Decode adpcm epic g721 gsm jpeg pegwit rasta 5 0 adpcm epic g721 gsm jpeg mpeg2 pegwit Fig. 5. (a) Relative Refresh Energy Savings in Encoding (b) Decoding applications 32 bits, the dominance of VZC architecture becomes visible. As the plots show, savings with VZC architecture for granularities of 8,16 and 32 bits are not too different, whereas in case of HZC architecture the difference is large. The average savings for the best architectures are 26.5% for HZC (cluster size = 8) and 33% for VZC (cluster size = 32). Notice that VZC with cluster size of 32 provides the best results, due to much smaller overhead. 6 Conclusions In this paper, we have proposed two value-conscious refresh architectures suitable for embedded DRAMs. Based on the observation that zeros do not need to be refreshed, we group bits into clusters to avoid refresh of entire cluster. We have explored clustering in both horizontal and vertical directions, and various cluster sizes. Our experiments show that as we move towards higher granularity vertical clustering become more effective than horizontal one. Due to smaller overhead, for higher granularity vertical clustering offers substantial advantage. Experimental results show that the best overall architecture, that is, vertical clustering with cluster size of 32 provides a 33% reduction of refresh energy, evaluated on a set of embedded multimedia applications. References 1. D. Keitel-Schulz, N. Wehn, Embedded DRAM Development: Technology, Physical Design, and Application Issues, IEEE Design and Test, Vol. 18, No. 3, pp. 7 15, May C.-W. Yoon et al., A 80/20MHz 160mW Multimedia Processor integrated with Embedded DRAM MPEG-4 Accelerator and 3D Rendering Engine for Mobile Applications, ISSCC 04, pp , Feb R. Woo, et al. A Low-Power Graphics LSI integrating 29Mb Embedded DRAM for Mobile Multimedia Applications, ASPDAC 04,pp , Feb F Morishita, et al., A 312MHz 16Mb Random-Cycle Embedded DRAM Macro with 73/spl mu/w Power-Down Mode for Mobile Applications, ISSCC 04,pp , Feb

11 476 K. Patel et al. 5. V. Delaluz, et al., Hardware and Software Techniques for Controlling DRAM Power Modes, IEEE Transactions on Computers, Vol. 50, No. 11, Nov. 2001, pp L. Villa, M. Zhang, K. Asanoivc, Dynamic zero compression for cache energy reduction, Micro-33: 33rd International Symposium on Microarchitecture, Dec. 2000, pp Y. Zhang, J. Yang, and R. Gupta, Frequent Value Locality and Value-Centric Data Cache Design, ASPLOS 00, Nov. 2000, pp Y.J. Chang, C.L. Yang, F Lai, Value-Conscious Cache: Simple Technique for Reducing Cache Access Power, DATE04,Feb pp T. Ohsawa, K. Kai, K. Murakami, Optimizing the DRAM Refresh Count for Merged DRAM/Logic LSIs, ISLPED 98,Aug. 1998, pp Y. Idei, et al., Dual-Period Self-Refresh Scheme for Low-Power DRAMs with On- Chip PROM Mode Register, IEEE Journal on Solid-State Circuits, Vol. 33, No. 2, Feb. 1998, pp J. Kim, M.C. Papaefthymiou, Block-Based Multiperiod Dynamic Memory Design for Low Data-Retention Power, IEEE Transactions on VLSI Systems, Vol. 11, No. 6, Dec. 2003, pp SimpleScalar home page, C. Lee, M. Potkonjak, W. Mangione-Smith, MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems, International Symposium on Microarchitecture, Dec. 1997, pp

Minimizing Power Dissipation during Write Operation to Register Files

Minimizing Power Dissipation during Write Operation to Register Files Minimizing Power Dissipation during Operation to Register Files Abstract - This paper presents a power reduction mechanism for the write operation in register files (RegFiles), which adds a conditional

More information

! Memory Overview. ! ROM Memories. ! RAM Memory " SRAM " DRAM. ! This is done because we can build. " large, slow memories OR

! Memory Overview. ! ROM Memories. ! RAM Memory  SRAM  DRAM. ! This is done because we can build.  large, slow memories OR ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec 2: April 5, 26 Memory Overview, Memory Core Cells Lecture Outline! Memory Overview! ROM Memories! RAM Memory " SRAM " DRAM 2 Memory Overview

More information

MEMORIES. Memories. EEC 116, B. Baas 3

MEMORIES. Memories. EEC 116, B. Baas 3 MEMORIES Memories VLSI memories can be classified as belonging to one of two major categories: Individual registers, single bit, or foreground memories Clocked: Transparent latches and Flip-flops Unclocked:

More information

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 6T- SRAM for Low Power Consumption Mrs. J.N.Ingole 1, Ms.P.A.Mirge 2 Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 PG Student [Digital Electronics], Dept. of ExTC, PRMIT&R,

More information

Power Reduction Techniques in the Memory System. Typical Memory Hierarchy

Power Reduction Techniques in the Memory System. Typical Memory Hierarchy Power Reduction Techniques in the Memory System Low Power Design for SoCs ASIC Tutorial Memories.1 Typical Memory Hierarchy On-Chip Components Control edram Datapath RegFile ITLB DTLB Instr Data Cache

More information

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy.

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy. ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec : April 4, 7 Memory Overview, Memory Core Cells Today! Memory " Classification " ROM Memories " RAM Memory " Architecture " Memory core " SRAM

More information

ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2

ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2 ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2 9.2 A 80/20MHz 160mW Multimedia Processor integrated with Embedded DRAM MPEG-4 Accelerator and 3D Rendering Engine for Mobile Applications

More information

Unleashing the Power of Embedded DRAM

Unleashing the Power of Embedded DRAM Copyright 2005 Design And Reuse S.A. All rights reserved. Unleashing the Power of Embedded DRAM by Peter Gillingham, MOSAID Technologies Incorporated Ottawa, Canada Abstract Embedded DRAM technology offers

More information

Minimizing Power Dissipation during. University of Southern California Los Angeles CA August 28 th, 2007

Minimizing Power Dissipation during. University of Southern California Los Angeles CA August 28 th, 2007 Minimizing Power Dissipation during Write Operation to Register Files Kimish Patel, Wonbok Lee, Massoud Pedram University of Southern California Los Angeles CA August 28 th, 2007 Introduction Outline Conditional

More information

A Comparative Study of Power Efficient SRAM Designs

A Comparative Study of Power Efficient SRAM Designs A Comparative tudy of Power Efficient RAM Designs Jeyran Hezavei, N. Vijaykrishnan, M. J. Irwin Pond Laboratory, Department of Computer cience & Engineering, Pennsylvania tate University {hezavei, vijay,

More information

A novel DRAM architecture as a low leakage alternative for SRAM caches in a 3D interconnect context.

A novel DRAM architecture as a low leakage alternative for SRAM caches in a 3D interconnect context. A novel DRAM architecture as a low leakage alternative for SRAM caches in a 3D interconnect context. Anselme Vignon, Stefan Cosemans, Wim Dehaene K.U. Leuven ESAT - MICAS Laboratory Kasteelpark Arenberg

More information

ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES

ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES Shashikiran H. Tadas & Chaitali Chakrabarti Department of Electrical Engineering Arizona State University Tempe, AZ, 85287. tadas@asu.edu, chaitali@asu.edu

More information

+1 (479)

+1 (479) Memory Courtesy of Dr. Daehyun Lim@WSU, Dr. Harris@HMC, Dr. Shmuel Wimer@BIU and Dr. Choi@PSU http://csce.uark.edu +1 (479) 575-6043 yrpeng@uark.edu Memory Arrays Memory Arrays Random Access Memory Serial

More information

Concept of Memory. The memory of computer is broadly categories into two categories:

Concept of Memory. The memory of computer is broadly categories into two categories: Concept of Memory We have already mentioned that digital computer works on stored programmed concept introduced by Von Neumann. We use memory to store the information, which includes both program and data.

More information

CSEE 3827: Fundamentals of Computer Systems. Storage

CSEE 3827: Fundamentals of Computer Systems. Storage CSEE 387: Fundamentals of Computer Systems Storage The big picture General purpose processor (e.g., Power PC, Pentium, MIPS) Internet router (intrusion detection, pacet routing, etc.) WIreless transceiver

More information

Memory in Digital Systems

Memory in Digital Systems MEMORIES Memory in Digital Systems Three primary components of digital systems Datapath (does the work) Control (manager) Memory (storage) Single bit ( foround ) Clockless latches e.g., SR latch Clocked

More information

8Kb Logic Compatible DRAM based Memory Design for Low Power Systems

8Kb Logic Compatible DRAM based Memory Design for Low Power Systems 8Kb Logic Compatible DRAM based Memory Design for Low Power Systems Harshita Shrivastava 1, Rajesh Khatri 2 1,2 Department of Electronics & Instrumentation Engineering, Shree Govindram Seksaria Institute

More information

ECE 152 Introduction to Computer Architecture

ECE 152 Introduction to Computer Architecture Introduction to Computer Architecture Main Memory and Virtual Memory Copyright 2009 Daniel J. Sorin Duke University Slides are derived from work by Amir Roth (Penn) Spring 2009 1 Where We Are in This Course

More information

Area-Efficient Error Protection for Caches

Area-Efficient Error Protection for Caches Area-Efficient Error Protection for Caches Soontae Kim Department of Computer Science and Engineering University of South Florida, FL 33620 sookim@cse.usf.edu Abstract Due to increasing concern about various

More information

Low-Power SRAM and ROM Memories

Low-Power SRAM and ROM Memories Low-Power SRAM and ROM Memories Jean-Marc Masgonty 1, Stefan Cserveny 1, Christian Piguet 1,2 1 CSEM, Neuchâtel, Switzerland 2 LAP-EPFL Lausanne, Switzerland Abstract. Memories are a main concern in low-power

More information

Simulation and Analysis of SRAM Cell Structures at 90nm Technology

Simulation and Analysis of SRAM Cell Structures at 90nm Technology Vol.1, Issue.2, pp-327-331 ISSN: 2249-6645 Simulation and Analysis of SRAM Cell Structures at 90nm Technology Sapna Singh 1, Neha Arora 2, Prof. B.P. Singh 3 (Faculty of Engineering and Technology, Mody

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

Embedded Memories. Advanced Digital IC Design. What is this about? Presentation Overview. Why is this important? Jingou Lai Sina Borhani

Embedded Memories. Advanced Digital IC Design. What is this about? Presentation Overview. Why is this important? Jingou Lai Sina Borhani 1 Advanced Digital IC Design What is this about? Embedded Memories Jingou Lai Sina Borhani Master students of SoC To introduce the motivation, background and the architecture of the embedded memories.

More information

Memory. Outline. ECEN454 Digital Integrated Circuit Design. Memory Arrays. SRAM Architecture DRAM. Serial Access Memories ROM

Memory. Outline. ECEN454 Digital Integrated Circuit Design. Memory Arrays. SRAM Architecture DRAM. Serial Access Memories ROM ECEN454 Digital Integrated Circuit Design Memory ECEN 454 Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Multiple Ports DRAM Outline Serial Access Memories ROM ECEN 454 12.2 1 Memory

More information

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 341 6.2 Types of Memory 341 6.3 The Memory Hierarchy 343 6.3.1 Locality of Reference 346 6.4 Cache Memory 347 6.4.1 Cache Mapping Schemes 349 6.4.2 Replacement Policies 365

More information

DIRECT Rambus DRAM has a high-speed interface of

DIRECT Rambus DRAM has a high-speed interface of 1600 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 11, NOVEMBER 1999 A 1.6-GByte/s DRAM with Flexible Mapping Redundancy Technique and Additional Refresh Scheme Satoru Takase and Natsuki Kushiyama

More information

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1 The Memory Hierarchy Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1 Memory Technologies Technologies have vastly different tradeoffs between capacity, latency,

More information

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department

More information

Lecture-14 (Memory Hierarchy) CS422-Spring

Lecture-14 (Memory Hierarchy) CS422-Spring Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect

More information

A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors

A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors Murali Jayapala 1, Francisco Barat 1, Pieter Op de Beeck 1, Francky Catthoor 2, Geert Deconinck 1 and Henk Corporaal

More information

A Novel Architecture of SRAM Cell Using Single Bit-Line

A Novel Architecture of SRAM Cell Using Single Bit-Line A Novel Architecture of SRAM Cell Using Single Bit-Line G.Kalaiarasi, V.Indhumaraghathavalli, A.Manoranjitham, P.Narmatha Asst. Prof, Department of ECE, Jay Shriram Group of Institutions, Tirupur-2, Tamilnadu,

More information

250nm Technology Based Low Power SRAM Memory

250nm Technology Based Low Power SRAM Memory IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 5, Issue 1, Ver. I (Jan - Feb. 2015), PP 01-10 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org 250nm Technology Based Low Power

More information

Design of Low Power Wide Gates used in Register File and Tag Comparator

Design of Low Power Wide Gates used in Register File and Tag Comparator www..org 1 Design of Low Power Wide Gates used in Register File and Tag Comparator Isac Daimary 1, Mohammed Aneesh 2 1,2 Department of Electronics Engineering, Pondicherry University Pondicherry, 605014,

More information

DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY

DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY Saroja pasumarti, Asst.professor, Department Of Electronics and Communication Engineering, Chaitanya Engineering

More information

CS250 VLSI Systems Design Lecture 9: Memory

CS250 VLSI Systems Design Lecture 9: Memory CS250 VLSI Systems esign Lecture 9: Memory John Wawrzynek, Jonathan Bachrach, with Krste Asanovic, John Lazzaro and Rimas Avizienis (TA) UC Berkeley Fall 2012 CMOS Bistable Flip State 1 0 0 1 Cross-coupled

More information

Power Estimation of System-Level Buses for Microprocessor-Based Architectures: A Case Study

Power Estimation of System-Level Buses for Microprocessor-Based Architectures: A Case Study Power Estimation of System-Level Buses for Microprocessor-Based Architectures: A Case Study William Fornaciari Politecnico di Milano, DEI Milano (Italy) fornacia@elet.polimi.it Donatella Sciuto Politecnico

More information

Gated-V dd : A Circuit Technique to Reduce Leakage in Deep-Submicron Cache Memories

Gated-V dd : A Circuit Technique to Reduce Leakage in Deep-Submicron Cache Memories To appear in the Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2000. Gated-V dd : A Circuit Technique to Reduce in Deep-Submicron Cache Memories Michael Powell,

More information

Memory Design I. Array-Structured Memory Architecture. Professor Chris H. Kim. Dept. of ECE.

Memory Design I. Array-Structured Memory Architecture. Professor Chris H. Kim. Dept. of ECE. Memory Design I Professor Chris H. Kim University of Minnesota Dept. of ECE chriskim@ece.umn.edu Array-Structured Memory Architecture 2 1 Semiconductor Memory Classification Read-Write Wi Memory Non-Volatile

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Cache 11232011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review Memory Components/Boards Two-Level Memory Hierarchy

More information

Limiting the Number of Dirty Cache Lines

Limiting the Number of Dirty Cache Lines Limiting the Number of Dirty Cache Lines Pepijn de Langen and Ben Juurlink Computer Engineering Laboratory Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology

More information

2D/3D Graphics Accelerator for Mobile Multimedia Applications. Ramchan Woo, Sohn, Seong-Jun Song, Young-Don

2D/3D Graphics Accelerator for Mobile Multimedia Applications. Ramchan Woo, Sohn, Seong-Jun Song, Young-Don RAMP-IV: A Low-Power and High-Performance 2D/3D Graphics Accelerator for Mobile Multimedia Applications Woo, Sungdae Choi, Ju-Ho Sohn, Seong-Jun Song, Young-Don Bae,, and Hoi-Jun Yoo oratory Dept. of EECS,

More information

! Memory. " RAM Memory. " Serial Access Memories. ! Cell size accounts for most of memory array size. ! 6T SRAM Cell. " Used in most commercial chips

! Memory.  RAM Memory.  Serial Access Memories. ! Cell size accounts for most of memory array size. ! 6T SRAM Cell.  Used in most commercial chips ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec : April 5, 8 Memory: Periphery circuits Today! Memory " RAM Memory " Architecture " Memory core " SRAM " DRAM " Periphery " Serial Access Memories

More information

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity Donghyuk Lee Carnegie Mellon University Problem: High DRAM Latency processor stalls: waiting for data main memory high latency Major bottleneck

More information

FeRAM Circuit Technology for System on a Chip

FeRAM Circuit Technology for System on a Chip FeRAM Circuit Technology for System on a Chip K. Asari 1,2,4, Y. Mitsuyama 2, T. Onoye 2, I. Shirakawa 2, H. Hirano 1, T. Honda 1, T. Otsuki 1, T. Baba 3, T. Meng 4 1 Matsushita Electronics Corp., Osaka,

More information

The Memory Hierarchy. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.

The Memory Hierarchy. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. The Memory Hierarchy Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. L13-1 Memory Technologies Technologies have vastly different tradeoffs between capacity, latency, bandwidth,

More information

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 19: Main Memory Prof. Onur Mutlu Carnegie Mellon University Last Time Multi-core issues in caching OS-based cache partitioning (using page coloring) Handling

More information

Performance and Power Solutions for Caches Using 8T SRAM Cells

Performance and Power Solutions for Caches Using 8T SRAM Cells Performance and Power Solutions for Caches Using 8T SRAM Cells Mostafa Farahani Amirali Baniasadi Department of Electrical and Computer Engineering University of Victoria, BC, Canada {mostafa, amirali}@ece.uvic.ca

More information

Power Protocol: Reducing Power Dissipation on Off-Chip Data Buses

Power Protocol: Reducing Power Dissipation on Off-Chip Data Buses Power Protocol: Reducing Power Dissipation on Off-Chip Data Buses K. Basu, A. Choudhary, J. Pisharath ECE Department Northwestern University Evanston, IL 60208, USA fkohinoor,choudhar,jayg@ece.nwu.edu

More information

Address connections Data connections Selection connections

Address connections Data connections Selection connections Interface (cont..) We have four common types of memory: Read only memory ( ROM ) Flash memory ( EEPROM ) Static Random access memory ( SARAM ) Dynamic Random access memory ( DRAM ). Pin connections common

More information

A Single Ended SRAM cell with reduced Average Power and Delay

A Single Ended SRAM cell with reduced Average Power and Delay A Single Ended SRAM cell with reduced Average Power and Delay Kritika Dalal 1, Rajni 2 1M.tech scholar, Electronics and Communication Department, Deen Bandhu Chhotu Ram University of Science and Technology,

More information

Memory in Digital Systems

Memory in Digital Systems MEMORIES Memory in Digital Systems Three primary components of digital systems Datapath (does the work) Control (manager) Memory (storage) Single bit ( foround ) Clockless latches e.g., SR latch Clocked

More information

Lecture 13: SRAM. Slides courtesy of Deming Chen. Slides based on the initial set from David Harris. 4th Ed.

Lecture 13: SRAM. Slides courtesy of Deming Chen. Slides based on the initial set from David Harris. 4th Ed. Lecture 13: SRAM Slides courtesy of Deming Chen Slides based on the initial set from David Harris CMOS VLSI Design Outline Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Multiple Ports

More information

a) Memory management unit b) CPU c) PCI d) None of the mentioned

a) Memory management unit b) CPU c) PCI d) None of the mentioned 1. CPU fetches the instruction from memory according to the value of a) program counter b) status register c) instruction register d) program status word 2. Which one of the following is the address generated

More information

Semiconductor Memory Classification

Semiconductor Memory Classification ESE37: Circuit-Level Modeling, Design, and Optimization for Digital Systems Lec 6: November, 7 Memory Overview Today! Memory " Classification " Architecture " Memory core " Periphery (time permitting)!

More information

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017 Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit Pallavi Mamidala 1 K. Anil kumar 2 mamidalapallavi@gmail.com 1 anilkumar10436@gmail.com 2 1 Assistant Professor, Dept of

More information

Introduction to SRAM. Jasur Hanbaba

Introduction to SRAM. Jasur Hanbaba Introduction to SRAM Jasur Hanbaba Outline Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Non-volatile Memory Manufacturing Flow Memory Arrays Memory Arrays Random Access Memory Serial

More information

EEM 486: Computer Architecture. Lecture 9. Memory

EEM 486: Computer Architecture. Lecture 9. Memory EEM 486: Computer Architecture Lecture 9 Memory The Big Picture Designing a Multiple Clock Cycle Datapath Processor Control Memory Input Datapath Output The following slides belong to Prof. Onur Mutlu

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

Chapter 5. Internal Memory. Yonsei University

Chapter 5. Internal Memory. Yonsei University Chapter 5 Internal Memory Contents Main Memory Error Correction Advanced DRAM Organization 5-2 Memory Types Memory Type Category Erasure Write Mechanism Volatility Random-access memory(ram) Read-write

More information

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors G. Chen 1, M. Kandemir 1, I. Kolcu 2, and A. Choudhary 3 1 Pennsylvania State University, PA 16802, USA 2 UMIST,

More information

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism

More information

CENG4480 Lecture 09: Memory 1

CENG4480 Lecture 09: Memory 1 CENG4480 Lecture 09: Memory 1 Bei Yu byu@cse.cuhk.edu.hk (Latest update: November 8, 2017) Fall 2017 1 / 37 Overview Introduction Memory Principle Random Access Memory (RAM) Non-Volatile Memory Conclusion

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February ISSN

International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February ISSN International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February-2014 938 LOW POWER SRAM ARCHITECTURE AT DEEP SUBMICRON CMOS TECHNOLOGY T.SANKARARAO STUDENT OF GITAS, S.SEKHAR DILEEP

More information

Logic and Computer Design Fundamentals. Chapter 8 Memory Basics

Logic and Computer Design Fundamentals. Chapter 8 Memory Basics Logic and Computer Design Fundamentals Memory Basics Overview Memory definitions Random Access Memory (RAM) Static RAM (SRAM) integrated circuits Arrays of SRAM integrated circuits Dynamic RAM (DRAM) Read

More information

The Memory Hierarchy 1

The Memory Hierarchy 1 The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow

More information

Low Power using Match-Line Sensing in Content Addressable Memory S. Nachimuthu, S. Ramesh 1 Department of Electrical and Electronics Engineering,

Low Power using Match-Line Sensing in Content Addressable Memory S. Nachimuthu, S. Ramesh 1 Department of Electrical and Electronics Engineering, Low Power using Match-Line Sensing in Content Addressable Memory S. Nachimuthu, S. Ramesh 1 Department of Electrical and Electronics Engineering, K.S.R College of Engineering, Tiruchengode, Tamilnadu,

More information

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 09, 2016 ISSN (online): 2321-0613 A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM Yogit

More information

ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems

ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Lec 26: November 9, 2018 Memory Overview Dynamic OR4! Precharge time?! Driving input " With R 0 /2 inverter! Driving inverter

More information

High Performance Memory Read Using Cross-Coupled Pull-up Circuitry

High Performance Memory Read Using Cross-Coupled Pull-up Circuitry High Performance Memory Read Using Cross-Coupled Pull-up Circuitry Katie Blomster and José G. Delgado-Frias School of Electrical Engineering and Computer Science Washington State University Pullman, WA

More information

CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL

CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL Shyam Akashe 1, Ankit Srivastava 2, Sanjay Sharma 3 1 Research Scholar, Deptt. of Electronics & Comm. Engg., Thapar Univ.,

More information

Memories. Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu.

Memories. Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu. Memories Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu http://www.syssec.ethz.ch/education/digitaltechnik_17 Adapted from Digital Design and Computer Architecture, David Money Harris & Sarah

More information

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering IP-SRAM ARCHITECTURE AT DEEP SUBMICRON CMOS TECHNOLOGY A LOW POWER DESIGN D. Harihara Santosh 1, Lagudu Ramesh Naidu 2 Assistant professor, Dept. of ECE, MVGR College of Engineering, Andhra Pradesh, India

More information

Filter-Based Dual-Voltage Architecture for Low-Power Long-Word TCAM Design

Filter-Based Dual-Voltage Architecture for Low-Power Long-Word TCAM Design Filter-Based Dual-Voltage Architecture for Low-Power Long-Word TCAM Design Ting-Sheng Chen, Ding-Yuan Lee, Tsung-Te Liu, and An-Yeu (Andy) Wu Graduate Institute of Electronics Engineering, National Taiwan

More information

ECE 2300 Digital Logic & Computer Organization

ECE 2300 Digital Logic & Computer Organization ECE 2300 Digital Logic & Computer Organization Spring 201 Memories Lecture 14: 1 Announcements HW6 will be posted tonight Lab 4b next week: Debug your design before the in-lab exercise Lecture 14: 2 Review:

More information

250 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 2, FEBRUARY 2011

250 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 2, FEBRUARY 2011 250 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 2, FEBRUARY 2011 Energy-Efficient Hardware Data Prefetching Yao Guo, Member, IEEE, Pritish Narayanan, Student Member,

More information

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee

More information

Low-Power Technology for Image-Processing LSIs

Low-Power Technology for Image-Processing LSIs Low- Technology for Image-Processing LSIs Yoshimi Asada The conventional LSI design assumed power would be supplied uniformly to all parts of an LSI. For a design with multiple supply voltages and a power

More information

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts Hardware/Software Introduction Chapter 5 Memory Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 1 2 Introduction Memory:

More information

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction Hardware/Software Introduction Chapter 5 Memory 1 Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 2 Introduction Embedded

More information

Analysis of 8T SRAM Cell Using Leakage Reduction Technique

Analysis of 8T SRAM Cell Using Leakage Reduction Technique Analysis of 8T SRAM Cell Using Leakage Reduction Technique Sandhya Patel and Somit Pandey Abstract The purpose of this manuscript is to decrease the leakage current and a memory leakage power SRAM cell

More information

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech)

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) K.Prasad Babu 2 M.tech (Ph.d) hanumanthurao19@gmail.com 1 kprasadbabuece433@gmail.com 2 1 PG scholar, VLSI, St.JOHNS

More information

Bus Encoding Technique for hierarchical memory system Anne Pratoomtong and Weiping Liao

Bus Encoding Technique for hierarchical memory system Anne Pratoomtong and Weiping Liao Bus Encoding Technique for hierarchical memory system Anne Pratoomtong and Weiping Liao Abstract In microprocessor-based systems, data and address buses are the core of the interface between a microprocessor

More information

Performance Analysis and Designing 16 Bit Sram Memory Chip Using XILINX Tool

Performance Analysis and Designing 16 Bit Sram Memory Chip Using XILINX Tool Performance Analysis and Designing 16 Bit Sram Memory Chip Using XILINX Tool Monika Solanki* Department of Electronics & Communication Engineering, MBM Engineering College, Jodhpur, Rajasthan Review Article

More information

Column decoder using PTL for memory

Column decoder using PTL for memory IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735. Volume 5, Issue 4 (Mar. - Apr. 2013), PP 07-14 Column decoder using PTL for memory M.Manimaraboopathy

More information

MEMORY BHARAT SCHOOL OF BANKING- VELLORE

MEMORY BHARAT SCHOOL OF BANKING- VELLORE A memory is just like a human brain. It is used to store data and instructions. Computer memory is the storage space in computer where data is to be processed and instructions required for processing are

More information

Very Large Scale Integration (VLSI)

Very Large Scale Integration (VLSI) Very Large Scale Integration (VLSI) Lecture 8 Dr. Ahmed H. Madian ah_madian@hotmail.com Content Array Subsystems Introduction General memory array architecture SRAM (6-T cell) CAM Read only memory Introduction

More information

Design and Implementation of Low Leakage Power SRAM System Using Full Stack Asymmetric SRAM

Design and Implementation of Low Leakage Power SRAM System Using Full Stack Asymmetric SRAM Design and Implementation of Low Leakage Power SRAM System Using Full Stack Asymmetric SRAM Rajlaxmi Belavadi 1, Pramod Kumar.T 1, Obaleppa. R. Dasar 2, Narmada. S 2, Rajani. H. P 3 PG Student, Department

More information

Memory Design I. Semiconductor Memory Classification. Read-Write Memories (RWM) Memory Scaling Trend. Memory Scaling Trend

Memory Design I. Semiconductor Memory Classification. Read-Write Memories (RWM) Memory Scaling Trend. Memory Scaling Trend Array-Structured Memory Architecture Memory Design I Professor hris H. Kim University of Minnesota Dept. of EE chriskim@ece.umn.edu 2 Semiconductor Memory lassification Read-Write Memory Non-Volatile Read-Write

More information

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast

More information

Computer Architecture

Computer Architecture Computer Architecture Lecture 7: Memory Hierarchy and Caches Dr. Ahmed Sallam Suez Canal University Spring 2015 Based on original slides by Prof. Onur Mutlu Memory (Programmer s View) 2 Abstraction: Virtual

More information

An Integrated Circuit/Architecture Approach to Reducing Leakage in Deep-Submicron High-Performance I-Caches

An Integrated Circuit/Architecture Approach to Reducing Leakage in Deep-Submicron High-Performance I-Caches To appear in the Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA), 2001. An Integrated Circuit/Architecture Approach to Reducing Leakage in Deep-Submicron

More information

Se-Hyun Yang, Michael Powell, Babak Falsafi, Kaushik Roy, and T. N. Vijaykumar

Se-Hyun Yang, Michael Powell, Babak Falsafi, Kaushik Roy, and T. N. Vijaykumar AN ENERGY-EFFICIENT HIGH-PERFORMANCE DEEP-SUBMICRON INSTRUCTION CACHE Se-Hyun Yang, Michael Powell, Babak Falsafi, Kaushik Roy, and T. N. Vijaykumar School of Electrical and Computer Engineering Purdue

More information

Lossless Compression using Efficient Encoding of Bitmasks

Lossless Compression using Efficient Encoding of Bitmasks Lossless Compression using Efficient Encoding of Bitmasks Chetan Murthy and Prabhat Mishra Department of Computer and Information Science and Engineering University of Florida, Gainesville, FL 326, USA

More information

Summer 2003 Lecture 18 07/09/03

Summer 2003 Lecture 18 07/09/03 Summer 2003 Lecture 18 07/09/03 NEW HOMEWORK Instruction Execution Times: The 8088 CPU is a synchronous machine that operates at a particular clock frequency. In the case of the original IBM PC, that clock

More information

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 233 6.2 Types of Memory 233 6.3 The Memory Hierarchy 235 6.3.1 Locality of Reference 237 6.4 Cache Memory 237 6.4.1 Cache Mapping Schemes 239 6.4.2 Replacement Policies 247

More information

Using a Victim Buffer in an Application-Specific Memory Hierarchy

Using a Victim Buffer in an Application-Specific Memory Hierarchy Using a Victim Buffer in an Application-Specific Memory Hierarchy Chuanjun Zhang Depment of lectrical ngineering University of California, Riverside czhang@ee.ucr.edu Frank Vahid Depment of Computer Science

More information

An Energy-Efficient High-Performance Deep-Submicron Instruction Cache

An Energy-Efficient High-Performance Deep-Submicron Instruction Cache An Energy-Efficient High-Performance Deep-Submicron Instruction Cache Michael D. Powell ϒ, Se-Hyun Yang β1, Babak Falsafi β1,kaushikroy ϒ, and T. N. Vijaykumar ϒ ϒ School of Electrical and Computer Engineering

More information

Design and Simulation of Low Power 6TSRAM and Control its Leakage Current Using Sleepy Keeper Approach in different Topology

Design and Simulation of Low Power 6TSRAM and Control its Leakage Current Using Sleepy Keeper Approach in different Topology Vol. 3, Issue. 3, May.-June. 2013 pp-1475-1481 ISSN: 2249-6645 Design and Simulation of Low Power 6TSRAM and Control its Leakage Current Using Sleepy Keeper Approach in different Topology Bikash Khandal,

More information

William Stallings Computer Organization and Architecture 8th Edition. Chapter 5 Internal Memory

William Stallings Computer Organization and Architecture 8th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 8th Edition Chapter 5 Internal Memory Semiconductor Memory The basic element of a semiconductor memory is the memory cell. Although a variety of

More information

An Approach for Adaptive DRAM Temperature and Power Management

An Approach for Adaptive DRAM Temperature and Power Management IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 An Approach for Adaptive DRAM Temperature and Power Management Song Liu, Yu Zhang, Seda Ogrenci Memik, and Gokhan Memik Abstract High-performance

More information