Cache Memory Configurations and Their Respective Energy Consumption

Size: px
Start display at page:

Download "Cache Memory Configurations and Their Respective Energy Consumption"

Transcription

1 Cache Memory Configurations and Their Respective Energy Consumption Dylan Petrae Department of Electrical and Computer Engineering University of Central Florida Orlando, FL Abstract When it comes to accessing data in a computer system, the memory hierarchy becomes very critical. Accessing data that has a larger capacity or that is found in the Last Level Cache (LLC) of the system quickly becomes time and energy-consuming. The purpose of computer systems is to provide accurate, quick, and efficient data manipulation, and accessing these LLCs are becoming troublesome for fast bit crunching. This paper will analyze the benefits of using Spin-Transfer Torque Random Access Memory (STT-RAM), instead of Static Ram (SRAM) in different levels of cache and how different applications and cache configurations can provide different cache latency and energy consumption. Keywords CPU, DRAM, SRAM, STT-RAM, Cache, Registers, Memory Hierarchy, Blocks, Lines, Hit Rate, Miss Rate, Associativity, Direct-Mapped Cache, Fully Associative, Set Associative, Volatile Memory, Non-Volatile Memory I. INTRODUCTION On the motherboard, there is the central processing unit (CPU) and dynamic random access memory (DRAM). Instructions and data are stored in the DRAM and needs to be referenced and accessed by the CPU, since there is limited space in the CPU. This limited space available in the CPU introduces the need for a memory hierarchy to improve efficiency of the machine. The proximity of an aspect of memory compared to the CPU correlates to a faster access time, since the electrons have a shorter distance to travel. Another contributing factor to the speed of memory access is the size of the memory device in question. Smaller memory, such as the limited number of registers in the CPU itself, can be accessed much faster than the much larger memory space of DRAM, since there are a smaller amount of registers and less bits need to be crunched in order to find the data that is being sought. There is an element of memory that falls between the registers of the CPU and the DRAM within the memory hierarchy. This element is called cache. The cache also appears in multiple levels. Since programs can only access small amount of address space at a time, it is very beneficial to place the more frequently used aspects of the program in the memory devices that are closest to the CPU and that are contained in a smaller memory space. The cache levels (ordered L1, L2, L3, etc.) that are closer to the CPU and smaller in memory space can be accessed at a higher rate than the upper levels of cache. Following the 9/1 rule, which states that 9% of your work that your program does stems from 1% of the code, you would put frequently used aspects of the program, such as loops, inside the L1 level of cache. The goal of a program is to access the further elements of memory as least often as possible. There are two metrics that measure the effectiveness of the memory access. Hit rate is the portion of the memory access found in cache. The higher this metric is, the faster the program is going to run, since the data you are trying to access is in the first place you are looking for it. The second metric is miss rate, which is simply the portion of memory access that is NOT found in the lower levels of the memory hierarchy, which can be derived from [1 Hit Rate]. Associativity is a technique to reduce these conflict misses and improve hit rate. There are three design strategies of cache associativity that can be used in different applications. The first design for cache block placements is direct-mapped cache, which actually contains no associativity. In directmapped design, only one cache line is mapped to by a memory block. The advantage of this block placement is that it utilizes a bitwise mod of the

2 address to find the sought line, while the disadvantage is that the conflict misses, which is a scenario where more than one memory location ends up being mapped to the same location in cache, will lower the hit rate. The second cache associativity is fully associative, which is unrestrictive associativity. The advantage of this design is that it contains a fully flexible mapping, which extends up to the capacity of cache. The disadvantage of this design is that it has the largest tag field, which results in a higher number of comparisons and yields a longer access time for tag search (slower). The final design is set associative, which is bounded associativity. This design is a hybrid of the previous two block placement designs. It is also known as k-way set associative, where k lines can store each block. The advantage of this hybrid functionality is that it balances the flexibility of fully associative with the complexity of tag-matching. The contents of the cache for the different cache strategies are accessed in different ways. For directmapped, where the width of the address bus is a, where each word contains 2 w Bytes, the cache size is 2 n blocks and n bits are necessary to explain the line index. The block size is 2 m words and m bits are necessary to indicate the word index within the block. To find the tag field size, calculate the difference between the width of the address bus and the sum of n, m, and w. Figure 2. Set-associate cache fields There is a discrepancy between device technologies that are labeled as volatile and nonvolatile. Volatile is RAM that requires voltage supply to maintain values. Two devices that are volatile are static RAM (SRAM) and dynamic RAM (DRAM). SRAM storage is reliant on transistors. DRAM, on the other hand, is reliant on capacitors. STT-RAM is non-volatile memory. Non-volatile memory s primary use is primarily for application in secondary storage devices found further away from the CPU. Since it does not need a voltage supply to maintain values, non-volatile memory is perfect for long-term storage that can be preserved when the power is turned off. In this paper, cache latency and energy consumption will be the two metrics covered. Energy consumption is measured in nano joules (nj) or joules (j) and is the energy required to access memory and crunch bits. Cache latency is measured in nano seconds (nsec) or seconds. Eight sources will be identified and analyzed, examining these two specific metrics and how improvements have been made from the year 2 through today with the potential switch to STT-RAM from SRAM-based designs for cache. Figure 1. Direct-mapped cache fields The set-associative strategy is laid out in a similar way as direct-mapped, but the line (n) field is replaced with a set (s) field. Whatever value the s holds, is how many ways the set-associate method can be formed. Basically every block of memory has s number of opportunities to be located in the cache. Every block maps to one set, but there are s lines in each of those sets. The cache capacity is 2 s sets. II. LITERATURE REVIEW In the last 1 years, we have seen Spin-Transfer Torque Random Access Memory (STT-RAM) emerge as a new memory design. It has a multitude of trade-offs versus one of its predecessors, SRAM. SRAM has equal R/W speeds, but STT-RAM, since it is non-volatile, has slow write speeds and high dynamic energy. However, STT-RAM has no standby power needed, high density, and low leakage power, opposed to SRAM. As we are reaching a technological limit where speeds cannot be much faster, since old RAM technologies used the elevation of electron charge to determine data. The new spintronic devices use the angular momentum of an electron. Magnetic Tunnel

3 Junctions (MTJ) are what comprise an STT-RAM device. The magnetization direction determines the state of the layer in memory and a switch in magnetized direction can switch from a 1 to a. If the reference layer and storage layer are moving in the same direction, the value of the data is, since it s in low resistance. When they are moving in opposite directions, it is in high resistance and has a value of 1. With the STT RAM, the device only needs to send a magnetic field, where the current is orthogonal to the magnetic field, over the MTJs that need to be altered. STT-RAM can result in a slower read and write latency and higher energy usage (nj), but the higher density and low power leakage [4] In the years of , propositions were made to create multi-level cells (MLC) for the STT- RAM design. This would increase density of the cell, but create a writing disturbance. Since the soft bits of the cell have to be reset right after the hard bits change, these resets can become cumbersome and create an energy consumption issue. A possible solution to this is overwriting these soft bits using a read, reuse, distance (RRD) replacement policy, which is where they will use the instructions in cache for predicting the reuse of blocks. [3] Another technology that will be analyzed is the edram design for memory and how different node technologies probability of failure correlate to their retention time. Evidence shows that a lower nanometer technology node has more likelihood for retention failure than a larger node. [2] In the year 213, a study that sought to find whether edram could be a viable alternative to SRAM was carried out. While edram has low energy leakage and a higher density, the frequent refreshes will become the primary energy consumer. Their results show that with proper control of the refreshing functions, edram is potentially a very viable memory device technology because it can be much more energy efficient. [9] In 212, studies were conducted to determine whether STT-RAM can outperform SRAM, despite its slow write speeds and its higher retention duration. Figure 3 is a graph that compares SRAM to STT-RAM of varying retention times and also an iteration of STT-RAM with a refresh period. The total energy is reduced because of the simultaneously reduction of power leakage. [6] The design of STT-RAM can also be improved with adding a refresh period to make up for the sped-up retention turnover. Even though the dynamic energy increases because of more frequent writing within the caches, it s negligible compared to total energy usage. A similar study in 211 investigated exchanging the non-volatility aspect of STT-RAM for a more energy and performance efficient device, utilizing the aforementioned refresh scheme and a reduced retention time. Figure 4 shows how much better the power leakage is for STT-RAM, compared to SRAM. The study s system also had varying retention levels between cache levels, optimizing for differing patterns. [8] In 26, a study suggested that an increase in data cache size correlates to more efficient data sharing. This correlation means that it is beneficial to have a last level cache (LLC) that is shared, opposed to a partitioned LLC with numerous private caches. [7] The latency can be significantly reduced between LLC and memory devices beyond it, utilizing the increased efficiency of data-sharing. Looking a little further into the past, in the year 1994, the debate of Concurrent-Read-Exclusive- Write access (CREW) versus Concurrent-Read- Concurrent-Write access (CRCW), comparing their percent read against the time (in nanoseconds) to complete them. It is found that the fastest access time comes from a cache-to-cache scheme for transferring data, but similar results can be provided for directory schemes that are much less complex. [1]

4 Leakage Power (mw) Noramlized Energy Usage III. DATA ANALYSIS Total Energy Usage of SRAM vs. STT-RAM of varying retention times S-1MB M-4MB M-4MB(1s) M-4MB(1s) M-4MB(1ms) Figure 3. Total Energy Usage SRAM vs. STT-RAM This initial graph depicts that STT-RAM can become more energy efficient than the standard SRAM when the retention time is reduced to avoid the potentially large energy overhead that STT-RAM has with its slow write speeds. Power Leakage of SRAM vs. STT-RAM SRAM Low Retention STT-RAM Med-Retention STT-RAM Hi-Retention STT-RAM Figure 4. SRAM vs STT-RAM power leakage This graph shows how significantly lower the power leakage is for STT-RAM opposed to SRAM, which can benefit power consumption.

5 Latency (ns) Area (mm 2 ) Read and Write Latency Discrepencies Between SRAM, STT-RAM, and edram SRAM STT-RAM edram Read Latency Write Latency Figure 5. R/W latencies for SRAM, STT-RAM, and edram Read and write speeds are pretty symmetrical for SRAM and edram, however the writing latency for STT-RAM is where it differs greatly. The write speed for STT-RAM is drastically slower Area of SRAM, STT-RAM, and edram Technologies SRAM STT-RAM edram Figure 6. Areas of SRAM, STT-RAM, and edram The areas shown in this graph for the different memory device technologies lends to the idea that STT-RAM is a much denser option for cache, which can benefit fabrication costs, as well as data access TABLE I. COMPARING MEMORY SPACE AND LATENCY BETWEEN CACHE LEVELS Processor Level 1 (L1) for Instruction (I) or Data (D) Level 2 (L2) Level 3 (L3) or Last Level Cache (LLC) # of cores Freq. Capacity Set Assoc. Device Tech. # of CL Protocol Capacity Set Assoc. Device Tech. # of CL Protocol Capacity Set Assoc. Device Tech. # of CL Protocol Khoshavi [2] 8 3GHz 32KB 8-way SRAM 512 MESI 512KB 8-way SRAM 8192 MESI 96MB 16-way edram ~1M WB Sun [8] 4 2GHz 32KB 4-way SRAM 512 N/A 256KB 8-way SRAM 496 N/A 4MB 16-way STT- RAM N/A Crawford[1] N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A Chen[3] 4 3.3GHz 32KB 8-Way SRAM STT Khosavi[4] 8 3GHz 32KB 8-way SRAM 512 Jog[6] 4 2GHz 32KB (per core) 4-way SRAM 512 Jaleel[7] 8 N/A 32KB 4-way DRAM 512 Chang[9] 8 2GHz 32KB 8-way SRAM STT edram Sun[1] 8 2GHz 16KB 2-way SRAM 256 Lin[5] 2 8 MHz 32KB 4-way DRAM N/A 4MB 8-way STT- RAM N/A 8-way SRAM N/A 65,536 N/A N/A N/A N/A N/A N/A 96MB 16-way edram ~1.5M N/A 1MB 16-way SRAM 16,384 N/A N/A N/A N/A N/A N/A Through 256KB 8-way DRAM MESI 256KB 8-way SRAM 496 MESI 32MB 16-way Through 8MB 32-way STT- RAM 512KB 16-way DRAM ,72 The formula for the number of cache lines = cache capacity (Bytes) / cache line size (Bytes) Assuming the cache line size is 64 Bytes in all of the cases Protocol column = {Write (WB), Write Through (WT), MESI, MOESI, Not Available (N/A)} 64MB 16-way DRAM ~1M SRAM STT 524,288 edram N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A

6 IV. CONCLUSION I conclude from the reading that the STT-RAM device technology has the capability of being an efficient alternative to SRAM and edram. When it is presented in an iteration that contains a clock-controlled refresh scheme and lower retention time, it proves to be extremely energy efficient without sacrificing crucial read latency. The devices tend to be much denser than other memory devices, which benefits fabrication costs. With upcoming roadblocks from fundamental physical limitations in technology, the industry needs a denser and energy efficient memory device. Baseline designs: REFERENCES [1] S. E. Crawford and R. F. DeMara, "Cache coherence in a multiport memory environment," in Proceedings of the Second International Conference on Massively Parallel Computing Systems (MPCS-95), pp , Ischia, Italy, May 2-6, [2] N. Khoshavi, X. Chen, J. Wang and R. F. DeMara, Bit-Upset Vulnerability Factor for edram Last Level Cache Immunity Analysis, Proceedings of 17th International Symposium on Quality Electronic Design (ISQED 216), Santa Clara, CA, USA, March 15-16, 216. [3] X. Chen, N. Khoshavi, J. Zhou, D. Huang, R. F. DeMara, J. Wang, W. Wen and Y. Chen, AOS: Adaptive Overwrite Scheme for Energy-Efficient MLC STT-RAM Cache, 53rd Design Automation Conference, Austing, TX, USA, 216. [4] N. Khoshavi, X. Chen, J. Wang and R. F. DeMara, "Read-Tuned STT-RAM and edram Cache Hierarchies for Throughput and Energy Enhancement, arxiv preprint, 216. [5] M. Lin, et al. "ASTRO: Synthesizing application-specific reconfigurable hardware traces to exploit memory-level parallelism" Microprocessors and Microsystems 39.7 (215): Comparison designs: [6] A. Jog, A. K. Mishra, C. Xu, Y. Xie, V. Narayanan, R. Iyer, and C. R. Das, Cache Revive: Architecting Volatile STT-RAM Caches for Enhanced Performance in CMPs, in Proceedings of 49th Annual Design Automation Conference (DAC). 212, pp [7] A. Jaleel, M. Mattina, and B. Jacob, Last Level Cache (LLC) Performance of Data Mining Workloads on a CMP-a Case Study of Parallel Bioinformatics Workloads, in Proceedings of 12th International Symposium on High Performance Computer Architecture (HPCA), 26, pp [8] Z. Sun, X. Bi, H. H. Li, W.-F. Wong, Z.-L. Ong, X. Zhu, and W. Wu, Multi Retention Level STT-RAM Cache Designs with a Dynamic Refresh Scheme, in Proceedings of 44th annual IEEE/ACM International Symposium on Microarchitecture. 211, pp [9] M.-T. Chang, P. Rosenfeld, S.-L. Lu, and B. Jacob, Technology Comparison for Large Last-level Caches (L 3 Cs): Low-leakage SRAM, Low energy STT-RAM, and Refresh-optimized edram, in Proceedings of 19th International Symposium on High Performance Computer Architecture (HPCA), 213, pp [1] Z. Sun, X. Bi, and H. Li, Process variation aware data management for stt-ram cache design, in Proceedings of the 212 ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED, 212, pp

Revolutionizing Technological Devices such as STT- RAM and their Multiple Implementation in the Cache Level Hierarchy

Revolutionizing Technological Devices such as STT- RAM and their Multiple Implementation in the Cache Level Hierarchy Revolutionizing Technological s such as and their Multiple Implementation in the Cache Level Hierarchy Michael Mosquera Department of Electrical and Computer Engineering University of Central Florida Orlando,

More information

Analysis of Cache Configurations and Cache Hierarchies Incorporating Various Device Technologies over the Years

Analysis of Cache Configurations and Cache Hierarchies Incorporating Various Device Technologies over the Years Analysis of Cache Configurations and Cache Hierarchies Incorporating Various Technologies over the Years Sakeenah Khan EEL 30C: Computer Organization Summer Semester Department of Electrical and Computer

More information

A Brief Compendium of On Chip Memory Highlighting the Tradeoffs Implementing SRAM,

A Brief Compendium of On Chip Memory Highlighting the Tradeoffs Implementing SRAM, A Brief Compendium of On Chip Memory Highlighting the Tradeoffs Implementing, RAM, or edram Justin Bates Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 3816-36

More information

Comparisons Of Different Level Of Cache Using Various Technologies From Multiple Reverences

Comparisons Of Different Level Of Cache Using Various Technologies From Multiple Reverences Comparisons Of Different Level Of Cache Using Various Technologies From ultiple Reverences Parameswari Chandrasekar Department of Electrical and Computer Engineering University of Central Florida Orlando,

More information

Cache Memory Introduction and Analysis of Performance Amongst SRAM and STT-RAM from The Past Decade

Cache Memory Introduction and Analysis of Performance Amongst SRAM and STT-RAM from The Past Decade Cache Memory Introduction and Analysis of Performance Amongst S and from The Past Decade Carlos Blandon Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 386-36

More information

A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid Cache in 3D chip Multi-processors

A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid Cache in 3D chip Multi-processors , July 4-6, 2018, London, U.K. A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid in 3D chip Multi-processors Lei Wang, Fen Ge, Hao Lu, Ning Wu, Ying Zhang, and Fang Zhou Abstract As

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2014 Lecture 14 LAST TIME! Examined several memory technologies: SRAM volatile memory cells built from transistors! Fast to use, larger memory cells (6+ transistors

More information

Memory. Lecture 22 CS301

Memory. Lecture 22 CS301 Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch

More information

Mohsen Imani. University of California San Diego. System Energy Efficiency Lab seelab.ucsd.edu

Mohsen Imani. University of California San Diego. System Energy Efficiency Lab seelab.ucsd.edu Mohsen Imani University of California San Diego Winter 2016 Technology Trend for IoT http://www.flashmemorysummit.com/english/collaterals/proceedi ngs/2014/20140807_304c_hill.pdf 2 Motivation IoT significantly

More information

Memory hierarchy Outline

Memory hierarchy Outline Memory hierarchy Outline Performance impact Principles of memory hierarchy Memory technology and basics 2 Page 1 Performance impact Memory references of a program typically determine the ultimate performance

More information

Large and Fast: Exploiting Memory Hierarchy

Large and Fast: Exploiting Memory Hierarchy CSE 431: Introduction to Operating Systems Large and Fast: Exploiting Memory Hierarchy Gojko Babić 10/5/018 Memory Hierarchy A computer system contains a hierarchy of storage devices with different costs,

More information

Area, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory

Area, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory Area, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory Youngbin Jin, Mustafa Shihab, and Myoungsoo Jung Computer Architecture and Memory Systems Laboratory Department of Electrical

More information

LECTURE 11. Memory Hierarchy

LECTURE 11. Memory Hierarchy LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed

More information

MEMORY. Objectives. L10 Memory

MEMORY. Objectives. L10 Memory MEMORY Reading: Chapter 6, except cache implementation details (6.4.1-6.4.6) and segmentation (6.5.5) https://en.wikipedia.org/wiki/probability 2 Objectives Understand the concepts and terminology of hierarchical

More information

OAP: An Obstruction-Aware Cache Management Policy for STT-RAM Last-Level Caches

OAP: An Obstruction-Aware Cache Management Policy for STT-RAM Last-Level Caches OAP: An Obstruction-Aware Cache Management Policy for STT-RAM Last-Level Caches Jue Wang, Xiangyu Dong, Yuan Xie Department of Computer Science and Engineering, Pennsylvania State University Qualcomm Technology,

More information

Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative

Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative Emre Kültürsay *, Mahmut Kandemir *, Anand Sivasubramaniam *, and Onur Mutlu * Pennsylvania State University Carnegie Mellon University

More information

Improving Energy Efficiency of Write-asymmetric Memories by Log Style Write

Improving Energy Efficiency of Write-asymmetric Memories by Log Style Write Improving Energy Efficiency of Write-asymmetric Memories by Log Style Write Guangyu Sun 1, Yaojun Zhang 2, Yu Wang 3, Yiran Chen 2 1 Center for Energy-efficient Computing and Applications, Peking University

More information

A Low-Power Hybrid Magnetic Cache Architecture Exploiting Narrow-Width Values

A Low-Power Hybrid Magnetic Cache Architecture Exploiting Narrow-Width Values A Low-Power Hybrid Magnetic Cache Architecture Exploiting Narrow-Width Values Mohsen Imani, Abbas Rahimi, Yeseong Kim, Tajana Rosing Computer Science and Engineering, UC San Diego, La Jolla, CA 92093,

More information

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg Computer Architecture and System Software Lecture 09: Memory Hierarchy Instructor: Rob Bergen Applied Computer Science University of Winnipeg Announcements Midterm returned + solutions in class today SSD

More information

A Coherent Hybrid SRAM and STT-RAM L1 Cache Architecture for Shared Memory Multicores

A Coherent Hybrid SRAM and STT-RAM L1 Cache Architecture for Shared Memory Multicores A Coherent Hybrid and L1 Cache Architecture for Shared Memory Multicores Jianxing Wang, Yenni Tim Weng-Fai Wong, Zhong-Liang Ong Zhenyu Sun, Hai (Helen) Li School of Computing Swanson School of Engineering

More information

Emerging NVM Memory Technologies

Emerging NVM Memory Technologies Emerging NVM Memory Technologies Yuan Xie Associate Professor The Pennsylvania State University Department of Computer Science & Engineering www.cse.psu.edu/~yuanxie yuanxie@cse.psu.edu Position Statement

More information

LECTURE 10: Improving Memory Access: Direct and Spatial caches

LECTURE 10: Improving Memory Access: Direct and Spatial caches EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses

More information

CENG4480 Lecture 09: Memory 1

CENG4480 Lecture 09: Memory 1 CENG4480 Lecture 09: Memory 1 Bei Yu byu@cse.cuhk.edu.hk (Latest update: November 8, 2017) Fall 2017 1 / 37 Overview Introduction Memory Principle Random Access Memory (RAM) Non-Volatile Memory Conclusion

More information

CS 320 February 2, 2018 Ch 5 Memory

CS 320 February 2, 2018 Ch 5 Memory CS 320 February 2, 2018 Ch 5 Memory Main memory often referred to as core by the older generation because core memory was a mainstay of computers until the advent of cheap semi-conductor memory in the

More information

Couture: Tailoring STT-MRAM for Persistent Main Memory. Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung

Couture: Tailoring STT-MRAM for Persistent Main Memory. Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung Couture: Tailoring STT-MRAM for Persistent Main Memory Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung Executive Summary Motivation: DRAM plays an instrumental role in modern

More information

International Journal of Information Research and Review Vol. 05, Issue, 02, pp , February, 2018

International Journal of Information Research and Review Vol. 05, Issue, 02, pp , February, 2018 International Journal of Information Research and Review, February, 2018 International Journal of Information Research and Review Vol. 05, Issue, 02, pp.5221-5225, February, 2018 RESEARCH ARTICLE A GREEN

More information

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1 CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson

More information

Lecture-14 (Memory Hierarchy) CS422-Spring

Lecture-14 (Memory Hierarchy) CS422-Spring Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect

More information

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now? cps 14 memory.1 RW Fall 2 CPS11 Computer Organization and Programming Lecture 13 The System Robert Wagner Outline of Today s Lecture System the BIG Picture? Technology Technology DRAM A Real Life Example

More information

Chapter 2: Memory Hierarchy Design Part 2

Chapter 2: Memory Hierarchy Design Part 2 Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental

More information

Computer Architecture Memory hierarchies and caches

Computer Architecture Memory hierarchies and caches Computer Architecture Memory hierarchies and caches S Coudert and R Pacalet January 23, 2019 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches

More information

CS311 Lecture 21: SRAM/DRAM/FLASH

CS311 Lecture 21: SRAM/DRAM/FLASH S 14 L21-1 2014 CS311 Lecture 21: SRAM/DRAM/FLASH DARM part based on ISCA 2002 tutorial DRAM: Architectures, Interfaces, and Systems by Bruce Jacob and David Wang Jangwoo Kim (POSTECH) Thomas Wenisch (University

More information

Phase Change Memory An Architecture and Systems Perspective

Phase Change Memory An Architecture and Systems Perspective Phase Change Memory An Architecture and Systems Perspective Benjamin C. Lee Stanford University bcclee@stanford.edu Fall 2010, Assistant Professor @ Duke University Benjamin C. Lee 1 Memory Scaling density,

More information

Chapter 2: Memory Hierarchy Design Part 2

Chapter 2: Memory Hierarchy Design Part 2 Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental

More information

CS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

CS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved. CS 33 Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved. Hyper Threading Instruction Control Instruction Control Retirement Unit

More information

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information

Hybrid Cache Architecture (HCA) with Disparate Memory Technologies

Hybrid Cache Architecture (HCA) with Disparate Memory Technologies Hybrid Cache Architecture (HCA) with Disparate Memory Technologies Xiaoxia Wu, Jian Li, Lixin Zhang, Evan Speight, Ram Rajamony, Yuan Xie Pennsylvania State University IBM Austin Research Laboratory Acknowledgement:

More information

The Memory Hierarchy & Cache

The Memory Hierarchy & Cache Removing The Ideal Memory Assumption: The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

Architectural Aspects in Design and Analysis of SOTbased

Architectural Aspects in Design and Analysis of SOTbased Architectural Aspects in Design and Analysis of SOTbased Memories Rajendra Bishnoi, Mojtaba Ebrahimi, Fabian Oboril & Mehdi Tahoori INSTITUTE OF COMPUTER ENGINEERING (ITEC) CHAIR FOR DEPENDABLE NANO COMPUTING

More information

Lecture notes for CS Chapter 2, part 1 10/23/18

Lecture notes for CS Chapter 2, part 1 10/23/18 Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental

More information

Where Have We Been? Ch. 6 Memory Technology

Where Have We Been? Ch. 6 Memory Technology Where Have We Been? Combinational and Sequential Logic Finite State Machines Computer Architecture Instruction Set Architecture Tracing Instructions at the Register Level Building a CPU Pipelining Where

More information

Cycle Time for Non-pipelined & Pipelined processors

Cycle Time for Non-pipelined & Pipelined processors Cycle Time for Non-pipelined & Pipelined processors Fetch Decode Execute Memory Writeback 250ps 350ps 150ps 300ps 200ps For a non-pipelined processor, the clock cycle is the sum of the latencies of all

More information

Cache/Memory Optimization. - Krishna Parthaje

Cache/Memory Optimization. - Krishna Parthaje Cache/Memory Optimization - Krishna Parthaje Hybrid Cache Architecture Replacing SRAM Cache with Future Memory Technology Suji Lee, Jongpil Jung, and Chong-Min Kyung Department of Electrical Engineering,KAIST

More information

Reconfigurable Spintronic Fabric using Domain Wall Devices

Reconfigurable Spintronic Fabric using Domain Wall Devices Reconfigurable Spintronic Fabric using Domain Wall Devices Ronald F. DeMara, Ramtin Zand, Arman Roohi, Soheil Salehi, and Steven Pyle Department of Electrical and Computer Engineering University of Central

More information

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:

More information

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) Caches Han Wang CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes) This week: Announcements PA2 Work-in-progress submission Next six weeks: Two labs and two projects

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

Memory. Objectives. Introduction. 6.2 Types of Memory

Memory. Objectives. Introduction. 6.2 Types of Memory Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts

More information

Memory hierarchy and cache

Memory hierarchy and cache Memory hierarchy and cache QUIZ EASY 1). What is used to design Cache? a). SRAM b). DRAM c). Blend of both d). None. 2). What is the Hierarchy of memory? a). Processor, Registers, Cache, Tape, Main memory,

More information

Lecture 8: Virtual Memory. Today: DRAM innovations, virtual memory (Sections )

Lecture 8: Virtual Memory. Today: DRAM innovations, virtual memory (Sections ) Lecture 8: Virtual Memory Today: DRAM innovations, virtual memory (Sections 5.3-5.4) 1 DRAM Technology Trends Improvements in technology (smaller devices) DRAM capacities double every two years, but latency

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

CS 261 Fall Mike Lam, Professor. Memory

CS 261 Fall Mike Lam, Professor. Memory CS 261 Fall 2016 Mike Lam, Professor Memory Topics Memory hierarchy overview Storage technologies SRAM DRAM PROM / flash Disk storage Tape and network storage I/O architecture Storage trends Latency comparisons

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 13

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 13 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2017 Lecture 13 COMPUTER MEMORY So far, have viewed computer memory in a very simple way Two memory areas in our computer: The register file Small number

More information

Novel Nonvolatile Memory Hierarchies to Realize "Normally-Off Mobile Processors" ASP-DAC 2014

Novel Nonvolatile Memory Hierarchies to Realize Normally-Off Mobile Processors ASP-DAC 2014 Novel Nonvolatile Memory Hierarchies to Realize "Normally-Off Mobile Processors" ASP-DAC 2014 Shinobu Fujita, Kumiko Nomura, Hiroki Noguchi, Susumu Takeda, Keiko Abe Toshiba Corporation, R&D Center Advanced

More information

Memory Hierarchy and Caches

Memory Hierarchy and Caches Memory Hierarchy and Caches COE 301 / ICS 233 Computer Organization Dr. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals Presentation Outline

More information

COMPUTER ARCHITECTURES

COMPUTER ARCHITECTURES COMPUTER ARCHITECTURES Random Access Memory Technologies Gábor Horváth BUTE Department of Networked Systems and Services ghorvath@hit.bme.hu Budapest, 2019. 02. 24. Department of Networked Systems and

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Cache 11232011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review Memory Components/Boards Two-Level Memory Hierarchy

More information

1. Memory technology & Hierarchy

1. Memory technology & Hierarchy 1 Memory technology & Hierarchy Caching and Virtual Memory Parallel System Architectures Andy D Pimentel Caches and their design cf Henessy & Patterson, Chap 5 Caching - summary Caches are small fast memories

More information

Adapted from David Patterson s slides on graduate computer architecture

Adapted from David Patterson s slides on graduate computer architecture Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual

More information

Magnetoresistive RAM (MRAM) Jacob Lauzon, Ryan McLaughlin

Magnetoresistive RAM (MRAM) Jacob Lauzon, Ryan McLaughlin Magnetoresistive RAM (MRAM) Jacob Lauzon, Ryan McLaughlin Agenda Current solutions Why MRAM? What is MRAM? History How it works Comparisons Outlook Current Memory Types Memory Market primarily consists

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common

More information

Memory Hierarchy: Caches, Virtual Memory

Memory Hierarchy: Caches, Virtual Memory Memory Hierarchy: Caches, Virtual Memory Readings: 5.1-5.4, 5.8 Big memories are slow Computer Fast memories are small Processor Memory Devices Control Input Datapath Output Need to get fast, big memories

More information

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 341 6.2 Types of Memory 341 6.3 The Memory Hierarchy 343 6.3.1 Locality of Reference 346 6.4 Cache Memory 347 6.4.1 Cache Mapping Schemes 349 6.4.2 Replacement Policies 365

More information

CENG3420 Lecture 08: Memory Organization

CENG3420 Lecture 08: Memory Organization CENG3420 Lecture 08: Memory Organization Bei Yu byu@cse.cuhk.edu.hk (Latest update: February 22, 2018) Spring 2018 1 / 48 Overview Introduction Random Access Memory (RAM) Interleaving Secondary Memory

More information

Lecture 15: Caches and Optimization Computer Architecture and Systems Programming ( )

Lecture 15: Caches and Optimization Computer Architecture and Systems Programming ( ) Systems Group Department of Computer Science ETH Zürich Lecture 15: Caches and Optimization Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Last time Program

More information

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics ,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics The objectives of this module are to discuss about the need for a hierarchical memory system and also

More information

Characteristics of Memory Location wrt Motherboard. CSCI 4717 Computer Architecture. Characteristics of Memory Capacity Addressable Units

Characteristics of Memory Location wrt Motherboard. CSCI 4717 Computer Architecture. Characteristics of Memory Capacity Addressable Units CSCI 4717/5717 Computer Architecture Topic: Cache Memory Reading: Stallings, Chapter 4 Characteristics of Memory Location wrt Motherboard Inside CPU temporary memory or registers Motherboard main memory

More information

WALL: A Writeback-Aware LLC Management for PCM-based Main Memory Systems

WALL: A Writeback-Aware LLC Management for PCM-based Main Memory Systems : A Writeback-Aware LLC Management for PCM-based Main Memory Systems Bahareh Pourshirazi *, Majed Valad Beigi, Zhichun Zhu *, and Gokhan Memik * University of Illinois at Chicago Northwestern University

More information

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141 EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role

More information

The University of Adelaide, School of Computer Science 13 September 2018

The University of Adelaide, School of Computer Science 13 September 2018 Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

Chapter Seven. Large & Fast: Exploring Memory Hierarchy Chapter Seven Large & Fast: Exploring Memory Hierarchy 1 Memories: Review SRAM (Static Random Access Memory): value is stored on a pair of inverting gates very fast but takes up more space than DRAM DRAM

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major

More information

Module 5a: Introduction To Memory System (MAIN MEMORY)

Module 5a: Introduction To Memory System (MAIN MEMORY) Module 5a: Introduction To Memory System (MAIN MEMORY) R E F E R E N C E S : S T A L L I N G S, C O M P U T E R O R G A N I Z A T I O N A N D A R C H I T E C T U R E M O R R I S M A N O, C O M P U T E

More information

CS152 Computer Architecture and Engineering Lecture 16: Memory System

CS152 Computer Architecture and Engineering Lecture 16: Memory System CS152 Computer Architecture and Engineering Lecture 16: System March 15, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson

More information

Chapter 8. Virtual Memory

Chapter 8. Virtual Memory Operating System Chapter 8. Virtual Memory Lynn Choi School of Electrical Engineering Motivated by Memory Hierarchy Principles of Locality Speed vs. size vs. cost tradeoff Locality principle Spatial Locality:

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

DESIGNING LARGE HYBRID CACHE FOR FUTURE HPC SYSTEMS

DESIGNING LARGE HYBRID CACHE FOR FUTURE HPC SYSTEMS DESIGNING LARGE HYBRID CACHE FOR FUTURE HPC SYSTEMS Jiacong He Department of Electrical Engineering The University of Texas at Dallas 800 W Campbell Rd, Richardson, TX, USA Email: jiacong.he@utdallas.edu

More information

Memory technology and optimizations ( 2.3) Main Memory

Memory technology and optimizations ( 2.3) Main Memory Memory technology and optimizations ( 2.3) 47 Main Memory Performance of Main Memory: Latency: affects Cache Miss Penalty» Access Time: time between request and word arrival» Cycle Time: minimum time between

More information

Recap: Machine Organization

Recap: Machine Organization ECE232: Hardware Organization and Design Part 14: Hierarchy Chapter 5 (4 th edition), 7 (3 rd edition) http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy,

More information

Architectural Differences nc. DRAM devices are accessed with a multiplexed address scheme. Each unit of data is accessed by first selecting its row ad

Architectural Differences nc. DRAM devices are accessed with a multiplexed address scheme. Each unit of data is accessed by first selecting its row ad nc. Application Note AN1801 Rev. 0.2, 11/2003 Performance Differences between MPC8240 and the Tsi106 Host Bridge Top Changwatchai Roy Jenevein risc10@email.sps.mot.com CPD Applications This paper discusses

More information

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts Hardware/Software Introduction Chapter 5 Memory Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 1 2 Introduction Memory:

More information

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction Hardware/Software Introduction Chapter 5 Memory 1 Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 2 Introduction Embedded

More information

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University Computer Architecture Memory Hierarchy Lynn Choi Korea University Memory Hierarchy Motivated by Principles of Locality Speed vs. Size vs. Cost tradeoff Locality principle Temporal Locality: reference to

More information

Loadsa 1 : A Yield-Driven Top-Down Design Method for STT-RAM Array

Loadsa 1 : A Yield-Driven Top-Down Design Method for STT-RAM Array Loadsa 1 : A Yield-Driven Top-Down Design Method for STT-RAM Array Wujie Wen, Yaojun Zhang, Lu Zhang and Yiran Chen University of Pittsburgh Loadsa: a slang language means lots of Outline Introduction

More information

Unleashing the Power of Embedded DRAM

Unleashing the Power of Embedded DRAM Copyright 2005 Design And Reuse S.A. All rights reserved. Unleashing the Power of Embedded DRAM by Peter Gillingham, MOSAID Technologies Incorporated Ottawa, Canada Abstract Embedded DRAM technology offers

More information

Lecture 18: DRAM Technologies

Lecture 18: DRAM Technologies Lecture 18: DRAM Technologies Last Time: Cache and Virtual Memory Review Today DRAM organization or, why is DRAM so slow??? Lecture 18 1 Main Memory = DRAM Lecture 18 2 Basic DRAM Architecture Lecture

More information

The Memory Hierarchy 1

The Memory Hierarchy 1 The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow

More information

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative Chapter 6 s Topics Memory Hierarchy Locality of Reference SRAM s Direct Mapped Associative Computer System Processor interrupt On-chip cache s s Memory-I/O bus bus Net cache Row cache Disk cache Memory

More information

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy.

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy. Eastern Mediterranean University School of Computing and Technology ITEC255 Computer Organization & Architecture CACHE MEMORY Introduction Computer memory is organized into a hierarchy. At the highest

More information

Course Administration

Course Administration Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570

More information

SLC vs MLC: Considering the Most Optimal Storage Capacity

SLC vs MLC: Considering the Most Optimal Storage Capacity White Paper SLC vs MLC: Considering the Most Optimal Storage Capacity SLC vs MLC: Considering the Most Optimal Storage Capacity P. 1 Introduction Proficiency should be a priority for the storage in computers.

More information

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user

More information

CpE 442. Memory System

CpE 442. Memory System CpE 442 Memory System CPE 442 memory.1 Outline of Today s Lecture Recap and Introduction (5 minutes) Memory System: the BIG Picture? (15 minutes) Memory Technology: SRAM and Register File (25 minutes)

More information

ELE 758 * DIGITAL SYSTEMS ENGINEERING * MIDTERM TEST * Circle the memory type based on electrically re-chargeable elements

ELE 758 * DIGITAL SYSTEMS ENGINEERING * MIDTERM TEST * Circle the memory type based on electrically re-chargeable elements ELE 758 * DIGITAL SYSTEMS ENGINEERING * MIDTERM TEST * Student name: Date: Example 1 Section: Memory hierarchy (SRAM, DRAM) Question # 1.1 Circle the memory type based on electrically re-chargeable elements

More information

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity Donghyuk Lee Carnegie Mellon University Problem: High DRAM Latency processor stalls: waiting for data main memory high latency Major bottleneck

More information