Analysis of Cache Configurations and Cache Hierarchies Incorporating Various Device Technologies over the Years

Size: px
Start display at page:

Download "Analysis of Cache Configurations and Cache Hierarchies Incorporating Various Device Technologies over the Years"

Transcription

1 Analysis of Cache Configurations and Cache Hierarchies Incorporating Various Technologies over the Years Sakeenah Khan EEL 30C: Computer Organization Summer Semester Department of Electrical and Computer Engineering University of Central Florida Orlando, FL Abstract The objective of this paper was to evaluate fundamental metrics for selected studies cache configurations, observe trends in the cache configurations, and compare design approaches and device technologies. Metrics that were studied include read and write energy consumption, read latency, k-way setassociativity, and cache capacity. Trends were identified in regards to cache hierarchy, device technology, and k-way set-associativity. Over time, the amount of cache levels and sets for set associativity have increased in general. While S was originally the predominant device technology, newer technologies such as STT-, Re, ed, and have replaced S s role in L and L3/LLC caches. The performances of different device technologies were compared by examining the read and write energy and the read latency. STT- on average had the highest energy consumption. and ed on average had the lowest read latency. Keywords fast/slow memory, memory hierarchy, multilevel cache, hit/miss ratio, hit time, miss penalty, associativity, directmapped, fully-associative, set-associative, S, STT-, Re, ed,, cache lines, read/write energy, read latency I. INTRODUCTION Primary memory consists of registers, cache, and main memory (D). The registers and cache are located in the CPU and are considered fast memory, while the main memory is considered slow memory. This stems from the fact that capacity and speed are opposing properties. Memory hierarchy uses the principle of locality (which states that programs use a small part of their memory space frequently) in order to create memory that behaves large, fast, and inexpensive. This is accomplished by storing commonly used data in fast memory and not commonly used data in slow memory. The cache holds recently and frequently used data for fast reference, and can be extended to a hierarchy of levels. With each level that is farther from the CPU, the speed decreases while the capacity increases. Multilevel caches are useful because they decrease the memory access time in general, provided that the requested data is in the cache. Multiple levels optimize the cache by reducing the miss penalty. FIGURE I. A TYPICAL MEMORY HIERARCHY If the requested data is located in the cache, it s a hit, and if not then it s a miss. The hit ratio is the portion of memory accesses found in the cache, while the miss ratio is the portion of memory accesses not found in the cache. Likewise, the hit time is the time to access data in the cache, and the miss penalty is the time to access data from main memory and replace a block in the cache. When the CPU requests a word from the contents of a read address (RA), first the cache is checked to see if it has the block containing the RA s contents; if it s a hit then the memory access is a fast process. However, if it s a miss, then the main memory must be accessed, the block containing the RA s contents is read and transferred to the cache, and finally the requested word is forwarded to the processor. This is a longer process, thus in general the hit time is much smaller than the miss penalty. Associativity is a design approach that provides flexibility by associating a block of memory with a corresponding line in the cache. There are three design strategies pertaining to cache associativity: direct-mapped, fully-associative, and setassociative. In the direct-mapped strategy, there is no associativity; each block is mapped to only one possible line. When data is referenced that is in the cache, it is located by the tag and the line index. The tag and line index are chosen depending on the memory address. Under this strategy, there are more conflict misses, which occur when multiple memory locations are mapped to the same cache location. If the memory access is a miss and the cache is full, a capacity miss has occurred. The data that was just accessed will replace a line in the cache

2 (which line depends on the tag; the line index and data will be rewritten). In the fully-associative strategy, there is unrestricted associativity; any block of memory can be mapped to any line of the cache. This allows for fully flexible mapping up to the capacity of the cache, however since this strategy has the largest tag field the time for tag search is longer. There are no conflict misses, only capacity misses. In the set-associative strategy (referred to as k-way set-associative), there is bounded associativity. Each block of memory is mapped to one set. There are k lines in each set, so a given block of memory could be stored in one of the k lines. When data is referenced that is in the cache, it is located by the set and the tag. Some conflict misses can occur, but there are less than the direct-mapped strategy. For both the fully-associative and set-associative strategies, if there is a capacity miss, then the blocks most likely to be used are kept in the cache. Either the LRU (least recently used) or LFU (least frequently used) block is replaced. The device technologies studied in this paper include S, STT-, ed,, and Re. S, ed, and are volatile memory technologies, meaning they require voltage supply to maintain values. STT- and Re are non-volatile []. Within this paper, thirteen studies with unique cache configurations were analyzed and compared. These studies span from 995 to. Section II summarizes the literature and observes trends in cache hierarchies, device technologies, and set-associativity. Section III analyzes the data and observes how different device technologies have different energy consumptions and read latencies. II. LITERATURE REVIEW Thirteen research studies were analyzed throughout this paper. The studies span from 995 to. Most of the studies cache configurations employed set-associativity strategies. From 005 to 007, the three selected studies all had two cache levels. The set-associativity for L was -way for sources [] and [] and 4-way for source [0]. For L, the set-associativity was 8-way for sources [0] and [] and 6- way for source []. From 0 to 03, S and STT- technologies were popular between the four selected studies. The study from 0 employed three cache levels [7], while the others employed two cache levels [6, 8, 3]. The device technology for L of the studies was S, with either -way or 4-way set associativity [7, 6, 8]. The study from 0 used S for L, L, and L3 [7]. In the two studies from 0, STT- technology was used for L with 6-way and 8-way set associativity [6, 8]. From the studies of, two of the studies had two cache levels [, 3] while the other two had three [4, 9]. S and STT- were still popular, while new technologies emerged such as ed,, and Re. Three of the studies used S technology for L and all four of the studies used 8-way set-associativity for L [, 3, 4, 9]. None of the studies employed S technology for L or L3. Over the past decade, three-level caches became more common. The number of sets for the L cache has generally increased. For most of the studies, S technology was used for the L cache. As for L and L3, as the decade proceeded different alternatives were introduced to replace S technology. In order to meet the demands for high performance and energy efficiency, a large portion of modern processors is occupied by multilevel S caches. This fast, low-capacity technology is most often employed in the L and L levels of the cache. However, S s significant leakage power and cell area are great disadvantages [3, 4]. Leakage power can be greatly decreased by using nonvolatile memory technologies to replace S LLCs, such as STT- (spin-transfer torque ) and Re (resistive ). STT- s advantages include its near-zero power leakage, high cell density, and short read access time [3, 4, 6, 8, 9]. However, key drawbacks to STT- include its long write latency and high write energy [6, 8]. Re s largest advantage is its high compatibility with CMOS, which makes it a strong cost competitor to S. However, it has a longer access latency and lower cell endurance than STT-, making it more suitable for the LLC technology in a deep cache hierarchy (e.g. a three level cache) [9]. More recently, large ed (embedded dynamic ) has been introduced as the LLC cache technology to further alleviate the core-memory speed gap. ed offers a high cache capacity, smaller area, and faster on-chip communication. However, it also has a high refresh demand due to the need to keep the stored value in the valid state, which increases the dynamic energy consumption [, 4]. Another recently introduced technology is (read reference activity persistent), which optimizes the L cache and maximizes the benefit of STT- s extra capacity by using a heterogeneous STT-. provides accelerated service to the critical load requests from LLC. accelerates the service to critical requests from LLC while also efficiently managing regular L cache requests. III. DATA ANALYSIS Table I details the information and metrics provided in each the studies, including the cache hierarchies, cache capacities, set-associativity, device technologies, and protocols. The number of cache lines was also included in Table I, and was calculated by using the following equation (assuming the cache line size is always 64 Bytes): EQUATION I. [# of CL] = [cache capacity] / (64 Bytes) Table II contains the read and write energy and the read latency from five studies. The read and write energy comes from the addition of the read energy and the write energy. To obtain the read latency in units of ns from cycles, the following equation was used: EQUATION II. [latency (ns)] = [cycles] / [frequency (GHz)] Figures II and III illustrate the data given in Table II in the form of bar graphs. Figure II shows the read and write energy consumption for different device technologies over the years.

3 Figure III shows the cache read latency for different device technologies over the years. As Figure II shows, using S for L requires little energy consumption. The order of technologies from least to most energy consuming on average is S,, ed, and STT-. STT- s average energy consumption is significantly higher than that of the other technologies. The order of technologies from having the lowest to highest read latency on average is ed,, S, and STT-. ed and s average read latencies are almost half that of S and STT- s. IV. CONCLUSION After analyzing the literature, clear trends became apparent over time and for different device technologies. Most of the studies cache configurations were set-associative. Over the past decade, three-level caches became more common. The amount of sets used for set associativity generally increased. Across the board, S technology was used for L cache. In earlier literature, S technology was much more prevalent and often used for each cache level. As years passed, S technology was increasingly replaced in L and L3 by nonvolatile technologies such as STT- and Re. Most recently, ed and technologies have emerged. ed is often used for LLC, while strategy would be used in L. As observed from the literature, between the different device technologies, STT- required the highest average energy consumption, while S and STT- had the highest average read latencies. [0] D. Chandra, et al. Predicting inter-thread cache contention on a chip multi-processor architecture th International Symposium on High- Performance Computer Architecture, 005. [] J. Huh, et al. "A NUCA substrate for flexible CMP cache sharing." IEEE transactions on parallel and distributed systems 8.8 (007): [] M. K. Qureshi, D. Thompson, and Y. N. Patt. The V-Way cache: demand-based associativity via global replacement 3nd International Symposium on Computer Architecture (ISCA'05), 005. [3] R. Parihar, et al. "Protection, utilization and collaboration in shared through rationing." URL cs. rochester. edu/u/cding/documents/publications/tr-ration. pdf (03). R. F. DeMara. Memory Hierarchy [Module PowerPoint]. EEL 380C: Computer Organization; University of Central Florida,. REFERENCES [] S. E. Crawford and R. F. DeMara, "Cache coherence in a multiport memory environment," in Proceedings of the Second International Conference on Massively Parallel Computing Systems (MPCS-95), pp , Ischia, Italy, May -6, 995. [] N. Khoshavi, X. Chen, J. Wang and R. F. DeMara, Bit-Upset Vulnerability Factor for ed Last Level Cache Immunity Analysis, Proceedings of 7th International Symposium on Quality Electronic Design (ISQED ), Santa Clara, CA, USA, March 5-6,. [3] X. Chen, N. Khoshavi, J. Zhou, D. Huang, R. F. DeMara, J. Wang, W. Wen and Y. Chen, AOS: Adaptive Overwrite Scheme for Energy- Efficient MLC STT- Cache, 53rd Design Automation Conference, Austing, TX, USA,. N. Khoshavi, X. Chen, J. Wang and R. F. DeMara, "Read-Tuned STT- and ed Cache Hierarchies for Throughput and Energy Enhancement, arxiv preprint,. [5] M. Lin, et al. "ASTRO: Synthesizing application-specific reconfigurable hardware traces to exploit memory-level parallelism" Microprocessors and Microsystems 39.7 (05): [6] A. Jog, A. K. Mishra, C. Xu, Y. Xie, V. Narayanan, R. Iyer, and C. R. Das, Cache Revive: Architecting Volatile STT- Caches for Enhanced Performance in CMPs, in Proceedings of 49th Annual Design Automation Conference (DAC). 0, pp [7] Z. Sun, X. Bi, H. H. Li, W.-F. Wong, Z.-L. Ong, X. Zhu, and W. Wu, Multi Retention Level STT- Cache Designs with a Dynamic Refresh Scheme, in Proceedings of 44th annual IEEE/ACM International Symposium on Microarchitecture. 0, pp [8] Z. Sun, X. Bi, and H. Li, Process variation aware data management for stt-ram cache design, in Proceedings of the 0 ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED, 0, pp [9] M. R. Jokar, M. Arjomand, and H. Sarbazi-Azad, Sequoia: High- Endurance NVM-Based Cache Architecture, IEEE Transactions on Very Large Scale Integration (VLSI) Systems,. 3

4 TABLE I. METRICS FOR VARIOUS MULTILEVEL CACHE DESIGNS Parameters for Processor the below techniques, Year # of Freq. Capacity cores Crawford [] 995 Khoshavi [] Chen [3] Koshavi Lin [5] 05 Jog [6] 0 Level (L) for Instruction (I) or Data (D) Level (L) Level 3 (L3) or Last Level Cache (LLC) N/A N/A infinite N/A N/A N/A # of CL Protocol Capacity # of CL Protocol Capacity # of CL Protocol CREW/ CRCW infinite N/A N/A N/A CREW/ CRCW infinite N/A N/A N/A CREW/ CRCW 8 3GHz 3KB 8-way S 5 MESI 5KB 8-way S 89 MESI 96MB 6-way ed ~00M WB 4 3.3GHz 3KB 8-way S 5 WB 4MB 8-way STT WB N/A N/A N/A N/A N/A 8 3GHz 3KB 8-way S 5 WB 04KB 8-way WB 96MB 6-way ed ~00M WB N/A N/A 3KB N/A N/A 5 MOESI 5KB N/A N/A 89 MOESI N/A N/A N/A N/A N/A 4 GHz 3KB 4-way S 5 WB MB or 4MB 6-way S or STT or WB N/A N/A N/A N/A N/A Sun [7] 0 4 GHz 3KB 4-way S 5 MESI 56KB 8-way S 4096 WB 4MB 6-way S WB Sun [8] STT- 8 GHz 6KB -way S 56 WT 8MB 3-way WB N/A N/A N/A N/A N/A Jokar [9] MB or STT GHz 3KB 8-way N/A 5 WB 8-way 56 KB or 4096 WB 8MB 8-way Re 307 WB Chandra [0] GHz 3KB 4-way N/A 5 WB 5KB 8-way N/A 89 WB N/A N/A N/A N/A N/A Huh [] 007 N/A 5GHz 3KB -way N/A 5 N/A 56KB 6-way N/A 4096 N/A N/A N/A N/A N/A N/A Qureshi [] 005 N/A N/A 6KB -way N/A 56 N/A 56KB 8-way N/A 4096 N/A N/A N/A N/A N/A N/A Parihar [3] 03 N/A N/A 3KB -way N/A 5 N/A 5KB 8-way N/A 89 N/A N/A N/A N/A N/A N/A TABLE II. ENERGY & LATENCY FOR DIFFERENT DEVICE TECHNOLOGIES Technology and Details Read and Write Energy (nj) Read Latency (ns) 3 KB S, L - 0 [7] KB S, L - 0 [7] MB S, L3-0 [7] MB S, L - 0 [6] KB S, L - [3] S, MB STT-, L - 0 [6] MB STT-, [3] MB STT-, MB STT-, [9] N/A LRSC, HRSC, MB ed,

5 .5 FIGURE II. READ AND WRITE ENERGY.06 Energy (nj) KB 56 KB 4 MB MB 3 KB 4 MB STT- S, L -S, L -S, L3 -S, L -S, L -S, L -, L - 0 [7] 0 [7] 0 [7] 0 [6] [3] 0 [6] MB STT-, L - [3] MB STT-, L LRSC, L HRSC, L MB ed, Latency (ns) KB S, L - 0 [7].8 56 KB S, L - 0 [7] MB S, L3-0 [7] FIGURE III. READ LATENCY.0 MB S, L - 0 [6].5 3 KB S, L - [3].77 S, L MB 4 MB MB MB STT-, STT-, STT-, STT-, L - 0 [6] [3] [9].6.6 LRSC, L - HRSC, L -.07 MB ed, 5

Revolutionizing Technological Devices such as STT- RAM and their Multiple Implementation in the Cache Level Hierarchy

Revolutionizing Technological Devices such as STT- RAM and their Multiple Implementation in the Cache Level Hierarchy Revolutionizing Technological s such as and their Multiple Implementation in the Cache Level Hierarchy Michael Mosquera Department of Electrical and Computer Engineering University of Central Florida Orlando,

More information

A Brief Compendium of On Chip Memory Highlighting the Tradeoffs Implementing SRAM,

A Brief Compendium of On Chip Memory Highlighting the Tradeoffs Implementing SRAM, A Brief Compendium of On Chip Memory Highlighting the Tradeoffs Implementing, RAM, or edram Justin Bates Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 3816-36

More information

Cache Memory Configurations and Their Respective Energy Consumption

Cache Memory Configurations and Their Respective Energy Consumption Cache Memory Configurations and Their Respective Energy Consumption Dylan Petrae Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 32816-2362 Abstract When it

More information

Cache Memory Introduction and Analysis of Performance Amongst SRAM and STT-RAM from The Past Decade

Cache Memory Introduction and Analysis of Performance Amongst SRAM and STT-RAM from The Past Decade Cache Memory Introduction and Analysis of Performance Amongst S and from The Past Decade Carlos Blandon Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 386-36

More information

Comparisons Of Different Level Of Cache Using Various Technologies From Multiple Reverences

Comparisons Of Different Level Of Cache Using Various Technologies From Multiple Reverences Comparisons Of Different Level Of Cache Using Various Technologies From ultiple Reverences Parameswari Chandrasekar Department of Electrical and Computer Engineering University of Central Florida Orlando,

More information

A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid Cache in 3D chip Multi-processors

A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid Cache in 3D chip Multi-processors , July 4-6, 2018, London, U.K. A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid in 3D chip Multi-processors Lei Wang, Fen Ge, Hao Lu, Ning Wu, Ying Zhang, and Fang Zhou Abstract As

More information

A Coherent Hybrid SRAM and STT-RAM L1 Cache Architecture for Shared Memory Multicores

A Coherent Hybrid SRAM and STT-RAM L1 Cache Architecture for Shared Memory Multicores A Coherent Hybrid and L1 Cache Architecture for Shared Memory Multicores Jianxing Wang, Yenni Tim Weng-Fai Wong, Zhong-Liang Ong Zhenyu Sun, Hai (Helen) Li School of Computing Swanson School of Engineering

More information

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy.

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy. Eastern Mediterranean University School of Computing and Technology ITEC255 Computer Organization & Architecture CACHE MEMORY Introduction Computer memory is organized into a hierarchy. At the highest

More information

Improving Energy Efficiency of Write-asymmetric Memories by Log Style Write

Improving Energy Efficiency of Write-asymmetric Memories by Log Style Write Improving Energy Efficiency of Write-asymmetric Memories by Log Style Write Guangyu Sun 1, Yaojun Zhang 2, Yu Wang 3, Yiran Chen 2 1 Center for Energy-efficient Computing and Applications, Peking University

More information

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook CS356: Discussion #9 Memory Hierarchy and Caches Marco Paolieri (paolieri@usc.edu) Illustrations from CS:APP3e textbook The Memory Hierarchy So far... We modeled the memory system as an abstract array

More information

SF-LRU Cache Replacement Algorithm

SF-LRU Cache Replacement Algorithm SF-LRU Cache Replacement Algorithm Jaafar Alghazo, Adil Akaaboune, Nazeih Botros Southern Illinois University at Carbondale Department of Electrical and Computer Engineering Carbondale, IL 6291 alghazo@siu.edu,

More information

OAP: An Obstruction-Aware Cache Management Policy for STT-RAM Last-Level Caches

OAP: An Obstruction-Aware Cache Management Policy for STT-RAM Last-Level Caches OAP: An Obstruction-Aware Cache Management Policy for STT-RAM Last-Level Caches Jue Wang, Xiangyu Dong, Yuan Xie Department of Computer Science and Engineering, Pennsylvania State University Qualcomm Technology,

More information

LECTURE 11. Memory Hierarchy

LECTURE 11. Memory Hierarchy LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2014 Lecture 14 LAST TIME! Examined several memory technologies: SRAM volatile memory cells built from transistors! Fast to use, larger memory cells (6+ transistors

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

Lecture 8: Virtual Memory. Today: DRAM innovations, virtual memory (Sections )

Lecture 8: Virtual Memory. Today: DRAM innovations, virtual memory (Sections ) Lecture 8: Virtual Memory Today: DRAM innovations, virtual memory (Sections 5.3-5.4) 1 DRAM Technology Trends Improvements in technology (smaller devices) DRAM capacities double every two years, but latency

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Cache 11232011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review Memory Components/Boards Two-Level Memory Hierarchy

More information

International Journal of Information Research and Review Vol. 05, Issue, 02, pp , February, 2018

International Journal of Information Research and Review Vol. 05, Issue, 02, pp , February, 2018 International Journal of Information Research and Review, February, 2018 International Journal of Information Research and Review Vol. 05, Issue, 02, pp.5221-5225, February, 2018 RESEARCH ARTICLE A GREEN

More information

CS161 Design and Architecture of Computer Systems. Cache $$$$$

CS161 Design and Architecture of Computer Systems. Cache $$$$$ CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks

More information

Handout 4 Memory Hierarchy

Handout 4 Memory Hierarchy Handout 4 Memory Hierarchy Outline Memory hierarchy Locality Cache design Virtual address spaces Page table layout TLB design options (MMU Sub-system) Conclusion 2012/11/7 2 Since 1980, CPU has outpaced

More information

Physical characteristics (such as packaging, volatility, and erasability Organization.

Physical characteristics (such as packaging, volatility, and erasability Organization. CS 320 Ch 4 Cache Memory 1. The author list 8 classifications for memory systems; Location Capacity Unit of transfer Access method (there are four:sequential, Direct, Random, and Associative) Performance

More information

Mohsen Imani. University of California San Diego. System Energy Efficiency Lab seelab.ucsd.edu

Mohsen Imani. University of California San Diego. System Energy Efficiency Lab seelab.ucsd.edu Mohsen Imani University of California San Diego Winter 2016 Technology Trend for IoT http://www.flashmemorysummit.com/english/collaterals/proceedi ngs/2014/20140807_304c_hill.pdf 2 Motivation IoT significantly

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 15

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 15 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2017 Lecture 15 LAST TIME: CACHE ORGANIZATION Caches have several important parameters B = 2 b bytes to store the block in each cache line S = 2 s cache sets

More information

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141 EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Virtual Memory 11282011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review Cache Virtual Memory Projects 3 Memory

More information

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (4 th Week)

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (4 th Week) + (Advanced) Computer Organization & Architechture Prof. Dr. Hasan Hüseyin BALIK (4 th Week) + Outline 2. The computer system 2.1 A Top-Level View of Computer Function and Interconnection 2.2 Cache Memory

More information

COMP 3221: Microprocessors and Embedded Systems

COMP 3221: Microprocessors and Embedded Systems COMP 3: Microprocessors and Embedded Systems Lectures 7: Cache Memory - III http://www.cse.unsw.edu.au/~cs3 Lecturer: Hui Wu Session, 5 Outline Fully Associative Cache N-Way Associative Cache Block Replacement

More information

Memory Hierarchy. Slides contents from:

Memory Hierarchy. Slides contents from: Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory

More information

Memory hierarchy Outline

Memory hierarchy Outline Memory hierarchy Outline Performance impact Principles of memory hierarchy Memory technology and basics 2 Page 1 Performance impact Memory references of a program typically determine the ultimate performance

More information

Module Outline. CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies.

Module Outline. CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies. M6 Memory Hierarchy Module Outline CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies. Events on a Cache Miss Events on a Cache Miss Stall the pipeline.

More information

SPINTRONIC MEMORY ARCHITECTURE

SPINTRONIC MEMORY ARCHITECTURE SPINTRONIC MEMORY ARCHITECTURE Anand Raghunathan Integrated Systems Laboratory School of ECE, Purdue University Rangharajan Venkatesan Shankar Ganesh Ramasubramanian, Ashish Ranjan Kaushik Roy 7 th NCN-NEEDS

More information

William Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved.

William Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved. + William Stallings Computer Organization and Architecture 10 th Edition 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. 2 + Chapter 4 Cache Memory 3 Location Internal (e.g. processor registers,

More information

Emerging NVM Memory Technologies

Emerging NVM Memory Technologies Emerging NVM Memory Technologies Yuan Xie Associate Professor The Pennsylvania State University Department of Computer Science & Engineering www.cse.psu.edu/~yuanxie yuanxie@cse.psu.edu Position Statement

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap

More information

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5

More information

Write only as much as necessary. Be brief!

Write only as much as necessary. Be brief! 1 CIS371 Computer Organization and Design Final Exam Prof. Martin Wednesday, May 2nd, 2012 This exam is an individual-work exam. Write your answers on these pages. Additional pages may be attached (with

More information

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Hyunchul Seok Daejeon, Korea hcseok@core.kaist.ac.kr Youngwoo Park Daejeon, Korea ywpark@core.kaist.ac.kr Kyu Ho Park Deajeon,

More information

A Review on Cache Memory with Multiprocessor System

A Review on Cache Memory with Multiprocessor System A Review on Cache Memory with Multiprocessor System Chirag R. Patel 1, Rajesh H. Davda 2 1,2 Computer Engineering Department, C. U. Shah College of Engineering & Technology, Wadhwan (Gujarat) Abstract

More information

Performance! (1/latency)! 1000! 100! 10! Capacity Access Time Cost. CPU Registers 100s Bytes <10s ns. Cache K Bytes ns 1-0.

Performance! (1/latency)! 1000! 100! 10! Capacity Access Time Cost. CPU Registers 100s Bytes <10s ns. Cache K Bytes ns 1-0. Since 1980, CPU has outpaced DRAM... EEL 5764: Graduate Computer Architecture Appendix C Hierarchy Review Ann Gordon-Ross Electrical and Computer Engineering University of Florida http://www.ann.ece.ufl.edu/

More information

Memory Hierarchy. Slides contents from:

Memory Hierarchy. Slides contents from: Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

A Low-Power Hybrid Magnetic Cache Architecture Exploiting Narrow-Width Values

A Low-Power Hybrid Magnetic Cache Architecture Exploiting Narrow-Width Values A Low-Power Hybrid Magnetic Cache Architecture Exploiting Narrow-Width Values Mohsen Imani, Abbas Rahimi, Yeseong Kim, Tajana Rosing Computer Science and Engineering, UC San Diego, La Jolla, CA 92093,

More information

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy CSCI-UA.0201 Computer Systems Organization Memory Hierarchy Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Programmer s Wish List Memory Private Infinitely large Infinitely fast Non-volatile

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

MEMORY. Objectives. L10 Memory

MEMORY. Objectives. L10 Memory MEMORY Reading: Chapter 6, except cache implementation details (6.4.1-6.4.6) and segmentation (6.5.5) https://en.wikipedia.org/wiki/probability 2 Objectives Understand the concepts and terminology of hierarchical

More information

Cache memory. Lecture 4. Principles, structure, mapping

Cache memory. Lecture 4. Principles, structure, mapping Cache memory Lecture 4 Principles, structure, mapping Computer memory overview Computer memory overview By analyzing memory hierarchy from top to bottom, the following conclusions can be done: a. Cost

More information

Lecture 14: Cache Innovations and DRAM. Today: cache access basics and innovations, DRAM (Sections )

Lecture 14: Cache Innovations and DRAM. Today: cache access basics and innovations, DRAM (Sections ) Lecture 14: Cache Innovations and DRAM Today: cache access basics and innovations, DRAM (Sections 5.1-5.3) 1 Reducing Miss Rate Large block size reduces compulsory misses, reduces miss penalty in case

More information

Mismatch of CPU and MM Speeds

Mismatch of CPU and MM Speeds Fö 3 Cache-Minne Introduction Cache design Replacement and write policy Zebo Peng, IDA, LiTH Mismatch of CPU and MM Speeds Cycle Time (nano second) 0 4 0 3 0 0 Main Memory CPU Speed Gap (ca. one order

More information

Cache/Memory Optimization. - Krishna Parthaje

Cache/Memory Optimization. - Krishna Parthaje Cache/Memory Optimization - Krishna Parthaje Hybrid Cache Architecture Replacing SRAM Cache with Future Memory Technology Suji Lee, Jongpil Jung, and Chong-Min Kyung Department of Electrical Engineering,KAIST

More information

Advanced Computer Architecture

Advanced Computer Architecture ECE 563 Advanced Computer Architecture Fall 2009 Lecture 3: Memory Hierarchy Review: Caches 563 L03.1 Fall 2010 Since 1980, CPU has outpaced DRAM... Four-issue 2GHz superscalar accessing 100ns DRAM could

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

Question?! Processor comparison!

Question?! Processor comparison! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 5.1-5.2!! (Over the next 2 lectures)! Lecture 18" Introduction to Memory Hierarchies! 3! Processor components! Multicore processors and programming! Question?!

More information

COSC 6385 Computer Architecture. - Memory Hierarchies (I)

COSC 6385 Computer Architecture. - Memory Hierarchies (I) COSC 6385 Computer Architecture - Hierarchies (I) Fall 2007 Slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05 Recap

More information

ECE 485/585 Microprocessor System Design

ECE 485/585 Microprocessor System Design Microprocessor System Design Lecture 11: Reducing Hit Time Cache Coherence Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering and Computer Science Source: Lecture based

More information

Lecture-14 (Memory Hierarchy) CS422-Spring

Lecture-14 (Memory Hierarchy) CS422-Spring Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect

More information

WEEK 7. Chapter 4. Cache Memory Pearson Education, Inc., Hoboken, NJ. All rights reserved.

WEEK 7. Chapter 4. Cache Memory Pearson Education, Inc., Hoboken, NJ. All rights reserved. WEEK 7 + Chapter 4 Cache Memory Location Internal (e.g. processor registers, cache, main memory) External (e.g. optical disks, magnetic disks, tapes) Capacity Number of words Number of bytes Unit of Transfer

More information

A Comparison of Capacity Management Schemes for Shared CMP Caches

A Comparison of Capacity Management Schemes for Shared CMP Caches A Comparison of Capacity Management Schemes for Shared CMP Caches Carole-Jean Wu and Margaret Martonosi Princeton University 7 th Annual WDDD 6/22/28 Motivation P P1 P1 Pn L1 L1 L1 L1 Last Level On-Chip

More information

ECE468 Computer Organization and Architecture. Memory Hierarchy

ECE468 Computer Organization and Architecture. Memory Hierarchy ECE468 Computer Organization and Architecture Hierarchy ECE468 memory.1 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Input Datapath Output Today s Topic:

More information

COSC 6385 Computer Architecture - Memory Hierarchies (I)

COSC 6385 Computer Architecture - Memory Hierarchies (I) COSC 6385 Computer Architecture - Memory Hierarchies (I) Edgar Gabriel Spring 2018 Some slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05

More information

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3.

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3. 5 Solutions Chapter 5 Solutions S-3 5.1 5.1.1 4 5.1.2 I, J 5.1.3 A[I][J] 5.1.4 3596 8 800/4 2 8 8/4 8000/4 5.1.5 I, J 5.1.6 A(J, I) 5.2 5.2.1 Word Address Binary Address Tag Index Hit/Miss 5.2.2 3 0000

More information

Phase Change Memory An Architecture and Systems Perspective

Phase Change Memory An Architecture and Systems Perspective Phase Change Memory An Architecture and Systems Perspective Benjamin C. Lee Stanford University bcclee@stanford.edu Fall 2010, Assistant Professor @ Duke University Benjamin C. Lee 1 Memory Scaling density,

More information

CMPSC 311- Introduction to Systems Programming Module: Caching

CMPSC 311- Introduction to Systems Programming Module: Caching CMPSC 311- Introduction to Systems Programming Module: Caching Professor Patrick McDaniel Fall 2016 Reminder: Memory Hierarchy L0: Registers CPU registers hold words retrieved from L1 cache Smaller, faster,

More information

Fundamentals of Computer Systems

Fundamentals of Computer Systems Fundamentals of Computer Systems Caches Stephen A. Edwards Columbia University Summer 217 Illustrations Copyright 27 Elsevier Computer Systems Performance depends on which is slowest: the processor or

More information

Performance metrics for caches

Performance metrics for caches Performance metrics for caches Basic performance metric: hit ratio h h = Number of memory references that hit in the cache / total number of memory references Typically h = 0.90 to 0.97 Equivalent metric:

More information

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies 1 Lecture 22 Introduction to Memory Hierarchies Let!s go back to a course goal... At the end of the semester, you should be able to......describe the fundamental components required in a single core of

More information

Page 1. Multilevel Memories (Improving performance using a little cash )

Page 1. Multilevel Memories (Improving performance using a little cash ) Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency

More information

ADAPTIVE BLOCK PINNING BASED: DYNAMIC CACHE PARTITIONING FOR MULTI-CORE ARCHITECTURES

ADAPTIVE BLOCK PINNING BASED: DYNAMIC CACHE PARTITIONING FOR MULTI-CORE ARCHITECTURES ADAPTIVE BLOCK PINNING BASED: DYNAMIC CACHE PARTITIONING FOR MULTI-CORE ARCHITECTURES Nitin Chaturvedi 1 Jithin Thomas 1, S Gurunarayanan 2 1 Birla Institute of Technology and Science, EEE Group, Pilani,

More information

Row Buffer Locality Aware Caching Policies for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu

Row Buffer Locality Aware Caching Policies for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu Row Buffer Locality Aware Caching Policies for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu Executive Summary Different memory technologies have different

More information

CS152 Computer Architecture and Engineering Lecture 17: Cache System

CS152 Computer Architecture and Engineering Lecture 17: Cache System CS152 Computer Architecture and Engineering Lecture 17 System March 17, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http//http.cs.berkeley.edu/~patterson

More information

EITF20: Computer Architecture Part4.1.1: Cache - 2

EITF20: Computer Architecture Part4.1.1: Cache - 2 EITF20: Computer Architecture Part4.1.1: Cache - 2 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache performance optimization Bandwidth increase Reduce hit time Reduce miss penalty Reduce miss

More information

Area, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory

Area, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory Area, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory Youngbin Jin, Mustafa Shihab, and Myoungsoo Jung Computer Architecture and Memory Systems Laboratory Department of Electrical

More information

Portland State University ECE 587/687. Caches and Memory-Level Parallelism

Portland State University ECE 587/687. Caches and Memory-Level Parallelism Portland State University ECE 587/687 Caches and Memory-Level Parallelism Revisiting Processor Performance Program Execution Time = (CPU clock cycles + Memory stall cycles) x clock cycle time For each

More information

Lecture 17 Introduction to Memory Hierarchies" Why it s important " Fundamental lesson(s)" Suggested reading:" (HP Chapter

Lecture 17 Introduction to Memory Hierarchies Why it s important  Fundamental lesson(s) Suggested reading: (HP Chapter Processor components" Multicore processors and programming" Processor comparison" vs." Lecture 17 Introduction to Memory Hierarchies" CSE 30321" Suggested reading:" (HP Chapter 5.1-5.2)" Writing more "

More information

Memory Hierarchy Technology. The Big Picture: Where are We Now? The Five Classic Components of a Computer

Memory Hierarchy Technology. The Big Picture: Where are We Now? The Five Classic Components of a Computer The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Datapath Today s Topics: technologies Technology trends Impact on performance Hierarchy The principle of locality

More information

A Page-Based Storage Framework for Phase Change Memory

A Page-Based Storage Framework for Phase Change Memory A Page-Based Storage Framework for Phase Change Memory Peiquan Jin, Zhangling Wu, Xiaoliang Wang, Xingjun Hao, Lihua Yue University of Science and Technology of China 2017.5.19 Outline Background Related

More information

Fundamentals of Computer Systems

Fundamentals of Computer Systems Fundamentals of Computer Systems Caches Martha A. Kim Columbia University Fall 215 Illustrations Copyright 27 Elsevier 1 / 23 Computer Systems Performance depends on which is slowest: the processor or

More information

ארכיטקטורת יחידת עיבוד מרכזי ת

ארכיטקטורת יחידת עיבוד מרכזי ת ארכיטקטורת יחידת עיבוד מרכזי ת (36113741) תשס"ג סמסטר א' July 2, 2008 Hugo Guterman (hugo@ee.bgu.ac.il) Arch. CPU L8 Cache Intr. 1/77 Memory Hierarchy Arch. CPU L8 Cache Intr. 2/77 Why hierarchy works

More information

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg Computer Architecture and System Software Lecture 09: Memory Hierarchy Instructor: Rob Bergen Applied Computer Science University of Winnipeg Announcements Midterm returned + solutions in class today SSD

More information

Memory Hierarchy and Caches

Memory Hierarchy and Caches Memory Hierarchy and Caches COE 301 / ICS 233 Computer Organization Dr. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals Presentation Outline

More information

Solutions for Chapter 7 Exercises

Solutions for Chapter 7 Exercises olutions for Chapter 7 Exercises 1 olutions for Chapter 7 Exercises 7.1 There are several reasons why you may not want to build large memories out of RAM. RAMs require more transistors to build than DRAMs

More information

Using Non-volatile Memories for Browser Performance Improvement. Seongmin KIM and Taeseok KIM *

Using Non-volatile Memories for Browser Performance Improvement. Seongmin KIM and Taeseok KIM * 2017 2nd International Conference on Computer, Network Security and Communication Engineering (CNSCE 2017) ISBN: 978-1-60595-439-4 Using Non-volatile Memories for Browser Performance Improvement Seongmin

More information

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts Hardware/Software Introduction Chapter 5 Memory Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 1 2 Introduction Memory:

More information

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction Hardware/Software Introduction Chapter 5 Memory 1 Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 2 Introduction Embedded

More information

Lecture 2: Memory Systems

Lecture 2: Memory Systems Lecture 2: Memory Systems Basic components Memory hierarchy Cache memory Virtual Memory Zebo Peng, IDA, LiTH Many Different Technologies Zebo Peng, IDA, LiTH 2 Internal and External Memories CPU Date transfer

More information

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches CS 61C: Great Ideas in Computer Architecture Direct Mapped Caches Instructor: Justin Hsia 7/05/2012 Summer 2012 Lecture #11 1 Review of Last Lecture Floating point (single and double precision) approximates

More information

A novel SRAM -STT-MRAM hybrid cache implementation improving cache performance

A novel SRAM -STT-MRAM hybrid cache implementation improving cache performance A novel SRAM -STT-MRAM hybrid cache implementation improving cache performance Odilia Coi, Guillaume Patrigeon, Sophiane Senni, Lionel Torres, Pascal Benoit To cite this version: Odilia Coi, Guillaume

More information

Contents. Main Memory Memory access time Memory cycle time. Types of Memory Unit RAM ROM

Contents. Main Memory Memory access time Memory cycle time. Types of Memory Unit RAM ROM Memory Organization Contents Main Memory Memory access time Memory cycle time Types of Memory Unit RAM ROM Memory System Virtual Memory Cache Memory - Associative mapping Direct mapping Set-associative

More information

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches CS 61C: Great Ideas in Computer Architecture The Memory Hierarchy, Fully Associative Caches Instructor: Alan Christopher 7/09/2014 Summer 2014 -- Lecture #10 1 Review of Last Lecture Floating point (single

More information

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1 The Memory Hierarchy Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1 Memory Technologies Technologies have vastly different tradeoffs between capacity, latency,

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

A Reconfigurable Cache Design for Embedded Dynamic Data Cache

A Reconfigurable Cache Design for Embedded Dynamic Data Cache I J C T A, 9(17) 2016, pp. 8509-8517 International Science Press A Reconfigurable Cache Design for Embedded Dynamic Data Cache Shameedha Begum, T. Vidya, Amit D. Joshi and N. Ramasubramanian ABSTRACT Applications

More information

CS152 Computer Architecture and Engineering Lecture 16: Memory System

CS152 Computer Architecture and Engineering Lecture 16: Memory System CS152 Computer Architecture and Engineering Lecture 16: System March 15, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson

More information

Amnesic Cache Management for Non-Volatile Memory

Amnesic Cache Management for Non-Volatile Memory Amnesic Cache Management for Non-Volatile Memory Dongwoo Kang, Seungjae Baek, Jongmoo Choi Dankook University, South Korea {kangdw, baeksj, chiojm}@dankook.ac.kr Donghee Lee University of Seoul, South

More information

Memory. Lecture 22 CS301

Memory. Lecture 22 CS301 Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch

More information

Mainstream Computer System Components

Mainstream Computer System Components Mainstream Computer System Components Double Date Rate (DDR) SDRAM One channel = 8 bytes = 64 bits wide Current DDR3 SDRAM Example: PC3-12800 (DDR3-1600) 200 MHz (internal base chip clock) 8-way interleaved

More information

Chapter 1. Memory Systems. 1.1 Introduction. Yoongu Kim Carnegie Mellon University. Onur Mutlu Carnegie Mellon University

Chapter 1. Memory Systems. 1.1 Introduction. Yoongu Kim Carnegie Mellon University. Onur Mutlu Carnegie Mellon University Chapter 1 Memory Systems Yoongu Kim Carnegie Mellon University Onur Mutlu Carnegie Mellon University 1.1 Introduction............................................................... 1.1.1 Basic Concepts

More information

A LITERATURE SURVEY ON CPU CACHE RECONFIGURATION

A LITERATURE SURVEY ON CPU CACHE RECONFIGURATION A LITERATURE SURVEY ON CPU CACHE RECONFIGURATION S. Subha SITE, Vellore Institute of Technology, Vellore, India E-Mail: ssubha@rocketmail.com ABSTRACT CPU caches are designed with fixed number of sets,

More information

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 18-447: Computer Architecture Lecture 25: Main Memory Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 Reminder: Homework 5 (Today) Due April 3 (Wednesday!) Topics: Vector processing,

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information

Couture: Tailoring STT-MRAM for Persistent Main Memory. Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung

Couture: Tailoring STT-MRAM for Persistent Main Memory. Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung Couture: Tailoring STT-MRAM for Persistent Main Memory Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung Executive Summary Motivation: DRAM plays an instrumental role in modern

More information