The Impact of Write Back on Cache Performance

Size: px
Start display at page:

Download "The Impact of Write Back on Cache Performance"

Transcription

1 The Impact of Write Back on Cache Performance Daniel Kroening and Silvia M. Mueller Computer Science Department Universitaet des Saarlandes, Saarbruecken, Germany Tel.: (+49) , Fax: (+49) Submitted to the International Conference on Computer Design (ICCD 99) March 23, 1999 Abstract This paper quantifies the impact of write policies and allocation policies on the cache performance. Adding write back to a cache design increases hardware cost, since additional storage for dirty bits is required. Furthermore, write back is not always faster. This paper derives a criterion which allows to check easily whether write back improves the performance of a given memory system. This is done with the help of extensive cache simulations on a MIPS RISC architecture with SPEC92 benchmarks as workload. 1 Introduction Background. Memory throughput is crucial for the performance of modern high-speed processors. In order to speed up accesses to slow DRAM, caches made of fast on-chip RAM are used. Such a cache only stores a subset of the data in the DRAM. If the data requested is not in the cache (cache miss), a slow access to the DRAM is necessary. Thus, larger caches usually result in better performance. Beyond the cache size, a lot of parameters affect the cache performance, such as associativity, line size, or allocation policy. There are many publications Supported by Graduiertenkolleg Effizienz und Komplexität von Algorithmen und Rechenanlagen which study the impact of those parameters on the miss ratio [1, 2, 3, 4, 5]. However, adding write back does not affect the miss ratio but still has a significant performance impact. Cache miss rate simulations therefore are no answer to the write back question, and different simulations are required. Their results are presented in this paper and these results will explain why modern CPUs use write back caches. A write through cache passes all writes directly to the next level in the hierarchy. In contrast to that, a write back cache only records the write access in the cache with a dirty bit. The memory is only updated on a cache miss, and in some systems, additional write backs are performed during idle cycles. In case of a dirty miss, the whole cache line has to be flushed. If this takes longer than a single write access, write back can cause performance degradation. Write back is more expensive regarding hardware cost, since storage for the dirty bits is required. Simulations are used to predict whether write back is worth the extra cost. Organization. In order to compare the different write policies, we will derive formulae for the total access time required by write trough and by write back. These formulae also depend on the allocation policy. In section 2, we compare three common allocation policies. With the help of the access time formulae, we will present a metric which allows an easy performance comparison of 1

2 both write policies in section 3. The rest of the paper will present simulation results for different cache parameter sets. 2 Allocation Policies The allocation policy determines when a new cache entry is allocated. Three different policies are common: 1. Using read-write allocate, both read and write misses allocate a new cache entry. Hits do not affect the cache directory. 2. Using read-only allocate, only read misses allocate new cache entries. Writes and read hits do not affect the cache directory. 3. Using write-invalidate, only read misses allocate new cache entries. In case of a write hit, the cache entry is invalidated, i.e., cleared. As our simulations show, read-write allocation provides the best miss rates (tables 1 to 3). The read-write allocation miss rates are four to eight times better than the miss rates of the caches with read only allocation Write- Invalidate is worst with miss rates about two times the miss rates of read only allocation. Read/write allocation shows significant improvement in the miss ratio when increasing the cache size even beyond 16 KB, which is not the case for the other two policies. Thus, read/write allocation makes better use of larger caches. Furthermore, write back only permits read-write and read-only allocate. In the following, we therefore assume read-write allocation. 3 Write Back vs. Write Through The performance gain achievable with write back depends on the performance of the underlying memory system. The slower the memory system, the bigger the potential benefit of write back. In order to compare the performance of write through (WT) and write back (WB), we derive for each policy a formula for the total time spent on cache accesses. Write Trough. On a write through system, a cache access with read-write allocate is processed as follows: 1. A check is performed whether the desired data is in the cache. 2. In case of a read hit, the data is read from the cache. Let Check denote the number of clock cycles required for the test and for reading. 3. In case of a miss, the appropriate cache line has to be filled into the cache. Let Line f ill denote the number of clock cycles for this task. 4. In case of a write access, the data is written into the cache and passed to the memory system. The cache and memory access can be overlapped. Let Update denote the number of clock cycles for this task. In order to calculate the total time t WT spent on cache accesses, it is necessary to sum up the total time required for the read/write hits and misses. Let #rhit denote the number of read hits, #whit the number of write hits and so on. The write through total t WT is calculated as: t WT = #rhit Check +#rmiss (Check+Line fill) +#whit (Check +Update) +#wmiss (Check+Line fill +Update) (1) Write Back. On a write back system, a cache access is processed as follows: 1. A check is performed whether the desired data is in the cache. The data is read in case of a read hit. This is identical to the behavior of the write through cache. 2. In case of a miss, there are now two cases: (a) If the line is not dirty (i.e., the dirty bit is not set), the cache line can be replaced by the new entry immediately. (b) If the line is dirty (i.e., the dirty bit is set), the line has to be flushed first. This is the price for write back. Let Penalty denote the number of clock cycles necessary for the write back of the dirty line. After the flush, the line is filled into the cache as done by a write through cache, which takes line f ill many cycles. 2

3 Integer Averages Floating Point Averages Allocate Read Only Read/Write Write Inv. Read Only Read/Write Write Inv. 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Table 1: Average miss rate for different allocation policies (unified cache, associativity 1, no sectoring, LRU replacement, line size 128 bytes) Integer Averages Floating Point Averages Allocate Read Only Read/Write Write Inv. Read Only Read/Write Write Inv. 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Table 2: Average miss rate for different allocation policies (unified cache, associativity 2, no sectoring, LRU replacement, line size 128 bytes) Integer Averages Floating Point Averages Allocate Read Only Read/Write Write Inv. Read Only Read/Write Write Inv. 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Table 3: Average miss rate for different allocation policies (unified cache, associativity 4, no sectoring, LRU replacement, line size 128 bytes) 3

4 Write Through Write Back read hit Check Check (clean) miss Check+Line fill Check+Line fill dirty miss - Check+Penalty+Line fill write hit Check +Update Check +Cacheupdate (clean) miss Check + Line f ill +Update Check + Line f ill dirty miss - Check+Penalty+Line fill Table 4: Duration formulae for a cache access 3. In case of a write access, the cache has to be updated. The write access to the cache can be overlapped with the line fill in case of a miss. On a write hit, the cache update can be overlapped with the following cache access. However, the Check-time might not be sufficient for the update. Thus, let Cacheupdate denote the extra time required. Table 4 summarizes the cache access time formulae. In case of the write back cache, it is necessary to distinguish between dirty and clean misses. Let #rcleanmiss denote the number of clean read misses, #wdirtymiss the number of dirty write misses, and so on. Let us consider a system with fast cache update, i.e., Cacheupdate = 0. The total access time t WB of the write back cache then adds up to: t WB = #rhit Check +#rcleanmiss (Check+Line fill) +#rdirtymiss (Check+Penalty+Line fill) +#whit Check +#wcleanmiss (Check+Line fill) +#wdirtymiss (Check+Penalty+Line fill) The write back cache is better than the write through cache if (2) t WT > t WB (3) holds. Under the assumption given above, that simplifies to: #write Update > #dirtymiss Penalty (4) Thus, it only depends on four parameters whether write back is faster than write through. Of these parameters, the clock cycles Update and Penalty are specified by the memory system. The #dirtymiss and #write values, which denote the total number of dirty misses and the total number of writes, depend on the cache parameter set and on the workload, i.e., they can be determined with a cache simulator. Write back improves the cache performance, if Update Penalty > #dirtymiss #write =: α (5) holds. The cache simulator provides the value α. We now analyze the case Cacheupdate > 0, i.e., the write hit on the write back system takes longer than the check of the cache line and thus, on a hit, the update cannot be hidden completely. Thus, each write hit adds Cacheupdate many cycles to the time t WB. In order to simplify the formula, we add Cacheupdate penalty on each write access, not just on hits. Miss rates are low on reasonably sized caches, so the number of write hits and writes are similar. With this simplification, equation 4 changes to: #write Update > #dirtymiss Penalty + #write Cacheupdate In the general case, write back improves the cache performance, if (6) Update? Cacheupdate Penalty > #dirtymiss #write = α: (7) 4

5 Benchmarks Integer Averages Floating Point Averages Sectors K Cache K Cache K Cache K Cache K Cache K Cache K Cache Table 5: Average miss ratio for different numbers of sectors (unified cache, associativity 1, LRU replacement, line size 128 bytes) 4 Simulation Results In order to supply values for the dirty miss / write ratio, we performed detailed cache simulations. The SPEC92 benchmark suite served as workload, since it is a well known measure of CPU performance and since the plain miss ratios are already available for this workload [4]. The cache simulation was done with the memory trace driven simulator ACS [6]. The memory traces of the SPEC92 benchmarks were provided by the TraceBase project [7] and have been generated on a MIPS RISC architecture. They contain about 1,000,000 memory references for each benchmark of the suite. The simulator was extended to accept further cache parameters such as the different allocation policies, and to provide the desired dirty miss / write ratio α. We varied the cache size from 1 KByte to 64 KByte, the set associativity from one (direct-mapped) to four, and the cache line size from four to 128 bytes. We used readwrite, read-only, and write-invalidate as allocation policy, and LRU (least recently used) and random replacement. All parameters were applied to split and unified caches. For split caches, the miss ratios were accounted separately for the instruction cache and the data cache. Tables 6 to 8 list the dirty miss / write ratio α for different caches without sectoring, with LRU replacement, and with read-write allocation. Examples. Consider a memory system with 64-bit wide data bus and a cache with 32 byte line size. The system supports bus bursts. The cache update requires an additional cycle (i.e., Cacheupdate = 1). In this case, the ratio (Update? Cacheupdate)=Penalty is (5? 1)=( ) = 0:5 in this case. Thus, the write back policy is advantageous for a given cache if its value α listed in the tables is smaller than 0.5. The Intel Pentium, for example, has the same bus width and line size but makes bus bursts [8, 9]. Thus, values of α lower than 0.43 are required for the write back policy to win. The original Pentium has a two way split cache of 16K total size. Table 7 gives a value of for this configuration under an integer workload. Thus, write back is a big win for the Pentium. In contrast to Intel s Pentium, the MIPS R3000 architecture uses a split, direct mapped write through cache [10]. Assuming a line size of 16 bytes, 32 bit bus width, and a bus burst, this requires values of α lower than Table 6 shows that write back would reduce the performance of floating point applications on smaller caches, thus, MIPS was right not to use write back. Miss Rate Simulations. In addition to the write back analysis, we performed miss rate simulations. One goal was to provide detailed cache performance data for a given parameter set in a convenient way. The single results for each simulation are available via world wide web [11]. A form permits selecting the various parameters, including the individual SPEC benchmark trace file. The simulator returns the miss ratios for cache sizes from 1K to 64K. Furthermore, we simulated sectoring. Each sector is assumed to have dirty/valid flags of its own. As our sim- 5

6 ulations show (table 5), sectoring usually results in lower hit rates. Thus, hit rate simulations cannot explain why sectoring is used. However, sectoring lowers the average write back penalty, thus, it can be an option for some memory systems. 5 Conclusion In the paper, we derived an easy formula which allows to check whether write back or write through is faster for a given memory system. As our results show, write back is advantageous for caches with common line sizes of 16 and 32 bytes and cache size above 8 KB. Furthermore, we showed that read/write allocation is superior to read only and write-invalidate for almost any common cache parameter set. [7] Trace Database, Parallel Architecture Research Laboratory, New Mexico State University. [8] Intel Corporation. 2430FX PCIset Datasheet 82437FX System Controller (TSC) and 82438FX Data Path Unit (TDP), [9] Intel Corporation. Pentium Processor Family Developer s Manual, Vol. 1-3, [10] G. Kane and J. Heinrich. MIPS RISC Architecture. Prentice Hall, [11] Cache Simulations, Tables and Web Form, Computer Science Department, Universität des Saarlandes. kroening/cache eng/. References [1] M.D. Hill. Aspects of Cache Memory and Instruction Buffer Performance. PhD thesis, Computer Science Devision (EECS), UC Berkeley, CA 94720, [2] D.W. Clark. Cache Performance in the VAX-11/780. ACM Trans. Comp. Sys., 1(1):24 37, [3] S. Przybylski, M. Horowitz, and J. Hennessy. Performance tradeoffs in cache design. In Proc. 15th Annual International Symposium on Computer Architecture, pages IEEE Computer Society Press, [4] J.D. Gee, Hill M.D., Pnevmatikatos D.N., and Smith A.J. Cache Performance of the SPEC92 Benchmark Suite. IEEE Micro, 13(4):17 27, [5] J.L. Hennessy and D.A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, INC., San Mateo, CA, 2nd edition, [6] Bryan Hunt. Acme cache simulator. acme/acs.html,

7 Integer Averages: Unified Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Integer Averages: Split Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Floating Point Averages: Unified Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Floating Point Averages: Split Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Table 6: Average dirty miss / write ratio α for associativity 1, without sectoring, LRU replacement, read/write allocation 7

8 Integer Averages: Unified Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Integer Averages: Split Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Floating Point Averages: Unified Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Floating Point Averages: Split Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Table 7: Average dirty miss / write ratio α for associativity 2, without sectoring, LRU replacement, read/write allocation 8

9 Integer Averages: Unified Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Integer Averages: Split Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Floating Point Averages: Unified Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Floating Point Averages: Split Cache 1K Cache K Cache K Cache K Cache K Cache K Cache K Cache Table 8: Average dirty miss / write ratio α for associativity 4, without sectoring, LRU replacement, read/write allocation 9

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored

More information

LECTURE 10: Improving Memory Access: Direct and Spatial caches

LECTURE 10: Improving Memory Access: Direct and Spatial caches EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

Cache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Cache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Cache Optimization Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Cache Misses On cache hit CPU proceeds normally On cache miss Stall the CPU pipeline

More information

CS152 Computer Architecture and Engineering Lecture 17: Cache System

CS152 Computer Architecture and Engineering Lecture 17: Cache System CS152 Computer Architecture and Engineering Lecture 17 System March 17, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http//http.cs.berkeley.edu/~patterson

More information

Page 1. Memory Hierarchies (Part 2)

Page 1. Memory Hierarchies (Part 2) Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,

More information

Performance metrics for caches

Performance metrics for caches Performance metrics for caches Basic performance metric: hit ratio h h = Number of memory references that hit in the cache / total number of memory references Typically h = 0.90 to 0.97 Equivalent metric:

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major

More information

Main Memory Supporting Caches

Main Memory Supporting Caches Main Memory Supporting Caches Use DRAMs for main memory Fixed width (e.g., 1 word) Connected by fixed-width clocked bus Bus clock is typically slower than CPU clock Cache Issues 1 Example cache block read

More information

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK] Lecture 17 Adapted from instructor s supplementary material from Computer Organization and Design, 4th Edition, Patterson & Hennessy, 2008, MK] SRAM / / Flash / RRAM / HDD SRAM / / Flash / RRAM/ HDD SRAM

More information

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141 EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role

More information

Unit 2. Chapter 4 Cache Memory

Unit 2. Chapter 4 Cache Memory Unit 2 Chapter 4 Cache Memory Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation Location CPU Internal External Capacity Word

More information

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2) The Memory Hierarchy Cache, Main Memory, and Virtual Memory (Part 2) Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Cache Line Replacement The cache

More information

1. Creates the illusion of an address space much larger than the physical memory

1. Creates the illusion of an address space much larger than the physical memory Virtual memory Main Memory Disk I P D L1 L2 M Goals Physical address space Virtual address space 1. Creates the illusion of an address space much larger than the physical memory 2. Make provisions for

More information

Lecture 2: Memory Systems

Lecture 2: Memory Systems Lecture 2: Memory Systems Basic components Memory hierarchy Cache memory Virtual Memory Zebo Peng, IDA, LiTH Many Different Technologies Zebo Peng, IDA, LiTH 2 Internal and External Memories CPU Date transfer

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

William Stallings Computer Organization and Architecture 8th Edition. Cache Memory

William Stallings Computer Organization and Architecture 8th Edition. Cache Memory William Stallings Computer Organization and Architecture 8th Edition Chapter 4 Cache Memory Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics

More information

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

CPU issues address (and data for write) Memory returns data (or acknowledgment for write) The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar

More information

MEMORY HIERARCHY BASICS. B649 Parallel Architectures and Programming

MEMORY HIERARCHY BASICS. B649 Parallel Architectures and Programming MEMORY HIERARCHY BASICS B649 Parallel Architectures and Programming BASICS Why Do We Need Caches? 3 Overview 4 Terminology cache virtual memory memory stall cycles direct mapped valid bit block address

More information

Improving Cache Performance

Improving Cache Performance Improving Cache Performance Tuesday 27 October 15 Many slides adapted from: and Design, Patterson & Hennessy 5th Edition, 2014, MK and from Prof. Mary Jane Irwin, PSU Summary Previous Class Memory hierarchy

More information

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)] ECE7995 (6) Improving Cache Performance [Adapted from Mary Jane Irwin s slides (PSU)] Measuring Cache Performance Assuming cache hit costs are included as part of the normal CPU execution cycle, then CPU

More information

EECS150 - Digital Design Lecture 11 SRAM (II), Caches. Announcements

EECS150 - Digital Design Lecture 11 SRAM (II), Caches. Announcements EECS15 - Digital Design Lecture 11 SRAM (II), Caches September 29, 211 Elad Alon Electrical Engineering and Computer Sciences University of California, Berkeley http//www-inst.eecs.berkeley.edu/~cs15 Fall

More information

EECS 322 Computer Architecture Superpipline and the Cache

EECS 322 Computer Architecture Superpipline and the Cache EECS 322 Computer Architecture Superpipline and the Cache Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses powerpoint animation: please viewshow Summary:

More information

Chapter 02. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1

Chapter 02. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1 Chapter 02 Authors: John Hennessy & David Patterson Copyright 2011, Elsevier Inc. All rights Reserved. 1 Figure 2.1 The levels in a typical memory hierarchy in a server computer shown on top (a) and in

More information

Course Administration

Course Administration Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570

More information

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1 CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson

More information

6 th Lecture :: The Cache - Part Three

6 th Lecture :: The Cache - Part Three Dr. Michael Manzke :: CS7031 :: 6 th Lecture :: The Cache - Part Three :: October 20, 2010 p. 1/17 [CS7031] Graphics and Console Hardware and Real-time Rendering 6 th Lecture :: The Cache - Part Three

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #24 Cache II 27-8-6 Scott Beamer, Instructor New Flow Based Routers CS61C L24 Cache II (1) www.anagran.com Caching Terminology When we try

More information

A Cache Hierarchy in a Computer System

A Cache Hierarchy in a Computer System A Cache Hierarchy in a Computer System Ideally one would desire an indefinitely large memory capacity such that any particular... word would be immediately available... We are... forced to recognize the

More information

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality: it will tend to be referenced again soon spatial locality: nearby items will tend to be

More information

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory Cache Memories Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory CPU looks first for data in caches (e.g., L1, L2, and

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

Chapter 2: Memory Hierarchy Design Part 2

Chapter 2: Memory Hierarchy Design Part 2 Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental

More information

Cache in a Memory Hierarchy

Cache in a Memory Hierarchy Cache in a Memory Hierarchy Paulo J. D. Domingues Curso de Especialização em Informática Universidade do Minho pado@clix.pt Abstract: In the past two decades, the steady increase on processor performance

More information

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) Department of Electr rical Eng ineering, Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Depar rtment of Electr rical Engineering,

More information

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner CPS104 Computer Organization and Programming Lecture 16: Virtual Memory Robert Wagner cps 104 VM.1 RW Fall 2000 Outline of Today s Lecture Virtual Memory. Paged virtual memory. Virtual to Physical translation:

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Computer Architecture ECE 568/668 Part Hierarchy - I Israel Koren ECE568/Koren Part.. 3 4 5 6 7 8 9 A B C D E F 6 blocks 3 4 block

More information

Slide Set 5. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Slide Set 5. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng Slide Set 5 for ENCM 501 in Winter Term, 2017 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2017 ENCM 501 W17 Lectures: Slide

More information

Chapter 2: Memory Hierarchy Design Part 2

Chapter 2: Memory Hierarchy Design Part 2 Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory CPS 104 Computer Organization and Programming Lecture 20: Virtual Nov. 10, 1999 Dietolf (Dee) Ramm http://www.cs.duke.edu/~dr/cps104.html CPS 104 Lecture 20.1 Outline of Today s Lecture O Virtual. 6 Paged

More information

Memory Design. Cache Memory. Processor operates much faster than the main memory can.

Memory Design. Cache Memory. Processor operates much faster than the main memory can. Memory Design Cache Memory Processor operates much faster than the main memory can. To ameliorate the sitution, a high speed memory called a cache memory placed between the processor and main memory. Barry

More information

Lecture notes for CS Chapter 2, part 1 10/23/18

Lecture notes for CS Chapter 2, part 1 10/23/18 Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental

More information

12 Cache-Organization 1

12 Cache-Organization 1 12 Cache-Organization 1 Caches Memory, 64M, 500 cycles L1 cache 64K, 1 cycles 1-5% misses L2 cache 4M, 10 cycles 10-20% misses L3 cache 16M, 20 cycles Memory, 256MB, 500 cycles 2 Improving Miss Penalty

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/ Typical Memory Hierarchy Datapath On-Chip

More information

Page 1. Multilevel Memories (Improving performance using a little cash )

Page 1. Multilevel Memories (Improving performance using a little cash ) Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency

More information

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp. 13 1 CMPE110 Computer Architecture, Winter 2009 Andrea Di Blas 110 Winter 2009 CMPE Cache Direct-mapped cache Reads and writes Cache associativity Cache and performance Textbook Edition: 7.1 to 7.3 Third

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 48 Computer Organization 5. The Basics of Cache Chansu Yu Caches: The Basic Idea A smaller set of storage locations storing a subset of information from a larger set (memory) Unlike registers or memory,

More information

LRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.

LRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved. LRU A list to keep track of the order of access to every block in the set. The least recently used block is replaced (if needed). How many bits we need for that? 27 Pseudo LRU A B C D E F G H A B C D E

More information

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp. Cache associativity Cache and performance 12 1 CMPE110 Spring 2005 A. Di Blas 110 Spring 2005 CMPE Cache Direct-mapped cache Reads and writes Textbook Edition: 7.1 to 7.3 Second Third Edition: 7.1 to 7.3

More information

Improving Cache Performance

Improving Cache Performance Improving Cache Performance Computer Organization Architectures for Embedded Computing Tuesday 28 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition,

More information

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off

More information

CS3350B Computer Architecture

CS3350B Computer Architecture CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &

More information

ECSE 425 Lecture 21: More Cache Basics; Cache Performance

ECSE 425 Lecture 21: More Cache Basics; Cache Performance ECSE 425 Lecture 21: More Cache Basics; Cache Performance H&P Appendix C 2011 Gross, Hayward, Arbel, Vu, Meyer Textbook figures Last Time Two QuesRons Q1: Block placement Q2: Block idenrficaron ECSE 425,

More information

EEC 170 Computer Architecture Fall Improving Cache Performance. Administrative. Review: The Memory Hierarchy. Review: Principle of Locality

EEC 170 Computer Architecture Fall Improving Cache Performance. Administrative. Review: The Memory Hierarchy. Review: Principle of Locality Administrative EEC 7 Computer Architecture Fall 5 Improving Cache Performance Problem #6 is posted Last set of homework You should be able to answer each of them in -5 min Quiz on Wednesday (/7) Chapter

More information

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016 Caches and Memory Hierarchy: Review UCSB CS240A, Winter 2016 1 Motivation Most applications in a single processor runs at only 10-20% of the processor peak Most of the single processor performance loss

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 14 Caches III Lecturer SOE Dan Garcia Google Glass may be one vision of the future of post-pc interfaces augmented reality with video

More information

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review CISC 662 Graduate Computer Architecture Lecture 6 - Cache and virtual memory review Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David

More information

Understanding The Behavior of Simultaneous Multithreaded and Multiprocessor Architectures

Understanding The Behavior of Simultaneous Multithreaded and Multiprocessor Architectures Understanding The Behavior of Simultaneous Multithreaded and Multiprocessor Architectures Nagi N. Mekhiel Department of Electrical and Computer Engineering Ryerson University, Toronto, Ontario M5B 2K3

More information

COSC 6385 Computer Architecture - Memory Hierarchies (I)

COSC 6385 Computer Architecture - Memory Hierarchies (I) COSC 6385 Computer Architecture - Memory Hierarchies (I) Edgar Gabriel Spring 2018 Some slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05

More information

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6.

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6. Cache Memories From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6. Today Cache memory organization and operation Performance impact of caches The memory mountain Rearranging

More information

Caches Concepts Review

Caches Concepts Review Caches Concepts Review What is a block address? Why not bring just what is needed by the processor? What is a set associative cache? Write-through? Write-back? Then we ll see: Block allocation policy on

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed

More information

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Cache Introduction [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user with as much

More information

Chapter 8. Virtual Memory

Chapter 8. Virtual Memory Operating System Chapter 8. Virtual Memory Lynn Choi School of Electrical Engineering Motivated by Memory Hierarchy Principles of Locality Speed vs. size vs. cost tradeoff Locality principle Spatial Locality:

More information

Introduction. Memory Hierarchy

Introduction. Memory Hierarchy Introduction Why memory subsystem design is important CPU speeds increase 25%-30% per year DRAM speeds increase 2%-11% per year 1 Memory Hierarchy Levels of memory with different sizes & speeds close to

More information

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 33 Caches III 2007-04-11 Future of movies is 3D? Dreamworks says they may exclusively release movies in this format. It s based

More information

Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site:

Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site: Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, 2003 Textbook web site: www.vrtechnology.org 1 Textbook web site: www.vrtechnology.org Laboratory Hardware 2 Topics 14:332:331

More information

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal

More information

Computer and Digital System Architecture

Computer and Digital System Architecture Computer and Digital System Architecture EE/CpE-517-A Bruce McNair bmcnair@stevensedu Stevens Institute of Technology - All rights reserved 1-1/68 Week 9 Hierarchical organization of system memory Furber

More information

Cache Architectures Design of Digital Circuits 217 Srdjan Capkun Onur Mutlu http://www.syssec.ethz.ch/education/digitaltechnik_17 Adapted from Digital Design and Computer Architecture, David Money Harris

More information

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs Chapter 5 (Part II) Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Virtual Machines Host computer emulates guest operating system and machine resources Improved isolation of multiple

More information

Topics. Digital Systems Architecture EECE EECE Need More Cache?

Topics. Digital Systems Architecture EECE EECE Need More Cache? Digital Systems Architecture EECE 33-0 EECE 9-0 Need More Cache? Dr. William H. Robinson March, 00 http://eecs.vanderbilt.edu/courses/eece33/ Topics Cache: a safe place for hiding or storing things. Webster

More information

Computer Architecture Memory hierarchies and caches

Computer Architecture Memory hierarchies and caches Computer Architecture Memory hierarchies and caches S Coudert and R Pacalet January 23, 2019 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches

More information

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size CS61C L33 Caches III (1) inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C Machine Structures Lecture 33 Caches III 27-4-11 Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Future of movies is 3D? Dreamworks

More information

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017 Caches and Memory Hierarchy: Review UCSB CS24A, Fall 27 Motivation Most applications in a single processor runs at only - 2% of the processor peak Most of the single processor performance loss is in the

More information

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck Main memory management CMSC 411 Computer Systems Architecture Lecture 16 Memory Hierarchy 3 (Main Memory & Memory) Questions: How big should main memory be? How to handle reads and writes? How to find

More information

August 1994 / Features / Cache Advantage. Cache design and implementation can make or break the performance of your high-powered computer system.

August 1994 / Features / Cache Advantage. Cache design and implementation can make or break the performance of your high-powered computer system. Cache Advantage August 1994 / Features / Cache Advantage Cache design and implementation can make or break the performance of your high-powered computer system. David F. Bacon Modern CPUs have one overriding

More information

Architectural Issues for the 1990s. David A. Patterson. Computer Science Division EECS Department University of California Berkeley, CA 94720

Architectural Issues for the 1990s. David A. Patterson. Computer Science Division EECS Department University of California Berkeley, CA 94720 Microprocessor Forum 10/90 1 Architectural Issues for the 1990s David A. Patterson Computer Science Division EECS Department University of California Berkeley, CA 94720 1990 (presented at Microprocessor

More information

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed) Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2011/12 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2011/12 1 2

More information

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed) Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2012/13 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2012/13 1 2

More information

The Impact of Hardware Scheduling Mechanisms on the Performance and Cost of Processor Designs

The Impact of Hardware Scheduling Mechanisms on the Performance and Cost of Processor Designs The Impact of Hardware Scheduling Mechanisms on the Performance and Cost of Processor Designs S. M. Mueller, H. W. Leister Ý, P. Dell, N. Gerteis, D. Kroening Ý Dept 14: Computer Science, University of

More information

Virtual Memory - Objectives

Virtual Memory - Objectives ECE232: Hardware Organization and Design Part 16: Virtual Memory Chapter 7 http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy Virtual Memory - Objectives

More information

Cache Memory and Performance

Cache Memory and Performance Cache Memory and Performance Cache Performance 1 Many of the following slides are taken with permission from Complete Powerpoint Lecture Notes for Computer Systems: A Programmer's Perspective (CS:APP)

More information

Chapter 2. OS Overview

Chapter 2. OS Overview Operating System Chapter 2. OS Overview Lynn Choi School of Electrical Engineering Class Information Lecturer Prof. Lynn Choi, School of Electrical Eng. Phone: 3290-3249, Kong-Hak-Kwan 411, lchoi@korea.ac.kr,

More information

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder Michigan Technological University Randy Katz & David A. Patterson University of California, Berkeley Levels in

More information

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University CS 201 The Memory Hierarchy Gerson Robboy Portland State University memory hierarchy overview (traditional) CPU registers main memory (RAM) secondary memory (DISK) why? what is different between these

More information

Sarah L. Harris and David Money Harris. Digital Design and Computer Architecture: ARM Edition Chapter 8 <1>

Sarah L. Harris and David Money Harris. Digital Design and Computer Architecture: ARM Edition Chapter 8 <1> Chapter 8 Digital Design and Computer Architecture: ARM Edition Sarah L. Harris and David Money Harris Digital Design and Computer Architecture: ARM Edition 215 Chapter 8 Chapter 8 :: Topics Introduction

More information

Caching Basics. Memory Hierarchies

Caching Basics. Memory Hierarchies Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby

More information

3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition

3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015 Operating Systems 09. Memory Management Part 1 Paul Krzyzanowski Rutgers University Spring 2015 March 9, 2015 2014-2015 Paul Krzyzanowski 1 CPU Access to Memory The CPU reads instructions and reads/write

More information

Caching Prof. James L. Frankel Harvard University. Version of 5:16 PM 5-Apr-2016 Copyright 2016 James L. Frankel. All rights reserved.

Caching Prof. James L. Frankel Harvard University. Version of 5:16 PM 5-Apr-2016 Copyright 2016 James L. Frankel. All rights reserved. Caching Prof. James L. Frankel Harvard University Version of 5:16 PM 5-Apr-2016 Copyright 2016 James L. Frankel. All rights reserved. Memory Hierarchy Extremely limited number of registers in CPU Lots

More information

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5

More information