Computer Architecture Memory hierarchies and caches
|
|
- Matthew Watkins
- 5 years ago
- Views:
Transcription
1 Computer Architecture Memory hierarchies and caches S Coudert and R Pacalet January 23, 2019
2 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches Write strategies Cache coherence in multiprocessor systems 2/44 S Coudert and R Pacalet January 23, 2019
3 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches Write strategies Cache coherence in multiprocessor systems 3/44 S Coudert and R Pacalet January 23, 2019
4 The memory latency problem Latency of external memory access tends to increase (tens to hundreds of CPU clock cycles) Clock cycles between CPU load and instruction/data returning from memory are wasted Clock Per Instruction (CPI) increases On the other hand, in average, 90% of execution time corresponds to 10% of code instructions Caches are a way to take benefit from this to mitigate the memory latency problem 4/44 S Coudert and R Pacalet January 23, 2019
5 CPU Principles of caches All CPU memory accesses Small fast memory Access only when needed Larger slow memory Memory Memory Memory Memory Size smallest largest Speed fastest slowest Cost ($/bit) highest lowest Keep most frequently accessed data in small (expensive) fast (close) memory Performance depend on hit and miss times and on hit rate Technology Latency (s) Cost ($ per byte) Register Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk /44 S Coudert and R Pacalet January 23, 2019
6 Memory hierarchy CPU Increasing size Level 1 Level 2 Level n Increasing latency Size Goals: Minimize miss rate Ideally, the full memory is as fast as level 1 (miss rate = 0) and its size is that of level n (GBytes) 6/44 S Coudert and R Pacalet January 23, 2019
7 Cache miss & Cache hit CPU CACHE address data bus address bus HIT data bus address address bus 4 3 data bus MISS data bus 2 MEMORY Cache miss => CPU wait states 7/44 S Coudert and R Pacalet January 23, 2019
8 Most frequently accessed Best possible choice for cached data: the one I will need next The future s not ours to see (Que sera sera) Second best choice: most frequently accessed data How to identify them Approximation, heuristics based on two locality principles: Spatial: in a given short period of time a program frequently accesses a small memory area (example: working on an array) Temporal: a program often accesses the same memory cell several times in a short period (example: instructions in a loop) 8/44 S Coudert and R Pacalet January 23, 2019
9 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches Write strategies Cache coherence in multiprocessor systems 9/44 S Coudert and R Pacalet January 23, 2019
10 Locality principles: example Sub-program example: 1 f o r ( i = 0; i < 1000; i ++) { 2 C[ i ] = A [ i ] + B [ i ] ; 3 } Variable addresses: array A array B array C constant constant /44 S Coudert and R Pacalet January 23, 2019
11 Locality principles: example 1000 times Temporal locality: an accessed memory location is likely to be accessed again soon Spatial locality: a memory location near an accessed one is likely to be accessed soon 8000 lw $1,36000($0) 8004 lw $2,36004($0) 8008 lw $3,24000($1) 8012 lw $4,28000($1) 8016 add $3,$3,$ sw $3,32000($1) 8024 beq $1,$2, addi $1,$1, j while looping # $r1 < 0 Initialization # $r2 < 3996 # $r3 < A[i] Loop body # $r4 < B[i] # $r3 < $r3 + $r4 # C[i] < $r3 # jump to 8036 if $r1 = $r2 # increment $r1 # jump to 8008 Sequel 11/44 S Coudert and R Pacalet January 23, 2019
12 Locality principles: example 1000 times Temporal locality: an accessed memory location is likely to be accessed again soon Spatial locality: a memory location near an accessed one is likely to be accessed soon 8000 lw $1,36000($0) 8004 lw $2,36004($0) 8008 lw $3,24000($1) 8012 lw $4,28000($1) 8016 add $3,$3,$ sw $3,32000($1) 8024 beq $1,$2, addi $1,$1, j while looping # $r1 < 0 Initialization # $r2 < 3996 # $r3 < A[i] Loop body # $r4 < B[i] # $r3 < $r3 + $r4 # C[i] < $r3 # jump to 8036 if $r1 = $r2 # increment $r1 # jump to 8008 Sequel 11/44 S Coudert and R Pacalet January 23, 2019
13 Locality principles: example adresses 8000 lw $1,36000($0) 8004 lw $2,36004($0) 8008 lw $3,24000($1) 8012 lw $4,28000($1) 8016 add $3,$3,$ sw $3,32000($1) 8024 beq $1,$2, addi $1,$1, j # $r1 < 0 # $r2 < 3996 # $r3 < A[i] # $r4 < B[i] # $r3 < $r3 + $r4 # C[i] < $r3 # jump to 8036 if $r1 = $r2 # increment $r1 # jump to instruction fetch temporal locality iteration 1 iteration 2 iteration 3 iteration 4 iteration 5 time 12/44 S Coudert and R Pacalet January 23, 2019
14 Locality principles: example adresses lw $1,36000($0) 8004 lw $2,36004($0) 8008 lw $3,24000($1) 8012 lw $4,28000($1) 8016 add $3,$3,$ sw $3,32000($1) 8024 beq $1,$2, addi $1,$1, j # $r1 < 0 # $r2 < 3996 # $r3 < A[i] # $r4 < B[i] # $r3 < $r3 + $r4 # C[i] < $r3 # jump to 8036 if $r1 = $r2 # increment $r1 # jump to 8008 data fetch iteration 1 iteration 2 iteration 3 iteration 4 iteration 5 time 12/44 S Coudert and R Pacalet January 23, 2019
15 Locality principles: example adresses lw $1,36000($0) 8004 lw $2,36004($0) 8008 lw $3,24000($1) 8012 lw $4,28000($1) 8016 add $3,$3,$ sw $3,32000($1) 8024 beq $1,$2, addi $1,$1, j # $r1 < 0 # $r2 < 3996 # $r3 < A[i] # $r4 < B[i] # $r3 < $r3 + $r4 # C[i] < $r3 # jump to 8036 if $r1 = $r2 # increment $r1 # jump to spatial locality iteration 1 iteration 2 iteration 3 iteration 4 iteration 5 time 12/44 S Coudert and R Pacalet January 23, 2019
16 Most frequently accessed Selection of data to cache usually based on locality heuristics When data is fetched from lower levels: it is loaded in cache and kept there for later re-use (temporal locality), and data in neighbourhood are also loaded, just in case they will also be needed (spatial locality) 13/44 S Coudert and R Pacalet January 23, 2019
17 CPU Cache miss / cache hit 1 Memory 2 handling cache fault x Cache miss x CPU Cache x Memory 3 Memory 4 loading data at x and around Memory CPU Cache x x+1 hit CPU Cache hit data at x is now in cache neighbourhood too accessing y: miss y 14/44 S Coudert and R Pacalet January 23, 2019
18 CPU Cache miss / cache hit 5 Memory 6 handling cache fault y Cache miss CPU Cache Memory y y 7 Memory 8 Memory Cache x+1 x hit CPU CPU Cache miss 14/44 S Coudert and R Pacalet January 23, 2019
19 Cache management strategies Where in cache shall we store the incoming data when handling cache faults Upon CPU accesses, how do we know if a data is in cache and where In case a data must be replaced, which one to chose How do we handle write accesses Various kinds of caches and associated strategies 15/44 S Coudert and R Pacalet January 23, 2019
20 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches Write strategies Cache coherence in multiprocessor systems 16/44 S Coudert and R Pacalet January 23, 2019
21 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit 17/44 S Coudert and R Pacalet January 23, 2019
22 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit 17/44 S Coudert and R Pacalet January 23, 2019
23 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit 17/44 S Coudert and R Pacalet January 23, 2019
24 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit 17/44 S Coudert and R Pacalet January 23, 2019
25 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit 17/44 S Coudert and R Pacalet January 23, 2019
26 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit 17/44 S Coudert and R Pacalet January 23, 2019
27 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit 17/44 S Coudert and R Pacalet January 23, 2019
28 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit 17/44 S Coudert and R Pacalet January 23, 2019
29 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit 17/44 S Coudert and R Pacalet January 23, 2019
30 Direct-mapped caches Smallest cache-able unit: the smallest Addressing Unit (AU, eg one byte, one word) For any AU there is one unique possible location in cache Cache capacity: 2 k cache lines (stores at most 2 k AUs) Where in the cache is AU which address in memory is a In line a mod 2 k How do we know it is the right AU The cache line also stores the tag: a/2 k How do we know it is a valid AU (eg after reset) The cache line also stores a validity bit k bits V tag data address tag line index 1 tag line data cache: 2 k lines 17/44 S Coudert and R Pacalet January 23, 2019
31 Direct-mapped cache architecture Example 4 GB memory with 1 kb cache and one-byte cache lines memory address V hit tag = 20 data data 10 Memory /44 S Coudert and R Pacalet January 23, 2019
32 1 MISS CPU handling cache fault Direct-mapped cache running Cache 11 a q b g 01 l q b g a b c d e f g h j i k m l n o p Memory MISS CPU handling cache fault Cache 01 l b f g 01 l q b g a b c d e f g h j i k m l n o p Memory a a b b c c MISS d HIT d e e Cache f Cache f l 00 g p 00 g b 01 h f j i 0 01 b 01 h CPU 0000 CPU f g j i 0 11 g k k handling cache fault m l m l p 00 n n b 01 o o f 10 p p g 11 Memory but 1011 or 1110: miss Memory 19/44 S Coudert and R Pacalet January 23, 2019
33 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches Write strategies Cache coherence in multiprocessor systems 20/44 S Coudert and R Pacalet January 23, 2019
34 Larger addressing unit and cache line Addressing unit: CPU word (eg 32 bits) Locality: cache line stores several consecutive words (a block) Blocks are aligned in memory and cache Memory address tag unit n bits index byte tagidxuubb uubb uubb uubb uubb uubb V n cache lines MUX 21/44 S Coudert and R Pacalet January 23, 2019
35 CPU CPU MISS Direct-mapped cache running Cache handling cache fault p0 o0 n0 m HIT Cache p0 o0 n0 m l0 k0 j0 i o p a b c d e f g h i j k l m n o p Memory o p a b c d e f g h i j k l m n o p Memory CPU CPU MISS Cache p0 o0 n0 m handling cache fault p0 o0 n0 m l0 k0 j0 i MISS Cache p0 o0 n0 m l0 k0 j0 i handling cache fault p1 o1 n1 m l0 k0 j0 i o p a b c d e f g h i j k l m n o p Memory o p a b c d e f g h i j k l m n o p Memory 22/44 S Coudert and R Pacalet January 23, 2019
36 Exercise #1: Cache size and architecture 4-bytes words 4-words blocks (cache lines) 64-bit addresses Direct-mapped cache Cache capacity: 2 16 bytes of data Addresses breakdown Cache architecture Total cache size (with tags and valid bits) 23/44 S Coudert and R Pacalet January 23, 2019
37 Limits of block size Fewer cache lines for same cache capacity Favours spatial locality over temporal locality More data loaded from memory when handling cache miss Potentially increase cache miss cost Example (simplified): Access to three consecutive data in memory Cache access time: 2 cycle Memory access time (1 word): 20 cycles 1 word per cache line and 3 cache misses: = 66 cycles 4 words per cache line and 1 cache miss: 6 +1 (4 20) = 86 cycles Requires efficient data transfer between memory and cache 24/44 S Coudert and R Pacalet January 23, 2019
38 Improving memory-cache transfers Wider data bus Expensive Bounded efficiency Higher bus frequency Expensive Limited by Printed Circuit Board (PCB) constraints Double Data Rate (DDR) Banks rows columns Row latency > column latency Wrapping bursts Requested word first Multi-banking Mask row latencies bank decoder control logic address bus row DDR memory column MUX read logic data bus 25/44 S Coudert and R Pacalet January 23, 2019
39 Cache efficiency vs block size (Computer Organization and Design, the Hardware / Software Interface, Patterson and Hennessy, second edition, 1998) VAX machine, direct-mapped cache 26/44 S Coudert and R Pacalet January 23, 2019
40 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches Write strategies Cache coherence in multiprocessor systems 27/44 S Coudert and R Pacalet January 23, 2019
41 Set-associative caches Other possible improvement: several blocks per index (set) N-way set associative cache: N blocks per set (Computer Organization and Design, the Hardware / Software Interface, Patterson and Hennessy, second edition, 1998) 28/44 S Coudert and R Pacalet January 23, 2019
42 Set-associative cache architecture (Computer Organization and Design, the Hardware / Software Interface, Patterson and Hennessy, second edition, 1998) 29/44 S Coudert and R Pacalet January 23, 2019
43 Set-associative: replacement policy Several blocks per set What block to replace when a set is full First In First Out (FIFO) Same as Least Recently Cached Not that good, first in can be re-referenced frequently Random: simple hardware, sub-optimal but satisfactory Least Recently Used (LRU), complex hardware if > 2-ways LRU approximations Combinations of FIFO and LRU (Not Most Recently Used) 30/44 S Coudert and R Pacalet January 23, 2019
44 Exercise #2: Set-associative cache running 2-set, 2-way set-associative cache FIFO replacement policy Sequence of accesses (addresses) illustrating various situations Hits Misses 31/44 S Coudert and R Pacalet January 23, 2019
45 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches Write strategies Cache coherence in multiprocessor systems 32/44 S Coudert and R Pacalet January 23, 2019
46 Write policies Goals: avoid memory corruption, improve performance miss miss => replace x x x read x write x read y CPU CPU Cache hit Write-through: write in cache and in memory Performance issues when write rate exceeds memory throughput Write-back: memory written only when replacing a dirty block A dirty flag is added to cache lines Helpful when write rate exceeds memory throughput Cache miss Write only in memory (No Write Allocate) Fetch block from memory and write data (Write Allocate) All combinations are possible, some are more frequent (guess) Write buffers can be added to smooth the write rate to memory CPU y 33/44 S Coudert and R Pacalet January 23, 2019
47 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches Write strategies Cache coherence in multiprocessor systems 34/44 S Coudert and R Pacalet January 23, 2019
48 Processors Cache coherence problem CPU 1 CPU 2 CPU n Caches CPU 2: x=123 x=123 z= Write x,237 x absent Bus x=123 Shared memory CPU 1 CPU 2 CPU 1 CPU 2 Multiple copies in caches and memory how to ensure coherence x=123 x=237 x=123 x=237 x=123 x=237 Shared memory Shared memory incoherent! incoherent! 35/44 S Coudert and R Pacalet January 23, 2019
49 Cache coherence Note: problem exists even in mono-processor systems Direct Memory Access (DMA) peripherals Snooping Caches inform others about what they do Each cache continuously monitors others activity (snooping) Take appropriate actions when needed Appropriate depends on cache coherence protocol Directory-based Central or distributed directories track blocks and caches Not studied in this course Cache coherence protocols require support from bus protocol To exchange state changes To exchange cached data 36/44 S Coudert and R Pacalet January 23, 2019
50 Cache coherence protocols Define what action is taken in which circumstance (state) Examples Write invalidate Written cache sends a write invalidate message All snooping caches invalidate their copy of written block What if write through What if write back Write update (broadcast) Written cache broadcasts the new block All snooping caches update their copy of written block What if write through What if write back Operations of one cache can Be delayed upon request from another cache Wait for acknowledges by other caches Expect a response from other caches Use a delay before action unless responded to Be served by another cache or memory 37/44 S Coudert and R Pacalet January 23, 2019
51 Cache coherence protocols Each cache Maintains a state of each block Invalid Clean Dirty Exclusive Owned Reacts on events from its own processor Processor read Processor write Reacts on messages from other caches (snooped on bus) Read Write Flush Emits messages to other caches (on bus) Exchanges data with other caches (on bus) 38/44 S Coudert and R Pacalet January 23, 2019
52 Cache coherence protocols Write-through caches Exercise #3: Imagine protocol (messages, actions) 39/44 S Coudert and R Pacalet January 23, 2019
53 Example of coherence protocol: MSI MSI: Modified, Shared, Invalid Each block is in one of 3 states (M, S, I) for each cache A block can be in different states in different caches States definition Invalid: not in cache (or not valid) Shared: in cache and valid but read-only Modified: in cache and valid and read-write A block not in cache is considered in I (Invalid) state If a block is in M (Modified) state in one cache, it is the only copy The 3 states can be encoded using the Valid and Dirty flags 40/44 S Coudert and R Pacalet January 23, 2019
54 Example of coherence protocol: MSI Write-back allocate caches Exercise #4: Imagine MSI protocol (events, actions) M I S 41/44 S Coudert and R Pacalet January 23, 2019
55 Example of coherence protocol: MESI 3 states of MSI can be encoded using Valid and Dirty flags We could add one more state for free Let us reduce bandwidth with a fourth Exclusive (E) state Exclusive: block in cache, valid, read-only and is the only copy Write-back allocate caches Exercise #5: Imagine MESI protocol (events, actions) M S I E 42/44 S Coudert and R Pacalet January 23, 2019
56 Cache coherence protocol Homework: imagine further improvements with one more state Example: MOESI protocol O (Owner): modified in one cache, shared (S) in others Owner responsible for write-back Owner responsible for providing block to other caches M S I E O 43/44 S Coudert and R Pacalet January 23, 2019
57 Vocabulary Cache hit (miss): requested data is (not) in cache Block: smallest cache-able unit (12 n words, aligned) Cache line: where cache stores a block, its tag and flags Set: group of blocks with same cache index N-way cache: cache with N blocks per set Direct-mapped cache: 1-way cache (number of lines = number of sets) Full-associative cache: 1-set cache (number of lines = number of ways) Index: part of address used to designate a set Tag: part of address stored in cache line and compared with requested address to decide hit or miss Valid flag: flag stored in cache line and used to indicate whether content is valid or not Dirty flag: flag stored in cache line and used to indicate whether content has been modified or not Write-through: writing in cache and in memory Write-back: writing in cache but not in memory Eviction (replacement): replacing a block in a cache line by another block, after writing the replaced block in memory if cache is write-back and block was dirty Write-allocate: cache that, upon write miss, reads the block from memory, stores it in cache (evicts another block if needed) and writes in the cache (and in memory if it is a write-through cache) Write-no-allocate: cache that, upon write miss, writes in memory but not in the cache Coherence: property that guarantees that all CPUs see the same memory content despite their local caches 44/44 S Coudert and R Pacalet January 23, 2019
Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies
TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,
More informationMemory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB
Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar
More informationCSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1
CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationBasic Memory Hierarchy Principles. Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!)
Basic Memory Hierarchy Principles Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!) Cache memory idea Use a small faster memory, a cache memory, to store recently
More informationEN1640: Design of Computing Systems Topic 06: Memory System
EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II
ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,
More informationMemory Hierarchy. Slides contents from:
Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory
More informationCS161 Design and Architecture of Computer Systems. Cache $$$$$
CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks
More informationPage 1. Multilevel Memories (Improving performance using a little cash )
Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency
More informationMemory Hierarchies &
Memory Hierarchies & Cache Memory CSE 410, Spring 2009 Computer Systems http://www.cs.washington.edu/410 4/26/2009 cse410-13-cache 2006-09 Perkins, DW Johnson and University of Washington 1 Reading and
More informationEN1640: Design of Computing Systems Topic 06: Memory System
EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring
More informationregisters data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.
13 1 CMPE110 Computer Architecture, Winter 2009 Andrea Di Blas 110 Winter 2009 CMPE Cache Direct-mapped cache Reads and writes Cache associativity Cache and performance Textbook Edition: 7.1 to 7.3 Third
More informationCENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu
CENG 3420 Computer Organization and Design Lecture 08: Cache Review Bei Yu CEG3420 L08.1 Spring 2016 A Typical Memory Hierarchy q Take advantage of the principle of locality to present the user with as
More informationMemory Hierarchy. Slides contents from:
Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory
More informationWhy memory hierarchy
Why memory hierarchy (3 rd Ed: p.468-487, 4 th Ed: p. 452-470) users want unlimited fast memory fast memory expensive, slow memory cheap cache: small, fast memory near CPU large, slow memory (main memory,
More informationCourse Administration
Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570
More informationMultilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823
More informationAgenda. EE 260: Introduction to Digital Design Memory. Naive Register File. Agenda. Memory Arrays: SRAM. Memory Arrays: Register File
EE 260: Introduction to Digital Design Technology Yao Zheng Department of Electrical Engineering University of Hawaiʻi at Mānoa 2 Technology Naive Register File Write Read clk Decoder Read Write 3 4 Arrays:
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationChapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review
Memories: Review Chapter 7 Large and Fast: Exploiting Hierarchy DRAM (Dynamic Random Access ): value is stored as a charge on capacitor that must be periodically refreshed, which is why it is called dynamic
More informationAdvanced Memory Organizations
CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU
More informationThe Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):
The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:
More informationChapter 5A. Large and Fast: Exploiting Memory Hierarchy
Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address
More informationMemory Hierarchy. Caching Chapter 7. Locality. Program Characteristics. What does that mean?!? Exploiting Spatial & Temporal Locality
Caching Chapter 7 Basics (7.,7.2) Cache Writes (7.2 - p 483-485) configurations (7.2 p 487-49) Performance (7.3) Associative caches (7.3 p 496-54) Multilevel caches (7.3 p 55-5) Tech SRAM (logic) SRAM
More informationCache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Cache Memory COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline The Need for Cache Memory The Basics
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common
More informationCaches. Hiding Memory Access Times
Caches Hiding Memory Access Times PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O N T R O L ALU CTL INSTRUCTION FETCH INSTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMORY
More informationChapter 6 Objectives
Chapter 6 Memory Chapter 6 Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.
More informationIntroduction to cache memories
Course on: Advanced Computer Architectures Introduction to cache memories Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Summary Summary Main goal Spatial and temporal
More informationStructure of Computer Systems
222 Structure of Computer Systems Figure 4.64 shows how a page directory can be used to map linear addresses to 4-MB pages. The entries in the page directory point to page tables, and the entries in a
More informationPage 1. Memory Hierarchies (Part 2)
Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy
More informationLECTURE 11. Memory Hierarchy
LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed
More informationEE 4683/5683: COMPUTER ARCHITECTURE
EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major
More informationPlot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;
How will execution time grow with SIZE? int array[size]; int A = ; for (int i = ; i < ; i++) { for (int j = ; j < SIZE ; j++) { A += array[j]; } TIME } Plot SIZE Actual Data 45 4 5 5 Series 5 5 4 6 8 Memory
More informationMemory Organization MEMORY ORGANIZATION. Memory Hierarchy. Main Memory. Auxiliary Memory. Associative Memory. Cache Memory.
MEMORY ORGANIZATION Memory Hierarchy Main Memory Auxiliary Memory Associative Memory Cache Memory Virtual Memory MEMORY HIERARCHY Memory Hierarchy Memory Hierarchy is to obtain the highest possible access
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to
More informationMemory. Lecture 22 CS301
Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch
More informationAdapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]
Lecture 17 Adapted from instructor s supplementary material from Computer Organization and Design, 4th Edition, Patterson & Hennessy, 2008, MK] SRAM / / Flash / RRAM / HDD SRAM / / Flash / RRAM/ HDD SRAM
More informationMemory. Objectives. Introduction. 6.2 Types of Memory
Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts
More informationregisters data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.
Cache associativity Cache and performance 12 1 CMPE110 Spring 2005 A. Di Blas 110 Spring 2005 CMPE Cache Direct-mapped cache Reads and writes Textbook Edition: 7.1 to 7.3 Second Third Edition: 7.1 to 7.3
More informationEECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141
EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role
More informationChapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)
Department of Electr rical Eng ineering, Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Depar rtment of Electr rical Engineering,
More informationChapter 8. Virtual Memory
Operating System Chapter 8. Virtual Memory Lynn Choi School of Electrical Engineering Motivated by Memory Hierarchy Principles of Locality Speed vs. size vs. cost tradeoff Locality principle Spatial Locality:
More informationComputer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 5 Memory Hierachy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic
More informationWelcome to Part 3: Memory Systems and I/O
Welcome to Part 3: Memory Systems and I/O We ve already seen how to make a fast processor. How can we supply the CPU with enough data to keep it busy? We will now focus on memory issues, which are frequently
More informationLecture 16. Today: Start looking into memory hierarchy Cache$! Yay!
Lecture 16 Today: Start looking into memory hierarchy Cache$! Yay! Note: There are no slides labeled Lecture 15. Nothing omitted, just that the numbering got out of sequence somewhere along the way. 1
More informationCache introduction. April 16, Howard Huang 1
Cache introduction We ve already seen how to make a fast processor. How can we supply the CPU with enough data to keep it busy? The rest of CS232 focuses on memory and input/output issues, which are frequently
More informationCache Architectures Design of Digital Circuits 217 Srdjan Capkun Onur Mutlu http://www.syssec.ethz.ch/education/digitaltechnik_17 Adapted from Digital Design and Computer Architecture, David Money Harris
More informationRecap: Machine Organization
ECE232: Hardware Organization and Design Part 14: Hierarchy Chapter 5 (4 th edition), 7 (3 rd edition) http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy,
More informationCycle Time for Non-pipelined & Pipelined processors
Cycle Time for Non-pipelined & Pipelined processors Fetch Decode Execute Memory Writeback 250ps 350ps 150ps 300ps 200ps For a non-pipelined processor, the clock cycle is the sum of the latencies of all
More informationThe University of Adelaide, School of Computer Science 13 September 2018
Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More informationLECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY
LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal
More informationWhy memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho
Why memory hierarchy? L1 cache design Sangyeun Cho Computer Science Department Memory hierarchy Memory hierarchy goals Smaller Faster More expensive per byte CPU Regs L1 cache L2 cache SRAM SRAM To provide
More informationCHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang
CHAPTER 6 Memory 6.1 Memory 233 6.2 Types of Memory 233 6.3 The Memory Hierarchy 235 6.3.1 Locality of Reference 237 6.4 Cache Memory 237 6.4.1 Cache Mapping Schemes 239 6.4.2 Replacement Policies 247
More informationCPU issues address (and data for write) Memory returns data (or acknowledgment for write)
The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives
More informationRegisters. Instruction Memory A L U. Data Memory C O N T R O L M U X A D D A D D. Sh L 2 M U X. Sign Ext M U X ALU CTL INSTRUCTION FETCH
PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O T R O L ALU CTL ISTRUCTIO FETCH ISTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMOR ACCESS WRITE BACK A D D A D D A L U
More informationCache Coherence. CMU : Parallel Computer Architecture and Programming (Spring 2012)
Cache Coherence CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Shared memory multi-processor Processors read and write to shared variables - More precisely: processors issues
More informationA Cache Hierarchy in a Computer System
A Cache Hierarchy in a Computer System Ideally one would desire an indefinitely large memory capacity such that any particular... word would be immediately available... We are... forced to recognize the
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationCPE300: Digital System Architecture and Design
CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Cache 11232011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review Memory Components/Boards Two-Level Memory Hierarchy
More informationCS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook
CS356: Discussion #9 Memory Hierarchy and Caches Marco Paolieri (paolieri@usc.edu) Illustrations from CS:APP3e textbook The Memory Hierarchy So far... We modeled the memory system as an abstract array
More informationEE 457 Unit 7a. Cache and Memory Hierarchy
EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy & Caching Use several levels of faster and faster memory to hide delay of upper levels Registers Unit of Transfer:, Half, or Byte (LW, LH, LB
More informationCS 433 Homework 5. Assigned on 11/7/2017 Due in class on 11/30/2017
CS 433 Homework 5 Assigned on 11/7/2017 Due in class on 11/30/2017 Instructions: 1. Please write your name and NetID clearly on the first page. 2. Refer to the course fact sheet for policies on collaboration.
More informationIntroduction to OpenMP. Lecture 10: Caches
Introduction to OpenMP Lecture 10: Caches Overview Why caches are needed How caches work Cache design and performance. The memory speed gap Moore s Law: processors speed doubles every 18 months. True for
More informationCHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang
CHAPTER 6 Memory 6.1 Memory 341 6.2 Types of Memory 341 6.3 The Memory Hierarchy 343 6.3.1 Locality of Reference 346 6.4 Cache Memory 347 6.4.1 Cache Mapping Schemes 349 6.4.2 Replacement Policies 365
More informationCS3350B Computer Architecture
CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &
More informationChapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction
Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.
More informationChapter 7-1. Large and Fast: Exploiting Memory Hierarchy (part I: cache) 臺大電機系吳安宇教授. V1 11/24/2004 V2 12/01/2004 V3 12/08/2004 (minor)
Chapter 7-1 Large and Fast: Exploiting Memory Hierarchy (part I: cache) 臺大電機系吳安宇教授 V1 11/24/2004 V2 12/01/2004 V3 12/08/2004 (minor) 臺大電機吳安宇教授 - 計算機結構 1 Outline 7.1 Introduction 7.2 The Basics of Caches
More informationMemory Hierarchy. Maurizio Palesi. Maurizio Palesi 1
Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio
More informationComputer Architecture. Memory Hierarchy. Lynn Choi Korea University
Computer Architecture Memory Hierarchy Lynn Choi Korea University Memory Hierarchy Motivated by Principles of Locality Speed vs. Size vs. Cost tradeoff Locality principle Temporal Locality: reference to
More informationChapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY
Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored
More informationCache and Memory. CS230 Tutorial 07
Cache and Memory CS230 Tutorial 07 Cache Overview Memory Hierarchy Blocks rom fastest and most expensive to slowest and least expensive astest memory is smallest and closest to CPU Slowest memory is largest
More informationMIPS) ( MUX
Memory What do we use for accessing small amounts of data quickly? Registers (32 in MIPS) Why not store all data and instructions in registers? Too much overhead for addressing; lose speed advantage Register
More informationCSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]
CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user
More informationPortland State University ECE 587/687. Caches and Memory-Level Parallelism
Portland State University ECE 587/687 Caches and Memory-Level Parallelism Revisiting Processor Performance Program Execution Time = (CPU clock cycles + Memory stall cycles) x clock cycle time For each
More informationCSE 2021: Computer Organization
CSE 2021: Computer Organization Lecture-12a Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB
More informationSarah L. Harris and David Money Harris. Digital Design and Computer Architecture: ARM Edition Chapter 8 <1>
Chapter 8 Digital Design and Computer Architecture: ARM Edition Sarah L. Harris and David Money Harris Digital Design and Computer Architecture: ARM Edition 215 Chapter 8 Chapter 8 :: Topics Introduction
More informationChapter Seven. Large & Fast: Exploring Memory Hierarchy
Chapter Seven Large & Fast: Exploring Memory Hierarchy 1 Memories: Review SRAM (Static Random Access Memory): value is stored on a pair of inverting gates very fast but takes up more space than DRAM DRAM
More informationThe levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms
The levels of a memory hierarchy CPU registers C A C H E Memory bus Main Memory I/O bus External memory 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms 1 1 Some useful definitions When the CPU finds a requested
More informationCENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu
CENG 3420 Computer Organization and Design Lecture 08: Memory - I Bei Yu CEG3420 L08.1 Spring 2016 Outline q Why Memory Hierarchy q How Memory Hierarchy? SRAM (Cache) & DRAM (main memory) Memory System
More informationKey Point. What are Cache lines
Caching 1 Key Point What are Cache lines Tags Index offset How do we find data in the cache? How do we tell if it s the right data? What decisions do we need to make in designing a cache? What are possible
More informationCSE 2021: Computer Organization
CSE 2021: Computer Organization Lecture-12 Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB
More informationMemory Hierarchy Y. K. Malaiya
Memory Hierarchy Y. K. Malaiya Acknowledgements Computer Architecture, Quantitative Approach - Hennessy, Patterson Vishwani D. Agrawal Review: Major Components of a Computer Processor Control Datapath
More informationLecture 15: Caches and Optimization Computer Architecture and Systems Programming ( )
Systems Group Department of Computer Science ETH Zürich Lecture 15: Caches and Optimization Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Last time Program
More informationAssignment 1 due Mon (Feb 4pm
Announcements Assignment 1 due Mon (Feb 19) @ 4pm Next week: no classes Inf3 Computer Architecture - 2017-2018 1 The Memory Gap 1.2x-1.5x 1.07x H&P 5/e, Fig. 2.2 Memory subsystem design increasingly important!
More informationChapter 5B. Large and Fast: Exploiting Memory Hierarchy
Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,
More informationV. Primary & Secondary Memory!
V. Primary & Secondary Memory! Computer Architecture and Operating Systems & Operating Systems: 725G84 Ahmed Rezine 1 Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM)
More informationLogical Diagram of a Set-associative Cache Accessing a Cache
Introduction Memory Hierarchy Why memory subsystem design is important CPU speeds increase 25%-30% per year DRAM speeds increase 2%-11% per year Levels of memory with different sizes & speeds close to
More informationTextbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site:
Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, 2003 Textbook web site: www.vrtechnology.org 1 Textbook web site: www.vrtechnology.org Laboratory Hardware 2 Topics 14:332:331
More informationChapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST
Chapter 5 Memory Hierarchy Design In-Cheol Park Dept. of EE, KAIST Why cache? Microprocessor performance increment: 55% per year Memory performance increment: 7% per year Principles of locality Spatial
More informationDigital Logic & Computer Design CS Professor Dan Moldovan Spring Copyright 2007 Elsevier 8-<1>
Digital Logic & Computer Design CS 4341 Professor Dan Moldovan Spring 21 Copyright 27 Elsevier 8- Chapter 8 :: Memory Systems Digital Design and Computer Architecture David Money Harris and Sarah L.
More informationMemory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic
More informationLecture 10: Cache Coherence: Part I. Parallel Computer Architecture and Programming CMU , Spring 2013
Lecture 10: Cache Coherence: Part I Parallel Computer Architecture and Programming Cache design review Let s say your code executes int x = 1; (Assume for simplicity x corresponds to the address 0x12345604
More informationIntroduction. Memory Hierarchy
Introduction Why memory subsystem design is important CPU speeds increase 25%-30% per year DRAM speeds increase 2%-11% per year 1 Memory Hierarchy Levels of memory with different sizes & speeds close to
More informationLECTURE 5: MEMORY HIERARCHY DESIGN
LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive
More informationChapter 8 :: Topics. Chapter 8 :: Memory Systems. Introduction Memory System Performance Analysis Caches Virtual Memory Memory-Mapped I/O Summary
Chapter 8 :: Systems Chapter 8 :: Topics Digital Design and Computer Architecture David Money Harris and Sarah L. Harris Introduction System Performance Analysis Caches Virtual -Mapped I/O Summary Copyright
More informationMemory System Design Part II. Bharadwaj Amrutur ECE Dept. IISc Bangalore.
Memory System Design Part II Bharadwaj Amrutur ECE Dept. IISc Bangalore. References: Outline Computer Architecture a Quantitative Approach, Hennessy & Patterson Topics Memory hierarchy Cache Multi-core
More information