Chapter 7: Large and Fast: Exploiting Memory Hierarchy
|
|
- Everett Barnett
- 5 years ago
- Views:
Transcription
1 Chapter 7: Large and Fast: Exploiting Memory Hierarchy
2 Basic Memory Requirements Users/Programmers Demand: Large computer memory ery Fast access memory Technology Limitations Large Computer memory relatively slower access Small Computer memory Relatively Faster access So how do you build a large computer memory with faster access? Computer Architecture CS
3 Computer Memory Use-Case Scenarios If a memory item is referenced It will most likely be referenced again soon (Temporal Locality) Its neighbors will tend to be referenced soon (Spatial Locality) Basic Philosophy Employ basic requirements, technology limitations and stated use-case scenarios to architect the memory Computer Architecture CS
4 Memory Hierarchy Design Philosophy Build a hierarchy of memories with fast access close to the CPU Employ Temporal Locality and Spatial Locality in the design CPU Increasing speed Levels in the memory hierarchy Level Level 2 Increasing distance from the CPU in access time Level n Size of the memory at each level Computer Architecture CS
5 Memory Speed Technology trend Technology SRAM DRAM Magnetic Disk Access Time ns ns 5,000,000 20,000,000 ns $ per GB in 2004 $4,000 - $0,000 $00 - $200 $ $2 Computer Architecture CS
6 Three-Level Computer Memory Hierarchy CPU Cache Main SRAM ---- Small Memory and Fastest DRAM irtual (stores data) Magnetic Disk ---- Biggest and Slowest Fastest Memory is closest to the CPU Computer Architecture CS
7 Memory Hierarchy Upper & Lower Levels if requested data appears in cache block hit not in cache miss Memory Performance q CPU request hit rate = hits/memory access miss rate = -hit rate hit time = time to access cache transfer q miss penalty = time to access lower level + time to transfer block in upper level + access time of upper memory + CPU access time block Computer Architecture CS
8 Basics of Caches CPU Simple Cache Requests X n CPU request X n X 4 X 4 But X n is not in Cache X X X n is copied in Cache from memory X n-2 X n-2 X n is returned to CPU X n- X n- X 2 X 2 Trip to Memory X n X 3 Before Request X 3 After Request Memory How do we know if a data item is in cache, and how do we find it? Computer Architecture CS
9 Cache Structure Direct Mapped Each memory location is mapped directly to a unique location in cache Example Mapping Scheme: Memory address (Block address) mudulo (umber of cache blocks in cache) Cache 8 cache entries word block of cache (8 = 2 3 ) Words: Memory Total entries in Direct mapped Cache must be a Power of 2 Computer Architecture CS
10 Basics of Cache Cache mapping: many words -to-one cache location How do we know whether the data in the cache corresponds to a requested word? Add a set of s to the cache Contain the upper bits of the word address not used in the indexing How do we know if a cache block contains valid information? Cache contents are invalid (empty) during CPU initialization Solution add alid Bits to indicate valid address Computer Architecture CS
11 Accessing Direct-Mapped Cache Assume a direct mapped Cache with -word blocks Requests. Decimal Ref Address 22 Binary Ref Address 00 Hits/Miss Miss Assigned cache block 0 (fig. 7.6b) Miss (fig. 7.6c) Hit Hit Miss (fig. 7.6d) Miss (fig,. 7.6e) Hit Miss (fig 7.6f) 00 Computer Architecture CS
12 Accessing a Cache State of Cache at Initialization Index alid Cache is empty Computer Architecture CS
13 Accessing a Cache Request #: Post Memory Reference 00 Index alid Memory contents00 CPU encounters a miss, Cache copies contents from memory 00 Computer Architecture CS
14 Accessing a Cache Request #2: Post Memory Reference 00 Index alid Memory contents Memory contents00 CPU encounters a miss, Cache copies contents from memory 00 Computer Architecture CS
15 Accessing a Cache Request #3: Post Memory Reference 00 Index alid Memory contents Memory contents00 CPU encounters a hit, Computer Architecture CS
16 Accessing a Cache Request #4: Post Memory Reference 00 Index alid Memory contents00 0 Memory contents00 CPU encounters a hit Computer Architecture CS
17 Accessing a Cache Request #5: Post Memory Reference 0000 Index alid Memory contents Memory contents Memory contents00 CPU encounters a miss, Cache copies contents from memory 0000 Computer Architecture CS
18 Accessing a Cache Request #6: Post Memory Reference 000 Index alid Memory contents Memory contents Memory contents Memory contents00 CPU encounters a miss, Cache copies contents from memory 000 Computer Architecture CS
19 Accessing a Cache Request #7: Post Memory Reference 0000 Index alid Memory contents Memory contents Memory contents Memory contents00 CPU encounters a hit Computer Architecture CS
20 Accessing a Cache Request #8: Post Memory Reference 000 Index alid Memory contents Memory contents Memory contents Memory contents00 CPU encounters a miss, Cache copies contents from memory 000 Computer Architecture CS
21 MIPS Direct Mapped Cache 32-bit Byte Reference Address Cache Index: Bits 2 Bits 0- ignored (not significant) Field: Bits 2-3 Hit CPU Address (showing bit positions) Byte offset 20 0 Index Index 0 2 alid Cache: = Cache size: 2 0 blocks block/word Computer Architecture CS
22 Direct Mapped Cache Cache size Let s assume Index field occupies n bits Size = 2 n blocks field occupies m bits -> 2 m bits Block size 2 m word 2 m+5 bits For a 32-bit byte address: umber of bits for field = 32 (n + m +2) bits Size of cache = 2 n x (block size + size + alid size) Since Block size = 2 m+5 bits umber of bits: 2 n x (block size + size + alid size) = 2 n x (2 m+5 + [32 (n + m +2)] + ) = 2 n x (m x 32 + [32 (n + m +2)] + ) Cache size = 2 n x (m x n m) Computer Architecture CS
23 Direct Mapped Cache Four-word blocks & Total size 6 words Cache DATA Index/Block How do we map the memory address to a Direct Mapped cache with four-word blocks? Computer Architecture CS
24 Four-word blocks & Total size 6 words Consider following reference Memory addresses Ref. Requests Decimal Ref Address Binary Ref Address Show the Direct-Mapped Cache contents at each stage Computer Architecture CS
25 Direct Mapped Cache: Four-word blocks & Total size 6 words Initial Content of Cache Cache Index/Block two 0 two 0 two two Cache is empty Computer Architecture CS
26 Direct Mapped Cache: Four-word blocks & Total size 6 words Request #: Post Memory Reference address 22 ( 0 0 ) Cache Index/Block two 0 two 0 two two 00 0 Con (20) Con(2) Con(22) Con(23) 0 Remaining bits of 22 Address = : diff from cache content Index: Four word 2bits max 0 ext 2 upper bits of 22 Miss Block: 22 mod 4 = 2 =0 two Trailing 2 bits of 22 Transfer con(2) thru con(23) to cache then set to & bit to Computer Architecture CS
27 Direct Mapped Cache: Four-word blocks & Total size 6 words Request #2: Post Memory Reference address 20 (000) Cache Index/Block two 0 two 0 two two 00 0 Con (20) Con(2) Con(22) Con(23) 0 Block: 20 mod 4 = 0 =00 two = (Remaining bits) Index: Four word 2bits max 0 & -bits already set to & resp. Hit Computer Architecture CS
28 Direct Mapped Cache: Four-word blocks & Total size 6 words Request #3: Post Memory Reference address 26 (00) Cache Index/Block two 0 two 0 two two 0 Con (20) Con(2) Con(22) Con(23) 0 Con (24) Con(25) Con(26) Con(27) = (Remaining bits) Index: Four word 2bits max 0 not set, -bit set to Block: 26 mod 4 = 2 =0 two Miss Transfer data to cache then set to & bit to Computer Architecture CS
29 Direct Mapped Cache: Four-word blocks & Total size 6 words Request #4: Post Memory Reference address 23 (0) Cache Index/Block two 0 two 0 two two 00 0 Con (20) Con(2) Con(22) Con(23) 0 Con (24) Con(25) Con(26) Con(27) Block: 23 mod 4 = 3 = two = (Remaining bits) Index: Four word 2bits max 0 already set to, -bit set to Hit Computer Architecture CS
30 Direct Mapped Cache: Four-word blocks & Total size 6 words Request #5: Post Memory Reference address 28 (00) Cache Index/Block two 0 two 0 two two 0 Con (20) Con(2) Con(22) Con(23) 0 Con (24) Con(25) Con(26) Con(27) Con (28) Con(29) Con(30) Con(3) = (Remaining bits) Index: Four word 2bits max not set, -bit set to Block: 28 mod 4 = 0 =00 two Miss Transfer data to cache then set to & bit to Computer Architecture CS
31 Direct Mapped Cache: Four-word blocks & Total size 6 words Request #6: Post Memory Reference address 6 (0000) Cache Index/Block two Con (6) 0 two Con(7) 0 two Con(8) two Con(9) 0 Con (20) Con(2) Con(22) Con(23) 0 Con (24) Con(25) Con(26) Con(27) Con (28) Con(29) Con(30) Con(3) = (Remaining bits) Index: Four word 2bits max 00 not set, -bit set to Block: 6 mod 4 = 0 =00 two Miss Transfer data to cache then set to & bit to Computer Architecture CS
32 Direct Mapped Cache: Four-word blocks & Total size 6 words Request #7: Post Memory Reference address 3 (000) Cache Index/Block two Con (0) 0 two Con() 0 two Con(2) two Con(3) 0 Con (20) Con(2) Con(22) Con(23) 0 Con (24) Con(25) Con(26) Con(27) Con (28) Con(29) Con(30) Con(3) = 0 (Remaining bits) Index: Four word 2bits max 00 But is set to Block: 3 mod 4 = 3 = two Miss Transfer data to cache then set to 0 & bit to Computer Architecture CS
33 Direct Mapped Cache: Four-word blocks & Total size 6 words Request #8: Post Memory Reference address 6 (0000) Cache Index/Block two Con (6) 0 two Con(7) 0 two Con(8) two Con(9) 0 Con (20) Con(2) Con(22) Con(23) 0 Con (24) Con(25) Con(26) Con(27) Con (28) Con(29) Con(30) Con(3) = (Remaining bits) Index: Four word 2bits max 00 But is set to 0 Block: 6 mod 4 = 0 =00 two Miss Transfer data to cache then set to & bit to Computer Architecture CS
34 Direct Mapped Cache: Four-word blocks & Total size 6 words Request #9: Post Memory Reference address 8 (000) Cache Index/Block two Con (6) 0 two Con(7) 0 two Con(8) two Con(9) 0 Con (20) Con(2) Con(22) Con(23) 0 Con (24) Con(25) Con(26) Con(27) Con (28) Con(29) Con(30) Con(3) Block: 8 mod 4 = 2 =0 two = (Remaining bits) Index: Four word 2bits max 00 already set to, -bit set to Hit Computer Architecture CS
35 Handling Hits and Misses Control Unit Read hits BAU! this is what we want! Read misses stall the CPU, control unit fetches block from memory, deliver to cache, restart Write hits: can replace data in cache and memory (write-through) write the data only into the cache (write-back the cache later) Write misses: read the entire block into the cache, then write the word Computer Architecture CS
36 Other Cache Structures Reducing Cache Misses Fully Associative Cache Each Block in Memory can be placed anywhere in the cache Set-Associative Cache Each Block in Memory can be placed in a fixed number of locations within a set A 2-Way Set-Associative Cache Each set has 2 elements: Cache Locations Computer Architecture CS
37 Set - Associative Cache Structure 2-way set-associative Locations Set Elements of Set 0 Elements of Set 3 Each Block Address can be placed in one of 2 elements within a given set Computer Architecture CS
38 Set - Associative Cache Structure 2-way set-associative How do we map the Block Address to a unique set within the cache? Bit selection algorithm. Set = (Block Address) MOD (number of sets within the Cache) 2. How do we identify the memory address in a set? 3. First, associate each element of the set with a The Set address + contents + -bit identifies the Block Address Computer Architecture CS
39 Set - Associative Cache 2-way set-associative Structure Set Element Element Computer Architecture CS
40 Set - Associative Cache 4-way set-associative Structure S e t Element Element 2 Element 3 Element Each Block Address can be placed in one of 4elements within a given set All s within a set are searched in parallel Computer Architecture CS
41 Two-way Set - Associative Cache Show Cache Contents after each requests Order of requests Decimal Reference Address Binary Reference Address Computer Architecture CS
42 Two-way Set - Associative Cache State of Cache at initialization We will assume:. The Set -Associative Cache has 2 Sets (2 ) with two elements per set 2. Cache Block Size 2 Bits = 2 Set Element Element 2 0 umber of bits in = Max bits to rep Address Max bits to rep Set Cache Block size = Max bits to rep Address - - Computer Architecture CS
43 Two-way Set - Associative Cache Contents Request #: Post Memory Reference address 0 (00000) Set Element Element CO[0] Locate Set: 0 mod 2 = 0 =0 two Search: bit: 5 2 Leading 3 bits Miss Does not exist in Set 0 Transfer data to cache then set to 000 & bit to Computer Architecture CS
44 Two-way Set - Associative Cache Contents Request #2: Post Memory Reference address 8 (0000) ) Set Element Element CO[0] 00 CO[8] Locate Set: 8 mod 2 = 0 =0 two Search: bit: 5 2 Leading 3 bits Miss Does not exist in Set 0 Transfer data to cache then set to 00 & bit to Computer Architecture CS
45 Two-way Set - Associative Cache Contents Request #3: Post Memory Reference address 0 (00000) Set Element Element CO[0] 00 CO[8] Locate Set: 0 mod 2 = 0 =0 two Search the TAG bits in set 0 (parallel algorithm) for 000: bit: 5 2 Leading 3 bits TAG bits exist in Set 0 Hit Send Cache contents (for the Memory address ) to CPU Computer Architecture CS
46 Two-way Set - Associative Cache Contents Request #4: Post Memory Reference address 6 (000) Set Element Element CO[0] 00 CO[6] Search: bit: 5 2 Leading 3 bits Miss Locate Set: 6 mod 2 = 0 =0 two Does not exist in Set 0 Transfer data to cache then set to 00 & bit to Cache Replacement Algorithm: Least Recently Used block Computer Architecture CS
47 Two-way Set - Associative Cache Contents Request #5: Post Memory Reference address 8 (0000) Set Element Element CO[8] 00 CO[6] Search: bit: 5 2 Leading 3 bits Miss Locate Set: 8 mod 2 = 0 =0 two Does not exist in Set 0 Transfer data to cache then set to 00 & bit to Cache Replacement Algorithm: Least Recently Used block Computer Architecture CS
48 Set-Associative Cache Block Replacement Algorithm Least Recently Used (LRU) The block replaced is the one that has been unused for the longest time. Track usage of each Element in a set Expensive with increased associativity (m-way set associative; where m very large) Most Recently Used (MRU) Least-Frequently Used (LFU ) Most-Frequently Used (MFU) First In First Out (FIFO) Computer Architecture CS
49 Let s Summarize M-Way Set-Associative Cache -Way Set-Associative Cache (M=) ( ) Direct Mapped 2-Way Set-Associative Cache (M=2) ( ) Cache Block Replacement: LRU 4-Way Set-Associative Cache (M=4) ( ) Cache Block Replacement: LRU Fully Associative Cache Block can be placed in any location in Cache All entries in cache must be searched in response to cache request (Expensive) Computer Architecture CS
50 Cache has a total size of 6 words configured in 4 blocks: There are 4 words in each of the blocks. 4 blocks 2 2 address bits for the blocks. Therefore the cache address size (i.e. index) is 2 bits wide: Decimal Address of Reference Binary conversio n IDEX TAG OFFSET HIT/MISS Miss Hit Hit Miss Miss Hit Miss Miss Miss COTET Set: 3,2,,0 3,2,,0 3,2,,0 Set: 23,22,2,20 Set: 9,8,7,6 23,22,2,20 Set: 27,26,25,24 Set: 5,50,49,48 Set: 3,2,,0 Memory address + Index + offset (2 bits) Hit 27,26,25,24 Final Cache Contents CACHE IDEX (Blocks) Final Cache Contents ot Set Computer Architecture CS
Chapter Seven. Large & Fast: Exploring Memory Hierarchy
Chapter Seven Large & Fast: Exploring Memory Hierarchy 1 Memories: Review SRAM (Static Random Access Memory): value is stored on a pair of inverting gates very fast but takes up more space than DRAM DRAM
More informationCMPSC 311- Introduction to Systems Programming Module: Caching
CMPSC 311- Introduction to Systems Programming Module: Caching Professor Patrick McDaniel Fall 2016 Reminder: Memory Hierarchy L0: Registers CPU registers hold words retrieved from L1 cache Smaller, faster,
More informationChapter 5. Memory Technology
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationChapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review
Memories: Review Chapter 7 Large and Fast: Exploiting Hierarchy DRAM (Dynamic Random Access ): value is stored as a charge on capacitor that must be periodically refreshed, which is why it is called dynamic
More informationBasic Memory Hierarchy Principles. Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!)
Basic Memory Hierarchy Principles Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!) Cache memory idea Use a small faster memory, a cache memory, to store recently
More informationMemory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology
Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast
More informationAdvanced Memory Organizations
CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU
More informationCMPSC 311- Introduction to Systems Programming Module: Caching
CMPSC 311- Introduction to Systems Programming Module: Caching Professor Patrick McDaniel Fall 2014 Lecture notes Get caching information form other lecture http://hssl.cs.jhu.edu/~randal/419/lectures/l8.5.caching.pdf
More informationCSE 2021: Computer Organization
CSE 2021: Computer Organization Lecture-12a Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB
More informationRegisters. Instruction Memory A L U. Data Memory C O N T R O L M U X A D D A D D. Sh L 2 M U X. Sign Ext M U X ALU CTL INSTRUCTION FETCH
PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O T R O L ALU CTL ISTRUCTIO FETCH ISTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMOR ACCESS WRITE BACK A D D A D D A L U
More informationChapter 6 Objectives
Chapter 6 Memory Chapter 6 Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.
More informationMemory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB
Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar
More informationCSE 2021: Computer Organization
CSE 2021: Computer Organization Lecture-12 Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB
More informationChapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY
Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored
More informationV. Primary & Secondary Memory!
V. Primary & Secondary Memory! Computer Architecture and Operating Systems & Operating Systems: 725G84 Ahmed Rezine 1 Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM)
More informationregisters data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.
Cache associativity Cache and performance 12 1 CMPE110 Spring 2005 A. Di Blas 110 Spring 2005 CMPE Cache Direct-mapped cache Reads and writes Textbook Edition: 7.1 to 7.3 Second Third Edition: 7.1 to 7.3
More informationLocality. Cache. Direct Mapped Cache. Direct Mapped Cache
Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality: it will tend to be referenced again soon spatial locality: nearby items will tend to be
More informationDonn Morrison Department of Computer Science. TDT4255 Memory hierarchies
TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,
More informationMemory Hierarchy: Caches, Virtual Memory
Memory Hierarchy: Caches, Virtual Memory Readings: 5.1-5.4, 5.8 Big memories are slow Computer Fast memories are small Processor Memory Devices Control Input Datapath Output Need to get fast, big memories
More informationLECTURE 11. Memory Hierarchy
LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed
More informationCS161 Design and Architecture of Computer Systems. Cache $$$$$
CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks
More informationMemory. Objectives. Introduction. 6.2 Types of Memory
Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts
More informationLecture 12: Memory hierarchy & caches
Lecture 12: Memory hierarchy & caches A modern memory subsystem combines fast small memory, slower larger memories This lecture looks at why and how Focus today mostly on electronic memories. Next lecture
More informationEECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141
EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common
More informationComputer Systems Laboratory Sungkyunkwan University
Caches Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns
More informationChapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)
Chapter Seven emories: Review SRA: value is stored on a pair of inverting gates very fast but takes up more space than DRA (4 to transistors) DRA: value is stored as a charge on capacitor (must be refreshed)
More informationChapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)
Department of Electr rical Eng ineering, Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Depar rtment of Electr rical Engineering,
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationMemory Hierarchy. Maurizio Palesi. Maurizio Palesi 1
Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio
More informationChapter 7-1. Large and Fast: Exploiting Memory Hierarchy (part I: cache) 臺大電機系吳安宇教授. V1 11/24/2004 V2 12/01/2004 V3 12/08/2004 (minor)
Chapter 7-1 Large and Fast: Exploiting Memory Hierarchy (part I: cache) 臺大電機系吳安宇教授 V1 11/24/2004 V2 12/01/2004 V3 12/08/2004 (minor) 臺大電機吳安宇教授 - 計算機結構 1 Outline 7.1 Introduction 7.2 The Basics of Caches
More informationMIPS) ( MUX
Memory What do we use for accessing small amounts of data quickly? Registers (32 in MIPS) Why not store all data and instructions in registers? Too much overhead for addressing; lose speed advantage Register
More informationECE468 Computer Organization and Architecture. Memory Hierarchy
ECE468 Computer Organization and Architecture Hierarchy ECE468 memory.1 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Input Datapath Output Today s Topic:
More informationChapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction
Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.
More informationThe Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):
The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:
More informationMemory Hierarchy. Maurizio Palesi. Maurizio Palesi 1
Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio
More informationMemory Hierarchy: Motivation
Memory Hierarchy: Motivation The gap between CPU performance and main memory speed has been widening with higher performance CPUs creating performance bottlenecks for memory access instructions. The memory
More informationThe Memory Hierarchy & Cache
Removing The Ideal Memory Assumption: The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory
More informationChapter 5A. Large and Fast: Exploiting Memory Hierarchy
Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM
More informationCaches. Hiding Memory Access Times
Caches Hiding Memory Access Times PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O N T R O L ALU CTL INSTRUCTION FETCH INSTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMORY
More informationLECTURE 10: Improving Memory Access: Direct and Spatial caches
EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses
More informationCS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches
CS 61C: Great Ideas in Computer Architecture The Memory Hierarchy, Fully Associative Caches Instructor: Alan Christopher 7/09/2014 Summer 2014 -- Lecture #10 1 Review of Last Lecture Floating point (single
More information14:332:331. Week 13 Basics of Cache
14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Lec20.1 Fall 2003 Head
More informationCom S 321 Problem Set 3
Com S 321 Problem Set 3 1. A computer has a main memory of size 8M words and a cache size of 64K words. (a) Give the address format for a direct mapped cache with a block size of 32 words. (b) Give the
More informationIntroduction to OpenMP. Lecture 10: Caches
Introduction to OpenMP Lecture 10: Caches Overview Why caches are needed How caches work Cache design and performance. The memory speed gap Moore s Law: processors speed doubles every 18 months. True for
More informationMemory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy
ENG338 Computer Organization and Architecture Part II Winter 217 S. Areibi School of Engineering University of Guelph Hierarchy Topics Hierarchy Locality Motivation Principles Elements of Design: Addresses
More informationMemory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit.
Memory Hierarchy Goal: Fast, unlimited storage at a reasonable cost per bit. Recall the von Neumann bottleneck - single, relatively slow path between the CPU and main memory. Fast: When you need something
More informationCOSC 6385 Computer Architecture - Memory Hierarchies (I)
COSC 6385 Computer Architecture - Memory Hierarchies (I) Edgar Gabriel Spring 2018 Some slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05
More informationregisters data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.
13 1 CMPE110 Computer Architecture, Winter 2009 Andrea Di Blas 110 Winter 2009 CMPE Cache Direct-mapped cache Reads and writes Cache associativity Cache and performance Textbook Edition: 7.1 to 7.3 Third
More informationMemory. Lecture 22 CS301
Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch
More informationCS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook
CS356: Discussion #9 Memory Hierarchy and Caches Marco Paolieri (paolieri@usc.edu) Illustrations from CS:APP3e textbook The Memory Hierarchy So far... We modeled the memory system as an abstract array
More informationLocality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example
Locality CS429: Computer Organization and Architecture Dr Bill Young Department of Computer Sciences University of Texas at Austin Principle of Locality: Programs tend to reuse data and instructions near
More informationMemory Hierarchy: The motivation
Memory Hierarchy: The motivation The gap between CPU performance and main memory has been widening with higher performance CPUs creating performance bottlenecks for memory access instructions. The memory
More informationMemory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic
More informationCSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1
CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson
More informationMemory. Principle of Locality. It is impossible to have memory that is both. We create an illusion for the programmer. Employ memory hierarchy
Datorarkitektur och operativsystem Lecture 7 Memory It is impossible to have memory that is both Unlimited (large in capacity) And fast 5.1 Intr roduction We create an illusion for the programmer Before
More informationECE 30 Introduction to Computer Engineering
ECE 0 Introduction to Computer Engineering Study Problems, Set #9 Spring 01 1. Given the following series of address references given as word addresses:,,, 1, 1, 1,, 8, 19,,,,, 7,, and. Assuming a direct-mapped
More informationCPU issues address (and data for write) Memory returns data (or acknowledgment for write)
The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives
More informationCENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu
CENG 3420 Computer Organization and Design Lecture 08: Cache Review Bei Yu CEG3420 L08.1 Spring 2016 A Typical Memory Hierarchy q Take advantage of the principle of locality to present the user with as
More informationCourse Administration
Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationReview: Computer Organization
Review: Computer Organization Cache Chansu Yu Caches: The Basic Idea A smaller set of storage locations storing a subset of information from a larger set. Typically, SRAM for DRAM main memory: Processor
More informationECE260: Fundamentals of Computer Engineering
Basics of Cache Memory James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy Cache Memory Cache
More informationIntroduction to cache memories
Course on: Advanced Computer Architectures Introduction to cache memories Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Summary Summary Main goal Spatial and temporal
More informationEN1640: Design of Computing Systems Topic 06: Memory System
EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring
More informationAdvanced Computer Architecture
ECE 563 Advanced Computer Architecture Fall 2009 Lecture 3: Memory Hierarchy Review: Caches 563 L03.1 Fall 2010 Since 1980, CPU has outpaced DRAM... Four-issue 2GHz superscalar accessing 100ns DRAM could
More informationComputer Systems Architecture
Computer Systems Architecture Lecture 12 Mahadevan Gomathisankaran March 4, 2010 03/04/2010 Lecture 12 CSCE 4610/5610 1 Discussion: Assignment 2 03/04/2010 Lecture 12 CSCE 4610/5610 2 Increasing Fetch
More informationCS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14
CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2014 Lecture 14 LAST TIME! Examined several memory technologies: SRAM volatile memory cells built from transistors! Fast to use, larger memory cells (6+ transistors
More informationCaches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017
Caches and Memory Hierarchy: Review UCSB CS24A, Fall 27 Motivation Most applications in a single processor runs at only - 2% of the processor peak Most of the single processor performance loss is in the
More informationThe Memory Hierarchy Cache, Main Memory, and Virtual Memory
The Memory Hierarchy Cache, Main Memory, and Virtual Memory Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University The Simple View of Memory The simplest view
More informationCray XE6 Performance Workshop
Cray XE6 Performance Workshop Mark Bull David Henty EPCC, University of Edinburgh Overview Why caches are needed How caches work Cache design and performance. 2 1 The memory speed gap Moore s Law: processors
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationMemory Hierarchy. Caching Chapter 7. Locality. Program Characteristics. What does that mean?!? Exploiting Spatial & Temporal Locality
Caching Chapter 7 Basics (7.,7.2) Cache Writes (7.2 - p 483-485) configurations (7.2 p 487-49) Performance (7.3) Associative caches (7.3 p 496-54) Multilevel caches (7.3 p 55-5) Tech SRAM (logic) SRAM
More informationCaches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016
Caches and Memory Hierarchy: Review UCSB CS240A, Winter 2016 1 Motivation Most applications in a single processor runs at only 10-20% of the processor peak Most of the single processor performance loss
More informationAssignment 1 due Mon (Feb 4pm
Announcements Assignment 1 due Mon (Feb 19) @ 4pm Next week: no classes Inf3 Computer Architecture - 2017-2018 1 The Memory Gap 1.2x-1.5x 1.07x H&P 5/e, Fig. 2.2 Memory subsystem design increasingly important!
More informationThe levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms
The levels of a memory hierarchy CPU registers C A C H E Memory bus Main Memory I/O bus External memory 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms 1 1 Some useful definitions When the CPU finds a requested
More informationCaching Basics. Memory Hierarchies
Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby
More informationCache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Cache Memory COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline The Need for Cache Memory The Basics
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to
More informationMemory Hierarchy. Reading. Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 (2) Lecture notes from MKP, H. H. Lee and S.
Memory Hierarchy Lecture notes from MKP, H. H. Lee and S. Yalamanchili Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 Reading (2) 1 SRAM: Value is stored on a pair of inerting gates Very fast but
More informationThe Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs.
The Hierarchical Memory System The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory Hierarchy:
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 24: Cache Performance Analysis Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Last time: Associative caches How do we
More informationThe check bits are in bit numbers 8, 4, 2, and 1.
The University of Western Australia Department of Electrical and Electronic Engineering Computer Architecture 219 (Tutorial 8) 1. [Stallings 2000] Suppose an 8-bit data word is stored in memory is 11000010.
More informationTrying to design a simple yet efficient L1 cache. Jean-François Nguyen
Trying to design a simple yet efficient L1 cache Jean-François Nguyen 1 Background Minerva is a 32-bit RISC-V soft CPU It is described in plain Python using nmigen FPGA-friendly Designed for reasonable
More informationComputer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 5 Memory Hierachy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Components of a Computer Processor
More informationEN1640: Design of Computing Systems Topic 06: Memory System
EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring
More informationMemory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky
Memory Hierarchy, Fully Associative Caches Instructor: Nick Riasanovsky Review Hazards reduce effectiveness of pipelining Cause stalls/bubbles Structural Hazards Conflict in use of datapath component Data
More information14:332:331. Week 13 Basics of Cache
14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Week131 Spring 2006
More informationModern Computer Architecture
Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap
More informationHomework 6. BTW, This is your last homework. Assigned today, Tuesday, April 10 Due time: 11:59PM on Monday, April 23. CSCI 402: Computer Architectures
Homework 6 BTW, This is your last homework 5.1.1-5.1.3 5.2.1-5.2.2 5.3.1-5.3.5 5.4.1-5.4.2 5.6.1-5.6.5 5.12.1 Assigned today, Tuesday, April 10 Due time: 11:59PM on Monday, April 23 1 CSCI 402: Computer
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Static RAM (SRAM) Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 0.5ns 2.5ns, $2000 $5000 per GB 5.1 Introduction Memory Technology 5ms
More informationCache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Cache Optimization Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Cache Misses On cache hit CPU proceeds normally On cache miss Stall the CPU pipeline
More informationMemory hierarchy and cache
Memory hierarchy and cache QUIZ EASY 1). What is used to design Cache? a). SRAM b). DRAM c). Blend of both d). None. 2). What is the Hierarchy of memory? a). Processor, Registers, Cache, Tape, Main memory,
More informationCPE 631 Lecture 04: CPU Caches
Lecture 04 CPU Caches Electrical and Computer Engineering University of Alabama in Huntsville Outline Memory Hierarchy Four Questions for Memory Hierarchy Cache Performance 26/01/2004 UAH- 2 1 Processor-DR
More informationCMPT 300 Introduction to Operating Systems
CMPT 300 Introduction to Operating Systems Cache 0 Acknowledgement: some slides are taken from CS61C course material at UC Berkeley Agenda Memory Hierarchy Direct Mapped Caches Cache Performance Set Associative
More informationEE 4683/5683: COMPUTER ARCHITECTURE
EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major
More informationEastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy.
Eastern Mediterranean University School of Computing and Technology ITEC255 Computer Organization & Architecture CACHE MEMORY Introduction Computer memory is organized into a hierarchy. At the highest
More information