Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy

Size: px
Start display at page:

Download "Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy"

Transcription

1 ENG338 Computer Organization and Architecture Part II Winter 217 S. Areibi School of Engineering University of Guelph Hierarchy Topics Hierarchy Locality Motivation Principles Elements of Design: Addresses Size Mapping Function Replacement Algorithms Write Policy Line Size Summary With thanks to W. Stallings, Hamacher, J. Hennessy, M. J. Irwin for lecture slide contents Many slides adapted from the PPT slides accompanying the textbook and CSE331 Course 2 Hierarchy o o o o o The design constraints on a computer memory can be summed up by three questions (i) How Much (ii) How Fast (iii) How expensive. There is a tradeoff among the three key characteristics A variety of technologies are used to implement memory system Dilemma facing designer is clear large capacity, fast, low cost!! Solution Employ memory hierarchy Flip Flops Dynamic RAM registers Main Disk Magnetic Disk Removable Media Static RAM 5 References I. Computer Organization and Architecture: Designing for Performance, 1 th edition, by William Stalling, Pearson. II. Computer Organization and Design: The Hardware/Software Interface, 5 th edition, by D. Patterson and J. Hennessy, Morgan Kaufmann III. Computer Organization and Architecture: Themes and Variations, 214, by Alan Clements, CENGAGE Learning 3 Chapter 5 Large and Fast: Exploiting Hierarchy 1

2 Hierarchy Main vs. As you go further, capacity and latency increase Dynamic RAM Static RAM Registers Registers 1KB 1 cycle L1 data or instruction 32KB 2 cycles L2 cache 2MB 15 cycles 1GB 3 cycles Disk 8 GB 1M cycles Static RAM 7 1 CPU + Bus + Registers Static RAM CPU Controller Local CPU / Bus Dynamic RAM PCI DRAM Co-processor Controller Peripheral Component Interconnect Bus EISA/PCI Bridge Controller Hard Drive Controller Video Adaptor SCSI Adaptor EISA PC Bus PC Card 1 PC Card 2 PC Card 3 SCSI Bus 11 Hierarchy Levels How is the Hierarchy Managed? Upper Level Lower Level Block (aka line): unit of copying May be multiple words If accessed data is present in upper level Hit: access satisfied by upper level Hit ratio: hits/accesses If accessed data is absent Miss: block copied from lower level Time taken: miss penalty Miss ratio: misses/accesses = 1 hit ratio Then accessed data supplied from upper level registers memory by compiler (programmer?) cache main memory by the cache controller hardware main memory disks by the operating system (virtual memory) virtual to physical address mapping assisted by the hardware (TLB) by the programmer (files) Chapter 5 Large and Fast: Exploiting Hierarchy 2

3 Taking Advantage of Locality Locality hierarchy Store everything on disk Copy recently accessed (and nearby) items from disk to smaller DRAM memory Main memory Copy more recently accessed (and nearby) items from DRAM to smaller SRAM memory memory attached to CPU Principle of Locality (Temporal) Programs access a small proportion of their address space at any time (Locality in Time) Temporal locality Items accessed recently are likely to be accessed again soon Keep most recently accessed data items closer to the processor e.g., instructions in a loop, induction variables and Locality Why do caches work? Temporal locality: if you used some data recently, you will likely use it again Spatial locality: if you used some data recently, you will likely access its neighbors No hierarchy: average access time for data = 3 cycles for (i=; i<1; i++) x[i] = x[i] + s; To Processor Upper Level Lower Level 32KB 1-cycle L1 cache that has a hit rate of 95%: average access time =.95 x x (31) = 16 cycles From Processor Blk X Blk Y 17 Principle of Locality (Spatial) Programs access a small proportion of their address space at any time (Locality in Space) Spatial locality Items near those accessed recently likely to be accessed soon Move blocks consisting of contiguous words to the upper levels E.g., sequential instruction access, scanning an array data Motivation for (i=; i<1; i++) x[i] = x[i] + s; To Processor From Processor Upper Level Blk X Lower Level Blk Y Chapter 5 Large and Fast: Exploiting Hierarchy 3

4 A Typical Hierarchy By taking advantage of the principle of locality: Present the user with as much memory as is available in the cheapest technology. Provide access at the speed offered by the fastest technology. On-Chip Components Control Datapath RegFile ITLB DTLB Instr Data Second Level (SRAM) Main (DRAM) Secondary (Disk) Speed (ns):.1 s 1 s 1 s 1 s 1, s Size (bytes): 1 s K s 1K s M s T s Cost: highest lowest Why Pipeline? For Throughput! To avoid a structural hazard need two caches onchip: one for instructions (I$) and one for data (D$) I n s t r. O r d e r Inst Inst 1 Inst 2 Inst 3 Inst 4 Time (clock cycles) ALU I$ Reg D$ Reg ALU I$ Reg D$ Reg ALU I$ Reg D$ Reg ALU I$ Reg D$ Reg ALU I$ Reg D$ Reg To keep the pipeline running at its maximum rate both I$ and D$ need to satisfy a request from the datapath every cycle. What happens when they can t do that? The Hierarchy Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology at the speed offered by the fastest technology Increasing distance from the processor in access time Processor 4-8 bytes (word) L1$ 8-32 bytes (block) L2$ 1 to 4 blocks Main Secondary 1,24+ bytes (disk sector = page) Inclusive what is in L1$ is a subset of what is in L2$ is a subset of what is in MM that is a subset of is in SM (Relative) size of the memory at each level Processor-memory Performance Gap The Processor vs DRAM speed disparity continues to grow Terminology Hit: data is in some block in the upper level (Blk X) Hit Rate: fraction of memory accesses found in upper level Hit Time: Time to access the upper level which consists of - SRAM access time + Time to determine hit/miss To Processor From Processor Upper Level Blk X Lower Level Blk Y Good memory hierarchy (cache) design is increasingly important to overall performance Miss: data is not in the upper level so needs to be retrieve from a block in the lower level (Blk Y) Miss Rate = 1 - (Hit Rate) Miss Penalty: Time to bring in a block from the lower level and replace a block in the upper level with it + Time to deliver the block the processor Hit Time << Miss Penalty Chapter 5 Large and Fast: Exploiting Hierarchy 4

5 Four Questions for Design Q1: Where can a block be placed in the upper level? (Block placement) Q2: How is a block found if it is in the upper level? (Block identification) : Block Identification memory The level of the memory hierarchy closest to the CPU Given accesses X 1,, X n 1, X n Q3: Which block should be replaced on a miss? (Block replacement strategy) Q4: What happens on a write? (Write strategy) How do we know if the data is present? Where do we look?? : Block Placement Determined by associativity Direct mapped (1-way set associative) One choice for placement n-way set associative n choices within a set Fully associative Any location Higher associativity reduces miss rate, Increases complexity, cost, and access time Block Identification: Finding a Block Associativity Location method Tag comparisons Direct mapped Index 1 n-way set associative Hardware caches Set index, then search entries within the set Fully associative Search all entries #entries Full lookup table Reduce comparisons to reduce cost n : Block Placement Q1: Where can a block be placed in the upper level? Block 12 placed in 8 block cache: Fully associative, direct mapped, 2-way set associative S.A. Mapping = Block Number Modulo Number Sets Full Mapped Direct Mapped (12 mod 8) = 4 2-Way Assoc (12 mod 4) = Block Identification Use lower address part as index to How do we know requested block is in cache (9 or 13)? Store block address as well as the data Block address: Actually, only need the high-order bits Called the tag Index Chapter 5 Large and Fast: Exploiting Hierarchy 5

6 Block Identification: Tags Every block has a tag in addition to data Tag: Upper part of address, that is not used to index cache Another Reference String Mapping Consider the main memory word reference string Start with an empty cache - all blocks initially marked as not valid miss 4 miss miss 4 miss 1 4 Mem() Mem() 1 1 Mem(4) Mem() 4 miss 4 miss miss 4 1 miss Mem(4) Mem() 1 Mem(4) Mem() 8 requests, 8 misses Ping pong effect due to conflict misses - two memory locations that map into the same cache block Valid Bits What if there is no data in a location? Valid bit: 1 = present, = not present Initially Valid bit Tag Data Direct Mapped Hit/Miss: Example Consider the main memory word reference string Start with an empty cache - all blocks initially marked as not valid Tag =, Index = Tag =, Index = 1 Tag =, Index = 1 Tag =, Index = 11 tag miss 1 miss 2 miss 3 miss Mem() Mem() Mem() Mem() Mem(1) Mem(1) Mem(1) Mem(2) Mem(2) Mem(3) Tag = 1, Index = Tag =, Index = 11 Tag = 1, Index = Tag = 11, Index = 11 4 miss 3 hit 4 hit 15 miss 1 4 Mem() 1 Mem(4) 1 Mem(4) 1 Mem(4) Mem(1) Mem(1) Mem(1) Mem(1) Mem(2) Mem(2) Mem(2) Mem(2) Mem(3) Mem(3) Mem(3) 11 Mem(3) 15 8 requests, 6 misses Direct Mapped Location determined by address Direct mapped: only one choice Index = (BlockAddress) modulo (#Blocks in cache) Index = 9 mod 4 = 1 If #Blocks (i.e., number of entries in the cache) is a power of 2 then modulo (i.e., Index) can be computed simply by using the low order log 2 (cache size in blocks) bits of the address (log 2 (4) = 2) Chapter 5 Large and Fast: Exploiting Hierarchy 6

7 The Direct Mapped Direct mapped For each item of data at the lower level (main memory), there is exactly one location in the upper level (cache) where it might be - so lots of items at the lower level must share locations in the upper level Address mapping: (block address) modulo (# of blocks in the cache) Direct-mapped cache: each address maps to a unique address Caching: Example Index Valid Tag Q2: Is it there? Data Compare the cache tag to the high order 2 memory address bits to tell if the memory block is in the cache xx 1xx 1xx 11xx 1xx 11xx 11xx 111xx 1xx 11xx 11xx 111xx 11xx 111xx 111xx 1111xx Main Two low order bits define the byte in the word (32b words) Q1: How do we find it? Use next 2 low order memory address bits the index to determine which cache block (i.e., modulo the number of blocks in the cache) (block address) modulo (# of blocks in the cache) Accessing the Byte address 8 sets: 3 index bits 11 Equations for DM BlockAddress =ByteAddress/BytesPerBlock Index = BlockAddress % #Blocks Offset: 3 bits 8-byte words Direct-mapped cache: each address maps to a unique address Larger Block Size A 64 block, with16 bytes/block To what block number does byte address 12 map? Block Address = Byte Address/Block Size Block address = 12/16 = 75 Index = BlockAddress % # Blocks Index (Block number) = 75 modulo 64 = 11 Tag = BlockAddress / # Blocks Tag = 75 / 64 = 1 Data array 8 Sets (blocks) Tag Index Offset 22 bits 6 bits 4 bits The Tag Array Byte address Tag Compare index bits Offset: 3 bits Because each cache location can contain the contents of a number of different memory location, a tag is added to every block to further identify the requested item. 8-byte words Direct Mapped Example A Processor generates byte addresses It has Direct Mapped (1-way set associative) with 4 sets (blocks) The set (block) size is 4-bytes For each access, is it hit or miss? Solution: Compute Index BlockAddress % #Blocks Compute Tags BlockAddress / #Blocks Byte Address Block Address Tag array Data array 8 Sets 39 Index Tag Chapter 5 Large and Fast: Exploiting Hierarchy 7

8 Direct Mapped Example Compute Index BlockAddress % # Blocks Compute Tags BlockAddress / # Blocks #Blocks = 4 Direct Mapped Example Compute Index BlockAddress % # Blocks Compute Tags BlockAddress / # Blocks 1 5 MemBlock[22] #Blocks = 4 Byte Address Index Tag Hit or Miss????????? Byte Address Index Tag Hit or Miss? m m m????? Direct Mapped Example #Blocks = 4 Direct Mapped Example #Blocks = 4 Compute Index BlockAddress % # Blocks Compute Tags BlockAddress / # Blocks Compute Index BlockAddress % # Blocks Compute Tags BlockAddress / # Blocks 1 5 MemBlock[22] 1 6 MemBlock[26] Byte Address Index Tag Hit or Miss? m??????? Byte Address Index Tag Hit or Miss? m m m m???? Direct Mapped Example #Blocks = 4 Direct Mapped Example #Blocks = 4 Compute Index BlockAddress % # Blocks Compute Tags BlockAddress / # Blocks Compute Index BlockAddress % # Blocks Compute Tags BlockAddress / # Blocks 1 4 MemBlock[16] 1 6 MemBlock[26] 1 6 MemBlock[26] Byte Address Index Tag Hit or Miss? m m?????? Byte Address Index Tag Hit or Miss? m m m m m??? Chapter 5 Large and Fast: Exploiting Hierarchy 8

9 Direct Mapped Example Compute Index BlockAddress % # Blocks Compute Tags BlockAddress / # Blocks 1 4 MemBlock[16] 1 6 MemBlock[26] 1 MemBlock[3] Byte Address Index Tag Hit or Miss? m m m m m m?? #Blocks = 4 Block Size Considerations Larger blocks should reduce miss rate Due to spatial locality But in a fixed-sized cache Larger blocks fewer of them More competition increased miss rate Larger blocks pollution Larger miss penalty (i.e., cost (time) for transfer) Can override benefit of reduced miss rate Early restart? and critical-word-first can help Early restart: is simply to resume execution as soon as the requested word of the block is returned, rather than wait for the entire block. Direct Mapped Example Compute Index BlockAddress % # Blocks Compute Tags BlockAddress / # Blocks 1 4 MemBlock[16] 1 6 MemBlock[26] 1 MemBlock[3] Byte Address Index Tag Hit or Miss? m m m m m m h? #Blocks = 4 Reduce Misses via Larger Block Size Increasing the cache size decreases miss rate Increasing block size lowers miss rates. However the miss rate may go up eventually if the block size becomes a significant fraction of the cache size, Why? Because the number of blocks that can be held in the cache will become small, and there will be a great deal of competition for those blocks. Miss Rate 25% 2% 15% 1% 5% % Block Size (bytes) K 4K 16K 64K 256K Direct Mapped Example Compute Index BlockAddress % # Blocks Compute Tags BlockAddress / # Blocks #Blocks = MemBlock[16] 1 4 MemBlock[18] 1 MemBlock[3] DM Size Byte Address Index Tag Hit or Miss? m m m m m m h m 8 requests, 7 misses Chapter 5 Large and Fast: Exploiting Hierarchy 9

10 Direct-Mapped Size The total number of bits needed for a cache is a function of the (a) cache size, (b) address size, because the cache includes both the storage for the data and the tags. For the following situation: 32-bit o 32-bit addresses TAG Size Index n-bit Blockoffseoffset Byte- o A direct-mapped cache o The size is 2 n blocks, so n bits are used for index o The block size is 2 m words (2 m+2 bytes), so m bits are used for the word within the block, two bits used for the byte part of the address The size of the tag field is 32 (n + m + 2) The total number of bits in a direct-mapped cache is: 2 n x (data size (block size) + tag size + valid field size) Since the block size is 2 m words (a word is 32-bits i.e., 2 5 bits) (2 m+5 bits), and we need 1 bit for the valid field, the number of bits: 2 n x (2 m x 2 5 +(32 n m 2) +1 ) = 2 n x (2 m x n m) Direct-Mapped Size The total number of bits in a direct-mapped cache is #blocks x (block size + tag size + valid field size) Although this is the actual size in bits, the naming convention is to exclude the size of the tag and valid field and to count only the size of the data Valid bit Tag Data DM Size: Example How many total bits are required for a DM with: 16 KiB of Data 4-word blocks Assuming a 32-bit address. Solution: We know that 16 KiB is 496 (2 12 ) words. With a block size of 4-words (2 2 ), there are 124 ( 2 1 ) blocks. Each block has Data: 4 x 32 = 128 bits, plus Tag: which is ( ) = 18-bits, plus Valid bit: 1-bit Thus, the total cache size is ( ) x 124 blocks = bits (147 K bit) 2 1 x (4 x 32 +( ) +1 ) = 2 1 x 147 = 147 Kibibits Or 18.4 KiB for a 16 KiB cache Total number of bits is 1.5 times as many as needed for storage!! Performance Access and Size Performance Metrics This has 124 entries. Each entry (block) is one word. Each word is 32-bits (4-bytes) Therefore: 2-bits are used as offset 1-bit used as index TAG = 32 (1 + 2) = 2-bits The cache size in this case is 4 KiB o HitRate = #Hits / #Accesses o MissRate = #Misses / #Accesses o = 1 HitRate o HitTime = time for a hit o MissPenalty = cost of a miss o Average Access Time (AMAT) = o HitTime + MissRate x MissPenalty Chapter 5 Large and Fast: Exploiting Hierarchy 1

11 Miss Categories 3 Cs Model Compulsory First access to a block is always a miss - Also called cold start misses - Misses in infinite size cache Conflict Multiple memory locations mapped to the same cache location - Also called collision misses. - All other misses. Capacity cannot contain all blocks needed, Capacity misses occur due to blocks being discarded and later retrieved. Example Word addr Binary addr Hit/miss block Miss 11 Index N 1 N 1 N 11 N 1 N 11 N 11 Y 1 Mem[111] 111 N Example Word addr Binary addr Hit/miss block Miss 1 Example Index N 1 N 1 Y 11 Mem[111] 11 N 1 N 11 N 11 Y 1 Mem[111] 111 N Example (more blocks) Recall earlier cache with 4 blocks (8 requests, 7 misses) This cache has: 8-blocks (instead of 4), 1 word/block, direct mapped Example Word addr Binary addr Hit/miss block Hit Hit 1 Index N 1 N 1 N 11 N 1 N 11 N 11 N 111 N Index N 1 N 1 Y 11 Mem[111] 11 N 1 N 11 N 11 Y 1 Mem[111] 111 N Chapter 5 Large and Fast: Exploiting Hierarchy 11

12 Example Word addr Binary addr Hit/miss block 16 1 Miss 3 11 Miss Hit Index Y 1 Mem[1] 1 N 1 Y 11 Mem[111] 11 Y Mem[11] 1 N 11 N 11 Y 1 Mem[111] 111 N Address Subdivision A memory can hold 32 Kbytes. Data is transferred between MM and cache in blocks of 16 bytes each. The main memory consist of 512 Kbytes Show the format of main memory addresses in a DM Organization. Assume that addressing is done at the byte-level Solution: Total address lines needed for 512 Kbytes is 19 bits # of blocks: 32 Kbytes/16 bytes = bits for index Byte Offset within a block: 16 bytes bits for word (byte offset) Tag = = 4 bits Example Word addr Binary addr Hit/miss block Miss 1 Index Y 1 Mem[1] 1 N 1 Y 1 Mem[11] 11 Y Mem[11] 1 N 11 N 11 Y 1 Mem[111] 111 N 8 requests, 5 misses vs. 8 request with 7 misses Mapping Functions Use small cachewith 128 blocks of 16 words Use main memory with 64K words (4K blocks) Word-addressable memory, so 16-bit address Direct Mapping Direct Mapped Example size = 1K words, One word/block Byte offset Hit Tag 2 1 Index Data Address Subdivision & Architecture Index Valid Tag 2 Data 32 Comparator Chapter 5 Large and Fast: Exploiting Hierarchy 12

13 Multiword Block Direct Mapped size = 1K words, Hit Tag Index Valid Tag Index 8 Four words/block Byte offset Data Block offset Data Misses On cache hit, CPU proceeds normally On cache miss Stall the CPU pipeline Fetch block from next level of hierarchy Instruction cache miss Restart instruction fetch Data cache miss Complete data access What kind of locality are we taking advantage of? 32 Multiword Block Direct Mapped size = 16K words, Four words/block Hit Tag Address (showing bit positions) Byte offset 16 bits 128 bits Address (bit positions) Index Mux 32 Block offset 4K entries Data 74 Misses/Improving Performance Compulsory: First access to a block, cold fact of life, Conflict: Multiple memory locations mapped to the same cache location - Solution 1: increase cache size - Solution 2: increase associativity Capacity: cannot contain all blocks accessed by the program - Solution: increase cache size AMAT = HitTime + MissRate x MissPenalty Reduce HitTime: - Small and simple cache Reduce MissRate: - Larger Block Size, - Higher Associativity Reduce MissPenalty: - MultiLevel s, - Give priority to read misses. Miss & Hits Example: Intrinsity FastMATH Embedded MIPS processor 12-stage pipeline Instruction and data access on each cycle Split cache: separate I-cache and D-cache Each 16KB: 256 blocks 16 words/block D-cache: write-through or write-back SPEC2 miss rates I-cache:.4% D-cache: 11.4% Weighted average: 3.2% Chapter 5 Large and Fast: Exploiting Hierarchy 13

14 Example: Intrinsity FastMATH Example Access Pattern Summary Byte address Tag 11 Assume that addresses are 8 bits long How many of the following address requests are hits/misses? 4, 7, 1, 13, 16, 68, 73, 78, 83, 88, 4, 7, 1 8-byte words Compare Direct-mapped cache: each address maps to a unique address Tag array Data array 83 Summary The Principle of Locality: Program likely to access a relatively small portion of the address space at any instant of time - Temporal Locality: Locality in Time - Spatial Locality: Locality in Space Three major categories of cache misses: Compulsory misses: sad facts of life, e.g., cold start misses Conflict misses: increase cache size and/or associativity Nightmare Scenario: ping pong effect! Capacity misses: increase cache size What's Next? Set Associative s Write, Replacement Policies Systems that Support s The off-chip interconnect and memory architecture affects overall system performance dramatically on-chip 32-bit data & 32-bit addr per cycle CPU bus Main Assume 1. 1 clock cycle (1 ns) to send the address from the cache to the Main 2. 5 ns (5 processor clock cycles) for DRAM first word access time, 1 ns (1 clock cycles) cycle time (remaining words in burst for SDRAM) 3. 1 clock cycle (1 ns) to return a word of data from the Main to the cache -Bus to bandwidth number of bytes accessed from Main and transferred to cache/cpu per clock cycle Chapter 5 Large and Fast: Exploiting Hierarchy 14

15 One Word Wide Organization on-chip CPU bus Main If the block size is one word, then for a memory access due to a cache miss, the pipeline will have to stall the number of cycles required to return one data word from memory 1 cycle(s) to send address 5 cycle(s) to read DRAM 1 cycle(s) to return data 52 total clock cycles miss penalty Number of bytes transferred per clock cycle (bandwidth) for a miss is 4/52 =.77 bytes per clock Burst Organization on-chip CPU bus Main What if the block size is four words and a (DDR) SDRAM is used? 1 cycle(s) to send 1 st address 5 + 3*1 = 8 cycle(s) to read DRAM 1 cycle(s) to return last data word 82 total clock cycles miss penalty 5 cycles 1 cycles 1 cycles 1 cycles Number of bytes transferred per clock cycle (bandwidth) for a single miss is (4 x 4)/82 =.183 bytes per clock Chapter 5 Large and Fast: Exploiting Hierarchy 15

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Cache Introduction [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user with as much

More information

14:332:331. Week 13 Basics of Cache

14:332:331. Week 13 Basics of Cache 14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Week131 Spring 2006

More information

14:332:331. Week 13 Basics of Cache

14:332:331. Week 13 Basics of Cache 14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Lec20.1 Fall 2003 Head

More information

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work? EEC 17 Computer Architecture Fall 25 Introduction Review Review: The Hierarchy Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology

More information

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1 CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Chapter 5. Memory Technology

Chapter 5. Memory Technology Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

CSE 2021: Computer Organization

CSE 2021: Computer Organization CSE 2021: Computer Organization Lecture-12a Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB

More information

CSE 2021: Computer Organization

CSE 2021: Computer Organization CSE 2021: Computer Organization Lecture-12 Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major

More information

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar

More information

Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University Caches Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns

More information

Course Administration

Course Administration Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Review: Major Components of a Computer Processor Devices Control Memory Input Datapath Output Secondary Memory (Disk) Main Memory Cache Performance

More information

CS3350B Computer Architecture

CS3350B Computer Architecture CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common

More information

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory 1 COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory Cristinel Ababei Dept. of Electrical and Computer Engineering Marquette University Credits: Slides adapted from presentations

More information

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu CENG 3420 Computer Organization and Design Lecture 08: Memory - I Bei Yu CEG3420 L08.1 Spring 2016 Outline q Why Memory Hierarchy q How Memory Hierarchy? SRAM (Cache) & DRAM (main memory) Memory System

More information

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Caches Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to

More information

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu CENG 3420 Computer Organization and Design Lecture 08: Cache Review Bei Yu CEG3420 L08.1 Spring 2016 A Typical Memory Hierarchy q Take advantage of the principle of locality to present the user with as

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

V. Primary & Secondary Memory!

V. Primary & Secondary Memory! V. Primary & Secondary Memory! Computer Architecture and Operating Systems & Operating Systems: 725G84 Ahmed Rezine 1 Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM)

More information

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user

More information

Page 1. Memory Hierarchies (Part 2)

Page 1. Memory Hierarchies (Part 2) Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy

More information

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches CS 61C: Great Ideas in Computer Architecture Direct Mapped Caches Instructor: Justin Hsia 7/05/2012 Summer 2012 Lecture #11 1 Review of Last Lecture Floating point (single and double precision) approximates

More information

Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site:

Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site: Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, 2003 Textbook web site: www.vrtechnology.org 1 Textbook web site: www.vrtechnology.org Laboratory Hardware 2 Topics 14:332:331

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 5 Memory Hierachy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Static RAM (SRAM) Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 0.5ns 2.5ns, $2000 $5000 per GB 5.1 Introduction Memory Technology 5ms

More information

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds. Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COEN-4710 Computer Hardware Lecture 7 Large and Fast: Exploiting Memory Hierarchy (Chapter 5) Cristinel Ababei Marquette University Department

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap

More information

The Memory Hierarchy & Cache

The Memory Hierarchy & Cache Removing The Ideal Memory Assumption: The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 6: Memory Organization Part I

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 6: Memory Organization Part I ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 6: Memory Organization Part I Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,

More information

Caches Part 1. Instructor: Sören Schwertfeger. School of Information Science and Technology SIST

Caches Part 1. Instructor: Sören Schwertfeger.   School of Information Science and Technology SIST CS 110 Computer Architecture Caches Part 1 Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's

More information

COSC 6385 Computer Architecture. - Memory Hierarchies (I)

COSC 6385 Computer Architecture. - Memory Hierarchies (I) COSC 6385 Computer Architecture - Hierarchies (I) Fall 2007 Slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05 Recap

More information

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141 EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

The Memory Hierarchy Cache, Main Memory, and Virtual Memory

The Memory Hierarchy Cache, Main Memory, and Virtual Memory The Memory Hierarchy Cache, Main Memory, and Virtual Memory Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University The Simple View of Memory The simplest view

More information

Handout 4 Memory Hierarchy

Handout 4 Memory Hierarchy Handout 4 Memory Hierarchy Outline Memory hierarchy Locality Cache design Virtual address spaces Page table layout TLB design options (MMU Sub-system) Conclusion 2012/11/7 2 Since 1980, CPU has outpaced

More information

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information

CS152 Computer Architecture and Engineering Lecture 17: Cache System

CS152 Computer Architecture and Engineering Lecture 17: Cache System CS152 Computer Architecture and Engineering Lecture 17 System March 17, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http//http.cs.berkeley.edu/~patterson

More information

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches CS 61C: Great Ideas in Computer Architecture The Memory Hierarchy, Fully Associative Caches Instructor: Alan Christopher 7/09/2014 Summer 2014 -- Lecture #10 1 Review of Last Lecture Floating point (single

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Caches hold a subset of data from the main

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/ Typical Memory Hierarchy Datapath On-Chip

More information

5. Memory Hierarchy Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16

5. Memory Hierarchy Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16 5. Memory Hierarchy Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3 Emil Sekerinski, McMaster University, Fall Term 2015/16 Movie Rental Store You have a huge warehouse with every movie ever made.

More information

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal

More information

Memory Hierarchy Y. K. Malaiya

Memory Hierarchy Y. K. Malaiya Memory Hierarchy Y. K. Malaiya Acknowledgements Computer Architecture, Quantitative Approach - Hennessy, Patterson Vishwani D. Agrawal Review: Major Components of a Computer Processor Control Datapath

More information

Memory. Principle of Locality. It is impossible to have memory that is both. We create an illusion for the programmer. Employ memory hierarchy

Memory. Principle of Locality. It is impossible to have memory that is both. We create an illusion for the programmer. Employ memory hierarchy Datorarkitektur och operativsystem Lecture 7 Memory It is impossible to have memory that is both Unlimited (large in capacity) And fast 5.1 Intr roduction We create an illusion for the programmer Before

More information

CPE 631 Lecture 04: CPU Caches

CPE 631 Lecture 04: CPU Caches Lecture 04 CPU Caches Electrical and Computer Engineering University of Alabama in Huntsville Outline Memory Hierarchy Four Questions for Memory Hierarchy Cache Performance 26/01/2004 UAH- 2 1 Processor-DR

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Components of a Computer Processor

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 22: Direct Mapped Cache Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Intel 8-core i7-5960x 3 GHz, 8-core, 20 MB of cache, 140

More information

COSC 6385 Computer Architecture - Memory Hierarchies (I)

COSC 6385 Computer Architecture - Memory Hierarchies (I) COSC 6385 Computer Architecture - Memory Hierarchies (I) Edgar Gabriel Spring 2018 Some slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 21: Memory Hierarchy Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Ideally, computer memory would be large and fast

More information

10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts

10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts // CS C: Great Ideas in Computer Architecture (Machine Structures) s Part Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~csc/ Organization and Principles Write Back vs Write Through

More information

EEC 170 Computer Architecture Fall Improving Cache Performance. Administrative. Review: The Memory Hierarchy. Review: Principle of Locality

EEC 170 Computer Architecture Fall Improving Cache Performance. Administrative. Review: The Memory Hierarchy. Review: Principle of Locality Administrative EEC 7 Computer Architecture Fall 5 Improving Cache Performance Problem #6 is posted Last set of homework You should be able to answer each of them in -5 min Quiz on Wednesday (/7) Chapter

More information

Caches. Hiding Memory Access Times

Caches. Hiding Memory Access Times Caches Hiding Memory Access Times PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O N T R O L ALU CTL INSTRUCTION FETCH INSTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMORY

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/16/17 Fall 2017 - Lecture #15 1 Outline

More information

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

Chapter Seven. Large & Fast: Exploring Memory Hierarchy Chapter Seven Large & Fast: Exploring Memory Hierarchy 1 Memories: Review SRAM (Static Random Access Memory): value is stored on a pair of inverting gates very fast but takes up more space than DRAM DRAM

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Different Storage Memories Chapter 5 Large and Fast: Exploiting Memory

More information

CS3350B Computer Architecture

CS3350B Computer Architecture CS3350B Computer Architecture Winter 2015 Lecture 3.1: Memory Hierarchy: What and Why? Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design, Patterson

More information

The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs.

The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. The Hierarchical Memory System The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory Hierarchy:

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Jiang Jiang

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Jiang Jiang Chapter 5 Large and Fast: Exploiting Memory Hierarchy Jiang Jiang jiangjiang@ic.sjtu.edu.cn [Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2008, MK] Chapter 5 Large

More information

Memory Hierarchy: Caches, Virtual Memory

Memory Hierarchy: Caches, Virtual Memory Memory Hierarchy: Caches, Virtual Memory Readings: 5.1-5.4, 5.8 Big memories are slow Computer Fast memories are small Processor Memory Devices Control Input Datapath Output Need to get fast, big memories

More information

Memory Hierarchy. Reading. Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 (2) Lecture notes from MKP, H. H. Lee and S.

Memory Hierarchy. Reading. Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 (2) Lecture notes from MKP, H. H. Lee and S. Memory Hierarchy Lecture notes from MKP, H. H. Lee and S. Yalamanchili Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 Reading (2) 1 SRAM: Value is stored on a pair of inerting gates Very fast but

More information

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky Memory Hierarchy, Fully Associative Caches Instructor: Nick Riasanovsky Review Hazards reduce effectiveness of pipelining Cause stalls/bubbles Structural Hazards Conflict in use of datapath component Data

More information

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp. Cache associativity Cache and performance 12 1 CMPE110 Spring 2005 A. Di Blas 110 Spring 2005 CMPE Cache Direct-mapped cache Reads and writes Textbook Edition: 7.1 to 7.3 Second Third Edition: 7.1 to 7.3

More information

LECTURE 11. Memory Hierarchy

LECTURE 11. Memory Hierarchy LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Bernhard Boser & Randy H Katz http://insteecsberkeleyedu/~cs61c/ 10/18/16 Fall 2016 - Lecture #15 1 Outline

More information

CMPT 300 Introduction to Operating Systems

CMPT 300 Introduction to Operating Systems CMPT 300 Introduction to Operating Systems Cache 0 Acknowledgement: some slides are taken from CS61C course material at UC Berkeley Agenda Memory Hierarchy Direct Mapped Caches Cache Performance Set Associative

More information

Performance! (1/latency)! 1000! 100! 10! Capacity Access Time Cost. CPU Registers 100s Bytes <10s ns. Cache K Bytes ns 1-0.

Performance! (1/latency)! 1000! 100! 10! Capacity Access Time Cost. CPU Registers 100s Bytes <10s ns. Cache K Bytes ns 1-0. Since 1980, CPU has outpaced DRAM... EEL 5764: Graduate Computer Architecture Appendix C Hierarchy Review Ann Gordon-Ross Electrical and Computer Engineering University of Florida http://www.ann.ece.ufl.edu/

More information

CS 61C: Great Ideas in Computer Architecture Caches Part 2

CS 61C: Great Ideas in Computer Architecture Caches Part 2 CS 61C: Great Ideas in Computer Architecture Caches Part 2 Instructors: Nicholas Weaver & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/fa15 Software Parallel Requests Assigned to computer eg,

More information

CS161 Design and Architecture of Computer Systems. Cache $$$$$

CS161 Design and Architecture of Computer Systems. Cache $$$$$ CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks

More information

ECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]

ECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)] ECE7995 (4) Basics of Memory Hierarchy [Adapted from Mary Jane Irwin s slides (PSU)] Major Components of a Computer Processor Devices Control Memory Input Datapath Output Performance Processor-Memory Performance

More information

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals Cache Memory COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline The Need for Cache Memory The Basics

More information

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored

More information

Advanced Computer Architecture

Advanced Computer Architecture ECE 563 Advanced Computer Architecture Fall 2009 Lecture 3: Memory Hierarchy Review: Caches 563 L03.1 Fall 2010 Since 1980, CPU has outpaced DRAM... Four-issue 2GHz superscalar accessing 100ns DRAM could

More information

10/19/17. You Are Here! Review: Direct-Mapped Cache. Typical Memory Hierarchy

10/19/17. You Are Here! Review: Direct-Mapped Cache. Typical Memory Hierarchy CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~cs6c/ Parallel Requests Assigned to computer eg, Search

More information

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) Department of Electr rical Eng ineering, Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Depar rtment of Electr rical Engineering,

More information

CS/ECE 3330 Computer Architecture. Chapter 5 Memory

CS/ECE 3330 Computer Architecture. Chapter 5 Memory CS/ECE 3330 Computer Architecture Chapter 5 Memory Last Chapter n Focused exclusively on processor itself n Made a lot of simplifying assumptions IF ID EX MEM WB n Reality: The Memory Wall 10 6 Relative

More information

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies 1 Lecture 22 Introduction to Memory Hierarchies Let!s go back to a course goal... At the end of the semester, you should be able to......describe the fundamental components required in a single core of

More information

Recap: Machine Organization

Recap: Machine Organization ECE232: Hardware Organization and Design Part 14: Hierarchy Chapter 5 (4 th edition), 7 (3 rd edition) http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy,

More information

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)] ECE7995 (6) Improving Cache Performance [Adapted from Mary Jane Irwin s slides (PSU)] Measuring Cache Performance Assuming cache hit costs are included as part of the normal CPU execution cycle, then CPU

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/19/17 Fall 2017 - Lecture #16 1 Parallel

More information

Topics. Digital Systems Architecture EECE EECE Need More Cache?

Topics. Digital Systems Architecture EECE EECE Need More Cache? Digital Systems Architecture EECE 33-0 EECE 9-0 Need More Cache? Dr. William H. Robinson March, 00 http://eecs.vanderbilt.edu/courses/eece33/ Topics Cache: a safe place for hiding or storing things. Webster

More information

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast

More information

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp. 13 1 CMPE110 Computer Architecture, Winter 2009 Andrea Di Blas 110 Winter 2009 CMPE Cache Direct-mapped cache Reads and writes Cache associativity Cache and performance Textbook Edition: 7.1 to 7.3 Third

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,

More information