ECE 331 Hardware Organization and Design. UMass ECE Discussion 10 4/5/2018

Size: px

Start display at page:

Download "ECE 331 Hardware Organization and Design. UMass ECE Discussion 10 4/5/2018"

Hilda Harris
5 years ago
Views:

1 ECE 331 Hardware Organization and Design UMass ECE Discussion 10 4/5/2018

2 Today s Discussion Topics Direct and Set Associative Cache Midterm Review Hazards Code reordering and forwarding

3 Direct Mapped Cache Simplest mapping is a direct mapped cache Each memory address is associated with one possible block within the cache Therefore, we only need to look in a single location in the cache for the data if it exists in the cache ECE232: Introduction to Caches 18

modulo (#Blocks in cache) #Blocks is a power of 2 Use

4 Direct Mapped Cache - Textbook Location determined by address Direct mapped: only one choice (Block address) modulo (#Blocks in cache) #Blocks is a power of 2 Use low-order address bits ECE232: Introduction to Caches 14

5 Direct mapped cache (assume 1 byte/block) Cache Block 0 can be occupied by data from Memory blocks 0, 4, 8, 12 Cache Block 1 can be occupied by data from Memory blocks 1, 5, 9, 13 Cache Block 2 can be occupied by data from Memory blocks 2, 6, 10, 14 Cache Block 3 can be occupied by data from Memory blocks 3, 7, 11, Block Index Memory Cache Index 4-Block Direct Mapped Cache ECE232: Introduction to Caches 15

6 Direct Mapped w/tag Block Index Memory Cache Index tag tag tag determines which memory block occupies cache block hit: cache tag field = tag bits of address 11 Memory block address index miss: tag field ¹ tag bits of address ECE232: Introduction to Caches 17

7 Why large blocks? Fetch large blocks at a time Take advantage of spatial locality for (i=0; i < length; i++) sum += array[i]; array has spatial locality sum has temporal locality ECE232: Memory Hierarchy 20

8 Cache Organization Fully-associative: any memory location can be stored anywhere in the cache Cache location and memory address are unrelated Direct-mapped: each memory location maps onto exactly one cache entry Some of the memory address bit are used to index the cache N-way set-associative: each memory location can go into one of N sets LSBs of Address MSBs of Address Tag Data ECE232: Associative Caches 7

9 Associativity example 64 Bytes of main memory (000000) arranged in 16 blocks of 4 byte words main memory 1 byte main memory block number bytes of cache arranged in 4 blocks of 4 bytes index or cache memory block number eq. to set cache memory set number cache memory block number within set Direct Mapped 1-way set associative 2-way set associative 1 block of cache 4 bytes in each block fully associative 4-way set associative cache memory block number ECE232: Associative Caches 9

10 Associativity example continued For example: Memory location 13 (001101) main memory 1 byte ECE232: Associative Caches 10 main memory block number To find which block in main memory it is stored: 13 / 4 bytes per block = 3 remainder 1 data can be found in main memory block #3 For direct mapped cache (memory block #3) / (4 blocks cache) = 0 remainder 3 cache block #3 (same as the mod operation) cache memory block number set For set associative cache (memory block #3) / (2 sets cache) = 1 remainder 1 cache block #1 cache memory set number cache memory block number within set actual location of byte in block

11 Two-way Set Associative Cache Two direct-mapped caches operate in parallel Cache Index selects a set from the cache (set includes 2 blocks) The two tags in the set are compared in parallel Data is selected based on the tag result Valid Cache Tag Cache Data Cache Block 0 : : : Cache Index Cache Data Cache Block 0 : Cache Tag Valid : : Tag Compare Sel1 1 Mux 0 Sel0 Compare Tag Set Hit OR Cache Block ECE232: Associative Caches 12

12 Three Types of Hazards Data Hazards RAW Read After Write Control Hazards Relevant for ONLY branch and jump instructions When the PC is affected Structural Hazards Due to interactions with pipeline design The compiler (you!) should fix hazards

13 Midterm Review: Code Scheduling to Avoid Stalls Reorder code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; stall stall lw $t1, 0($t0) lw $t2, 4($t0) add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) add $t5, $t1, $t4 sw $t5, 16($t0) 13 cycles lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 sw $t3, 12($t0) add $t5, $t1, $t4 sw $t5, 16($t0) 11 cycles ECE232: Cache Performance Analysis 13

14 Midterm Review: Datapath with Forwarding Hardware PCSrc ID/EX EX/MEM IF/ID Control PC 4 Instruction Memory Read Address Add Read Addr 1 Register Read Read Addr 2Data 1 File Write Addr Write Data Read Data 2 16 Sign 32 Extend Shift left 2 Add ALU ALU cntrl Branch Address Data Memory Write Data Read Data MEM/WB Forward Unit ECE232: Cache Performance Analysis 14

15 Midterm Review: Accessing data in a direct mapped cache Three types of events: cache hit: cache block is valid and contains proper address, so read desired word cache miss: nothing in cache in appropriate block, so fetch from memory cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory Cache Access Procedure: (1) Use Index bits to select cache block (2) If valid bit is 1, compare the tag bits of the address with the cache block tag bits (3) If they match, use the offset to read out the word/byte ECE232: Cache Performance Analysis 21

16 Midterm Review: Accessing data in a direct mapped cache Three types of events: cache hit: cache block is valid and contains proper address, so read desired word cache miss: nothing in cache in appropriate block, so fetch from memory cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory Cache Access Procedure: (1) Use Index bits to select cache block (2) If valid bit is 1, compare the tag bits of the address with the cache block tag bits (3) If they match, use the offset to read out the word/byte ECE232: Cache Performance Analysis 21

17 Midterm Review: Selecting part of a block (block size > 1 byte) If block size > 1, rightmost bits of index are really the offset within the indexed block TAG INDEX OFFSET Tag to check if have correct block Index to select a block in cache Byte offset Example: Block size of 8 bytes; select byte 4 (or 2 nd word) Memory address tag 11 ECE232: Cache Performance Analysis 22 Cache Index

18 Midterm Review: Set Associative Cache - addressing From the main memory address TAG INDEX/Set # OFFSET Tag to check if have correct block anywhere in set Index to select a set in cache Byte offset Example: Main memory address 13 (001101) with 16 bytes of cache arranged in 4 blocks of 4 bytes each Direct mapped: tag: 00, index 11, offset 01 2-way associative: tag: 001, index 1, offset 01 4-way associative: tag: 0011, index -, offset 01 Notice: the size of the tag grows as associativity increases ECE232: Cache Performance Analysis 25

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor