Improving Memory Access The Cache and Virtual Memory

Size: px

Start display at page:

Download "Improving Memory Access The Cache and Virtual Memory"

Tyrone Bishop
6 years ago
Views:

1 EECS 322 Computer Architecture Improving Memory Access The Cache and Virtual Memory Instructor: Francis G. Wolff Case Western Reserve University This presentation uses powerpoint animation: please viewshow

2 Cache associativity Figure 7.5 Direct mapped Set associative Fully associative Block # Set # Data Data Data Tag 2 Tag 2 Tag 2 Search Search Search Direct-mapped cache (a.k.a. -way set associative cache) 2-way set associative cache Fully associative cache

3 Cache architectures Figure 7.6 Two-way set associative Set Tag Data Tag Data Four-way set associative Set 0 Tag Data Tag Data Tag Data Tag Data One-way set associative (direct mapped) Block Tag Data Eight-way set associative (fully associative) Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data T ag Data

4 Direct Mapped Cache: Example Page 57 Tag Index Byte offset Byte address lw $, ($0) lw $2, ($0) lw $3, ($0) lw $4, ($0) lw $5, ($0) Miss: valid Miss: tag Miss: tag Miss: valid Miss: tag 5/5 Misses lw $,0($0) lw $2,32($0) lw $3,0($0) lw $4,24($0) lw $5,32($0) Index Valid Tag Data 00 N Y Memory[000 Memory[00 Memory[ Memory[ ] 0 N 0 NY 00 Memory[ ] N

5 2-Way Set Associative Cache: Example Page 57 Tag Index Byte offset Byte address lw $, ($0) lw $2, ($0) lw $3, ($0) lw $4, ($0) lw $5, ($0) Miss: valid Miss: tag Hit! Miss: tag Miss: tag 4/5 Misses lw $,0($0) lw $2,32($0) lw $3,0($0) lw $4,24($0) lw $5,32($0) Index Valid Tag Data 0 N Y N Y Memory[0000 Memory[ ] 00] Memory[000 Memory[ ] N N

6 Fully Associative Cache: Example Page 57 Tag Index Byte offset Byte address lw $, ($0) lw $2, ($0) lw $3, ($0) lw $4, ($0) lw $5, ($0) Miss: valid Miss Hit! Miss Hit! 3/5 Misses lw $,0($0) lw $2,32($0) lw $3,0($0) lw $4,24($0) lw $5,32($0) Valid Tag Data NY Memory[ ] NY 0000 Memory[ ] NY 000 Memory[000 00] N

7 Cache comparisons: page 57 Example Example (page 57): 5 loads lw $,0($0) lw $2,32($0) lw $3,0($0) lw $4,24($0) lw $5,32($0) Cache architecture misses Direct Mapped 5 out of 5 2-Way Set associative 4 out of 5 Fully associative 3 out of 5 Chip Area Memory access time Decreasing the number of misses, speeds up the access time

8 Bits in 64 KB Data Cache Summary Page 570 Cache Type Index Size Tag Size Set Size Overhead Miss Rate Temporial Direct Mapped % 5.4% Spatial Direct Mapped %.9% 2 - Way Set Associative 7 3.2%.5% 4 - Way Set Associative %.5% Fully Associative %

9 Decreasing miss penalty with multilevel caches Page 576 The effective CPI with only L to M cache Total CPI = Base CPI + (Memory Stall cycles per Instruction) =.0 + 5% 00ns = 6.0 The effective CPI with L and L2 caches Total CPI = Base CPI + (L to L2 penalty) + (L2 to M penalty) =.0 + (5% 0ns) + (2% 00ns) = CPU Speedup = =.7 3.5

10 Direct Mapping Cache addresses Cache Index = (Block Address) %(Cache Size) Cache Index Byte address = Data Bytes per cache block %(Cache Size) Block address= Data Byte address Bytes per cache block Cache Size in Blocks = Cache Size in Bytes Bytes per Cache Block

11 N-Way Set-Associative Mapping Cache addresses An N-way set associative cache consists of a number of sets, where each set consists on N blocks Cache Index = Set # = (Block Address) %(Cache Size) Cache Index = Data Byte address Bytes per cache block %(Cache Size) Block address= Data Byte address Bytes per cache block Cache Size in N way Sets = N Cache Size in Bytes DataBytes per Cache Block

12 Direct Mapped Cache Suppose we have the have the following byte addresses: 0, 32, 0, 24, 32 Given a direct mapped cache consisting of 6 data bytes where Data bytes per cache block = 4 where Cache size = 6 bytes/(4 bytes per block) = 4 The cache block addresses are: 0, 8, 0, 6, 8 The cache index addresses are: 0, 0, 0, 2, 0 For example, 24, Block address =floor(byte address / data bytes per block) = floor(24/4)=6 Cache Index = 6 % Cache size = 6 % 4 = 2

13 2-Way Set associative cache Again, suppose we have the have the following byte addresses: 0, 32, 0, 24, 32 Given a 2-way set associative cache of 6 data bytes where Data bytes per cache block = 4 where Cache size = 6 bytes/(2 4 bytes per block) = 2 The 2-way cache block addresses are: 0, 8, 0, 6, 8 The 2-way cache index addresses are: 0, 0, 0, 0, 0 Note: Direct cache index addresses were: 0, 0, 0, 0, 0, 0, 2, 2, 0 For example, 24, Block address =floor(byte address / data bytes per block) = floor(24/4) = 6 Cache Index = 6 % Cache size = 6 % 2 = 0

14 Fully associative cache Again, suppose we have the have the following byte addresses: 0, 32, 0, 24, 32 Given a 2-way set associative cache of 6 data bytes where Data bytes per cache block = 4 where Cache size = 6 bytes/(4 bytes per block) = 4 The fully cache block addresses are: None! The 2-way cache index addresses are: NONE!

15 Bits in a Temporal Direct Mapped Cache Page 550 How many total are required for a cache with 64KB (= 2 6 KiloBytes) of data and one word (=4 bytes =32 bit) blocks assuming a 32 bit byte memory address? Cache index width = log 2 words = log /4 = log words = 4 Block address width = <byte address width> log 2 word = 32 2 = 30 Tag size = <block address width> <cache index width> = 30 4 = 6 Cache block size = <valid size>+<tag size>+<block data size> = bit = 49 Total size = <Cache word size> <Cache block size> = 2 4 words 49 = = 784 K = 98 KB = 98 KB/64 KB =.5 times overhead

16 Bits in a Spatial Direct Mapped Cache Page 570 How many total are required for a cache with 64KB (= 2 6 KiloBytes) of data and 4 - one word (=4*4=6 bytes =4*32=28 ) blocks assuming a 32 bit byte memory address? Cache index width = log 2 words = log /6 = log words = 2 Block address width = <byte address width> log 2 blocksize = 32 4 = 28 Tag size = <block address width> <cache index width> = 28 2 = 6 Cache block size = <valid size>+<tag size>+<block data size> = bit = 45 Total size = <Cache word size> <Cache block size> = 2 2 words 45 = / (024*8) = 72.5 Kbyte = 72.5 KB/64 KB =.3 times overhead

17 Bits in a 2-Way Set Associative Cache Page 570 How many total are required for a cache with 64KB (= 2 6 KiloBytes) of data and 4 - one word (=4*4=6 bytes =4*32=28 ) blocks assuming a 32 bit byte memory address? Cache index width = log 2 words = log /(2 6) = log 2 2 words = Block address width = <byte address width> log 2 blocksize = 32 4 = 28 (same!) Tag size = <block address width> <cache index width> = 28 = 7 Cache Set size = <valid size>+<tag size>+<block data size> = bit + 2 ( ) = 3 Total size = <Cache word size> <Cache block size> = 2 words 3 = / (024*8) = Kbyte = 72.5 KB/64 KB =.2 times overhead

18 Bits in a 4-Way Set Associative Cache Page 570 How many total are required for a cache with 64KB (= 2 6 KiloBytes) of data and 4 - one word (=4*4=6 bytes =4*32=28 ) blocks assuming a 32 bit byte memory address? Cache index width = log 2 words = log /(4 6) = log words = 0 Block address width = <byte address width> log 2 blocksize = 32 4 = 28 (same!) Tag size = <block address width> <cache index width> = 28 0 = 8 Cache Set size = <valid size>+<tag size>+<block data size> = bit + 4 ( ) = 585 Total size = <Cache word size> <Cache block size> = 2 0 words 585 = / (024*8) = Kbyte = KB/64 KB =.4 times overhead

19 Bits in a Fully Associative Cache Page 570 How many total are required for a cache with 64KB (= 2 6 KiloBytes) of data and 4-one word (=4*4=6 bytes =4*32=28 ) blocks assuming a 32 bit byte memory address? Cache index width = 0 Block address width = <byte address width> log 2 blocksize = 32 4 = 28 (same!) Tag size = <block address width> <cache index width> = 28 0 = 28 Cache Set size = <valid size>+<tag size>+<block data size> = bit = 57 Total size = <Cache word size> <Cache block size> = 2 2 words 57 = / (024*8) = 78.5 Kbyte = 78.5 KB/64 KB =.23 times overhead

20 Virtual Memory Figure 7.20 Main memory can act as a cache for the secondary storage (disk) Advantages: illusion of having more physical memory program relocation Virtual addresses Physical addresses Address translation protection Disk addresses

21 Pages: virtual memory blocks Figure 7.2 Virtual address Virtual page number Page offset Translation Physical page number Page offset The The automatic transmission of of a car: car: Hardware/Software does the the shifting Physical address

22 Page Tables Figure 7.23 Virtual page number Valid Page table Physical page or disk address Physical memory Disk storage

23 Virtual Example: 32 Byte pages lw $2, ($0) lw $2, ($0) C-Miss C-Miss P-Fault P- P- Hit! lw $,32($0) lw $2,40($0) Index Valid Tag Data 0 N Y Memory[00000 Memory[ ] N Virtual Page Number Page offset Page Index Valid Physical Page number 0 N NY... N Memory Disk

24 Page Table Size Suppose we have 32 bit virtual address (=2 32 bytes) 4096 bytes per page (=2 2 bytes) 4 bytes per page table entry (=2 2 bytes) What is the total page table size? Number of pagetableentries = = 2 20 entries Sizeof pagetable = 2 20 entries 2 2 bytes = 2 22 bytes = 4MB 4 Megabytes just for page tables!! Too Big

25 Page Tables Figure 7.22 Page table register Virtual address Virtual page number Page offset 20 2 Valid Physical page num ber Page table If 0 th en p ag e is no t present in m emory Physical page num ber Page offset Phy sical address

26 Making Address Translation Fast A cache for address translations: translation lookaside buffer V irtual page number Valid Tag T L B Physical page address 0 Physical memory Page table Physical page Valid or disk address Disk storage

27 TLBs and caches Virtual address TLB access TLB miss exception No TLB hit? Yes Physical address No Write? Yes Cache miss stall No Try to read data from cache Cache hit? Yes No Write protection exception Write access bit on? Yes Write data into cache, update the tag, and put the data and the address into the write buffer Deliver data to the CPU

LECTURE 10: Improving Memory Access: Direct and Spatial caches

EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses