Binghamton University. CS-220 Spring Cached Memory. Computer Systems Chapter

Similar documents
Lecture 12: Memory hierarchy & caches

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

MIPS) ( MUX

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang

Characteristics of Memory Location wrt Motherboard. CSCI 4717 Computer Architecture. Characteristics of Memory Capacity Addressable Units

Caches Concepts Review

Memory Hierarchy. Mehran Rezaei

Memory. Objectives. Introduction. 6.2 Types of Memory

Caches. Samira Khan March 23, 2017

Caches 3/23/17. Agenda. The Dataflow Model (of a Computer)

Overview IN this chapter we will study. William Stallings Computer Organization and Architecture 6th Edition

Memory Hierarchies &

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory memories memory

CPE300: Digital System Architecture and Design

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

ECE Lab 8. Logic Design for a Direct-Mapped Cache. To understand the function and design of a direct-mapped memory cache.

CISC 360. Cache Memories Exercises Dec 3, 2009

CAM Content Addressable Memory. For TAG look-up in a Fully-Associative Cache

CS 31: Intro to Systems Caching. Martin Gagne Swarthmore College March 23, 2017

Chapter 6 Objectives

Page 1. Multilevel Memories (Improving performance using a little cash )

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook

Caching Basics. Memory Hierarchies

Caching Prof. James L. Frankel Harvard University. Version of 5:16 PM 5-Apr-2016 Copyright 2016 James L. Frankel. All rights reserved.

Page 1. Memory Hierarchies (Part 2)

LECTURE 11. Memory Hierarchy

MEMORY. Objectives. L10 Memory

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Key Point. What are Cache lines

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

CS161 Design and Architecture of Computer Systems. Cache $$$$$

Memory Hierarchy: Caches, Virtual Memory

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 27, SPRING 2013

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]

EE 457 Unit 7a. Cache and Memory Hierarchy

Improving System. Performance: Caches

Memory. Lecture 22 CS301

EE 4683/5683: COMPUTER ARCHITECTURE

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory

ECE331 Homework 4. Due Monday, August 13, 2018 (via Moodle)

CSE 410 Computer Systems. Hal Perkins Spring 2010 Lecture 12 More About Caches

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

Physical characteristics (such as packaging, volatility, and erasability Organization.

V. Primary & Secondary Memory!

CS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

CS 136: Advanced Architecture. Review of Caches

Course Administration

Pick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 27, FALL 2012

Chapter 4: Memory Management. Part 1: Mechanisms for Managing Memory

Memory and multiprogramming

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

EE414 Embedded Systems Ch 5. Memory Part 2/2

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6.

LECTURE 10: Improving Memory Access: Direct and Spatial caches

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

211: Computer Architecture Summer 2016

Memory Design. Cache Memory. Processor operates much faster than the main memory can.

Computer Systems Laboratory Sungkyunkwan University

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

CS61C : Machine Structures

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Lecture 2: Memory Systems

CSE 2021: Computer Organization

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Paging! 2/22! Anthony D. Joseph and Ion Stoica CS162 UCB Fall 2012! " (0xE0)" " " " (0x70)" " (0x50)"

ECE331: Hardware Organization and Design

CS3350B Computer Architecture

CSE 2021: Computer Organization

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

Cache and Memory. CS230 Tutorial 07

CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs"

[ 6.1] A cache is a small, fast memory which is transparent to the processor. The cache duplicates information that is in main memory.

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

Introduction to OpenMP. Lecture 10: Caches

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 15

Memory hierarchies: caches and their impact on the running time

Performance metrics for caches

Characteristics. Microprocessor Design & Organisation HCA2102. Unit of Transfer. Location. Memory Hierarchy Diagram

UCB CS61C : Machine Structures

Computer & Microprocessor Architecture HCA103

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Cache memory. Lecture 4. Principles, structure, mapping

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

ECE232: Hardware Organization and Design

Cache Memories /18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, Today s Instructor: Phil Gibbons

Caches Part 1. Instructor: Sören Schwertfeger. School of Information Science and Technology SIST

ELE 758 * DIGITAL SYSTEMS ENGINEERING * MIDTERM TEST * Circle the memory type based on electrically re-chargeable elements

1" 0" d" c" b" a" ...! ! Benefits of Larger Block Size. " Very applicable with Stored-Program Concept " Works well for sequential array accesses

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

Transcription:

Cached Memory Computer Systems Chapter 6.2-6.5

Cost Speed The Memory Hierarchy Capacity

The Cache Concept CPU Registers Addresses Data Memory ALU Instructions

The Cache Concept Memory CPU Registers Addresses Data Cache ALU Instructions Fast, Expensive, Small

Cache Pro s and Con s Pro s If CPU requests an address that is in cache, it gets it MUCH faster! Con s If CPU requests an address that is NOT in cache, it gets it slower Cache Hit Cache Miss

What happens on Cache Miss? 1. Recognize data is not in cache 2. Make room in Cache copy modified data block from cache to memory 3. Copy requested block of memory into cache 4. Return data from cache

Example Memory Trace Fetch Instruction at 0x0000555555554a92 Fetch Integer at 0x00007fffffffffffab48 Fetch Instruction at 0x0000555555554a94 Store Integer at 0x00007fffffffffffab48 Fetch Instruction at 0x0000555555554a96 Fetch Integer at 0x00007fffffffffffab8c Fetch Instruction at 0x0000555555554905 Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Locality Local variables are next to each other in the stack frame Tend to read a matrix sequentially in row major order Instructions are read sequentially If there is a loop, a small cluster of instructions is re-read multiple times Chances are, for most memory accesses, the block is already in the cache! It is not uncommon to get 98%+ cache hit rates!

Locality has two dimensions Probability of Cache Hit 100 90 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 18 15 12 9 6 3 0

Fully Associative Cache Define cache block size = 2 b the amount of data transferred between cache and ROM memory most often 64=2 6 bytes Divide the cache up into 2 w ways Each way contains one block (e.g. 64 bytes) Each way contains an ID tag where in memory does this block come from Each way contains flags, valid flag and dirty flag For instance, an 8K fully associative cache contains 128=2 7 ways

Fully Associative Cache Instruction @ 0x0000555555554a92 Way ID V D Data (Line) 0 1 2 3 4 5 6 7 8 3 C 0 0x0000555555554a80 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 1 0x00007fffffffffffab40 1 1 x x x x x x x x y y y y x x x x x x x x x x x x x x x x 2 0x00007fffffffffffab80 1 1 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 3 0x0000555555554900 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 125 0 0 126 0 0 127 0 0 Integer @ 0x00007fffffffffffab48 3 D 3 E 3 F Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Address Sub-Fields Fully Associative Divide each memory address into two sub-fields Tag High order bits - where in memory this block comes from Block Offset b Low order bits that define the offset within a block 2 63 2 62 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 b 63 b 62 b 7 b 6 b 5 b 4 b 3 b 2 b 1 b 0 0x0000555555554a 1 0 0 1 0 0 1 0 Tag Block Offset

Random Access Memory (RAM) ADDR DATA 00 0 0 0 01 0 0 1 10 0 1 0 11 1 1 1

Content Addressable Memory (CAM) ADDR DATA 00 1 0 1 01 1 1 0 10 0 0 1 11 1 1 1

CAM in Fully Associative Cache Keep tags in CAM data width = tag size Address in CAM is the way index in the cache When main memory written to cache, also write its tag to CAM When accessing memory, provide requested tag to CAM CAM may find that tag is not available cache miss CAM may find that tag is available and return way index of the way that contains that data probable cache hit!

Example Fully Associative Cache OFFSET CAM Way Fast RAM TAG Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab4 1 1 1 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab8 2 1 1 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 000055555555490 3 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x WAY 3 3 D E 3 F 125 0 0 126 0 0 127 0 0

Example Fully Associative Cache CAM Way Fast RAM 0000555555554a8 Tag V D 0 1 2 3 4 5 6 7 8 3 C MISS 0 0 0 1 0 0 2 0 0 3 0 0 125 0 0 126 0 0 127 0 0 3 3 D E 3 F Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Example Fully Associative Cache CAM Way Fast RAM Instruction @ 0x0000555555554a92 0000555555554a8 Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 1 0 0 2 0 0 3 0 0 125 0 0 126 0 0 127 0 0 3 3 D E 3 F Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Example Fully Associative Cache CAM Way Fast RAM 00007fffffffffffab4 Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x MISS 1 0 0 2 0 0 3 0 0 125 0 0 126 0 0 127 0 0 3 3 D E 3 F Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Example Fully Associative Cache Integer @ 0x00007fffffffffffab48 CAM Way Fast RAM 00007fffffffffffab4 Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab4 1 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 2 0 0 3 0 0 125 0 0 126 0 0 127 0 0 3 3 D E 3 F Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Example Fully Associative Cache Instruction @ 0x0000555555554a94 CAM Way Fast RAM 0000555555554a8 Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab4 1 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 2 0 0 3 0 0 125 0 0 126 0 0 127 0 0 3 3 D E 3 F Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Example Fully Associative Cache Integer @ 0x00007fffffffffffab48 CAM Way Fast RAM 00007fffffffffffab4 Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab4 1 1 1 x x x x x x x x y y y y x x x x x x x x x x x x x x x x 2 0 0 3 0 0 125 0 0 126 0 0 127 0 0 3 3 D E 3 F Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Example Fully Associative Cache Instruction @ 0x0000555555554a96 CAM Way Fast RAM 0000555555554a8 Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab4 1 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 2 0 0 3 0 0 125 0 0 126 0 0 127 0 0 3 3 D E 3 F Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Example Fully Associative Cache CAM Way Fast RAM 00007fffffffffffab8 Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab4 1 1 1 x x x x x x x x y y y y x x x x x x x x x x x x x x x x MISS 2 0 0 3 0 0 125 0 0 126 0 0 127 0 0 3 3 D E 3 F Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 W 00007fffffffffffab48 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905

Example Fully Associative Cache CAM Way Fast RAM Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab4 1 1 1 x x x x x x x x y y y y x x x x x x x x x x x x x x x x 00007fffffffffffab8 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab8 2 3 0 0 Trace I 0000555555554a92 R 00007fffffffffffab48 I 0000555555554a94 125 0 0 Integer @ W 00007fffffffffffab48 126 0 0 0x00007fffffffffffab8c I 0000555555554a96 127 0 0 R 00007fffffffffffab8c I 0000555555554905 3 3 D E 3 F

Example Fully Associative Cache CAM Way Fast RAM Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab4 1 1 1 x x x x x x x x y y y y x x x x x x x x x x x x x x x x 00007fffffffffffab8 000055555555490 2 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 3 0 0 Trace MISS I 0000555555554a92 R 00007fffffffffffab48 125 0 0 I 0000555555554a94 W 00007fffffffffffab48 126 0 0 I 0000555555554a96 127 0 0 R 00007fffffffffffab8c I 0000555555554905 3 3 D E 3 F

Example Fully Associative Cache CAM Way Fast RAM Tag V D 0 1 2 3 4 5 6 7 8 3 C 0000555555554a8 0 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 00007fffffffffffab4 1 1 1 x x x x x x x x y y y y x x x x x x x x x x x x x x x x 00007fffffffffffab8 000055555555490 2 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 000055555555490 3 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x Trace I 0000555555554a92 R 00007fffffffffffab48 125 0 0 I 0000555555554a94 W 00007fffffffffffab48 126 0 0 Instruction @ I 0000555555554a96 127 0 0 0x0000555555554905 R 00007fffffffffffab8c I 0000555555554905 3 3 D E 3 F

Cache Miss Select a victim way index way to replace If there is an unused way (valid flag off), use it Pick a way that is unlikely to be used in the near future (Project 1) Random, Least Recently Used, Least Frequently Used If dirty flag is on, copy block from this way to memory Write tag to CAM at the chosen way index Copy block starting at tag (with block offset zeroed) from memory to the chosen way in Cache RAM Turn valid flag for this set on, and dirty bit off

Fully Associative Cache Pros and Cons Advantages Cache always contains most important memory Therefore, Cache Hit Rate is high Therefore, speed is fast Disadvantages Lots of very expensive CAM memory Choosing victim can be slow or incorrect Need extra data to choose victim

Direct Map Cache Define cache block size = 2 b the amount of data transferred between cache and ROM memory most often 64=2 6 bytes Divide the cache up into 2 s sets Each set contains one way (e.g. 64 byte block of memory) Each set identified by ID tag where this way comes from Each set contains flags, valid flag and dirty flag For instance, an 8K direct map cache contains 128=2 7 sets

Address Sub-Fields Divide each memory address into three sub-fields Tag High order bits - where in memory this block comes from Set Index s intermediate bits -which set this block is associated with Block Offset b Low order bits that define the offset within a block 2 63 2 62 2 14 2 13 2 12 2 11 2 10 2 9 2 8 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 b 63 b 62 b 14 b 13 b 12 b 11 b 10 b 9 b 8 b 7 b 6 b 5 b 4 b 3 b 2 b 1 b 0 0000555555554 0 1 0 1 0 1 0 0 1 0 0 0 1 0 00007fffffffffff8 1 0 1 0 1 1 0 1 0 0 1 0 0 0 Tag Set Index Block Offset

Direct Map Cache Instruction @ 0x0000555555554a92 Set ID V D Data (Line) 0 0 0 1 0 0 0 1 2 3 4 5 6 7 8 6 0 2A 0000555555554 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 56 00007fffffffffff8 1 1 Trace x x x x x x x x x x x x x x x x x x x x I x0000555555554a92 x x x x x x x 0 0 R 00007fffffffffffab48 I 0000555555554a94 7E 0 0 W 00007fffffffffffab48 7F 0 0 I 0000555555554a96 R 00007fffffffffffab8c I 0000555555554905 6 1 6 2 6 3

On memory access request 1. Use set index to choose correct set 2. If valid flag is off, cache miss 3. Compare tag if not equal evict block cache miss If dirty flag is on, write line back to main memory based on tag and set Read main memory line from new tag/set into this set 4. Use block offset to access data in cache If memory write, turn on dirty flag for this set.

Direct Map Hit Rates Sequential access Cache miss after 8,192 = 0x2000 = 2 13 bytes 99.988% hit rate pretty good! Problems occur when two access trends overlap For instance, if binary code is stored at 0x000000000ca4000 data is stored at 0x7FFFFFFFFFF64000 Assuming alternate code/data memory access, cache miss EVERY time! 0.000 hit rate pretty bad!

Direct Map Cache Pros and Cons Advantages No expensive CAM memory Choosing victim no cost / fast No extra data for victim choice Speeds up cache miss Disadvantages Hit rate not always high Therefore, can be slow

Set Associative Cache (Hybrid) Define cache block size = 2 b the amount of data transferred between cache and ROM memory most often 64=2 6 bytes Divide the cache up into 2 s sets Each set contains one way (e.g. 64 byte block of memory) Each set identified by ID tag where this way comes from Each set contains flags, valid flag and dirty flag For instance, an 8K 4-way set associative cache contains 32=2 5 sets Divide each set up into 2 w ways Each set contains one block (e.g. 64 bytes) Each set contains an ID tag where in memory does this block come from Each set contains flags, valid flag and dirty flag For instance, an 8K 4-way set associative cache contains 4=2 2 ways

Address Sub-Fields Divide each memory address into three sub-fields Tag High order bits - where in memory this block comes from Set Index s intermediate bits -which set this block is associated with Block Offset b Low order bits that define the offset within a block 2 63 2 62 2 14 2 13 2 12 2 11 2 10 2 9 2 8 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 b 63 b 62 b 14 b 13 b 12 b 11 b 10 b 9 b 8 b 7 b 6 b 5 b 4 b 3 b 2 b 1 b 0 00005555555554 0 1 0 1 0 0 1 0 0 0 1 0 00007fffffffffffa 1 0 1 1 0 1 0 0 1 0 0 0 Tag Set Index Block Offset

Example Set Associative Cache OFFSET Set CAMS Way Fast RAM Tag V D 0 1 2 3 4 5 6 7 8 3 C 0 0000000000000 0 0 0 0000000000000 1 0 0 0000000000000 2 0 0 0000000000000 3 0 0 3 3 D E 3 F 0A 0000555555554 WAY 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x 0000000000000 1 0 0 TAG 0000000000000 2 0 0 0000000000000 3 0 0

On memory access request 1. Use set index to choose correct set 2. In that set, use tag w/ CAM to find correct way 3. If valid flag is off or tag not found, cache miss Choose way to evict from this set If dirty flag is on, write line back to main memory based on tag and set Read main memory line from new tag/set into this set/way 4. Use block offset to access data in cache If memory write, turn on dirty flag for this set.

Set Associative Cache Pros and Cons Advantages Cache almost always contains most important memory Not much expensive CAM memory Small CAM cheaper than big Choosing victim not too expensive (1/4 choice) Speeds up cache miss Disadvantages Everything is a compromise

Multiple Cache Levels Real Memory (16G) CPU Registers L1 Cache (64K) L2 Cache (2048K) L3 Cache (8096K) ALU On-board, Fast, Expensive, Small Fast, Expensive, Larger Slower, Cheaper, Larger

Virtual Memory Real Memory (16G) CPU Registers L1 Cache (64K) L2 Cache (2048K) L3 Cache (8096K) 4K Page ALU Page Space