CS 3410, Spring 2014 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.15

Similar documents
Caches and Memory Deniz Altinbuken CS 3410, Spring 2015

Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches and Memory. Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.10, 5.13, 5.15, 5.17

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)

Lec 13: Linking and Memory. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

Memory Hierarchy: Caches, Virtual Memory

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Computer Systems and Networks. ECPE 170 Jeff Shafer University of the Pacific $$$ $$$ Cache Memory $$$

Welcome to Part 3: Memory Systems and I/O

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

Caches (Writing) Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.2 3, 5.5

Caches & Memory. CS 3410 Computer System Organization & Programming

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

Levels in memory hierarchy

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Caches. Hiding Memory Access Times

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

CS161 Design and Architecture of Computer Systems. Cache $$$$$

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

14:332:331. Week 13 Basics of Cache

Prelim 3 Review. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University

EECS 322 Computer Architecture Improving Memory Access: the Cache

Memory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

Prelim 3 Review. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University


Key Point. What are Cache lines

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

10/11/17. New-School Machine Structures. Review: Single Cycle Instruction Timing. Review: Single-Cycle RISC-V RV32I Datapath. Components of a Computer

CS 61C: Great Ideas in Computer Architecture Direct- Mapped Caches. Increasing distance from processor, decreasing speed.

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

14:332:331. Week 13 Basics of Cache

Cycle Time for Non-pipelined & Pipelined processors

LECTURE 11. Memory Hierarchy

Lecture-14 (Memory Hierarchy) CS422-Spring

Review : Pipelining. Memory Hierarchy

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

EE 457 Unit 7a. Cache and Memory Hierarchy

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review

Course Administration

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example

CSEE 3827: Fundamentals of Computer Systems, Spring Caches

Virtual Memory. P & H Chapter 5.4 (up to TLBs) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Caches and Memory Hierarchies.

Fundamentals of Computer Systems

Memory Hierarchies &

ECE331: Hardware Organization and Design

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

P & H Chapter 5.7 (up to TLBs) Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University

CS61C : Machine Structures

ECE ECE4680

CS61C : Machine Structures

Chapter 8 Virtual Memory

Multi-cycle Datapath (Our Version)

Virtual Memory 3. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

UCB CS61C : Machine Structures

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Digital Logic & Computer Design CS Professor Dan Moldovan Spring Copyright 2007 Elsevier 8-<1>

Memory. Lecture 22 CS301

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu

Caches Part 1. Instructor: Sören Schwertfeger. School of Information Science and Technology SIST

Cache Memory and Performance

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay!

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

UC Berkeley CS61C : Machine Structures

Chapter 7-1. Large and Fast: Exploiting Memory Hierarchy (part I: cache) 臺大電機系吳安宇教授. V1 11/24/2004 V2 12/01/2004 V3 12/08/2004 (minor)

EN1640: Design of Computing Systems Topic 06: Memory System

LECTURE 10: Improving Memory Access: Direct and Spatial caches

RISC Pipeline. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter 4.6

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

CSC Memory System. A. A Hierarchy and Driving Forces

Memory Hierarchy Y. K. Malaiya

The Memory Hierarchy 10/25/16

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy

Computer Architecture

The University of Adelaide, School of Computer Science 13 September 2018

EE 4683/5683: COMPUTER ARCHITECTURE

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory

Fundamentals of Computer Systems

Memory. Objectives. Introduction. 6.2 Types of Memory

CS61C : Machine Structures

Transcription:

CS 34, Spring 4 Computer Science Cornell University See P& Chapter: 5.- 5.4, 5.8, 5.5

Code Stored in emory (also, data and stack) memory PC +4 new pc inst control extend imm B A compute jump/branch targets B D memory D ctrl ctrl emory ctrl IF/ID ID/EX EX/E Stack, Data, Code E/WB Stored in emory

ain memory is very very slow Remember: SRA DRA 6-8 transistors, no refresh, fast transistor, denser, cheaper/bit, needs refresh

ain memory is very very slow CPU clock rates ~.33ns ns (3Gz- 5z) emory technology Access :me in nanosecs (ns) Access :me in cycles SRA (on chip).5-.5 ns - 3 cycles SRA (off chip).5-3 ns 5-5 cycles DRA 5-7 ns 5- cycles SSD (Flash) 5k- 5k ns Tens of thousands Disk 5- ns illions

ain memory is very very slow CPU clock rates ~.33ns ns (3Gz- 5z) emory technology Access :me in nanosecs (ns) Access :me in cycles $ per GIB in Capacity SRA (on chip).5-.5 ns - 3 cycles 56 KB SRA (off chip).5-3 ns 5-5 cycles $4k 3 B DRA 5-7 ns 5- cycles $- $ 8 GB SSD (Flash) 5k- 5k ns Tens of thousands $.75- $ 5 GB Disk 5- ns illions $.5- $. 4 TB

RegFile s bytes < cycle access L3 becoming more common L Cache (several KB) L Cache (½- 3B) emory Pyramid emory (8B few GB) - 3 cycle access 5-5 cycle access 5- cycle access Disk (any GB few TB) + cycle access These are rough numbers: mileage may vary for latest/greatest Caches usually made of SRA

Can we create an illusion of cheap, large and fast memory? RegFile s bytes L Cache (several KB) L Cache (½- 3B) emory Pyramid emory (8B few GB) Disk (any GB few TB)

Can we create an illusion of cheap, large and fast memory? RegFile s bytes L Cache (several KB) L Cache (½- 3B) emory Pyramid emory (8B few GB) Disk (any GB few TB) Yes, using caches and assuming temporal and spahal locality

Caches vs memory vs terhary storage Tradeoffs Cache organizahon Direct apped Fully Associahve N- way set associahve Caching Queshons ow does a cache work? ow fast? ow big?

Wrihng a paper on Beren and Lúthien

Pick a small set of books; not enhre shelf Spend hme on small set of chapters

Pick a small set of books; not enhre shelf Spend hme on small set of chapters Somehmes get other books as well Norse mythology, Tolkien biography Your desk: out of space Replace less useful books with new ones

Pick a small set of books; not enhre shelf Cache vs. main memory Working set (the subset in use) Spend hme on small set of chapters Cache hit Locality of access: temporal and spahal Somehmes go to other books Cache may not have data (cache miss) Shelf out of space Cache evichon policy

int n = 4; int k[] = { 3, 4,, }; int fib(int i) { if (i <= ) return i; else return fib(i- )+fib(i- ); } Temporal Locality int main(int ac, char **av) { for (int i = ; i < n; i++) { printi(fib(k[i])); prints("\n"); } Spahal Locality }

If em[x] was accessed recently... then em[x] is likely to be accessed soon Exploit temporal locality: Put recently accessed em[x] higher in memory hierarchy since it will likely be accessed again soon then em[x ± ε] is likely to be accessed soon Exploit spahal locality: Put en:re block containing em[x] and surrounding addresses higher in memory hierarchy since nearby address will likely be accessed

emory closer to processor small & fast stores achve data emory farther from processor big & slow stores inachve data L/L3 Cache SRA L Cache SRA- on- chip emory DRA

% of data is accessed the most L Cache SRA- on- chip $R3 Reg LW $R3, em L/L3 Cache SRA 9% of data is achve emory DRA 9% of data inachve (not accessed)

emory closer to processor is fast but small usually stores subset of memory farther strictly inclusive Transfer whole blocks (cache lines): 4kb: disk RA 56b: RA L 64b: L L

Processor tries to access em[x] Check: is block containing em[x] in the cache? Yes: cache hit return requested data from cache line No: cache miss read block from memory (or lower level cache) (evict an exishng cache line to make room) place new block in cache return requested data à and stall the pipeline while all of this happens

Block (or line) inimum unit of informahon that is present/or not in the cache Cache hit, miss it rate The frachon of memory accesses found in a level of the memory hierarchy iss rate The converse

What structure to use? Where to place a block (book)? ow to find a block (book)? When miss, which block to replace? What happens on write?

A given data block can be placed in exactly one cache line à Direct apped in any cache line à Fully Associahve in a small set of cache lines à Set Associahve

emory Each block number maps to a single cache line index Simplest hardware Queshons ow to index into cache ow to find correct word/byte ow to match it x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44

Each block number maps to a single cache line index Simplest hardware Queshons ow to index into cache ow to find correct word/byte ow to match it line line Cache cachelines - word per cacheline byte addressable x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44 emory

Queshons ow to index into cache ow to find correct word/byte ow to match it 3- addr 9 tag line line Cache index offset cachelines - word per cacheline byte addressable x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44 emory

3- addr line line tag 9- bits Cache x index offset - bits - bits cachelines - word per cacheline byte addressable addr x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44 x48 emory

emory line line Each block number maps to a single cache line index Simplest hardware 3- addr tag Cache index offset cachelines 4- words per cacheline byte addressable x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44

Size of offset? A) B) C) 3 D) 4 E) 5 Size of tag?

emory line line 3- addr 7 tag Cache index offset cachelines 4- words per cacheline byte addressable 4 x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44

line line Each block number maps to a single cache line index Simplest hardware 3- addr x tag 7- bits x4 Cache index offset - bits 4- bits x8 xc cachelines 4- words per cacheline addr x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44 x48 emory

line line 3- addr x tag 7- bits x4 Cache index offset - bits 4- bits x8 xc cachelines 4- words per cacheline line line x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44 x48 emory

line line line line 3 3- addr x Cache tag x4 4 cachelines - words per cacheline index offset x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44 x48 emory

line line line line 3 3- addr x Cache tag 7- bits x4 4 cachelines - words per cacheline index offset - bits 3- bits line line line line 3 line line line line 3 x x4 x8 xc x x4 x8 xc x x4 x8 xc x3 x34 x38 x3c x4 x44 x48 emory

Pros: Very simple hardware

Tag Index Offset V Tag Block = tag offset index hit? Word select data 3 bits

Tag Index Offset m bytes per block V Tag Block n blocks n bit index, m bit offset Q: ow big is cache (data only)? Cache of size n blocks Block size of m bytes Cache Size: n bytes per block x n blocks = n+m bytes

Tag Index Offset m bytes per block V Tag Block n blocks n bit index, m bit offset Q: ow much SRA is needed (data + overhead)? Cache of size n blocks Block size of m bytes Tag field: 3 (n + m), Valid bit: SRA Size: n x (block size + tag size + valid bit size) = n x ( m bytes x 8 bits- per- byte + (3 n m) + ) bits

Using byte addresses in this example. Addr Bus = 5 bits Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 Cache 4 cache lines byte block V tag data 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Using byte addresses in this example. Addr Bus = 5 bits Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 Cache 4 cache lines byte block bit tag field bit index field bit block offset V tag data 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 Cache index Addr: V tag data 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 Cache Addr: V tag data isses: its: 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 V Cache tag data isses: its: 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 V index Addr: tag data isses: its: 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 5 V index Addr: tag data isses: its: 4 5 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 5 V Cache tag data isses: its: 4 5 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 5 V Addr: tag data isses: its: 4 5 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 5 V tag data 4 5 4 5 isses: its: 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 5 4 Addr: V tag data isses: its: 4 5 4 5 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 5 4 V tag data 4 5 4 5 isses: its: 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] $ $ $ $3 4 4 Addr: V tag data 4 5 4 5 isses: its: 3 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 5 ] $ $ $ $3 4 4 V tag data isses: its: 3 4 5 4 5 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 5 ] $ $ $ $3 4 4 Addr: V tag data 4 5 4 5 isses: 3 its: 3 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 5 ] $ $ $ $3 4 4 V tag data 4 5 4 5 isses: 3 its: 3 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 5 ] $ $ $ $3 4 5 4 Addr: V tag data isses: 4 its: 3 4 5 4 5 4 5 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 5 ] LB $ [ 8 ] $ $ $ $3 4 5 4 V tag data isses: 4 its: 3 4 5 4 5 4 5 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 5 ] LB $ [ 8 ] $ $ $ $3 4 8 4 Addr: V tag data isses: 5 its: 3 8 9 4 5 4 5 4 5 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Processor Cache emory LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 5 ] LB $ [ 8 ] LB $ [ ] LB $ [ 5 ] LB $ [ 8 ] LB $ [ ] LB $ [ 5 ] LB $ [ 8 ] Addr: V isses: 5 its: tag data 8 9 4 5 4 5 4 5 3+3+3+ 3 4 5 6 7 8 9 3 4 5 3 4 5 6 7 8 9 3 4 5

Pathological example Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] $ $ $ $3 4 4 Cache V tag data 4 5 4 5 isses: its: 3 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] $ $ $ $3 4 4 Cache Addr: V tag data isses: 3 its: 3 4 5 3 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] $ $ $ $3 4 4 Cache V tag data isses: 3 its: 3 4 5 3 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] $ $ $ $3 8 4 4 Cache Addr: V tag data 8 9 4 5 3 isses: 4 its: 3 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] LB $ [ 4 ] LB $ [ ] $ $ $ $3 8 4 4 Cache V tag data 8 9 4 5 3 isses: 4 its: 3 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] LB $ [ 4 ] LB $ [ ] $ $ $ $3 8 4 Cache V tag data 8 9 4 5 4 5 3 isses: 4+ its: 3 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] LB $ [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] Cache V tag data 8 9 4 5 4 5 3 isses: 4+ its: 3 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Processor LB $ [ ] LB $ [ 5 ] LB $3 [ ] LB $3 [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] LB $ [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] LB $ [ 4 ] LB $ [ ] LB $ [ ] LB $ [ 8 ] Cache V tag data 8 9 4 5 3 isses: 4+++ its: 3 3 4 5 6 7 8 9 3 4 5 emory 3 4 5 6 7 8 9 3 4 5

Working set is not too big for cache Yet, we can t make it work