Cache Memory - II. Some of the slides are adopted from David Patterson (UCB)

Similar documents
Lctures 33: Cache Memory - I. Some of the slides are adopted from David Patterson (UCB)

CS61C - Machine Structures. Lecture 17 - Caches, Part I. October 25, 2000 David Patterson

CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

Direct-Mapped Cache Terminology. Caching Terminology. TIO Dan s great cache mnemonic. Accessing data in a direct mapped cache

Levels in memory hierarchy

CS61C : Machine Structures

Review : Pipelining. Memory Hierarchy

CS152 Computer Architecture and Engineering Lecture 17: Cache System

CS61C : Machine Structures

COMP 3221: Microprocessors and Embedded Systems

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures

CS61C : Machine Structures

Review: New-School Machine Structures. Review: Direct-Mapped Cache. TIO Dan s great cache mnemonic. Memory Access without Cache

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size

UCB CS61C : Machine Structures

CS61C : Machine Structures

! CS61C : Machine Structures. Lecture 22 Caches I. !!Instructor Paul Pearce! ITʼS NOW LEGAL TO JAILBREAK YOUR PHONE!

UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

1" 0" d" c" b" a" ...! ! Benefits of Larger Block Size. " Very applicable with Stored-Program Concept " Works well for sequential array accesses

LECTURE 11. Memory Hierarchy

UC Berkeley CS61C : Machine Structures

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

UCB CS61C : Machine Structures

CS61C : Machine Structures

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

UCB CS61C : Machine Structures

CS61C : Machine Structures

14:332:331. Week 13 Basics of Cache

Block Size Tradeoff (1/3) Benefits of Larger Block Size. Lecture #22 Caches II Block Size Tradeoff (3/3) Block Size Tradeoff (2/3)

14:332:331. Week 13 Basics of Cache

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

And in Review! ! Locality of reference is a Big Idea! 3. Load Word from 0x !

ECE468 Computer Organization and Architecture. Memory Hierarchy

ECE ECE4680

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Chapter 5. Memory Technology

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

UC Berkeley CS61C : Machine Structures

Memory Hierarchy: Caches, Virtual Memory

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

EE 4683/5683: COMPUTER ARCHITECTURE

Course Administration

CPE 631 Lecture 04: CPU Caches

UCB CS61C : Machine Structures

EECS150 - Digital Design Lecture 11 SRAM (II), Caches. Announcements

The University of Adelaide, School of Computer Science 13 September 2018

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

Basic Memory Hierarchy Principles. Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!)

Registers. Instruction Memory A L U. Data Memory C O N T R O L M U X A D D A D D. Sh L 2 M U X. Sign Ext M U X ALU CTL INSTRUCTION FETCH

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Speicherarchitektur. Who Cares About the Memory Hierarchy? Technologie-Trends. Speicher-Hierarchie. Referenz-Lokalität. Caches

Memory Hierarchies &

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

Administrivia. Expect new HW out today (functions in assembly!)

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay!

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1

Memory Technologies. Technology Trends

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

CSC Memory System. A. A Hierarchy and Driving Forces

Welcome to Part 3: Memory Systems and I/O

www-inst.eecs.berkeley.edu/~cs61c/

COSC 6385 Computer Architecture. - Memory Hierarchies (I)

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

The Memory Hierarchy. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.

Improving our Simple Cache

COSC 6385 Computer Architecture - Memory Hierarchies (I)

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

EEC 483 Computer Organization

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

ECE331: Hardware Organization and Design

Chapter 6 Objectives

Figure 1: Organisation for 128KB Direct Mapped Cache with 16-word Block Size and Word Addressable

MIPS) ( MUX

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

Chapter 2: Memory Hierarchy Design, part 1 - Introducation. Advanced Computer Architecture Mehran Rezaei

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017

Page 1. Memory Hierarchies (Part 2)

Computer Architecture Memory hierarchies and caches

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CSE 2021: Computer Organization

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

Memory Hierarchy Technology. The Big Picture: Where are We Now? The Five Classic Components of a Computer

Chapter 7: Large and Fast: Exploiting Memory Hierarchy

CSE 2021: Computer Organization

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site:

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Transcription:

Cache Memory - II Some of the slides are adopted from David Patterson (UCB)

Outline Direct-Mapped Cache Types of Cache Misses A (long) detailed example Peer - to - peer education example Block Size Tradeoff Types of Cache Misses

Review: Memory Hierarchy Hard Disk Processor RAM/ROM DRAM EEPROM Control Memory Datapath Memory Memory Memory Memory Registers Speed: Size: Cost: Fastest Smallest Highest Cache L1 SRAM Cache L2 SRAM Slowest Biggest Lowest

Review: Direct-Mapped Cache (#1/2) In a direct-mapped cache, each memory address is associated with one possible block within the cache Therefore, we only need to look in a single location in the cache for the data if it exists in the cache Block is the unit of transfer between cache and memory

Review: Direct-Mapped Cache 1 Word Block Memory Address Memory 1 2 3 4 5 6 7 8 9 A B C D E F Cache 8 Byte Direct Index Mapped Cache 1 Block size = 4 bytes Cache Location can be occupied by data from: Memory location - 3, 8 - B, In general: any 4 memory locations that is 8*n (n=,1,2,..) Cache Location 1 can be occupied by data from: Memory location 4-7, C - F, In general: any 4 memory locations that is 8*n + 4 (n=,1,2,..)

Direct-Mapped with 1 woed Blocks Example Address tag block index Byte offset address tag RAM data RAM compare mux hit data

Review: Direct-Mapped Cache Terminology All fields are read as unsigned integers. Index: specifies the cache index (which row or line of the cache we should look in) Offset: once we ve found correct block, specifies which byte within the block we want Tag: the remaining bits after offset and index are determined; these are used to distinguish between all the memory addresses that map to the same location

Reading Material Steve Furber: ARM System On-Chip; 2nd Ed, Addison-Wesley, 2, ISBN: -21-67519-6. Chapter 1.

Accessing Data in a Direct Mapped Cache (#1/3) Ex.: 16KB of data, Memory direct-mapped, Address (hex) Value of Word 4 word blocks Read 4 addresses x14, x1c, x34, x814 Only cache/memory level of hierarchy 1 14 18 1C a b c d 3 34 38 3C 81 814 818 81C e f g h i j k l

Accessing Data in a Direct Mapped Cache (#2/3) 4 Addresses: x14, x1c, x34, x814 4 Addresses divided (for convenience) into Tag, Index, Byte Offset fields ttttttttttttttttt iiiiiiiiii oooo tag to check if have index to byte offset correct block select block within block 1 1 1 11 11 1 1 1 1 Tag Index Offset

Accessing Data in a Direct Mapped Cache (#3/3) So lets go through accessing some data in this cache 16KB data, direct-mapped, 4 word blocks Will see 3 types of events: cache miss: nothing in cache in appropriate block, so fetch from memory cache hit: cache block is valid and contains proper address, so read desired word cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory

16 KB Direct Mapped Cache, 16B blocks Valid bit: determines whether anything is stored in that row (when computer initially turned on, all entries are invalid) Valid Example Block Index 1 2 3 4 5 6 7 122 123 Tag x-3 x4-7 x8-b xc-f

Read x14 =..1 1 1 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 2 3 4 5 6 7 122 123

So we read block 1 (1) 1 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 23 4 5 6 7 122 123

No valid data 1 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 23 4 5 6 7 122 123

So load that data into cache, setting tag, valid 1 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 2 3 4 5 6 7 122 123

Read from cache at offset, return word b 1 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 23 4 5 6 7 122 123

Read x1c =..1 11 1 11 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 2 3 4 5 6 7 122 123

Data valid, tag OK, so read offset return word d 1 11 Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 2 3 4 5 6 7 122 123

Read x34 =..11 1 11 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 2 3 4 5 6 7 122 123

So read block 3 11 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 2 3 45 6 7 122 123

No valid data 11 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 2 3 45 6 7 122 123

Load that cache block, return word f 11 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 2 3 1 e f g h 45 6 7 122 123

Read x814 = 1..1 1 1 1 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 2 3 1 e f g h 4 5 6 7 122 123

So read Cache Block 1, Data is Valid 1 1 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 23 4 5 6 7 122 123 1 e f g h

Cache Block 1 Tag does not match (!= 2) 1 1 1 Tag field Index field Offset Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 a b c d 23 4 5 6 7 122 123 1 e f g h

Miss, so replace block 1 with new data & tag 1 1 1 Valid Tag field Index field Offset Index Tag x-3 x4-7 x8-b xc-f 1 1 2 i j k l 2 3 1 e f g h 4 5 6 7 122 123

And return word j 1 1 1 Valid Tag field Index field Offset Index Tag x-3 x4-7 x8-b xc-f 1 1 2 i j k l 2 3 1 e f g h 4 5 6 7 122 123

Do an example yourself. What happens? Chose from: Cache: Hit, Miss, Miss w. replace Values returned: a,b, c, d, e,, k, l Read address x3? 11 Read address x1c? 1 11 Cache Valid Index Tag x-3 x4-7 x8-b xc-f 1 1 2 i j k l 2 3 1 e f g h 4 5 6 7

x3 a hit Index = 3, Tag matches, Offset =, value = e Answers x1c a miss with replacment Index = 1, Tag mismatch, so replace from memory, Offset = xc, value = d The Values read from Cache must equal memory values whether or not cached: x3 = e x1c = d Memory Address Value of Word a b c d 1 14 18 1c 3 34 38 3c 81 814 818 81c e f g h i j k l

Block Size Tradeoff (#1/3) Benefits of Larger Block Size Spatial Locality: if we access a given word, we re likely to access other nearby words soon (Another Big Idea) Very applicable with Stored-Program Concept: if we execute a given instruction, it s likely that we ll execute the next few as well Works nicely in sequential array accesses too

Block Size Tradeoff (#2/3) Drawbacks of Larger Block Size Larger block size means larger miss penalty - on a miss, takes longer time to load a new block from next level If block size is too big relative to cache size, then there are too few blocks - Result: miss rate goes up In general, minimize Average Access Time = Hit Time x Hit Rate + Miss Penalty x Miss Rate

Block Size Tradeoff (#3/3) Hit Time = time to find and retrieve data from current level cache Miss Penalty = average time to retrieve data on a current level miss (includes the possibility of misses on successive levels of memory hierarchy) Hit Rate = % of requests that are found in current level cache Miss Rate = 1 - Hit Rate

Extreme Example: One Big Block Valid Bit Tag Cache Size = 4 bytes Only ONE entry in the cache! Cache Data B 3 B 2 B 1 B Block Size = 4 bytes If item accessed, likely accessed again soon But unlikely will be accessed again immediately! The next access will likely to be a miss again Continually loading data into the cache but discard data (force out) before use it again Nightmare for cache designer: Ping Pong Effect

Miss Penalty Block Size Tradeoff Conclusions Block Size Miss Rate Exploits Spatial Locality Block Size Fewer blocks: compromises temporal locality Average Access Time Increased Miss Penalty & Miss Rate Block Size

Things to Remember Cache Access involves 3 types of events: cache miss: nothing in cache in appropriate block, so fetch from memory cache hit: cache block is valid and contains proper address, so read desired word cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory