Review: New-School Machine Structures. Review: Direct-Mapped Cache. TIO Dan s great cache mnemonic. Memory Access without Cache

Similar documents
UCB CS61C : Machine Structures

Direct-Mapped Cache Terminology. Caching Terminology. TIO Dan s great cache mnemonic. Accessing data in a direct mapped cache

CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

! CS61C : Machine Structures. Lecture 22 Caches I. !!Instructor Paul Pearce! ITʼS NOW LEGAL TO JAILBREAK YOUR PHONE!

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size

UCB CS61C : Machine Structures

Cache Memory - II. Some of the slides are adopted from David Patterson (UCB)

CS61C : Machine Structures

CS61C : Machine Structures

Review : Pipelining. Memory Hierarchy

CS61C : Machine Structures

UCB CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Clever Signed Adder/Subtractor. Five Components of a Computer. The CPU. Stages of the Datapath (1/5) Stages of the Datapath : Overview

Block Size Tradeoff (1/3) Benefits of Larger Block Size. Lecture #22 Caches II Block Size Tradeoff (3/3) Block Size Tradeoff (2/3)

UC Berkeley CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture Caches Part 2

CS61C - Machine Structures. Lecture 17 - Caches, Part I. October 25, 2000 David Patterson

UCB CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture Direct- Mapped Caches. Increasing distance from processor, decreasing speed.

UCB CS61C : Machine Structures

CS61C : Machine Structures

www-inst.eecs.berkeley.edu/~cs61c/

Another View of the Memory Hierarchy. Lecture #25 Virtual Memory I Memory Hierarchy Requirements. Memory Hierarchy Requirements

UCB CS61C : Machine Structures

1" 0" d" c" b" a" ...! ! Benefits of Larger Block Size. " Very applicable with Stored-Program Concept " Works well for sequential array accesses

10/11/17. New-School Machine Structures. Review: Single Cycle Instruction Timing. Review: Single-Cycle RISC-V RV32I Datapath. Components of a Computer

CS61C : Machine Structures

CS61C : Machine Structures

Lecture #6 Intro MIPS; Load & Store

Levels in memory hierarchy

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

CS61C : Machine Structures

COMP 3221: Microprocessors and Embedded Systems

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

ECE331: Hardware Organization and Design

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2

10/19/17. You Are Here! Review: Direct-Mapped Cache. Typical Memory Hierarchy

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

Direct-Mapped and Set Associative Caches. Instructor: Steven Ho

CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review

And in Review! ! Locality of reference is a Big Idea! 3. Load Word from 0x !

data block 0, word 0 block 0, word 1 block 1, word 0 block 1, word 1 block 2, word 0 block 2, word 1 block 3, word 0 block 3, word 1 Word index cache

ECE232: Hardware Organization and Design

CMPT 300 Introduction to Operating Systems

Agenda. Cache-Memory Consistency? (1/2) 7/14/2011. New-School Machine Structures (It s a bit more complicated!)

LECTURE 11. Memory Hierarchy

Review. Pipeline big-delay CL for faster clock Finite State Machines extremely useful You ll see them again in 150, 152 & 164

Review. Motivation for Input/Output. What do we need to make I/O work?

UC Berkeley CS61C : Machine Structures

Figure 1: Organisation for 128KB Direct Mapped Cache with 16-word Block Size and Word Addressable

Brought to you by CalLUG (UC Berkeley GNU/Linux User Group). Tuesday, September 20, 6-8 PM in 100 GPB.

Lecture #6 Intro MIPS; Load & Store Assembly Variables: Registers (1/4) Review. Unlike HLL like C or Java, assembly cannot use variables

Review. N-bit adder-subtractor done using N 1- bit adders with XOR gates on input. Lecture #19 Designing a Single-Cycle CPU

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches, Set Associative Caches, Cache Performance

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 36: IO Basics

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

ECE 331 Hardware Organization and Design. UMass ECE Discussion 10 4/5/2018

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

Review: Computer Organization

CS61C : Machine Structures

I-Format Instructions (3/4) Define fields of the following number of bits each: = 32 bits

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

Instructor: Randy H. Katz hap://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #7. Warehouse Scale Computer

CS61C : Machine Structures

Caches Part 1. Instructor: Sören Schwertfeger. School of Information Science and Technology SIST

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS61C : Machine Structures

Memory. Lecture 22 CS301

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part I

ECE331: Hardware Organization and Design

CS3350B Computer Architecture

CACHE ARCHITECTURE. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

CPE 631 Lecture 04: CPU Caches

UC Berkeley CS61C : Machine Structures

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Spring 2018 :: CSE 502. Cache Design Basics. Nima Honarmand

ECE331: Hardware Organization and Design

ECE 2300 Digital Logic & Computer Organization. Caches

Transcription:

In st r uct io n Un it ( s) A+B A1+B1 A+B A3+B3 Guest Lecturer Alan Christopher inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 13 Caches II 14--1 MEMRISTOR MEMORY ON ITS WAY (HOPEFULLY) HP has begun testing research prototypes of a novel non-volatile memory element, the memristor. They have double the storage density of flash, and has 1x more read-write cycles than flash (1 6 vs 1 5 ). Memristors are (in principle) also capable of being memory and logic, how cool is that? Originally slated to be ready by 13, HP later pushed that date to some time 14. www.technologyreview.com/computing/518 http://www.technologyreview.com/view/51361/can-hp-save-itself/ Parallel Requests Assigned to computer e.g., Search Katz Parallel Threads Assigned to core e.g., Lookup, Ads Review: New-School Machine Structures Software Parallel Instructions >1 instruction @ one time e.g., 5 pipelined instructions Parallel Data >1 data item @ one time e.g., Add of 4 pairs of words Hardware descriptions All gates @ one time Programming Languages Harness Parallelism & Achieve High Performance Hardware Warehouse Scale Computer /19/14 Fall 1 -- Lecture #14 Smart Phone Computer Core Core Memory (Cache) Today s Input/Output Lecture Core Functional Unit(s) Cache Memory Logic Gates Review: Direct-Mapped Cache All fields are read as unsigned integers. Index specifies the cache index (or row /block) Tag distinguishes betw the addresses that map to the same location Offset specifies which byte within the block we want tttttttttttttttttt iiiiiiiiii oooo tag index byte to check to offset if have select within correct block block block TIO Dan s great cache mnemonic AREA (cache size, B) = HEIGHT (# of blocks) (H+W) = H * W * WIDTH (size of one block, B/block) WIDTH Tag Index Offset (size of one block, B/block) Addr size (often 3 bits) HEIGHT (# of blocks) AREA (cache size, B) CS61C L31 Caches II (3) Garcia, Spring 13 UCB CS61C L31 Caches II (4) Garcia, Spring 13 UCB Memory Access without Cache Load word instruction: lw $t, ($t1) $t1 contains 1 ten, Memory[1] = 99 1. Processor issues address 1 ten to Memory. Memory reads word at address 1 ten (99) 3. Memory sends 99 to Processor 4. Processor loads 99 into register $t1 Memory Access with Cache Load word instruction: lw $t, ($t1) $t1 contains 1 ten, Memory[1] = 99 With cache (similar to a hash) 1. Processor issues address 1 ten to Cache. Cache checks to see if has copy of data at address 1 ten a. If finds a match (Hit): cache reads 99, sends to processor b. No match (Miss): cache sends address 1 to Memory I. Memory reads 99 at address 1 ten II. Memory sends 99 to Cache III. Cache replaces word with new 99 IV. Cache sends 99 to processor 3. Processor loads 99 into register $t1 CS61C L31 Caches II (5) Garcia, Spring 13 UCB CS61C L31 Caches II (6) Garcia, Spring 13 UCB

CS61C L31 Caches II (7) Caching Terminology When reading memory, 3 things can happen: cache hit: cache block is valid and contains proper address, so read desired word cache miss: nothing in cache in appropriate block, so fetch from memory cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory (cache always copy) Cache Terms Hit rate: fraction of access that hit in the cache Miss rate: 1 Hit rate Miss penalty: time to replace a block from lower level in memory hierarchy to cache Hit time: time to access cache memory (including tag comparison) Abbreviation: $ = cache (A Berkeley innovation!) Garcia, Spring 13 UCB CS61C L31 Caches II (8) Garcia, Spring 13 UCB Accessing data in a direct mapped cache Ex.: 16KB of data, direct-mapped, 4 word blocks Can you work out height, width, area? Read 4 addresses 1. x14. x1c 3. x34 4. x814 Memory vals here: CS61C L31 Caches II (9) Memory Address (hex)value of Word 1 a 14 b 18 c 1C d 3 34 38 3C 81 814 818 81C e f g h i j k l Garcia, Spring 13 UCB Accessing data in a direct mapped cache 4 Addresses: x14, x1c, x34, x814 4 Addresses divided (for convenience) into Tag, Index, Byte Offset fields 1 1 1 11 11 1 1 1 1 Tag Index Offset CS61C L31 Caches II (1) Garcia, Spring 13 UCB 16 KB Direct Mapped Cache, 16B blocks bit: determines whether anything is stored in that row (when computer initially turned on, all entries invalid) Index Tag xc-f x8-b x4-7 x-3 1 3 4 5 6 7 1 13 1. Read x14 1 1 Tag field Index field Offset Index Tag xc-f x8-b x4-7 x-3 1 3 4 5 6 7 1 13 CS61C L31 Caches II (11) Garcia, Spring 13 UCB CS61C L31 Caches II (1) Garcia, Spring 13 UCB

CS61C L31 Caches II (13) 1 13 So we read block 1 (1) 1 1 Index Tag xc-f x8-b x4-7 x-3 1 3 4 5 6 7 No valid data 1 1 Index Tag xc-f x8-b x4-7 x-3 1 3 4 5 6 7 1 13 Garcia, Spring 13 UCB CS61C L31 Caches II (14) Garcia, Spring 13 UCB So load that data into cache, setting tag, valid 1 1 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 4 5 6 7 1 13 Read from cache at offset, return word b 1 1 Tag field Index field Offset Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 4 5 6 7 1 13 CS61C L31 Caches II (15) Garcia, Spring 13 UCB CS61C L31 Caches II (16) Garcia, Spring 13 UCB. Read x1c =..1 11 1 11 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 4 5 6 7 1 13 Index is 1 11 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 4 5 6 7 1 13 CS61C L31 Caches II (17) Garcia, Spring 13 UCB CS61C L31 Caches II (18) Garcia, Spring 13 UCB

CS61C L31 Caches II (19) Index valid, Tag Matches 1 11 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 4 5 6 7 1 13 Index, Tag Matches, return d 1 11 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 4 5 6 7 1 13 Garcia, Spring 13 UCB CS61C L31 Caches II () Garcia, Spring 13 UCB 3. Read x34 =..11 1 11 1 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 4 5 6 7 1 13 So read block 3 11 1 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 4 5 6 7 1 13 CS61C L31 Caches II (1) Garcia, Spring 13 UCB CS61C L31 Caches II () Garcia, Spring 13 UCB No valid data 11 1 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 4 5 6 7 1 13 Load that cache block, return word f 11 1 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 1 h g f e 4 5 6 7 1 13 CS61C L31 Caches II (3) Garcia, Spring 13 UCB CS61C L31 Caches II (4) Garcia, Spring 13 UCB

CS61C L31 Caches II (5) 4. Read x814 = 1..1 1 1 1 1 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 1 h g f e 4 5 6 7 1 13 So read Cache Block 1, Data is 1 1 1 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 1 h g f e 4 5 6 7 1 13 Garcia, Spring 13 UCB CS61C L31 Caches II (6) Garcia, Spring 13 UCB Cache Block 1 Tag does not match (!= ) 1 1 1 Index Tag xc-f x8-b x4-7 x-3 1 1 d c b a 3 1 h g f e 4 5 6 7 1 13 Miss, so replace block 1 with new data & tag 1 1 1 Index Tag xc-f x8-b x4-7 x-3 1 1 l k j i 3 1 h g f e 4 5 6 7 1 13 CS61C L31 Caches II (7) Garcia, Spring 13 UCB CS61C L31 Caches II (8) Garcia, Spring 13 UCB And return word J 1 1 1 Index Tag xc-f x8-b x4-7 x-3 1 1 l k j i 3 1 h g f e 4 5 6 7 1 13 Do an example yourself. What happens? Chose from: Cache: Hit, Miss, Miss w. replace Values returned: a,b, c, d, e,, k, l Read address x3? 11 Read address x1c? 1 11 Cache Index Tag xc-f x8-b x4-7 x-3 1 1 l k j i 3 1 h g f e 4 5 6 7 CS61C L31 Caches II (9) Garcia, Spring 13 UCB CS61C L31 Caches II (3) Garcia, Spring 13 UCB

CS61C L31 Caches II (31) Answers x3 a hit Index = 3, Tag matches, Offset =, value = e x1c a miss Index = 1, Tag mismatch, so replace from memory, Offset = xc, value = d Since reads, values must = memory values whether or not cached: x3 = e x1c = d Memory Address (hex)value of Word 1 a 14 b 18 c 1C d 3 34 38 3C e f g h 81 i 814 j 818 k 81C l Garcia, Spring 13 UCB Administrivia Proj 1- due Sunday CS61C L31 Caches II (3) Garcia, Spring 13 UCB Multiword-Block Direct-Mapped Cache Four words/block, cache size = 4K words Byte 31 3... 15 14 13... 4 3 1 Hit offset Tag Index Tag 1... 11 1 13 18 AND 18 Index 1 MUX 4 1 Multiplexor Data Block offset What kind of locality are we taking advantage of? 3 Data Peer Instruction 1) Mem hierarchies were invented before 195. (UNIVAC I wasn t delivered til 1951) ) All caches take advantage of spatial locality. 3) All caches take advantage of temporal locality. CS61C L31 Caches II (34) 13 a) FFF a) FFT b) FTF b) FTT c) TFF d) TFT e) TTF e) TTT Garcia, Spring 13 UCB 1 3 And in Conclusion Mechanism for transparent movement of data among levels of a storage hierarchy set of address/value bindings address index to set of candidates compare desired address with tag service hit or miss load new block and binding on miss address: tag index offset 1 11 Tag xc-f x8-b x4-7 x-3 1 d c b a CS61C L31 Caches II (36) Garcia, Spring 13 UCB