Kathryn Chan, Kevin Bi, Ryan Wong, Waylon Huang, Xinyu Sui

Similar documents
Caches II. CSE 351 Spring Instructor: Ruth Anderson

Caches II. CSE 351 Autumn Instructor: Justin Hsia

Caches II. CSE 351 Autumn Instructor: Justin Hsia

Caches II CSE 351 Spring #include <yoda.h> int is_try(int do_flag) { return!(do_flag!do_flag); }

Caches III. CSE 351 Autumn Instructor: Justin Hsia

Caches III. CSE 351 Autumn Instructor: Justin Hsia

Administrivia. Caches III. Making memory accesses fast! Associativity. Cache Organization (3) Example Placement

Caches III CSE 351 Spring

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches, Set Associative Caches, Cache Performance

Kathryn Chan, Kevin Bi, Ryan Wong, Waylon Huang, Xinyu Sui

CMPSC 311- Introduction to Systems Programming Module: Caching

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

CMPSC 311- Introduction to Systems Programming Module: Caching

14:332:331. Week 13 Basics of Cache

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS3350B Computer Architecture

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

ECE331: Hardware Organization and Design

Computer Systems C S Cynthia Lee Today s materials adapted from Kevin Webb at Swarthmore College

Agenda. CS 61C: Great Ideas in Computer Architecture. Virtual Memory II. Goals of Virtual Memory. Memory Hierarchy Requirements

Course Administration

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

CS 61C: Great Ideas in Computer Architecture. Cache Performance, Set Associative Caches

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy

14:332:331. Week 13 Basics of Cache

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

CS 61C: Great Ideas in Computer Architecture Caches Part 2

Direct-Mapped and Set Associative Caches. Instructor: Steven Ho

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

10/19/17. You Are Here! Review: Direct-Mapped Cache. Typical Memory Hierarchy

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015

ECE331: Hardware Organization and Design

CS 61C: Great Ideas in Computer Architecture. Virtual Memory

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Recall

Caches Part 1. Instructor: Sören Schwertfeger. School of Information Science and Technology SIST

CS 61C: Great Ideas in Computer Architecture. Multilevel Caches, Cache Questions

ECE232: Hardware Organization and Design

ECE260: Fundamentals of Computer Engineering

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

CS 61C: Great Ideas in Computer Architecture Direct- Mapped Caches. Increasing distance from processor, decreasing speed.

EE 4683/5683: COMPUTER ARCHITECTURE

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2

Agenda. Cache-Memory Consistency? (1/2) 7/14/2011. New-School Machine Structures (It s a bit more complicated!)

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

LECTURE 11. Memory Hierarchy

Advanced Memory Organizations

CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2

10/11/17. New-School Machine Structures. Review: Single Cycle Instruction Timing. Review: Single-Cycle RISC-V RV32I Datapath. Components of a Computer

Agenda. Recap: Components of a Computer. Agenda. Recap: Cache Performance and Average Memory Access Time (AMAT) Recap: Typical Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

Computer Systems CSE 410 Autumn Memory Organiza:on and Caches

ECE 2300 Digital Logic & Computer Organization. More Caches

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

Lecture 12: Memory hierarchy & caches

ECE232: Hardware Organization and Design

Lecture 11 Cache. Peng Liu.

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

CSE 2021: Computer Organization

CSE 2021: Computer Organization

CS 61C: Great Ideas in Computer Architecture. Virtual Memory III. Instructor: Dan Garcia

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

Page 1. Memory Hierarchies (Part 2)

ECE Lab 8. Logic Design for a Direct-Mapped Cache. To understand the function and design of a direct-mapped memory cache.

! CS61C : Machine Structures. Lecture 22 Caches I. !!Instructor Paul Pearce! ITʼS NOW LEGAL TO JAILBREAK YOUR PHONE!

Caches. Hiding Memory Access Times

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

ECE 2300 Digital Logic & Computer Organization. More Caches

ECE 30 Introduction to Computer Engineering

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Computer Systems Laboratory Sungkyunkwan University

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

Cache Lab Implementation and Blocking

CS 3510 Comp&Net Arch

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

MIPS) ( MUX

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

Memory. Lecture 22 CS301

COSC 6385 Computer Architecture. - Memory Hierarchies (I)

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

CSE 410 Computer Systems. Hal Perkins Spring 2010 Lecture 12 More About Caches

EEC 170 Computer Architecture Fall Improving Cache Performance. Administrative. Review: The Memory Hierarchy. Review: Principle of Locality

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Slide Set 5. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Levels in memory hierarchy

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6.

Transcription:

Caches II CSE 4 Winter 27 Instructor: Justin Hsia Teaching Assistants: Kathryn Chan, Kevin Bi, Ryan Wong, Waylon Huang, Xinyu Sui IBM s hub for wearables could shorten your hospital stay Researchers at IBM have developed a hub for wearables that can gather information from multiple wearable devices. The gadget, [called Chiyo ], funnels data from devices such as smart watches and fitness bands into the IBM Cloud. There, it's analyzed and the results are shared with the user and their doctor. IBM isn't planning to get into the wearables business. Instead, it plans to offer the service as a platform on which other companies can build their own health services. http://www.pcworld.com/article/3753/wearables/ ibms hub for wearables could have you out of the hospital faster.html

Administrivia Lab 3 due Thursday (2/23) Homework 4 released today (Structs, Caches) Mid Quarter Survey Feedback Pace is moderate to a bit too fast and course is more than 3 units of work You talk too fast in lecture (or rush at the end) and I wish there were more peer instruction questions Canvas quiz answer keys are annoying, but instant homework feedback is great Sections: shorten discussions and lengthen explanations 2

Review: Example Memory Hierarchy Smaller, faster, costlier per byte registers on chip L cache (SRAM) off chip L2 cache (SRAM) CPU registers hold words retrieved from L cache L cache holds cache lines retrieved from L2 cache L2 cache holds cache lines retrieved from main memory Larger, slower, cheaper per byte main memory (DRAM) local secondary storage (local disks) remote secondary storage (distributed file systems, web servers) Main memory holds disk blocks retrieved from local disks Local disks hold files retrieved from disks on remote network servers 3

Making memory accesses fast! Cache basics Principle of locality Memory hierarchies Cache organization Direct mapped (sets; index + tag) Associativity (ways) Replacement policy Handling writes Program optimizations that consider caches 4

Cache Organization () Note: The textbook uses B for block size Block Size ( ): unit of transfer between and Mem Given in bytes and always a power of 2 (e.g. 64 B) Blocks consist of adjacent bytes (differ in address by ) Spatial locality! 5

Cache Organization () Note: The textbook uses b for offset bits Block Size ( ): unit of transfer between and Mem Given in bytes and always a power of 2 (e.g. 64 B) Blocks consist of adjacent bytes (differ in address by ) Spatial locality! Offset field Low order bits of address tell you which byte within a block (address) mod 2 = lowest bits of address (address) modulo (# of bytes in a block) bit address: (refers to byte in memory) bits Block Number bits Block Offset 6

Cache Organization (2) Cache Size ( ): amount of data the can store Cache can only hold so much data (subset of next level) Given in bytes ( ) or number of blocks ( ) Example: = 32 KiB = 52 blocks if using 64 B blocks Where should data go in the cache? We need a mapping from memory addresses to specific locations in the cache to make checking the cache for an address fast What is a data structure that provides fast lookup? Hash table! 7

Review: Hash Tables for Fast Lookup Insert: 5 27 34 2 9 Apply hash function to map data to buckets 2 3 4 5 6 7 8 9 8

Place Data in Cache by Hashing Address Block Addr Memory Block Data Index Cache Block Data Here = 4 B and / = 4 Map to cache index from block address Use next bits (block address) mod (# blocks in cache) 9

Place Data in Cache by Hashing Address Block Addr Memory Block Data Index Cache Block Data Here = 4 B and / = 4 Map to cache index from block address Lets adjacent blocks fit in cache simultaneously! Consecutive blocks go in consecutive cache indices

Place Data in Cache by Hashing Address Block Addr Memory Block Data Index Cache Block Data Here = 4 B and / = 4 Collision! This might confuse the cache later when we access the data Solution?

Tags Differentiate Blocks in Same Index Block Addr Memory Block Data Cache Index Tag Block Data Here = 4 B and / = 4 Tag = rest of address bits bits = Check this during a cache lookup 2

Checking for a Requested Address CPU sends address request for chunk of data Address and requested data are not the same thing! Analogy: your friend his or her phone number TIO address breakdown: bit address: Tag ( ) Index ( ) Offset ( ) Block Number Index field tells you where to look in cache Tag field lets you check that data is the block you want Offset field selects specified start byte within block Note: and sizes will change based on hash function 3

Cache Puzzle # Vote at http://pollev.com/justinh Based on the following behavior, which of the following block sizes is NOT possible for our cache? Cache starts empty, also known as a cold cache Access (addr: hit/miss) stream: (4: miss), (5: hit), (6: miss) A. 4bytes B. 8bytes C. 6 bytes D. 32 bytes E. We re lost 4

Direct Mapped Cache Block Addr Memory Block Data Cache Index Tag Block Data Here = 4 B and / = 4 Hash function: (block address) mod (# of blocks in cache) Each memory address maps to exactly one index in the cache Fast (and simpler) to find an address 5

Direct Mapped Cache Problem Block Addr Memory Block Data What happens if we access the following addresses? 8, 24, 8, 24, 8,? Conflict in cache (misses!) Rest of cache goes unused Solution? Cache Index Tag Block Data?????? Here = 4 B and / = 4 6

Associativity What if we could store data in any place in the cache? More complicated hardware = more power consumed, slower So we combine the two ideas: Each address maps to exactly one set Each set can store block in more than one way 2 3 4 5 6 7 -way: 8 sets, block each direct mapped 2-way: 4 sets, 2 blocks each Set 2 3 Set 4-way: 2 sets, 4 blocks each Set 8-way: set, 8 blocks fully associative 7

Cache Organization (3) Note: The textbook uses b for offset bits Associativity ( ): # of ways for each set Such a cache is called an way set associative cache We now index into cache sets, of which there are Use lowest = bits of block address Direct mapped: =, so = log / as we saw previously Fully associative: = /, so = bits Used for tag comparison Selects the set Selects the byte from block Tag ( ) Index ( ) Offset ( ) Decreasing associativity Direct mapped (only one way) Increasing associativity Fully associative (only one set) 8

Example Placement Where would data from address x833 be placed? Binary: b block size: 6 B capacity: 8 blocks address: 6 bits = = log / / = log bit address: Tag ( ) Index ( ) Offset ( ) Set Tag 2 3 4 5 6 7 =? Direct mapped Data =? =? 2 way set associative 4 way set associative Set Tag Data Set Tag Data 2 3 9

Example Placement Where would data from address x833 be placed? Binary: b block size: 6 B capacity: 8 blocks address: 6 bits = = log / / = log bit address: Tag ( ) Index ( ) Offset ( ) Set Tag 2 3 4 5 6 7 = 3 Direct mapped Data = 2 = 2 way set associative 4 way set associative Set Tag Data Set Tag Data 2 3 2

Block Replacement Any empty block in the correct set may be used to store block If there are no empty blocks, which one should we replace? No choice for direct mapped caches Caches typically use something close to least recently used (LRU) (hardware usually implements not most recently used ) Set Tag 2 3 4 5 6 7 Direct mapped Data 2 way set associative Set Tag Data 2 3 4 way set associative Set Tag Data 2

Cache Puzzle #2 What can you infer from the following behavior? Cache starts empty, also known as a cold cache Access (addr: hit/miss) stream: (: miss), (2: miss), (: miss) Associativity? Number of sets? 22

General Cache Organization (,, ) = blocks/lines per set set line (block plus management bits) = # sets = 2 V Tag 2 K Cache size: data bytes (doesn t include V or Tag) valid bit = bytes per block 23

Notation Review We just introduced a lot of new variable names! Please be mindful of block size notation when you look at past exam questions or are watching videos Variable This Quarter Formulas Block size Cache size Associativity Number of Sets Address space Address width Tag field width Index field width Offset field width in book in book 2 log 2 log 2 log log / / 24