Marten van Dijk Syed Kamran Haider, Chenglu Jin, Phuong Ha Nguyen. Department of Electrical & Computer Engineering University of Connecticut

Similar documents
Caches Design of Parallel and High-Performance Computing Recitation Session

ECE7995 Caching and Prefetching Techniques in Computer Systems. Lecture 8: Buffer Cache in Main Memory (I)

CS 282 KAUST Spring 2010

Lecture: Large Caches, Virtual Memory. Topics: cache innovations (Sections 2.4, B.4, B.5)

Lecture: Large Caches, Virtual Memory. Topics: cache innovations (Sections 2.4, B.4, B.5)

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

1/19/2009. Data Locality. Exploiting Locality: Caches

CSE502: Computer Architecture CSE 502: Computer Architecture

EECS 470. Lecture 14 Advanced Caches. DEC Alpha. Fall Jon Beaumont

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand

Advanced Caching Techniques (2) Department of Electrical Engineering Stanford University

Caching Basics. Memory Hierarchies

Page 1. Multilevel Memories (Improving performance using a little cash )

Advanced Computer Architecture

Lecture 15: Large Cache Design III. Topics: Replacement policies, prefetch, dead blocks, associativity, cache networks

Lecture 10: Large Cache Design III

Lecture: Cache Hierarchies. Topics: cache innovations (Sections B.1-B.3, 2.1)

Agenda. EE 260: Introduction to Digital Design Memory. Naive Register File. Agenda. Memory Arrays: SRAM. Memory Arrays: Register File

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 15

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho

Spring 2018 :: CSE 502. Cache Design Basics. Nima Honarmand

ECE 571 Advanced Microprocessor-Based Design Lecture 10

CPE300: Digital System Architecture and Design

CACHE OPTIMIZATION. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

Lecture 17: Memory Hierarchy: Cache Design

STORING DATA: DISK AND FILES

VIRTUAL MEMORY READING: CHAPTER 9

Operating Systems, Fall

ECE 485/585 Microprocessor System Design

Memory Hierarchy. Slides contents from:

Outline. 1 Paging. 2 Eviction policies. 3 Thrashing 1 / 28

CS370 Operating Systems

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CIS Operating Systems Memory Management Page Replacement. Professor Qiang Zeng Spring 2018

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

PAGE REPLACEMENT. Operating Systems 2015 Spring by Euiseong Seo

Week 2: Tiina Niklander

Portland State University ECE 587/687. Caches and Memory-Level Parallelism

Cache Controller with Enhanced Features using Verilog HDL

10/16/2017. Miss Rate: ABC. Classifying Misses: 3C Model (Hill) Reducing Conflict Misses: Victim Buffer. Overlapping Misses: Lockup Free Cache

and data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed

Introduction to OpenMP. Lecture 10: Caches

Computer Systems Architecture

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

COSC 6385 Computer Architecture - Memory Hierarchies (I)

Memory Hierarchy. Slides contents from:

CS162 Operating Systems and Systems Programming Lecture 11 Page Allocation and Replacement"

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

A Review on Cache Memory with Multiprocessor System

CACHE OPTIMIZATION. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

Lecture 13: Cache Hierarchies. Today: cache access basics and innovations (Sections )

Memory Hierarchies &

Course Administration

MIPS) ( MUX

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

Move back and forth between memory and disk. Memory Hierarchy. Two Classes. Don t

Memory Management Ch. 3

Chapter 8. Virtual Memory

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

Lecture 14 Page Replacement Policies

Lecture 7 - Memory Hierarchy-II

CSE 2021: Computer Organization

CS370 Operating Systems

CSE 2021: Computer Organization

CS370 Operating Systems

Caches. Samira Khan March 23, 2017

Perform page replacement. (Fig 8.8 [Stal05])

Operating Systems Virtual Memory. Lecture 11 Michael O Boyle

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory

Operating Systems Lecture 6: Memory Management II

Page 1. Goals for Today" TLB organization" CS162 Operating Systems and Systems Programming Lecture 11. Page Allocation and Replacement"

Assignment 1 due Mon (Feb 4pm

3/3/2014! Anthony D. Joseph!!CS162! UCB Spring 2014!

Virtual or Logical. Logical Addr. MMU (Memory Mgt. Unit) Physical. Addr. 1. (50 ns access)

Virtual Memory III. Jo, Heeseung

Lecture 14: Large Cache Design II. Topics: Cache partitioning and replacement policies

CACHE MEMORIES ADVANCED COMPUTER ARCHITECTURES. Slides by: Pedro Tomás

Lecture 12: Demand Paging

Fig 7.30 The Cache Mapping Function. Memory Fields and Address Translation

Locality and Data Accesses video is wrong one notes when video is correct

1. Creates the illusion of an address space much larger than the physical memory

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

CS 136: Advanced Architecture. Review of Caches

Memory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Cache Policies. Philipp Koehn. 6 April 2018

CS 333 Introduction to Operating Systems. Class 14 Page Replacement. Jonathan Walpole Computer Science Portland State University

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 15

Page 1. Review: Address Segmentation " Review: Address Segmentation " Review: Address Segmentation "

Computer Architecture Spring 2016

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

Memory System Design Part II. Bharadwaj Amrutur ECE Dept. IISc Bangalore.

Cray XE6 Performance Workshop

Q3: Block Replacement. Replacement Algorithms. ECE473 Computer Architecture and Organization. Memory Hierarchy: Set Associative Cache

Page Replacement Algorithms

Logical Diagram of a Set-associative Cache Accessing a Cache

Transcription:

CSE 5095 & ECE 4451 & ECE 5451 Spring 2017 Lecture 5a Caching Review Marten van Dijk Syed Kamran Haider, Chenglu Jin, Phuong Ha Nguyen Department of Electrical & Computer Engineering University of Connecticut Material taken from: 1. Cache and Virtual Memory Replacement Algorithms, slides by Michael Smaili, Spring 2008 2. Cache Rep;lacement Policies, Mikko M. Lipasti, Spring 2012 3. Design and Implementation of the AEGIS Single-Chip Secure Processor Using Physical Random Functions, slides by G. E. Suh, C. W. O Donnell, I. Sachdev, and S, Devadas

Memory Layout 2

Memory Hierarchy Provide memories of various speed and size at different points in the system. Use a memory management scheme which will move data between levels. Those items most often used should be stored in faster levels. Those items seldom used should be stored in lower levels. Cache: a small, fast buffer that lies between the CPU and the Main Memory which holds the most recently accessed data. Virtual Memory: Program and data are assigned addresses independent of the amount of physical main memory storage actually available and the location from which the program will actually be executed. Hit ratio: Probability that next memory access is found in the cache. Miss rate: (1.0 Hit rate) Primary goal of Cache: increase Speed. Primary goal of Virtual Memory: increase Space. 3

Key issues are: Placement Where can a block of memory go? Identification How do I find a block of memory? Replacement How do I make space for new blocks? Write Policy How do I propagate changes? Consider these for caches Usually SRAM Cache Design Also apply to main memory, disks 4

Placement and Identification 5

Fully Associative Mapping A main memory block can map into any block in cache. Main Memory Cache Memory Italics: Stored in Memory 6

Pros/Cons Advantages: No Contention Easy to implement Address?= Tag Disadvantages: Very expensive Very wasteful of cache storage since you must store full primary memory address Hash Tag Check Hit SRAM Cache Offset Data Out 32-bit Address Tag Offset 7

Direct Mapping Store higher order tag bits along with data in cache. Main Memory Cache Memory Italics: Stored in Memory Index bits Tag bits 8

Pros/Cons Advantages: Low cost; doesn t require an associative memory in hardware Uses less cache space Address Block Size Disadvantages: Contention with main memory data with same index bits. Hash Index SRAM Cache Offset 32-bit Address Data Out Index Offset 9

Set Associative Mapping Puts a fully associative cache within a direct-mapped cache. Main Memory Cache Memory Italics: Stored in Memory Index bits Tag bits 10

Pros/Cons Intermediate compromise solution between Fully Associative and Direct Mapping Not as expensive and complex as a fully associative approach. Not as much contention as in a direct mapping approach. Address Hash Index a Tags Index SRAM Cache a Data Blocks Tag?=?=?=?= Offset 32-bit Address Tag Index Offset Data Out 11

Replacement How do we choose victim? Verbs: Victimize, evict, replace, cast out Many policies are possible FIFO (first-in-first-out) LRU (least recently used), pseudo-lru LFU (least frequently used) NMRU (not most recently used) NRU Pseudo-random (yes, really!) Optimal Etc 12

LRU For a=2, LRU is equivalent to NMRU Single bit per set indicates LRU/MRU Set/clear on each access For a>2, LRU is difficult/expensive Timestamps? How many bits? Must find min timestamp on each eviction Sorted list? Re-sort on every access? List overhead: log 2 (a) bits /block Shift register implementation 13

LRU Implementation Have LRU counter for each line in a set When line accessed Get old value X of its counter Set its counter to max value For every other line in the set If counter larger than X, decrement it When replacement needed Select line whose counter is 0 14

Practical Pseudo LRU 0 Older Newer 1 0 0 1 1 1 J F C B X Y A Z Rather than true LRU, use binary tree Each node records which half is older/newer Update nodes on each reference Follow older pointers to find LRU victim 15

True LRU Shortcomings Streaming data/scans: x 0, x 1,, x n Effectively no temporal reuse Thrashing: reuse distance > a Temporal reuse exists but LRU fails All blocks march from MRU to LRU Other conflicting blocks are pushed out For n>a no blocks remain after scan/thrash Incur many conflict misses after scan ends Pseudo-LRU sometimes helps a little bit 16

Write Policy Do we allocate cache lines on a write? Write-allocate A write miss brings block into cache No-write-allocate A write miss leaves cache as it was Do we update memory on writes? Write-through Memory immediately updated on each write Write-back Memory updated when line replaced 17

Write-Back Caches Need a Dirty bit for each line A dirty line has more recent data than memory Line starts as clean (not dirty) Line becomes dirty on first write to it Memory not updated yet, cache has the only up-to-date copy of data for a dirty line Replacing a dirty line Must write data back to memory (write-back) 18

Prefetching Predict future misses and get data into cache If access does happen, we have a hit now (or a partial miss, if data is on the way) If access does not happen, cache pollution (replaced other data with junk we don t need) To avoid pollution, prefetch buffers Pollution a big problem for small caches Have a small separate buffer for prefetches When we do access it, put data in cache If we don t access it, cache not polluted 19

Advanced Topics (not covered) Inclusive and exclusive caches (in a cache hierarchy) Blocking and non-blocking caches Etc. 20