Fig 7.30 The Cache Mapping Function. Memory Fields and Address Translation

Similar documents
CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design

Chapter 7- Memory System Design

Topics in Memory Subsystem

1. Creates the illusion of an address space much larger than the physical memory

Cache Memory Mapping Techniques. Continue to read pp

Pipelined processors and Hazards

Cache Policies. Philipp Koehn. 6 April 2018

LECTURE 11. Memory Hierarchy

Structure of Computer Systems

Physical characteristics (such as packaging, volatility, and erasability Organization.

a process may be swapped in and out of main memory such that it occupies different regions

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Review: Computer Organization

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy.

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

Virtual Memory. Chapter 8

ECEC 355: Cache Design

ECE331: Hardware Organization and Design

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

LECTURE 12. Virtual Memory

EEC 483 Computer Organization. Chapter 5.3 Measuring and Improving Cache Performance. Chansu Yu

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]

CPSC 352. Chapter 7: Memory. Computer Organization. Principles of Computer Architecture by M. Murdocca and V. Heuring

ECE232: Hardware Organization and Design

DAT (cont d) Assume a page size of 256 bytes. physical addresses. Note: Virtual address (page #) is not stored, but is used as an index into the table

ECE331: Hardware Organization and Design

The check bits are in bit numbers 8, 4, 2, and 1.

Lecture 11: SMT and Caching Basics. Today: SMT, cache access basics (Sections 3.5, 5.1)

Memory Design. Cache Memory. Processor operates much faster than the main memory can.

Basic Memory Hierarchy Principles. Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!)

Assignment 1 due Mon (Feb 4pm

[ 6.1] A cache is a small, fast memory which is transparent to the processor. The cache duplicates information that is in main memory.

Week 2: Tiina Niklander

12 Cache-Organization 1

Lecture 2: Memory Systems

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

Memory Management. Chapter 4 Memory Management. Multiprogramming with Fixed Partitions. Ideally programmers want memory that is.

Cache Memory By Noor Kadhum

Performance metrics for caches

Computer Organization (Autonomous)

Operating Systems, Fall

stack Two-dimensional logical addresses Fixed Allocation Binary Page Table

A Review on Cache Memory with Multiprocessor System

Homework 6. BTW, This is your last homework. Assigned today, Tuesday, April 10 Due time: 11:59PM on Monday, April 23. CSCI 402: Computer Architectures

Caching Prof. James L. Frankel Harvard University. Version of 5:16 PM 5-Apr-2016 Copyright 2016 James L. Frankel. All rights reserved.

Perform page replacement. (Fig 8.8 [Stal05])

William Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Memory Management! How the hardware and OS give application pgms:" The illusion of a large contiguous address space" Protection against each other"

Principles of Computer Architecture. Chapter 7: Memory

data block 0, word 0 block 0, word 1 block 1, word 0 block 1, word 1 block 2, word 0 block 2, word 1 block 3, word 0 block 3, word 1 Word index cache

The Memory System. Components of the Memory System. Problems with the Memory System. A Solution

Memory Management. Reading: Silberschatz chapter 9 Reading: Stallings. chapter 7 EEL 358

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668

Memory Management! Goals of this Lecture!

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 15

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Computer Architecture and Organization

Chapter 4. Cache Memory. Yonsei University

Chapter 8. Virtual Memory

Caching Basics. Memory Hierarchies

Memory Hierarchy Recall the von Neumann bottleneck - single, relatively slow path between the CPU and main memory.

MEMORY MANAGEMENT/1 CS 409, FALL 2013

EEC 483 Computer Organization

WEEK 7. Chapter 4. Cache Memory Pearson Education, Inc., Hoboken, NJ. All rights reserved.

ECE Lab 8. Logic Design for a Direct-Mapped Cache. To understand the function and design of a direct-mapped memory cache.

Memory Management. Goals of this Lecture. Motivation for Memory Hierarchy

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (4 th Week)

Operating Systems Lecture 6: Memory Management II

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Com S 321 Problem Set 3

Memory Management. 3. What two registers can be used to provide a simple form of memory protection? Base register Limit Register

MIPS) ( MUX

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

CS61C : Machine Structures

Memory Hierarchy Design (Appendix B and Chapter 2)

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6.

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

William Stallings Copyright 2009

Memory Management (Chaper 4, Tanenbaum)

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

CSE 141 Computer Architecture Spring Lectures 17 Virtual Memory. Announcements Office Hour

VIRTUAL MEMORY. Maninder Kaur. 1

A Typical Memory Hierarchy

Page Size Page Size Design Issues

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Page 1. Memory Hierarchies (Part 2)

Caches. Samira Khan March 23, 2017

CSE Computer Architecture I Fall 2011 Homework 07 Memory Hierarchies Assigned: November 8, 2011, Due: November 22, 2011, Total Points: 100

Transcription:

7-47 Chapter 7 Memory System Design Fig 7. The Mapping Function Example: KB MB CPU Word Block Main Address Mapping function The cache mapping function is responsible for all cache operations: Placement strategy: where to place an incoming block in the cache Replacement strategy: which block to replace upon a miss Read and write policy: how to handle reads and writes upon cache misses Mapping function must be implemented in hardware. (Why?) Three different types of mapping functions: Associative Direct mapped Block-set associative Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan 7-4 Chapter 7 Memory System Design Memory Fields and Address Translation Example of processor-issued -bit virtual address: bits That same -bit address partitioned into two fields, a block and a word field. The word field represents the offset into the block specified in the block field: Block number Word word blocks Example of a specific reference: Block 9, word. Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan

7-49 Chapter 7 Memory System Design Fig 7. Associative Associative mapped cache model: any block from main can be put anywhere in the cache. Assume a -bit main.* 4? 9 bits bits, bit block Main block MM block? MM block block MM block bytes MM block 9 MM block 4 MM block 9 Main address: Byte bytes * bits, while unrealistically small, simplifies the examples Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan 7- Chapter 7 Memory System Design Fig 7. Associative Mechanism Because any block can reside anywhere in the cache, an associative (content addressable) is used. All locations are searched simultaneously. Associative tag Argument register Match bit bit block Match? block 4 block Main address Byte Selector bytes To CPU Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan

7- Chapter 7 Memory System Design Advantages and Disadvantages of the Associative Mapped Advantage Most flexible of all any MM block can go anywhere in the cache. Disadvantages Large tag. Need to search entire tag simultaneously means lots of hardware. Replacement Policy is an issue when the cache is full. more later Q.: How is an associative search conducted at the logic gate level? Direct-mapped caches simplify the hardware by allowing each MM block to go into only one place in the cache: Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan 7- Chapter 7 Memory System Design Fig 7. Direct-Mapped 9 bits Main block numbers Group #: 7 4 7 79 7 797 7 79 Key Idea: all the MM blocks from a given group can go into only one location in the cache, corresponding to the group number. bits One cache line, bytes bytes address: Main address: 77 9 #: 9 Now the cache needs only examine the single group that its reference specifies. Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan Group Byte

7- Chapter 7 Memory System Design Fig 7.4 Direct-Mapped Operation. Decode the group number of the incoming MM address to select the group. If Match AND. Then gate out the tag field 4. Compare cache tag with incoming tag. If a hit, then gate out the cache line 9 bits bits miss. and use the word field to select the desired word. Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan 4 Main address Group Byte = -bit comparator decoder hit Hit Selector 7-4 Chapter 7 Memory System Design Direct-Mapped s The direct mapped cache uses less hardware, but is much more restrictive in block placement. If two blocks from the same group are frequently referenced, then the cache will thrash. That is, repeatedly bring the two competing blocks into and out of the cache. This will cause a performance degradation. Block replacement strategy is trivial. Compromise allow several cache blocks in each group the Block-Set-Associative : Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan

7- Chapter 7 Memory System Design Fig 7. -Way Set-Associative Example shows groups, a set of two per group. Sometimes referred to as a -way set-associative cache. 9 Main block numbers Group #: 7 4 7 4 4 7 79 7 797 7 79 bits One cache line, bytes 77 9 #: 9 group address: bytes Main address: Set Byte Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan 7- Chapter 7 Memory System Design Getting Specific: The Intel Pentium The Pentium actually has two separate caches one for instructions and one for data. Pentium issues -bit MM addresses. Each cache is -way set-associative Each cache is K = bytes in size = bytes per line. Thus there are or bytes per set, and therefore / = 7 = groups This leaves - - 7 = bits for the tag field: Set (group) Word 7 This cache arithmetic is important, and deserves your mastery. Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan

7-7 Chapter 7 Memory System Design Read and Write Policies Read and Write cache hit policies Writethrough updates both cache and MM upon each write. Write back updates only cache. Updates MM only upon block removal. Dirty bit is set upon first write to indicate block must be written back. Read and Write cache miss policies Read miss bring block in from MM Either forward desired word as it is brought in, or Wait until entire line is filled, then repeat the cache request. Write miss Write-allocate bring block into cache, then update Write no-allocate write word to MM without bringing block into cache. Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan 7- Chapter 7 Memory System Design Block Replacement Strategies Not needed with direct-mapped cache Least Recently Used (LRU) Track usage with a counter. Each time a block is accessed: Clear counter of accessed block Increment counters with values less than the one accessed All others remain unchanged When set is full, remove line with highest count Random replacement replace block at random Even random replacement is a fairly effective strategy Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan

7-9 Chapter 7 Memory System Design Performance Recall Access time, t a = h t p + ( - h) t s for primary and secondary levels. For t p = cache and t s = MM, t a = h t C + ( - h) t M We define S, the speedup, as S = T without /T with for a given process, where T without is the time taken without the improvement, cache in this case, and T with is the time the process takes with the improvement. Having a model for cache and MM access times and cache line fill time, the speedup can be calculated once the hit ratio is known. Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan 7- Chapter 7 Memory System Design Fig 7. The PowerPC Structure Set of Line Sector Sector sets Line Address tag bits bytes bytes bytes bytes bytes bytes Physical address: Line (set) # The PPC has a unified cache that is, a single cache for both instructions and data. It is KB in size, organized as x block-set associative, with blocks being -byte organized as independent 4-word sectors for convenience in the updating process A cache line can be updated in two single-cycle operations of 4 each. Normal operation is write back, but write through can be selected on a per line basis via software. The cache can also be disabled via software. Computer Systems Design and Architecture by V. Heuring and H. Jordan 997 V. Heuring and H. Jordan Word #