Review: Computer Organization

Similar documents
EEC 483 Computer Organization

EEC 483 Computer Organization. Chapter 5.3 Measuring and Improving Cache Performance. Chansu Yu

data block 0, word 0 block 0, word 1 block 1, word 0 block 1, word 1 block 2, word 0 block 2, word 1 block 3, word 0 block 3, word 1 Word index cache

ECEC 355: Cache Design

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

EEC 483 Computer Organization

Memory Hierarchy Design (Appendix B and Chapter 2)

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

EE 4683/5683: COMPUTER ARCHITECTURE

Course Administration

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CS3350B Computer Architecture

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Registers. Instruction Memory A L U. Data Memory C O N T R O L M U X A D D A D D. Sh L 2 M U X. Sign Ext M U X ALU CTL INSTRUCTION FETCH

Page 1. Memory Hierarchies (Part 2)

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Caches. Hiding Memory Access Times

Memory Hierarchy: Caches, Virtual Memory

CSE 2021: Computer Organization

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

Advanced Memory Organizations

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

CSE 410 Computer Systems. Hal Perkins Spring 2010 Lecture 12 More About Caches

CSE 2021: Computer Organization

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)

Q3: Block Replacement. Replacement Algorithms. ECE473 Computer Architecture and Organization. Memory Hierarchy: Set Associative Cache

Agenda. EE 260: Introduction to Digital Design Memory. Naive Register File. Agenda. Memory Arrays: SRAM. Memory Arrays: Register File

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

Chapter 5. Memory Technology

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Logical Diagram of a Set-associative Cache Accessing a Cache

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

Introduction. Memory Hierarchy

14:332:331. Week 13 Basics of Cache

ארכיטקטורת יחידת עיבוד מרכזי ת

LECTURE 11. Memory Hierarchy

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand

Improving our Simple Cache

Spring 2018 :: CSE 502. Cache Design Basics. Nima Honarmand

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

ECE260: Fundamentals of Computer Engineering

ECE331: Hardware Organization and Design

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

Computer Architecture Spring 2016

ECE232: Hardware Organization and Design

MEMORY HIERARCHY BASICS. B649 Parallel Architectures and Programming

EN1640: Design of Computing Systems Topic 06: Memory System

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Page 1. Multilevel Memories (Improving performance using a little cash )

3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition

CSE502: Computer Architecture CSE 502: Computer Architecture

Locality and Data Accesses video is wrong one notes when video is correct

Question?! Processor comparison!

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

1/19/2009. Data Locality. Exploiting Locality: Caches

COMP 3221: Microprocessors and Embedded Systems

V. Primary & Secondary Memory!

The levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms

ECE331: Hardware Organization and Design

Chapter 7-1. Large and Fast: Exploiting Memory Hierarchy (part I: cache) 臺大電機系吳安宇教授. V1 11/24/2004 V2 12/01/2004 V3 12/08/2004 (minor)

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

12 Cache-Organization 1

The Memory Hierarchy & Cache

Chapter 7: Large and Fast: Exploiting Memory Hierarchy

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

CS161 Design and Architecture of Computer Systems. Cache $$$$$

Levels in memory hierarchy

MIPS) ( MUX

Lecture 11 Cache. Peng Liu.

Advanced Computer Architecture

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook

Computer Systems Laboratory Sungkyunkwan University

CACHE OPTIMIZATION. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

Fig 7.30 The Cache Mapping Function. Memory Fields and Address Translation

Lecture 17 Introduction to Memory Hierarchies" Why it s important " Fundamental lesson(s)" Suggested reading:" (HP Chapter

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Caches. Samira Khan March 23, 2017

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

Memory Hierarchy. Mehran Rezaei

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

EN1640: Design of Computing Systems Topic 06: Memory System

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)

Com S 321 Problem Set 3

Lecture 12: Memory hierarchy & caches

Chapter 6 Objectives

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)

Pick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality

COSC 6385 Computer Architecture - Memory Hierarchies (I)

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho

ECE331: Hardware Organization and Design

Transcription:

Review: Computer Organization Cache Chansu Yu Caches: The Basic Idea A smaller set of storage locations storing a subset of information from a larger set. Typically, SRAM for DRAM main memory: Processor Cache Goal: Decrease average time for data access. Use: Look in for data. Look in larger storage only if not found in. Invisible to programmer. Multiple ways to organize we will see several. For simplicity, assume all memory accesses are word-sized. c.yu@csuohio.edu

Terminology Cache holds code/data that will probably be accessed later Later access is really found in : Hit or % it is not found in : Miss Hit time : access time Miss penalty : memory access time Bring the data into since it will probably be accessed later Bring the following data altogether ( block or line size) Need to throw away the oldest block in the ( replacement ) c.yu@csuohio.edu Caches: Flowchart memory reference yes (hit) read bytes from block in? no (miss) read the block from memory into extract desired bytes return data to CPU c.yu@csuohio.edu

Hits vs. Misses Read hits this is what we want! Read misses stall the CPU, fetch block from memory, deliver to, restart Write hits: can replace data in and memory (write-through) write the data only into the (write-back the later) Write misses: read the entire block into the, then write the word c.yu@csuohio.edu Cache Performance Describing performance: Hit time = time to access on hit. Usually, read hit time = write hit time. SRAM s: a few cycles Miss penalty = additional time to access on miss. Usually, read miss penalty write miss penalty. DRAM main memory: s-s cycles Hit ratio = #hits / #accesses Miss ratio = #misses / #accesses Measuring performance: Lots of benchmarking & simulations. c.yu@csuohio.edu

Cache Block Placement Two issues: How do we know if a data item is in the? If it is, how do we find it? Our first example: (with decimal numbers, which is not the actual case) Block size is bytes of data has bytes ( blocks) Cache has bytes ( blocks) Direct mapped" c.yu@csuohio.edu Direct-mapped Block Placement bytes... access [] bytes Direct-mapped c.yu@csuohio.edu

Decoding address [] offset within the block block index for identifying the original memory block Given the memory address (e.g. ) Extract the block index () Check if the block # corresponds to memory ~ For this, each entry remembers the tag data (e.g. ) If tag matches with the first digit in memory address, HIT Extract the byte within the block with offset address (e.g. ) c.yu@csuohio.edu Direct-mapped Block Placement bytes... bytes Direct-mapped c.yu@csuohio.edu

Exercise: Make connections! bytes... bytes Direct-mapped c.yu@csuohio.edu Exercise: Find them! bytes... bytes Direct-mapped c.yu@csuohio.edu

Decoding Given the memory address (e.g. ) Extract the block index () Check if the block # corresponds to memory ~ Since tag is and the first digit in memory address is, MISS So, what happens when a miss occurs? (see pages -) The current block # contains ~ Read memory ~ and replace the block # Change the to c.yu@csuohio.edu Decoding Given the memory address (e.g. ) Extract the block index () Check if the block # corresponds to memory ~ is and therefore, HIT However, it is possible that = because the is initially empty This is not a HIT??? => We need a valid bit! c.yu@csuohio.edu

Exercise: What is invalid? bytes... bytes Direct-mapped Valid One digit (in fact, -bit is enough in this example) One bit c.yu@csuohio.edu Handling Writes Write strategy (see pages -) When write hit Write through: data is written to both the and the memory Write back : data is written only to the the modified (dirty) block is written to memory when replaced requires dirty bit for each block (more complexity) Better performance but semantic problem When write miss Write allocate : fetch-on-write - with Write-back??? No write allocate : write-around - with Write-through??? c.yu@csuohio.edu

Single Word Cache Block : Exercise address size? (# bits for memory address?) Block size? (# bits for offset?) # blocks? (# bits for block index?) Hit Address (showing bit positions) Byte offset Index Index Valid Data Data address to Cache address # Cache blocks in direct-mapped? (# bits for block index?) Which (how many) memory blocks are candidates for block #? size? How memory address is decomposed? c.yu@csuohio.edu Exercise Multiword Cache Block Address (showing bit positions) Hit Byte offset Data Index Block offset bits bits V Data K entries Mux c.yu@csuohio.edu

Other Block Placement Policies Direct-mapped A memory block is d in only one position in the Set-associative (n-way) A memory block is d in n positions in the Fully-associative A memory block is d in any position in the Do they decrease miss ratio? c.yu@csuohio.edu Direct-mapped Block Placement bytes ( words) A B C D E F Valid Direct-mapped bytes ( words) -bit memory address tag -bit block # -bit offset in the block c.yu@csuohio.edu

(N-Way) Set-associative Block Placement bytes ( words) A B C D E F Valid Set-associative (-way ) bytes ( words) Set Set Set Set -bit memory address tag(???) block # (???) -bit offset in the block c.yu@csuohio.edu Fully-associative Block Placement bytes ( words) A B C D E F Valid Fully-associative bytes ( words) -bit memory address tag(???) block # (???) -bit offset in the block c.yu@csuohio.edu

Block Identification (cont d) N-way set-associative There are N blocks to compare N tag comparisons are done in parallel Block # to Set # choose low-order bits of blocks as set # tag + set# + offset consecutive blocks to map to different sets fewer conflicts in, especially in the presence of spatial locality Same size with higher associativity #blocks / set??? index size, tag size??? c.yu@csuohio.edu CPU address block # tag set # offset ways -way set associative index into tag data tag data hit OR MUX data c.yu@csuohio.edu

An Example Address Index V Data V Data V Data V Data Hit -to- multiplexor Data x-way set associate? Number of blocks? Block size? field size? Cache size? c.yu@csuohio.edu Replacement Policies In direct-mapped, no replacement policy is necessary In set-associative, an important question is Which block to replace among the set (see page )? Least recently used (LRU) The most commonly used scheme How to keep track of usage of blocks? Single bit in case of two-way set associative See Section. for higher associativity case c.yu@csuohio.edu