MEMORY. Objectives. L10 Memory

Similar documents
Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

Memory. Objectives. Introduction. 6.2 Types of Memory

Chapter 6 Objectives

Chapter 6 Objectives

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Basic Memory Hierarchy Principles. Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!)

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory

ECE468 Computer Organization and Architecture. Memory Hierarchy

Computer Systems and Networks. ECPE 170 Jeff Shafer University of the Pacific $$$ $$$ Cache Memory $$$

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

LECTURE 10: Improving Memory Access: Direct and Spatial caches

LECTURE 11. Memory Hierarchy

CENG4480 Lecture 09: Memory 1

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy.

Recap: Machine Organization

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

Advanced Memory Organizations

CS 3510 Comp&Net Arch

Memory hierarchy and cache

EE414 Embedded Systems Ch 5. Memory Part 2/2

Characteristics of Memory Location wrt Motherboard. CSCI 4717 Computer Architecture. Characteristics of Memory Capacity Addressable Units

Lecture 2: Memory Systems

Module 5a: Introduction To Memory System (MAIN MEMORY)

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Question?! Processor comparison!

Computer Organization and Assembly Language (CS-506)

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

Large and Fast: Exploiting Memory Hierarchy

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

Memory. Lecture 22 CS301

Lecture 12: Memory hierarchy & caches

Overview IN this chapter we will study. William Stallings Computer Organization and Architecture 6th Edition

Lecture 17 Introduction to Memory Hierarchies" Why it s important " Fundamental lesson(s)" Suggested reading:" (HP Chapter

Introduction to cache memories

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University

ECE232: Hardware Organization and Design

k -bit address bus n-bit data bus Control lines ( R W, MFC, etc.)

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory

Chapter 4 Main Memory

CENG3420 Lecture 08: Memory Organization

UNIT-V MEMORY ORGANIZATION

Memory memories memory

Lecture-14 (Memory Hierarchy) CS422-Spring

Internal Memory Cache Stallings: Ch 4, Ch 5 Key Characteristics Locality Cache Main Memory

The Memory Hierarchy & Cache

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

Lecture 18: Memory Systems. Spring 2018 Jason Tang

William Stallings Computer Organization and Architecture 8th Edition. Cache Memory

Unit 2. Chapter 4 Cache Memory

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

Characteristics. Microprocessor Design & Organisation HCA2102. Unit of Transfer. Location. Memory Hierarchy Diagram

BTEC Level 3 Unit 2. Computer Components

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Sarah L. Harris and David Money Harris. Digital Design and Computer Architecture: ARM Edition Chapter 8 <1>

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

CPE300: Digital System Architecture and Design

Computer & Microprocessor Architecture HCA103

CPE300: Digital System Architecture and Design

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Caches. Hiding Memory Access Times

The Memory System. Components of the Memory System. Problems with the Memory System. A Solution

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

EN1640: Design of Computing Systems Topic 06: Memory System

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

Memory Hierarchies &

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

CS161 Design and Architecture of Computer Systems. Cache $$$$$

CMPSC 311- Introduction to Systems Programming Module: Caching

The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs.

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

EN1640: Design of Computing Systems Topic 06: Memory System

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6.

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 15

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

COSC 6385 Computer Architecture - Memory Hierarchies (I)

CS 31: Intro to Systems Storage and Memory. Kevin Webb Swarthmore College March 17, 2015

Computer Organization. Chapter 12: Memory organization

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

Contents. Main Memory Memory access time Memory cycle time. Types of Memory Unit RAM ROM

machine cycle, the CPU: (a) Fetches an instruction, (b) Decodes the instruction, (c) Executes the instruction, and (d) Stores the result.

Memory hierarchy Outline

CPSC 330 Computer Organization

The Memory Hierarchy 10/25/16

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

COSC 243. Memory and Storage Systems. Lecture 10 Memory and Storage Systems. COSC 243 (Computer Architecture)

Memory classification:- Topics covered:- types,organization and working

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

Memory Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)

Transcription:

MEMORY Reading: Chapter 6, except cache implementation details (6.4.1-6.4.6) and segmentation (6.5.5) https://en.wikipedia.org/wiki/probability 2 Objectives Understand the concepts and terminology of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured Understand the concept behind cache memory 1

3 Reading Text 6.1-6.4 Wikipedia en.wikipedia.org/wiki/cpu_cache 4 Types of Memory Types of main memory: Random Access Memory (RAM) data can be accessed quickly in any random order Read-Only-Memory (ROM) cannot be easily modified Non-random storage includes CD, DVD, hard disk, and mag tape 2

5 RAM Types of RAM Dynamic RAM (DRAM) contains capacitors that slowly leak their charge over time Inexpensive and must be refreshed every few milliseconds to prevent data loss Static RAM (SRAM) very fast and does not need to be periodically refreshed, but volatile (useful for cache memory) Volatile memory is computer memory that requires power to maintain the stored information (Wikipedia) 6 ROM Does not need to be refreshed Needs very little charge to retain its memory. Used to store permanent, or semipermanent data that persists even while the system is turned off 3

7 Mobile DDR Double Data Rate synchronous DRAM for mobile computers Generations LPDDR low power version LPDDR3 current generation LPDDR4 50% better performance than LPDDR3, consuming 40% less energy DDR SDRAM modules for desktop computers are commonly called DIMMs 8 The Memory Hierarchy In general, fast memory is more expensive than slow memory Light travels a little over a foot in a nanosecond Memory is organized in a hierarchy to provide good price performance Small, fast storage elements are kept in the CPU, larger, slower main memory is accessed through the data bus Larger, (almost) permanent storage (e.g. disk drives) is located further from the CPU Goal is to minimize user involvement in determining the location of data 4

9 The Memory Hierarchy Storage organization can be thought of as a pyramid Trade-off of access time vs. storage size 10 Cache Memory Results in faster accesses by storing recently used data closer to the CPU Smaller than main memory, but faster Processor determines if an address is contained in the cache by using the cache controller Cache controller sometimes accesses by content; hence, it is often called content addressable memory Where else do you see caching principles applied? 5

11 Levels of Cache L1 Level 1 cache fastest, but smallest L2 Level 2 cache larger than L1 cache, but not as fast L3 Level 3 cache even larger and slower Example: Apple ARM A8 L1 64KB/64KB L2 1 MB L3 4MB 12 Cache Overview To access a particular piece of data, the CPU first sends a request to its nearest memory (usually L1 cache) If the data is not in immediate cache, the next layer of memory (L2 cache or main memory) is queried If the data is not in main memory, the request goes to L3 or virtual memory (e.g., disk) Once the data is located, a block (called a cache line) containing the data, is fetched into cache memory Very large potential variation in access time for data 6

13 Locality of Reference Locality encourages data movement of blocks of data Principle of locality - once a byte is accessed, it is likely that a data element will soon be referenced There are three forms of locality: Temporal locality- Recently-accessed data elements tend to be accessed again. Spatial locality - Accesses tend to cluster. Sequential locality - Instructions tend to be accessed sequentially. What are high level language examples of the forms of locality? 14 What is a Cache Block? A block is just a way of organizing memory into groups of bytes that can be moved between memory and cache Example: if memory is 16MB and the block size is 64 bytes How many bytes (in a power of 2)? How many blocks (in a power of 2)? 2 24 2 24 /2 6 =2 18 2 18 blocks * 2 6 bytes/block = 2 24 bytes 7

15 Cache Data Movement Data moves a block (cache line) at a time between memory and cache Memory A memory address is mapped into a cache address Cache Block of data Block of data Strategy - Which cache block corresponds to a memory block? 16 Offset Addressing You can calculate a memory address by adding the address of a memory block to the offset in that block Example: a 16 byte block would have offset values of 0000, 0001, 0010, 0011,, 1111 Memory Block of data 8

17 Calculating a Block Number A memory address can be decomposed into a block number and an offset If a 4KB memory uses a block size of 16 bytes, memory contains 2 12 bytes and 2 8 blocks A memory address can be decomposed into a block number and an offset 1100 1110 0010 Block number offset 18 Direct Mapped Cache A cache line contains a block of data (and more) Each cache line is associated with multiple blocks Memory in memory Cache Cache line Block of data Block of data Processor needs to know if the address is has is associated with an active block in the cache Block of data Block of data 9

19 Cache Access Process CPU maps memory address into a tag, cache line location, and offset The cache line might be one of many memory blocks Processor matches the tag field of the memory address to the tag field of the cache line If the tag fields do not match (requested line is not in cache), line is moved from main memory to the cache Mapping schemes direct, fully associative, and set associative Above applies to direct mapped cache 20 Memory Address Fields Computer converts the main memory address to a cache line address Tag identifies which block in memory is in the cache tag cache line address offset Block number 10

21 Cache Line Layout Cache not only holds the data block, but also a block identifier (tag) Tag matching the tag of the memory address with the tag field of the cache line identifies a hit or miss Flag bits valid bit and dirty bit tag data block flag bits Flag bits tell the processor if the cache line has been modified 22 How the Tag Field Works Cache block address is determined from the block field of memory address Tag field of memory address is compared with tag field of cache line tag Memory address 0 0 11 Block number Cache Example 1 0 1 1 Memory Address 0 0 11 0 1 1 0 1 1 11

23 Example Direct Mapped Cache Byte addressable main memory with 4 bytes per block 4 blocks of main memory 2 blocks of cache memory 0 0 11 Tag identifies the block among Tag field 1 bit those possible for that cache slot Block number 1 bit (2 1 blocks of cache) Offset 2 bits (2 2 bytes per block) Memory address 0011 and 1011 compete for the same cache block 24 Are We on Track? Computer with Direct-mapped cache with 32 blocks Each cache block is 16 bytes 2 20 bytes of byte addressable main memory Questions 1. How many blocks of main memory are there? 2. What is the format of a memory address as seen by the cache (i.e., what are the sizes of tag, block, and offset fields)? 3. To which cache block will the memory address 0 DB 63 map? 12

25 Were We on Track? How many blocks of main memory? Main memory 2 20 bytes 2 4 bytes per block 2 16 blocks of main memory (2 20 / 2 4 ) Size of fields Offset 4 bits (2 4 bytes per block) Block 5 bits (2 5 cache blocks) Tag 11 bits (20-5-4) Memory address memory address 0 DB 63 is 0000 1101 1011 0110 0011 11 5 4 Maps to cache block 10110 26 Definitions Hit - data is found at a given memory level Miss data is not found Hit rate - percentage of time data is found at a given memory level Miss rate - percentage of time data is not found Miss rate = (1 - hit rate) Hit time - time required to access data at a given memory level Miss penalty - time required to process a miss including the time to replace a block of memory plus the time to deliver the data to the processor 13

27 Probability Recap Probability likelihood that an event will occur Quantified as a number between 0 and 1 0 not possible 1 certain Example probability that a coin flip will result in heads is.5 Expected value - the probability-weighted average of all possible values Example the expected value of a roll of a die is 3.5 = 1*.167 + 2*.167 +3*.167 + 4*.167 +5*.167 + 6*.167 28 Effective Access Time (EAT) A measure of hierarchal memory Assumes memory and cache access initiate simultaneously Expected (or average) time per access = cache hit probability * cache access time + cache miss probability * memory access time Example Access times: cache - 10ns, memory 200ns Probabilities: cache hit -.99, cache miss -.01 EAT =.99*10ns +.01 * 200ns = 9.9ns + 2ns = 11.9ns We will extend this when we consider multi-level cache and virtual memory 14

31 Finding the Line in the Cache Lookup in the cache to find a match (if any) with the address Various approaches Direct-mapped each location in memory corresponds to only one entry in the cache Content-addressable memory every address in cache is examined in parallel, returning the cache line or a no-hit response 2-way set associative each location maps to 2 locations in cache Direct mapped cache is simpler to implement, but associative cache performs better 32 Fully Associative Cache A main memory block can be placed anywhere in the cache Cache lookup is much more complex since all cache lines are matched to the memory address tag offset Tag field is equivalent to tag+block in direct-mapping 15

33 Cache Write Policies Cache replacement policies must take into account dirty blocks (blocks that have been updated while they were in the cache) Dirty blocks must be written back to memory A write policy determines how to write back to memory Write policies Write through - updates cache and main memory simultaneously on every write Write back (also called copyback) - updates memory only when the block is selected for replacement. 34 Cache Write Policy Pros & Cons Write through Advantage cache coherence Disadvantage - memory must be updated with each cache write (slowdown is usually negligible, because the majority of accesses tend to be reads, not writes) Write back Advantage - memory traffic is minimized Disadvantage - memory does not always agree with the value in cache, causing potential problems in multi-core systems Cache coherence becomes important with multi-core systems 16

35 Cache Coherence Cache coherence - the consistency of data stored in local caches of a shared resource Clients are usually separate cores in a multi-core processor 36 Separate Caches Unified or integrated cache -both instructions and data are cached Many modern systems employ separate caches for data and instructions (called a Harvard cache) The separation of data from instructions provides better locality, at the cost of greater complexity Why do separate caches for instructions and data work well? 17

37 Multi-Level Cache Memory Most of today s small systems employ multilevel cache hierarchies The levels of cache form their own small memory hierarchy Level 1 cache (8KB to 64KB) - on the processor Access time is typically about 4ns Level 2 cache (64KB to 2MB) may be on the die, motherboard, or on an expansion card. Access time is usually around 15-20ns Cache size estimates from text (low range) to higher L1 cache size increases as chip real estate becomes available 38 Example AMD K8 64 byte cache lines Source: Wikipedia 18

39 3-Level Cache Memory Level 2 cache - on the same die as the CPU (reducing access time to about 10ns) Level 3 cache (2MB to 256MB) - cache that is either Situated between the processor and main memory or On the die 40 Have You Met The Objectives? Understand the concepts and terminology of hierarchical memory organization Understand how each level of memory contributes to system performance, and how the performance is measured Understand the concept behind cache memory 19