The Memory Hierarchy 10/25/16

Similar documents
CS 31: Intro to Systems Storage and Memory. Kevin Webb Swarthmore College March 17, 2015

+ Random-Access Memory (RAM)

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

Computer Organization: A Programmer's Perspective

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 26, SPRING 2013

CS 261 Fall Mike Lam, Professor. Memory

Read-only memory (ROM): programmed during production Programmable ROM (PROM): can be programmed once SRAM (Static RAM)

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 26, FALL 2012

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

CS 261 Fall Mike Lam, Professor. Memory

Random Access Memory (RAM)

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015

CS 2461: Computer Architecture 1 The Memory Hierarchy and impact on program performance

Storage Technologies and the Memory Hierarchy

CSE 153 Design of Operating Systems Fall 2018

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory

Computer Systems. Memory Hierarchy. Han, Hwansoo

CS 33. Memory Hierarchy I. CS33 Intro to Computer Systems XVI 1 Copyright 2016 Thomas W. Doeppner. All rights reserved.

Random-Access Memory (RAM) CISC 360. The Memory Hierarchy Nov 24, Conventional DRAM Organization. SRAM vs DRAM Summary.

Cycle Time for Non-pipelined & Pipelined processors

CS429: Computer Organization and Architecture

CS429: Computer Organization and Architecture

The Memory Hierarchy Sept 29, 2006

Giving credit where credit is due

CISC 360. The Memory Hierarchy Nov 13, 2008

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Foundations of Computer Systems

Advanced Memory Organizations

CSE 153 Design of Operating Systems

Lecture 18: Memory Systems. Spring 2018 Jason Tang

Key features. ! RAM is packaged as a chip.! Basic storage unit is a cell (one bit per cell).! Multiple RAM chips form a memory.

Random-Access Memory (RAM) Lecture 13 The Memory Hierarchy. Conventional DRAM Organization. SRAM vs DRAM Summary. Topics. d x w DRAM: Key features

STORING DATA: DISK AND FILES

Computer Systems C S Cynthia Lee Today s materials adapted from Kevin Webb at Swarthmore College

The Memory Hierarchy /18-213/15-513: Introduction to Computer Systems 11 th Lecture, October 3, Today s Instructor: Phil Gibbons

Memory Management! Goals of this Lecture!

ECE232: Hardware Organization and Design

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

Memory Management! How the hardware and OS give application pgms:" The illusion of a large contiguous address space" Protection against each other"

NEXT SET OF SLIDES FROM DENNIS FREY S FALL 2011 CMSC313.

Memory Management. Goals of this Lecture. Motivation for Memory Hierarchy

Large and Fast: Exploiting Memory Hierarchy

CENG3420 Lecture 08: Memory Organization

Denison University. Cache Memories. CS-281: Introduction to Computer Systems. Instructor: Thomas C. Bressoud

Memory Management. Kevin Webb Swarthmore College February 27, 2018

Computer Organization and Assembly Language (CS-506)

CMPSC 311- Introduction to Systems Programming Module: Caching

The Memory Hierarchy / : Introduction to Computer Systems 10 th Lecture, Feb 12, 2015

Caches. Hiding Memory Access Times

The Memory Hierarchy. Computer Organization 2/12/2015. CSC252 - Spring Memory. Conventional DRAM Organization. Reading DRAM Supercell (2,1)

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

Systems Programming and Computer Architecture ( ) Timothy Roscoe


Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Computer Science 432/563 Operating Systems The College of Saint Rose Spring Topic Notes: Memory Hierarchy

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory I

A Computer. Computer organization - Recap. The Memory Hierarchy... Brief Overview of Memory Design. CPU has two components: Memory

211: Computer Architecture Summer 2016

UNIT-V MEMORY ORGANIZATION

ECE468 Computer Organization and Architecture. Memory Hierarchy

CMPSC 311- Introduction to Systems Programming Module: Caching

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

Computer Architecture and System Software Lecture 08: Assembly Language Programming + Memory Hierarchy

ECE 341. Lecture # 16

Recap: Machine Organization

COSC 243. Memory and Storage Systems. Lecture 10 Memory and Storage Systems. COSC 243 (Computer Architecture)

ECE331: Hardware Organization and Design

LECTURE 11. Memory Hierarchy

Caches II CSE 351 Spring #include <yoda.h> int is_try(int do_flag) { return!(do_flag!do_flag); }

Chapter 8 :: Topics. Chapter 8 :: Memory Systems. Introduction Memory System Performance Analysis Caches Virtual Memory Memory-Mapped I/O Summary

MEMORY. Objectives. L10 Memory

Sarah L. Harris and David Money Harris. Digital Design and Computer Architecture: ARM Edition Chapter 8 <1>

F28HS Hardware-Software Interface: Systems Programming

Memory Hierarchy Y. K. Malaiya

Introduction to Computer Systems

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

Memory Hierarchy. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Random-Access Memory (RAM) CS429: Computer Organization and Architecture. SRAM and DRAM. Flash / RAM Summary. Storage Technologies

Today. The Memory Hierarchy. Byte Oriented Memory Organization. Simple Memory Addressing Modes

Digital Logic & Computer Design CS Professor Dan Moldovan Spring Copyright 2007 Elsevier 8-<1>

14:332:331. Week 13 Basics of Cache

Memory Hierarchy: Caches, Virtual Memory

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CENG4480 Lecture 09: Memory 1

Today. The Memory Hierarchy. Random-Access Memory (RAM) Nonvolatile Memories. Traditional Bus Structure Connecting CPU and Memory

CPU Pipelining Issues

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Registers. Instruction Memory A L U. Data Memory C O N T R O L M U X A D D A D D. Sh L 2 M U X. Sign Ext M U X ALU CTL INSTRUCTION FETCH

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

CSC Memory System. A. A Hierarchy and Driving Forces

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Memory. Lecture 22 CS301

Memory Hierarchies &

Transcription:

The Memory Hierarchy 10/25/16

Transition First half of course: hardware focus How the hardware is constructed How the hardware works How to interact with hardware Second half: performance and software systems Memory performance Operating systems Standard libraries Parallel programming

Making programs efficient Algorithms matter CS35 CS41 Hardware matters Engineering Using the hardware properly matters CPU vs GPU Parallel programming Memory hierarchy

Memory so far: array abstraction Memory is a big array of bytes. Every address is an index into this array. This is the level of abstraction at which an assembly programmer thinks. C programmers can think even more abstractly with variables.

Memory Technologies Latches (registers, cache) Magnetic (hard drives) Volatile (loses data without power) Capacitors (DRAM) $$ $$ $$$ $ $$ Flash (SSDs) Non-Volatile (maintains data when computer is turned off)

The Memory Hierarchy 1 cycle to access Registers Faster Cache(s) (SRAM) few cycles to access Main memory (DRAM) ~100 cycles to access Local secondary storage (disk, SSD) ~100,000,000 cycles to access Cheaper

Key idea this week: caching Store everything in cheap, slow storage. Store a subset in fast, expensive storage. Try to guess the most useful subset to cache.

A note on terminology Caching: the general principle of holding a small subset of your data in fast-access storage. The cache: SRAM memory inside the CPU.

Connecting CPU and Memory Components are connected by a bus: A bus is a bundle of parallel wires that carry address, data, and control signals. Buses are typically shared by multiple devices. CPU chip Register file Cache ALU System bus Memory bus Bus interface I/O bridge Main memory

How a Memory Read Works (1) CPU places address A on the memory bus. CPU chip Load operation: movl (A), %eax Cache %eax Register file ALU I/O bridge Main memory 0 Bus interface A x A

How a Memory Read Works (2) Main Memory reads Address A from Memory Bus, fetches data X at that address and puts it on the bus CPU chip Load operation: movl (A), %eax Cache %eax Register file ALU I/O bridge Main memory 0 Bus interface X x A

How a Memory Read Works (3) CPU reads X from the bus, and copies it into register %eax. A copy also goes into the on-chip cache memory CPU chip Load operation: movl (A), %eax Register file Cache X %eax X ALU Main memory I/O bridge 0 Bus interface x A

Write 1. CPU writes A to bus, Memory Reads it 2. CPU writes Y to bus, Memory Reads it 3. Memory stores read value, y, at address A CPU chip Store operation: movl %eax, (A) Y Cache Register file %eax Y ALU Main memory I/O bridge 0 Bus interface A Y Y A

I/O Bus: connects Devices & Memory CPU chip Cache Register file ALU OS moves data between Main Memory & Devices System bus Memory bus Bus interface I/O bridge Main memory USB controller Graphics controller I/O bus Disk controller Expansion slots for other devices such as network controller. Mouse Keyboard Monitor Disk

Device Driver: OS device-specific code CPU chip Cache Register file ALU OS driver code running on CPU makes read & write requests to Device Controller via I/O Bridge System bus Memory bus Bus interface I/O bridge Main memory I/O bus USB controller Mouse Keyboard Graphics controller Monitor Disk controller Disk

Abstraction Goal Reality: There is no one type of memory to rule them all! Abstraction: hide the complex/undesirable details of reality. Illusion: We have the speed of SRAM, with the capacity of disk, at reasonable cost.

What s Inside A Disk Drive? Actuator Arm Spindle Platters Data Encoded as points of magnetism on Platter surfaces R/W head bus connector Image from Seagate Technology Controller Electronics (includes processor & memory) Device Driver (part of OS code) interacts with Controller to R/W to disk

Reading and Writing to Disk Data blocks located in some Sector of some Track on some Surface 1. Disk Arm moves to correct track (seek time) 2. Wait for sector spins under R/W head (rotational latency) 3. As sector spins under head, data are Read or Written (transfer time) sector disk surface spins at a fixed rotational rate ~7200 rotations/min disk arm sweeps across surface to position read/write head over a specific track.

Cache Basics Regs CPU ALU L1 CPU real estate dedicated to cache L2 Cache Memory Bus Main Memory Usually two levels: L1: smallest, fastest L2: larger, slower Same rules apply: L1 subset of L2

Cache Basics CPU Regs ALU Cache Memory Bus Main Memory CPU real estate dedicated to cache Usually two levels: L1: smallest, fastest L2: larger, slower We ll assume one cache (same principles) Cache is a subset of main memory. (Not to scale, memory much bigger!)

Cache Basics: Read from memory In cache? CPU Regs ALU Cache In parallel: Issue read to memory Check cache Memory Bus Request data Main Memory

Cache Basics: Read from memory CPU In cache? Regs ALU Cache Memory Bus In parallel: Issue read to memory Check cache Data in cache (hit): Good, send to register Cancel/ignore memory Main Memory

Cache Basics: Read from memory CPU In cache? Regs ALU 2. Cache Memory Bus 1. (~200 cycles) In parallel: Issue read to memory Check cache Data in cache (hit): Good, send to register Cancel/ignore memory Main Memory Data not in cache (miss): 1. Load cache from memory (might need to evict data) 2. Send to register

Cache Basics: Write to memory Data CPU Regs ALU Cache Assume data already cached Otherwise, bring it in like read 1. Update cached copy. Memory Bus 2. Update memory? Main Memory

When should we copy the written data from cache to memory? Why? A. Immediately update the data in memory when we update the cache. B. Update the data in memory when we evict the data from the cache. C. Update the data in memory if the data is needed elsewhere (e.g., another core). D. Update the data in memory at some other time. (When?)

When should we copy the written data from cache to memory? Why? A. Immediately update the data in memory when we update the cache. ( Write-through ) B. Update the data in memory when we evict the data from the cache. ( Write-back ) C. Update the data in memory if the data is needed elsewhere (e.g., another core). D. Update the data in memory at some other time. (When?)

Cache Basics: Write to memory Both options (write-through, write-back) viable write-though: write to memory immediately simpler, accesses memory more often (slower) write-back: only write to memory on eviction complex (cache inconsistent with memory) potentially reduces memory accesses (faster) Sells better. Servers/Desktops/Laptops

Discussion Question What data should we keep in the cache? What principles can we use to make a decent guess?

Problem: Prediction We can t know the future So are we out of luck? What might we look at to help us decide? The past is often a pretty good predictor

Analogy: two types of Netflix users 1: 2: What should be next in each user s queue?

Critical Concept: Locality Locality: we tend to repeatedly access recently accessed items, or those that are nearby. Temporal locality: An item accessed recently is likely to be accessed again soon. Spatial locality: We re likely to access an item that s nearby others we just accessed.

In the following code, how many examples are there of temporal / spatial locality? Where are they? void print_array(int *array, int num) { } int i; for (i = 0; i < num; i++) { } printf( %d : %d, i, array[i]); A. 1 temporal, 1 spatial B. 1 temporal, 2 spatial C. 2 temporal, 1 spatial D. 2 temporal, 2 spatial E. Some other number

Example Temporal Locality? void print_array(int *array, int num) { int i; for (i = 0; i < num; i++){ printf( %d : %d, i, array[i]); } } array, num and i used over and over again in each iteration Spatial Locality? array bucket access program instructions Programs with loops tend to have a lot of locality and most programs have loops: it s hard to write a long-running program w/o a loop 33

Use Locality to Speed-up Memory Access Caching Key idea: keep copy of likely to be accessed soon data in higher levels of Memory Hierarchy to make their future accesses faster: recently accessed data (temporal locality) data nearby recently accessed data (spatial locality) If program has high degree of locality, next data access is likely to be in cache - if little/no locality, then caching won t help + luckily most programs have a high degree of locality 34

Discussion Question What data should we evict from the cache? What principles can we use to make a decent guess?