Virtual Memory - Objectives

Similar documents
ECE232: Hardware Organization and Design

ECE331: Hardware Organization and Design

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays.

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Virtual Memory. Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Memory Hierarchy. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Virtual Memory. Virtual Memory

Memory Hierarchy Y. K. Malaiya

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Transistor: Digital Building Blocks

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)

1. Creates the illusion of an address space much larger than the physical memory

Agenda. CS 61C: Great Ideas in Computer Architecture. Virtual Memory II. Goals of Virtual Memory. Memory Hierarchy Requirements

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Jiang Jiang

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review

CS/ECE 3330 Computer Architecture. Chapter 5 Memory

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

Virtual Memory, Address Translation

Another View of the Memory Hierarchy. Lecture #25 Virtual Memory I Memory Hierarchy Requirements. Memory Hierarchy Requirements

Memory Hierarchy Requirements. Three Advantages of Virtual Memory

CS 333 Introduction to Operating Systems. Class 11 Virtual Memory (1) Jonathan Walpole Computer Science Portland State University

Virtual Memory, Address Translation

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

virtual memory. March 23, Levels in Memory Hierarchy. DRAM vs. SRAM as a Cache. Page 1. Motivation #1: DRAM a Cache for Disk

Memory hier ar hier ch ar y ch rev re i v e i w e ECE 154B Dmitri Struko Struk v o

ECE468 Computer Organization and Architecture. Virtual Memory

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

ECE4680 Computer Organization and Architecture. Virtual Memory

Virtual Memory: From Address Translation to Demand Paging

Virtual Memory: From Address Translation to Demand Paging

UC Berkeley CS61C : Machine Structures

virtual memory Page 1 CSE 361S Disk Disk

Lecture 21: Virtual Memory. Spring 2018 Jason Tang

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

EEC 483 Computer Organization. Chapter 5.3 Measuring and Improving Cache Performance. Chansu Yu

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Main Memory (Fig. 7.13) Main Memory

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Chapter 5 B. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

CS162 - Operating Systems and Systems Programming. Address Translation => Paging"

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

Virtual Memory Oct. 29, 2002

CS 61C: Great Ideas in Computer Architecture. Virtual Memory

Virtual Memory Nov 9, 2009"

Sarah L. Harris and David Money Harris. Digital Design and Computer Architecture: ARM Edition Chapter 8 <1>

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Part II Virtual Memory

Memory hierarchy review. ECE 154B Dmitri Strukov

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]

CS 61C: Great Ideas in Computer Architecture. Virtual Memory III. Instructor: Dan Garcia

ADDRESS TRANSLATION AND TLB

Translation Buffers (TLB s)

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

Computer Systems. Virtual Memory. Han, Hwansoo

LECTURE 12. Virtual Memory

ADDRESS TRANSLATION AND TLB

Motivations for Virtual Memory Virtual Memory Oct. 29, Why VM Works? Motivation #1: DRAM a Cache for Disk

CISC 360. Virtual Memory Dec. 4, 2008

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

HY225 Lecture 12: DRAM and Virtual Memory

Virtual Memory. Motivation:

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

Computer Structure. X86 Virtual Memory and TLB

Page 1. Memory Hierarchies (Part 2)

EE 4683/5683: COMPUTER ARCHITECTURE

Virtual Memory. User memory model so far:! In reality they share the same memory space! Separate Instruction and Data memory!!

Memory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit.

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

SE-292 High Performance Computing. Memory Hierarchy. R. Govindarajan

COMPUTER ORGANIZATION AND DESIGN

CS152 Computer Architecture and Engineering Lecture 18: Virtual Memory


Topic 18: Virtual Memory

UC Berkeley CS61C : Machine Structures

COSC3330 Computer Architecture Lecture 20. Virtual Memory

EN1640: Design of Computing Systems Topic 06: Memory System

CSE 560 Computer Systems Architecture

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

CS3350B Computer Architecture

CSE 141 Computer Architecture Spring Lectures 17 Virtual Memory. Announcements Office Hour

CPSC 330 Computer Organization

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface

Transcription:

ECE232: Hardware Organization and Design Part 16: Virtual Memory Chapter 7 http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy Virtual Memory - Objectives 1. Allow program to be written without memory constraints program can exceed the size of the main memory 2. Many Programs sharing DRAM Memory so that context switches can occur 3. Relocation: Parts of the program can be placed at different locations in the memory instead of a big chunk Virtual Memory: I. Main Memory holds many programs running at same time (processes) II. use Main Memory as a kind of cache for disk Datapath Processor Control Cache Regs Main Memory (DRAM) Disk ECE232: Virtual Memory 2 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Disk Technology in Brief Disk is mechanical memory tracks 3600 4200-5400-7200-10000 RPM rotation speed R/W arm Disk Access Time = seek time + rotational delay + transfer time usually measured in milliseconds Miss to disk is extremely expensive typical access time = millions of clock cycles Addressing a sector ECE232: Virtual Memory 3 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 Virtual to Physical Memory mapping Each process has its own private virtual address space (e.g., 2 32 Bytes); CPU actually generates virtual addresses Each computer has a physical address space (e.g., 128 MB DRAM); also called real memory Address translation: mapping virtual Virtual Memory addresses to physical addresses Allows some chunks of virtual memory to be present on disk, not in main memory Physical Memory Allows multiple programs to use (different chunks of physical) memory at same time Processor virtual address ECE232: Virtual Memory 4 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Mapping Virtual Memory to Physical Memory Divide Memory into equal sized pages (say, 4KB each) A page of Virtual Memory can be assigned to any page frame of Physical Memory Virtual Memory 4GB Stack 128 MB Physical Memory Heap Heap Static Single Process Code 0 0 ECE232: Virtual Memory 5 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 How to Perform Address Translation? VM divides memory into equal sized pages Address translation maps entire pages offsets within the pages do not change if page size is a power of two, the virtual address separates into two fields: like cache index, offset fields virtual address Virtual Page Number Page Offset ECE232: Virtual Memory 6 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Mapping Virtual to Physical Address Virtual Address 31 30 29 28 27..12 11 10 9 8... 3 2 1 0 Virtual Page Number Page Offset Translation 1KB page size Physical Page Number Page Offset 27...12 11 10 Physical Address 9 8... 3 2 1 0 ECE232: Virtual Memory 7 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 Address Translation Want fully associative page placement How to locate the physical page? Search impractical (too many pages) A page table is a data structure which contains the mapping of virtual pages to physical pages There are several different ways, all up to the operating system, to keep and update this data Each process running in the system has its own page table CPU Virtual Address Virtual Page No. Page Offset Hit_time = 70-100 CPU cycles Miss_penalty = 10^6 cycles Miss_rate = 1% Page Table Address Mapping Physical Page No. Page Offset Physical Address Memory ECE232: Virtual Memory 8 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Address Translation: Page Table Virtual Address (VA): virtual page # offset Page Table Register index into page table Page Table is located in physical memory Val -id Page Table... V A.R. P. P. N. Access Rights Physical Page Number 1 A.R. P. P. N. 0 A.R.... Access Rights: None, Read Only, Read/Write, Execute Physical Memory Address (PA) disk ECE232: Virtual Memory 9 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 Page Table ECE232: Virtual Memory 10 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Handling Page Faults A page fault is like a cache miss Must find page in lower level of hierarchy If valid bit is zero, the Physical Page Number points to a page on disk When OS starts new process, it creates space on disk for all the pages of the process, sets all valid bits in page table to zero, and all Physical Page Numbers to point to disk called Demand Paging - pages of the process are loaded from disk only as needed ECE232: Virtual Memory 11 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 Comparing the 2 hierarchies Cache Block or Line Miss Block Size: 32-64B Placement: Direct Mapped, N-way Set Associative Replacement: LRU or Random Write Thru or Back How Managed: Hardware Virtual Memory Page Page Fault Page Size: 4K-16KB Fully Associative Least Recently Used (LRU) approximation Write Back Hardware + Software (Operating System) ECE232: Virtual Memory 12 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Optimizing for Space Page Table too big! 4GB Virtual Address Space 4 KB page 1 million Page Table Entries 4 MB just for Page Table of single process! Variety of solutions to tradeoff Page Table size for slower performance Multilevel page table, Paging page tables, etc. (Take O/S Class to learn more) ECE232: Virtual Memory 13 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 How to Translate Fast? Problem: Virtual Memory requires two memory accesses! one to translate Virtual Address into Physical Address (page table lookup) - Page Table is in physical memory one to transfer the actual data (hopefully cache hit) Observation: since there is locality in pages of data, must be locality in virtual addresses of those pages! Why not create a cache of virtual to physical address translations to make translation fast? (smaller is faster) For historical reasons, such a page table cache is called a Translation Lookaside Buffer, or TLB ECE232: Virtual Memory 14 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Translation-Lookaside Buffer (TLB) Physical Page 0 of page 1 Physical Page 1 Physical Page N-1 Main Memory H. Stone, High Performance Computer Architecture, AW 1993 ECE232: Virtual Memory 15 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 TLB and Page Table ECE232: Virtual Memory 16 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Typical TLB Format Virtual Physical Valid Ref Dirty Access Page Number Page Number Rights tag data TLB just a cache of the page table mappings Dirty: since use write back, need to know whether or not to write page to disk when replaced Ref: Used to calculate LRU on replacement Reference bit - set when page accessed; OS periodically sorts and moves the referenced pages to the top & resets all Ref bits Must provide timer interrupt to update LRU bits TLB access time comparable to cache (much less than main memory access time) ECE232: Virtual Memory 17 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 Translation Look-Aside Buffers TLB is usually small, typically 32-512 entries Like any other cache, the TLB can be fully associative, set associative, or direct mapped data data Processor virtual addr. physical addr. hit hit miss TLB Cache Main miss Memory Page Table OS Fault Handler page fault/ protection violation Disk Memory ECE232: Virtual Memory 18 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Steps in Memory Access - Example ECE232: Virtual Memory 19 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 Virtual Address 31 30 29 15 14 13 12 11 10 9 8 3 2 1 0 DECStation 3100/ MIPS R2000 Virtualpage number Page offset 20 12 TLB TLB hit 64 entries, fully associative Valid Dirty Tag Physical page number 20 Physical Address Physical page number Page offset Physical address tag Cache index 16 14 2 Byte offset Valid Tag Data Cache 16K entries, direct mapped 32 ECE232: Cache Virtual hit Memory 20 Adapted from Computer Organization and Design, Patterson&Hennessy; Data Kundu, UMass Koren, 2011

Real Stuff: Pentium Pro Memory Hierarchy Address Size: 32 bits (VA, PA) VM Page Size: 4 KB, 4 MB TLB organization: separate i,d TLBs (i-tlb: 32 entries, d-tlb: 64 entries) 4-way set associative LRU approximated hardware handles miss L1 Cache: 8 KB, separate i,d 4-way set associative LRU approximated 32 byte block write back L2 Cache: 256 or 512 KB ECE232: Virtual Memory 21 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 Intel Nehalim quad-core processor 13.5 19.6 mm die; 731 million transistors; Two 128-bit memory channels Each processor has: private 32-KB instruction and 32-KB data caches and a 512-KB L2 cache. The four cores share an 8-MB L3 cache. Each core also has a two-level TLB. ECE232: Virtual Memory 22 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011

Comparing Intel s Nehalim to AMD s Opteron Virtual addr Physical addr Page size L1 TLB (per core) L2 TLB (per core) TLB misses Intel Nehalem 48 bits 44 bits 4KB, 2/4MB L1 I-TLB: 128 entries L1 D-TLB: 64 entries Both 4-way, LRU replacement Single L2 TLB: 512 entries 4-way, LRU replacement Handled in hardware AMD Opteron X4 48 bits 48 bits 4KB, 2/4MB L1 I-TLB: 48 entries L1 D-TLB: 48 entries Both fully associative, LRU replacement L2 I-TLB: 512 entries L2 D-TLB: 512 entries Both 4-way, round-robin LRU Handled in hardware ECE232: Virtual Memory 23 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011 Further Comparison L1 caches (per core) L2 unified cache (per core) L3 unified cache (shared) Intel Nehalem L1 I-cache: 32KB, 64-byte blocks, 4-way, approx LRU, hit time n/a L1 D-cache: 32KB, 64- byte blocks, 8-way, approx LRU, writeback/allocate, hit time n/a 256KB, 64-byte blocks, 8- way, approx LRU, writeback/allocate, hit time n/a 8MB, 64-byte blocks, 16- way, write-back/allocate, hit time n/a AMD Opteron X4 L1 I-cache: 32KB, 64-byte blocks, 2-way, LRU, hit time 3 cycles L1 D-cache: 32KB, 64- byte blocks, 2-way, LRU, write-back/allocate, hit time 9 cycles 512KB, 64-byte blocks, 16-way, approx LRU, write-back/allocate, hit time n/a 2MB, 64-byte blocks, 32- way, write-back/allocate, hit time 32 cycles ECE232: Virtual Memory 24 Adapted from Computer Organization and Design, Patterson&Hennessy; Kundu, UMass Koren, 2011