Virtual Memory. Motivation:

Similar documents
Cache Performance (H&P 5.3; 5.5; 5.6)

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

Virtual Memory: From Address Translation to Demand Paging

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

MEMORY HIERARCHY BASICS. B649 Parallel Architectures and Programming

Computer Science 146. Computer Architecture

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review

EECS 470. Lecture 16 Virtual Memory. Fall 2018 Jon Beaumont

Computer Systems. Virtual Memory. Han, Hwansoo

Virtual Memory. Virtual Memory

EITF20: Computer Architecture Part 5.1.1: Virtual Memory

Virtual Memory: From Address Translation to Demand Paging

ECE232: Hardware Organization and Design

CSE 451: Operating Systems Winter Page Table Management, TLBs and Other Pragmatics. Gary Kimura

Virtual Memory. Samira Khan Apr 27, 2017

Chapter 8. Virtual Memory

EITF20: Computer Architecture Part 5.1.1: Virtual Memory

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches

Memory hier ar hier ch ar y ch rev re i v e i w e ECE 154B Dmitri Struko Struk v o

Memory hierarchy review. ECE 154B Dmitri Strukov

1. Creates the illusion of an address space much larger than the physical memory

VIRTUAL MEMORY II. Jo, Heeseung

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Chapter 8 Memory Management

5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

LECTURE 12. Virtual Memory

Virtual Memory. Stefanos Kaxiras. Credits: Some material and/or diagrams adapted from Hennessy & Patterson, Hill, online sources.

HY225 Lecture 12: DRAM and Virtual Memory

Main Memory (Fig. 7.13) Main Memory

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668

CSE 351. Virtual Memory

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays.

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

Virtual Memory. Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Modern Virtual Memory Systems. Modern Virtual Memory Systems

John Wawrzynek & Nick Weaver

Multi-level Translation. CS 537 Lecture 9 Paging. Example two-level page table. Multi-level Translation Analysis

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

Virtual to physical address translation

2-Level Page Tables. Virtual Address Space: 2 32 bytes. Offset or Displacement field in VA: 12 bits

Virtual Memory, Address Translation

CS 153 Design of Operating Systems Winter 2016

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Virtual Memory Oct. 29, 2002

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

CS 333 Introduction to Operating Systems. Class 11 Virtual Memory (1) Jonathan Walpole Computer Science Portland State University

ECE 571 Advanced Microprocessor-Based Design Lecture 12

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

This Unit: Main Memory. Virtual Memory. Virtual Memory. Other Uses of Virtual Memory

Virtual Memory 2. To do. q Handling bigger address spaces q Speeding translation

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

Memory Hierarchy. Mehran Rezaei

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Part II Virtual Memory

Virtual Memory, Address Translation

CSE 153 Design of Operating Systems

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

virtual memory Page 1 CSE 361S Disk Disk

CS3350B Computer Architecture

COSC 6385 Computer Architecture. - Memory Hierarchies (I)

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

CPE300: Digital System Architecture and Design

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668

ECE331: Hardware Organization and Design

Virtual Memory 2. q Handling bigger address spaces q Speeding translation

Page 1. Review: Address Segmentation " Review: Address Segmentation " Review: Address Segmentation "

Virtual Memory - Objectives

EN1640: Design of Computing Systems Topic 06: Memory System

Virtual Memory 2. Today. Handling bigger address spaces Speeding translation

Handout 4 Memory Hierarchy

CS162 - Operating Systems and Systems Programming. Address Translation => Paging"

Virtual Memory. Kevin Webb Swarthmore College March 8, 2018

SE-292 High Performance Computing. Memory Hierarchy. R. Govindarajan

EN1640: Design of Computing Systems Topic 06: Memory System

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST

Module 2: Virtual Memory and Caches Lecture 3: Virtual Memory and Caches. The Lecture Contains:

Modern Computer Architecture

Advanced Memory Organizations

and data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed

Virtual Memory. Today. Handling bigger address spaces Speeding translation

Processes and Virtual Memory Concepts

CSE502: Computer Architecture CSE 502: Computer Architecture

COSC3330 Computer Architecture Lecture 20. Virtual Memory

Computer Structure. X86 Virtual Memory and TLB

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

198:231 Intro to Computer Organization. 198:231 Introduction to Computer Organization Lecture 14

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory. Bernhard Boser & Randy Katz

@2010 Badri Computer Architecture Assembly II. Virtual Memory. Topics (Chapter 9) Motivations for VM Address translation

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Virtual Memory. CS61, Lecture 15. Prof. Stephen Chong October 20, 2011

Transcription:

Virtual Memory Motivation:! Each process would like to see its own, full, address space! Clearly impossible to provide full physical memory for all processes! Processes may define a large address space but use only a small part of it at any one time! Processes would like their memory to be protected from access and modification by other processes! The operating system needs to be protected from applications Inf3 Computer Architecture - 216-217 1

Virtual Memory Basic idea:! Each process has its own Virtual Address Space, divided into fixed-sized pages! Virtual pages that are in use get mapped to pages of physical memory (called page frames). Virtual memory: pages Physical memory: frames! Virtual pages not recently used may be stored on disk! Extends the memory hierarchy out to the swap partition of a disk Inf3 Computer Architecture - 216-217 2

Virtual and Physical Memory! Example 4K page size! Process 1 has pages A, B, C and D! Page B is held on disk Virtual memory (process 1) 4K 8K A B Physical memory 4K 8K Y D Virtual memory (process 2) 4K 8K X Y! Process 2 has pages X, Y, Z! Page Z is held on disk 12K 16K 2K 24K C 12K 16K 2K A C X 12K 16K 2K 24K Z! Process 1 cannot access pages X, Y, Z! Process 2 cannot access page A, B, C, D 28K 32K 36K 36K D B Z page swapping 28K 32K 36K 36K! O/S can access any page (full privileges) Swap disk Inf3 Computer Architecture - 216-217 3

Sharing memory using Virtual Aliases (Synonym)! Process 1 and Process 2 want to share a page of memory! Process 1 maps virtual page A to physical page P Virtual memory (process 1) 4K 8K 12K A B Physical memory 4K 8K 12K P Virtual memory (process 2) 4K 8K 12K! Process 2 maps virtual page Z to physical page P 16K 2K 24K C 16K 2K Q 16K 2K 24K Z! Permissions can vary between the sharing processors. 28K 32K 36K page swapping 28K 32K 36K! Note: Process 1 can also map the same physical page at multiple virtual addresses!! 36K P Shared page Swap disk 36K Q Aliased within one process Inf3 Computer Architecture - 216-217 4

Typical Virtual Memory 1 Parameters parameter L1 cache memory Size 4KB-64KB 128MB-1TB block/page 16-128 bytes 4KB-4GB hit time 1-3 cycles 1-3 cycles miss penalty 8-3 cycles 1M-1M cycles miss rate.1-1%.1-.1% H&P 5/e Fig. B.2! Modern OS s support several page sizes for flexibility. On Linux: Normal pages: 4KB Huge pages: 2MB or 1GB! Virtual Memory miss is called a page fault 1 Note: these parameters are due to a combination of physical memory organization and virtual memory implementation Inf3 Computer Architecture - 216-217 5

Virtual Memory Policies! Block identification: finding the correct page frame Assigning tags to memory page frames and comparing tags is impractical OS maintains a table that maps all virtual pages to physical page frames: Page Table (PT) The OS updates the PT with a new mapping whenever it allocates a page frame to a virtual page PT is accessed on a memory request to translate virtual to physical address inefficient!! Solution: cache translations (TLB) One PT per process and one for the OS Inf3 Computer Architecture - 216-217 6

Virtual Memory Policies! Block placement: location of a page in memory More freedom lower miss rates, higher hit and miss penalties Memory access time is already high and memory miss penalty (i.e., disk access time) is huge must minimize miss rates As a result, memory is fully associative a virtual page can be located in any page frame! No conflict misses! Important to reduce time to find a page in memory (hit time) To place new pages in memory, OS maintains a list of free frames! Block placement may be constrained by use of translated virtual address bits when indexing the cache (see later) Inf3 Computer Architecture - 216-217 7

Virtual Memory Policies! Block replacement: choosing a page frame to reuse Minimize misses (page faults) LRU policy! True LRU expensive must minimize CPU time of the algorithm! Simple solution: OS sets a Used bit whenever a page is accessed in a time quantum. In the next quantum, any page with its Used bit clear is eligible for replacement. This requires 2 sets of Used bits Minimize write backs to disk give priority to clean pages! Write strategy: what happens when a page is written Write-through: would mean writing the cache block back to disk whenever the page is updated in main memory not practical due to latency and bandwidth considerations (~4 orders of magnitude latency gap between memory & disk) Write-back: the norm in today s virtual memory systems! OS tracks modified pages through the use of Dirty bits in page table entries Inf3 Computer Architecture - 216-217 8

Page Tables and Address Translation Page Table Address Register 31 12 11 Virtual Page Number PageOffset Virtual Address Page Table Entry (PTE):! Track access permissions for each page Read, Write, Execute Disk? Physical Page Number Permiss. Dirty Valid r-w-x 1 PTE Page Table! Bit indicates if page is on disk, in which case Physical Page Number indicates location within swap file! Dirty bit indicates if there were any writes to the page! 4B per PTE in this example 31 12 11 Physical Page Number PageOffset Physical Address Inf3 Computer Architecture - 216-217 9

Making Page Tables space-efficient! The number of entries in the table is the number of virtual pages many! e.g., 4KB pages " 2 2 =1M entries for a 32b address space # need 4MB/process " 2 52 entries for a 64b address space # petabytes per process! Solution:! Exploit the observation that the virtual address space of each process is sparse only a fraction of all virtual addresses actually used! hash virtual addresses to avoid maintaining a map from each virtual page (many) to physical frame (few).! Resulting structure is called the inverted page table! Other (complementary) solutions: Store PTs in the virtual memory of the OS, and swap out recently unused portions Use large pages Inf3 Computer Architecture - 216-217 1

Fast address translation: TLB! Typically a small, fullyassociative cache of Page Table Entries (PTE)! Tag given by VPN for that PTE! PPN taken from PTE! Valid bit required! D bit (dirty) indicates whether page has been modified! R, W, X bits indicate Read, Write and Execute permission! Permissions are checked on every memory access! Physical address formed from PPN and Page Offset! TLB Exceptions: TLB miss (no matching entry) Privilege violation! Often separate TLBs for Instruction and Data references V TLB hit D R W X 31 12 11 Virtual Page Number = = = = = Tag Physical Address PageOffset Virtual Address Physical Page Number 31 12 11 Physical Page Number PageOffset Inf3 Computer Architecture - 216-217 11

How to address a cache in a virtual-memory system Option 1: physically-addressed caches perform address translation before cache access Hit time is increased to accommodate translation $ Virtual address: x4 CPU address translation Physical address: xc14 L1 cache tag data Main memory? Inf3 Computer Architecture - 216-217 12

How to address a cache in a virtual-memory system Option 2: virtually-addressed caches perform address translation after cache access if miss Hit time does not include translation Aliases $ Virtual address: x4 CPU L1 cache tag data address translation Physical address: xc14 Main memory? Inf3 Computer Architecture - 216-217 13

Problems with virtually addressed caches! Virtually tagged data cache problems: A program may use different virtual addresses pointing to the same physical address (Aliases or Synonyms)! Two copies could exist in the same data cache! Writing to copy 1 would not be reflected in copy 2! Reading copy 2 would get stale data! Thus, does not provide a coherent view of memory Also, must be able to distinguish across different processes: same VA, different PA (Homonyms) Inf3 Computer Architecture - 216-217 14

Solutions for handling homonyms and synonyms! Flush cache on context switch or add process ID to each tag Will solve the homonym problem But will not solve the synonym problem.! Use physically addressed caches Will solve homonym problem. Will also solve synonym problem.! Synonyms all have same physical address, thus one copy exists in each cache Implication: need to do address translation before accessing cache.! Use physically addressed tags? Must translate addresses before cache tag check May still be able to index cache using non-translated low-order address bits under certain circumstances. Inf3 Computer Architecture - 216-217 15

VI-PT: translating in parallel with L1-$ access! Access TLB and L1-$ in parallel! Requires that L1-$ index be obtained from the non-translated bits of the virtual address.! This constraint in the number of bits available for the index limits the size of the cache! 31 12 11 Virtual Page Number PageOffset Index Offset Virtual Address 31 12 11 5 4 TLB hit TLB Tag Comparison L1-$ 4 KB D-M 32-byte line IMPORTANT: If the cache Index extends beyond bit 11, into the translated part of the address, then translation must take place before the cache can be indexed Cache hit Inf3 Computer Architecture - 216-217 16

Coping with large VI-PT caches! Multi-way caches: multiple blocks in the same set E.g., Intel Haswell: 32KB 8-way cache w/ 4KB pages " High associativity affords large capacity! Check other potential sets for aliases on a miss E.g., AMD Opteron: 64KB 2-way cache w/ 4KB pages " on a miss, 7 add l cycles to check for aliases in other sets! Larger page size: more bits available for the index Not a universal solution, since most OS normal page size is 4-8KB Inf3 Computer Architecture - 216-217 17

Coping with large VI-PT caches (con d)! Rely on page allocator in the O/S to allocate pages such that the translation of index bits would always be an identity relation Hence, if virtual address A translates to physical address P, then Page Allocator must guarantee that: V[12] == P[12] This approach is referred to as page coloring. Cache tag bits Index Offset Cache addressing 31 13 12 5 4 31 12 11 Virtual Page Number Page offset Virtual addressing Any translated bit used to index the cache must be identical in both the Virtual and Physical addresses Inf3 Computer Architecture - 216-217 18

Summary: how to address a cache! PI-PT : Physically indexed, physically tagged Translation first; then cache access Con: Translation occurs in sequence with L1-$ access high latency! VI-VT : Virtually indexed, virtually tagged L1-$ indexed with virtual address, tag contains virtual address Con: Cannot distinguish synonyms/homonyms in cache Pro: Only perform TLB lookup on L1-$ miss! VI-PT : Virtually indexed, physically tagged L1-$ indexed with virtual address, or often just the un-translated bits Translation must take place before tag can be checked Con: Translation must take place on every L1-$ access Pro: No synonyms/homonyms in the cache! PI-VT : Physically indexed, virtually tagged Not interesting Inf3 Computer Architecture - 216-217 19