Lecture 22: Virtual Memory: Survey of Modern Systems

Similar documents
Lecture 19: Survey of Modern VMs. Housekeeping

Lecture 19: Survey of Modern VMs + a Decomposition of Meltdown. James C. Hoe Department of ECE Carnegie Mellon University

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

EECS 470. Lecture 16 Virtual Memory. Fall 2018 Jon Beaumont

Lecture 17. Fall 2007 Prof. Thomas Wenisch. row enable. _bitline. Lecture 18 Slide 1 EECS 470

Lecture 20: Virtual Memory, Protection and Paging. Multi-Level Caches

CS162 - Operating Systems and Systems Programming. Address Translation => Paging"

Learning Outcomes. Page Tables Revisited. Two-level Translation. R3000 TLB Refill. Virtual Linear Array page table

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

A Typical Memory Hierarchy

A Typical Memory Hierarchy

CS252 Spring 2017 Graduate Computer Architecture. Lecture 17: Virtual Memory and Caches

HY225 Lecture 12: DRAM and Virtual Memory

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches

CS5460: Operating Systems Lecture 14: Memory Management (Chapter 8)

VIRTUAL MEMORY II. Jo, Heeseung

Multi-level Translation. CS 537 Lecture 9 Paging. Example two-level page table. Multi-level Translation Analysis

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

Lecture 17: Address Translation. James C. Hoe Department of ECE Carnegie Mellon University

Virtual Memory 2. Today. Handling bigger address spaces Speeding translation

CS 152 Computer Architecture and Engineering. Lecture 9 - Virtual Memory

CS 153 Design of Operating Systems Winter 2016

Recap: Set Associative Cache. N-way set associative: N entries for each Cache Index N direct mapped caches operates in parallel

Virtual Memory 2. To do. q Handling bigger address spaces q Speeding translation

Virtual Memory. Today. Handling bigger address spaces Speeding translation

Computer Architecture Lecture 13: Virtual Memory II

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

Virtual Memory: From Address Translation to Demand Paging

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

Virtual memory - Paging

The process. one problem

This training session will cover the Translation Look aside Buffer or TLB

CS 318 Principles of Operating Systems

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

CSE 120 Principles of Operating Systems Spring 2017

CSE 451: Operating Systems Winter Page Table Management, TLBs and Other Pragmatics. Gary Kimura

Chapter 8 Memory Management

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays.

18-447: Computer Architecture Lecture 18: Virtual Memory III. Yoongu Kim Carnegie Mellon University Spring 2013, 3/1

CSE 120 Principles of Operating Systems

Lecture 9 - Virtual Memory

Virtual Memory. CS61, Lecture 15. Prof. Stephen Chong October 20, 2011

Computer Architecture and Engineering. CS152 Quiz #3. March 18th, Professor Krste Asanovic. Name:

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

ECE232: Hardware Organization and Design

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

Motivation, Concept, Paging January Winter Term 2008/09 Gerd Liefländer Universität Karlsruhe (TH), System Architecture Group

Virtual Memory. Virtual Memory

Opera&ng Systems ECE344

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

Virtual Memory Review. Page faults. Paging system summary (so far)

CS 537 Lecture 6 Fast Translation - TLBs

Virtual Memory. Study Chapters something I understand. Finally! A lecture on PAGE FAULTS! doing NAND gates. I wish we were still

Virtual Memory: From Address Translation to Demand Paging

ECE 571 Advanced Microprocessor-Based Design Lecture 12

Computer Science 146. Computer Architecture

5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner

ECE 571 Advanced Microprocessor-Based Design Lecture 13

Operating Systems. Paging... Memory Management 2 Overview. Lecture 6 Memory management 2. Paging (contd.)

Memory Management. An expensive way to run multiple processes: Swapping. CPSC 410/611 : Operating Systems. Memory Management: Paging / Segmentation 1

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

ADDRESS TRANSLATION AND TLB

ECE 598 Advanced Operating Systems Lecture 14

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory. Bernhard Boser & Randy Katz

Virtual Memory 2. q Handling bigger address spaces q Speeding translation

CS 61C: Great Ideas in Computer Architecture Virtual Memory. Instructors: John Wawrzynek & Vladimir Stojanovic

ADDRESS TRANSLATION AND TLB

Virtual Memory. Samira Khan Apr 27, 2017

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

CS 333 Introduction to Operating Systems. Class 11 Virtual Memory (1) Jonathan Walpole Computer Science Portland State University

Processes and Virtual Memory Concepts

Virtual Memory. CS 351: Systems Programming Michael Saelee

Memory Hierarchy Requirements. Three Advantages of Virtual Memory

Topics: Memory Management (SGG, Chapter 08) 8.1, 8.2, 8.3, 8.5, 8.6 CS 3733 Operating Systems

SE-292 High Performance Computing. Memory Hierarchy. R. Govindarajan

Virtual Memory. Motivation:

EITF20: Computer Architecture Part 5.1.1: Virtual Memory

Lecture 9 Virtual Memory

CS162 Operating Systems and Systems Programming Lecture 14. Caching and Demand Paging

Memory Hierarchy. Mehran Rezaei

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

CSE 351. Virtual Memory

Modern Virtual Memory Systems. Modern Virtual Memory Systems

Topic 18 (updated): Virtual Memory

Virtual Memory. Computer Systems Principles

Multi-Process Systems: Memory (2) Memory & paging structures: free frames. Memory & paging structures. Physical memory

Virtual Memory. Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Lecture 11: Interrupt and Exception. James C. Hoe Department of ECE Carnegie Mellon University

Background. IBM sold expensive mainframes to large organizations. Monitor sits between one or more OSes and HW

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory

MEMORY MANAGEMENT UNITS

Virtual Memory 2. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4

CSE 560 Computer Systems Architecture

Virtual Memory. Today. Segmentation Paging A good, if common example

6.004 Tutorial Problems L20 Virtual Memory

Agenda. CS 61C: Great Ideas in Computer Architecture. Virtual Memory II. Goals of Virtual Memory. Memory Hierarchy Requirements

Transcription:

18-447 Lecture 22: Virtual Memory: Survey of Modern Systems James C. Hoe Dept of ECE, CMU April 15, 2009 S 09 L22-1 Announcements: Spring Carnival!!! Final Thursday, May 7 5:30-8:30p.m Room TBA Two Guest Lectures next Mon and Wed (not on final) L23: multicore cache-coherence by Nikos Hardavellas L24: advanced multicore design by Prof. Onur Mutlu Handouts: Assigned Reading "Virtual memory in contemporary microprocessors." B. L Jacob and T. N. Mudge. IEEE Micro, July/August 1998 S 09 L22-2 Case Studies B. Jacob and T. Mudge, Virtual Memory in Contemporary Processors, IEEE Micro, vol. 18, no. 4, 1998.

SPARC V9 64-bit Virtual Address an implementation can choose not to map the high-order bits (must be sign-extended from the highest mapped bit) e.g. UltraSPARC 1 maps only the lower 44 bits physical address space size set by implementation 64 entry fully associative I-TLB and D-TLB S 09 L22-3 g context 13 VA<63:13> 51 Tag 64 software defined invert endianess no fault only page size 8k~4M valid v size nfo IE Soft diag PA<40:13> Soft L CP CV e p w g Data 64 hw diagnosis bits PPN global writeable priviledged side-effect (no speculation) cacheable in VA-indexed cacheable in PA-indexed locked from replacement software defined TLB Miss Handling SPARC V8 (32-bit) defines a 3-level hierarchical page table for HW MMU page-table walk S 09 L22-4 context context table descriptors +VA [31:24] L1 Table: L2 Table: +VA 256 [23:18] 64 +VA [17:12] descriptors (1024-byte) descriptors (256-byte) L3 Table: 64 PTEs (256-byte) SPARC V9 (64-bit) defines Translation Storage Buffer a software managed, direct-mapped cache of PTEs (think hashed page table) HW assisted address generation on a TLB miss, e.g., for 8-k pages {TSBbase 63:21, Logic(TSBbase 20:13,VA 32:22,size,split?),VA 21:13,0000} TLB miss handler (SW) search TSB. If TSB misses, a slower TSB-miss handler takes over

IBM PowerPC (32-bit) segments 256MB regions S 09 L22-5 seg# 4 seg offset 16 page offset 12 EA 32 16-entry segment table Protection seg ID 24 seg offset 16 page offset 12 VA 52 128 2-way ITLB and DTLB VM+paging PPN 20 page offset 12 PA 32 64-bit PowerPC = 64-bit EA -> 80-bit VA 64-bit PA. How many segments in EA? IBM PowerPC Hashed Page Table S 09 L22-6 VPN 40 Hash Function Hashed Page Table table base + 8 PTE s per group recommend at least N PTEG s for a system with 2N physical pages HW table walk VPN hashes into a PTE group (PTEG) of 8 8 PTEs searched sequentially for tag match if not found in first PTE group search a second PTE group if not found in the 2 nd PTE group, trap to software handler Hashed table structure also used for EA to VA mapping in 64-bit implementations

MIPS R10K 64-bit virtual address top 2 bits set kernel/supervisor/user mode additional bits set cache and translation behavior bit 61-40 not translate at all (holes in the VA??) 8-bit ASID (address space ID) distinguishes between processes 40-bit physical address Translation - 64 -bit VA and 8-bit ASID 40-bit PA 1 GB mapped (ksseg) 0.5 GB unmapped uncached 0.5 GB unmapped cached Bottom 2 GB Mapped (normal) S 09 L22-7 simplified example from 32-bit VA in R2000/3000 MIPS TLB S 09 L22-8 64-entry fully associative unified TLB paired: each entry maps 2 consecutive VPNs to 2 different PPNs software managed 7-instruction page table walk in the best case TLB Write Random: chooses a random entry for TLB replacement OS can exclude some number of TLB entry (low range) to be excluded from the random selection, to hold translations that cannot miss or should not miss TLB entry VPN 20 ASID 6 0 6 R2000 PPN 20 ndvg 0 8 N: noncacheable D: dirty (actually a write-enable bit) V: valid G: global entry, i.e., ignore ASID matching

S 09 L22-9 MIPS Bottom-Up Hierarchical Table Page table organization is not part of the ISA Reference design optimized for software TLB miss handling Bottom-Up Table start with a basic 2-level hierarchical table (32-bit case) map all of the L2 tables (empty or not) linearly in the mapped kseg VPN is the index into this linear table in VA A linear translation table that scales with VA size, Is this okay? Bottom-Up Table Walk S 09 L22-10 PTEBase mem load VPN VPN 0s PO VA on TLB Miss, trap which address space? VA of PTE (generated automatically by HW after TLB miss) PPN status PTE loaded from mem Can this load miss in the TLB? What happens if it misses? Notice translation also eats up TLB entries!

UserTLB Miss Handling S 09 L22-11 mfc0 k0,tlbcxt mfc0 k1,epc lw k0,0(k0) mtc0 k0,entry_lo tlbwr j k1 rfe # move the contents of TLB # context register into k0 # move PC of faulting load # instruction into k1 # load thru address that was # intlb context register # move the loaded value # into the EntryLo register # write entry into the TLB # at a random slot number # jump to PC of faulting # load instruction to retry # RESTORE FROM # EXCEPTION HP PA-RISC: PID and AID S 09 L22-12 2-level translation: 64-bit EA 96-bit VA (global) 64-bit PA Variable sized segmented EA to VA A different twist on protection everyone else: limit what can be named by a process In PowerPC, OS controls what VA can be reached by a process by controlling what s in the segment registers HP-RISC: rights-based access control User controls segment registers, i.e., user can generate any VA it wants Each virtual page has an access ID (not related to ownership by a processes) assigned by the OS Each process has 8 active protection IDs in special HW registers controlled by the OS A process can only access a page if it has the key (PID) that fits the lock (AID)

Intel x86 S 09 L22-13 Intended for two-level address translation with segments for naming and protection (similar to PPC) user private 48-bit effective address 16-bit segment number (implicit) + 32-bit segment offset each addr register has a corresponding segment register a global 32-bit virtual address 20-bit page number + 12-bit page offset an implementation defined paged physical address space What is strange about this? 32-bit VA too small to be shared by multiple processes No major OS today uses segmentation features code, data, stack segments always mapped to 0~(2 32-1) time multiplex VA between processes for naming and protection set MMU to use a different table on context switch must flush TLB on context switch because TLB entries are not defined to have ASIDs