Virtual Memory: Systems

Similar documents
Virtual Memory: Systems

Foundations of Computer Systems

Virtual Memory. Spring Instructors: Aykut and Erkut Erdem

Memory System Case Studies Oct. 13, 2008

Pentium/Linux Memory System March 17, 2005

Virtual Memory III /18-243: Introduc3on to Computer Systems 17 th Lecture, 23 March Instructors: Bill Nace and Gregory Kesden

P6/Linux Memory System Nov 11, 2009"

P6 memory system P6/Linux Memory System October 31, Overview of P6 address translation. Review of abbreviations. Topics. Symbols: ...

Intel P The course that gives CMU its Zip! P6/Linux Memory System November 1, P6 memory system. Review of abbreviations

Intel P6 (Bob Colwell s Chip, CMU Alumni) The course that gives CMU its Zip! Memory System Case Studies November 7, 2007.

Operating Systems: Internals and Design. Chapter 8. Seventh Edition William Stallings

Virtual Memory. Alan L. Cox Some slides adapted from CMU slides

Page Which had internal designation P5

This lecture. Virtual Memory. Virtual memory (VM) CS Instructor: Sanjeev Se(a

Foundations of Computer Systems

Virtual Memory. Samira Khan Apr 27, 2017

Virtual Memory: Concepts

Virtual Memory: Concepts

Systems Programming and Computer Architecture ( ) Timothy Roscoe

198:231 Intro to Computer Organization. 198:231 Introduction to Computer Organization Lecture 14

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

Computer Systems. Virtual Memory. Han, Hwansoo

Carnegie Mellon. 16 th Lecture, Mar. 20, Instructors: Todd C. Mowry & Anthony Rowe

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

14 May 2012 Virtual Memory. Definition: A process is an instance of a running program

Processes and Virtual Memory Concepts

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

CSE 153 Design of Operating Systems

Virtual Memory: Concepts

Virtual Memory II CSE 351 Spring

University of Washington Virtual memory (VM)

CS 153 Design of Operating Systems

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 24

Virtual Memory Nov 9, 2009"

virtual memory. March 23, Levels in Memory Hierarchy. DRAM vs. SRAM as a Cache. Page 1. Motivation #1: DRAM a Cache for Disk

Virtual Memory. CS61, Lecture 15. Prof. Stephen Chong October 20, 2011

Lecture 19: Virtual Memory: Concepts

CSE 351. Virtual Memory

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

Virtual Memory Oct. 29, 2002

A Few Problems with Physical Addressing. Virtual Memory Process Abstraction, Part 2: Private Address Space

CS 153 Design of Operating Systems

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

Opera&ng Systems ECE344

Understanding the Design of Virtual Memory. Zhiqiang Lin

Motivations for Virtual Memory Virtual Memory Oct. 29, Why VM Works? Motivation #1: DRAM a Cache for Disk

CISC 360. Virtual Memory Dec. 4, 2008

Virtual Memory. Physical Addressing. Problem 2: Capacity. Problem 1: Memory Management 11/20/15

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 25

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 23

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 23

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

Main Points. Address Transla+on Concept. Flexible Address Transla+on. Efficient Address Transla+on

Virtual Memory. Computer Systems Principles

VM as a cache for disk

PROCESS VIRTUAL MEMORY PART 2. CS124 Operating Systems Winter , Lecture 19

CIS Operating Systems Memory Management Cache and Demand Paging. Professor Qiang Zeng Spring 2018

Memory Management. Fundamentally two related, but distinct, issues. Management of logical address space resource

Lecture 8: Memory Management

virtual memory Page 1 CSE 361S Disk Disk

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2017

@2010 Badri Computer Architecture Assembly II. Virtual Memory. Topics (Chapter 9) Motivations for VM Address translation

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2015

seven Virtual Memory Introduction

CSE 120 Principles of Operating Systems

Applications of. Virtual Memory in. OS Design

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory

Virtual Memory I. CSE 351 Winter Instructor: Mark Wyse

CSE 560 Computer Systems Architecture

Another View of the Memory Hierarchy. Lecture #25 Virtual Memory I Memory Hierarchy Requirements. Memory Hierarchy Requirements

CSCI 4061: Virtual Memory

Virtual Memory. CS 351: Systems Programming Michael Saelee

Compsci 104. Duke University. Input/Output. Instructors: Alvin R. Lebeck. Some Slides provided by Randy Bryant and Dave O Hallaron and G.

Memory Hierarchy Requirements. Three Advantages of Virtual Memory

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

CSE 120 Principles of Operating Systems Spring 2017

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

Memory Management Day 2. SWE3015 Sung- hun Kim

Lecture 21: Virtual Memory. Spring 2018 Jason Tang

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

University*of*Washington*

Virtual Memory I. CSE 351 Spring Instructor: Ruth Anderson

CS 318 Principles of Operating Systems

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

How to create a process? What does process look like?

Virtual Memory. CS 3410 Computer System Organization & Programming

CS 61C: Great Ideas in Computer Architecture. Virtual Memory III. Instructor: Dan Garcia

CSE 451: Operating Systems Winter Page Table Management, TLBs and Other Pragmatics. Gary Kimura

Memory Mapping. Sarah Diesburg COP5641

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory. Bernhard Boser & Randy Katz

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

Chapter 6: Demand Paging

Computer Structure. X86 Virtual Memory and TLB

VIRTUAL MEMORY II. Jo, Heeseung

Problem 9. VM address translation. (9 points): The following problem concerns the way virtual addresses are translated into physical addresses.

CIS Operating Systems Memory Management Address Translation for Paging. Professor Qiang Zeng Spring 2018

Transcription:

Virtual Memory: Systems 5-23 / 8-23: Introduc2on to Computer Systems 7 th Lecture, Mar. 22, 22 Instructors: Todd C. Mowry & Anthony Rowe

Today Virtual memory ques7ons and answers Simple memory system example Bonus: Case study: Core i7/linux memory system Bonus: Memory mapping 2

Virtual memory reminder/review Programmer s view of virtual memory Each process has its own private linear address space Cannot be corrupted by other processes System view of virtual memory Uses memory efficiently by caching virtual memory pages Efficient only because of locality Simplifies memory management and programming Simplifies protec2on by providing a convenient interposi2oning point to check permissions 3

Recall: Address Transla7on With a Page Table Page table base register (PTBR) n- Virtual address Virtual page number (VPN) p p- Virtual page offset (VPO) Page table address for process Page table Valid Physical page number (PPN) Valid bit = : page not in memory (page fault) m- Physical page number (PPN) Physical address p p- Physical page offset (PPO) 4

Recall: Address Transla7on: Page Hit CPU Chip CPU VA MMU 2 PTEA PTE 3 PA Cache/ Memory 4 Data 5 ) Processor sends virtual address to MMU 2-3) MMU fetches PTE from page table in memory 4) MMU sends physical address to cache/memory 5) Cache/memory sends data word to processor 5

Ques7on # Are the PTEs cached like other memory accesses? Yes (and no: see next ques7on) 6

Page tables in memory, like other data PTE CPU Chip CPU VA MMU PTEA PA PTEA hit PTEA miss PA miss PTE PTEA PA Memory PA hit Data Data L cache VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address 7

Ques7on #2 Isn t it slow to have to go to memory twice every 7me? Yes, it would be so, real MMUs don t 8

Speeding up Transla7on with a TLB Page table entries (PTEs) are cached in L like any other memory word PTEs may be evicted by other data references PTE hit s2ll requires a small L delay Solu7on: TranslaAon Lookaside Buffer (TLB) Small, dedicated, super- fast hardware cache of PTEs in MMU Contains complete page table entries for small number of pages 9

TLB Hit CPU Chip TLB 2 PTE VPN 3 CPU VA MMU PA 4 Cache/ Memory Data 5 A TLB hit eliminates a memory access

TLB Miss CPU Chip TLB 4 2 PTE VPN 3 CPU VA MMU PTEA PA Cache/ Memory 5 Data 6 A TLB miss incurs an addi7onal memory access (the PTE) Fortunately, TLB misses are rare. Why?

Ques7on #3 Isn t the page table huge? How can it be stored in RAM? Yes, it would be so, real page tables aren t simple arrays 2

Mul7- Level Page Tables Suppose: 4KB (2 2 ) page size, 64- bit address space, 8- byte PTE Level 2 Tables Problem: Would need a 32, TB page table! 2 64 * 2-2 * 2 3 = 2 55 bytes Level Table... Common solu7on: Mul2- level page tables Example: 2- level page table Level table: each PTE points to a page table (always memory resident) Level 2 table: each PTE points to a page (paged in and out like any other data)... 3

A Two- Level Page Table Hierarchy Level page table Level 2 page tables Virtual memory VP PTE PTE PTE 2 (null) PTE 3 (null) PTE 4 (null) PTE 5 (null) PTE 6 (null) PTE 7 (null) PTE... PTE 23 PTE... PTE 23... VP 23 VP 24... VP 247 Gap 2K allocated VM pages for code and data 6K unallocated VM pages PTE 8 (K - 9) null PTEs 23 null PTEs 32 bit addresses, 4KB pages, 4- byte PTEs PTE 23 23 unallocated pages VP 925... 23 unallocated pages allocated VM page for the stack 4

Transla7ng with a k- level Page Table n-! VIRTUAL ADDRESS! p-!! VPN! VPN 2!...! VPN k! VPO! Level! page table! Level 2! page table!...!...! Level k! page table! PPN! m-! p-!! PPN! PPO! PHYSICAL ADDRESS! 5

Ques7on #4 Shouldn t fork() be really slow, since the child needs a copy of the parent s address space? Yes, it would be so, fork() doesn t really work that way 6

Sharing Revisited: Shared Objects Process virtual memory Physical memory Process 2 virtual memory Process maps the shared object. Shared object 7

Sharing Revisited: Shared Objects Process virtual memory Physical memory Process 2 virtual memory Process 2 maps the shared object. No7ce how the virtual addresses can be different. Shared object 8

Sharing Revisited: Private Copy- on- write (COW) Objects Process virtual memory Physical memory Process 2 virtual memory Two processes mapping a private copy- on- write (COW) object. Private copy-on-write area Area flagged as private copy- on- write PTEs in private areas are flagged as read- only Private copy-on-write object 9

Sharing Revisited: Private Copy- on- write (COW) Objects Process virtual memory Physical memory Copy-on-write Private copy-on-write object Process 2 virtual memory Write to private copy-on-write page Instruc7on wri7ng to private page triggers protec7on fault. Handler creates new R/W page. Instruc7on restarts upon handler return. Copying deferred as long as possible! 2

The fork Func7on Revisited fork provides private address space for each process To create virtual address for new process Create exact copies of parent page tables Flag each page in both processes (parent and child) as read- only Flag writeable areas in both processes as private COW On return, each process has exact copy of virtual memory Subsequent writes create new physical pages using COW mechanism Perfect approach for common case of fork() followed by exec() Why? 2

Today Virtual memory ques7ons and answers Simple memory system example Bonus: Case study: Core i7/linux memory system Bonus: Memory mapping 22

Review of Symbols Basic Parameters N = 2 n : Number of addresses in virtual address space M = 2 m : Number of addresses in physical address space P = 2 p : Page size (bytes) Components of the virtual address (VA) VPO: Virtual page offset VPN: Virtual page number TLBI: TLB index TLBT: TLB tag Components of the physical address (PA) PPO: Physical page offset (same as VPO) PPN: Physical page number CO: Byte offset within cache line CI: Cache index CT: Cache tag 23

Simple Memory System Example Addressing 4- bit virtual addresses 2- bit physical address Page size = 64 bytes 3 2 9 8 7 6 5 4 3 2 VPN Virtual Page Number VPO Virtual Page Offset 9 8 7 6 5 4 3 2 PPN Physical Page Number PPO Physical Page Offset 24

Simple Memory System Page Table Only show first 6 entries (out of 256) VPN PPN Valid VPN PPN Valid 28 8 3 9 7 2 33 A 9 3 2 B 4 C 5 6 D 2D 6 E 7 F D 25

Simple Memory System TLB 6 entries 4- way associa7ve TLBT TLBI 3 2 9 8 7 6 5 4 3 2 VPN VPO Set Tag PPN Valid Tag PPN Valid Tag PPN Valid Tag PPN Valid 3 9 D 7 2 3 2D 2 4 A 2 2 8 6 3 3 7 3 D A 34 2 26

Simple Memory System Cache 6 lines, 4- byte block size Physically addressed Direct mapped CT CI CO 9 8 7 6 5 4 3 2 PPN PPO Idx Tag Valid B B B2 B3 Idx Tag Valid B B B2 B3 9 99 23 8 24 3A 5 89 5 9 2D 2 B 2 4 8 A 2D 93 5 DA 3B 3 36 B B 4 32 43 6D 8F 9 C 2 5 D 36 72 F D D 6 4 96 34 5 6 3 E 3 83 77 B D3 7 6 C2 DF 3 F 4 27

Address Transla7on Example # Virtual Address: x3d4 TLBT TLBI 3 2 9 8 7 6 5 4 3 2 VPN VPO VPN xf TLBI x3 TLBT x3 TLB Hit? Y Page Fault? N PPN: xd Physical Address CT CI CO 9 8 7 6 5 4 3 2 PPN PPO CO CI x5 CT xd Hit? Y Byte: x36 28

Address Transla7on Example #2 Virtual Address: xcf TLBT TLBI 3 2 9 8 7 6 5 4 3 2 VPN VPO VPN x7 TLBI x3 TLBT x TLB Hit? N Page Fault? Y PPN: TBD Physical Address CT CI CO 9 8 7 6 5 4 3 2 PPN PPO CO CI CT Hit? Byte: 29

Address Transla7on Example #3 Virtual Address: x2 TLBT TLBI 3 2 9 8 7 6 5 4 3 2 VPN VPO VPN x TLBI TLBT x TLB Hit? N Page Fault? N PPN: x28 Physical Address CT CI CO 9 8 7 6 5 4 3 2 PPN PPO CO CI x8 CT x28 Hit? N Byte: Mem 3

Today Virtual memory ques7ons and answers Simple memory system example Bonus: Case study: Core i7/linux memory system Bonus: Memory mapping 3

Intel Core i7 Memory System Processor package Core x4 Registers Instruc7on fetch MMU (addr transla7on) L d- cache 32 KB, 8- way L i- cache 32 KB, 8- way L d- TLB 64 entries, 4- way L i- TLB 28 entries, 4- way L2 unified cache 256 KB, 8- way L2 unified TLB 52 entries, 4- way QuickPath interconnect 4 links @ 25.6 GB/s each To other cores To I/O bridge L3 unified cache 8 MB, 6- way (shared by all cores) DDR3 Memory controller 3 x 64 bit @.66 GB/s 32 GB/s total (shared by all cores) Main memory 32

Review of Symbols Basic Parameters N = 2 n : Number of addresses in virtual address space M = 2 m : Number of addresses in physical address space P = 2 p : Page size (bytes) Components of the virtual address (VA) TLBI: TLB index TLBT: TLB tag VPO: Virtual page offset VPN: Virtual page number Components of the physical address (PA) PPO: Physical page offset (same as VPO) PPN: Physical page number CO: Byte offset within cache line CI: Cache index CT: Cache tag 33

End- to- end Core i7 Address Transla7on VPN TLB miss CPU 36 2 VPO 32 4 TLBT TLBI Virtual address (VA)... TLB hit 32/64 Result L hit L d- cache (64 sets, 8 lines/set)... L2, L3, and main memory L miss L TLB (6 sets, 4 entries/set) CR3 9 9 VPN VPN2 PTE 9 9 VPN3 VPN4 PTE PTE PTE 4 2 PPN PPO Physical address (PA) 4 6 6 CT CI CO Page tables 34

Core i7 Level - 3 Page Table Entries 63 XD 62 Unused 52 5 2 9 8 7 6 5 4 3 2 Page table physical base address Unused G PS A CD WT U/S R/W P= Available for OS (page table loca7on on disk) P= Each entry references a 4K child page table P: Child page table present in physical memory () or not (). R/W: Read- only or read- write access access permission for all reachable pages. U/S: user or supervisor (kernel) mode access permission for all reachable pages. WT: Write- through or write- back cache policy for the child page table. CD: Caching disabled or enabled for the child page table. A: Reference bit (set by MMU on reads and writes, cleared by sohware). PS: Page size either 4 KB or 4 MB (defined for Level PTEs only). G: Global page (don t evict from TLB on task switch) Page table physical base address: 4 most significant bits of physical page table address (forces page tables to be 4KB aligned) 35

Core i7 Level 4 Page Table Entries 63 XD 62 Unused 52 5 2 9 8 7 6 5 4 3 2 Page physical base address Unused G D A CD WT U/S R/W P= Available for OS (page loca7on on disk) P= Each entry references a 4K child page P: Child page is present in memory () or not () R/W: Read- only or read- write access permission for child page U/S: User or supervisor mode access WT: Write- through or write- back cache policy for this page CD: Cache disabled () or enabled () A: Reference bit (set by MMU on reads and writes, cleared by sohware) D: Dirty bit (set by MMU on writes, cleared by sohware) G: Global page (don t evict from TLB on task switch) Page physical base address: 4 most significant bits of physical page address (forces pages to be 4KB aligned) 36

Core i7 Page Table Transla7on 9 VPN 9 VPN 2 9 VPN 3 VPN 4 9 2 Virtual VPO address CR3 Physical address of L PT 4 / L PT Page global directory L PTE 4 / L2 PT Page upper directory L2 PTE 4 / L3 PT Page middle directory L3 PTE 4 / L4 PT Page table L4 PTE / 2 Offset into physical and virtual page 52 GB region per entry GB region per entry 2 MB region per entry 4 KB region per entry Physical address of page 4 / 4 2 Physical PPO address PPN 37

Cute Trick for Speeding Up L Access CT Tag Check Physical address (PA) 36 6 6 CT CI CO PPN PPO Virtual address (VA) Address Transla7on VPN VPO 36 2 No Change Observa7on Bits that determine CI iden2cal in virtual and physical address Can index into cache while address transla2on taking place Generally we hit in TLB, so PPN bits (CT bits) available next Virtually indexed, physically tagged Cache carefully sized to make this possible CI L Cache 38

Today Virtual memory ques7ons and answers Simple memory system example Bonus: Case study: Core i7/linux memory system Bonus: Memory mapping 39

Memory Mapping VM areas ini7alized by associa7ng them with disk objects. Process is known as memory mapping. Area can be backed by (i.e., get its ini7al values from) : Regular file on disk (e.g., an executable object file) Ini2al page bytes come from a sec2on of a file Anonymous file (e.g., nothing) First fault will allocate a physical page full of 's (demand- zero page) Once the page is wripen to (diraed), it is like any other page Dirty pages are copied back and forth between memory and a special swap file. 4

Demand paging Key point: no virtual pages are copied into physical memory un7l they are referenced! Known as demand paging Crucial for 7me and space efficiency 4

User- Level Memory Mapping void *mmap(void *start, int len, int prot, int flags, int fd, int offset) Map len bytes star7ng at offset offset of the file specified by file descrip7on fd, preferably at address start start: may be for pick an address prot: PROT_READ, PROT_WRITE,... flags: MAP_ANON, MAP_PRIVATE, MAP_SHARED,... Return a pointer to start of mapped area (may not be start) 42

User- Level Memory Mapping void *mmap(void *start, int len, int prot, int flags, int fd, int offset) len bytes len bytes start (or address chosen by kernel) offset (bytes) Disk file specified by file descriptor fd Process virtual memory 43

Using mmap to Copy Files Copying without transferring data to user space. #include "csapp.h" /* * mmapcopy - uses mmap to copy * file fd to stdout */ void mmapcopy(int fd, int size) { } /* Ptr to mem-mapped VM area */ char *bufp; bufp = Mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, ); Write(, bufp, size); return; /* mmapcopy driver */ int main(int argc, char **argv) { struct stat stat; int fd; } /* Check for required cmdline arg */ if (argc!= 2) { printf("usage: %s <filename>\n, argv[]); exit(); } /* Copy the input arg to stdout */ fd = Open(argv[], O_RDONLY, ); Fstat(fd, &stat); mmapcopy(fd, stat.st_size); exit(); 44

Virtual Memory of a Linux Process Different for each process IdenAcal for each process Process- specific data structs (ptables, task and mm structs, kernel stack) Physical memory Kernel code and data Kernel virtual memory %esp User stack brk Memory mapped region for shared libraries Run7me heap (malloc) Process virtual memory x848 (32) x4 (64) Unini7alized data (.bss) Ini7alized data (.data) Program text (.text) 45

Linux Organizes VM as Collec7on of Areas Process virtual memory vm_area_struct task_struct mm_struct vm_end mm pgd vm_start vm_prot mmap vm_flags Shared libraries vm_next pgd: Page global directory address Points to L page table vm_prot: Read/write permissions for this area vm_flags Pages shared with other processes or private to this process vm_end vm_start vm_prot vm_flags vm_next vm_end vm_start vm_prot vm_flags vm_next Data Text 46

Linux Page Fault Handling vm_area_struct Process virtual memory vm_end vm_start vm_prot vm_flags vm_next vm_end vm_start vm_prot vm_flags shared libraries data read 3 read Segmentation fault: accessing a non- exis7ng page Normal page fault vm_next vm_end vm_start vm_prot vm_flags vm_next text 2 write Protec7on excep7on: e.g., viola7ng permission by wri7ng to a read- only page (Linux reports as Segmenta7on fault) 47

The execve Func7on Revisited User stack Private, demand-zero To load and run a new program a.out in the current process using execve: libc.so.data.text Memory mapped region for shared libraries Shared, file-backed Free vm_area_struct s and page tables for old areas a.out.data.text Runtime heap (via malloc) Uninitialized data (.bss) Initialized data (.data) Program text (.text) Private, demand-zero Private, demand-zero Private, file-backed Create vm_area_struct s and page tables for new areas Programs and ini2alized data backed by object files..bss and stack backed by anonymous files. Set PC to entry point in.text Linux will fault in code and data pages as needed. 48