Administrivia. Lab 1 due Friday 12pm. We give will give short extensions to groups that run into trouble. But us:

Similar documents
Protection. - Programmers typically assume machine has enough memory - Sum of sizes of all processes often greater than physical memory 1 / 36

- Established working set model - Led directly to virtual memory. Protection

Virtual memory Paging

Administrivia. - If you didn t get Second test of class mailing list, contact cs240c-staff. Clarification on double counting policy

Multiprogramming on physical memory

Simple idea 1: load-time linking. Our main questions. Some terminology. Simple idea 2: base + bound register. Protection mechanics.

Virtual Memory. CS 351: Systems Programming Michael Saelee

Review: Hardware user/kernel boundary

16 Sharing Main Memory Segmentation and Paging

CS 318 Principles of Operating Systems

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 23

15 Sharing Main Memory Segmentation and Paging

virtual memory. March 23, Levels in Memory Hierarchy. DRAM vs. SRAM as a Cache. Page 1. Motivation #1: DRAM a Cache for Disk

Computer Architecture Lecture 13: Virtual Memory II

Computer Structure. X86 Virtual Memory and TLB

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Memory Hierarchy Requirements. Three Advantages of Virtual Memory

CSE 120 Principles of Operating Systems

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

Lecture 21: Virtual Memory. Spring 2018 Jason Tang

How to create a process? What does process look like?

CS399 New Beginnings. Jonathan Walpole

18-447: Computer Architecture Lecture 18: Virtual Memory III. Yoongu Kim Carnegie Mellon University Spring 2013, 3/1

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 24

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

CS 4284 Systems Capstone. Virtual Memory Page Tables Godmar Back

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 23

CSE 120 Principles of Operating Systems Spring 2017

Address Translation. Tore Larsen Material developed by: Kai Li, Princeton University

Chapter 8 Memory Management

Address spaces and memory management

CS162 Operating Systems and Systems Programming Lecture 12. Address Translation. Page 1

238P: Operating Systems. Lecture 5: Address translation. Anton Burtsev January, 2018

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

Virtual Memory. Virtual Memory

Another View of the Memory Hierarchy. Lecture #25 Virtual Memory I Memory Hierarchy Requirements. Memory Hierarchy Requirements

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

12: Memory Management

Virtual Memory. Samira Khan Apr 27, 2017

Memory Management Topics. CS 537 Lecture 11 Memory. Virtualizing Resources

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

COS 318: Operating Systems. Virtual Memory and Address Translation

Virtual Memory: From Address Translation to Demand Paging

CSE 560 Computer Systems Architecture

UC Berkeley CS61C : Machine Structures

Virtual Memory Oct. 29, 2002

CS 550 Operating Systems Spring Memory Management: Paging

198:231 Intro to Computer Organization. 198:231 Introduction to Computer Organization Lecture 14

Memory System Case Studies Oct. 13, 2008

John Wawrzynek & Nick Weaver

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

MEMORY: SWAPPING. Shivaram Venkataraman CS 537, Spring 2019

Pentium/Linux Memory System March 17, 2005

Topics: Memory Management (SGG, Chapter 08) 8.1, 8.2, 8.3, 8.5, 8.6 CS 3733 Operating Systems

Virtualizing Memory: Paging. Announcements

Virtual Memory. CS61, Lecture 15. Prof. Stephen Chong October 20, 2011

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

COSC3330 Computer Architecture Lecture 20. Virtual Memory

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory. Bernhard Boser & Randy Katz

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

P6/Linux Memory System Nov 11, 2009"

Virtual Memory. CS 3410 Computer System Organization & Programming

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

@2010 Badri Computer Architecture Assembly II. Virtual Memory. Topics (Chapter 9) Motivations for VM Address translation

64-bit address spaces

ECE 571 Advanced Microprocessor-Based Design Lecture 12

Virtual Memory: From Address Translation to Demand Paging

Recap: Memory Management

! What is main memory? ! What is static and dynamic allocation? ! What is segmentation? Maria Hybinette, UGA. High Address (0x7fffffff) !

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays.

Midterm 1. Administrivia. Virtualizing Resources. CS162 Operating Systems and Systems Programming Lecture 12. Address Translation

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory

Fall 2017 :: CSE 306. Introduction to. Virtual Memory. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau)

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

CS5460: Operating Systems Lecture 14: Memory Management (Chapter 8)

Paging. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Virtual Memory 2. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Virtual Memory: Concepts

Virtual Memory 3. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4

CSE 451: Operating Systems Winter Page Table Management, TLBs and Other Pragmatics. Gary Kimura

Virtual Memory 1. Virtual Memory

Virtual Memory 1. Virtual Memory

Lecture 9 - Virtual Memory

Lec 22: Interrupts. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

virtual memory Page 1 CSE 361S Disk Disk

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Paging. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

CSE 153 Design of Operating Systems

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

CS 152 Computer Architecture and Engineering. Lecture 9 - Virtual Memory

VIRTUAL MEMORY II. Jo, Heeseung

Transcription:

Administrivia Lab 1 due Friday 12pm. We give will give short extensions to groups that run into trouble. But email us: - How much is done & left? - How much longer do you need? Attend section Friday at 2:15pm to learn about lab 2. 1 / 37

Want processes to co-exist Consider multiprogramming on physical memory - What happens if pintos needs to expand? - If emacs needs more memory than is on the machine?? - If pintos has an error and writes to address 0x7100? - When does gcc have to know it will run at 0x4000? - What if emacs isn t using its memory? 2 / 37

Issues in sharing physical memory Protection - A bug in one process can corrupt memory in another - Must somehow prevent process A from trashing B s memory - Also prevent A from even observing B s memory (ssh-agent) Transparency - A process shouldn t require particular physical memory bits - Yes processes often require large amounts of contiguous memory (for stack, large data structures, etc.) Resource exhaustion - Programmers typically assume machine has enough memory - Sum of sizes of all processes often greater than physical memory 3 / 37

Virtual memory goals Is address app. kernel load virtual address 0x30408 To fault handler MMU No Yes, phys. addr 0x92408 data memory Give each program its own virtual address space - At run time, Memory-Management Unit relocates each load, store to actual memory... App doesn t see physical memory Also enforce protection - Prevent one app from messing with another s memory And allow programs to see more memory than exists - Somehow relocate some memory accesses to disk 4 / 37

Virtual memory advantages Can re-locate program while running - Run partially in memory, partially on disk Most of a process s memory may be idle (80/20 rule). gcc emacs idle physical memory idle kernel kernel - Write idle parts to disk until needed - Let other processes use memory of idle part - Like CPU virtualization: when process not using CPU, switch (Not using a memory region? switch it to another process) Challenge: VM = extra layer, could be slow 5 / 37

Idea 1: load-time linking Linker patches addresses of symbols like printf Idea: link when process executed, not at compile time - Determine where process will reside in memory - Adjust all references within program (using addition) Problems? 6 / 37

Idea 1: load-time linking Linker patches addresses of symbols like printf Idea: link when process executed, not at compile time - Determine where process will reside in memory - Adjust all references within program (using addition) Problems? - How to enforce protection - How to move once already in memory (Consider: data pointers) - What if no contiguous free region fits program? 6 / 37

Idea 2: base + bound register Two special privileged registers: base and bound On each load/store: - Physical address = virtual address + base - Check 0 virtual address < bound, else trap to kernel How to move process in memory? What happens on context switch? 7 / 37

Idea 2: base + bound register Two special privileged registers: base and bound On each load/store: - Physical address = virtual address + base - Check 0 virtual address < bound, else trap to kernel How to move process in memory? - Change base register What happens on context switch? 7 / 37

Idea 2: base + bound register Two special privileged registers: base and bound On each load/store: - Physical address = virtual address + base - Check 0 virtual address < bound, else trap to kernel How to move process in memory? - Change base register What happens on context switch? - OS must re-load base and bound register 7 / 37

Definitions Programs load/store to virtual (or logical) addresses Actual memory uses physical (or real) addresses VM Hardware is Memory Management Unit (MMU) - Usually part of CPU - Accessed w. privileged instructions (e.g., load bound reg) - Translates from virtual to physical addresses - Gives per-process view of memory called address space 8 / 37

Address space 9 / 37

Base+bound trade-offs Advantages - Cheap in terms of hardware: only two registers - Cheap in terms of cycles: do add and compare in parallel - Examples: Cray-1 used this scheme Disadvantages 10 / 37

Advantages Base+bound trade-offs - Cheap in terms of hardware: only two registers - Cheap in terms of cycles: do add and compare in parallel - Examples: Cray-1 used this scheme Disadvantages - Growing a process is expensive or impossible - No way to share code or data (E.g., two copies of bochs, both running pintos) One solution: Multiple segments - E.g., separate code, stack, data segments - Possibly multiple data segments 10 / 37

Segmentation Let processes have many base/bound regs - Address space built from many segments - Can share/protect memory at segment granularity Must specify segment as part of virtual address 11 / 37

Segmentation mechanics Each process has a segment table Each VA indicates a segment and offset: - Top bits of addr select segment, low bits select offset (PDP-10) - Or segment selected by instruction or operand (means you need wider far pointers to specify segment) 12 / 37

Segmentation example virtual 0x4000 0x3000 0x2000 physical 0x4700 0x4000 0x3000 0x1500 0x1000 0x0700 0x0000 0x500 0x0000 2-bit segment number (1st digit), 12 bit offset (last 3) - Where is 0x0240? 0x1108? 0x265c? 0x3002? 0x1600? 13 / 37

Segmentation trade-offs Advantages - Multiple segments per process - Allows sharing! (how?) - Don t need entire process in memory Disadvantages - Requires translation hardware, which could limit performance - Segments not completely transparent to program (e.g., default segment faster or uses shorter instruction) - n byte segment needs n contiguous bytes of physical memory - Makes fragmentation a real problem. 14 / 37

Fragmentation Fragmentation = Inability to use free memory Over time: - Variable-sized pieces = many small holes (external fragmentation) - Fixed-sized pieces = no external holes, but force internal waste (internal fragmentation) 15 / 37

Alternatives to hardware MMU Language-level protection (Java) - Single address space for different modules - Language enforces isolation - Singularity OS does this [Hunt] Software fault isolation - Instrument compiler output - Checks before every store operation prevents modules from trashing each other - Google Native Client does this with only about 5% slowdown [Yee] 16 / 37

Paging Divide memory up into small pages Map virtual pages to physical pages - Each process has separate mapping Allow OS to gain control on certain operations - Read-only pages trap to OS on write - Invalid pages trap to OS on read or write - OS can change mapping and resume application Other features sometimes found: - Hardware can set accessed and dirty bits - Control page execute permission separately from read/write - Control caching or memory consistency of page 17 / 37

Paging trade-offs Eliminates external fragmentation Simplifies allocation, free, and backing storage (swap) Average internal fragmentation of.5 pages per segment 18 / 37

Simplified allocation gcc physical memory emacs Disk Allocate any physical page to any process Can store idle virtual pages on disk 19 / 37

Paging data structures Pages are fixed size, e.g., 4K - Least significant 12 (log 2 4K) bits of address are page offset - Most significant bits are page number Each process has a page table - Maps virtual page numbers (VPNs) to physical page numbers (PPNs) - Also includes bits for protection, validity, etc. On memory access: Translate VPN to PPN, then add offset 20 / 37

Example: Paging on PDP-11 64K virtual memory, 8K pages - Separate address space for instructions & data - I.e., can t read your own instructions with a load Entire page table stored in registers - 8 Instruction page translation registers - 8 Data page translations Swap 16 machine registers on each context switch 21 / 37

x86 Paging Paging enabled by bits in a control register (%cr0) - Only privileged OS code can manipulate control registers Normally 4KB pages %cr3: points to 4KB page directory - See pagedir activate in Pintos Page directory: 1024 PDEs (page directory entries) - Each contains physical address of a page table Page table: 1024 PTEs (page table entries) - Each contains physical address of virtual 4K page - Page table covers 4 MB of Virtual mem See old intel manual for simplest explanation - Also volume 2 of AMD64 Architecture docs - Also volume 3A of latest Pentium Manual 22 / 37

x86 page translation Linear Address 31 22 21 12 11 Directory Table Offset 0 12 4 KByte Page 10 Page Directory 10 Page Table Physical Address Directory Entry Page Table Entry 20 32* CR3 (PDBR) 1024 PDE 1024 PTE = 2 20 Pages *32 bits aligned onto a 4 KByte boundary 23 / 37

x86 page directory entry 3 1 P a g e D i r e c t o r y E n t r y (4 K B y t e P a g e Ta b l e ) 1 2 11 9 8 7 6 5 4 3 2 1 0 P a g e Ta b le B a s e A d d re ss A v a il G P S 0 A P C D P W T U / S R / W P A v a ila b le fo r s y s te m p ro g ra m m e r s u s e G lo b a l p a g e (Ig n o re d ) P a g e s iz e (0 in d ic a te s 4 K B y te s ) R e s e rv e d (s e t to 0 ) A c c e s s e d C a c h e d is a b le d W rite th ro u g h U s e r/s u p e rv is o r R e a d /W rite P re s e n t 24 / 37

x86 page table entry 31 Page Table Entry (4 KByte Page) 12 11 9 8 7 6 5 4 3 2 1 0 Page Base Address Avail G P A T D A P C D P W T U / S R / W P Available for system programmer s use Global Page Page Table Attribute Index Dirty Accessed Cache Disabled Write Through User/Supervisor Read/Write Present 25 / 37

x86 hardware segmentation x86 architecture also supports segmentation - Segment register base + pointer val = linear address - Page translation happens on linear addresses Two levels of protection and translation check - Segmentation model has four privilege levels (CPL 0 3) - Paging only two, so 0 2 = kernel, 3 = user Why do you want both paging and segmentation? 26 / 37

x86 hardware segmentation x86 architecture also supports segmentation - Segment register base + pointer val = linear address - Page translation happens on linear addresses Two levels of protection and translation check - Segmentation model has four privilege levels (CPL 0 3) - Paging only two, so 0 2 = kernel, 3 = user Why do you want both paging and segmentation? Short answer: You don t just adds overhead - Most OSes use flat mode set base = 0, bounds = 0xffffffff in all segment registers, then forget about it - x86-64 architecture removes much segmentation support Long answer: Has some fringe/incidental uses - VMware runs guest OS in CPL 1 to trap stack faults - OpenBSD used CS limit for W X when no PTE NX bit 26 / 37

Making paging fast x86 PTs require 3 memory references per load/store - Look up page table address in page directory - Look up PPN in page table - Actually access physical page corresponding to virtual address For speed, CPU caches recently used translations - Called a translation lookaside buffer or TLB - Typical: 64-2K entries, 4-way to fully associative, 95% hit rate - Each TLB entry maps a VPN PPN + protection information On each memory reference - Check TLB, if entry present get physical address fast - If not, walk page tables, insert in TLB for next time (Must evict some entry) 27 / 37

TLB details TLB operates at CPU pipeline speed = small, fast Complication: what to do when switch address space? - Flush TLB on context switch (e.g., old x86) - Tag each entry with associated process s ID (e.g., MIPS) In general, OS must manually keep TLB valid E.g., x86 invlpg instruction - Invalidates a page translation in TLB - Must execute after changing a possibly used page table entry - Otherwise, hardware will miss page table change More Complex on a multiprocessor (TLB shootdown) 28 / 37

x86 Paging Extensions PSE: Page size extensions - Setting bit 7 in PDE makes a 4MB translation (no PT) PAE Page address extensions - Newer 64-bit PTE format allows 36 bits of physical address - Page tables, directories have only 512 entries - Use 4-entry Page-Directory-Pointer Table to regain 2 lost bits - PDE bit 7 allows 2MB translation Long mode PAE - In Long mode, pointers are 64-bits - Extends PAE to map 48 bits of virtual address (next slide) - Why are aren t all 64 bits of VA usable? 29 / 37

63 Sign Extend 48 x86 long mode paging Virtual Address 47 39 38 30 29 Page Map Page Directory Page Directory Level 4 offset Pointer Offset Offset (PML4) 21 20 Page Table Offset 12 11 Physical Page Offset 0 9 9 Page Page Map Directory Page Level 4 Pointer Directory Page Table Table Table Table 9 9 12 4 Kbyte Physical Page PTE 52 PML4E 52 PDPE 52 PDE 52 Physical Address 51 Page Map L4 Base Addr 12 CR3 30 / 37

Where does the OS live? In its own address space? - Can t do this on most hardware (e.g., syscall instruction won t switch address spaces) - Also would make it harder to parse syscall arguments passed as pointers So in the same address space as process - Use protection bits to prohibit user code from writing kernel Typically all kernel text, most data at same VA in every address space - On x86, must manually set up page tables for this - Usually just map kernel in contiguous virtual memory when boot loader puts kernel into contiguous physical memory - Some hardware puts physical memory (kernel-only) somewhere in virtual address space 31 / 37

Pintos memory layout Kernel/ Pseudo-physical memory User stack 0xffffffff 0xc0000000 (PHYS BASE) BSS / Heap Data segment Code segment Invalid virtual addresses 0x08048000 0x00000000 32 / 37

Very different MMU: MIPS Hardware has 64-entry TLB - References to addresses not in TLB trap to kernel Each TLB entry has the following fields: Virtual page, Pid, Page frame, NC, D, V, Global Kernel itself unpaged - All of physical memory contiguously mapped in high VM - Kernel uses these pseudo-physical addresses User TLB fault hander very efficient - Two hardware registers reserved for it - utlb miss handler can itself fault allow paged page tables OS is free to choose page table format! 33 / 37

DEC Alpha MMU Software managed TLB (like MIPS) - 8KB, 64KB, 512KB, 4MB pages all available - TLB supports 128 instruction/128 data entries of any size But TLB miss handler not part of OS - Processor ships with special PAL code in ROM - Processor-specific, but provides uniform interface to OS - Basically firmware that runs from main memory like OS Various events vector directly to PAL code - call pal instruction, TLB miss/fault, FP disabled PAL code runs in special privileged processor mode - Interrupts always disabled - Have access to special instructions and registers 34 / 37

PAL code interface details Examples of Digital Unix PALcode entry functions - callsys/retsys - make, return from system call - swpctx - change address spaces - wrvptptr - write virtual page table pointer - tbi - TBL invalidate Some fields in PALcode page table entries - GH - 2-bit granularity hint 2 N pages have same translation - ASM - address space match mapping applies in all processes 35 / 37

Example: Paging to disk gcc needs a new page of memory OS re-claims an idle page from emacs If page is clean (i.e., also stored on disk): - E.g., page of text from emacs binary on disk - Can always re-read same page from binary - So okay to discard contents now & give page to gcc If page is dirty (meaning memory is only copy) - Must write page to disk first before giving to gcc Either way: - Mark page invalid in emacs - emacs will fault on next access to virtual page - On fault, OS reads page data back from disk into new page, maps new page into emacs, resumes executing 36 / 37

Paging in day-to-day use Demand paging Growing the stack BSS page allocation Shared text Shared libraries Shared memory Copy-on-write (fork, mmap, etc.) Q: Which pages should have global bit set on x86? 37 / 37