Protection. - Programmers typically assume machine has enough memory - Sum of sizes of all processes often greater than physical memory 1 / 36

Similar documents
- Established working set model - Led directly to virtual memory. Protection

Administrivia. Lab 1 due Friday 12pm. We give will give short extensions to groups that run into trouble. But us:

Virtual memory Paging

Administrivia. - If you didn t get Second test of class mailing list, contact cs240c-staff. Clarification on double counting policy

Multiprogramming on physical memory

CS399 New Beginnings. Jonathan Walpole

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

Simple idea 1: load-time linking. Our main questions. Some terminology. Simple idea 2: base + bound register. Protection mechanics.

3.6. PAGING (VIRTUAL MEMORY) OVERVIEW

15 Sharing Main Memory Segmentation and Paging

16 Sharing Main Memory Segmentation and Paging

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Lecture 21: Virtual Memory. Spring 2018 Jason Tang

Review: Hardware user/kernel boundary

CSE 120 Principles of Operating Systems

Address spaces and memory management

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

virtual memory. March 23, Levels in Memory Hierarchy. DRAM vs. SRAM as a Cache. Page 1. Motivation #1: DRAM a Cache for Disk

How to create a process? What does process look like?

12: Memory Management

Memory Hierarchy Requirements. Three Advantages of Virtual Memory

CS 318 Principles of Operating Systems

Address Translation. Tore Larsen Material developed by: Kai Li, Princeton University

ECE 571 Advanced Microprocessor-Based Design Lecture 12

Fall 2017 :: CSE 306. Introduction to. Virtual Memory. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau)

CSE 120 Principles of Operating Systems Spring 2017

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

CMSC 313 Lecture 14 Reminder: Midterm Exam next Tues (10/26) Project 4 Questions Virtual Memory on Linux/Pentium platform

COS 318: Operating Systems. Virtual Memory and Address Translation

238P: Operating Systems. Lecture 5: Address translation. Anton Burtsev January, 2018

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 23

CS162 Operating Systems and Systems Programming Lecture 12. Address Translation. Page 1

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

Virtual Memory. CS 351: Systems Programming Michael Saelee

Virtual Memory. Virtual Memory

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 23

Midterm 1. Administrivia. Virtualizing Resources. CS162 Operating Systems and Systems Programming Lecture 12. Address Translation

Memory Management Topics. CS 537 Lecture 11 Memory. Virtualizing Resources

UC Berkeley CS61C : Machine Structures

virtual memory Page 1 CSE 361S Disk Disk

CSE 560 Computer Systems Architecture

Another View of the Memory Hierarchy. Lecture #25 Virtual Memory I Memory Hierarchy Requirements. Memory Hierarchy Requirements

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Computer Architecture Lecture 13: Virtual Memory II

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Operating Systems Lecture 6: Memory Management II

Recap: Memory Management

Topics: Memory Management (SGG, Chapter 08) 8.1, 8.2, 8.3, 8.5, 8.6 CS 3733 Operating Systems

Motivations for Virtual Memory Virtual Memory Oct. 29, Why VM Works? Motivation #1: DRAM a Cache for Disk

Modeling Page Replacement: Stack Algorithms. Design Issues for Paging Systems

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

Classifying Information Stored in Memory! Memory Management in a Uniprogrammed System! Segments of a Process! Processing a User Program!

CS450/550 Operating Systems

Virtual Memory Oct. 29, 2002

Intel 64 and IA-32 Architectures Software Developer s Manual

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2017

Chapter 8 Memory Management

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

Lecture 13: Address Translation

CIS Operating Systems Memory Management Cache and Demand Paging. Professor Qiang Zeng Spring 2018

ECE 598 Advanced Operating Systems Lecture 14

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Memory Management. Dr. Yingwu Zhu

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

! What is main memory? ! What is static and dynamic allocation? ! What is segmentation? Maria Hybinette, UGA. High Address (0x7fffffff) !

Protection and System Calls. Otto J. Anshus

18-447: Computer Architecture Lecture 18: Virtual Memory III. Yoongu Kim Carnegie Mellon University Spring 2013, 3/1

CS 550 Operating Systems Spring Memory Management: Paging

Lec 22: Interrupts. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

198:231 Intro to Computer Organization. 198:231 Introduction to Computer Organization Lecture 14

Memory Management Part 1. Operating Systems in Depth XX 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

64-bit address spaces

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory. Bernhard Boser & Randy Katz

CS 5523 Operating Systems: Memory Management (SGG-8)

Virtual Memory 1. Virtual Memory

Course Outline. Processes CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems

Virtual Memory 1. Virtual Memory

Multi-level Translation. CS 537 Lecture 9 Paging. Example two-level page table. Multi-level Translation Analysis

Basic Memory Management

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

Virtual Memory. CS 3410 Computer System Organization & Programming

1. Creates the illusion of an address space much larger than the physical memory

CS 4284 Systems Capstone. Virtual Memory Page Tables Godmar Back

Virtual Memory. CS61, Lecture 15. Prof. Stephen Chong October 20, 2011

Memory and multiprogramming

Memory management. Requirements. Relocation: program loading. Terms. Relocation. Protection. Sharing. Logical organization. Physical organization

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 24

Virtual Memory. Kevin Webb Swarthmore College March 8, 2018

Multi-Process Systems: Memory (2) Memory & paging structures: free frames. Memory & paging structures. Physical memory

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Chapter 8: Memory-Management Strategies

MEMORY: SWAPPING. Shivaram Venkataraman CS 537, Spring 2019

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 13: Address Translation

Transcription:

Want processes to co-exist Issues in sharing physical memory rotection - A bug in one process can corrupt memory in another - Must somehow prevent process A from trashing B s memory - Also prevent A from even observing B s memory (ssh-agent) Transparency Consider multiprogramming on physical memory - What happens if pintos needs to expand? - If emacs needs more memory than is on the machine?? - If pintos has an error and writes to address 0x7100? - When does gcc have to know it will run at 0x4000? - What if emacs isn t using its memory? - A process shouldn t require particular physical memory bits - Yes processes often require large amounts of contiguous memory (for stack, large data structures, etc.) Resource exhaustion - rogrammers typically assume machine has enough memory - Sum of sizes of all processes often greater than physical memory 1 / 36 2 / 36 Virtual memory goals Virtual memory advantages app. kernel load virtual address 0x30408 To fault handler MMU No Is address legal? Yes, phys. addr 0x2408 data memory Can re-locate program while running - Run partially in memory, partially on disk Most of a process s memory will be idle (80/20 rule). gcc idle physical memory idle emacs Give each program its own virtual address space - At run time, Memory-Management Unit relocates each load, store to actual memory... App doesn t see physical memory Also enforce protection - revent one app from messing with another s memory And allow programs to see more memory than exists - Somehow relocate some memory accesses to disk kernel - Write idle parts to disk until needed - Let other processes use memory of idle part kernel - Like CU virtualization: when process not using CU, switch (Not using a memory region? switch it to another process) Challenge: VM = extra layer, could be slow 3 / 36 4 / 36 Idea 1: load-time linking Idea 1: load-time linking Linker patches addresses of symbols like printf Linker patches addresses of symbols like printf Idea: link when process executed, not at compile time Idea: link when process executed, not at compile time - Determine where process will reside in memory - Determine where process will reside in memory - Adjust all references within program (using addition) - Adjust all references within program (using addition) roblems? roblems? - How to enforce protection - How to move once in memory (Consider: data pointers) - What if no contiguous free region fits program? 5 / 36 5 / 36

Idea 2: base + bound register Idea 2: base + bound register Two special privileged registers: base and bound On each load/store: - hysical address = virtual address + base - Check 0 virtual address < bound, else trap to kernel How to move process in memory? What happens on context switch? Two special privileged registers: base and bound On each load/store: - hysical address = virtual address + base - Check 0 virtual address < bound, else trap to kernel How to move process in memory? - Change base register What happens on context switch? 6 / 36 6 / 36 Idea 2: base + bound register Definitions rograms load/store to virtual (or logical) addresses Actual memory uses physical (or real) addresses VM Hardware is Memory Management Unit (MMU) Two special privileged registers: base and bound On each load/store: - hysical address = virtual address + base - Check 0 virtual address < bound, else trap to kernel How to move process in memory? - Change base register What happens on context switch? - OS must re-load base and bound register 6 / 36 - Usually part of CU - Accessed w. privileged instructions (e.g., load bound reg) - Translates from virtual to physical addresses - Gives per-process view of memory called address space 7 / 36 Address space Base+bound trade-offs Advantages - Cheap in terms of hardware: only two registers - Cheap in terms of cycles: do add and compare in parallel - Examples: Cray-1 used this scheme Disadvantages 8 / 36 / 36

Base+bound trade-offs Segmentation Advantages - Cheap in terms of hardware: only two registers - Cheap in terms of cycles: do add and compare in parallel - Examples: Cray-1 used this scheme Disadvantages - Growing a process is expensive or impossible - No way to share code or data (E.g., two copies of bochs, both running pintos) One solution: Multiple segments - E.g., separate code, stack, data segments - ossibly multiple data segments Let processes have many base/bound regs - Address space built from many segments - Can share/protect memory at segment granularity Must specify segment as part of virtual address / 36 10 / 36 Segmentation mechanics Segmentation example virtual 0x4000 0x3000 0x2000 physical 0x4700 0x4000 0x3000 0x1500 Each process has a segment table Each VA indicates a segment and offset: - Top bits of addr select segment, low bits select offset (D-10) - Or segment selected by instruction or operand (means you need wider far pointers to specify segment) 11 / 36 0x1000 0x0700 0x0000 2-bit segment number (1st digit), 12 bit offset (last 3) - Where is 0x0240? 0x1108? 0x265c? 0x3002? 0x1600? 0x500 0x0000 12 / 36 Segmentation trade-offs Fragmentation Advantages - Multiple segments per process - Allows sharing! (how?) - Don t need entire process in memory Fragmentation = Inability to use free memory Over time: - Variable-sized pieces = many small holes (external fragmentation) - Fixed-sized pieces = no external holes, but force internal waste (internal fragmentation) Disadvantages - Requires translation hardware, which could limit performance - Segments not completely transparent to program (e.g., default segment faster or uses shorter instruction) - n byte segment needs n contiguous bytes of physical memory - Makes fragmentation a real problem. 13 / 36 14 / 36

Alternatives to hardware MMU aging Language-level protection (Java) - Single address space for different modules - Language enforces isolation - Singularity OS does this [Hunt] Software fault isolation - Instrument compiler output - Checks before every store operation prevents modules from trashing each other - Google Native Client does this with only about 5% slowdown [Yee] Divide memory up into small pages Map virtual pages to physical pages - Each process has separate mapping Allow OS to gain control on certain operations - Read-only pages trap to OS on write - Invalid pages trap to OS on read or write - OS can change mapping and resume application Other features sometimes found: - Hardware can set accessed and dirty bits - Control page execute permission separately from read/write - Control caching of page 15 / 36 16 / 36 aging trade-offs Simplified allocation gcc physical memory emacs Disk Eliminates external fragmentation Simplifies allocation, free, and backing storage (swap) Average internal fragmentation of.5 pages per segment Allocate any physical page to any process Can store idle virtual pages on disk 17 / 36 18 / 36 aging data structures ages are fixed size, e.g., 4K - Least significant 12 (log 2 4K) bits of address are page offset - Most significant bits are page number Each process has a page table - Maps virtual page numbers to physical page numbers - Also includes bits for protection, validity, etc. On memory access: Translate VN to N, then add offset Example: aging on D-11 64K virtual memory, 8K pages - Separate address space for instructions & data - I.e., can t read your own instructions with a load Entire page table stored in registers - 8 Instruction page translation registers - 8 Data page translations Swap 16 machine registers on each context switch 1 / 36 20 / 36

x86 aging aging enabled by bits in a control register (%cr0) - Only privileged OS code can manipulate control registers Normally 4KB pages %cr3: points to 4KB page directory - See pagedir activate in intos age directory: 1024 DEs (page directory entries) - Each contains physical address of a page table age table: 1024 TEs (page table entries) - Each contains physical address of virtual 4K page - age table covers 4 MB of Virtual mem See old intel manual for simplest explanation - Also volume 2 of AMD64 Architecture docs - Also volume 3A of latest entium Manual x86 page directory entry 21 / 36 x86 page translation Linear Address 31 22 21 12 11 0 Directory Table Offset 12 4 KByte age 10 age Table 10 hysical Address age Directory age Table Entry 20 Directory Entry 32* 1024 DE 1024 TE = 2 20 ages CR3 (DBR) *32 bits aligned onto a 4 KByte boundary x86 page table entry 22 / 36 a g e D i r e c t o r y E n t r y (4 K B y t e a g e Ta b l e ) age Table Entry (4 KByte age) 3 1 12 11 8 7 6 5 4 3 2 1 0 31 12 11 8 7 6 5 4 3 2 1 0 a g e Ta b le B a s e A d d re ss A v a il G S 0 A U R C W / / D T S W age Base Address Avail G A T D A C D W T U / S R / W A v a ila b le fo r s y s te m p ro g ra m m e r s u s e G lo b a l p a g e (Ig n o re d ) a g e s iz e (0 in d ic a te s 4 K B y te s ) R e s e rv e d (s e t to 0 ) A c c e s s e d C a c h e d is a b le d W rite th ro u g h U s e r/s u p e rv is o r R e a d /W rite re s e n t Available for system programmer s use Global age age Table Attribute Index Dirty Accessed Cache Disabled Write Through User/Supervisor Read/Write resent 23 / 36 24 / 36 x86 hardware segmentation x86 hardware segmentation x86 architecture also supports segmentation - Segment register base + pointer val = linear address - age translation happens on linear addresses Two levels of protection and translation check - Segmentation model has four privilege levels (CL 0 3) - aging only two, so 0 2 = kernel, 3 = user Why do you want both paging and segmentation? x86 architecture also supports segmentation - Segment register base + pointer val = linear address - age translation happens on linear addresses Two levels of protection and translation check - Segmentation model has four privilege levels (CL 0 3) - aging only two, so 0 2 = kernel, 3 = user Why do you want both paging and segmentation? Short answer: You don t just adds overhead - Most OSes use flat mode set base = 0, bounds = 0xffffffff in all segment registers, then forget about it - x86-64 architecture removes much segmentation support Long answer: Has some fringe/incidental uses - VMware runs guest OS in CL 1 to trap stack faults - OpenBSD used CS limit for W X when no TE NX bit 25 / 36 25 / 36

Making paging fast TLB details x86 Ts require 3 memory references per load/store - Look up page table address in page directory - Look up N in page table - Actually access physical page corresponding to virtual address For speed, CU caches recently used translations - Called a translation lookaside buffer or TLB - Typical: 64-2K entries, 4-way to fully associative, 5% hit rate - Each TLB entry maps a VN N + protection information On each memory reference - Check TLB, if entry present get physical address fast - If not, walk page tables, insert in TLB for next time (Must evict some entry) TLB operates at CU pipeline speed = small, fast Complication: what to do when switch address space? - Flush TLB on context switch (e.g., old x86) - Tag each entry with associated process s ID (e.g., MIS) In general, OS must manually keep TLB valid E.g., x86 invlpg instruction - Invalidates a page translation in TLB - Must execute after changing a possibly used page table entry - Otherwise, hardware will miss page table change More Complex on a multiprocessor (TLB shootdown) 26 / 36 27 / 36 x86 aging Extensions SE: age size extensions - Setting bit 7 in DE makes a 4MB translation (no T) AE age address extensions - Newer 64-bit TE format allows 36 bits of physical address - age tables, directories have only 512 entries - Use 4-entry age-directory-ointer Table to regain 2 lost bits - DE bit 7 allows 2MB translation Long mode AE - In Long mode, pointers are 64-bits - Extends AE to map 48 bits of virtual address (next slide) - Why are aren t all 64 bits of VA usable? x86 long mode paging Virtual Address 63 48 47 3 38 30 2 age Map Sign Extend age Directory age Directory Level 4 offset ointer Offset Offset (ML4) 21 20 age Table Offset age age Map Directory age Level 4 ointer Directory age Table Table Table Table ML4E 51 DE age Map L4 Base Addr 12 DE TE 12 11 CR3 hysical age Offset 12 0 4 Kbyte hysical age hysical Address 28 / 36 2 / 36 Where does the OS live? intos memory layout In its own address space? - Can t do this on most hardware (e.g., syscall instruction won t switch address spaces) - Also would make it harder to parse syscall arguments passed as pointers So in the same address space as process - Use protection bits to prohibit user code from writing kernel Typically all kernel text, most data at same VA in every address space - On x86, must manually set up page tables for this - Usually just map kernel in contiguous virtual memory when boot loader puts kernel into contiguous physical memory - Some hardware puts physical memory (kernel-only) somewhere in virtual address space Kernel/ seudo-physical memory User stack BSS / Heap Data segment Code segment Invalid virtual addresses 0xffffffff 0xc0000000 (HYS BASE) 0x08048000 0x00000000 30 / 36 31 / 36

Very different MMU: MIS Hardware has 64-entry TLB - References to addresses not in TLB trap to kernel Each TLB entry has the following fields: Virtual page, id, age frame, NC, D, V, Global Kernel itself unpaged - All of physical memory contiguously mapped in high VM - Kernel uses these pseudo-physical addresses User TLB fault hander very efficient - Two hardware registers reserved for it - utlb miss handler can itself fault allow paged page tables OS is free to choose page table format! DEC Alpha MMU Software managed TLB (like MIS) - 8KB, 64KB, 512KB, 4MB pages all available - TLB supports 128 instruction/128 data entries of any size But TLB miss handler not part of OS - rocessor ships with special AL code in ROM - rocessor-specific, but provides uniform interface to OS - Basically firmware that runs from main memory like OS Various events vector directly to AL code - call pal instruction, TLB miss/fault, F disabled AL code runs in special privileged processor mode - Interrupts always disabled - Have access to special instructions and registers 32 / 36 33 / 36 AL code interface details Examples of Digital Unix ALcode entry functions - callsys/retsys - make, return from system call - swpctx - change address spaces - wrvptptr - write virtual page table pointer - tbi - TBL invalidate Some fields in ALcode page table entries - GH - 2-bit granularity hint 2 N pages have same translation - ASM - address space match mapping applies in all processes Example: aging to disk gcc needs a new page of memory OS re-claims an idle page from emacs If page is clean (i.e., also stored on disk): - E.g., page of text from emacs binary on disk - Can always re-read same page from binary - So okay to discard contents now & give page to gcc If page is dirty (meaning memory is only copy) - Must write page to disk first before giving to gcc Either way: - Mark page invalid in emacs - emacs will fault on next access to virtual page - On fault, OS reads page data back from disk into new page, maps new page into emacs, resumes executing 34 / 36 35 / 36 aging in day-to-day use Demand paging Growing the stack BSS page allocation Shared text Shared libraries Shared memory Copy-on-write (fork, mmap, etc.) Q: Which pages should have global bit set on x86? 36 / 36