CIS Operating Systems Memory Management Address Translation. Professor Qiang Zeng Fall 2017

Similar documents
CIS Operating Systems Memory Management Address Translation for Paging. Professor Qiang Zeng Spring 2018

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2017

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2015

CIS Operating Systems Memory Management Cache and Demand Paging. Professor Qiang Zeng Spring 2018

CIS Operating Systems Memory Management. Professor Qiang Zeng Fall 2017

Lecture 13: Address Translation

Main Points. Address Transla+on Concept. Flexible Address Transla+on. Efficient Address Transla+on

CIS Operating Systems Non-contiguous Memory Allocation. Professor Qiang Zeng Spring 2018

Basic Memory Management

Chapter 8 Main Memory

Basic Memory Management. Basic Memory Management. Address Binding. Running a user program. Operating Systems 10/14/2018 CSC 256/456 1

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition

Memory Management. Dr. Yingwu Zhu

CS307: Operating Systems

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts

CSCE Operating Systems Non-contiguous Memory Allocation. Qiang Zeng, Ph.D. Fall 2018

Chapter 8: Main Memory. Operating System Concepts 9 th Edition

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

Chapter 8: Memory-Management Strategies

Chapter 8 Memory Management

Chapter 8: Main Memory

Memory Management. Memory

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

Memory and multiprogramming

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition

CS307 Operating Systems Main Memory

Memory management: outline

CIS Operating Systems Contiguous Memory Allocation. Professor Qiang Zeng Spring 2018

Memory management: outline

CHAPTER 8: MEMORY MANAGEMENT. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

16 Sharing Main Memory Segmentation and Paging

Virtual to physical address translation

Chapter 9 Memory Management

CS 3733 Operating Systems:

Memory Management. Dr. Yingwu Zhu

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 13: Address Translation

CS399 New Beginnings. Jonathan Walpole

Memory management. Requirements. Relocation: program loading. Terms. Relocation. Protection. Sharing. Logical organization. Physical organization

Multi-level Translation. CS 537 Lecture 9 Paging. Example two-level page table. Multi-level Translation Analysis

Chapter 8: Memory- Manage g me m nt n S tra r t a e t gie i s

Course Outline. Processes CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy

Chapter 8: Main Memory

Topics: Memory Management (SGG, Chapter 08) 8.1, 8.2, 8.3, 8.5, 8.6 CS 3733 Operating Systems

CS6401- Operating System UNIT-III STORAGE MANAGEMENT

Multi-Process Systems: Memory (2) Memory & paging structures: free frames. Memory & paging structures. Physical memory

CS 5523 Operating Systems: Memory Management (SGG-8)

Main Memory: Address Translation

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory

Virtual or Logical. Logical Addr. MMU (Memory Mgt. Unit) Physical. Addr. 1. (50 ns access)

Chapter 3 - Memory Management

File Systems. OS Overview I/O. Swap. Management. Operations CPU. Hard Drive. Management. Memory. Hard Drive. CSI3131 Topics. Structure.

Computer Architecture Lecture 13: Virtual Memory II

Virtual Memory. CS 351: Systems Programming Michael Saelee

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

COS 318: Operating Systems. Virtual Memory and Address Translation

Virtual memory Paging

CS162 Operating Systems and Systems Programming Lecture 12. Address Translation. Page 1

CS370 Operating Systems

Recap: Memory Management

Lecture 21: Virtual Memory. Spring 2018 Jason Tang

Memory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit.

Objectives and Functions Convenience. William Stallings Computer Organization and Architecture 7 th Edition. Efficiency

CS3600 SYSTEMS AND NETWORKS

The Virtual Memory Abstraction. Memory Management. Address spaces: Physical and Virtual. Address Translation

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition

Chapter 8: Memory- Management Strategies

Memory Management. To improve CPU utilization in a multiprogramming environment we need multiple programs in main memory at the same time.

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Memory Management. Memory Management

Goals of Memory Management

MEMORY MANAGEMENT/1 CS 409, FALL 2013

15 Sharing Main Memory Segmentation and Paging

Outline. V Computer Systems Organization II (Honors) (Introductory Operating Systems) Advantages of Multi-level Page Tables

Memory Management. Contents: Memory Management. How to generate code? Background

6 - Main Memory EECE 315 (101) ECE UBC 2013 W2

8.1 Background. Part Four - Memory Management. Chapter 8: Memory-Management Management Strategies. Chapter 8: Memory Management

Segmentation (Apr 5 and 7, 2005)

Memory Management. An expensive way to run multiple processes: Swapping. CPSC 410/611 : Operating Systems. Memory Management: Paging / Segmentation 1

Module 8: Memory Management

CSE 451: Operating Systems Winter Page Table Management, TLBs and Other Pragmatics. Gary Kimura

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 23

Address Translation. Tore Larsen Material developed by: Kai Li, Princeton University

Introduction to Operating Systems

! What is main memory? ! What is static and dynamic allocation? ! What is segmentation? Maria Hybinette, UGA. High Address (0x7fffffff) !

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

Chapter 8: Main Memory

The memory of a program. Paging and Virtual Memory. Swapping. Paging. Physical to logical address mapping. Memory management: Review

Operating Systems. Memory Management. Lecture 9 Michael O Boyle

Chapter 8: Main Memory

1. Creates the illusion of an address space much larger than the physical memory

CS 550 Operating Systems Spring Memory Management: Paging

CS370 Operating Systems

Operating Systems and Computer Networks. Memory Management. Dr.-Ing. Pascal A. Klein

Simple idea 1: load-time linking. Our main questions. Some terminology. Simple idea 2: base + bound register. Protection mechanics.

Chapter 8: Virtual Memory. Operating System Concepts

Virtual Memory Oct. 29, 2002

Memory Management Topics. CS 537 Lecture 11 Memory. Virtualizing Resources

Transcription:

CIS 5512 - Operating Systems Memory Management Address Translation Professor Qiang Zeng Fall 2017

Outline Fixed partitions Dynamic partitions Con$guous alloca$on: Each process occupies a con$guous memory region in the physical memory Segmentation Paging Non- con$guous alloca$on: Each process occupies mul$ple memory regions sca8ered in the physical memory. CIS 5512 - Operating Systems 2

Fixed vs Dynamic partitioning Fixed partitioning: street parking with meters Internal Fragmentation: Parking a small car in a parking slot leads to internal fragmentation Dynamic partitioning: street parking w/o meters External Fragmentation: When small cars drive out, those non-contiguous holes left cannot be used by a big SUV CIS 5512 - Operating Systems 3

Previous class: why is non-contiguous allocation better? Reason 1: Contiguous allocation does not exploit the Working Set theory Non-contiguous allocation: the memory needed by a process is cut into A couple of parts (this is called segmentation) Many equal-sized parts (this is called paging) Non-contiguous allocation only swaps in the parts (segments or pages) corresponding to the current Working Set into memory, so that your memory allows more active processes Reason 2: the fragmentation problem is mitigated Small chunks can now be used by segments or pages CIS 5512 - Operating Systems 4

Segmentation vs. Paging Segmentation Causes external fragmentation Does not handle dynamic components (e.g., stack, heap) very well Swapping a segment is still slow Working Set is exploited at the segment granularity Paging No external fragmentation Allows dynamic components grow and shrink flexibly Swapping occurs at the page granularity Working Set is exploited at the page granularity Cons: Complex and expensive address translation CIS 5512 - Operating Systems 5

Paging Example Internal fragmenta$on? Yes, but it only occurs when the requested size is not a mul$ple of pages. E.g., a process that requests 3.1 pages of space will get 4 pages External fragmenta$on? No, any holes, i.e., page frames, lej by the exited process can be reused happily 6

Previous class What will happen if the RAM is less than the total size of the working sets of the processes? Thrashing. Consider a simple example: memcpy(dst, src, 4096). Assume dst and src reside in page 0 and 1, respec$vely; obviously, the working set during the call is (at least) two pages, but what if the physical memory allocated to the process is only one page? There will be con$nuous swapping: to access page 1, you have to first swap out page 0, and vice verse. CIS 5512 Operating Systems 7

Logical view and physical view The layout of processes in physical memory forms a physical view A compiler, however, compiles a program based on some logical view Some slides are courtesy of Dr. Thomas Anderson, Dr. Abraham Silberschatz, and Dr. William Stallings 8

Logical view vs. physical view Proces View Physical Memory Code0 Frame 0 VPage 0 VPage 1 Code Data Heap Data0 Heap1 Code1 Heap0 Data1 VPage N Stack Heap2 Stack 1 Stack 0 Frame M CIS 5512 - Operating Systems 9

Addresses Logical address Reference to a memory location in the logic view Used by processor and compiler In Segmentation and Paging, also called Virtual Address Intel: logical address + segment base = virtual address. In Linux, the segment base is 0, so still logical address = virtual address Physical address Actual location in main memory Sent to the memory bus CIS 5512 - Operating Systems 10

Address Translation Typically, a hardware component MMU (Memory Management Unit) does the translation Processor Virtual Logical Address address Translation Invalid Raise Exception Valid Data Physical Address Physical Memory Data 11

Registers for Contiguous Allocation A pair of base and limit (or, bound) registers define the partition in physical memory i.e., the par$$on size CIS 5512 - Operating Systems 12

Address translation for Contiguous Allocation base CIS 5512 - Operating Systems 13

Address translation for Contiguous Allocation - Question What is updated at a process switch to assist address translation? Base & limit registers Can it keep program from accidentally overwriting its own code? No Can it share code/data with other processes? No CIS 5512 - Operating Systems 14

Address translation for Segmentation A logical address consists of Segment number/label Offset (inside that segment) Each process has a segment table Each entry in the table saves the information for a segment: base, bound, and access permission CIS 5512 - Operating Systems 15

Address translation for Segmentation Process View Processor Virtual Address Virtual Memory Code Data Heap Stack Processor Segment Virtual Address Implementation Offset Segment Table Base Bound Access Read R/W R/W R/W Physical Address Raise Exception Physical Memory Stack Code Data Heap Base 3 Base+ Bound 3 Base 0 Base+ Bound 0 Base 1 Base+ Bound 1 Base 2 Base+ Bound 2 CIS 5512 - Operating Systems 16

Questions With segmentation, what is updated at process switch to assist address translation? Register that points to the segment table Recall that a segment can be swapped in/out memory. If a segment is swapped in and its location changes, what information in the table should be updated? Base field of the entry describing the location of that segment in physical memory Can it keep program from accidentally overwriting its own code? Yes. Code segment can be set as read-only Can it share code/data with other processes? Yes. A physical segment can be pointed to by multiple processes CIS 5512 - Operating Systems 17

Processor s View Process 1`s View Implementation Sharing segments Physical Memory Virtual Memory P2`s Data Processor Virtual Address 0x0500 Code Data Heap Processor Seg. Offset 0 500 Virtual Address Code Data Heap Stack Segment Table Base Bound Access Read R/W R/W R/W P1`s Heap P1`s Stack P1`s Data Process 2`s View Stack Physical Address P2`s Heap Processor Virtual Address 0x0500 Code Data Processor Seg. Offset 0 500 Code Segment Table Base Bound Access Read P1 s+ P2`s Code P2`s Stack Heap Virtual Address Data Heap R/W R/W Stack R/W Stack CIS 5512 - Operating Systems 18

UNIX fork() UNIX fork Virtually makes a complete copy of the parent s memory (1) any given variable (except for the return value of fork()) has the same value between parent and child; (2) when you change a variable of a process, the copy in the other will not be changed, since the two processes are independent A very slow implementation: Copy every segment A slightly improved implementation: Share the code segment and copy others CIS 5512 - Operating Systems 19

UNIX fork() and Copy-on-Write (CoW) Efficient implementation: copy-on-write Copy the segment table Set the read only flag for all segments, for both parent and child, and increment the reference count (initially, 0) When child or parent writes to any segment (e.g., stack, heap) An exception will be triggered, and the control flow traps into kernel Kernel copies the segment, set the new copy as r/w Decrement the reference count for the original segment; if it is 0, remove the write-protection Lazy copy of segments in order to avoid unnecessary copy and a slow return from fork() CIS 5512 - Operating Systems 20

Questions How does the system distinguish it is a valid write to a CoW segment or just an invalid write? The ground truth permission information for each segment is stored separately in the kernel CIS 5512 - Operating Systems 21

Paging Divide physical memory into fixed-sized blocks called frames Size is power of 2, between 512 bytes and 16 Mbytes (or larger), depending on specific systems Divide logical memory into blocks of same size called pages A page can be put at any frame

Virtual Address Address generated by CPU is divided into: Page number used as an index into a page table, which is an array of entries for translating page # to frame # Page offset combined with frame # addr to define the physical address that is sent to the memory unit page number p m - n page offset d n

Page Table Each process has its own page table, which is an array of page table entries for address translation A page table entry present bit (1 if the page is in physical memory) frame number protection bits: read, write, exec modified bit (1 if the page has been modified) reference bit (1 if page was recently referenced) CIS 5512 - Operating Systems 24

Steps of address translation using a page table Use the page number in the address to search in page table If the entry s present bit is 1, build physical address: (frame #, offset) Otherwise, a page fault CIS 5512 - Operating Systems 25

Paging Hardware

Questions What is updated at a process switch to assist address translation? Register pointing to the page table, Given 4KB page size and a 32-bit virtual address, calculate m (# of bits for virtual address space), n (# of bits for offset within a page), and the number of bits for representing page # m = 32; n = 12 So page # is represented by (32 12 =) 20 bits CIS 5512 - Operating Systems 27

Virtual address space Physical Memory A B C D E F G H I J K L Virtual address space: 16 bytes (m = 4) Memory size per page: 4 bytes (n = 2) Write the page table and use it to translate the virtual address 0xa Page # 0 1 2 3 Page Table frame # = 3 frame # = 2 frame # = 0 (present = 0) I J K L E F G H A B C D

Questions What if the page size is very large? Internal fragmentation: if we don t need all of the space inside a page frame What if the page size is very small? Many page table entries, which consume memory Bad TLB use (will be covered later) Can we share memory between processes? The same page frames are pointed to by the page tables of processes that share the memory This is how processes share the kernel address space, program code, and library code CIS 5512 - Operating Systems 29

Paging and Copy on Write UNIX fork with copy on write Copy page table of parent into child process Mark all pages (in both page tables) as read-only Trap into kernel on write (in child or parent) Copy page Mark both as writeable Resume execution CIS 5512 - Operating Systems 30

Issues with the array-based page table implementation Large memory consumption due to the table 32-bit space, 4KB page size => 2 32 /2 12 = 1M entries, 4 bytes for each entry => 4M memory 64-bit machine currently uses 48 bits address space 2 48 /2 12 = 64G entries => 256 G memory CIS 5512 - Operating Systems 31

Multilevel Page Table CIS 5512 - Operating Systems 32

Questions What is the range of the virtual address space that is covered by an entry in the L1 page table? 2 22 = 4M Recall that we consistently need 4MB memory if we use the single-level implementation; how much memory do we need at most with the multi-level page table? 4KB memory for the L1 page table (1024 entries, each 4 bytes) 1024 L2 page tables = 4MB memory 1 + 1024 = 1025 frames, that is, 4KB + 4MB Even worse? This is just the worst case CIS 5512 - Operating Systems 33

Questions If the process needs 8MB (2 23 ) memory, how many page frames are used as page tables at least? How many at most? Best case: all pages are together in the virtual space. 1 frame for the L1 page table + 2 for L2 page tables (2 23 / 2 12 = 2 11 = 2048 entries, which correspond to 2 L2 page tables) Worst case: pages scatter. 1 frame for the L1 page table; 2048 scatter in all the L2 page tables, which are 1024 Which is the case in reality? CIS 5512 - Operating Systems 34

Anatomy of the virtual address space of a process CIS 5512 - Operating Systems 35

Multilevel Translation Pros: Save the physical memory used by page tables Cons: Two steps of (or more) lookups per memory reference 2 levels for 32 bits 4 levels for 48 bits (in current 64-bit systems) CIS 5512 - Operating Systems 36

TLB (Translation Lookaside Buffer) TLB: hardware cache for the page tables each entry stores a mapping from page # to frame # Address translation for an address page # : offset If page # is in cache, get frame # out Otherwise get frame # from page table in memory Then load the (page #, frame #) mapping in TLB

TLB Lookup Physical Memory Virtual Address Page# Offset Translation Lookaside Buffer (TLB) Virtual Page Page Frame Access Physical Address Matching Entry Frame Offset Page Table Lookup CIS 5512 - Operating Systems 38

TLB and Page Table Translation Processor Virtual Address TLB Virtual Address Miss Page Table Invalid Raise Exception Hit Valid Frame Frame Data Offset Physical Address Physical Memory Data CIS 5512 - Operating Systems 39

Cost Hit ratio: percentage of times that the cache has the needed data; denoted as h; so Miss ratio: 1 h Cost of TLB look up: T tlb Cost of Page Table look up T pt Cost of translation = h * T tlb + (1-h) * (T tlb + T pt ) = T tlb + (1-h) * T pt Assume Cost of TLB lookup = 1ns Cost of page table lookup = 200ns Hit ratio = 99% (percentage of times that) Cost = 1 + 0.01 * 200 = 3ns; a boost compared to 200ns CIS 5512 - Operating Systems 40

Improvement 1: Superpages On many systems, a TLB entry can be for A superpage It saves a lot of TLB entries compared to small pages x86: superpage 2MB 4MB 1GB (x86-64) CIS 5512 - Operating Systems 41

Superpages Virtual Address Physical Memory Page# SP Offset Offset Translation Lookaside Buffer (TLB) Matching Entry Superpage (SP) or Page# Superframe (SF) or Frame Access Frame Physical Address Offset Matching Superpage Page Table Lookup SF Offset CIS 5512 - Operating Systems 42

Improvement 2: Tagged TLB An address mapping only makes sense for a specific process. One solution to dealing with process switch is to flush TLB. But, it takes time for filling TLB entries for the newly switched-in process Some entries can be pinned, e.g., those for kernel Tagged TLB Each TLB entry has process ID TLB hit only if process ID matches current process So no need to flush TLB

Processor Virtual Address Implementation Physical Memory Page Frame Page# Offset Process ID Matching Entry Translation Lookaside Buffer (TLB) Process ID Page Frame Access Frame Physical Address Offset Page Table Lookup CIS 5512 - Operating Systems 44

Improvement 3: TLB Consistency The OS should discard the related TLB entry, e.g., When swapping out a page When R/W pages become read-only due to fork() These events can be summarized as Permission Reduction To support shared memory, upon permission reduction all the corresponding TLB entries should be discarded On a multicore, upon permission reduction, the OS must ask each CPU to discard the related TLB entries This is called TLB shootdown It leads to Inter-processor Interrupts. The initiator processor has to wait until all other processors acknowledge the completion, so it is costly CIS 5512 - Operating Systems 45

TLB Shootdown Processor 1 TLB = = Process ID 0 VirtualPage PageFrame Access 0x0053 0x0003 R/W 1 0x4OFF 0x0012 R/W Processor 2 TLB = = 0 0x0053 0x0003 R/W 0 0x0001 0x0005 Read Processor 3 TLB = = 1 0x4OFF 0x0012 R/W 0 0x0001 0x0005 Read CIS 5512 - Operating Systems 46

Question Why upon permission reduction only? In the case of permission increment, a page fault will be triggered and the TLB has chance to be updated then (we will cover page fault handling next class) Lazy update CIS 5512 - Operating Systems 47

Summary Address translation for contiguous allocation Base + bound registers Address translation for segmentation Segment table Copy-on-write Sharing Memory-efficient address translation for paging Multi-level page tables Accelerated address translation for paging TLB CIS 5512 - Operating Systems 48

Writing assignment What is the benefit of CoW? Compare conventional page table and inverted page table Why do we use multi-level page tables for address translation? What is TLB? What is it used for? CIS 5512 - Operating Systems 49

Backup pages CIS 5512 - Operating Systems 50

Previous class What is internal fragmenta$on? What is external fragmenta$on Fragmenta)on: small useless chunks Internal fragmenta)on: small useless chunks that are inside the allocated par$$ons External fragmenta)on: outside CIS 5512 Operating Systems 51

How Buddy System Helps Three sentences: Split-based allocation Coalescing-buddies-based deallocation Freelists-based implementation: an array of free lists, each storing the unallocated partitions of certain size Internal fragmentation: Best fit every time Upper bound: ~50%. E.g., 128.01k is rounded up to 256k External fragmentation: Coalescing can only occur between buddies Restriction about splitting, e.g., 64kB minimal CIS 5512 - Operating Systems 52

Previous class If swapping is not used, do the schemes of segmenta$on/paging s$ll have advantages over con$guous memory alloca$on? Yes. Segmenta$on and paging can be regarded as a special type of dynamic par$$oning, as a process can occupy mul$ple, rather than one, par$$ons. Those small chunks that are probably useless in dynamic par$$oning may be used to accommodate segments or pages of one or more processes. So it is a be8er use of space. You will see other advantages, e.g., sharing CIS 5512 Operating Systems 53

Inverted Page Table One page table for each process may be too costly A hash table that maps virtual page # to frame # The whole system uses a single Inverted Page Table regardless of the number of processes or the size of virtual space Structure is called inverted because it indexes page table entries by frame number rather than by page number

Inverted Page Table Each entry in the page table includes: Process iden$fier Page number Control bits Chain pointer the process that owns this page frame Because there may be hash collision includes flags and protec$on and locking informa$on To resolve hash collision

Inverted Page Table Note that j is used as the frame #

Terms (X86) Real mode: a legacy operating mode of x86- compatible CPUs. 20-bit address space, i.e., 1M physical_address = segment_part 16 + 16-bit offset No support for memory protection: each process can access any physical location Protected mode Enables memory protection (recall privilege rings) by setting the Protection Enable bit in CR0 Segmentation and Paging CIS 5512 - Operating Systems 57

Terms (X86) GDT (Global Descriptor Table) and LDT (Local Descriptor Table) They contain Segment Descriptors; each defines the base, limit, and privilege of a segment A Segment Register contains a Segment Selector, which determines which table to use and an index into the table Each logical address consists of a Segment Register and an offset. However, the Segment Selector is usually specified implicitly: E.g., an instruction fetch implies the Code Segment; a data access implies DS Privilege check: CPL DPL where CPL is the current Privilege Level (found in the CS register), and DPL is the descriptor privilege level of the segment to be accessed This is how Protection CIS 5512 - Operating Systems 58

Segmentation in Linux/x86 Segmentation cannot be disabled on x86-32 processors; it used to translate a two-part logical address to a linear address Linux has to pretend that it is using segmentation For each segment, bases = 0 and limit = 4G, which means that a logical address = a linear address However, those Segment Selector registers, especially the Code Segment register, are critical to implement the Protection Ring idea Linux defines four Segment Selector values: USER_CS, USER_DS KERNEL_CS, KERNEL_DS For user space and kernel space, respectively Stack Segment register uses the data segment CIS 5512 - Operating Systems 59