Main Memory: Address Translation (Chapter 8) CS 4410 Operating Systems
Can t We All Just Get Along? Physical Reality: different processes/threads share the same hardware à need to multiplex CPU (temporal) Memory (spatial) Disk and devices (later) Why worry about memory sharing? Complete working state of process and/or kernel is defined by its data in memory (+ registers) Don t want different threads to have access to each other s memory (protection) 2
Aspects of Memory Multiplexing Isolation Don t want distinct process states collided in physical memory (unintended overlap à chaos) Sharing Want option to overlap when desired (for communication) Virtualization Want to create the illusion of more resources than exist in underlying physical system Utilization Want to best use of this limited resource 3
Address Translation Paged Translation Efficient Address Translation All in the context of the OS 4
A Day in the Life of a Program sum.c source files #include <stdio.h> int max = 10; int main () { int i; int sum = 0; add(m, &sum); printf( %d,i);... } Compiler (+ Assembler + Linker) 0040 0000.text 1000 0000.data main executable max sum... 0C40023C 21035000 1b80050c 8C048004 21047002 0C400020... 10201000 21040330 22500102... Loader PC GP SP 0x10000000 0x00400000 It s alive! pid xxx process 0xffffffff max jal addi stack heap data text 0x00000000 5
Logical view of process memory 0xffffffff 0x00000000 stack heap data text What s wrong with this in the context of: multiple processes? multiple threads? 6
Paged Translation Virtual Page N Virtual Page 0 Processor s View stack heap data text Frame M Frame 0 Physical Memory STACK 0 HEAP 0 TET 1 HEAP 1 DATA 0 TET 0 STACK 1 TERMINOLOGY ALERT: Page: the data itself Frame: physical location No more external fragmentation! 7
Paging Overview Divide: Physical memory into fixed-sized blocks called frames Logical memory into blocks of same size called pages Management: Keep track of all free frames. To run a program with n pages, need to find n free frames and load program Notice: Logical address space can be noncontiguous! Process given frames when/where available 8
Address Translation, Conceptually Virtual Processor Address Translation Invalid Raise Exception Valid Data Physical Address Physical Memory Data 9
Memory Management Unit (MMU) Hardware device Maps virtual to physical address (used to access data) User Process: deals with virtual addresses Never sees the physical address 10
High-Level Address Translation Page N Processor s View stack Frame M Physical Memory STACK 0 HEAP 0 red cube is 255 th byte in page 2. Page 0 heap data text Frame 0 TET 1 HEAP 1 DATA 0 TET 0 STACK 1 Where is the red cube in physical memory? 11
Logical Address Components Page number Upper bits Must be translated into a physical frame number Page offset Lower bits Does not change in translation page number m - n page offset n For given logical address space 2 m and page size 2 n 12
High-Level Address Translation Virtual Memory Physical Memory Who keeps track of the stack STACK 0 HEAP 0 mapping? 0x5000 0x4000 0x3000 0x2000 0x1000 0x0000 heap data text 0x20FF 0x6000 0x5000 0x4000 0x3000 0x2000 0x1000 0x0000 TET 1 HEAP 1 DATA 0 TET 0 STACK 1 0x???? à Page Table 0 1 2 3 4 5-3 6 4 8 5 13
Simplified Page Table Page-table Lives in Memory Page-table base register (PTBR) Points to the page table Saved/restored on context switch PTBR 14
Leveraging Paging Protection Dynamic Loading Dynamic Linking Copy-On-Write 15
Full Page Table Page-table Meta Data about each frame Protection R/W/, Modified, Valid, etc. PTBR 16
Leveraging Paging Protection Dynamic Loading Dynamic Linking Copy-On-Write 17
Dynamic Loading & Linking Dynamic Loading Routine is not loaded until it is called Better memory-space utilization; unused routine is never loaded No special support from the OS needed Dynamic Linking Routine is not linked until execution time Locate (or load) library routine when called AKA shared libraries (e.g., DLLs) 18
Leveraging Paging Protection Dynamic Loading Dynamic Linking Copy-On-Write 19
Copy on Write (COW) P1 forks() P2 created with - own page table - same translations All pages marked COW (in Page Table) P1 Virt Addr Space COW stack heap data text P2 Virt Addr Space stack heap data text Physical Addr Space heap text data stack 20
COW, then keep executing Option #1: Child keeps executing Upon page fault: Allocate new frame Copy frame Both pages no longer COW P1 Virt Addr Space COW stack heap data text P2 Virt Addr Space stack heap data text Physical Addr Space stack heap text data stack 21
COW, then call exec (before) Option #2: Child calls exec() Load new frames Copy frame Both pages now COW P1 Virt Addr Space COW stack heap data text P2 Virt Addr Space stack heap data text Physical Addr Space BEFORE heap text data stack 22
COW, then call exec (after) Option #2: Child calls exec() Load new frames Copy frame Both pages no longer COW P1 Virt Addr Space COW stack heap data text P2 Virt Addr Space stack data text Physical Addr Space AFTER stack heap text text data data stack 23
Downsides to Paging Memory Consumption: Internal Fragmentation Make pages smaller? But then Page Table Space: consider 32-bit address space, 4KB page size, each PTE 8 bytes How big is this page table? How many pages in memory does it need? Performance: every data/instruction access requires two memory accesses: One for the page table One for the data/instruction 24
Address Translation Paged Translation Efficient Address Translation Multi-Level Page Tables Inverted Page Tables TLBs 25
Multi-Level Page Tables to the Rescue! Implementation Physical Memory Processor Virtual Address index Index 1 1 Index 2index Index 2 3 offset Offset Level 1 Frame Physical Address Offset Frame Level 2 Frame Access + Allocate only PTEs in use + Simple memory allocation more lookups per memory reference Level 3 26
Two-Level Paging Example 32-bit machine, 1KB page size Logical address is divided into: a page offset of 10 bits (1024 = 2^10) a page number of 22 bits (32-10) Since the page table is paged, the page number is further divided into: a 12-bit first index a 10-bit second index Thus, a logical address is as follows: page number page offset index 1 index 2 offset 12 10 10 27
This one goes to three! Implementation Physical Memory Processor Virtual Address Index 1 Index 2 Index 3 Level 1 Offset Frame Physical Address Offset Level 2 Level 3 + First Level requires less contiguous memory even more lookups per memory reference 28
Complete Page Table Entry (PTE) Valid Protection R/W/ Ref Dirty Index Index is an index into: table of memory frames (if bottom level) table of page table frames (if multilevel page table) backing store (if page was swapped out) Synonyms: Valid bit == Present bit Dirty bit == Modified bit Referenced bit == Accessed bit 29
Address Translation Paged Translation Efficient Address Translation Multi-Level Page Tables Inverted Page Tables TLBs 30
Inverted Page Table: Motivation So many virtual pages comparatively few physical frames Traditional Page Tables: map pages to frames are numerous and sparse Why not map frames to pages? (How?) 31
Inverted Page Table: Implementation pid Virtual address page # offset pid page Page-table frame page pid Physical Memory offset Implementation: 1 Page Table for entire system 1 entry per frame in memory Why don t we store the frame #? frame Not to scale! Page table << Memory 32
Inverted Page Table: Discussion Tradeoffs: memory to store page tables time to search page tables Solution: hashing hash(page,pid) à PT entry (or chain of entries) What about: collisions sharing 33
Address Translation Paged Translation Efficient Address Translation Multi-Level Page Tables Inverted Page Tables TLBs 34
Translation Lookaside Buffer (TLB) Cache of virtual to physical page translations Major efficiency improvement Virtual Address Physical Memory Page# Offset Translation Lookaside Buffer (TLB) Virtual Page Page Frame Access Physical Address Matching Entry Frame Offset Page Table Lookup 35
Address Translation with TLB Access TLB before you access memory. Processor Virtual Address TLB Virtual Address Miss Page Table Invalid Raise Exception Hit Valid Frame Frame Data Offset Physical Address Trick: access TLB while you access the cache. Physical Memory Data 36
Address Translation Uses! Process isolation Keep a process from touching anyone else s memory, or the kernel s Efficient interprocess communication Shared regions of memory between processes Shared code segments common libraries used by many different programs Program initialization Start running a program before it is entirely in memory Dynamic memory allocation Allocate and initialize stack/heap pages on demand 37
MORE Address Translation Uses! Program debugging Data breakpoints when address is accessed Memory mapped files Access file data using load/store instructions Demand-paged virtual memory Illusion of near-infinite memory, backed by disk or memory on other machines Checkpointing/restart Transparently save a copy of a process, without stopping the program while the save happens Distributed shared memory Illusion of memory that is shared between machines 38