CS 550 Operating Systems Spring 2018 Memory Management: Paging 1
Recap: Memory Management Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory cache some medium-speed, medium price main memory Lots of slow, cheap disk storage Memory manager handles the memory hierarchy 2
Recap: (Virtual) Address Space The abstraction that the OS is providing to the running program The running program has an illusion that it is can use the memory space starting at a particular address (e.g., 0x0), and going up to a very large address space (e.g., 32-bits or 64- bits); 0KB 1KB 2KB 15KB 16KB Program Code Heap (free) Stack the code segment: where instructions live the heap segment: contains malloc d data dynamic data structures (it grows downward) (it grows upward) the stack segment: contains local variables arguments to routines, return values, etc. 0KB 64KB 128KB 192KB 256KB 320KB 384KB 448KB 512KB Operating System (code, data, etc.) (free) Process C (code, data, etc.) Process B (code, data, etc.) (free) Process A (code, data, etc.) (free) (free) 3
Recap: Relocation Relocation Address locations in a program are relative. They are added to a base value to map to physical addresses. LIMIT 0 Physical MAX Relative Addresses in original program binary BASE + LIMIT Relocated Addresses in Executing Binary BASE 0 4
Recap: Paging Paging Split up the address space into equal-sized units, which are called pages. OS then decides which pages stay in memory and which get moved to disk. Virtual Address Space of a single Process } Page Entire Physical RAM 5
Recap: Memory Management Unit (MMU) MMU is a hardware module that accompanies the CPU It translates the Virtual Address used by executing instructions to Physical Addresses in the main memory. 6
Recap: Page Table An array that stores the mapping from virtual page numbers (VPN) to physical frame number (PFN) The OS maintains One page table per userspace process. And usually another page table for kernel memory. VPN PFN 7
Address Translation Consider a machine that has a 32-bit virtual address size and 4 KByte page size. 1.How large is the virtual address space? 2^32 = 4 GB 1.How many virtual pages could a process have? 4GB/4KB = 2^20 2. How many page-table entries (for one process) are there in the page table? 4GB/4KB = 2^20 3. How many address bits are needed to select a page Log2(2^20) = 20 4. How many address bits are needed for the byte offset within a 4-KB page Log2(4KB) =Log2(2^12) = 12 8
Address Translation Given the 4 KByte page size, a 32-bit virtual address is divided into: 31 30 29 28 12 11 10 1 0 VPN Byte offset 9
Recap: Virtual Address Translation For Small Address Space Internal operation of MMU with 16 4 KB pages 10
Quiz Consider a machine that has a 32-bit virtual address space and 8KByte page size. 1.How many virtual pages could a process have? 2.How many bits in a 32-bit address are needed to determine the virtual page number of the address? 3.How many bits in a 32-bit address represent the byte offset into a page? 11
Quiz Answers Consider a machine that has a 32-bit virtual address space and 8KByte page size. 1. How many virtual pages could a process have? 4GB/8KB = 512*1024 = 2^9*2^10 = 2^19 2. How many bits in a 32-bit address are needed to determine the virtual page number of the address? log2(4gb/8kb) = log2(2^19) = 19 bits 3. How many bits in a 32-bit address represent the byte offset into a page? log2(8kb) = log2(2^13) = 13 Also, 32 19 = 13 bits 12
Problems Consider a machine that has a 32-bit virtual address space and 4 KByte page size. 1.How many virtual pages could a process have? 4GB/4KB = 2^20 2. How many page-table entries (for one process) are there in the page table? 4GB/4KB = 2^20 3. How many address bits are needed to select a page Log2(2^20) = 20 4. How many address bits are needed for the byte offset within a 4-KB page Log2(4KB) =Log2(2^12) = 12 13
Problems 32-bit address space with 4KB pages 4 bytes per page table entry (PTE): 4 2 20 = 4 MB Is this a big deal? If there are 100 processes, 400 MB space is needed just for storing page tables! We cannot keep page tables in any special on-chip hardware. Instead, page tables are stored in kernel memory. 14
Solution 1: Bigger pages 32 bit address, 4 KB pages 20 bit VPN, 12 bit offset 2^20 = 1M pages (entries) 4MB per process 32 bit address, 16 KB pages 18 bit VPN, 14 bit offset 2^18 = 256K pages (entries) 1MB per process 32 bit address, 1 MB pages 12 bit VPN, 20 bit offset 2^12 = 4 K pages (entries) 16 KB per process Problems? Internal fragmentation Reality: 4KB or 8KB pages. 15
Fragmentation Fragmentation is a phenomenon in which storage space is used inefficiently, reducing capacity or performance and often both. Types of fragmentation Internal fragmentation: the wasted space within each allocated block because of rounding up from the actual requested allocation to the allocation granularity. External fragmentation: the various free spaced holes that are generated in either your memory or disk space. 16
Solution 2: Multi-level page tables The high-level idea to get rid of those invalid/unused regions in the page table instead of keeping them all in memory. The basic idea Divide up the page table into equal-sized regions. if an entire region of page-table entries is invalid, don t allocate space for that region at all. 17
Virtual Address Translation For Large Address Space Second-level page tables 32 bit address with 4 KB pages: Top-level page table Two-level page tables But how does MMU know where to find PT? Registers (CR2 on Intel) 18
Virtual Address Translation For Large Address Space Second-level page tables 32 bit address with 2 page table fields Top-level page table Two-level page tables But how does MMU know where to find PT? Registers (CR2 on Intel) 19
A detailed multi-level example 14 bit address size, with 64-byte pages How many bits for offset, VPN? How many PTE in a linear page table? 6 bit offset (log2(64) = 6) 8 bit VPN (2^8=256 PTEs for a linear page table) Single-level page tables For the linear page table, we need to allocate 256 PTEs. Assume we divide the 256 PTEs to 16 chunks sub page tables, with each chunk (or sub page tables) containing 256 / 16 = 16 PTEs. Then, the page directory (PD) contain 16 entries, each points to one of the 16 sub page tables. 20
14 bit address (8 bit VPN, 6 bit offset) 16 entries in page directory 4 bit for page directory index Each sub page table have 16 PTEs 4 bit for page table index 21
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Single-level page table Page Directory Second-level Page Table Multi-level page tables 22
Single-level page tables 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Page Directory Second-level Page Table Multi-level page tables Instead of allocate 256 PTEs for the whole page table (linear-page table), 2-level page table only needs to allocate 48 PTEs. 23
Typical Page Table Entry (PTE) Page Frame number = physical page number for the virtual page represented by the PTE Referenced bit: Whether the page was accessed since last time the bit was reset. Modified bit: Also called Dirty bit. Whether the page was written to, since the last time the bit was reset. Protection bits: Whether the page is readable? writeable? executable? contains higher privilege code/data? Present/Absent bit: Whether the PTE contains a valid page frame #. Used for marking swapped/unallocated pages. 24
Paging is slow Where is the page table of the running process stored? Page-table base register: physical address of the starting location of the page table Pseudo-code of address translation with paging VPN = (VirtualAddress & VPN_MASK) >> SHIFT PTEAddr = PageTableBaseRegister + (VPN * sizeof(pte)) offset = VirtualAddress & OFFSET_MASK PhysAddr = (PFN << SHIFT) offset 25