CS/COE 1550 www.cs.pitt.edu/~nlf4/cs1550/ Virtual Memory
What if a program is too big for memory? Ye olde solution: Overlays! Programmers split their programs up into overlays containing a subset of the overall code/data When the program was loaded, what was actually loaded was the overlay manager for that program 2
We need an automated approach! How can we dynamically and automatically divide up and load the collection of code and data needed by a program? Virtual memory 3
Virtual memory idea 0xFFFF Address space RAM 0x7FFF 0x0 0x0 4
Virtual address space is completely abstract So if a program references memory address 0, what physical address should be read? Where should this decision be made?? 5
OK, but how is that decision made? How about using an array? Each index is a virtual address, contains the corresponding physical address it is stored at Assume 16-bit addresses How much memory can we address? How big will our array be? 6
Paging Split the address space/memory into chunks called pages We split virtual addresses into a page number and an offset Page number functions an index into a page table The entry at that index will point us to the page frame in physical memory that contains the address we're after Offset gives us the location of the address within the specified frame 7
Page table math Assuming 16-bit addresses, and 4KiB pages How many bits of the virtual address should be used as an offset? How many bits should be used as a page number? How many entries will the page table have? 8
Virtual address translation 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 0110 0000 0000 0011 0110011010110000 0011011010110000 MOVE REG, 14000 9
Page table entries Protection R D V Page frame number Referenced bit Dirty bit Valid bit 10
What happens if a referenced page isn't in RAM? Page fault Traps to OS, which will load the page from disk into a page frame in memory Will then restart the instruction that generated the fault 11
How big will the page table be? For our 16-bit address, 4KiB page example? What about with 32-bit addresses (still 4KiB pages)? How much would actually be used if a process only needed 10MiB of space in RAM? 12
Multilevel page tables 2nd level page tables RAM 1st level page table 13
32-bit address, 4KiB page example How many 1 st -level page table entries/2 nd level page tables should we have? How big should the 1 st and 2 nd level page tables be? Need 12 bits of the address to use as an offset, how should we use the other 20? 14
2 level page table example 01100110101100000110011010110000 RAM............ 00010010101110001010011010110000 15
How big are the page tables now? Assume again that a process needs 10MiB of RAM 16
This is still all for 32-bit addresses What about 64-bit? Still using 4KiB pages, so still need 12 bits for offset Page table entry size? More than 2 20 4KiB chunks with 64-bit physical addresses Need 52 bits to identify page frame! At least 7 bytes per page table entry 8 byte to be aligned to 4 byte words... Still want 2 nd -level page tables to be a page But with 8 byte entries, only 512 entries per 2 nd -level table 64-12 - 9 = 43 bits left of the address 2 43 entries in 1 st -level page table!! Each still needing 7 (or 8) bytes to identify the 2 nd -level table! 2 43 * 8 == 2 40 * 2 3 * 8 == 1TiB * 8 * 8 64 TiB just to store the 1 st -level page table!! 17
Do we really need all of those entries? We were worried about wasting the 2 20 entries in a single-level page table for 32-bit addresses So let's add another level! 12 bits for offset 9 bits for 3 rd -level page table 9 bits for 2 nd -level page table 34 bits for 1 st -level page table 18
4 LEVEL PAGE TABLES 12 BITS FOR OFFSET 9 BITS FOR 4 th -LEVEL PAGE TABLE 9 BITS FOR 3 rd -LEVEL PAGE TABLE 9 BITS FOR 2 nd -LEVEL PAGE TABLE 25 BITS FOR 1 st -LEVEL PAGE TABLE... 19
Let's take a step back for a minute 64-bit addresses means we can address 2 4 * 2 60 bytes 16 * 2 60 bytes 16 EiB 18446744073709551616 bytes No machine has this much RAM So we really don't need 64-bit physical addresses No program needs that much virtual address space, either 20
Current x86_64 CPUs do not support 64-bit addresses Currently, only support: 46-bit physical addresses 2 40 * 2 6 == 64TiB 48-bit virtual addresses 2 40 * 2 8 == 256TiB addressable Top "half" (128TiB) for kernel use Kernel Non-canonical addresses User (0xFFFF800000000000-0xFFFFFFFFFFFFFFFF) Bottom "half" (128TiB) for user programs (0x0000000000000000-0x00007FFFFFFFFFFF) 21
This makes for efficient use of a 4-level page table 9 + 9 + 9 + 9 + 12 == 48 bits of virtual address In 2017, Linux gained support for 5-level page tables 9 + 9 + 9 + 9 + 9 + 12 == 57 bit virtual addresses 128 PiB of virtual address space Supports 52 bit physical addresses 4 PiB of physical memory Ready for supporting hardware 22
What if we want to use 64-bit addresses? Each level added on to the page table increases the number of memory accesses needed to do 1 memory access What about if instead of making 1 page table entry per page, we made one per frame? This creates an inverted page table 23
Inverted page table example (32-bit) PID: 00000000101001110101010011110010 RAM 01100110101100000110011010110000 00010010101110001010011010110000 24
Inverted page table questions How big will the page table be? Will it be faster than an multilevel page table? 25
Is this fast enough? How can we speed it up? 26
The TLB Translation lookaside buffer Generally 32-1024 entries Maps page numbers to frame numbers exactly what the page table does So why bother? Is it going to be faster to search through 1024 TLB entries or doing a page table walk? 27
The TLB will be faster If its implemented in hardware! All entries checked simultaneously as the memory address is grabbed from the instruction 28
TLB example 29
TLB terminology Hit: Virtual address is found in the TLB Miss: Not found in the TLB Have to check the page table A "page table walk" What if it's not there either? Then add to the TLB Which entry to replace? 30
More TLB caching terminology Compulsory miss First time some data is access, can t be cached Capacity miss Not enough room in cache to store everything, data you re looking for had to be evicted Conflict miss Some caches about what data can be stored in given entries in the cache, so even if the cache isn t full, some data had to be evicted For more info: associativity of CPU caches 31
Cache rules everything around me CPU has a cache Web browsers cache page data BIND DNS servers cache DNS entries Hard disks have a cache built in to the drive Virtual memory can be viewed as a hierarchy of caches Ideally we would have one giant, incredibly fast, and incredibly cheap RAM to replace the entire memory hierarchy... 32