Operating Systems (2INC0) 2017/18

Operating Systems (2INC0) 2017/18 Virtual Memory (10) Dr Courtesy of Dr I Radovanovic, Dr R Mak System rchitecture and Networking Group

genda Recap memory management in early systems Principles of virtual memory Paging Segmentation Paging and Segmentation 2

genda Recap memory management in early systems Principles of virtual memory Paging Segmentation Paging and Segmentation 3

Memory management: requirements IDELLY, memory needs to be Simple to use Private (isolation) Non-volatile / permanent (data remains in memory) Fast (zero-time) access Huge (unlimited) capacity Cheap (cost-effective) These are conflicting We may make use of memory hierarchy and virtualization to compensate 4

Memory Hierarchy CPU Registers Larger storage Secondary Primary (Executable) L1 Cache Memory L2 Cache Memory \ Main Memory Solid State Memory Rotating Magnetic Memory Optical Memory Faster access Sequentially ccessed Memory 5

Early systems (properties) Every active process resides in its entirety in MM Large programs can t execute (except with an overlay structure) Overlay: Structure a program in independent parts (eg function calls) which can be overlayed in memory C versus D E Every active process is allocated a contiguous part of main memory Three partitioning schemes fixed, dynamic, relocatable dynamic partitions 6

Early systems (partitioning schemes) Fixed partitions limits the number and maximal size of the active processes internal fragmentation Dynamic partitions external fragmentation Relocatable dynamic partitions requires dynamic binding and relocatable load modules no fragmentation, but expensive compaction (or swapping) 7

Recap: Do we have the ideal memory properties so far? Simple Private Permanent Fast Huge Yes Yes/No No No No With the exception of overlays Isolation can be provided with dynamic binding No sharing Unless the programmer enforces this during execution Process-based compaction and swapping are expensive Process size cannot exceed main memory size Cost-effective Yes Hardware hierarchy 8

genda Recap memory management in early systems Principles of virtual memory Paging Segmentation Paging and Segmentation 9

VM: abstraction What needs to be done such that programs do not have to be stored contiguously, and not the entire program, but only parts are stored in main memory? pproach: Split the memory into segments and try to fit parts of the program into those segments Terminology: If segments are of different sizes: segmentation If segments are of the same size: paging Physical memory blocks: (page) frames Logical memory blocks: pages Paging and segmentation can be combined 10

Paging memory allocation Processes divided into pages (memory blocks) of equal size provides non-contiguous memory allocation works well if page size is equal to page frame size and the disk s section size (sectors) (size: 350 lines) dvantages: n empty page frame is always usable by any process The compaction scheme is not required No external and almost no internal fragmentation Disadvantage: Mechanism needed to keep track of page locations 11

Non-contiguous allocation -an example- P1 size=350 lines page size = 100 lines OS size = 300 lines Internal fragmentation Process 1 Main Memory Page frame # 1 st 100 lines Operating system 0 1 Page 0 2 3 2 nd 100 lines 4 Page 1 Process 1 Page 2 5 6 7 3 rd 100 lines Process 1 Page 0 8 Page 2 9 Process 1 Page 1 10 Remaining 50 lines Page 3 Process 1 Page 3 11 Wasted space 12 The number of free page frames left (13-3-4) = 6 ny process of more than 600 lines has to wait until Proc1 ends ny process of more than 1000 lines cannot fit into memory at all! Problem remains (needs solving): Entire process must be stored in memory during its execution 12

VM: Demand paging ring a page into memory only when it is needed Takes advantage of the fact that programs are written sequentially (not all pages are necessary at once) For example: User-written error handling modules Mutually exclusive modules Some parts are not always accessible Only a fraction of table is actually used No restriction of having the entire process stored in memory Gives appearance of an infinite physical memory 13

VM: implementation issues ddress binding / translation logical addresses translated to physical ones at run time HW support to reduce access time & to enforce isolation Placement strategies simple for pages: any free page frame will do for segments, strategies are similar to those for dynamic partitions Replacement strategies that decide which page(s) or segment(s) must be swapped out in case there is not enough free space in MM Load control policies that determine how many pages of a process are resident in MM? when to load pages into MM (demand paging, pre-paging)? Sharing possible with both paging and segmentation 14

genda Recap memory management in early systems Principles of virtual memory Paging address translation frame table, page table, TLs replacement strategies load policies Segmentation Segmentation with paging 15

Frame table vs Page table Page Frame Index Process ID Process page number 0 3 9 1 2 11 2 1 0 3 1 3 4 3 8 5 1 2 6 3 12 7 1 1 Process Page Index 0 1 2 3 Page Frame Number 2 7 5 3 Say the CPU wants to load an instruction of a certain process (at a certain page p and an offset w) How to find physical address (pa)? 16

Virtual vs physical address va p bits w bits Page number p w w: offset in the corresponding page (# words) Virtual ddress (for a process id) 2 p pages 2 w words/page VM size: 2 p + w words pa Goal: address a word f bits w bits Frame number f w Physical ddress 2 f page frames 2 w words/page (or /frame) PM size: 2 f + w words 17

How to find physical address using the frame table? Virtual address: Recall example Page Frame Index pid p 0 3 9 1 2 11 2 1 0 3 1 3 4 3 8 0 1 F-1 id p w f id p pid page w frame f page p address_map (id, p, w) { pa = UNDEFINED; } for (f = 0; f<f; f++) { } if (FT[f]pid == id && FT[f]page == p ) pa = f w return pa; Concatenate bits f and w 18

How to find physical address using page tables? Virtual address: id p w address_map (p, w) { pa = *(PTR+p) w; } return pa; PTR frame f Recall example Proc Page Index 0 1 2 3 Page Frame Number 2 7 5 3 p f w page p Page Table Register (PTR) is used for fast access (hardware support) Its content is stored in the process table Reading/writing from/to memory requires two memory accesses: One for accessing the page table, and one for accessing actual data 19

Process table maintained by OS for context switching and scheduling, etc Entries (rows) are process control info: name, state, priority, registers, a semaphore waited on, page table location etc Process Table (a) Process Table (b) Process Table (c) P1 P2 P3 Process Size Page Table Location Process Size Page Table Location Process Size Page Table Location 400 3096 400 3096 400 3096 200 3100 700 3100 500 3150 500 3150 500 3150 (a) PT has 3 entries initially; one for each process (b) second process ends, entry in table is released and replaced by (c) information about next process that is processed 20

Translation Look-aside uffer Problems with page tables They can be huge It requires 2 memory accesses per reference pproach use a TL (a cache memory) keeps track of locations of the most recently used pages in MM does not contain actual data or instructions most common replacement scheme: least recently used (LRU) if the page location cannot be found in the TL, the page table is used if the page location still cannot be found a page fault is generated meaning the page is currently not found in the physical memory 21

Page faults Page faults happen when a part of program needs to be brought into MM Upon page fault: Page fault handler determines whether there are empty frames in MM If not, it must decide which page to swap out This depends on predefined policy for page removal single memory reference may create several page faults eg when the page table itself is not in MM Tables to be updated after swapping: Page tables of two tasks (1 in 1 out) and table to locate free frames Problem with swapping: Thrashing 22

Thrashing process may spend more time paging than executing When does it happen? Pages in active use are replaced by other pages in active use Happens with increased degree of multiprogramming Solution: Provide the process with as many frames in MM as it needs needs a replacement strategy 23

Thrashing -an example- Swapping between these two pages for (j=1; j<100; ++j) { } k:=j*j; m:=a*j; printf( \n%d %d %d, j, k, m); printf( \n ); Page 0 Page 1 24

Replacement strategies Comparison of replacement strategies is done using reference strings an execution trace in which only memory references are recorded only the page number of the referenced location is mentioned Reference string: r 0 r 1 r 2 Goodness criteria the number of generated page faults the total number of pages loaded due to page faults (these two are equal under pure demand paging) Global strategies fixed number of page frames shared by all processes evicted page need not be owned by the process that needs extra memory Local strategies each process has a set of pages called the working set when a process runs out of memory a page from its working set is evicted 26

Global replacement strategies MIN replacement (looks to the future) select the page which will not be used for the longest time in the future, this gives the minimum number of page faults Random replacement select a random page for replacement FIFO replacement select the page that has been resident in MM for the longest time LRU replacement select the page that is least recently used Clock replacement (second chance) circular list of all resident pages equipped with a use-bit u upon each reference u is set to 1 (set to 2 for third chance) search clockwise for u=0, while setting the use-bits to zero 27

MIN policy (looks to the future) Page requested: C D C D Page Frame 1 Page D D D D D Page Frame 2 (empty) C C C C Interrupt : Time: * 1 * 2 3 * 4 5 * 6 * 7 8 * 9 * 10 11 How each page requested is swapped into the 2 available page frames using MIN When program is ready to be processed all 4 pages are on secondary storage Throughout program: 11 page requests are issued When program calls a page that isn t already in memory, a page fault is issued (shown by *) à 7 page faults 28

FIFO policy Page Requested: C D C D Page Frame 1 Page C C D Page Frame 2 (empty) D D D C C Interrupt: Time: * 1 * 2 3 * 4 * 5 * 6 * 7 8 * 9 * 10 * 11 à 9 page faults 29

LRU policy Page Requested: C D C D Page Frame 1 Page D D D Page Frame 2 (empty) C C C C Interrupt: Time: * 1 * 2 3 * 4 5 * 6 * 7 8 * 9 * 10 * 11 Only 8 page faults Efficiency slightly better than FIFO The most widely used static replacement algorithm 30

Page Table Extensions Extra fields Page Status bit Referenced bit Modified bit Page frame 0 1 1 1 5 1 1 0 0 9 2 1 0 0 7 3 1 1 0 12 Status bit indicates whether page is currently in memory or not Referenced bit (use bit) indicates whether page has been referenced recently Used by LRU to determine which pages should be swapped out Modified bit (dirty bit) indicates whether page contents have been altered Used to determine if page must be rewritten to secondary storage when it is swapped out There may be more of these bits for other purposes, eg, locking 31

Local Replacement strategies VMIN replacement looks to the future at each memory reference if there is page fault, then the requested page is loaded immediately if that page is not referenced during the next τ memory references, it is removed Working Set (WS) Model uses recent history (past τ memory references) at each memory reference a WS is determined only the pages that belong to the WS reside in MM a process can run if its entire WS is in MM the working set is given by W (t, τ) = { r j t-τ < j t } Reference string: r 0 r 1 r 2 r T hardware support in the form of aging registers 32

Working-set model (example) set of active pages large enough to avoid trashing Use a parameter τ to determine a working-set window Page reference window 2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4 1 3 2 3 4 4 4 τ = 9 τ = 9 WS (t 1 )={1,2,5,6,7} WS (t 2 )={3,4} D = WSSi D: the total demand for frames in memory WSS i : the working set size for process i Thrashing will occur when D is greater than the total number of available frames! 33

Working set an example- -2-1 Time t 0 1 2 3 4 5 6 7 8 9 10 e d Reference string a c c d b c e c e a d Page a -- -- -- -- -- Page b -- -- -- -- -- -- -- Page c -- Page d -- -- -- Page e -- -- -- -- IN t c * b * e * a * d * OUT t e a d b Working set: from t-τ to t; ie (t-τ, t]; in this case τ=4 34

Load control Paging strategy which and how many pages to load static paging: upon activation all pages of the process are loaded also called simple paging dynamic paging: upon a page fault one or more pages are loaded pure demand paging loads a single page demand paging with pre-paging loads several pages upon (re)activation Degree of multiprogramming when using local replacement (working set) à automatic when using global replacement à a separate policy is needed to determine the number of pages per process service time of a page fault vs average time between page faults a good tradeoff will help avoid thrashing 36

Load control (cnt d) Global replacement with static paging: Replace the pages of which process? (depends on scheduling) lowest priority process follows CPU scheduling (unlikely to be immediately scheduled again) last process activated considered to be the least important smallest process least expensive to swap out largest process frees the largest number of frames 37

genda Recap memory management in early systems Principles of virtual memory Paging Segmentation Segmentation with paging 38

Segmented memory allocation ased on common practice by programmers of structuring their programs in modules (logical groupings of code) segment is a logical unit such as: main program, subroutine, procedure, function, local variables, global variables, common block, stack, symbol table, or an array Main memory is not divided into page frames because size of each segment is different Memory is allocated dynamically 39

Segmentation Each process has three obvious candidates code, data, and stack other candidates are stacks of the individual threads of a process memory-mapped files Virtual addresses consist of segment number s and offset w address_map (s, w) { pa = *(STR+s)+w; return pa; } Note: *(STR+s) gives the address of the first word of the segment s Until now: Pure segmentation: contiguous segments Next: Segmentation with paging: paged segments 40

genda Recap memory management in early systems Principles of virtual memory Paging Segmentation Segmentation with paging address translation, segment table, TLs 41

Segment + page tables Virtual address: s p w STR frame f Segment Table Register s p f w page p 42

Segment + page tables (cnt d) Each memory reference requires three accesses to MM address_map (s, p, w) { pa = *(*(STR+s)+p)+w; } return pa; Use TL to reduce memory accesses per reference to one alance between s and p older systems: s long and p short many small segments: segment tables are large and can themselves be paged this results in yet another memory access multimedia applications, however, favor s short and p long many pages of a segment: now the page tables are large and must be paged 43

dvantages of VM Works well in a multiprogramming environment most programs spend a lot of time waiting Process size is no longer restricted to MM size (or the free space within main memory) Memory is used more efficiently Eliminates external fragmentation when used with paging and eliminates internal fragmentation when used with segmentation llows an unlimited amount of multiprogramming llows a program to be loaded multiple times occupying a different memory location each time llows sharing of code and data Facilitates dynamic linking of program segments 44

Disadvantages of VM Increased processor hardware costs Increased overhead for handling paging interrupts Increased software complexity to prevent thrashing 45

Last remark: Cache memory CPU CPU bus Cache Memory bus memory device memory device is based on the principle of locality a high-speed memory that speeds up the CPU s access to data increases performance no need to contend for the bus 46

Cache memory -improved efficiency- Cache hit ratio h: h = number of requests foundin the cache total number of requests verage main memory access time: t t t a c m = h t c + (1 h) t average cache access time m average main memory access time 47

Summary Simple Yes User has a linear address space Private Yes VM also facilitates sharing Permanent Fast No Moderate Unless the programmer enforces this during execution Management overhead for tables and strategies Huge Yes Memory size virtually unlimited Cost-effective Yes Hardware support for VM can be expensive 48

History: Memoryc Taken from http://lxrfree-electronscom 49