CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

Similar documents
Lecture 12: Demand Paging

Address spaces and memory management

Basic Memory Management

Chapter 4: Memory Management. Part 1: Mechanisms for Managing Memory

Chapter 8. Virtual Memory

PAGE REPLACEMENT. Operating Systems 2015 Spring by Euiseong Seo

Memory Management. To improve CPU utilization in a multiprogramming environment we need multiple programs in main memory at the same time.

Basic Memory Management. Basic Memory Management. Address Binding. Running a user program. Operating Systems 10/14/2018 CSC 256/456 1

Operating Systems Lecture 6: Memory Management II

CSE 153 Design of Operating Systems

Operating Systems. Operating Systems Sina Meraji U of T

CS 153 Design of Operating Systems Winter 2016

Week 2: Tiina Niklander

Chapter 4 Memory Management

Memory Management. Chapter 4 Memory Management. Multiprogramming with Fixed Partitions. Ideally programmers want memory that is.

Operating Systems, Fall

Outline. 1 Paging. 2 Eviction policies. 3 Thrashing 1 / 28

Virtual or Logical. Logical Addr. MMU (Memory Mgt. Unit) Physical. Addr. 1. (50 ns access)

CS450/550 Operating Systems

Page replacement algorithms OS

Move back and forth between memory and disk. Memory Hierarchy. Two Classes. Don t

Memory Management Ch. 3

Memory: Paging System

Operating Systems CSE 410, Spring Virtual Memory. Stephen Wagner Michigan State University

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced?

CSE 120 Principles of Operating Systems

Lecture 14: Cache & Virtual Memory

Operating System Concepts

Caching and Demand-Paged Virtual Memory

Swapping. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Swapping. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Chapter 8: Virtual Memory. Operating System Concepts

1. Creates the illusion of an address space much larger than the physical memory

CS370 Operating Systems

Chapter 8 Virtual Memory

ECE7995 Caching and Prefetching Techniques in Computer Systems. Lecture 8: Buffer Cache in Main Memory (I)

Virtual Memory III. Jo, Heeseung

CISC 7310X. C08: Virtual Memory. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 3/22/2018 CUNY Brooklyn College

Virtual Memory. Chapter 8

Perform page replacement. (Fig 8.8 [Stal05])

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

Paging and Page Replacement Algorithms

Chapter 8 Virtual Memory

VIRTUAL MEMORY READING: CHAPTER 9

Virtual Memory. Reading: Silberschatz chapter 10 Reading: Stallings. chapter 8 EEL 358

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective. Part I: Operating system overview: Memory Management

Virtual Memory Outline

Virtual Memory. Today.! Virtual memory! Page replacement algorithms! Modeling page replacement algorithms

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

Chapter 9: Virtual-Memory

Chapter 4 Memory Management. Memory Management

Paging algorithms. CS 241 February 10, Copyright : University of Illinois CS 241 Staff 1

Course Outline. Processes CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems

Operating Systems (1DT020 & 1TT802) Lecture 9 Memory Management : Demand paging & page replacement. Léon Mugwaneza

Chapter 9: Virtual Memory

CS 550 Operating Systems Spring Memory Management: Page Replacement

ADRIAN PERRIG & TORSTEN HOEFLER Networks and Operating Systems ( ) Chapter 6: Demand Paging

Virtual Memory. 1 Administrivia. Tom Kelliher, CS 240. May. 1, Announcements. Homework, toolboxes due Friday. Assignment.

MEMORY: SWAPPING. Shivaram Venkataraman CS 537, Spring 2019

OPERATING SYSTEM. Chapter 9: Virtual Memory

Chapter 9: Virtual Memory

a process may be swapped in and out of main memory such that it occupies different regions

Clock page algorithm. Least recently used (LRU) NFU algorithm. Aging (NFU + forgetting) Working set. Process behavior

MEMORY MANAGEMENT/1 CS 409, FALL 2013

Memory management, part 2: outline

How to create a process? What does process look like?

Memory management, part 2: outline. Operating Systems, 2017, Danny Hendler and Amnon Meisels

Page Replacement Algorithms

Modeling Page Replacement: Stack Algorithms. Design Issues for Paging Systems

Virtual Memory Design and Implementation

CS 333 Introduction to Operating Systems. Class 14 Page Replacement. Jonathan Walpole Computer Science Portland State University

Chapter 6: Demand Paging

CS 333 Introduction to Operating Systems. Class 14 Page Replacement. Jonathan Walpole Computer Science Portland State University

Chapter 8: Virtual Memory. Operating System Concepts Essentials 2 nd Edition

TLB Recap. Fast associative cache of page table entries

Motivation. Memory Management. Memory Management. Computer Hardware Review

3/3/2014! Anthony D. Joseph!!CS162! UCB Spring 2014!

Chapter 3 - Memory Management

Memory management. Requirements. Relocation: program loading. Terms. Relocation. Protection. Sharing. Logical organization. Physical organization

Lecture#16: VM, thrashing, Replacement, Cache state

Memory Management. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

CS370 Operating Systems

CS370 Operating Systems

CS162 Operating Systems and Systems Programming Lecture 11 Page Allocation and Replacement"

Virtual Memory Management

CSE 451: Operating Systems Winter Page Table Management, TLBs and Other Pragmatics. Gary Kimura

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Virtual Memory. Overview: Virtual Memory. Virtual address space of a process. Virtual Memory. Demand Paging

All Paging Schemes Depend on Locality. VM Page Replacement. Paging. Demand Paging

CS 5523 Operating Systems: Memory Management (SGG-8)

Past: Making physical memory pretty

Virtual Memory COMPSCI 386

First-In-First-Out (FIFO) Algorithm

Operating Systems. Overview Virtual memory part 2. Page replacement algorithms. Lecture 7 Memory management 3: Virtual memory

Last Class: Demand Paged Virtual Memory

CS6401- Operating System UNIT-III STORAGE MANAGEMENT

Virtual to physical address translation

Virtual Memory: From Address Translation to Demand Paging

1. Background. 2. Demand Paging

Operating Systems Virtual Memory. Lecture 11 Michael O Boyle

Transcription:

CSE 120 July 18, 2006 Day 5 Memory Instructor: Neil Rhodes Translation Lookaside Buffer (TLB) Implemented in Hardware Cache to map virtual page numbers to page frame Associative memory: HW looks up in all cache entries simultaneously Usually not big: 64-128 entries TLB entry: page number Valid Modified Protection Page frame If not present, do ordinary lookup, then evict entry from TLB and add new one Evict which entry? Serial/Parallel lookup Serial: First look in TLB. If not found, then look in page table Parallel. Look in TLB and in page table in parallel. If not found in TLB, then page table lookup already in progress. 2 Software TLB Management MMU doesn t handle page tables; software does On a TLB miss, generate a TLB fault and let OS deal with it Search a larger memory cache. Page containing cache must be in TLB for speed If not in cache, search page table Once page frame, etc. found, update TLB Why not use hardware? Logic to search page table takes space on the die Spend die size alternatively: Increase Memory cache Reduce cost/power consumption Cost Example TLB Summary Direct memory access: 100ns Without TLB: 200ns (lookup in Page Table first) With TLB Assume cost of TLB lookup is 10ns Assume TLB hit rate is 90% Serial lookup: Average cost =.9*110ns +.1*200ns = 119ns Parallel lookup: Average cost =.9*110ns +.1*(200ns-10ns) = 118ns Caches are very sensitive to: Hit rate Cost of cache miss Note that TLB must be flushed on context switch Unless TLB entries include process ID 3 4

Inverted Page Tables Traditional page tables: 1 entry/virtual page Inverted page tables: 1 entry/physical frame of memory Hash table Inverted Page Tables Space: proportional to number of allocated memory frames 1 entry in hash table for each allocated page Why? Size 64-bit virtual addresses, 4KB page 256MB of RAM per process. Inverted page table needs 65536 entries Page Table Entry: Process ID Virtual page number Additional PTE info pid: p offset virtual address f offset physical address Slow to search through table with 65536 entries Solution: Hash table. Key is virtual page number. Entry contains virtual page, process ID and page frame Advantage: Page table memory is proportional to physical memory Not logical address space Not number of processes Disadvantage Hard to share memory between processes hash(p) f 5 6 Segmentation vs. Paging Page Fault Handling for Paging Need the programmer be aware the technique is being used? Segmentation Paging MMU generates Page Fault (protection violation or page not present). Page fault handler must: Save registers How many linear address spaces are there? Can the total address space exceed the size of phys. mem? Can procedures and data be distinguished and separately protected? Can tables whose size fluctuates be accommodated easily? Figure out virtual address that caused fault Often in hardware register If protection problem, signal or kill process If writing to page after currently-allocated stack Allocate free page to add to stack Update page-table Restart instruction for faulting process - Must undo any partial effects Else, signal or kill process Is sharing of procedures between users facilitated? 7 8

Virtual Memory Idea: Use fast (small, expensive) memory as a cache for slow (large, expensive) disk 90/10 rule: processes spend 90% of there time in 10% of the code Not all of a process s address space need be in memory at a time Illusion of near-infinite memory More processes in memory (higher degree of multiprogramming) Locality Spatial: The likelihood of accessing a resource is higher if a resource close to it was just referenced. Temporal: The likelihood of accessing a resource is higher if it was recently accessed. Page Fault Handling for Virtual Memory MMU generates Page Fault (protection violation or page not present) Save registers Figure out virtual address that caused fault Often in hardware register If protection problem, signal or kill process If no free frame, evict a page from memory (which one?) If modified, write to backing store (dedicated paging space or normal file) Keep disk location of this page (not in page table, but some other data structure). - MMU doesn t need to know disk location Suspend faulting process (resume when write is complete) Read data from backing store for faulting page From backing store or application code or fill-with-zero Suspend faulting process (resume when read complete) Update page table Restart instruction for faulting process Must undo any partial effects 9 10 Paging and Translation Lookaside Buffer Page Replacement Policy yes CPU checks TLB PTE in TLB? no Access page table Page in main memory yes Update TLB CPU generates physical address no return to failed instruction Free page frame? OS instructs CPU to read the page from disk CPU activates I/O hardware Page transferred from disk to main memory Update page table OS instructs CPU to write the page to disk CPU activates I/O hardware Page transferred from main memory to disk Update page table Resident Set Management How many page frames are allocated to each active process? Fixed Variable What existing pages can be considered for replacement Local: only the process that caused the page fault Global: all processes Cleaning policy Pre-Cleaning: Write dirty pages out prospectively Demand-Cleaning: Write dirty pages out only as needed Fetch policy Demand paging Prepaging. Load extra pages speculatively while you re loading others Copy-on-write - Lazy duplicate of pages. For example, on fork, don t copy data page until write occurs. Replacement Policy Which page, among those eligible, should be replaced All policies want to replace those that won t be needed for a long time Since most processes exhibit locality, recent behavior helps predict future behavior Eligibility may be limited based on locked frames 11 Kernel pages I/O buffers in kernel space 12

Page References Assumption is that the sequence of page references exhibits locality Reference string is list of page numbers used by program For example, <0 1 2 3 0 1 4 01 2 3 4> Consecutive references to the same page are removed That page better still be in memory Reference means read or write Opt: the Optimal Page Replacement Policy Swap out the page that will be used farthest in the future Difficult to implement:) Example reference string: <0 1 2 3 0 1 4 0 1 2 3 4> Three page frames 13 14 FIFO: First-In First-Out Swap out the page that s been in memory the longest Works well for swapping out initialization code Not so good for often-used code FIFO: Belady s anomaly For FIFO, adding extra page frames can cause more page faults Example reference string: <0 1 2 3 0 1 4 0 1 2 3 4> Three page frames 0 1 2 3 0 1 4 0 1 2 3 4 Four page frames 0 1 2 3 0 1 4 0 1 2 3 4 15 16

Least Recently Used (LRU) Remove the page that has been unused the longest Hardware Keep counter in PTE. Increment on use. Find PTE with lowest counter to evict Or, keep a linked list ordered by usage Example reference string: <0 1 2 3 0 1 4 0 1 2 3 4> Clock (or Second-chance) Choose the oldest page that hasn t been referenced Implementation: Pages in circular list R bit maintained by hardware in the PTE - HW: Whenever a PTE is accessed (read or write for that page), R bit is set to 1 - SW: can set R bit to 0 or 1 When page is loaded, set R bit to 1 Hand points to particular page. When a page is needed, it checks R bit of that page - If set, clear and move to next page - If not set, this is the page to free 17 18 Two levels of pages: Clock old pages (those not referenced in last clock cycle) new pages (referenced in last clock cycle Algorithm picks one of the old pages Not the oldest (LRU) Another way to look at it: FIFO with a second chance (if front of list is referenced, clear reference and put in back of list) Nth Chance Clock gives a second chance, so has 2 ages it can distinguish Give n chances instead. Don t evict page unless hand has swept by n times Need counter in PTE Higher we make N, the closer it approximates LRU Can it loop infinitely? 19 20

W(, t) Developed by Deming Working-Set Model Set of pages a process has accessed from time t- to time t t is virtual time (last t memory accesses) is size of the window (a larger window means a possibly larger set of pages) Working set can grow and shrink over time Idea for algorithm Monitor the working set of each process Shrink/Grow the page frames allocated to a process down/up to that of its working set If there is not enough space for the working set, swap this process to disk Difficulties What size of to use? Keeping track of working set very difficult Approximation Monitor page fault frequency of process. Exceed upper threshold: add page frame Below lower threshold: remove page frame 21 Keeping Free Pages Keeping some clean free pages makes page faults faster Don t need to run page replacement algorithm: just go to free list Only need to wait for page to be brought in (instead of first waiting for dirty page to be written out). Retain contents of freed page frames If requested again, reuse page frame without I/O Write modified page frames lazily Save in modified page list Write out in groups (based on disk locality) 22 What it is: Thrashing Spending more time paging than doing real work Why it happens: If the degree of multiprogramming gets too high, each process working set is not resident With local replacement, the number of frames allocated to this process isn t enough. (Fighting within a process) With global replacement, one process causes pages from other processes working sets to be evicted. (Fighting among processes) Solution Reduce the degree of multiprogramming. Swap processes out to disk How to determine good degree of multiprogramming Look at utilization of page device (50% utilization optimal) Look at mean time between faults versus mean time to service a fault (equal maximizes CPU utilization) For clock algorithm, look at rate that hand scans through the clock Too low - Few page faults: not many requests to move the pointer - Not scanning many pages per request: most pages not referenced Too high - High fault rate - Scanning many pages per request: most pages are referenced 23 Memory-Mapped Files A file can be mapped into an address space Pager must read/write from file, similar to the way it pages in from an executable Processes can read and write using memory access rather than file read/file write Written data is cached in page frame Difficult to change EOF Can be shared between processes 24

Page Sizes Advantages of smaller page size Less internal fragmentation On average, the address space of each process wastes P/2 space Advantages of larger page size TLB covers more bytes (TLB size * P), so better TBL hit rate Smaller page tables (need address space/p PTEs) As memory has become cheaper and address space has become larger, page sizes have increased 1970s: Vax: 512 bytes 1990s: PowerPC: 4KB 1990s: Pentium: 4KB or 4MB (defined per secondary page table) 1990s: MIPS: 16KB Summary Some page replacement algorithms are better than others OPT LRU Clock FIFO Locality is what makes VM (or any caching) work Physical memory is a cache for logical memory Keep working set in memory Otherwise, thrashing 25 26