CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

Similar documents
CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

ECE468 Computer Organization and Architecture. Virtual Memory

ECE4680 Computer Organization and Architecture. Virtual Memory

CS152 Computer Architecture and Engineering Lecture 18: Virtual Memory

Virtual Memory. Virtual Memory

Modern Computer Architecture

CS162 - Operating Systems and Systems Programming. Address Translation => Paging"

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays.

Virtual Memory: From Address Translation to Demand Paging

Virtual Memory: From Address Translation to Demand Paging

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

Memory Hierarchy Requirements. Three Advantages of Virtual Memory

CS5460: Operating Systems Lecture 14: Memory Management (Chapter 8)

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

Virtual to physical address translation

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

Page 1. Review: Address Segmentation " Review: Address Segmentation " Review: Address Segmentation "

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

Topic 18 (updated): Virtual Memory

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

COSC 6385 Computer Architecture. - Memory Hierarchies (I)

Chapter 8. Virtual Memory

Agenda. CS 61C: Great Ideas in Computer Architecture. Virtual Memory II. Goals of Virtual Memory. Memory Hierarchy Requirements

Another View of the Memory Hierarchy. Lecture #25 Virtual Memory I Memory Hierarchy Requirements. Memory Hierarchy Requirements

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches

Topics: Memory Management (SGG, Chapter 08) 8.1, 8.2, 8.3, 8.5, 8.6 CS 3733 Operating Systems

VIRTUAL MEMORY II. Jo, Heeseung

Virtual Memory - Objectives

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

COSC3330 Computer Architecture Lecture 20. Virtual Memory

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Modern Virtual Memory Systems. Modern Virtual Memory Systems

EECS 470. Lecture 16 Virtual Memory. Fall 2018 Jon Beaumont

Recap: Set Associative Cache. N-way set associative: N entries for each Cache Index N direct mapped caches operates in parallel

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

ECE331: Hardware Organization and Design

CS 153 Design of Operating Systems Winter 2016

UC Berkeley CS61C : Machine Structures

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James

CSE 120 Principles of Operating Systems Spring 2017

Virtual Memory. Kevin Webb Swarthmore College March 8, 2018

Computer Science 146. Computer Architecture

Computer Structure. X86 Virtual Memory and TLB

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

1. Creates the illusion of an address space much larger than the physical memory

CS450/550 Operating Systems

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

The Virtual Memory Abstraction. Memory Management. Address spaces: Physical and Virtual. Address Translation

Computer Architecture. Lecture 8: Virtual Memory

Chapter 8 Memory Management

CS 318 Principles of Operating Systems

ECE232: Hardware Organization and Design

CSE 451: Operating Systems Winter Page Table Management, TLBs and Other Pragmatics. Gary Kimura

CS 152 Computer Architecture and Engineering. Lecture 9 - Virtual Memory

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

ADDRESS TRANSLATION AND TLB

Topic 18: Virtual Memory

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

EECS 482 Introduction to Operating Systems

Memory Management. An expensive way to run multiple processes: Swapping. CPSC 410/611 : Operating Systems. Memory Management: Paging / Segmentation 1

Memory Management Topics. CS 537 Lecture 11 Memory. Virtualizing Resources

Virtual Memory Oct. 29, 2002

CS162 Operating Systems and Systems Programming Lecture 14. Caching (Finished), Demand Paging

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

ADDRESS TRANSLATION AND TLB

Time. Who Cares About the Memory Hierarchy? Performance. Where Have We Been?

Page 1. Memory Hierarchies (Part 2)

Operating Systems and Computer Networks. Memory Management. Dr.-Ing. Pascal A. Klein

Motivations for Virtual Memory Virtual Memory Oct. 29, Why VM Works? Motivation #1: DRAM a Cache for Disk

Lecture 11. Virtual Memory Review: Memory Hierarchy

CS 61C: Great Ideas in Computer Architecture. Virtual Memory

Lecture 9 - Virtual Memory

virtual memory Page 1 CSE 361S Disk Disk

CS162 Operating Systems and Systems Programming Lecture 14. Caching and Demand Paging

CISC 360. Virtual Memory Dec. 4, 2008

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Main Memory (Fig. 7.13) Main Memory

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

EITF20: Computer Architecture Part 5.1.1: Virtual Memory

Recall from Tuesday. Our solution to fragmentation is to split up a process s address space into smaller chunks. Physical Memory OS.

HY225 Lecture 12: DRAM and Virtual Memory

Chapter 4: Memory Management. Part 1: Mechanisms for Managing Memory

Paging! 2/22! Anthony D. Joseph and Ion Stoica CS162 UCB Fall 2012! " (0xE0)" " " " (0x70)" " (0x50)"

CSE 120 Principles of Operating Systems

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review

EITF20: Computer Architecture Part 5.1.1: Virtual Memory

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

Virtual Memory (VM) Ch Copyright Teemu Kerola Teemu s Cheesecake

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

Virtual Memory II CSE 351 Spring

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 27, FALL 2012

Recap: Memory Management

Virtual Memory. Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

ECE 571 Advanced Microprocessor-Based Design Lecture 12

Memory: Page Table Structure. CSSE 332 Operating Systems Rose-Hulman Institute of Technology

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

Transcription:

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory Robert Wagner cps 104 VM.1 RW Fall 2000

Outline of Today s Lecture Virtual Memory. Paged virtual memory. Virtual to Physical translation: The page table. Fragmentation. Page replacement policies. Reducing the Virtual to Physical address translation time. The TLB Parallel access to the TLB and Cache Memory Protection Putting it all together: The SPARC-20 memory system cps 104 VM.2 RW Fall 2000

Virtual Memory cps 104 VM.3 RW Fall 2000

Memory System Management Problems Different Programs have different memory requirements. How to manage program placement? Different machines have different amount of memory. How to run the same program on many different machines? At any given time each machine runs a different set of programs. How to fit the program mix into memory? Reclaiming unused memory? Moving code around? The amount of memory consumed by each program is dynamic (changes over time) How to effect changes in memory location: add or subtract space? Program bugs can cause a program to generate reads and writes outside the program address space. How to protect one program from another? cps 104 VM.4 RW Fall 2000

Virtual Memory Provides illusion of very large memory Sum of the memory of many jobs greater than physical memory Address space of each job larger than physical memory Allows available (fast and expensive) physical memory to be well utilized. Simplifies memory management: code and data movement, protection,... (main reason today) Exploits memory hierarchy to keep average access time low. Involves at least two storage levels: main and secondary Virtual Address -- address used by the programmer Virtual Address Space -- collection of such addresses Memory Address -- address in physical memory also known as physical address or real address cps 104 VM.5 RW Fall 2000

Paged Virtual Memory: Main Idea Divide memory (virtual and physical) into fixed size blocks (Pages, Frames). Pages in Virtual space. Frames in Physical space. Make page size a power of 2: (page size = 2 k ) All pages in the virtual address space are contiguous. Pages can be mapped into physical Frames in any order. Some of the pages are in main memory (DRAM), some of the pages are on secondary memory (disk). All programs are written using Virtual Memory Address Space. The hardware does on-the-fly translation between virtual and physical address spaces. Use a Page Table to translate between Virtual and Physical addresses cps 104 VM.7 RW Fall 2000

Basic Issues in Virtual Memory System Design size of information blocks (pages)that are transferred from secondary to main storage (M) block of information brought into M, and M is full, then some region of M must be released to make room for the new block --> replacement policy which region of M is to hold the new block --> placement policy missing item fetched from secondary memory only on the occurrence of a fault --> demand load policy mem disk cache reg Paging Organization frame pages virtual and physical address space partitioned into blocks of equal size frames pages cps 104 VM.8 RW Fall 2000

Virtual and Physical Memories Virtual Memory Page-0 Page-1 Page-2 Page-3 Physical Memory Frame-0 Frame-1 Frame-2 Frame-3 Frame-4 Frame-5 Disk Page -2 Page N-2 Page N-1 Page N cps 104 VM.9 RW Fall 2000

Virtual to Physical Address translation Page size: 4K 31 Virtual Address 11 0 Virtual Page Number Page offset Page Table 29 11 0 Physical Frame Number Page offset Physical Address cps 104 VM.10 RW Fall 2000

Address Map V = {0, 1,..., n - 1} virtual address space M = {0, 1,..., m - 1} physical address space n > m MAP: V --> M U {0} address mapping function MAP(a) = a' if data at virtual address a is present in physical address a' and a' in M = 0 if data at virtual address a is not present in M missing item fault Processor fault handler a Addr. Trans. Mechanism 0 a' Main Memory Secondary Memory physical address OS performs this transfer cps 104 VM.11 RW Fall 2000

The page table 31 Virtual Address 1211 0 Virtual Page Number Page offset Page table register V AC Frame number Access Control 1 Valid bit 29 11 0 Physical Frame Number Page offset Physical Address cps 104 VM.12 RW Fall 2000

Address Mapping Algorithm If V = 1 then page is in main memory at frame address stored in table else address located page in secondary memory Access Control R = Read-only, R/W = read/write, X = execute only If kind of access not compatible with specified access rights, then protection_violation_fault If valid bit not set then page fault Protection Fault: access control violation; causes trap to hardware, or software fault handler Page Fault: page not resident in physical memory, also causes a trap; usually accompanied by a context switch: current process suspended while page is fetched from secondary storage e.g., VAX 11/780 each process sees a 4 gigabyte (2 32 bytes) virtual address space 1/2 for user regions, 1/2 for a system wide name space shared by all processes. page size is 512 bytes (too small) cps 104 VM.13 RW Fall 2000

The Page Frame Table The Page Frame table: One entry per physical page frame Hardware sets Recently Used bit to 1 when frame is referenced Dirty bit to 1 when frame is stored into Operating system resets these bits to 0 -- dirty means frame contents does NOT match disk Operating System maintains the Valid bit, and the Virtual Page # field. These fields together indicate that, if V==1, the frame contains the virtual page number indicated, while if V==0, the frame contains no valid page. Alternative designs are possible, but are constrained by what the hardware maintains Hardware sets Frame # RU D 0 0 0 1 1 0 2 1 1 3 0 1 4 1 0 5 1 0 Recently Used bit Dirty bit OS Maintains Virtual V Pg # 0 100 1 A34 1 B22 1 0 0 100 0 0 Virtual Page # valid bit cps 104 VM.14 RW Fall 2000

Page Replacement Algorithms Just like cache block replacement! Least Recently Used: -- selects the least recently used page for replacement -- requires knowledge about past references, more difficult to implement (thread through page table entries from most recently referenced to least recently referenced; when a page is referenced it is placed at the head of the list; the end of the list is the page to replace) -- good performance, recognizes principle of locality cps 104 VM.15 RW Fall 2000

Page Replacement (Continued) Not Recently Used: Associated with each page frame is a reference flag such that ref flag = 1 if the frame has been referenced in recent past = 0 otherwise -- if replacement is necessary, choose any page frame such that its reference bit is 0. This is a frame that has not been referenced in the recent past -- per fault implementation of NRU: FRAME table entry 1 0 1 0 1 0 0 0 virtual page # last replaced pointer (lrp) if replacement is to take place, advance lrp to next entry (mod table size) until one with a 0 bit is found; this is the target for replacement; As a side effect, all examined FTE's have their Recently Used bits set to zero. RU bit An optimization is to first search for a page that is both not recently used AND not dirty. If none, use first not recently used frame. Why? The virtual page number allows access to the page table entry which points to this page frame. cps 104 VM.16 RW Fall 2000

Choosing a Page Size What if page is too small? Too many misses BIG page tables What if page is too big? Fragmentation don t use all of the page, but can t use that DRAM for other pages want to minimize fragmentation (get good utilization of physical memory) Smaller page tables Trend is toward larger pages increasing gap between CPU/DRAM/DISK cps 104 VM.17 RW Fall 2000

Optimal Page Size Choose page size that minimizes fragmentation (partial use of pages) large page size => internal fragmentation more severe BUT decreases the # of pages / name space => smaller page tables In general, the trend is towards larger page sizes because -- Memories get larger as the price of RAM drops. -- The gap between processor speed and disk speed grow wider. -- Programmers demand larger virtual address spaces (larger program) -- Smaller page table. Most machines at 4K-8K byte pages today, with page sizes likely to increase cps 104 VM.18 RW Fall 2000

Fragmentation Page Table Fragmentation occurs when page tables become very large because of large virtual address spaces; linearly mapped page tables could take up sizable chunk of memory 21 9 EX: VAX Architecture (late 1970s) XX Page Number Disp NOTE: this implies that page table could require up to 2 ^21 entries, each on the order of 4 bytes long (8 M Bytes) Alternatives to linear page table: 00 P0 region of user process 01 P1 region of user process 10 system name space (1) Hardware associative mapping: requires one entry per page frame (O( M )) rather than per virtual page (O( N )) (2) "software" approach based on a hash table (inverted page table) page# disp Access Control Page# Phy Addr Page Table associative or hashed lookup on the page number field cps 104 VM.19 RW Fall 2000

Virtual Address and a Cache CPU VA PA miss Translation Cache data hit Main Memory It takes an extra memory access to translate VA to PA This makes cache access very expensive, and this is the "innermost loop" that you want to go as fast as possible cps 104 VM.20 RW Fall 2000

Translation Lookaside Buffer (TLB) A way to speed up translation is to use a special cache of recently used page table entries -- this has many names, but the most frequently used is Translation Lookaside Buffer or TLB Virtual Address Physical Address Dirty Ref Valid Access TLB access time comparable to cache access time (much less than main memory access time) Typical TLB is 64-256 entries fully associative cache with random replacement cps 104 VM.21 RW Fall 2000

Translation Look-Aside Buffers Just like any other cache, the TLB can be organized as fully associative, set associative, or direct mapped TLBs are usually small, typically not more than 128-256 entries even on high end machines. This permits fully associative lookup on these machines. Many mid-range machines use small n-way set associative organizations. CPU hit VA PA miss TLB Cache Lookup Main Memory Translation with a TLB miss Translation hit 1/2 t data cps 104 VM.22 RW Fall 2000 t 20 t

TLB Design Must be fast, not increase critical path Must achieve high hit ratio Generally small highly associative (64-128 entries FA cache) Mapping change page added/removed from physical memory processor must invalidate the TLB entry (special instructions) Page Table Entry (PTE) is a per-process entity Multiple processes with same virtual addresses Context Switches? Flush TLB Add Address Space ID (ASID), or Process ID (PID) part of processor state, must be set on context switch cps 104 VM.23 RW Fall 2000

Hardware Managed TLBs CPU Hardware Handles TLB miss TLB Control Dictates page table organization Complicated state machine to walk page table Exception only if access violation Memory cps 104 VM.24 RW Fall 2000

Software Managed TLBs TLB CPU Memory Control Software Handles TLB miss OS reads translations from Page Table and puts them in TLB special instructions Flexible page table organization Simple Hardware to detect Hit or Miss Exception if TLB miss or access violation cps 104 VM.25 RW Fall 2000

Reducing Translation Time Machines with TLBs go one step further to reduce # cycles/cache access They overlap the cache access with the TLB access Works because high order bits of the VA are used to look in the TLB while low order bits are used as index into cache cps 104 VM.26 RW Fall 2000

Overlapped Cache & TLB Access 32 TLB assoc lookup index Cache PA Hit/ Miss 20 page # 12 disp PA Data Hit/ Miss = IF cache hit AND (cache tag = PA) then deliver data to CPU ELSE IF [cache miss OR (cache tag!= PA)] and TLB hit THEN access memory with the PA from the TLB ELSE do standard VA translation cps 104 VM.27 RW Fall 2000

Problems With Overlapped TLB Access Overlapped access only works as long as the address bits used to index into the cache do not change as the result of VA translation This usually limits things to small caches, large page sizes, or high n-way set associative caches if you want a large cache Example: suppose everything the same except that the cache is increased to 8 K bytes instead of 4 K: 11 2 00 cache index 20 12 virt page # disp Solutions: go to 8K byte page sizes; go to 2 way set associative cache; or SW guarantee VA[13]=PA[13] This bit is changed by VA translation, but is needed for cache lookup 10 4 4 1K 2 way set assoc cache cps 104 VM.28 RW Fall 2000

More on Selecting a Page Size Reasons for larger page size Page table size is inversely proportional to the page size. faster cache hit time when cache size <= page size; this allows use of virtual address as cache index (p 596). Also, bigger page => bigger cache. Transferring larger pages to or from secondary storage, is more efficient (Higher bandwidth) The number of TLB entries is restricted by clock cycle time, so a larger page size reduces TLB misses. Reasons for a smaller page size don t waste storage; data must be contiguous within page. quicker process start for small processes(?) Hybrid solution: multiple page sizes: Alpha, UltraSPARC: 8KB, 64KB, 512 KB, 4 MB pages cps 104 VM.29 RW Fall 2000

Memory Protection Paging Virtual memory provides protection by: Each process (user or OS) has different virtual memory space. The OS maintain the page tables for all processes. A reference outside the process s allocated space causes an exception that lets the OS decide what to do. Memory sharing between processes is done via different Virtual spaces but common physical frames. cps 104 VM.30 RW Fall 2000

Putting it together: The SparcStation 20: The SparcStation 20 has the following memory system. Caches: Two level-1 caches: I-cache and D-cache Parameter Instruction cache Data cache Organization 20Kbyte 5-way SA 16KB 4-way SA Page size 4K bytes 4K bytes Line size 8 bytes 4 bytes Replacement Pseudo LRU Pseudo LRU TLB: 64 entry Fully Associative TLB, Random replacement External Level-2 Cache: 1M-byte, Direct Map, 128 byte blocks, 32-byte sub-blocks. cps 104 VM.31 RW Fall 2000

Virtual Address 20 SparcStation 20 Data Access 12 2 10 Data Cache tag data tag tag tag 4 bytes 1K 10 Physical Address = = = = To Memory 24 20 Data Select TLB = = = tag0 tag1 tag2 To CPU = tag63 cps 104 VM.32 RW Fall 2000