Foundations of Computer Systems

Similar documents
Virtual Memory: Systems

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

Computer Systems. Virtual Memory. Han, Hwansoo

Virtual Memory: Systems

Processes and Virtual Memory Concepts

Virtual Memory: Concepts

CSE 153 Design of Operating Systems

Virtual Memory. Alan L. Cox Some slides adapted from CMU slides

14 May 2012 Virtual Memory. Definition: A process is an instance of a running program

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Memory System Case Studies Oct. 13, 2008

Pentium/Linux Memory System March 17, 2005

Virtual Memory: Concepts

Carnegie Mellon. 16 th Lecture, Mar. 20, Instructors: Todd C. Mowry & Anthony Rowe

VM as a cache for disk

198:231 Intro to Computer Organization. 198:231 Introduction to Computer Organization Lecture 14

This lecture. Virtual Memory. Virtual memory (VM) CS Instructor: Sanjeev Se(a

Foundations of Computer Systems

Virtual Memory. Spring Instructors: Aykut and Erkut Erdem

Understanding the Design of Virtual Memory. Zhiqiang Lin

P6/Linux Memory System Nov 11, 2009"

Virtual Memory. CS61, Lecture 15. Prof. Stephen Chong October 20, 2011

Lecture 19: Virtual Memory: Concepts

P6 memory system P6/Linux Memory System October 31, Overview of P6 address translation. Review of abbreviations. Topics. Symbols: ...

Intel P The course that gives CMU its Zip! P6/Linux Memory System November 1, P6 memory system. Review of abbreviations

Virtual Memory: Concepts

Intel P6 (Bob Colwell s Chip, CMU Alumni) The course that gives CMU its Zip! Memory System Case Studies November 7, 2007.

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

Page Which had internal designation P5

Virtual Memory. Samira Khan Apr 27, 2017

Virtual Memory II CSE 351 Spring

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

VIRTUAL MEMORY: CONCEPTS

Operating Systems: Internals and Design. Chapter 8. Seventh Edition William Stallings

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

University of Washington Virtual memory (VM)

CS 153 Design of Operating Systems

Virtual Memory III /18-243: Introduc3on to Computer Systems 17 th Lecture, 23 March Instructors: Bill Nace and Gregory Kesden

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 24

Virtual Memory Oct. 29, 2002

virtual memory. March 23, Levels in Memory Hierarchy. DRAM vs. SRAM as a Cache. Page 1. Motivation #1: DRAM a Cache for Disk

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

CSE 351. Virtual Memory

Motivations for Virtual Memory Virtual Memory Oct. 29, Why VM Works? Motivation #1: DRAM a Cache for Disk

Virtual Memory. Physical Addressing. Problem 2: Capacity. Problem 1: Memory Management 11/20/15

Virtual Memory Nov 9, 2009"

CISC 360. Virtual Memory Dec. 4, 2008

A Few Problems with Physical Addressing. Virtual Memory Process Abstraction, Part 2: Private Address Space

Virtual Memory. Computer Systems Principles

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 23

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

Virtual Memory I. CSE 351 Spring Instructor: Ruth Anderson

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 23

CS 153 Design of Operating Systems

virtual memory Page 1 CSE 361S Disk Disk

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 25

@2010 Badri Computer Architecture Assembly II. Virtual Memory. Topics (Chapter 9) Motivations for VM Address translation

Virtual Memory I. CSE 351 Winter Instructor: Mark Wyse

PROCESS VIRTUAL MEMORY PART 2. CS124 Operating Systems Winter , Lecture 19

CS 261 Fall Mike Lam, Professor. Virtual Memory

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

University*of*Washington*

CSE 120 Principles of Operating Systems

CIS Operating Systems Memory Management Cache and Demand Paging. Professor Qiang Zeng Spring 2018

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory. Bernhard Boser & Randy Katz

Memory Management. Fundamentally two related, but distinct, issues. Management of logical address space resource

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

CSE 560 Computer Systems Architecture

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2017

Lecture 21: Virtual Memory. Spring 2018 Jason Tang

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

CSE 120 Principles of Operating Systems Spring 2017

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

18-447: Computer Architecture Lecture 16: Virtual Memory

Memory Management! Goals of this Lecture!

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2015

CS 318 Principles of Operating Systems

Memory Management! How the hardware and OS give application pgms:" The illusion of a large contiguous address space" Protection against each other"

Virtual Memory. CS 351: Systems Programming Michael Saelee

Virtual Memory. CS 3410 Computer System Organization & Programming

Memory Hierarchy Requirements. Three Advantages of Virtual Memory

Computer Architecture Lecture 13: Virtual Memory II

Memory Management. Goals of this Lecture. Motivation for Memory Hierarchy

Virtual Memory: From Address Translation to Demand Paging

Compsci 104. Duke University. Input/Output. Instructors: Alvin R. Lebeck. Some Slides provided by Randy Bryant and Dave O Hallaron and G.

seven Virtual Memory Introduction

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

Another View of the Memory Hierarchy. Lecture #25 Virtual Memory I Memory Hierarchy Requirements. Memory Hierarchy Requirements

Agenda. CS 61C: Great Ideas in Computer Architecture. Virtual Memory II. Goals of Virtual Memory. Memory Hierarchy Requirements

Virtual Memory. Virtual Memory

John Wawrzynek & Nick Weaver

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Virtual Memory. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]

18-447: Computer Architecture Lecture 18: Virtual Memory III. Yoongu Kim Carnegie Mellon University Spring 2013, 3/1

Applications of. Virtual Memory in. OS Design

Recap: Memory Management

How to create a process? What does process look like?

Transcription:

8-6 Foundations of Computer Systems Lecture 5: Virtual Memory Concepts and Systems October 8, 27 8-6 SE PL OS CA Required Reading Assignment: Chapter 9 of CS:APP (3 rd edition) by Randy Bryant & Dave O Hallaron. /8/27 8-6 Lecture #5

Socrative Experiment (Continuing) Pittsburgh Students (86PGH): https://api.socrative.com/rc/icjvvc Silicon Valley Students (86SV): https://api.socrative.com/rc/iez85z Microphone/Speak out/raise Hand: Still G-R-E-A-T! Socrative: Let s me open floor for electronic questions, putting questions into a visual queue so I don t miss any Let s me do flash polls, etc. Prevents cross-talk and organic discussions in more generalized forums from pulling coteries out of class discussion into parallel question space. Keeps focus and reduces distraction while adding another vehicle for classroom interactivity. Won t allow more than 5 students per room So, I created one room per campus May later try random assignment to a room, etc. /8/27 8-6 Lecture #5 2

8-6 Foundations of Computer Systems Lecture 5: Virtual Memory Concepts and Systems Address spaces VM as a tool for caching VM as a tool for memory management VM as a tool for memory protection Address translation Simple memory system example Case study: Core i7/linux memory system Memory mapping /8/27 8-6 Lecture #5 3

A System Using Physical Addressing CPU Physical address (PA) 4 Main memory : : 2: 3: 4: 5: 6: 7: 8:... M-: Data word Used in simple systems like embedded microcontrollers in devices like cars, elevators, and digital picture frames /8/27 8-6 Lecture #5 4

A System Using Virtual Addressing CPU Chip CPU Virtual address (VA) 4 MMU Physical address (PA) 4 Main memory : : 2: 3: 4: 5: 6: 7: 8:... M-: Data word Used in all modern servers, laptops, and smart phones One of the great ideas in computer science /8/27 8-6 Lecture #5 5

Address Spaces Linear address space: Ordered set of contiguous non-negative integer addresses: {,, 2, 3 } Virtual address space: Set of N = 2 n virtual addresses {,, 2, 3,, N-} Physical address space: Set of M = 2 m physical addresses {,, 2, 3,, M-} /8/27 8-6 Lecture #5 6

Why Virtual Memory (VM)? Uses main memory efficiently Use DRAM as a cache for parts of a virtual address space Simplifies memory management Each process gets the same uniform linear address space Isolates address spaces One process can t interfere with another s memory User program cannot access privileged kernel information and code /8/27 8-6 Lecture #5 7

VM as a Tool for Caching Conceptually, virtual memory is an array of N contiguous bytes stored on disk. The contents of the array on disk are cached in physical memory (DRAM cache) These cache blocks are called pages (size is P = 2 p bytes) Virtual memory Physical memory VP VP Unallocated Cached Uncached Unallocated Empty Empty PP PP Cached Uncached Empty VP 2 n-p - Cached Uncached N- M- PP 2 m-p - Virtual pages (VPs) stored on disk Physical pages (PPs) cached in DRAM /8/27 8-6 Lecture #5 8

DRAM Cache Organization DRAM cache organization driven by the enormous miss penalty DRAM is about x slower than SRAM Disk is about,x slower than DRAM Consequences Large page (block) size: typically 4 KB, sometimes 4 MB Fully associative Any VP can be placed in any PP Requires a large mapping function different from cache memories Highly sophisticated, expensive replacement algorithms Too complicated and open-ended to be implemented in hardware Write-back rather than write-through /8/27 8-6 Lecture #5 9

Enabling Data Structure: Page Table A page table is an array of page table entries (PTEs) that maps virtual pages to physical pages. Per-process kernel data structure in DRAM Valid PTE PTE 7 Physical page number or disk address null null Memory resident page table (DRAM) Physical memory (DRAM) VP VP 2 VP 7 VP 4 Virtual memory (disk) /8/27 8-6 Lecture #5 VP VP 2 VP 3 VP 4 VP 6 VP 7 PP PP 3

Page Hit Page hit: reference to VM word that is in physical memory (DRAM cache hit) Virtual address Valid PTE PTE 7 Physical page number or disk address null null Memory resident page table (DRAM) Physical memory (DRAM) VP VP 2 VP 7 VP 4 Virtual memory (disk) VP VP 2 VP 3 VP 4 PP PP 3 VP 6 VP 7 /8/27 8-6 Lecture #5

Page Fault Page fault: reference to VM word that is not in physical memory (DRAM cache miss) Virtual address Valid PTE PTE 7 Physical page number or disk address null null Memory resident page table (DRAM) Physical memory (DRAM) VP VP 2 VP 7 VP 4 Virtual memory (disk) VP VP 2 VP 3 VP 4 PP PP 3 VP 6 VP 7 /8/27 8-6 Lecture #5 2

Handling Page Fault Page miss causes page fault (an exception) Virtual address Valid PTE PTE 7 Physical page number or disk address null null Memory resident page table (DRAM) Physical memory (DRAM) VP VP 2 VP 7 VP 4 Virtual memory (disk) VP VP 2 VP 3 VP 4 PP PP 3 VP 6 VP 7 /8/27 8-6 Lecture #5 3

Handling Page Fault Page miss causes page fault (an exception) Page fault handler selects a victim to be evicted (here VP 4) Virtual address Valid PTE PTE 7 Physical page number or disk address null null Memory resident page table (DRAM) Physical memory (DRAM) VP VP 2 VP 7 VP 4 Virtual memory (disk) VP VP 2 VP 3 VP 4 PP PP 3 VP 6 VP 7 /8/27 8-6 Lecture #5 4

Handling Page Fault Page miss causes page fault (an exception) Page fault handler selects a victim to be evicted (here VP 4) Virtual address Valid PTE PTE 7 Physical page number or disk address null null Memory resident page table (DRAM) Physical memory (DRAM) VP VP 2 VP 7 VP 3 Virtual memory (disk) VP VP 2 VP 3 VP 4 PP PP 3 VP 6 VP 7 /8/27 8-6 Lecture #5 5

Handling Page Fault Page miss causes page fault (an exception) Page fault handler selects a victim to be evicted (here VP 4) Offending instruction is restarted: page hit! Virtual address Valid PTE PTE 7 Physical page number or disk address null null Memory resident page table (DRAM) Key point: Waiting until the miss to copy the page to DRAM is known as demand paging Physical memory (DRAM) VP VP 2 VP 7 VP 3 Virtual memory (disk) VP VP 2 VP 3 VP 4 VP 6 VP 7 PP PP 3 /8/27 8-6 Lecture #5 6

Allocating Pages Allocating a new page (VP 5) of virtual memory. Valid PTE PTE 7 Physical page number or disk address null Memory resident page table (DRAM) Physical memory (DRAM) VP VP 2 VP 7 VP 3 Virtual memory (disk) VP VP 2 VP 3 VP 4 VP 5 VP 6 VP 7 PP PP 3 /8/27 8-6 Lecture #5 7

Locality to the Rescue Again! Virtual memory seems terribly inefficient, but it works because of locality. At any point in time, programs tend to access a set of active virtual pages called the working set Programs with better temporal locality will have smaller working sets If (working set size < main memory size) Good performance for one process after compulsory misses If ( SUM(working set sizes) > main memory size ) Thrashing: Performance meltdown where pages are swapped (copied) in and out continuously /8/27 8-6 Lecture #5 8

VM as a Tool for Memory Management Key idea: each process has its own virtual address space It can view memory as a simple linear array Mapping function scatters addresses through physical memory Well-chosen mappings can improve locality Virtual Address Space for Process : VP VP 2... Address translation PP 2 Physical Address Space (DRAM) N- Virtual Address Space for Process 2: VP VP 2... PP 6 PP 8... (e.g., read-only library code) N- M- /8/27 8-6 Lecture #5 9

VM as a Tool for Memory Management Simplifying memory allocation Each virtual page can be mapped to any physical page A virtual page can be stored in different physical pages at different times Sharing code and data among processes Map virtual pages to the same physical page (here: PP 6) Address translation Virtual Address Space for Process : VP VP 2... PP 2 Physical Address Space (DRAM) N- Virtual Address Space for Process 2: VP VP 2... PP 6 PP 8... (e.g., read-only library code) N- M- /8/27 8-6 Lecture #5 2

Simplifying Linking and Loading Linking Each program has similar virtual address space Code, data, and heap always start at the same addresses. Loading execve allocates virtual pages for.text and.data sections & creates PTEs marked as invalid The.text and.data sections are copied, page by page, on demand by the virtual memory system x4 Kernel virtual memory User stack (created at runtime) Memory-mapped region for shared libraries Run-time heap (created by malloc) Read/write segment (.data,.bss) Read-only segment (.init,.text,.rodata) Unused Memory invisible to user code %rsp (stack pointer) brk Loaded from the executable file /8/27 8-6 Lecture #5 2

VM as a Tool for Memory Protection Extend PTEs with permission bits MMU checks these bits on each access Process i: VP : VP : VP 2: SUP No No Yes READ Yes Yes Yes WRITE No Yes Yes EXEC Yes Yes No Address PP 6 PP 4 PP 2 Physical Address Space PP 2 PP 4 PP 6 Process j: VP : VP : VP 2: SUP No Yes No READ Yes Yes Yes WRITE No Yes Yes EXEC Yes Yes Yes Address PP 9 PP 6 PP PP 8 PP 9 PP /8/27 8-6 Lecture #5 22

VM Address Translation Virtual Address Space V = {,,, N} Physical Address Space P = {,,, M} Address Translation MAP: V P U { } For virtual address a: MAP(a) = a if data at virtual address a is at physical address a in P MAP(a) = if data at virtual address a is not in physical memory Either invalid or stored on disk /8/27 8-6 Lecture #5 23

Summary of Address Translation Symbols Basic Parameters N = 2 n : Number of addresses in virtual address space M = 2 m : Number of addresses in physical address space P = 2 p : Page size (bytes) Components of the virtual address (VA) TLBI: TLB index TLBT: TLB tag VPO: Virtual page offset VPN: Virtual page number Components of the physical address (PA) PPO: Physical page offset (same as VPO) PPN: Physical page number /8/27 8-6 Lecture #5 24

Address Translation With a Page Table Virtual address Page table base register (PTBR) n- Virtual page number (VPN) p p- Virtual page offset (VPO) Physical page table address for the current process Page table Valid Physical page number (PPN) Valid bit = : Page not in memory (page fault) Valid bit = m- p p- Physical page number (PPN) Physical page offset (PPO) Physical address /8/27 8-6 Lecture #5 25

Address Translation: Page Hit CPU Chip CPU VA MMU 2 PTEA PTE 3 PA Cache/ Memory 4 Data 5 ) Processor sends virtual address to MMU 2-3) MMU fetches PTE from page table in memory 4) MMU sends physical address to cache/memory 5) Cache/memory sends data word to processor /8/27 8-6 Lecture #5 26

Address Translation: Page Fault Exception 4 Page fault handler CPU Chip CPU VA 7 MMU 2 PTEA PTE 3 Cache/ Memory Victim page 5 New page Disk 6 ) Processor sends virtual address to MMU 2-3) MMU fetches PTE from page table in memory 4) Valid bit is zero, so MMU triggers page fault exception 5) Handler identifies victim (and, if dirty, pages it out to disk) 6) Handler pages in new page and updates PTE in memory 7) Handler returns to original process, restarting faulting instruction /8/27 8-6 Lecture #5 27

Integrating VM and Cache PTE CPU Chip CPU VA MMU PTEA PA PTEA hit PTEA miss PA miss PTE PTEA PA Memory PA hit Data Data L cache VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address /8/27 8-6 Lecture #5 28

Speeding up Translation with a TLB Page table entries (PTEs) are cached in L like any other memory word PTEs may be evicted by other data references PTE hit still requires a small L delay Solution: Translation Lookaside Buffer (TLB) Small set-associative hardware cache in MMU Maps virtual page numbers to physical page numbers Contains complete page table entries for small number of pages /8/27 8-6 Lecture #5 29

Accessing the TLB MMU uses the VPN portion of the virtual address to access the TLB: TLBT matches tag of line within set n- TLB tag (TLBT) T = 2 t sets VPN p+t p+t- TLB index (TLBI) p p- VPO Set v tag PTE v tag PTE TLBI selects the set Set v tag PTE v tag PTE Set T- v tag PTE v tag PTE /8/27 8-6 Lecture #5 3

TLB Hit CPU Chip TLB 2 VPN PTE 3 CPU VA MMU PA 4 Cache/ Memory Data 5 A TLB hit eliminates a memory access /8/27 8-6 Lecture #5 3

TLB Miss CPU Chip TLB 4 2 PTE VPN 3 CPU VA MMU PTEA PA Cache/ Memory 5 Data 6 A TLB miss incurs an additional memory access (the PTE) Fortunately, TLB misses are rare. Why? /8/27 8-6 Lecture #5 32

... Multi-Level Page Tables Suppose: 4KB (2 2 ) page size, 48-bit address space, 8-byte PTE Level 2 Tables Problem: Would need a 52 GB page table! 2 48 * 2-2 * 2 3 = 2 39 bytes Level Table... Common solution: Multi-level page table Example: 2-level page table Level table: each PTE points to a page table (always memory resident) Level 2 table: each PTE points to a page (paged in and out like any other data) /8/27 8-6 Lecture #5 33

A Two-Level Page Table Hierarchy Level page table PTE PTE PTE 2 (null) PTE 3 (null) PTE 4 (null) PTE 5 (null) PTE 6 (null) PTE 7 (null) PTE 8 (K - 9) null PTEs Level 2 page tables PTE... PTE 23 PTE... PTE 23 23 null PTEs 32 bit addresses, 4KB pages, 4-byte PTEs Virtual memory VP... VP 23 VP 24... VP 247 Gap PTE 23 23 unallocated pages VP 925... 2K allocated VM pages for code and data 6K unallocated VM pages 23 unallocated pages allocated VM page for the stack /8/27 8-6 Lecture #5 34

Translating with a k-level Page Table Page table base register (PTBR) n- VPN VIRTUAL ADDRESS VPN 2... VPN k p- VPO Level page table Level 2 page table...... Level k page table PPN m- PPN p- PPO PHYSICAL ADDRESS /8/27 8-6 Lecture #5 35

Summary Programmer s view of virtual memory Each process has its own private linear address space Cannot be corrupted by other processes System view of virtual memory Uses memory efficiently by caching virtual memory pages Efficient only because of locality Simplifies memory management and programming Simplifies protection by providing a convenient interpositioning point to check permissions /8/27 8-6 Lecture #5 36

Review of Symbols Basic Parameters N = 2 n : Number of addresses in virtual address space M = 2 m : Number of addresses in physical address space P = 2 p : Page size (bytes) Components of the virtual address (VA) TLBI: TLB index TLBT: TLB tag VPO: Virtual page offset VPN: Virtual page number Components of the physical address (PA) PPO: Physical page offset (same as VPO) PPN: Physical page number CO: Byte offset within cache line CI: Cache index CT: Cache tag /8/27 8-6 Lecture #5 37

Simple Memory System Example Addressing 4-bit virtual addresses 2-bit physical address Page size = 64 bytes 3 2 9 8 7 6 5 4 3 2 VPN Virtual Page Number VPO Virtual Page Offset 9 8 7 6 5 4 3 2 PPN Physical Page Number PPO Physical Page Offset /8/27 8-6 Lecture #5 38

. Simple Memory System TLB 6 entries 4-way associative TLBT TLBI 3 2 9 8 7 6 5 4 3 2 VPN VPO Set Tag PPN Valid Tag PPN Valid Tag PPN Valid Tag PPN Valid 3 9 D 7 2 3 2D 2 4 A 2 2 8 6 3 3 7 3 D A 34 2 /8/27 8-6 Lecture #5 39

2. Simple Memory System Page Table Only show first 6 entries (out of 256) VPN PPN Valid VPN PPN Valid 28 8 3 9 7 2 33 A 9 3 2 B 4 C 5 6 D 2D 6 E 7 F D /8/27 8-6 Lecture #5 4

3. Simple Memory System Cache 6 lines, 4-byte block size Physically addressed Direct mapped CT CI CO 9 8 7 6 5 4 3 2 PPN PPO Idx Tag Valid B B B2 B3 Idx Tag Valid B B B2 B3 9 99 23 8 24 3A 5 89 5 9 2D 2 B 2 4 8 A 2D 93 5 DA 3B 3 36 B B 4 32 43 6D 8F 9 C 2 5 D 36 72 F D D 6 4 96 34 5 6 3 E 3 83 77 B D3 7 6 C2 DF 3 F 4 /8/27 8-6 Lecture #5 4

Address Translation Example # Virtual Address: x3d4 TLBT TLBI 3 2 9 8 7 6 5 4 3 2 VPN VPO VPN xf TLBI x3 TLBT x3 TLB Hit? Y Page Fault? N PPN: xd Physical Address CT CI CO 9 8 7 6 5 4 3 2 PPN PPO CO CI x5 CT xd Hit? Y Byte: x36 /8/27 8-6 Lecture #5 42

Address Translation Example #2 Virtual Address: x2 TLBT TLBI 3 2 9 8 7 6 5 4 3 2 VPN VPO VPN x TLBI TLBT x TLB Hit? N Page Fault? N PPN: x28 Physical Address CT CI CO 9 8 7 6 5 4 3 2 PPN PPO CO CI x8 CT x28 Hit? N Byte: Mem /8/27 8-6 Lecture #5 43

Address Translation Example #3 Virtual Address: x2 TLBT TLBI 3 2 9 8 7 6 5 4 3 2 VPN VPO VPN x TLBI TLBT x TLB Hit? N Page Fault? N PPN: x28 Physical Address CT CI CO 9 8 7 6 5 4 3 2 PPN PPO CO CI x8 CT x28 Hit? N Byte: Mem /8/27 8-6 Lecture #5 44

Intel Core i7 Memory System Processor package Core x4 Registers Instruction fetch MMU (addr translation) L d-cache 32 KB, 8-way L i-cache 32 KB, 8-way L d-tlb 64 entries, 4-way L i-tlb 28 entries, 4-way L2 unified cache 256 KB, 8-way L3 unified cache 8 MB, 6-way (shared by all cores) L2 unified TLB 52 entries, 4-way QuickPath interconnect 4 links @ 25.6 GB/s each DDR3 Memory controller 3 x 64 bit @.66 GB/s 32 GB/s total (shared by all cores) To other cores To I/O bridge Main memory /8/27 8-6 Lecture #5 45

Review of Symbols Basic Parameters N = 2 n : Number of addresses in virtual address space M = 2 m : Number of addresses in physical address space P = 2 p : Page size (bytes) Components of the virtual address (VA) TLBI: TLB index TLBT: TLB tag VPO: Virtual page offset VPN: Virtual page number Components of the physical address (PA) PPO: Physical page offset (same as VPO) PPN: Physical page number CO: Byte offset within cache line CI: Cache index CT: Cache tag /8/27 8-6 Lecture #5 46

End-to-end Core i7 Address Translation VPN TLB miss CPU 36 2 32 TLBT Virtual address (VA) VPO 4 TLBI... TLB hit 32/64 Result L hit L d-cache (64 sets, 8 lines/set)... L2, L3, and main memory L miss L TLB (6 sets, 4 entries/set) CR3 9 VPN PTE 9 9 9 VPN2 VPN3 VPN4 PTE PTE PTE 4 2 PPN PPO Physical address (PA) 4 6 6 CT CI CO Page tables /8/27 8-6 Lecture #5 47

Core i7 Level -3 Page Table Entries 63 XD 62 Unused 52 5 2 9 8 7 6 5 4 3 2 Page table physical base address Unused G PS A CD WT U/S R/W P= Available for OS (page table location on disk) P= Each entry references a 4K child page table. Significant fields: P: Child page table present in physical memory () or not (). R/W: Read-only or read-write access access permission for all reachable pages. U/S: user or supervisor (kernel) mode access permission for all reachable pages. WT: Write-through or write-back cache policy for the child page table. A: Reference bit (set by MMU on reads and writes, cleared by software). PS: Page size either 4 KB or 4 MB (defined for Level PTEs only). Page table physical base address: 4 most significant bits of physical page table address (forces page tables to be 4KB aligned) XD: Disable or enable instruction fetches from all pages reachable from this PTE. /8/27 8-6 Lecture #5 48

Core i7 Level 4 Page Table Entries 63 XD 62 Unused 52 5 2 9 8 7 6 5 4 3 2 Page physical base address Unused G D A CD WT U/S R/W P= Available for OS (page location on disk) P= Each entry references a 4K child page. Significant fields: P: Child page is present in memory () or not () R/W: Read-only or read-write access permission for child page U/S: User or supervisor mode access WT: Write-through or write-back cache policy for this page A: Reference bit (set by MMU on reads and writes, cleared by software) D: Dirty bit (set by MMU on writes, cleared by software) Page physical base address: 4 most significant bits of physical page address (forces pages to be 4KB aligned) XD: Disable or enable instruction fetches from this page. /8/27 8-6 Lecture #5 49

Core i7 Page Table Translation 9 VPN 9 VPN 2 9 VPN 3 VPN 4 9 2 Virtual VPO address CR3 Physical address of L PT L PT L2 PT L3 PT Page global Page upper Page middle directory directory directory 4 / 4 / 4 / 4 / L PTE L2 PTE L3 PTE L4 PT Page table L4 PTE /2 Offset into physical and virtual page 52 GB region per entry GB region per entry 2 MB region per entry 4 KB region per entry Physical address of page 4 2 Physical PPN 4 / PPO address /8/27 8-6 Lecture #5 5

Cute Trick for Speeding Up L Access CT Tag Check Physical address (PA) 4 6 6 CT CI CO PPN PPO Virtual address (VA) Address Translation VPN 36 2 Observation Bits that determine CI identical in virtual and physical address Can index into cache while address translation taking place Generally we hit in TLB, so PPN bits (CT bits) available next Virtually indexed, physically tagged Cache carefully sized to make this possible VPO No Change CI L Cache /8/27 8-6 Lecture #5 5

Virtual Address Space of a Linux Process Different for each process Identical for each process %rsp Process-specific data structs (ptables, task and mm structs, kernel stack) Physical memory Kernel code and data User stack Kernel virtual memory brk x4 Memory mapped region for shared libraries Runtime heap (malloc) Uninitialized data (.bss) Initialized data (.data) Program text (.text) Process virtual memory /8/27 8-6 Lecture #5 52

Linux Organizes VM as Collection of Areas task_struct mm mm_struct pgd mmap vm_area_struct vm_end vm_start vm_prot vm_flags vm_next Process virtual memory Shared libraries pgd: Page global directory address Points to L page table vm_prot: Read/write permissions for this area vm_flags Pages shared with other processes or private to this process vm_end vm_start vm_prot vm_flags vm_next vm_end vm_start vm_prot vm_flags vm_next /8/27 8-6 Lecture #5 53 Data Text

Linux Page Fault Handling vm_area_struct Process virtual memory vm_end vm_start vm_prot vm_flags vm_next vm_end vm_start vm_prot vm_flags shared libraries data read 3 read Segmentation fault: accessing a non-existing page Normal page fault vm_next vm_end vm_start vm_prot vm_flags vm_next text 2 write Protection exception: e.g., violating permission by writing to a read-only page (Linux reports as Segmentation fault) /8/27 8-6 Lecture #5 54

Memory Mapping VM areas initialized by associating them with disk objects. Process is known as memory mapping. Area can be backed by (i.e., get its initial values from) : Regular file on disk (e.g., an executable object file) Initial page bytes come from a section of a file Anonymous file (e.g., nothing) First fault will allocate a physical page full of 's (demand-zero page) Once the page is written to (dirtied), it is like any other page Dirty pages are copied back and forth between memory and a special swap file. /8/27 8-6 Lecture #5 55

Sharing Revisited: Shared Objects Process virtual memory Physical memory Process 2 virtual memory Process maps the shared object. Shared object /8/27 8-6 Lecture #5 56

Sharing Revisited: Shared Objects Process virtual memory Physical memory Process 2 virtual memory Process 2 maps the shared object. Notice how the virtual addresses can be different. Shared object /8/27 8-6 Lecture #5 57

Sharing Revisited: Private Copy-on-write (COW) Objects Process virtual memory Physical memory Process 2 virtual memory Two processes mapping a private copy-on-write (COW) object. Private copy-on-write area Area flagged as private copy-on-write PTEs in private areas are flagged as read-only Private copy-on-write object /8/27 8-6 Lecture #5 58

Sharing Revisited: Private Copy-on-write (COW) Objects Process virtual memory Physical memory Copy-on-write Process 2 virtual memory Write to private copy-on-write page Instruction writing to private page triggers protection fault. Handler creates new R/W page. Instruction restarts upon handler return. Copying deferred as long as possible! Private copy-on-write object /8/27 8-6 Lecture #5 59

The fork Function Revisited VM and memory mapping explain how fork provides private address space for each process. To create virtual address for new new process Create exact copies of current mm_struct, vm_area_struct, and page tables. Flag each page in both processes as read-only Flag each vm_area_struct in both processes as private COW On return, each process has exact copy of virtual memory Subsequent writes create new pages using COW mechanism. /8/27 8-6 Lecture #5 6

The execve Function Revisited User stack Private, demand-zero To load and run a new program a.out in the current process using execve: libc.so.data.text Memory mapped region for shared libraries Shared, file-backed Free vm_area_struct s and page tables for old areas a.out.data.text Runtime heap (via malloc) Uninitialized data (.bss) Initialized data (.data) Program text (.text) Private, demand-zero Private, demand-zero Private, file-backed Create vm_area_struct s and page tables for new areas Programs and initialized data backed by object files..bss and stack backed by anonymous files. Set PC to entry point in.text Linux will fault in code and data pages as needed. /8/27 8-6 Lecture #5 6

User-Level Memory Mapping void *mmap(void *start, int len, int prot, int flags, int fd, int offset) Map len bytes starting at offset offset of the file specified by file description fd, preferably at address start start: may be for pick an address prot: PROT_READ, PROT_WRITE,... flags: MAP_ANON, MAP_PRIVATE, MAP_SHARED,... Return a pointer to start of mapped area (may not be start) /8/27 8-6 Lecture #5 62

User-Level Memory Mapping void *mmap(void *start, int len, int prot, int flags, int fd, int offset) len bytes len bytes start (or address chosen by kernel) offset (bytes) Disk file specified by file descriptor fd Process virtual memory /8/27 8-6 Lecture #5 63

Example: Using mmap to Copy Files Copying a file to stdout without transferring data to user space. #include "csapp.h" void mmapcopy(int fd, int size) { } /* Ptr to memory mapped area */ char *bufp; bufp = Mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, ); Write(, bufp, size); return; mmapcopy.c /* mmapcopy driver */ int main(int argc, char **argv) { struct stat stat; int fd; } /* Check for required cmd line arg */ if (argc!= 2) { printf("usage: %s <filename>\n", argv[]); exit(); } /* Copy input file to stdout */ fd = Open(argv[], O_RDONLY, ); Fstat(fd, &stat); mmapcopy(fd, stat.st_size); exit(); mmapcopy.c /8/27 8-6 Lecture #5 64

8-6 Foundations of Computer Systems Lecture 6: Dynamic Memory Allocation October 23, 27 Required Reading Assignment: Chapter 9 of CS:APP (3 rd edition) by Randy Bryant & Dave O Hallaron. /8/27 8-6 Lecture #5 65