Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

Similar documents
Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

virtual memory Page 1 CSE 361S Disk Disk

Motivations for Virtual Memory Virtual Memory Oct. 29, Why VM Works? Motivation #1: DRAM a Cache for Disk

Virtual Memory Oct. 29, 2002

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

CISC 360. Virtual Memory Dec. 4, 2008

@2010 Badri Computer Architecture Assembly II. Virtual Memory. Topics (Chapter 9) Motivations for VM Address translation

virtual memory. March 23, Levels in Memory Hierarchy. DRAM vs. SRAM as a Cache. Page 1. Motivation #1: DRAM a Cache for Disk

Virtual Memory Nov 9, 2009"

198:231 Intro to Computer Organization. 198:231 Introduction to Computer Organization Lecture 14

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example

14 May 2012 Virtual Memory. Definition: A process is an instance of a running program

Computer Systems. Virtual Memory. Han, Hwansoo

Carnegie Mellon. 16 th Lecture, Mar. 20, Instructors: Todd C. Mowry & Anthony Rowe

CSE 560 Computer Systems Architecture

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 26, SPRING 2013

+ Random-Access Memory (RAM)

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 26, FALL 2012

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

Memory Management! Goals of this Lecture!

Topic 18: Virtual Memory

CSE 153 Design of Operating Systems

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

Processes and Virtual Memory Concepts

Lecture 15: Caches and Optimization Computer Architecture and Systems Programming ( )

Virtual Memory: Concepts

CS 261 Fall Mike Lam, Professor. Virtual Memory

This lecture. Virtual Memory. Virtual memory (VM) CS Instructor: Sanjeev Se(a

Virtual Memory. Alan L. Cox Some slides adapted from CMU slides

Memory Management. Goals of this Lecture. Motivation for Memory Hierarchy

Memory Management! How the hardware and OS give application pgms:" The illusion of a large contiguous address space" Protection against each other"

CISC 360. The Memory Hierarchy Nov 13, 2008

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Giving credit where credit is due

18-447: Computer Architecture Lecture 16: Virtual Memory

Virtual Memory. CS61, Lecture 15. Prof. Stephen Chong October 20, 2011

Problem: Processor- Memory Bo<leneck

Read-only memory (ROM): programmed during production Programmable ROM (PROM): can be programmed once SRAM (Static RAM)

CS3350B Computer Architecture

NEXT SET OF SLIDES FROM DENNIS FREY S FALL 2011 CMSC313.

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

Virtual Memory: Concepts

Random-Access Memory (RAM) Lecture 13 The Memory Hierarchy. Conventional DRAM Organization. SRAM vs DRAM Summary. Topics. d x w DRAM: Key features

Computer Organization: A Programmer's Perspective

F28HS Hardware-Software Interface: Systems Programming

VM as a cache for disk

Key features. ! RAM is packaged as a chip.! Basic storage unit is a cell (one bit per cell).! Multiple RAM chips form a memory.

Chapter 8. Virtual Memory

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

Computer Systems. Memory Hierarchy. Han, Hwansoo

EE108B Lecture 13. Caches Wrap Up Processes, Interrupts, and Exceptions. Christos Kozyrakis Stanford University

Topic 18 (updated): Virtual Memory

Announcements. EE108B Lecture 13. Caches Wrap Up Processes, Interrupts, and Exceptions. Measuring Performance. Memory Performance

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Memory Hierarchy. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Lecture 19: Virtual Memory: Concepts

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

University of Washington Virtual memory (VM)

Virtual Memory. Physical Addressing. Problem 2: Capacity. Problem 1: Memory Management 11/20/15

The Memory Hierarchy Sept 29, 2006

Denison University. Cache Memories. CS-281: Introduction to Computer Systems. Instructor: Thomas C. Bressoud

Handout 4 Memory Hierarchy

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

Virtual Memory: From Address Translation to Demand Paging

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

Virtual Memory: From Address Translation to Demand Paging

Virtual Memory II CSE 351 Spring

CIS Operating Systems Memory Management Cache and Demand Paging. Professor Qiang Zeng Spring 2018

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

Virtual Memory: Systems

ECE4680 Computer Organization and Architecture. Virtual Memory

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 24

Memory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit.

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2017

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

CS 153 Design of Operating Systems

ECE468 Computer Organization and Architecture. Virtual Memory

CS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Random Access Memory (RAM)

Virtual Memory: Concepts

Virtual Memory. Computer Systems Principles

HY225 Lecture 12: DRAM and Virtual Memory

This Unit: Main Memory. Virtual Memory. Virtual Memory. Other Uses of Virtual Memory

Memory System Case Studies Oct. 13, 2008

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 23

Pentium/Linux Memory System March 17, 2005

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

CS 153 Design of Operating Systems

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

211: Computer Architecture Summer 2016

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Foundations of Computer Systems

Transcription:

Systemprogrammering 27 Föreläsning 4 Topics The memory hierarchy Motivations for VM Address translation Accelerating translation with TLBs Random-Access (RAM) Key features RAM is packaged as a chip. Basic storage unit is a cell (one bit per cell). Multiple RAM chips form a memory. Static RAM (SRAM( SRAM) Each cell stores bit with a six-transistor circuit. Retains value indefinitely, as long as it is kept powered. Relatively insensitive to disturbances such as electrical noise. Faster and more expensive than DRAM. Dynamic RAM (DRAM( DRAM) Each cell stores bit with a capacitor and transistor. Value must be refreshed every - ms. Sensitive to disturbances. Slower and cheaper than SRAM. F4 2 Systemprogrammering 27 The - Gap The increasing gap between DRAM, disk, and speeds. ns,,,,,,,,, 98 985 99 995 2 year Disk seek time DRAM access time SRAM access time cycle time F4 3 Systemprogrammering 27 Locality Principle of Locality: Programs tend to reuse data and instructions near those they have used recently, or that were recently referenced themselves. Temporal locality: Recently referenced items are likely to be referenced in the near future. Spatial locality: Items with nearby addresses tend to be referenced close together in time. Locality Example: Data Reference array elements in succession (stride- reference pattern): Spatial locality Reference sum each iteration: Temporal locality Instructions Reference instructions in sequence: Spatial locality Cycle through loop repeatedly: Temporal locality sum = ; for (i = ; i < n; i++) sum += a[i]; return sum; F4 4 Systemprogrammering 27

Hierarchies An Example Hierarchy Some fundamental and enduring properties of hardware and software: Fast storage technologies cost more per byte and have less capacity. The gap between and main memory speed is widening. Well-written programs tend to exhibit good locality. These fundamental properties complement each other beautifully. They suggest an approach for organizing memory and storage systems known as a memory hierarchy. Smaller, faster, and costlier (per byte) storage devices Larger, slower, and cheaper (per byte) storage devices L5: L4: L: L: registers on-chip L cache (SRAM) registers hold words retrieved from L cache. L cache holds cache lines retrieved L2: off-chip L2 from the L2 cache memory. cache (SRAM) L2 cache holds cache lines retrieved from main memory. L3: main memory (DRAM) Main memory holds disk blocks retrieved from local disks. local secondary storage (local disks) Local disks hold files retrieved from disks on remote network servers. remote secondary storage (distributed file systems, Web servers) F4 5 Systemprogrammering 27 F4 6 Systemprogrammering 27 s : A smaller, faster storage device that acts as a staging area for a subset of the data in a larger, slower device. Fundamental idea of a memory hierarchy: For each k, the faster, smaller device at level k serves as a cache for the larger, slower device at level k+. Why do memory hierarchies work? Programs tend to access the data at level k more often than they access the data at level k+. Thus, the storage at level k+ can be slower, and thus larger and cheaper per bit. Net effect: A large pool of memory that costs as much as the cheap storage near the bottom, but that serves data to programs at the rate of the fast storage near the top. F4 7 Systemprogrammering 27 General Caching Concepts Types of cache misses: Cold (compulsary) miss Cold misses occur because the cache is empty. Conflict miss Most caches limit blocks at level k+ to a small subset (sometimes a singleton) of the block positions at level k. E.g. Block i at level k+ must be placed in block (i mod 4) at level k+. Conflict misses occur when the level k cache is large enough, but multiple data objects all map to the same level k block. E.g. Referencing blocks, 8,, 8,, 8,... would miss every time. Capacity miss Occurs when the set of active cache blocks (working set) is larger than the cache. F4 8 Systemprogrammering 27

Examples of Caching in the Hierarchy Motivations for Type What d Where d Latency (cycles) Managed By Use DRAM as a for the Disk Registers TLB 4-byte word Address translations registers On-Chip TLB Compiler Hardware Address space of a process can exceed physical memory size Sum of address spaces of multiple processes can exceed physical memory L cache 32-byte block On-Chip L Hardware Simplify Management L2 cache 32-byte block Off-Chip L2 Hardware Multiple processes resident in main memory. Buffer cache Network buffer cache Browser cache Web cache 4-KB page Parts of files Parts of files Web pages Web pages Main memory Main memory Local disk Local disk Remote server disks,,,,, Web browser Web proxy server F4 9 Systemprogrammering 27,, Hardware + OS OS AFS/NFS client Each process with its own address space Only active code and data is actually in memory Allocate more memory to process as needed. Provide Protection One process can t interfere with another. because they operate in different address spaces. User process cannot access privileged information different sections of address spaces have different permissions. F4 Systemprogrammering 27 Motivation #: DRAM a for Disk Full address space is quite large: 32-bit addresses: ~4,,, (4 billion) bytes 64-bit addresses: ~6,,,,,, (6 quintillion) bytes Disk storage is ~3X cheaper than DRAM storage 8 GB of DRAM: ~ $33, 8 GB of disk: ~ $ To access large amounts of data in a cost-effective manner, the bulk of the data must be stored on disk 4 MB: ~$5 SRAM GB: ~$2 DRAM 8 GB: ~$ Disk F4 Systemprogrammering 27 A System with Only Examples: most Cray machines, early PCs, nearly all embedded systems, etc. generated by the correspond directly to bytes in physical memory F4 2 Systemprogrammering 27 : : N-:

A System with Examples: workstations, servers, modern PCs, etc. Page Table : : P-: Address Translation: Hardware converts virtual addresses to physical addresses via OS-managed lookup table (page table) F4 3 Systemprogrammering 27 Disk : : N-: Page Faults (like Misses ) What if an object is on disk rather than in memory? Page table entry indicates virtual address not in memory OS exception handler invoked to move data from disk into memory current process suspends, others can resume OS has full control over placement, etc. Before fault Page Table Disk F4 4 Systemprogrammering 27 After fault Page Table Disk Servicing a Page Fault Processor Signals Controller Read block of length P starting at disk address X and store starting at memory address Y Read Occurs Direct Access (DMA) Under control of I/O controller I / O Controller Signals Completion Interrupt processor OS resumes suspended process Processor Reg -I/O bus Transfer () Initiate Block Read (2) DMA (3) Read Done disk Disk disk Disk F4 5 Systemprogrammering 27 I/O controller Motivation #2: Management Multiple processes can reside in physical memory. How do we resolve address conflicts? what if two processes access something at the same address? Linux/x86 process memory image %esp kernel virtual memory stack mapped region for shared libraries runtime heap (via malloc) uninitialized data (.bss) initialized data (.data) program text (.text) forbidden memory invisible to user code the brk ptr F4 6 Systemprogrammering 27

Solution: Separate Virt. Addr. Spaces and physical address spaces divided into equal-sized blocks Address Space for Process : blocks are called pages (both virtual and physical) Each process has its own virtual address space Address Space for Process 2: operating system controls how virtual pages as assigned to physical memory N- N- VP VP 2... VP VP 2... Address Translation Address Space (DRAM) (e.g., read/only F4 7 Systemprogrammering 27 M- PP 2 PP 7 PP library code) Motivation #3: Protection Page table entry contains access rights information hardware enforces this protection (trap into OS if violation occurs) Process i: Process j: Read? VP : Yes VP : VP 2: Yes Read? VP : Yes VP : VP 2: Yes Page Tables Write? Yes Write? Yes Addr PP 9 PP 4 XXXXXXX Addr PP 6 PP 9 XXXXXXX F4 8 Systemprogrammering 27 : : N-: VM Address Translation Address Space V = {,,, N} Address Space P = {,,, M} M < N (usually) Address Translation MAP: V P U { } For virtual address a: MAP(a) = a if data at virtual address a at physical address a in P MAP(a) = if data at virtual address a not in physical memory» Either invalid or stored on disk F4 9 Systemprogrammering 27 VM Address Translation Parameters P = 2 p = page size (bytes). N = 2 n = address limit M = 2 m = address limit n p p m virtual page number page offset virtual address address translation p p physical page number page offset physical address Page offset bits don t change as a result of translation F4 2 Systemprogrammering 27

Page Number Page Tables resident page table (physical page or disk address) Disk Storage (swap file or regular file system file) Address Translation via Page Table page table base register virtual address n p p acts as table index virtual page number () page offset valid access physical page number () if valid= then page m p p not in memory physical page number () page offset physical address F4 2 Systemprogrammering 27 F4 22 Systemprogrammering 27 Page Table Operation Integrating VM and Translation Separate (set of) page table(s) per process VA PA miss Translation Main forms index into page table (points to a page table entry) data hit Computing Address Page Table Entry (PTE) provides information about page if (valid bit = ) then the page is in memory.» Use physical page number () to construct address if (valid bit = ) then the page is on disk» Page fault Checking Protection Access rights field indicate allowable access» e.g., read-only, read-write, execute-only» typically support multiple protection modes (e.g., kernel vs. user) Protection violation fault if user doesn t have necessary permission F4 23 Systemprogrammering 27 Most s ly Addressed Accessed by physical addresses Allows multiple processes to have blocks in cache at same time Allows multiple processes to share pages doesn t need to be concerned with protection issues Access rights checked as part of address translation Perform Address Translation Before Lookup But this could involve a memory access itself (of the PTE) Of course, page table entries can also become cached F4 24 Systemprogrammering 27

Speeding up Translation with a TLB Translation Lookaside Buffer (TLB) Small hardware cache in MMU Maps virtual page numbers to physical page numbers Contains complete page table entries for small number of pages Address Translation with a TLB valid n p p virtual page number page offset tag physical page number... virtual address TLB hit VA PA miss TLB Lookup miss hit Translation Main TLB hit = physical address tag index valid tag data byte offset data cache hit = data F4 25 Systemprogrammering 27 F4 26 Systemprogrammering 27 Simple System Example Simple System Page Table Addressing 4-bit virtual addresses Only show first 6 entries 2-bit physical address Page size = 64 bytes 3 2 9 8 7 6 5 4 3 2 ( Page Number) 9 8 7 6 5 4 3 2 ( Page Number) ( Page Offset) ( Page Offset) 2 3 4 5 6 7 28 33 2 6 8 9 A B C D E F 3 7 9 2D D F4 27 Systemprogrammering 27 F4 28 Systemprogrammering 27

TLB Simple System TLB 6 entries 4-way associative TLBT TLBI 3 2 9 8 7 6 5 4 3 2 Simple System 6 lines 4-byte line size Direct mapped CT 9 8 7 6 5 4 3 2 CI CO Set 2 3 3 3 2 7 2D 9 2 8 3 D D F4 29 Systemprogrammering 27 4 6 A 34 7 A 3 2 2 Idx 2 3 4 5 6 9 5 B 36 32 D 3 B 99 43 36 B 2 6D 72 B2 23 4 8F F B3 8 9 D 7 6 C2 DF 3 F 4 F4 3 Systemprogrammering 27 Idx 8 9 A B C D E 24 2D 2D B 2 6 3 B 3A 93 4 83 B 5 96 77 B2 5 DA 34 B B3 89 3B 5 D3 Address Translation Example # Address x3d4 TLBT TLBI 3 2 9 8 7 6 5 4 3 2 Address Translation Example #2 Address x38f TLBT TLBI 3 2 9 8 7 6 5 4 3 2 TLBI TLBT TLB Hit? Page Fault? : Address CT CO 9 8 7 6 5 4 3 2 CI TLBI TLBT TLB Hit? Page Fault? : Address CT 9 8 7 6 5 4 3 2 CI CO Offset CI CT Hit? Byte: Offset CI CT Hit? Byte: F4 3 Systemprogrammering 27 F4 32 Systemprogrammering 27

Address Translation Example #3 Address x4 TLBT TLBI 3 2 9 8 7 6 5 4 3 2 TLBI TLBT TLB Hit? Page Fault? : Address CT Offset CI CT Hit? Byte: F4 33 Systemprogrammering 27 9 8 7 6 5 4 3 2 CI CO Given: Multi-Level Page Tables 4KB (2 2 ) page size 32-bit address space 4-byte PTE Problem: Would need a 4 MB page table! 2 2 *4 bytes Common solution multi-level page tables e.g., 2-level table (P6) Level table: 24 entries, each of which points to a Level 2 page table. Level 2 table: 24 entries, each of 2 Level Table Level 2 Tables which points to a page F4 34 Systemprogrammering 27... Programmer s View Main Themes Large flat address space Can allocate large blocks of contiguous addresses Process owns machine Has private address space Unaffected by behavior of other processes System View User virtual address space created by mapping to set of pages Need not be contiguous Allocated dynamically Enforce protection during address translation OS manages many processes simultaneously Continually switching among processes Especially when one must wait for resource» E.g., disk I/O to handle page fault F4 35 Systemprogrammering 27