Virtual memory why? Virtual memory parameters Compared to first-level cache Parameter First-level cache Virtual memory. Virtual memory concepts

Size: px

Start display at page:

Download "Virtual memory why? Virtual memory parameters Compared to first-level cache Parameter First-level cache Virtual memory. Virtual memory concepts"

Brooke Morgan
5 years ago
Views:

1 Lecture 16 Virtual memory why? Virtual memory: Virtual memory concepts (5.10) Protection (5.11) The memory hierarchy of Alpha (5.13) Virtual address space proc 0? s space proc 1 Physical memory Virtual address space proc 2 Reasons to use VM: Large address space Several processes sharing the same physical memory Protection of memory Relocation Virtual memory concepts Part of the memory hierarchy: The virtual address space (AS) is divided into pages The physical AS is divided into page frames A miss is called a page fault Pages not in main memory are stored The CPU uses virtual addresses on disk We need an address translation mechanism Virtual memory parameters Compared to first-level cache Parameter First-level cache Virtual memory Block (page) size bytes 4K-64K bytes Hit time 1-3 clock cycles clock cycles Miss penalty (Access time) (Transfer time) clock cycles 1000K-10000K clock cycles (6-130 clock cycles) (800K-8000K clock cycles) (2-20 clock cycles) (200K-2000K clock cycles) Miss rate 0.1%-10% %-0.001% Data memory size 16 Kbyte - 1 Mbyte 16 Mbyte - 8 Gbyte Replacement in cache handled by HW Replacement in VM handled by SW The backing store for VM (paging (swap) partition on disk) is shared with the file system 1

2 VM: Block placement Where can a block (page) be placed in main memory? Cache access: 0.3 ns Memory access: 100 ns Disk access: 1 ms minimise miss rate The high miss penalty makes is possible to use SW solutions to implement a fully associative address mapping VM: Block identification Use a page table stored in main memory: Suppose 4 Kbyte pages, 32 bit virtual address Page table takes 2 32 /2 12 *4 = 2 22 = 4 Mbyte!!! Solutions: Multi-level page table Inverted page table How do we make the page table lookup fast? VM: Page replacement Most important: minimise number of page faults Page replacement strategies: FIFO First-In-First-Out LRU Least Recently Used Approximation Each page has a reference bit that is set on a reference The OS periodically resets the reference bits Often use some kind of counter When a page needs to be replaced, a page with a reference bit that is not set is chosen High counter value Main memory is large these days VM: Write strategy Write back or Write through? Write back! Write through is impossible to use: Too long access time to disk The write buffer would need to be very large The I/O system would need an extremely high bandwidth 2

3 Address translation Example: The Alpha Segment is selected by bit 62 & 63 in addr. kseg Kernel segment Used by OS. Does not use virtual memory. User segment 1 Used for stack. User segment 0 Used for instr. & static data & heap Fast address translation How do we avoid three extra memory references for each original memory reference? Store the most commonly used address translations in a cache Translation Lookaside Buffer P data TLB hit VA PA Cache TLB lookup Cache miss Translation TLB miss Cache hit Main memory Process 0 Process 1 X X Y Z Protection Address translation Physical memory Process 0 mustn t be allowed to alter memory of process 1 and vice versa They should, however, be able to share pages Protection mechanisms The address translation mechanism can be used to provide memory protection: Use protection attribute bits for each page Stored in the page table entry (PTE) and TLB Each page gets its own protection If a process does not have permission to, e.g., write to a memory address, this is detected in the address translation and an exception is raised Supervisor/user modes necessary to prevent user processes from changing page tables 3

4 li Computer Architecture: Lecture 9 The memory hierarchy of Alpha The Alpha data TLB Separate Instr & Data TLB and Caches TLBs fully associative TLB updates in SW Cache 8Kbyte direct mapped, write through Critical 8 bytes first Prefetch instr one block 2 MB L2 cache, direct mapped, write back Victim cache 4 entry write buffer between L1 and L2 ITLB I-cache prefetch buffer DTLB D-cache write buffer L2 cache victim 32 page table entries Fully associative Valid bit, kernel & user read and write permissions Instruction TLB has only 12 entries The Alpha data cache 8 Kbyte, 32 byte block size 256 blocks 1st level cache is write-through write buffer A special one-entry write buffer is used to pipeline stores CPI 5 4,5 4 3,5 3 2,5 2 1,5 1 0,5 0 Alpha memory hierarchy performance TPC-B (db1) TPC-B (db2) Alphasort espresso eqntott compress sc gcc spice doduc mdljdp2 wave5 tomcatv ora alvinn ear mdljsp2 swm256 su2cor hydro2d nasa7 fpppp I cache D cache L2 Instr. issue Other Instruction issue and data stalls are by far the largest contributions to the overall CPI 4

5 Summary Cache memories: Crucial with modern µprocessor technology Separate instruction and data caches permits simultaneous instruction fetch and data access Four questions: Block placement Block identification Block replacement Write strategy Virtual memory: Also part of the memory hierarchy Very high miss penalty miss rate must be very low Also facilitates: program loading memory protection multiprogramming 5

Lecture 19: Memory Hierarchy Five Ways to Reduce Miss Penalty (Second Level Cache) Admin

Lecture 19: Memory Hierarchy Five Ways to Reduce Miss Penalty (Second Level Cache) Professor Alvin R. Lebeck Computer Science 220 Fall 1999 Exam Average 76 90-100 4 80-89 3 70-79 3 60-69 5 < 60 1 Admin