ECE 571 Advanced Microprocessor-Based Design Lecture 13

Similar documents
ECE 571 Advanced Microprocessor-Based Design Lecture 12

ECE 598 Advanced Operating Systems Lecture 14

ECE 598 Advanced Operating Systems Lecture 14

Virtual Memory: From Address Translation to Demand Paging

1. Creates the illusion of an address space much larger than the physical memory

CSE 120 Principles of Operating Systems

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

LECTURE 12. Virtual Memory

Lecture 21: Virtual Memory. Spring 2018 Jason Tang

Virtual Memory: From Address Translation to Demand Paging

Chapter 8. Virtual Memory

Multi-level Translation. CS 537 Lecture 9 Paging. Example two-level page table. Multi-level Translation Analysis

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Virtual Memory. CS61, Lecture 15. Prof. Stephen Chong October 20, 2011

Virtual Memory. Today. Handling bigger address spaces Speeding translation

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

CSE 120 Principles of Operating Systems Spring 2017

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

CS 318 Principles of Operating Systems

Virtual Memory. Virtual Memory

Operating Systems. Operating Systems Sina Meraji U of T

CS 333 Introduction to Operating Systems. Class 11 Virtual Memory (1) Jonathan Walpole Computer Science Portland State University

ADDRESS TRANSLATION AND TLB

Lecture 12: Demand Paging

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Virtual Memory 2. Today. Handling bigger address spaces Speeding translation

CIS Operating Systems Memory Management Cache and Demand Paging. Professor Qiang Zeng Spring 2018

ADDRESS TRANSLATION AND TLB

Virtual Memory 2. To do. q Handling bigger address spaces q Speeding translation

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Virtual Memory. Stefanos Kaxiras. Credits: Some material and/or diagrams adapted from Hennessy & Patterson, Hill, online sources.

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2017

Page 1. Review: Address Segmentation " Review: Address Segmentation " Review: Address Segmentation "

Memory: Page Table Structure. CSSE 332 Operating Systems Rose-Hulman Institute of Technology

ECE 571 Advanced Microprocessor-Based Design Lecture 10

CSE 451: Operating Systems Winter Page Table Management, TLBs and Other Pragmatics. Gary Kimura

ECE 571 Advanced Microprocessor-Based Design Lecture 10

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory

Modern Virtual Memory Systems. Modern Virtual Memory Systems

Module 2: Virtual Memory and Caches Lecture 3: Virtual Memory and Caches. The Lecture Contains:

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays.

ECE232: Hardware Organization and Design

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs"

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches

Virtual Memory 2. q Handling bigger address spaces q Speeding translation

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

Virtual Memory. Main memory is a CACHE for disk Advantages: illusion of having more physical memory program relocation protection.

Paging! 2/22! Anthony D. Joseph and Ion Stoica CS162 UCB Fall 2012! " (0xE0)" " " " (0x70)" " (0x50)"

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

John Wawrzynek & Nick Weaver

CS162 - Operating Systems and Systems Programming. Address Translation => Paging"

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2015

CS 153 Design of Operating Systems Winter 2016

Handout 4 Memory Hierarchy

Virtual Memory. CS 3410 Computer System Organization & Programming

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST

Virtual Memory. Kevin Webb Swarthmore College March 8, 2018

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)

Page 1. CS162 Operating Systems and Systems Programming Lecture 14. Caching and Demand Paging

COSC3330 Computer Architecture Lecture 20. Virtual Memory

Levels in memory hierarchy

VIRTUAL MEMORY II. Jo, Heeseung

Memory Management. To improve CPU utilization in a multiprogramming environment we need multiple programs in main memory at the same time.

EECS 470. Lecture 16 Virtual Memory. Fall 2018 Jon Beaumont

Basic Memory Management. Basic Memory Management. Address Binding. Running a user program. Operating Systems 10/14/2018 CSC 256/456 1

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

ECE4680 Computer Organization and Architecture. Virtual Memory

CS3350B Computer Architecture

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY

Chapter 8 Memory Management

Memory Management Topics. CS 537 Lecture 11 Memory. Virtualizing Resources

CSE 141 Computer Architecture Spring Lectures 17 Virtual Memory. Announcements Office Hour

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

CS5460: Operating Systems Lecture 14: Memory Management (Chapter 8)

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

PROCESS VIRTUAL MEMORY PART 2. CS124 Operating Systems Winter , Lecture 19

ECE331: Hardware Organization and Design

Memory Hierarchies 2009 DAT105

Memory Management. Dr. Yingwu Zhu

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Outline. V Computer Systems Organization II (Honors) (Introductory Operating Systems) Advantages of Multi-level Page Tables

Virtual Memory. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]

Address spaces and memory management

CS 3733 Operating Systems:

CS162 Operating Systems and Systems Programming Lecture 14. Caching and Demand Paging

Basic Memory Management

Main Memory: Address Translation

ECE468 Computer Organization and Architecture. Virtual Memory

CS162 Operating Systems and Systems Programming Lecture 14. Caching (Finished), Demand Paging

MEMORY: SWAPPING. Shivaram Venkataraman CS 537, Spring 2019

Another View of the Memory Hierarchy. Lecture #25 Virtual Memory I Memory Hierarchy Requirements. Memory Hierarchy Requirements

Transcription:

ECE 571 Advanced Microprocessor-Based Design Lecture 13 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 21 March 2017

Announcements More on HW#6 When ask for reasons why cache better than other, give details. Saying Jetson cache is better... not clear. Is it bigger? Faster? Have more ways? Write through? Memory behavior mostly tied to size of data being processed and memory access patterns. This might be tangentially related to branch or integer/floating point, but you need to make an argument why (say FP might be accessing arrays sequentially or integer might 1

be using less regular data structures). Memory footprint (i.e. fitting in memory) is the most important thing for cache performance, all the other things happen once you start getting misses. Grad Class. Not necessarily a right answer, but need to explain. Gathering data is often tedious but straightforward. Analyzing/making guesses about the behavior you see is the interesting part. Want to see you are applying what you ve learned to the real world results. Not necessarily a right answer. 2

HW#7 will be after the midterm Project topic reminder. Topic due (an e-mail from group) on the 30th, which is next Thursday. 3

Midterm Review Closed book/laptop/phone but can have front of one 8.5x11 piece of paper worth of notes if you want. 1. Performance/Benchmarking Be familiar with the general idea of performance counters and interpreting perf results. Benchmark choice: it should match what you plan to do with the computer. Know a little about the difference between integer benchmarks and floating point (integer have 4

more random/ unpredictable behavior with lots of conditionals; floating point are often regular looped strides over large arrays or data sets) Be familiar with concept of skid. 2. Power Know the CMOS Power equation Energy, Energy Delay, Energy Delay Squared Idle Power Question 3. Branch Prediction Static vs Dynamic 5

2-bit up/down counter Looking at some simple C constructs say expected branch predict rate 4. Cache Given some parameters (size, way, blocksize, addr space) be able to calculate number of bits in tag, index, and offset. Know why caches are used, that they exploit temporal and spatial locality, and know the tradeoffs (speed vs nondeterminism) 6

Be at least familiar with the types of cache misses (cold, conflict, capacity) Know difference between writeback and write-through Be able to work a few simple steps in a cache example (like in HW#5) 5. Prefetch Why have prefetchers? Common prefetch patterns? 7

Virtual Memory Original purpose was to give the illusion of more main memory than available, with disk as backing store. Give each process own linear view of memory. Demand paging (no swapping out whole processes). Execution of processes only partly in memory, effectively a cache. Memory protection Security 8

Diagram Virtual Process 1 Virtual Process 2 Kernel Kernel Stack Stack Heap BSS Data Text Physical RAM Heap BSS Data Text 9

Memory Management Unit Can run without MMU. There s even MMU-less Linux. How do you keep processes separate? Very carefully... 10

Page Table Collection of Page Table Entries (PTE) Some common components: ID of owner Virtual Page Number valid bit, location of page (memory, disk, etc) protection info (read only, etc) page is dirty, age (how recent updated, for LRU) 11

Hierarchical Page Tables With 4GB memory and 4kb pages, you have 1 Million pages per process. If each has 4-byte PTE then 4MB of page tables per-process. Too big. It is likely each process does not use all 4GB at once. (sparse) So put page tables in swappable virtual memory themselves! 4MB page table is 1024 pages which can be mapped in 1 4KB page. 12

Hierarchical Page Table Diagram Physical Memory Virtual Address 10bits 10bits 12bits 4MB Page Table 4kB page tables Page Table Base Address (Stored in a register) 13

Hierarchical Page Table Diagram 32-bit x86 chips have hardware 2-level page tables ARM 2-level page tables What do you do if you have more than 32-bits? 64-bit x86 has 4-level page tables (256TBv/64TBp) 44/40-bits? Push by Intel for 5-level tables (128PBv/4PBp) 57 bits? What happens when you have unused bits? People try to use them, causes problems later. AMD64 canonical 14

addresses. 15

Inverted Page Table How to handle larger 64-bit address spaces? Can add more levels of page tables (4? 5?) but that becomes very slow Can use hash to find page. Better best case performance, can perform poorly if hash algorithm has lots of aliasing. 16

Inverted Page Table Diagram Physical Memory Virtual Address Page Tables HASH hit alias re hash 17

Walking the Page Table Can be walked in Hardware or Software Hardware is more common Early RISC machines would do it in Software. Can be slow. Has complications: what if the page-walking code was swapped out? 18

TLB Translation Lookaside Buffer (Lookaside Buffer is an obsolete term meaning cache) Caches page tables Much faster than doing a page-table walk. Historically fully associative, recently multi-level multiway TLB shootdown when change a setting on a mapping 19

and TLB invalidated on all other processors 20

Flushing the TLB May need to do this on context switch if doesn t store ASID or ASIDs run out. Sometimes called a TLB Shootdown Hurts performance as the TLB gradually refills Avoiding this is why the top part is mapped to kernel under Linux 21

What happens on a memory access Cache hit, generally not a problem, see later. To be in cache had to have gone through the whole VM process. Although some architectures do a lookup anyway in case permissions have changed. Cache miss, then send access out to memory If in TLB, not a problem, right page fetched from physical memory, TLB updated If not in TLB, then the page tables are walked 22

It no physical mapping in page table, then page fault happens 23

What happens on a page fault Walk the page table and see if the page is valid and there minor page is already in memory, just need to point a PTE at it. For example, shared memory, shared libraries, etc. major page needs to be created or brought in from disk. Demand paging. Needs to find room in physical memory. If no free space 24

available, needs to kick something out. Disk-backed (and not dirty) just discarded. Disk-backed and dirty, written back. Memory can be paged to disk. Eventually can OOM. Memory is then loaded, or zeroed, and PTE updated. Can it be shared? (zero page) invalid segfault 25

What happens on a fork? Do you actually copy all of memory? Why would that be bad? (slow, also often exec() right away) Page table marked read-only, then shared Only if writes happen, take page fault, then copy made Copy-on-write 26