Computer Architecture and Engineering. CS152 Quiz #3. March 18th, Professor Krste Asanovic. Name:

Similar documents

Computer Architecture and Engineering CS152 Quiz #2 March 7th, 2016 Professor George Michelogiannakis Name: <ANSWER KEY>

Computer Architecture and Engineering. CS152 Quiz #2. March 3rd, Professor Krste Asanovic. Name:

Virtual Memory: From Address Translation to Demand Paging

Computer Architecture and Engineering. CS152 Quiz #2. March 3rd, Professor Krste Asanovic. Name:

Virtual Memory: From Address Translation to Demand Paging

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches

Computer Architecture and Engineering. CS152 Quiz #1. February 19th, Professor Krste Asanovic. Name:

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

6.823 Computer System Architecture. Problem Set #3 Spring 2002

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Modern Virtual Memory Systems. Modern Virtual Memory Systems

Computer Architecture and Engineering CS152 Quiz #5 April 27th, 2016 Professor George Michelogiannakis Name: <ANSWER KEY>

Computer Architecture and Engineering. CS152 Quiz #5. April 23rd, Professor Krste Asanovic. Name: Answer Key

COSC3330 Computer Architecture Lecture 20. Virtual Memory

CSE 451: Operating Systems Winter Page Table Management, TLBs and Other Pragmatics. Gary Kimura

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

Virtual to physical address translation

VIRTUAL MEMORY II. Jo, Heeseung

Virtual Memory. Samira Khan Apr 27, 2017

CS 153 Design of Operating Systems Winter 2016

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

Memory: Page Table Structure. CSSE 332 Operating Systems Rose-Hulman Institute of Technology

Computer Architecture and Engineering CS152 Quiz #5 May 2th, 2013 Professor Krste Asanović Name: <ANSWER KEY>

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

John Wawrzynek & Nick Weaver

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

CS 152 Computer Architecture and Engineering. Lecture 9 - Virtual Memory

Lecture 9 - Virtual Memory

Virtual Memory: Concepts

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

SOLUTION. Midterm #1 February 26th, 2018 Professor Krste Asanovic Name:

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

Virtual Memory Review. Page faults. Paging system summary (so far)

Computer System Architecture Quiz #1 March 8th, 2019

Memory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit.

CS5460: Operating Systems Lecture 14: Memory Management (Chapter 8)

Translation Buffers (TLB s)

New-School Machine Structures. Overarching Theme for Today. Agenda. Review: Memory Management. The Problem 8/1/2011

Agenda. CS 61C: Great Ideas in Computer Architecture. Virtual Memory II. Goals of Virtual Memory. Memory Hierarchy Requirements

CS 61C: Great Ideas in Computer Architecture Virtual Memory. Instructors: John Wawrzynek & Vladimir Stojanovic

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays.

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance

Computer System Architecture Final Examination

CS 61C: Great Ideas in Computer Architecture. Virtual Memory

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

Learning to Play Well With Others

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory. Bernhard Boser & Randy Katz

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

Computer Systems. Virtual Memory. Han, Hwansoo

Computer Science 146. Computer Architecture

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 24

A Few Problems with Physical Addressing. Virtual Memory Process Abstraction, Part 2: Private Address Space

Virtual Memory. Motivation:

Computer System Architecture Midterm Examination Spring 2002

CS 61C: Great Ideas in Computer Architecture. Lecture 23: Virtual Memory

Virtual Memory, Address Translation

Problem M1.1: Self Modifying Code on the EDSACjr

Lecture 9 Virtual Memory

198:231 Intro to Computer Organization. 198:231 Introduction to Computer Organization Lecture 14

Computer System Architecture Final Examination Spring 2002

Multi-level Translation. CS 537 Lecture 9 Paging. Example two-level page table. Multi-level Translation Analysis

Virtual Memory. CS 351: Systems Programming Michael Saelee

CS152 Computer Architecture and Engineering

Computer System Architecture Quiz #2 April 5th, 2019

Memory Hierarchy Requirements. Three Advantages of Virtual Memory

Computer Architecture and Engineering CS152 Quiz #3 March 22nd, 2012 Professor Krste Asanović

Processes and Virtual Memory Concepts

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

ECE 571 Advanced Microprocessor-Based Design Lecture 12

ECE 571 Advanced Microprocessor-Based Design Lecture 13

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

Chapter 8 Memory Management

Virtual Memory Worksheet

Carnegie Mellon. 16 th Lecture, Mar. 20, Instructors: Todd C. Mowry & Anthony Rowe

Memory Management. An expensive way to run multiple processes: Swapping. CPSC 410/611 : Operating Systems. Memory Management: Paging / Segmentation 1

Lecture 19: Survey of Modern VMs. Housekeeping

Chapter 10: Virtual Memory. Lesson 05: Translation Lookaside Buffers

Virtual Memory. CS 3410 Computer System Organization & Programming

CS 333 Introduction to Operating Systems. Class 11 Virtual Memory (1) Jonathan Walpole Computer Science Portland State University

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches

CMU Introduction to Computer Architecture, Spring 2014 HW 5: Virtual Memory, Caching and Main Memory

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

Virtual Memory 2. To do. q Handling bigger address spaces q Speeding translation

Virtual Memory Oct. 29, 2002

Address spaces and memory management

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy

CS252 Spring 2017 Graduate Computer Architecture. Lecture 17: Virtual Memory and Caches

Memory hier ar hier ch ar y ch rev re i v e i w e ECE 154B Dmitri Struko Struk v o

18-447: Computer Architecture Lecture 18: Virtual Memory III. Yoongu Kim Carnegie Mellon University Spring 2013, 3/1

Lecture 19: Virtual Memory: Concepts

6.004 Tutorial Problems L20 Virtual Memory

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

Learning to Play Well With Others

Transcription:

Computer Architecture and Engineering CS152 Quiz #3 March 18th, 2008 Professor Krste Asanovic Name: Notes: This is a closed book, closed notes exam. 80 Minutes 10 Pages Not all questions are of equal difficulty, so look over the entire exam and budget your time carefully. Please carefully state any assumptions you make. Please write your name on every page in the quiz. You must not discuss a quiz's contents with students who have not yet taken the quiz. If you have inadvertently been exposed to the quiz prior to taking it, you must tell the instructor or TA. You will get no credit for selecting multiple choice answers without giving explanations if the instructions ask you to explain your choice. Writing name on each sheet 1 Point Question 1 30 Points Question 2 25 Points Question 3 24 Points TOTAL 80 Points

Problem Q3.1: Page Size and TLBs 30 Points The configuration of the hierarchical page table in this problem is similar to the one in Handout #3 (included at the back of the exam for your convenience), but we modify two parameters: 1) this problem evaluates a virtual memory system with two page sizes, 4KB and 1MB (instead of 4 MB), and 2) all PTEs are 16 Bytes (instead of 8 Bytes). The following figure summarizes the page table structure and indicates the sizes of the page tables and data pages (not drawn to scale): Root ptr. (processor register) L1 Table (4096 PTEs, 64KB) L2 Table (4096 PTEs, 64KB) L3 Table (256 PTEs, 4KB) Data Page (4KB) Data Page (1MB) The processor has a data TLB with 64 entries, and each entry can map either a 4KB page or a 1MB page. After a TLB miss, a hardware engine walks the page table to reload the TLB. The TLB uses a first-in/first-out (FIFO) replacement policy. We will evaluate the execution of the following program which adds the elements from two 1MB arrays and stores the results in a third 1MB array (note that, 1MB = 1,048,576 Bytes, the starting address of the arrays are given below): byte A[1048576]; // 1MB array 0x00001000000 byte B[1048576]; // 1MB array 0x00001100000 byte C[1048576]; // 1MB array 0x00001200000 for(int i=0; i<1048576; i++) C[i] = A[i] + B[i]; Assume that the above program is the only process in the system, and ignore any instruction memory or operating system overheads. The data TLB is initially empty.

Problem Q3.1.A 5 Points Consider the execution of the program. Assume that there is no data cache, and that each memory lookup has 100 cycle latency. Note that the program loop accesses memory one byte at a time. If all data pages are 4KB, compute the ratio of cycles for address translation to cycles for data access. Problem Q3.1.B 5 Points If all data pages are 1MB, compute the ratio of cycles for address translation to cycles for data access.

Problem Q3.1.C 10 Points For this question, assume that in addition we will add a PTE cache with single cycle latency. A PTE cache will cache page table entries from any level in order to speed up address translation. If this PTE cache has unlimited capacity and is fully associative, compute the ratio of cycles for address translation to cycles for data access for the 4KB data page case. Problem Q3.1.D 5 Points When running this code on a system with both a PTE cache and a TLB, is there any benefit to caching L3 PTE entries in the PTE cache? Explain. Problem Q3.1.E 5 Points What is the minimum capacity (number of entries) needed in the PTE cache to get the same performance on this code as an unlimited PTE cache? (Assume that the fully associative PTE cache does not cache L3 PTE entries and all data pages are 4KB)

Problem Q3.2: 64-bit Address Spaces 25 Points Ben read in a recent technical journal that many current implementations of 64-bit ISAs implement only part of the large virtual address space. They usually segment the virtual address space into three parts: one used for stack, one used for code and heap data, and the third one unused. Ben s proposed processor s virtual address space is shown below: 0xFFFFFFFFFFFFFFFF 0xFF00000000000000 Reserved for Stack 0x00FFFFFFFFFFFFFF 0x0000000000000000 Unuse Reserved for Code and Heap A special circuit is used to detect whether the top eight bits of an address are all zeros or all ones before the address is sent to the virtual memory system. If they are not all equal, an invalid virtual memory address trap is raised. This scheme in effect removes the top seven bits from the virtual memory address, but retains a memory layout that will be compatible with future designs that implement a larger virtual address space. Alyssa P. Hacker is unhappy with the large hole in the virtual address space given by Ben s scheme. She decides that a hashed page table is the way to go. Again, the machine has a 64-bit virtual address and 4KB pages. The hardware paging system has only one page table with 64 slots, each containing 8 PTEs. Alyssa decides to use X mod 64 as the hash function to select a slot, where X is the VPN. The page table resides in memory and Alyssa s design has no TLB, so each PTE read requires one memory access. During a page table lookup, all PTEs in each slot are searched sequentially until a match is found (each of these sequential accesses is a memory access). If there is a miss in the page table, a trap is raised and a software handler will refill the page table, with each refill requiring 10 memory accesses on average. Problem Q3.2.A 12 Points

Alyssa is happy with her new modification and runs a very simple benchmark that repeatedly loops over an array of 2 21 bytes, reading one byte at a time in sequential address order. On average in the steady state, how many memory accesses are performed for each byte read by the user program? Ignore the memory traffic for instruction fetch, assume that the array starts on a page boundary, and there are no other memory accesses in the user code apart from the single byte memory accesses. Problem Q3.2.B 5 Points Alyssa now decides to add a one-entry TLB to the hashed paging system. What should the replacement policy for the TLB be? Circle the most appropriate of the following choices and explain: a) FIFO b) LRU c) Random d) Doesn t matter, all of the above give the same performance Problem Q3.2.C 8 Points Given your answer to the previous question, what is now the average number of memory

accesses per user byte read for Alyssa s benchmark?

Problem Q3.3: Virtual Memory and Caches Problem Q3.3.A 24 Points 8 Points Ben Bitdiddle is designing a one level 4-way set associative cache that is virtually indexed and virtually tagged. He realizes that such a cache suffers from a homonym aliasing problem. The homonym problem happens when two processes use the same virtual address to access different physical locations. Ben asks Alyssa P. Hacker for help with solving this problem. She suggests that Ben should add a PID (Process ID) to the virtual tag. Does this solve the homonym problem? Explain. Problem Q3.3.B 8 Points Another problem with virtually indexed and virtually tagged caches is called the synonym problem. The synonym problem happens when distinct virtual addresses refer to the same physical location. Does Alyssa s idea solve this problem? Explain.

Problem Q3.3.C 8 Points Ben thinks that a different way of solving synonym and homonym problems is to have a direct mapped cache, rather than a set associative cache. Is he right? Explain.

END OF EXAM SECTION THIS PAGE PURPOSEFULLY LEFT BLANK

CS152 Computer Architecture and Engineering Virtual Memory Implementation Last Updated: 3/11/2008 6:28 PM http://inst.eecs.berkeley.edu/~cs152/sp08/ Hierarchical Page Table Supporting Variable-Sized Pages Small fixed-sized pages (e.g. 4 KB) reduce internal fragmentation and the page fault penalty compared to large fixed-sized pages. However, when we run an application with a large working set, they may degrade a processor s performance by incurring a number of TLB misses because of their small TLB reach. Therefore, researchers have proposed to support variable-sized pages to increase the TLB reach without losing the benefits of small fixed-sized pages. Many modern processor families (e.g. UltraSparc, PA-RISC, MIPS) and operating systems (e.g. Sun Solaris, SGI IRIX) support this feature. In this handout, we present an example implementation of variable-sized pages, supporting only two page sizes: 4 KB and 4 MB. Assume that the system uses 44-bit virtual addresses and 40-bit physical addresses. 4KB pages are mapped using a threelevel hierarchical page table. 4MB pages are mapped using the first two levels of the same page table. An L2 Page Table Entry (PTE) contains information which indicates if it points to an L3 table or a 4MB data page. All PTEs are 8 Bytes. The following figure summarizes the page table structure and indicates the sizes of the page tables and data pages (not drawn to scale): Root ptr. (processor register) L1 Table (2048 PTEs, 16KB) L2 Table (2048 PTEs, 16KB) L3 Table (1024 PTEs, 8KB) Data Page (4KB) Data Page (4MB) Figure H8-A. Example implementation of variable-sized pages.

Page Table Entries and Translation Lookaside Buffers Each Page Table Entry (PTE) will map a virtual page number (VPN) to a physical page number (PPN). In addition to the page number translation, each page table entry also contains some permission/status bits. Bit Name PPN / DBN V (valid) R (resident) W (writable) U (used) M (modified) S (supervisor) Bit Definition Physical Page Number / Disk Block Number 1 if the page table entry is valid, 0 otherwise 1 if the page is resident in memory, 0 otherwise 1 if the page is writable, 0 otherwise 1 if the page has been accessed recently, 0 otherwise 1 if the page has been modified, 0 otherwise 1 if the page is only accessible in supervisor mode, 0 otherwise Each entry in the Translation Lookaside Buffer (TLB) has a tag that is matched against the VPN and a TLB Entry Valid bit (note, the TLB Entry Valid bit is not the V bit shown in the table above). The TLB Entry Valid bit will be set if the TLB entry is valid. Each TLB entry also contains all the fields from the page table that are listed above. A TLB miss (VPN does not match any of the tags for entries that have the TLB Entry Valid bit set) causes an exception. On a TLB miss kernel software will load the page table entry into the TLB and will restart the memory access. (Kernel software can modify anything in the TLB that it likes and always runs in supervisor mode). If the entry being replaced was valid, then the kernel will also write the TLB entry that is being replaced back to the page table. Hardware will set the used bit whenever a TLB hit to the corresponding entry occurs. Similarly, the modified bit (in the TLB entry) will be set when a store to the page happens. All exceptions that come from the TLB (hit or miss) are handled by software. For example, the possible exceptions are as follows: TLB Miss: VPN does not match any of tags for entries that have the TLB Entry Valid bit set. Page Table Entry Invalid: Trying to access a virtual page that has no mapping to a physical address. Write Fault (Store only): Trying to modify a read-only page (W is 0). Protection Violation: Trying to access a protected (supervisor) page while in user mode. Page Fault: Page is not resident. (Unless noted, exceptions can occur for both loads and stores.)