Problem Page Possible Score Total 80

Similar documents
CS , Fall 2001 Exam 2

CS , Fall 2001 Exam 2

Problem 9. VM address translation. (9 points): The following problem concerns the way virtual addresses are translated into physical addresses.

Systems programming and Operating systems, 2005 Test Exam

CS , Spring 2002 Exam 2

CS , Fall 2003 Exam 2

Midterm Sample Answer ECE 454F 2008: Computer Systems Programming Date: Tuesday, Oct 28, p.m. - 5 p.m.

Systems programming and Operating systems, 2005 Tentamen

CS , Fall 1998 Final Exam

UW CSE 351, Winter 2013 Final Exam

The cache is 4-way set associative, with 4-byte blocks, and 16 total lines

Part I: Pen & Paper Exercises, Cache

Question F5: Caching [10 pts]

CSE Computer Architecture I Fall 2011 Homework 07 Memory Hierarchies Assigned: November 8, 2011, Due: November 22, 2011, Total Points: 100

Virtual Memory Nov 9, 2009"

15-213, Fall 2007 Midterm Exam

Introduction to Computer Systems. Exam 1. February 22, This is an open-book exam. Notes are permitted, but not computers.

CS , Fall 2007 Exam 1

Computer Organization and Architecture (CSCI-365) Sample Final Exam

15-213: Final Exam Review

Alexandria University

CS 433 Homework 4. Assigned on 10/17/2017 Due in class on 11/7/ Please write your name and NetID clearly on the first page.

CSE351 Spring 2010 Final Exam (9 June 2010)

Final Exam Introduction to Computer Systems. May 10, Name: Andrew User ID: Recitation Section: Model Solution fp

CSE351 Spring 2010 Final Exam (9 June 2010)

SOLUTION. Midterm #1 February 26th, 2018 Professor Krste Asanovic Name:

Introduction to Computer Systems. Exam 1. February 22, Model Solution fp

CS 433 Homework 5. Assigned on 11/7/2017 Due in class on 11/30/2017

ECE 411 Exam 1. Name:

CS , Spring 2009 Exam 2

Introduction to Computer Systems. Exam 2. April 11, Notes and calculators are permitted, but not computers.

CSE Computer Architecture I Fall 2009 Homework 08 Pipelined Processors and Multi-core Programming Assigned: Due: Problem 1: (10 points)

EECS 213 Fall 2007 Midterm Exam

CSE351 Autumn 2010 Final Exam (December 13, 2010)

CSE351 Autumn 2013 Final Exam (11 Dec 2013)

Virtual Memory Oct. 29, 2002

virtual memory. March 23, Levels in Memory Hierarchy. DRAM vs. SRAM as a Cache. Page 1. Motivation #1: DRAM a Cache for Disk

Name: CMSC 313 Fall 2001 Computer Organization & Assembly Language Programming Exam 1. Question Points I. /34 II. /30 III.

CSE 351 Spring 2017 Final Exam (7 June 2017)

6.24 Estimate the average time (in ms) to access a sector on the following disk:

Virtual Memory: Systems

Memory System Case Studies Oct. 13, 2008

Problem 3. (12 points):

ECE 411 Exam 1. This exam has 5 problems. Make sure you have a complete exam before you begin.

CS , Fall 2004 Exam 1

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

ECE 454 Computer Systems Programming

A Few Problems with Physical Addressing. Virtual Memory Process Abstraction, Part 2: Private Address Space

Pentium/Linux Memory System March 17, 2005

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

P6/Linux Memory System Nov 11, 2009"

CSE 351. Virtual Memory

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Virtual Memory: Concepts

HW1 Solutions. Type Old Mix New Mix Cost CPI

ECE 2035 Programming HW/SW Systems Spring problems, 5 pages Exam Three 8 April Your Name (please print clearly)

CISC 360. Virtual Memory Dec. 4, 2008

Motivations for Virtual Memory Virtual Memory Oct. 29, Why VM Works? Motivation #1: DRAM a Cache for Disk

CS 222/122C Fall 2016, Midterm Exam

Computer Architecture and Engineering. CS152 Quiz #5. April 23rd, Professor Krste Asanovic. Name: Answer Key

16.317: Microprocessor Systems Design I Fall Exam 3 December 15, Name: ID #:

Spring 2017 CS 1110/1111 Exam 1

CS 153 Design of Operating Systems

CS 216 Exam 1 Fall SOLUTION

Computer Architecture and Engineering CS152 Quiz #2 March 7th, 2016 Professor George Michelogiannakis Name: <ANSWER KEY>

CS , Spring 2004 Exam 1

Computer Architecture and Engineering. CS152 Quiz #3. March 18th, Professor Krste Asanovic. Name:

Carnegie Mellon. 16 th Lecture, Mar. 20, Instructors: Todd C. Mowry & Anthony Rowe

University of California, Berkeley College of Engineering

University of California, Berkeley College of Engineering

CSE141 Problem Set #4 Solutions

ISA Instruction Operation

CS 351 Exam 3, Fall 2011

Problem Max Points Score Total 100

Second Midterm Exam March 21, 2017 CS162 Operating Systems

15-213/18-213, Fall 2012 Final Exam

CS / ECE 6810 Midterm Exam - Oct 21st 2008

CS433 Final Exam. Prof Josep Torrellas. December 12, Time: 2 hours

EECS 470 Final Exam Fall 2013

ECE 2035 A Programming HW/SW Systems Spring problems, 5 pages Exam Three 13 April Your Name (please print clearly)

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 24

Virtual Memory II CSE 351 Spring

Final Exam. 11 May 2018, 120 minutes, 26 questions, 100 points

OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS.

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

CSE 378 Final Exam 3/14/11 Sample Solution

Virtual Memory II. CSE 351 Autumn Instructor: Justin Hsia

CSE 153 Design of Operating Systems

ECE264 Fall 2013 Exam 3, November 20, 2013

CS , Spring 2002 Final Exam

Write only as much as necessary. Be brief!

198:231 Intro to Computer Organization. 198:231 Introduction to Computer Organization Lecture 14

ECE264 Fall 2013 Exam 1, September 24, 2013

Final Exam. 12 December 2018, 120 minutes, 26 questions, 100 points

CS 202: Introduction to Computation Fall 2010: Exam #1

Overview of List Syntax

Virtual Memory. Samira Khan Apr 27, 2017

Las time: Large Pages Last time: VAX translation. Today: I/O Devices

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

EE557--FALL 1999 MIDTERM 1. Closed books, closed notes

Transcription:

Full Name: CSCI 540, Fall 2014 Practice Final Exam Instructions: Make sure that your exam is not missing any sheets, then write your full name on the front. Put your name or student ID on each page. Write your answers in the space provided below the problem. If you make a mess, clearly indicate your final answer. This exam is OPEN BOOK and you can use a single page of notes. You can not use a computer. Good luck! Problem Page Possible Score 1 1 20 2 2 20 3 4 20 4 5 20 Total 80

CSCI-540 - Fall 2014-1- December 7, 2014 1. [ 20 Points ] The following problem concerns basic cache lookups. The memory is byte addressable. Memory accesses are to 1-byte words (not 4-byte words). Physical addresses are 12 bits wide. The cache is 4-way set associative, with a 4-byte block size and 64 total bytes. In the following tables, all numbers are given in hexadecimal. The left-most value is byte #0 and the right-most byte is byte #3. The contents of the cache are as follows: 4-way Set Associative Cache Set (V?, Tag, Data) (V? Tag, Data) (V? Tag, Data) (V? Tag, Data) 0 (1, 00E, 6ECF3F9D) (1, 003, 4584EBAF) (1, 023, 436831E8) (1, 0F6, 74DC71BD) 1 (1, 0E9, 5FC155A5) (1, 00C, 16DE30ED) (1, 00B, 22876351) (1, 003, 12ECA0AA) 2 (1, 004, 27189C37) (1, 0FA, 0F01197D) (1, 006, 189A3395) (1, 005, 55274324) 3 (1, 00C, 61DC7BB3) (1, 003, 74758DBD) (1, 006, 1111C0D6) (1, 00D, 4FCCBF9D) [ 5 Points ] Label the parts of the address used as the block offset (BO) within the line, the cache set index (CI) and the cache tag (CT). Our computer access specific memory locations given the cache state above. For each given physical address, indicate the Byte Offset (BO), the Cache Set Index (CI) and the Cache Tag (CT). Then, give the value (V) that would be loaded if the load is a hit; if the address is a cache miss, write miss. Use hexidecimal values throughout. (a) [ 5 Points ] Physical address: 233 (b) [ 5 Points ] Physical address: 78D (c) [ 5 Points ] Physical address: FA8 CSCI-540 - Fall 2014-1- December 7, 2014

CSCI-540 - Fall 2014-2- December 7, 2014 2. [ 20 Points ] You ve been hired to optimize the a gaussian blur filter on the world s tiniest images, each of which are only 8 8. You start with the code: char m[3][8][8] =...; for (int j = 1; j < cols-1; j++) { for (int i = 1; i < rows-1; i++) { for (int p = 0; p < 3; p++) { char up = m[p][i-1][j]; char down = m[p][i+1][j]; char left = m[p][i][j-1]; char right = m[p][i][j+1];... = (m[p][i][j] + left + right + up + down)/5; You should assume: Char takes 1 bytes; you should ignore the store / memory writes only consider loads ; The array m starts at address 0; memory addresses are 12 bits long; All scalars are held in registers. Your cache is 2-way set associate with 4 byte lines, and a total size of 32 bytes and a least recently used replacement policy. Below, list the address for the first 12 READ or LOAD references (ignore the Store/Write) and indicate if it is a hit or miss in the cache. Use decimal numbers throughout. It s easy to to first write down the array entry (e.g. m[1][2[3]), translate that to an address and then figure out the hit or miss. [ 12 Points ] Ref # Address Array Entry Hit? 0 1 2 3 4 5 6 7 8 9 10 11 CSCI-540 - Fall 2014-2- December 7, 2014

CSCI-540 - Fall 2014-3- December 7, 2014 [ 4 Points ] Below, draw a diagram to show the state of the cache at the end of the references above. You should clearly distinguish each set. For each cache line, you should indicate if the entry is valid and the appropriate starting memory address for that line/block (if valid). If the entry is not valid, just leave the data blank and/or have the tag be zero (we re ignoring the valid bit in this example). Rather than showing the tag bits, which are harder to compute, indicate the starting memory address of the cache block, which should be evenly divisible by the block size. All numbers must be decimal. [ 4 Points ] Assume the two loops were switched into this order: char m[3][8][8] =...; for (int p = 0; p < 3; p++) { for (int i = 1; i < rows-1; i++) { for (int j = 1; j < cols-1; j++) {... How many cache misses would occur when that code is executed, assuming that s the only code that executed and the cache was initially empty. CSCI-540 - Fall 2014-3- December 7, 2014

CSCI-540 - Fall 2014-4- December 7, 2014 3. [ 20 Points ] The following problem concerns optimizing a procedure for maximum performance on an Intel Pentium III with the following characteristics of the functional units: Operation Latency Issue Time/Rate Integer Add 1 1 Integer Multiply 4 1 Integer Divide 36 36 Floating Point Add 3 1 Floating Point Multiply 5 2 Floating Point Divide 38 38 Load or Store (Cache Hit) 1 1 You ve just joined a programming team that is trying to develop the world s fastest factorial routine. Starting with recursive factorial, they ve converted the code to use iteration: int fact(int n) { int i; int result = 1; for (i = n; i > 0; i--) result = result * i; return result; By doing so, they have reduced the number of cycles per element (CPE) for the function from around 63 to around 4 (really!). Still, they would like to do better. One of the programmers heard about loop unrolling. He generated the following code: int fact_u2(int n) { int i; int result = 1; for (i = n; i > 0; i-=2) { result = (result * i) * (i-1); return result; Unfortunately, the team has discovered that this code returns 0 for some value(s) of argument n. CSCI-540 - Fall 2014-4- December 7, 2014

CSCI-540 - Fall 2014-5- December 7, 2014 (a) [ 5 Points ] For what values of n will fact_u2 and fact return different values? (b) [ 5 Points ] Show the simple fix for fact_u2 that makes its behavior identical to fact. (c) [ 5 Points ] Benchmarking fact_u2 shows no improvement in performance. How would you explain that? You might want to sketch out the assembly for that loop. (d) [ 5 Points ] You modify the line inside the loop to read: result = result * (i * (i-1)); To everyone s astonishment, the measured performance now has a CPE of 2.5. How do you explain this performance improvement? You might want to characterize how the assembly language for this version would differ from the former. CSCI-540 - Fall 2014-5- December 7, 2014

CSCI-540 - Fall 2014-6- December 7, 2014 4. [ 20 Points ] The following problem concerns the way virtual addresses are translated into physical addresses. The memory is byte addressable. Memory accesses are to 1-byte words (not 4-byte words). Virtual addresses are 10 bits wide. Physical addresses are 14 bits wide. The page size is 64 bytes. The TLB is 2-way set associative with 8 total entries. The L1 Cache is direct mapped, with a 4-byte block size and 64 total bytes. In the following tables, all numbers are given in hexadecimal and the left-most value is byte #0 and the right-most byte is byte #3, where applicable.. The contents of the TLB, a portion of the page tables, and the 16 entries of the Cache are as follows: TLB Index Tag Valid 0 1 0d 1-0 1 1 18 1 3 0c 1 2 2 16 1-0 3 2 3e 1-0 Page Table VPN Present 000 019 1 001 001 1 002 03f 1 003 020 1 004 00d 1 005 018 1 007 001 1 008 015 1 009 000 1 00a 016 1 00b 03e 1 00c 06f 1 00d 00c 1 00f 039 1 Cache Index Valid Tag Data 0 1 3b BF3A02F3 1 1 4a E8E7BA4F 2 1 12 23033CCA 3 1 16 7AFB27EE 4 1 2f 8F9F64E8 5 1 3e EA13BEFD 6 1 2b FEA8AAA6 7 1 4c BD501308 8 1 0d 4D011D8E 9 0 1b 7EFEB6ED 10 1 2b 860DFCB3 11 1 15 9D769441 12 1 3a 62DA7A7D 13 1 5b C8D747DD 14 1 6f CA8DC445 15 1 7a 90FAAF41 CSCI-540 - Fall 2014-6- December 7, 2014

CSCI-540 - Fall 2014-7- December 7, 2014 (a) [ 3 Points ] The box below shows the format of a virtual address. Indicate (by labeling the diagram) the fields (if they exist) that would be used to determine the following: (If a field doesn t exist, don t draw it on the diagram.) VPO The virtual page offset TLBI The TLB index VPN The virtual page number TLBT The TLB tag (b) [ 2 Points ] The box below shows the format of a physical address. Indicate (by labeling the diagram) the fields that would be used to determine the following: PPO ( The physical page offset) and ( The physical page number). (c) [ 25 Points ] (5 points each) For the given virtual addresses, indicate the TLB entry accessed and the physical address. Indicate whether the TLB misses and whether the entry is or is not in the page table. If the physical page number and address can not be determined, write N/A. Then if a physical address exists indicate the cache translation parts, if its a cache hit, and a value if applicable. If any part can t be determined just write N/A. i. Virtual address: 2d5 A. Virtual address format (one bit per box) B. Address translation VPN TLB Index TLB Tag TLB Hit? (Y/N) In Page Table? (Y/N) C. Physical address format (one bit per box) CSCI-540 - Fall 2014-7- December 7, 2014

CSCI-540 - Fall 2014-8- December 7, 2014 D. Cache Translation Block Offset Cache Index Cache Tag Cache Hit? (Y/N) ii. Virtual address: 1b1 A. Virtual address format (one bit per box) B. Address translation VPN TLB Index TLB Tag TLB Hit? (Y/N) In Page Table? (Y/N) C. Physical address format (one bit per box) D. Cache Translation Block Offset Cache Index Cache Tag Cache Hit? (Y/N) CSCI-540 - Fall 2014-8- December 7, 2014

CSCI-540 - Fall 2014-9- December 7, 2014 iii. Virtual address: 33b A. Virtual address format (one bit per box) B. Address translation VPN TLB Index TLB Tag TLB Hit? (Y/N) In Page Table? (Y/N) C. Physical address format (one bit per box) D. Cache Translation Block Offset Cache Index Cache Tag Cache Hit? (Y/N) iv. Virtual address: 112 A. Virtual address format (one bit per box) B. Address translation VPN TLB Index TLB Tag TLB Hit? (Y/N) In Page Table? (Y/N) C. Physical address format (one bit per box) CSCI-540 - Fall 2014-9- December 7, 2014

CSCI-540 - Fall 2014-10- December 7, 2014 D. Cache Translation Block Offset Cache Index Cache Tag Cache Hit? (Y/N) v. Virtual address: 22f A. Virtual address format (one bit per box) B. Address translation VPN TLB Index TLB Tag TLB Hit? (Y/N) In Page Table? (Y/N) C. Physical address format (one bit per box) D. Cache Translation Block Offset Cache Index Cache Tag Cache Hit? (Y/N) CSCI-540 - Fall 2014-10- December 7, 2014