OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS.

Similar documents
ECE 313 Computer Organization FINAL EXAM December 13, 2000

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts)

CS/CoE 1541 Mid Term Exam (Fall 2018).

Computer Architecture CS372 Exam 3

Question 1: (20 points) For this question, refer to the following pipeline architecture.

ECE Exam II - Solutions November 8 th, 2017

1. Truthiness /8. 2. Branch prediction /5. 3. Choices, choices /6. 5. Pipeline diagrams / Multi-cycle datapath performance /11

CS 251, Winter 2019, Assignment % of course mark

Final Exam Fall 2007

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

CS 251, Winter 2018, Assignment % of course mark

SOLUTION. Midterm #1 February 26th, 2018 Professor Krste Asanovic Name:

Pipelining. CSC Friday, November 6, 2015

Q1: Finite State Machine (8 points)

Computer Organization and Structure

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?

Pipelining and Caching. CS230 Tutorial 09

Final Exam Fall 2008

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011

CSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add;

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31

CSE 378 Final Exam 3/14/11 Sample Solution

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

cs470 - Computer Architecture 1 Spring 2002 Final Exam open books, open notes

ISA Instruction Operation

CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

Do not open this exam until instructed to do so. CS/ECE 354 Final Exam May 19, CS Login: QUESTION MAXIMUM SCORE TOTAL 115

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

CSCE 212: FINAL EXAM Spring 2009

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes

Alexandria University

CS3350B Computer Architecture Quiz 3 March 15, 2018

CMSC411 Fall 2013 Midterm 1

CS 2506 Computer Organization II Test 2

CS/CoE 1541 Exam 1 (Spring 2019).

ECE Sample Final Examination

Chapter 4. The Processor

ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013

ELE 375 / COS 471 Final Exam Fall, 2001 Prof. Martonosi

Pipelining Exercises, Continued

CS 61C Fall 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

CS 2506 Computer Organization II

Chapter 4. The Processor

Midterm #2 Solutions April 23, 1997

Final Exam Spring 2017

CS146 Computer Architecture. Fall Midterm Exam

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

CSE 141 Spring 2016 Homework 5 PID: Name: 1. Consider the following matrix transpose code int i, j,k; double *A, *B, *C; A = (double

CS 61C Summer 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)

HW1 Solutions. Type Old Mix New Mix Cost CPI

1 Hazards COMP2611 Fall 2015 Pipelined Processor

Pipelined Processor Design

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS61C Summer 2013 Final Exam

CS 351 Exam 2 Mon. 11/2/2015

CENG 5133 Computer Architecture Design Spring Sample Exam 2

EXAM #1. CS 2410 Graduate Computer Architecture. Spring 2016, MW 11:00 AM 12:15 PM

CS 351 Exam 2, Fall 2012

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

COMPUTER ORGANIZATION AND DESIGN

ELE 375 Final Exam Fall, 2000 Prof. Martonosi

CS232 Final Exam May 5, 2001

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.

ECE331: Hardware Organization and Design

Cache Organizations for Multi-cores

Instruction Frequency CPI. Load-store 55% 5. Arithmetic 30% 4. Branch 15% 4

Chapter 4. The Processor

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

امتحان نهاية الفصل الدراسي الثاني )يونية ) ٢٠١٥. Computer Architectures (CS x35) معماريات الحاسب x35) (CS. ii. if (g >= h) g = g + 1; else h = h 1;

CSEE W3827 Fundamentals of Computer Systems Homework Assignment 5 Solutions

LECTURE 10. Pipelining: Advanced ILP

Static, multiple-issue (superscaler) pipelines

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

Pipelining. lecture 15. MIPS data path and control 3. Five stages of a MIPS (CPU) instruction. - factory assembly line (Henry Ford years ago)

5008: Computer Architecture HW#2

ADVANCED COMPUTER ARCHITECTURES: Prof. C. SILVANO Written exam 11 July 2011

CS 2410 Mid term (fall 2015) Indicate which of the following statements is true and which is false.

University of California, Berkeley College of Engineering

CS 433 Homework 4. Assigned on 10/17/2017 Due in class on 11/7/ Please write your name and NetID clearly on the first page.

University of California, Berkeley College of Engineering

CS 341l Fall 2008 Test #2

s complement 1-bit Booth s 2-bit Booth s

ELE 818 * ADVANCED COMPUTER ARCHITECTURES * MIDTERM TEST *

Written Exam / Tentamen

Department of Electrical Engineering and Computer Sciences Fall 2017 Instructors: Randy Katz, Krste Asanovic CS61C MIDTERM 2

The University of Alabama in Huntsville Electrical & Computer Engineering Department CPE Test II November 14, 2000

CS455/CpE 442 Intro. To Computer Architecure. Review for Term Exam

Basic Instruction Timings. Pipelining 1. How long would it take to execute the following sequence of instructions?

Question 1 (5 points) Consider a cache with the following specifications Address space is 1024 words. The memory is word addressable The size of the

ECE 30, Lab #8 Spring 2014

Transcription:

CS/ECE472 Midterm #2 Fall 2008 NAME: Student ID#: OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS. Your signature is your promise that you have not cheated and will not cheat on this exam, nor will you help others to cheat on this exam. Signature: Each Question; 10 points Question 1 Question 2 Question 3 Question 4 Question 5 Question 6 Question 7 Question 8 Question 9 Question 10 TOTAL

1. Assume that you have a pipelined implementation of MIPS with a perfect cache. Draw the pipelining diagram for the code along the left side. I have started the diagram for you with the first statement. Indicate all data dependencies by circling the registers in the code. Indicate all necessary forwarding from the output of one functional unit to the input of another functional unit. Indicate all necessary stalls with a nop. Keep everything that happens in the same clock cycle in a straight column. You do not have to shade the active units. add $6, $1, $2 IF ID EX MA WB subi $6, $6, 1 lw $2, 100($6) beq $2, $6, NEXT sub $6, $7, $2 sw $6, $6($2) 2. Assume you have a loop with a branch statement at memory address 4000. Every time the program encounters this loop, the branch is taken 999 times and then not taken once which ends the loop. Assume this execution of the program has encountered this loop before and that no other branch statement indexes to the same entry in the branch prediction array. a) If a 1-bit prediction scheme is used, how many times will it be correct for the branch at memory address 4000 out of the 1000 times it is executed in the course of the loop? b) If a 2-bit prediction scheme is used, how many times will it be correct for the branch at memory address 4000 out of the 1000 times it is executed in the course of the loop?

3. The CPI of a machine with a perfect cache is 3.14 for a certain benchmark. Now, assume an instruction cache miss rate of 2% and a data cache miss rate of 6%. Assume the miss penalty (for either cache) is 270 cycles. The frequency of data access instructions in this benchmark is 48%. How much faster is the machine with the perfect cache than the one with the imperfect cache? Otherwise the machines are identical and running the same code. Express your answer either as a fraction or as a decimal rounded to the hundredths place. ( X.XX) Your answer should be > 1. 4. Assume an 80-bit virtual address and a 64-bit physical address. The page size is 8 KB. How many entries are there in the page table? Express your answer in powers of 2. Show your work. (KB = 2^10 bytes) 5. What is the order is which the system accesses the physically indexed/tagged L1 cache, the page table, the physically indexed/tagged L2 cache and the TLB assuming misses? (Assume all accesses are done sequentially, not in parallel.) 1. 2. 3. 4.

6. See the diagram below. The TLB is fully associative and contains the 4 bits: valid (V), protection (P), dirty (D), and reference (R). The page size is 8 KB. The virtual address is 37 bits long. The physical address is 32 bits long. virtual address 36 0 TLB 0... 63 VPDR Tag _? a) How many bits are in this tag? Be specific or no credit. b) What else (besides V,P,D,R and the tag) is stored in a TLB entry? c) How many bits are used for (b)? physical address 31 0

7. Assume we have a perfect cache. Assume the MIPS instruction set we have been studying all term. (Multicycle: loads = 5 cycles, stores and ALU instr = 4 cycles, branches and jumps = 3 cycles.) Assume the following functional unit times: (ps = 10-12 sec) 800 ps for memory access 130 ps for ALU operation 100 ps for register file read or write Assume a certain benchmark has the following instruction frequencies: 48% loads 12% stores 6% branches 4% jumps 30% ALU instructions For the pipelined design: Loads take 1 clock cycle when there is no load-use data dependency and 2 cycles when there is. Branches take 1 clock cycle when correctly predicted and 2 cycles when not. Jumps always cause a stall and therefore always take 2 cycles. 20% of the load instructions have a load-use data dependency. 1/4 of the branch instructions are incorrectly predicted. a) What is the minimum clock period for a single cycle implementation (in ps)? b) What is the minimum clock period for a multi-cycle implementation (in ps)? c) What is the minimum clock period for a pipelined implementation (in ps)? d) What is the CPI for a single cycle implementation? e) What is the CPI for a multi-cycle implementation? Round your answers to the hundredths place. X.XX f) What is the CPI for a pipelined implementation? g) How long (in ps) does it take for an instruction to come out of the single cycle implementation? Round to the hundredths place. h) How long (in ps) does it take (on average) for an instruction to come out of the multicycle implementation? Round to the hundredths place. i) How long (in ps) does it take (on average) for an instruction to come out of the pipelined implementation? Round to the hundredths place.

8. Assume we have a 16 KB cache direct mapped cache. (KB = 2^10 bytes) The block size is 8 words. Given a 32-bit physical address, divide up all the bits and indicate what they are used for to find and access the requested word in the cache. Do NOT draw the cache entries, the mux, the AND gate,... Just indicate exactly how many bits and which bits of the address are used for each purpose (the tag, the index,...). Label your groups of bits with their purpose. 31 0 9. Assume we have a 16 KB cache 8-way set associative cache. (KB = 2^10 bytes) The block size is 2 word. Given a 32-bit physical address, divide up all the bits and indicate what they are used for to find and access the requested word in the cache. Do NOT draw the cache entries, the mux, the AND gate,... Just indicate exactly how many bits and which bits of the address are used for each purpose. Label your groups of bits with their purpose. 31 0

10. A) Write a loop in MIPS that adds up all the ints stored in an array of size 100. I have started it for you. Use my registers ($t0, $s1, $s2) as declared in the comments. Use only MIPS core instructions. Feel free to use registers $t1 - $t7. LP: # $t0 holds the address of the first int in the array addi $s1, $t0, 400 #first word not in the array addi $s2, $zero, 0 #sum initialized to zero B) Unroll your loop (just the code that starts at LP) so that it executes 50 times and has no data hazards. Write your code to the right of your code from part A). C) Given the registers you see used in the starter code of A), what register(s) must be saved to the stack frame in order to follow MIPS calling conventions?

Page 8

Page 9