Homework 4 - Solutions (Floating point representation, Performance, Recursion and Stacks) Maximum points: 80 points

Similar documents
COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

EN164: Design of Computing Systems Lecture 11: Processor / ISA 4

comp 180 Lecture 10 Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions

CS/COE1541: Introduction to Computer Architecture

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

Lecture 5: Procedure Calls

2. dead code elimination (declaration and initialization of z) 3. common subexpression elimination (temp1 = j + g + h)

Lecture 5: Procedure Calls

Instruction Set Architecture

CSE Lecture In Class Example Handout

CDA3100 Midterm Exam, Summer 2013

2) Using the same instruction set for the TinyProc2, convert the following hex values to assembly language: x0f

MIPS function continued

MIPS ISA and MIPS Assembly. CS301 Prof. Szajda

ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010

October 24. Five Execution Steps

Homework 2 - Solutions (Assembly Language and Machine Language) Maximum points : 40 points

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

ECE232: Hardware Organization and Design

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary

Control Instructions

CSEE W3827 Fundamentals of Computer Systems Homework Assignment 5 Solutions

ICS DEPARTMENT ICS 233 COMPUTER ARCHITECTURE & ASSEMBLY LANGUAGE. Midterm Exam. First Semester (141) Time: 1:00-3:30 PM. Student Name : _KEY

Function Calls. 1 Administrivia. Tom Kelliher, CS 240. Feb. 13, Announcements. Collect homework. Assignment. Read

CS64 Week 5 Lecture 1. Kyle Dewey

The University of Alabama in Huntsville Electrical & Computer Engineering Department CPE Test II November 14, 2000

Common Problems on Homework

EE 361 University of Hawaii Fall

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

Lecture 14: Recap. Today s topics: Recap for mid-term No electronic devices in midterm

Review of Activation Frames. FP of caller Y X Return value A B C

Thomas Polzer Institut für Technische Informatik

ECE 30 Introduction to Computer Engineering

MIPS Functions and Instruction Formats

CS/COE 0447 Example Problems for Exam 2 Fall 2010

ECE331: Hardware Organization and Design

Subroutines. int main() { int i, j; i = 5; j = celtokel(i); i = j; return 0;}

ECE 2035 A Programming Hw/Sw Systems Spring problems, 8 pages Final Exam Solutions 29 April 2015

Computer Science and Engineering 331. Midterm Examination #1. Fall Name: Solutions S.S.#:

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

6.004 Tutorial Problems L3 Procedures and Stacks

CENG3420 Computer Organization and Design Lab 1-2: System calls and recursions

Chapter 2A Instructions: Language of the Computer

Points available Your marks Total 100

Function Calls. Tom Kelliher, CS 220. Oct. 24, SPIM programs due Wednesday. Refer to homework handout for what to turn in, and how.

Chapter 3. Instructions:

ECE 2035 Programming HW/SW Systems Fall problems, 7 pages Exam Two 23 October 2013

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Final Exam Fall 2007

bits 5..0 the sub-function of opcode 0, 32 for the add instruction

Lecture 7: MIPS Functions Part 2. Nested Function Calls. Lecture 7: Character and String Operations. SPIM Syscalls. Recursive Functions

COE608: Computer Organization and Architecture

(a) Implement processor with the following instructions: addi, sw, lw, add, sub, and, andi, or, ori, nor, sll, srl, mul

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

ECE369. Chapter 2 ECE369

Winter 2017 MIDTERM TEST #1 Wednesday, February 8 7:00pm to 8:30pm. Please do not write your U of C ID number on this cover page.

COMP2611: Computer Organization MIPS function and recursion

Chapter 2. Instructions:

CSE Lecture In Class Example Handout

Patterson PII. Solutions

CS 316: Procedure Calls/Pipelining

ECE260: Fundamentals of Computer Engineering

CS61C Machine Structures. Lecture 12 - MIPS Procedures II & Logical Ops. 2/13/2006 John Wawrzynek. www-inst.eecs.berkeley.

CSCI 2321 (Computer Design), Spring 2018 Homework 3

Winter 2003 MID-SESSION TEST Monday, March 10 6:30 to 8:00pm

CS 61C Summer 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)

Lecture 5. Announcements: Today: Finish up functions in MIPS

Computer Architecture CS372 Exam 3

ECE/CS 314 Fall 2003 Homework 2 Solutions. Question 1

Instructions: MIPS arithmetic. MIPS arithmetic. Chapter 3 : MIPS Downloaded from:

MIPS Procedure Calls. Lecture 6 CS301

2B 52 AB CA 3E A1 +29 A B C. CS120 Fall 2018 Final Prep and super secret quiz 9

Defining Performance. Performance. Which airplane has the best performance? Boeing 777. Boeing 777. Boeing 747. Boeing 747

ECE 2035 Programming HW/SW Systems Spring problems, 6 pages Exam Two 11 March Your Name (please print) total

CS 61C: Great Ideas in Computer Architecture More MIPS, MIPS Functions

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts)

CS 61C Fall 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)

Procedure Calls Main Procedure. MIPS Calling Convention. MIPS-specific info. Procedure Calls. MIPS-specific info who cares? Chapter 2.7 Appendix A.

Homework 1. Problem 1. The following code fragment processes an array and produces two values in registers $v0 and $v1:

ECE 2035 Programming HW/SW Systems Fall problems, 6 pages Exam Two 21 October 2016

Chapter 1. Computer Abstractions and Technology. Lesson 3: Understanding Performance

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

MIPS Assembly Programming

Instruction Set Architectures (4)

Computer Architecture I Midterm I

CS 2506 Computer Organization II Test 1

Lecture 4: MIPS Instruction Set

CS152 Computer Architecture and Engineering

Computer Architecture. Chapter 2-2. Instructions: Language of the Computer

Functions in MIPS. Functions in MIPS 1

CENG3420 Lecture 03 Review

Computer Architecture

ENCM 369 Winter 2017 Lab 3 for the Week of January 30

ECE 2035 Programming HW/SW Systems Fall problems, 6 pages Exam Two 23 October Your Name (please print clearly) Signed.

ECE 2035 Programming HW/SW Systems Spring problems, 7 pages Exam One Solutions 4 February 2013

CSE 378 Final Exam 3/14/11 Sample Solution

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Chapter 2: Instructions:

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

Transcription:

Homework 4 - Solutions (Floating point representation, Performance, Recursion and Stacks) Maximum points: 80 points Directions This assignment is due Friday, Feb. th. Submit your solutions on a separate sheet of paper. Learning Objectives In the process of completing this homework assignment, students will develop their abilities to: Determine the performance of a processor. Determine how different parameters affect the performance of a machine Apply Amdahl s law to determine how much speedup can be obtained by improving an architecture. Interpret and represent floating point numbers Write recursive procedures. Problems ) In this problem, you will analyze the performance of two implementations of the MULDER instruction set architecture. a. [5 points] For the FOX implementation, the clock rate is GHz, and instructions fall into two categories. Category X instructions require 4 cycles to execute, while category Y instructions require 5 cycles. If the File program executes.27 0 category X instructions and.0 0 category Y instructions, how long will it take to execute? Express your answer in seconds. N CPI = IC i * CPI i where IC i is the # of instructions of type i and CPI i is the t E i= average CPI for instructions of type i for the architecture..27x0 x4 +.0x0 x5 CPI =.27x0 +.0x0 = ofinstructions CPI = (.27 0 +.0 0 ) CPI ClockRate 0 # 9 = 03seconds b. [5 points] The SPOOKY implementation is identical to the FOX implementation, except that the clock rate is.2 GHz and category Y instructions require 6 cycles. How long will the File program take to execute on this implementation?.27x0 x4 +.0x0 x6 CPI =.27x0 +.0x0

t E = ofinstructions CPI = (.27 0 +.0 0 ) CPI ClockRate.2 0 # 9 = 928seconds c. [5 points] The AGENT implementation is identical to the FOX implementation, except that it includes a category Z instruction that requires 8 cycles to execute and does the same thing as a common combination consisting of one category X instruction and one category Y instruction. This combination accounts for 0% of the category X instructions used by the File program on the FOX implementation. How long will the File program take to execute on the AGENT implementation? # of Z type instructions = 0% x # of X type instructions =. x.27 x 0 =.27 x 0 0 Total remaining # of X type instructions when Z type instructions are used =.27 x 0.27 x 0 0 =.43 x 0 Total remaining # of Y type instructions when Z type instructions are used =.0 x 0.27 x 0 0 = 0.883 x 0 0.27x0 8 +.883 0 5 +.43 4 CPI = 0.27 0 +.883 0 +.43 0 t E 0 =.43 0 + 0.883 0.27 0 ) CPI 0 ( 9 = 000.3seconds d. [5 points] Between the FOX and SPOOKY implementations, which is faster for the File program, and by how much? Comparing the execution times calculated, the SPOOKY implementation is faster by 85 seconds or.09 times as fast or 8.39% (85*00/03) faster. 2) [8 points] Suppose we have two implementations of the same instruction set architecture. Computer A has a clock cycle time of 4 ns and a CPI of 2.0 for some program, and computer B has a clock cycle time of 2 ns and a CPI of.2 for the same program. Which computer is faster for this program? timea = cycle time of A x times IC x times CPI of A = 4 x 2 x IC = 8 x IC ns timeb = cycle time of B x times IC x times CPI of B = 2 x :2 x IC = 2:4 IC ns B is faster than A. 3) [9 points] Suppose we could improve the speed of the CPU in a computer by a factor of 5 (without affecting I/O performance) for 5 times the cost. Also assume that the CPU is used 50% of the time, and the rest of the time the CPU is waiting for I/O. If the CPU is one-third of the total cost of the computer, is increasing the CPU speed by a factor of 5 a good investment from a cost/performance viewpoint?

Using Amdahl's rule to determine the overall speedup: Overall Speedup = f ( f ) + s = =.667 = 66.7% 0.5 ( 0.5) + 5 Assuming that the original cost of the computer is x. Original CPU cost = /3x New CPU cost = 5 * /3x New cost of computer = (5 * /3 + ½)x = 2.67 x Overall increase in cost = 267% 4) [8 points] The incomplete MIPS procedure below computes the recursive function b, if a = 0 foo(a,b) = foo(a-,b-a) * b, otherwise Finish the procedure by adding the code for the procedure entrance, the recursive procedure call, and the procedure exit. Do not modify any of the given code, and be sure to follow the MIPS convention for register usage. Hint: $t0 and $s0 hold the local values of a and b, respectively. You may write the answer to this question in the boxes provided and attach the sheets with the rest of your solutions. foo:# Procedure entrance addi $sp, $sp, -4 # Since procedure may call another sw $ra, 0($sp) # procedure, save $ra move $t0, $a0 # $t0 = a addi $sp, $sp, -4 # Since $s0 is a saved register, sw $s0, 0($sp) # before we over-write it, must # save its value. move $s0, $a # $s0 = b # We use $s0 to hold b, because we # need the value of b after the # call to foo. # i.e. foo(a-,b-a) * b # if (a == 0) return b bne $t0, $zero, Else move $t, $s0 j Exit # otherwise, return foo(a-,b-a)*b Else: sub $t2, $t0,

sub $t3, $s0, $t0 # Recursive procedure call move $a, $t3 jal foo //Move arguments to argument //registers //Call the procedure move $t, $v0 mul $t, $t, $s0 Exit: move $v0, $t #Procedure exit lw $s0, 0($sp) lw $ra, 4($sp) addi $sp, $sp, 8 //Restore $ra and $s0 jr $ra //Quit procedure 5) [0 pts] Some computer architectures do not provide instructions for multiplication. The following Java code describes one (not particularly efficient) algorithm that could be used to perform multiplication within such an instruction set: public static int mymultiply( int a, int b ) { int rv; if ( a == 0 ) { rv = 0; } else if ( a > 0 ) { rv = mymultiply( a -, b ) + b; } else { rv = - mymultiply( -a, b ); } } return rv; Write a MIPS program that implements this algorithm. mymultiply: addi $sp, $sp, -8 #Save $ra to the stack sw $ra, 4($sp) sw $s0, 0($sp) #Save $s0 to the stack move $t0, $a0 move $s0, $a #$t0 = a #$s0 = b

#If (a==0) return rv = 0; bne $t0, $0, Greater move $v0, $0 # return 0 j Exit # It is not a good programming practice # to have # multiple exit points from a program. Greater: #if(a > 0), return mymultiply( a -, b ) + b slt $t2, $0, $t0 beq $t2, $0, Less addi $t2, $t0, - # $t2 = a- move $a, $s0 # Move arguments to argument # registers jal mymultiply # call procedure add $v0, $v0, $s0 # $v0 = mymultiply (,) + b j Exit Less: # return - mymultiply( -a, b ) sub $t2, $0, $t0 #$t2 = -a move $a, $s0 jal mymultiply # call procedure sub $v0, $0, $v0 add $t2, $0, $t #$t2 = -a move $a, $s0 jal mymultiply # call procedure sub $v0, $0, $v0 Exit: lw $ra, 4($sp) # Restore $s0 and $ra lw $s0, 0($sp) addi $sp, $sp, 8 jr $ra # Exit procedure 6) [0 points] Convert the following decimal numbers to single precision IEEE 754 floating point format: (a) -2.625 (b) 9.25 (a) (2:625 )=> 00.0 (-) x (.0000) x 2 4, exp = 3

000 00 00 00 0000 0000 0000 000 (b) 9.25 => 00.0 (-) 0 x (.000) x 2 3, exp = 30 0 000 000 000 000 0000 0000 0000 000 7) [5 points] The following bit patterns are floating point numbers in single precision IEEE 754 format. Convert them to decimal: (a) 00000 => exp = 25.00 =>.42875 (-) x.42875 x 2 25 = -47, 70, 208 (b) 00 => exp = -4.00 =>.875 (-) 0 x.875 x 2-4 = 0.0742875 (c) 00000000 => exp = -26 0.0 => 0.4375 (-) 0 x 0.4375 x 2-26 = 5.427877 x 0-39