ECE 3056 Exam I - Solutions February 19 th, 2015 3:00 pm 4:25pm
1. (35 pts) Consider the following block of SPIM code. The text segment starts at 0x00400000 and the data segment starts at 0x10010000..data first:.word 1,0,1,0,1,0,1, last:.word 0.text.globl main main: la $t0, first #load start address of array la $t1, last #load end address of the array addi $t1, $t1, 4 #point to first word after the array li $t2, 0 #initialize count using immediate add $t3, $zero, $zero #initialize sum using another approach loop: lw $t4, 0($t0) #fetch array element add $t3, $t3, $t4 #update sum addi $t0, $t0, 4 #point to next word addi $t2, $t2, 1 #increment count bne $t1, $t0, loop #if not done, start next iteration li $v0, 10 syscall a) (10 pts) Provide the hexadecimal encodings of the following lw $t4, 0($t0) bne $t1, $t0, loop 8d0c0000 1528fffc
b. (5 pts) Is the above code relocatable? Explain. Yes. None of the instructions depend on the absolute address of another instruction. c. (10 pts) The following is the binary representation of a block of assembled SPIM code. Disassemble the program producing the original SPIM instructions. Use the opcode map at the end of this exam. Assembled Binary MIPS Instruction 0x21080004 addi $8, $8, 4 0x2129ffff addi $9, $9, -1 0x1520fffc bne $9, $0, -4 (words) d. (6 pts) Consider a jal instruction stored at address 0x00400028. Determine whether each of the following addresses can be a target of this instruction, i.e., the starting location of the procedure. You must justify your answer to receive credit. i. 0x20020040 The jal instruction provides the lower 28 bits of the target byte address. The upper 4 bits comes from the PC to produce a full 32-bit address. For this instruction the upper four bits of its PC are 0x0. Therefore, the target produced from the jal instruction cannot have the value 0x2 in the upper four bits. The target is not in the same 256Mbyte segment as the jal instruction. Hence, this address cannot be a target of the jal instruction. ii. 0x04040044 Following the reasoning cited above, this address can be a target of the jal instruction. e. (4 pts) What will be contained in the symbol table after assembly? The symbol main which is declared as globl.
2. (10 pts) Consider the 32-bit ALU design discussed in class and shown below. Note that it supports the beq instruction. Now suppose that we wished to similarly add hardware support for ble $t0, $t1, loop the branch-on-lessthan-or-equal-to instruction. Clearly show i) the necessary changes to the ALU hardware, ii) list the corresponding values of the ALU control signals for this instruction, and iii) describe briefly (in a sentence or two) how the single cycle datapath would operate with these changes. Instruction Binvert Operation ble 1 10 The ALU is set to perform $t0-$t1 (110). Add a 2-input OR gate to form the logical OR of the Zero signal and the MSB to produce a new ble signal. At the datapath level, the branch address multiplexor is now driven by a logical OR condition. The branch is taken if i) it is a beq instruction and the zero signal =1, or ii) it is a ble instruction and the ble signal = 1. The ble instruction can use the same ALU opcode as beq.
3. (25 pts) Consider the single cycle SPIM datapath shown overleaf. You wish to add a new instruction - dcb $t0, loop decrement-and-branch-on-zero. This instruction will first decrement the register $t0 by 1 and then branch to the label loop if the result is 0. Register $t0 is updated with the decremented value. a. (10 pts) Clearly show the hardware modifications on the datapath. [To receive any credit your modifications must be legible] You can use the implementation of the beq which already performs a subtract operation. Add the value -1 as an input to the ALUSrc mux (now it is a two bit control). Write the result of the ALU operation (which is the decrement) back to the rs register expand the RegDst mux to include rs as an input while RegDst becomes a 2-bit control signal. b. (10 pts) Fill in the truth table below for the single cycle data path including any changes to accommodate the new instruction Instr. RegDst AluSrc MemToReg RegWrite MemRead MemWrite Branch AluOp R- 01 00 0 1 0 0 0 10 format lw 00 01 1 1 1 0 0 00 sw X 01 X 0 0 1 0 00 beq X 00 X 0 0 0 1 01 DCB 10 10 0 1 0 0 1 01 c. (5 pts) Assuming the opcode for the new instruction is 100111, show the modifications required to the controller hardware below for the new instruction. Make and state any assumptions you need.
4. (30 pts) Consider the execution of the following block of SPIM code on a multicycle datapath. Assume that the fetch cycle for the first instruction is cycle number 0. Assume that i) all immediate instructions require the same number of states as an add instruction, ii) all branch instructions require the same number of states as a beq instruction. Text starts at 0x00400000 and the data segment starts at 0x10010000. # a simple counting for loop and a conditional if-then-else statement.data L1:.word 0x44,22,33,55 main: loop:.text.globl main lui $t0, 0x1001 li $t1, 4 add $t2, $t2, $zero lw $t3, 0($t0) add $t2, $t2, $t3 addi $t0, $t0, 4 addi $t1, $t1, -1 bne $t1, $zero, loop bgt $t2, $0, then move $s0, $t2 j exit then: move $s1, $t2 exit: li $v0, 10 syscall a. (5 pts) What is the contents of ALUOut after cycle number 14 completes? Why? 0x1001000. Cycle 14 is the address calculation cycle of the lw instruction. The address calculation is $t0 + 0. The contents of $t0 are 0x10010000. b. (15 pts) Fill in the following values during the requested cycle (remember the first cycle is cycle number 0!). Cycle ALUOp ALUSrcB IRWrite RegWrite MemToReg PCSource 10 10 00 0 0 X XX 18 00 11 0 0 X XX Cycle number 10 corresponds to the addition operation for the add $t2, $t2, $zero instruction. Cycle number 18 corresponds to the decode state for the add $t2, $t2, $t3 instruction.
c. (10 pts) Consider the multi-cycle data path and assume each state consumes the same amount of energy E joules, and that the bne instruction takes the same number of cycles as the beq instruction. Now consider that the data path executes at 1 GHZ. Compare the power dissipation of each loop. Your answer must include the computation of the power executed by each loop. Loop1: lw $t3, 0($t0) add $t2, $t2, $t3 addi $t0, $t0, 4 addi $t1, $t1, -1 bne $t1, $zero, Loop1 Loop2: lw $t3, 0($t0) add $t2, $t2, $t3 addi $t0, $t0, 4 bne $t1, $t4, Loop2 Note that each state dissipates the same amount of energy. States are executed at the rate of 1 GHz for both loops. Therefore the execution of both loops expends energy at the same rate E joules per nanosecond. Power is the rate of expenditure of energy. Power dissipation is the same for both loops.