Winter 2002 FINAL EXAMINATION - PDF Free Download

University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Instructors: Dr. S. A. Norman (L01) and Dr. S. Yanushkevich (L02) Note for Winter 2005 students Winter 2002 FINAL EXAMINATION In order to save paper, this document has been reformatted to leave out the spaces for answers. There will be spaces for answers on the Winter 2005 question paper. Instructions Please note that the official University of Calgary examination regulations are printed on page 1 of the Examination Regulations and Reference Material booklet that accompanies this examination paper. All of those regulations are in effect for this examination, except that you must write your answers on the question paper, not in the examination booklet. You may not use electronic calculators or computers during the examination. The examination is closed-book. You may not refer to books or notes during the examination, with one exception: you may refer to the Examination Regulations and Reference Material booklet that accompanies this examination paper. You are not required to add comments to assembly language code you write, but you are strongly encouraged to do so, because writing good comments will improve the probability that your code is correct and will help you to check your code after it is finished. Some problems are relatively easy and some are relatively difficult. Go after the easy marks first. Write all answers on the question paper and hand in the question paper when you are done. Please do not hand in the Examination Regulations and Reference Material booklet. Please print or write your answers legibly. What cannot be read cannot be marked. If you write anything you do not want marked, put a large X through it and write rough work beside it. You may use the backs of pages for rough work. PAGE 1 OF 6

PROBLEM 1 (total of 20 marks) Part a (12 marks). Write a SPIM translation of the procedure f in the translation unit listed to the right of this text. Use only instructions from the Final Examination Instruction Subset described in the Examination Regulations and Reference Material booklet. Follow the calling conventions used in lectures and labs, and observe the following additional conventions regarding floating-point registers: the return value goes in $f0; the argument to root goes in $f12; $f2, $f4,..., $f10 may be used as temporaries (like $t0 $t9); and $f20, $f22,..., $f30 may be used as local variables (like $s0 $s7). double root(double x); double f(const double *a, const double *b, int n) { int i; double dot, sum_a, sum_b; dot = 0.0; sum_a = 0.0; sum_b = 0.0; for (i = 0; i < n; i++) { dot += a[i] * b[i]; sum_a += a[i] * a[i]; sum_b += b[i] * b[i]; } if (dot < 0.0) dot = -dot; return dot / (root(sum_a) * root(sum_b)); } Part b (4 marks). Consider the following bit pattern: 1100_0000_1010_1000_0000_0000_0000_0000 According to the IEEE 754 single-precision standard, what number does the bit pattern represent? Show your work, and express your answer as a base ten number. Part c (4 marks). What is the IEEE 754 single-precision representation of the base ten number 0.0703125? Express your answer in base two. Show your work. (Hint: 0.0703125 = 1/16.0 + 1/128.0.) PROBLEM 2 (10 marks) Consider the C translation unit printed to the right of this text. Write a SPIM translation of the procedure foo. Follow the calling conventions used in lectures and labs, and use only instructions from the Final Examination Instruction Subset described in the Examination Regulations and Reference Material booklet. Your code does not need to check that j or k are nonzero before they are used in division operations. (You may recall from lectures that some C compilers set up checks for division-by-zero before trying integer division.) int quux(int x); int foo(int a, int b, int *p) { int j, k; k = quux(a); j = quux(*p); *p = a / k; return b % j; } PAGE 2 OF 6

PROBLEM 3 (total of 14 marks) This problem concerns the multicycle processor design of Chapter 5 of the course textbook. The complete datapath and set of control signals is shown here: [Students in 2005 and later please refer to Fig. 5.28 from Patterson and Hennessy.] Part a (8 marks). The processor takes four clock cycles to execute an slt instruction. The finite state machine (FSM) that was designed for this processor passes through states 0, 1, 6 and 7 in these four clock cycles. Complete the following table to show what the control signals should be in states 0, 6, and 7. You do not need to have memorized the FSM to do this problem! If you understand how the processor works, you should be able to deduce the correct control signal values. The ALUOp encoding is as follows: 00 tells the ALU Control to tell the ALU to add; 01 tells the ALU Control to tell the ALU to subtract; 10 tells the ALU Control to select the ALU operation based on bits 5 0 of the instruction. Use X to indicate that a signal is a don t care in a particular state. signal state 0 state 1 state 6 state 7 RegDst X RegWrite 0 ALUSrcA 0 MemRead 0 MemWrite 0 MemtoReg X IorD X IRWrite 0 PCWrite 0 PCWriteCond 0 ALUOp 00 ALUSrcB 11 PCSource XX Part b (6 marks). Suppose that a new datapath element is available to support logical shift operations: 5 32 LeftRight Count Result Shifter DataIn 32 The 1-bit signal LeftRight is 1 for a left shift and 0 for a right shift. The 5-bit signal Count represents a shift count in the range from 0 to 31. The uses of the 32-bit input DataIn and 32-bit output Result should be obvious. The format of the sll instruction is the following, in base 2: 000000_00000_sssss_ddddd_ccccc_000000 sssss is the bit pattern that specifies the source register, ddddd is the bit pattern that specifies the destination register, and ddddd is the bit pattern that specifies the shift count. Note that bits 31 26 are also 000000 for add, sub, and, or and slt. Describe how the multicycle machine can be extended to support the sll instruction. Your answer must deal with both datapath and control, and must make it clear that all of the other instructions will still be executed correctly. Use text and/or diagrams to explain your answer. PAGE 3 OF 6

PROBLEM 4 (total of 12 marks) Part a (3 marks). Consider a MIPS-like computer system with 8-bit bytes and 32-bit words, and 32-bit addresses; the system does not use virtual memory. Suppose a data cache for this system is directmapped, has a four-word block size, and holds 2 14 bytes of data. Show how a data address would be broken into index, tag, byte offset and block offset. Part b (2 marks). Suppose the following sequence of instructions is executed on the system of part a: lui ori lw lb $t1, 0x1001 $t1, $t1, 0x0004 $t0, ($t1) $t2, -1($t1) If there is a miss in the data cache on the lw instruction, how many words will be read into the cache from main memory? Briefly give a reason for your answer. Part c (2 marks). In the code of part b will there be a hit or a miss in the data cache on on the lb instruction? Or is it not possible to decide with the information given? Briefly show how you obtained your answer. Part d (3 marks). Suppose the cache of part a is replaced with a two-way set-associative cache, with a four-word block size, holding 2 14 bytes of data. Make a diagram of this cache, showing the organization of valid bits, tags, and data words. (Leave out circuitry such as comparators and multiplexors.) Part e (2 marks). Which cache is likely to have a lower miss rate, that of part a or that of part d? Give a brief justification of your answer. PROBLEM 5 (total of 9 marks) Part a (3 marks). Give a brief definition of the term data hazard as it relates to pipelining. Then briefly describe one strategy used by processor designers to minimize stalls due to data hazards. Part b (3 marks). Give a brief definition of the term control hazard as it relates to pipelining. Then briefly describe one strategy used by processor designers to minimize stalls due to control hazards. Part c (3 marks): Consider the following procedure written in SPIM assembly language proc: L1:.text.globl addu beq lw proc $t0, $a0, $zero $a1, $zero, L1 $t0, ($a1) addiu $t0, $t0, 7 srl $v0, $t0, 3 jr $ra By adding no more than one nop instruction and reordering some existing instructions, rewrite the procedure in MIPS assembly language, so that it would work correctly with MIPS branch and load delays. Assume that the MIPS assembler does not automatically reorder instructions or insert nop instructions. (Reminder: The branch delay rule applys to jr instructions in the same way it applies to branch instructions.) PAGE 4 OF 6

PROBLEM 6 (total of 9 marks) Consider a MIPS-like computer system with virtual memory. Physical and virtual addresses are both 32 bits in size. Words consist of four 8-bit bytes. The computer has two TLBs one for instructions and one for data. Each TLB has room for 64 translations from physical to virtual page numbers. The size of a single page is 2 14 bytes. Part a (2 marks). Somebody tells you that for a particular process, the instruction at address 0x0040_010c is at physical address 0x07fe_210c. Explain why that cannot possibly be true. Part b (4 marks). Suppose that for a particular process, the word with virtual address 0x7fff_ff04 is at physical address 0x09b7_ff04. Suppose the $sp register contains 0x7fff_ff00 and the following instruction is executed: lw $t0, 4($sp) Suppose there is a miss in the data TLB when this instruction is executed. Describe the steps taken by the data TLB and the operating system kernel to ensure that the instruction is executed successfully. Part c (3 marks). A page fault may occur as a result of a TLB miss; a page fault is a sequence of events that is different from what happens in part b. How much time will it take the operating system kernel to handle a page fault compared to handling the TLB miss of part b? The same amount of time? A small amount longer? Or much longer? Give a good reason for your answer. PROBLEM 7 (total of 8 marks) Part a (4 marks). What does the term overflow mean when applied to integer subtraction. Give an example of 8-bit integer subtraction that results in overflow. Part b (4 marks). Suppose a computer is big-endian and uses the SPIM instruction set. What will be in $t0 and $t1 after the following sequence of instructions is run? lui ori sw lui ori sw lb lb $t2, 0x89ab $t2, 0xcdef $t2, 0($sp) $t2, 0x1234 $t2, 0x5678 $t2, 4($sp) $t0, 1($sp) $t1, 7($sp) Assume that $sp and $sp+4 are addresses of stack slots that are being used as a local variables. Show how you obtained your answer. PAGE 5 OF 6

PROBLEM 8 (10 marks). Consider the 32-bit IEEE 754 floating-point type. Some values of this type, such as 12345.0 and 3.0, can also be represented as 32-bit two s-complement integers. Others can t; for example, 2.25 can t be represented as a 32-bit two s-complement integer, because it is not an integer at all, and 3,000,000,000.0 can t be represented as a 32-bit two s-complement integer because it is too large. Write a SPIM procedure to match the following C interface: int has_int_rep(unsigned f); The argument f is assumed to hold the bit pattern of an IEEE 754 single-precision number. The return value should be 1 if the number represented by the floating-point bit pattern can also be represented as as a 32-bit two s-complement integer, and 0 otherwise. Follow the calling conventions used in lectures and labs. Use only integer instructions from the Final Examination Instruction Subset described in the Examination Regulations and Reference Material booklet, plus two additional instructions: sllv rdest, rsrc1, rsrc2 shift value of register rsrc1 left by count taken from register rsrc2, filling with 0 s from the right; put result in register rdest. Shift count in rsrc2 must be 0 and 31. srlv rdest, rsrc1, rsrc2 shift value of register rsrc1 right by count taken from register rsrc2, filling with 0 s from the left; put result in register rdest. Shift count in rsrc2 must be 0 and 31. IMPORTANT: Use integer instructions and registers only! If you can t find an algorithm that always works, you can get partial credit for code that works for at least some argument values. PAGE 6 OF 6