Computer Organization and Architecture (CSCI-365) Sample Final Exam

Computer Organization and Architecture (CSCI-365) Sample Final Exam NAME: STUDENT NUMBER 1. Consider a computer system with 64Kbytes main memory and 256bytes cache. If we assume the cache line size is 8 bytes. (15 points) 1. Show the format of main memory address with direct mapping, associative and 2- way set associative mapping, respectively. 2. If direct mapping is used, into what line would bytes with each of the following byte address be stored: 0001 0001 0001 1011 and 1010 1010 1010 1010 3. If direct mapping is used, supposed the byte with byte address 0001 1111 0001 1011 is stored in the cache. What are the addresses of the other bytes stored along with it. 4. If direct mapping is used, why is the tag also stored in the cache. 5. Draw a diagram of the cache organization and show briefly the different fields of the address are interpreted. Assume we still are using direct mapping. Page 1 of 7

2. Convert the C function below to MIPS assembly language. Make sure that your assembly language code could be called from a standard C program (that is to say, make sure you follow the MIPS calling conventions,in page 121 Figure 2.14) (10 points). unsigned int sum(unsigned int n) { if (n == 0) return 0; else return n + sum(n-1); } Page 2 of 7

3. Suppose we are running an application program which requires virtual memory space 32 Gbytes. The available physical memory space is 128 Mbytes. If we plan to use page length 16 Kbytes, describe the virtual and physical address format. Use a figure or texts to illustrate briefly the format of page table and address translation from virtual address to physical address. (15 points) Page 3 of 7

4. Consider a sequence of nine instructions without hazards occurring and show how it passes through a two and six stage pipeline respectively (draw a figure for each of the two situations). How many clock cycles are needed in each case? How about the speedup? In which case is the execution time of the sequence shorter? Can we conclude that an increasing number of stages always provide increasing performance? Bring arguments. (10 points) 5. What are pipeline hazards? Describe shortly three types of hazards with their reasons. Describe briefly how can those penalties be reduced? (7 points) Page 4 of 7

6. Computer Arithmetic (28 points in total) Given the following numbers, convert to decimal, binary, hexadecimal and octoal (6 points): DECIMAL BINARY HEXADECIMAL OCTOAL 37.625 37.625 1110 1110 3A.2D 3A.2D 1212 1212 Find the sum of the following two's complement numbers (one byte is used for storage.) Indicate if an error has occurred. Convert each number to its decimal equivalent to check your results. (2 points) 11111111 01001011 + 11111100 + 01010011 Given (x= -9 and y=3), compute the product p=x*y using two's complement notation by Booth algorithm. (5 points) Page 5 of 7

Show the IEEE 754 binary representations for the floating point number -0.75 10 in single precision. What is the equivalent decimal representation of 1011 1110 1001 0000 0000 0000 0000 0000 (also with IEEE 754 binary representation with single precision)? (5 points) Calculate A x (B + C) by assuming A, B, and C are in the 16-bit NVIDIA format with 1 guard bit, 1 round bit and 1 sticky bit (10 points) A= 1.5234375 x 10-1, B= 2.0703125 x 10-1, C= 9.96875 x 10 1 Page 6 of 7

7. Consider the following superscalar processor organization with 3 processing stages where two instructions can be fetched and decode at a time, three functional units (floating point unit, ADDF; integer adder, ADD; integer multiplier, MUL;) can work in parallel, and two instructions can be written back (completed) at a time. Assume I1 requires two cycles to execute. The left all pipeline stages take one cycle. Consider the following program to be executed on this processor: I1: ADDF R12, R13, R14 I2: ADD R1, R8, R9 I3: MUL R4, R2, R3 I4: MUL R5, R6, R7 I5: ADD R10, R5, R7 I6: ADD R11, R2, R3 I7: MUL R5, R9, R2 What dependencies exist in the program and specify them? (3 points) Which of them can be solved by register naming? (1 point) Which data dependencies have to be considered seriously by this superscalar processor when using in-order issue with in-order completion policies? (1 point) Which data dependencies have to be considered seriously by this superscalar processor when using out-of-order issue with out-of-order completion policies? (1 point) Show the pipeline activities on the processor using in-order issue with in-order completion policies. (3 points) Show the pipeline activities on the processor using in-order issue with out-oforder completion policies. (3 points) Show the pipeline activities on the processor using out-of-order issue with outof-order completion policies. (3 points) Page 7 of 7