CPE 631 Advanced Computer Systems Architecture: Homework #2 Issued: 02/01/2006 Due: 02/15/2006 Q#1. (30 points) Evaluate effectiveness of blocking optimization for matrix multiplication on SRx machines. A. Write a C subroutine for matrix multiplication MC = MA x MB: void mm(double **ma, double **mb, double **mc, int d); Matrices ma, mb, and mc are squared with dxd elements of type double. Let d be an input parameter for your main program that initializes the matrices and calls the subroutine for matrix multiplication. B. Modify your program from the part A to support the blocking optimization technique. Let b (blocking factor) be an additional input parameter of your program. C. Compare performance of the base program from A and the program from B for different blocking factors b assuming n=128 (or 256)? What is the optimal b for the solution with blocking? Q#2. (30 points) Evaluate effectiveness of blocking optimization for matrix multiplication executing on a simulated machine. Using the SimpleScalar toolset for PISA instruction set (sim-cache and sim-outorder simulators), repeat the measurement form Q#1 for a simulated computer system with the following characteristics L1I (8KB, direct-mapped, 64B cache line) + L1D (8KB, directmapped, 64B cache line), L2U (256KB, 64B cache line, LRU replacement policy, 4-way setassociative). UAH-ECE CPE 631: Homework #2 Page 1 of 6
Q#3. (10 points) Cache basics A. (3 points) Assume a computer with the following characteristics: word is 32 bits, addressable unit is a byte, 2-way set-associative data cache with 4 word blocks, the cache size is 512B. Replacement policy is LRU, the write policy is write-back, and on write miss the block is loaded into the cache (write allocate). Determine the sizes of the Tag, Index and Offset fields. Draw the structure of cache memory (tags + data, status bits, LRU bits). B. (7 points) Fill the following table for the cache memory described above. Assume that the cache memory is empty at the beginning, and each memory location contains its own address (e.g., Mem[0x0000 000C]=0x0000 000C). All write actions write 0 to the specified locations. CPU action Hit/Miss Replacement [-/Yes(Which block)] Read 0x02 Read 0x40 Write 0x41 Write 0x1042 Write 0x1243 Memory Operation [Read/Write] + Block Address Draw the structure of cache memory after the last CPU action is done. Q#4. (10 points) Textbook A#2. Q#5. (5 points) Textbook A#3. UAH-ECE CPE 631: Homework #2 Page 2 of 6
Q#6. (15 points) MIPS Pipeline Consider the following code fragment assuming the MIPS integer pipeline where branches are resolved during the Instruction Decode Stage. All memory accesses are cache hits. The initial value of R3 is R2 + 400. Branches are handled by freezing the pipeline. loop: lw R1, 0(R2) A. (5 points) Show the timing of this instruction sequence for the MIPS pipeline without any forwarding hardware. How many clock cycles does this loop take to execute? B. (5 points) Show the timing of this instruction sequence for the MIPS pipeline with forwarding hardware. Mark all data forwarding paths using arrows directed from source to destination. How many clock cycles does this loop take to execute? C. (5 points) Assuming MIPS pipeline with delayed branch and forwarding hardware, schedule instructions in the loop including branch delay slot. You may reorder instructions and modify individual instruction operands, but do not undertake other loop transformations. Show a pipeline diagram and compute the number of cycles needed to execute the entire loop. Mark all data forwarding paths using arrows directed from source to destination. Solution: UAH-ECE CPE 631: Homework #2 Page 3 of 6
A. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 1st lw R1, 0(R2) 2nd lw R1, 0(R2) D Execution time: UAH-ECE CPE 631: Homework #2 Page 4 of 6
B. 1st lw R1, 0(R2) 2nd lw R1, 0(R2) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Execution time: UAH-ECE CPE 631: Homework #2 Page 5 of 6
C. 1 st 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 2 nd Execution time: UAH-ECE CPE 631: Homework #2 Page 6 of 6