Prerequisite Quiz January 23, 2007 CS252 Computer Architecture and Engineering

University of California, Berkeley College of Engineering Computer Science Division EECS Spring 2007 John Kubiatowicz Prerequisite Quiz January 23, 2007 CS252 Computer Architecture and Engineering This prerequisite quiz will be used in determining class admissions. Good Luck! Your Name: SID Number: Discussion Section: 1 2 3 Total 1

[ This page left for π ] 3.141592653589793238462643383279502884197169399375105820974944 2

Problem 1: Memory Hierarchy Problem 1a: Below is a series of memory read references set to a cache. The cache holds 128 bytes total. It has 2-word blocks (i.e. 64bits), is 2-way set associative, and uses a least-recently-used replacement policy. Assume that the cache is initially empty. Classify each memory references as a hit or a miss. Identify each cache miss as either compulsory, conflict, or capacity. One example is shown below. Feel free to use space in the margin as scratch. Address Hit/Miss? Miss Type? 0x7 Miss Compulsory 0x4D 0x2A 0x79 0xAB 0xCE 0x2E 0x4B 0x6D 0x8A 0xAF 0x29 0xC8 0xCE 0x6A Problem 1b: Calculate the Miss Rate and the Hit Rate: 3

Problem 1c: Suppose you have a 32-bit processor, with a virtual-memory page-size of 16K. The data cache is 32K in size with 32-byte cache blocks. Finally, your TLB has 4 entries. Assume that you wish to do TLB lookups in parallel with cache lookups. Draw a block diagram of the data cache and TLB organization, showing a virtual address as input and both a physical address and data as output. Include cache hit and TLB hit output signals. Include as much information about the internals of the TLB and cache organization as possible. Include, among other things, all of the comparators in the system and any muxes as well. You can indicate RAM as with a simple block, but make sure to label address widths and data widths. Make sure to use abstraction in your diagram so that we can understand it. Label the function of various blocks and the width of any buses. 4

Next PC 4 Adder Problem 2: Pipelining Next SEQ PC Adder RS1 MUX Zero? Address Memory IF/ID RS2 Reg File ID/EX MUX ALU EX/MEM Data Memory MEM/WB MUX Sign Extend Imm RD RD RD WB Data Figure 1: A simple 5-stage pipeline Problem 2a: Is it possible to have zero branch delay-slots without guessing? Explain carefully. Problem 2b: What is a load delay-slot? Why does it exist in the above pipeline? Problem 2c: What sort of logic would be involved in stalling so that the above pipeline will have correct behavior for instruction sequences like: lw r2, 32(r19) ; load value addi r3, r2, #3 ; add 3 to it. Be general (get all sequences involving loads). Feel free to use pseudo-code to represent logic; data in registers can be named using name of register (i.e. rs1 value in IF/ID = r0 ): 5

The code sequence below is written in DLX assembly. Assume that it will execute on a pipeline similar to that in Figure 1, which includes a multiplier in the ALU. In addition, assume that there is one branch delay slot and that load instructions stall if necessary to get correct execution. 0 addi r5, r0, #50 ; reset sum to max 4 lw r1, 16(r19) ; load base address from stack 8 lw r2, 32(r19) ; load number of iterations 12 loop: lw r3, 0(r1) ; load data x 16 lw r4, 4(r1) ; load data y 20 mulu r7, r3, r4 ; multiply them together 24 subu r5, r5, r7 ; decrement result 28 addi r1, r1, 8 ; increment base by 2 words 32 addi r2, r2, -1 ; decrement count 36 bnez r2, loop ; branch to next iteration 40 noop ; do nothing 44 exit: hcf ; Halt and catch fire (exit) Problem 2d: Assuming that multiplications take 2 cycles to compute, but we otherwise utilize the pipeline shown in Figure 1, how many cycles will the above code take for each iteration of the loop? Explain. Problem 2e: Rearrange the instructions above to shorten the number of cycles/iteration: 6

Problem 2f: Draw the forwarding logic and control logic required to handle the following sequence without stalling in the pipeline of Figure 1: lw sw r1, 16(r2) r1, 32(r2) Problem 2g: Would the existence of the logic from 2f change your answer for 2c? 7

[ This page intentionally left blank ] 8

Problem 3: State Machine Control In this problem, you must design a six-state finite state machine (FSM) that implements a Gray Code Counter with exceptions. There are 4 count states (0-3) and 2 exception states (Overflow, Underflow). The counter only counts when the COUNT signal is asserted. When counting up (the UP signal is asserted), the counter will count as follows: Underflow, 0, 1, 3, 2, Overflow, Overflow, Overflow When counting down (the UP signal is not asserted), the counter will count as follows: Overflow, 2, 3, 1, 0, Underflow, Underflow, Underflow The RESET signal will take the counter to the 0 state. Problem 3a: Complete the following State Transition Diagram for the Gray-Code counter: Uf 0 1 2 3 Of 9

Problem 3b: Construct a State Transition Table for this FSM. Encode the state as 3 bits: S r S 1 S 0. Here, S r indicates out of range, S r, where S r =1 either Uf or Of, depending on the values of the other bits. The other two bits (assuming that S r =0) indicate the value, where S 1 is the MSB (i.e. S r S 1 S 0 =010 for state 2). Ignore RESET. Feel free to encode the out of range states in any way that you like (hint the more don t care states, the easier it is to encode): 10

Problem 3c: Derive Next-State Logic Equations, given your state transition table. Include the RESET signal in your equations. You will have 3 equations. Simplify these as much as possible (i.e. combine together terms as much as possible). Show your work. Hint: Separate the COUNT=0 and COUNT=1 cases, derive separately, then combine S r = S 1 = S 0 = Problem 3d: Draw logic for the S 0 state bit, including (1) all inputs, (2) the flip-flop that stores the state, (3) the clock input. 11

[This page left for scratch] 12