b) Register renaming c) CDB, register file, and ROB d) 0,1,X (output of a gate is never Z)

Size: px

Start display at page:

Download "b) Register renaming c) CDB, register file, and ROB d) 0,1,X (output of a gate is never Z)"

Avice Francis
5 years ago
Views:

1 1) a) Issuing stores to memory and (maybe) writing results back to register file (depends if we have a distinct physical register file). Instruction dispatch is usually done in program order, but can be violated (e.g. during branch mispredicts) without violating precise state b) Register renaming c) CDB, register file, and ROB d) 0,1,X (output of a gate is never Z) 2) a) Asynchronous interrupts (e.g. from IO device) can be deferred until later, so the pipeline can pause instruction fetch and naturally finish all in flight instructions. Synchronous exception is tied to a specific instruction and must be handled now, squashing all later instructions. So exceptions generally have an equal or greater penalty. b) Instruction cache can only supply one instruction per cycle, so there s no way to consistently sustain 3 instructions every cycle (unless there s some sort of fetch buffer that can hold a tight program loop) 3) a) Base CPI is 1. 1 cycle of delay will occur for every load followed by dependent instruction, plus 3 cycles for every mispredicted (i.e. taken) branch. CPI baseline = = 1.45 IPC = 1 CPI =.7 b) 3 new sources of delay: A 2 cycle arithmetic instruction followed by dependent instruction results in 1 cycle of delay A load followed by dependent instruction now results in 2 cycles of delay, not 1 A load followed by dependent instruction two instructions later adds one cycle of delay (but must make sure not to double count if previous bullet is also true) CPI new = X (. 7.2) = X Where X is the percentage of ALU operations with MSBs not set to 0. In order to have a speedup, CPI new 1.2 < CPI baseline

So regardless of X, there will be a speedup 1.5 +.18X < 1.45 1.2 X < 1.

2 So regardless of X, there will be a speedup X < X < ) c) This question is vague without specifying X from above, but any reasonable calculation of an average will be accepted here. Note that although we re dealing with throughputs, the harmonic mean should not be used since the amount of time each processor is running is fixed, so we don t need to weight them differently. Arithmetic mean is most appropriate. a) Outputs shown in parentheses

3 b) typedef enum logic[1:0] { s_, s_1, s_10, s_101 } STATE; module FSM ( input clock, reset input A, output B ); STATE state, next_state; assign B = (state == s_101); always_comb begin next_state = state; case (state) s_: next_state = A? s_1 : s_; s_1: next_state = A? s_1 : s_10; s_10: next_state = A? s_101 : s_; s_101: next_state = A? s_1 : s_10; endcase end clock) begin if(reset) state <= s_; else state <= next_state; end endmodule

4 5) Signal name Input/Output Bit width Cycle Used Description clock/reset Input 1 Multiple Normal usage AGU_addr_in Input 32 X Address calculated by AGU for load/store instruction AGU_ROB_in Input 7 X ROB # corresponding to load/store instruction whose addr was sent by AGU AGU_valid_in Input 1 X Whether AGU signals are valid Dispatch_st_en Input 1 D Whether store is being dispatched Dispatch_ld_en Input 1 D Whether load is being dispatched ROB_in_disp Input 7 D ROB idx of dispatched instruction store_data_in Input 64 X Data to be stored (sent from RS) complete_stall Input 1 C Indicates external structural hazard preventing completion st_retire_in Input 1 R Indicates whether there is store instruction at head of ROB retiring

5 dispatch_stall Output 1 D If instruction can t be dispatched due to structural hazard complete_ ROB_out Output 7 C ROB index of completed instruction sent to CDB ld_addr_out Output 32 X Address sent to cache ld_valid Output 1 X ld_data_out Output 64 C In event of forwarding, value sent on CDB st_comp_en Output 1 C Sent to ROB, indicates store is ready to go to memory st_addr_out Output 32 R Address sent to cache for store (could be merged with ld_addr_out) st_data_out Output 64 R Data sent to cache for store st_ret_en Output 1 R Indicates store successfully retired 6)

6 Map Table ROB Reg Tag + Entry # PC Executed? Value Dest Head Tail # Reg x44 Y x48 N 2 8 0x4C Y 14 4 RS# Valid? Op Type Tag 1 RS Value 1 Register File Tag 2 Value 2 ROB # Reg # Value 0 N N N Y N 4 4 NOTES: Head points to oldest instruction that hasn't retired Tail points to entry after most recently dispatched instruction Tag of '0' means data isn't in ROB 7) 8) Consider that ROB #3 is a branch which was mispredicted: it should have branched to instruction 0x44. Starting at 0x44, the next three instructions are: 9) 10) I0: R2 = R1*2 11) I1: R2 = R2+R3 12) I2: R4 = R3-1

EECS 470 Midterm Exam Fall 2014

EECS 470 Midterm Exam Fall 2014 Name: uniqname: Rewrite and sign the honor code below: I have neither given nor received aid on this exam nor observed anyone else doing so. Signature: Scores: Page # Points