TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction

Size: px

Start display at page:

Download "TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction"

Virginia Banks
6 years ago
Views:

1 Review Friday the 2st of October Real world eamples of pipelining? How does pipelining pp inflence instrction latency? How does pipelining inflence instrction throghpt? What are the three types of hazard in a processor pipeline? What are the five stages of the IPS pipeline?

2 Today: Friday the 2st of October 4.7 Data Hazards: Forwarding vs Stalling 4.8 Control Hazards (branches) 4.9 Eceptions

3 Datapath with control from last week

Data Hazards and Forwarding There are dependencies in the seqence to the left All the for last instrctions se register $2 Assme that $2 is before the sb instrction and that $-$3 is -2 The and

4 Data Hazards and Forwarding There are dependencies in the seqence to the left All the for last instrctions se register $2 Assme that $2 is before the sb instrction and that $-$3 is -2 The and instrction shold se -2 for $2, bt reads from the register file The or instrction also reads from the register file The add and sw instrctions read the correct vale -2 from the register file. sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

5 CC sb $2, $, $3... $2 = sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

6 CC 2 and $2, $2, $5 sb $2, $, $3.. $2 = sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

7 CC 3 or $3, $6, $2 and $2, $2, $5 sb $2, $, $3. $2 = and instrction reads for $2 stores in ID/EX sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

8 CC 4 add $4, $2, $2 or $3, $6, $2 and $2, $2, $5 sb $2, $, $3 $2 = or instrction reads for $2 and stores it in ID/EX and instrction ses in Al operation sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

9 CC 5 sw $5, ($2) add $4, $2, $2 or $3, $6, $2 and $2, $2, $5 sb $2, $, $3 $2 = (/)-2 add reads new vale 2 for $2 from register file or ses in Al operation sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

10 CC 6. sw $5, ($2) add $4, $2, $2 or $3, $6, $2 and $2, $2, $5 $2 = 2 sw reads new vale 2 for $2 from register file add ses 2 in al operation and writes register $2 with vale calclated with $2 = sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

11 When is the needed and is it prodced? sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

12 CC sb $2, $, $3... $2 = sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

13 CC 2 and $2, $2, $5 sb $2, $, $3.. $2 = sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

14 CC 3 EX/E.AlOt gets new vale for $2 (2) or $3, $6, $2 and $2, $2, $5 sb $2, $, $3. $2 = and instrction reads for $2 stores in ID/EX sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

15 CC 4 ALU needs new $2 vale available in EX/E add $4, $2, $2 or $3, $6, $2 and $2, $2, $5 sb $2, $, $3 $2 = or instrction reads for $2 and stores it in ID/EX and instrction can se 2 from EX/E.AlOt sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

16 CC 5 - ALU needs new $2 vale, now available in E/WB sw $5, ($2) add $4, $2, $2 or $3, $6, $2 and $2, $2, $5 sb $2, $, $3 $2 = (/)-2 add reads new vale 2 for $2 from register file or can se 2 from E/WB.AlOt sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

17 How can a need for forwarding be detected?

18 Detecting hazards - types of hazard conditions Notation: EX/E.RegisterRd (RegisterRd in EX/E register) EX/E.AlOt (Regsiter ALUOt in EX/E register) Hazard conditions: a) EX/E.RegisterRd = ID/EX.RegisterRs b) EX/E.RegisterRd = ID/EX.RegisterRt 2a) E/WB.RegisterRd = ID/EX.RegisterRs 2b) E/WB.RegisterRd = ID/EX.RegisterRt

19 CC 4 Which hazard type add $4, $2, $2 or $3, $6, $2 and $2, $2, $5 sb $2, $, $3 a) EX/E.RegisterRd = ID/EX.RegisterRs b) EX/E.RegisterRd = ID/EX.RegisterRt 2a) E/WB.RegisterRd = ID/EX.RegisterRs 2b) E/WB.RegisterRd Rd = ID/EX.RegisterRt Rt sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

20 CC 5 - Which hazard type sw $5, ($2) add $4, $2, $2 or $3, $6, $2 and $2, $2, $5 sb $2, $, $3 a) EX/E.RegisterRd = ID/EX.RegisterRs b) EX/E.RegisterRd = ID/EX.RegisterRt 2a) E/WB.RegisterRd = ID/EX.RegisterRs 2b) E/WB.RegisterRd Rd = ID/EX.RegisterRt Rt sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2)

21 hazards detection contined (r-type) Detection is performed in the EX state There is no hazard if the previos instrction will not write the reslt Reg for the earlier instrction mst be asserted $ is always and a write to $ will not create dependencies a) EX/E.Reg and EX/E.RegisterRd!= and EX/E.RegisterRd = ID/EX.RegisterRS b) EX/E.Reg and EX/E.RegisterRd!= and EX/E.RegisterRd = ID/EX.RegisterRt 2a) E/WB.Reg and E/WB.RegisterRd!= and E/WB.RegisterRd = ID/EX.RegisterRS 2b) E/WB.Reg and E/WB.RegisterRd!= and E/WB.RegisterRd = ID/EX.RegisterRt

22 Seqence with forwarding Dependence between pipeline registers and the inpts to the ALU Reqired eists in time for later instrctions It is possible to spply the inpts to the ALU needed by the and instrction and or instrction by forwarding the reslts fond in the pipeline registers

23 Adding forwarding logic If inpts to the ALU can be taken from any pipeline register proper can be forwarded By adding mltipleers to the inpt of the ALU the pipeline can be rn at fll speed in presence of dependenciesd

24 We mst not forget the immediate vales

25 Control lines from forwarding nit control Sorce Eplanation ForwardA = ID/EX Al operand A comes from the register file ForwardA = EX/E Al operand A comes from previos cycle ALU reslt ForwardA = E/WB Al operand A comes from previos cycle memory read or earlier ALU reslt ForwardB = ID/EX Al operand B comes from the register file ForwardB = EX/E Al operand B comes from previos cycle ALU reslt ForwardB = E/WB Al operand B comes from previos cycle memory read or earlier ALU reslt a) If (EX/E.Reg and (EX/E.RegisterRd ) and (EX/E.RegisterRd = ID/EXRegisterRs)) ForwardA <= b) If (EX/E.Reg and (EX/E.RegisterRd ) and (EX/E.RegisterRd = ID/EXRegisterRt)) ForwardB <= 2a) If (E/WB.Reg and (E/WB.RegisterRd ) and (E/WB.RegisterRd = ID/EXRegisterRs)) ForwardA <= 2b) If (E/WB.Reg and (E/WB.RegisterRd ) and (E/WB.RegisterRd = ID/EXRegisterRt)) ForwardB <=

26 CC add $, $, $2... add $, $, $2 add $, $, $3 add $, $, $4 add $, $, $5

27 CC 2 add $, $, $3 add $,. $, $2.. add $, $, $2 add $, $, $3 add $, $, $4 add $, $, $5

28 CC 3 add $, $, $4 add $,. $, $3 add $,. $, $2. add $, $, $3 reads old vale from register file add $, $, $2 add $, $, $3 add $, $, $4 add $, $, $5

29 CC 4 add $, $, $5 add $,. $, $4 add $,. $, $3 add $,. $, $2 add $, $, $4 reads old vale from register file add $, $, $2 add $, $, $3 gets the forwarded vale from e/mem add $, $, $3 a) If (EX/E.Reg add $, $, $4 and (EX/E.RegisterRd ) add $, $, $5 and (EX/E.RegisterRd = ID/EXRegisterRs)) ForwardA <=

30 CC 5 add $, $, $6 add $,. $, $5 add $,. $, $4 add $,. $, $3 add $, $, $2 a) If (EX/E.Reg and (EX/E.RegisterRd ) and (EX/E.RegisterRd = ID/EXRegisterRs)) ForwardA <= 2a) If (E/WB.Reg and (E/WB.RegisterRd ) and (E/WB.RegisterRd = ID/EXRegisterRs)) ForwardA <= add $, $, $2 add $, $, $3 add $, $, $4 add $, $, $5

31 Control lines from forwarding nit control Sorce Eplanation ForwardA = ID/EX Al operand A comes from the register file ForwardA = EX/E Al operand A comes from previos cycle ALU reslt ForwardA = E/WB Al operand A comes from previos cycle memory read or earlier ALU reslt ForwardB = ID/EX Al operand B comes from the register file ForwardB = EX/E Al operand B comes from previos cycle ALU reslt ForwardB = E/WB Al operand B comes from previos cycle memory read or earlier ALU reslt a) If (EX/E.Reg and (EX/E.RegisterRd ) and (EX/E.RegisterRd = ID/EXRegisterRs)) ForwardA <= b) If (EX/E.Reg and (EX/E.RegisterRd ) and (EX/E.RegisterRd = ID/EXRegisterRt)) ForwardB <= 2a) If (E/WB.Reg and (E/WB.RegisterRd ) and (EX/E.RegisterRd ID/EXRegisterRs) and (E/WB.RegisterRd = ID/EXRegisterRs)) ForwardA <= 2b) If (E/WB.Reg and (E/WB.RegisterRd ) and (EX/E.RegisterRd Rd ID/EXRegisterRt) Rt) and (E/WB.RegisterRd = ID/EXRegisterRt)) ForwardB <=

32 Datapath modified to resolve hazards Forwarding nit in EX-stage (with UXes) Operand register nmbers are passed on from ID stage via ID/EX pipel. reg. Some details are left ot, like signetending nit What abot store instrctions following r-type instrctions: add $2, $, $3 add $2, $, $5 sw $2, ($3) sw $5, ($2) or store following loads lw $2, ($3) sw $2, ($4)

33 CC add $2, $, $3 add $4, $5, $6 Instrction fetch lw $3, 24 ($) add $2, $3, $4, $ sb $, $2, $3 Instrction decode Eection emory... lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend add $2, $, $3 PAT6F2.eps sw $2, ($3)

34 CC 2 sw $2, ($3) add $2, $, $3 add $4, $5, $6 Instrction fetch lw $3, 24 ($) add $2, $3, $4, $ sb $, $2, $3 Instrction decode Eection emory... lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend add $2, $, $3 PAT6F2.eps sw $2, ($3)

35 CC 3 add $4, $5, $6 Instrction fetch lw $3, 24 ($) $2, $3, $4, $ Instrction decode Eection sw $2, ($3) add $2, $, $3... sb $, $2, $3 emory lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend add $2, $, $3 PAT6F2.eps sw $2, ($3)

36 CC 4 add $4, $5, $6 Instrction fetch lw $3, 24 ($) Instrction decode add $2, $3, $4, $ sb $, $2, Eection emory sw $2, ($3) add $2, $, $3... lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend b) If (EX/E.Reg and (EX/E.RegisterRd ) and (EX/E.RegisterRd = ID/EXRegisterRt)) ForwardB <= add $2, $, $3 sw $2, ($3) The read from port B is echanged with rd in EX/E PAT6F2.eps

37 CC add $2, $, $3 add $4, $5, $6 Instrction fetch lw $3, 24 ($) add $2, $3, $4, $ sb $, $2, $3 Instrction decode Eection emory... lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend add $2, $, $3 PAT6F2.eps sw $5, ($2)

38 CC 2 sw $2, ($3) add $2, $, $3 add $4, $5, $6 Instrction fetch lw $3, 24 ($) add $2, $3, $4, $ sb $, $2, $3 Instrction.. decode Eection emory.. lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend add $2, $, $3 PAT6F2.eps sw $5, ($2)

39 CC 3 add $4, $5, $6 Instrction fetch lw $3, 24 ($) $2, $3, $4, $ sw Instrction $2, ($3) decode add Eection $2,. $, $3... sb $, $2, $3 emory lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend add $2, $, $3 PAT6F2.eps sw $5, ($2)

40 CC 4 add $4, $5, $6 Instrction fetch lw $3, 24 ($) Instrction decode add $2, $3, $4, $ sb $, $2, sw $2, Eection ($3) add $2, emory. $,.. $3 lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend a) If (EX/E.Reg and (EX/E.RegisterRd ) and (EX/E.RegisterRd = ID/EXRegisterRs)) ForwardA <= add $2, $, $3 sw $5, ($2) The read from port A is echanged with rd in EX/E PAT6F2.eps

41 CC lw $2, ($3) add $4, $5, $6 Instrction fetch lw $3, 24 ($) add $2, $3, $4, $ sb $, $2, $3 Instrction decode Eection emory... lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend PAT6F2.eps lw $2, ($3) sw $2, ($4)

42 CC 3 add $4, $5, $6 Instrction fetch sw $2, ($4) lw $3, 24 ($) Instrction decode.. add $2, $3, $4, $ Eection. lw $2, ($3) sb $, $2, $3 emory lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend PAT6F2.eps lw $2, ($3) sw $2, ($4)

43 CC 4 add $4, $5, $6 Instrction fetch lw $3, 24 ($) Instrction decode sw $2, ($4) add $2, $3, $4, $ Eection.. sb $, $2, emory. lw $2, ($3) lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend PAT6F2.eps lw $2, ($3) sw $2, ($4)

44 CC 4 add $4, $5, $6 Instrction fetch lw $3, 24 ($) Instrction decode sw $2, ($4) add $2, $3, $4, $ Eection.. sb $, $2, emory. lw $2, ($3) lw$, 2($) back IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend b) If (EX/E.Reg and (EX/E.RegisterRd ) and (EX/E.RegisterRd = ID/EXRegisterRt)) ForwardB <= sw is forwarded the wrong vale for $2 lw $2, ($3) sw $2, ($4) PAT6F2.eps

45 CC 5 add $4, $5, $6 Instrction fetch lw $3, 24 ($) Instrction decode add $2, $3, $4, $ Eection sb $, $2, $3 emory sw $2, ($4).. lw$, 2($) back. lw $2, ($3) IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address Instrction memory Instrction register register 2 Registers 2 register Zero ALU ALU reslt Address Data memory 6 Sign 32 etend??? lw $2, ($3) sw $2, ($4) PAT6F2.eps

46 Data hazards and stalls (6.5) lw $2, 2($) and $4, $2, $5 or $8, $2, $6 add $9, $4, $2 slt $, $6, $7 Forwarding cannot avoid stalling the pipeline when an instrction tries to read a register following a load instrction that writes the same register. The is still being read from memory in clock cycle 4 while the ALU is performing the operation for the following instrction. Something mst stall the pipeline for the combination of load followed by an instrction that reads its reslt. Hazard detection is needed. d

47 CC lw $2, 2($)... $2 = lw $2, 2($) and $4, $2, $5 or $8, $2, $6 add $9, $4, $2 slt $, $6, $7

48 CC 2 and $4, $2, $5 lw $2, 2($).. $2 = lw $2, 2($) and $4, $2, $5 or $8, $2, $6 add $9, $4, $2 slt $, $6, $7

49 CC 3 or $8, $2, $6 and $4, $2, $5 lw $2, 2($). $2 = and instrction reads for $2 stores in ID/EX lw $2, 2($) and $4, $2, $5 or $8, $2, $6 add $9, $4, $2 slt $, $6, $7

50 CC 4 add $9, $4, $2 or $8, $2, $6 and $4, $2, $5 lw $2, 2($) $2 = or instrction reads for $2 and stores it in ID/EX and instrction need new $2 vale, bt it is not available lw $2, 2($) and $4, $2, $5 or $8, $2, $6 add $9, $4, $2 slt $, $6, $7

51 CC 5 slt $, $6, $7 add $9, $4, $2 or $8, $2, $6 and $4, $2, $5 lw $2, 2($) $2 = add instrction reads for $2 and stores it in ID/EX or instrction can get ALU A from E/WB register lw $2, 2($) and $4, $2, $5 or $8, $2, $6 add $9, $4, $2 slt $, $6, $7

52 Data hazards and stalls hazard detection ID step mst test to see if previos instrction is a load. Then it mst be decided if the sorce registers match the destination register of the load if (ID/EX.em and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt))) Stall pipeline

53 Stall insertion Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 CC Program eection order (in instrctions) lw $2, 2($) and becomes nop add $4, $2, $5 or $8, $2, $6 add $9, $4, $2 I Reg I Reg I D Reg I Reg D Reg I bbble Reg D Reg D Reg CC2 and is fetched and lw is decoded CC3 and is decoded, or is fetched and lw is eected CC4 and is decoded, d d or is fetched, a nop is eected eecedad and lw is in the E stage The nop can be achieved by setting harmless control signals, Reg D Reg PAT6F35.eps

54 Stall / No operation Both the instrctions in the ID and IF stages mst be stalled to not loose the fetched instrctions. Preventing these two instrctions from making progress is accomplished simply by preventing the PC register and the IF/ID pipeline register from changing. The back half of the pipeline starting ti with the EX is eecting instrctions that have no effect: nops, which act like bbbles Deasserting all 9 control signals in the EX, E and WB stages will create a do nothing or nop instrction. No registers or memories are written

55 Pipeline with control, forwarding and hazard detection

56 The big pictre (page 374) Althogh the hardware may or may not rely on the compiler to resolve hazard dependences to ensre correct eection, the compiler mst nderstand the pipeline to achieve the best performance. Otherwise, nepected stalls will redce the performance of the compiled code

57 Branch hazards / control hazard (6.6) The decision whether to branch or not is taken in the E stage. Withot intervention the three seqential instrctions following the branch will be fetched and begin eection less freqent than hazards, bt a three instrction flsh is costly

58 Assme branch not taken Stalling ntil the branch is complete is too slow. Improvement: Assme the branch will not be taken and contine eection. If it is taken the instrctions that are being fetched and decoded mst be discarded. If branches are ntaken half the time, and if it costs little to discard the instrctions, this optimization halves the cost of control hazards. To discard instrctions: change the original control vales to s Also change the three instrctions in the IF, ID and EX stages when the branch reaches the E-stage Discarding instrctions means we mst be able to flsh instrctions in the IF, ID and EX stages of the pipeline

59 Redcing the delay of branches If branch eection is moved earlier in the pipeline fewer instrctions need to be flshed (So far we have assmed that the net PC for a branch is selected in the E stage.) any branches can rely on simple tests not reqiring ii a fll llalu operation oving the branch decision i p reqires two actions to occr earlier; compting the branch target address and evalating the branch decision

60 Redcing the delay of branches early branch detection. We already have the PC and the immediate field in the IF/ID pipeline register, so we jst move the branch adder from the EX stage to the ID stage. 2. BEQ; we wold compare the two registers (simple logic) read dring the ID stage. oving the branch test to the ID stage implies additional forwarding and hazard detection hardware, since a branch dependent on a reslt still in the pipeline mst still work properly with this optimization

Eample page 379 36 sb $, $4, $8 4 beq $, $3, 7 # PC relative branch to 72 44 and $2, $2,

pipeline is optimized for branches not taken and branch eection is moved to the ID stage

PC address and zeros the instrction fetched for the net CC.

61 Eample page sb $, $4, $8 4 beq $, $3, 7 # PC relative branch to and $2, $2, $5 48 or $3, $2, $6 52 beq $4, $4, $2 56 and $5, $6, $7 72 lw $4, 5($7) Assmes that the pipeline is optimized for branches not taken and branch eection is moved to the ID stage The ID stage of CC3 determines that a branch mst be taken, so 72 is selected as the net PC address and zeros the instrction fetched for the net CC. CC4 shows the instrction at location 72 being fetched and the single bbble or nop instrction as a reslt of the taken branch.

62 Dynamic branch prediction In a deeper pipeline (than 5 stage IPS) a simple static prediction scheme will probably waste too mch performance. Dynamic branch prediction ses rntime information to decide where to begin fetching new instrctions. A branch prediction bffer or branch history table is a small memory indeed by the lower portion of the address of the branch instrction. The memory contains one bit indicating whether the branch was recently taken or not. A problem is that t we don t know if the prediction is the right one, and it may have been pt there by another branch with the same low-order bits. Another shortcoming: even if a branch is almost always taken, we will predict incorrectly twice, rather than once, when it is not taken.

63 2-bit prediction scheme By sing 2 bits rather than, a branch that t strongly favors taken or not taken will be mispredicted only once

64 Delayed branch Other branch handling strategies Always eecte the following instrction Compilers and assemblers try to fill in the following instrction with one withot dependencies looses effect on long pipelines and mltiple isse pipelines Branch target bffer Store the epected jmp address in a bffer Global dynamic prediction Use the global branch behavior to determine prediction Effective if combined with local branch prediction

65 The final path and control for chapter 4

66 Eceptions (4.9) Add $, $2, $s, sppose overflow. We mst: Transfer control to eception rotine immediately after this instrction We mst flsh the instrctions following the add from the pipeline and begin fetching instrctions from the new address Same mechanisms as for taken branches, bt with the eception deasserting the control lines

67 Datapath with controls to handle eceptions (fig page 387) ID.Flsh is ORed with the stall signal from the Hazard detection ti nit. To flsh instrctions in EX-stage; EX.Flsh casing UXes to zero the control lines Additional inpt to PC is added to be able to fetch instrctions from 8 8he, which is the eception location for overflow

68 Cases of eceptions (page 385): ) I/O device reqest 2) Hardware malfnction 3) Invoking an operating system service from a ser program 4) Using an ndefined instrction 5) Overflow ),2) are not associated with a special instrction, so the implementation has some fleibility as to when to interrpt the pipeline, sing the mechanism sed for other eceptions In case of simltaneos mltiple eceptions the normal soltion is to prioritize iti the eceptions

69 4 he sb $, $2, $4 44 he and $2, $2, $5 48 he or $3, $2, $6 4C he add $, $2, $ 5 he slt $5, $6, $7 54 he lw $6, 5($7) Instr. To be invoked 44 he sw $25, ($s) 444 he sw $26, 4($s) Overflow for add in EX stage 4 4 he forced into PC. CC7 shows that add and following instrctions are flshed and the first instrction ti of the eception code is fetched. Address of the instrction following add is saved: 4C he +4=5 he. and and or complete

70 lw $6, 5($7) slt $5, $6, $7 add $, $2, $ or $3,... and $2,... EX.Flsh IF.Flsh ID.Flsh Hazard detection nit ID/EX WB EX/E IF/ID 54 Control + EX 5 Case EPC WB E/WB WB Shift left 2 $6 $ PC 54 Instrction memory 2 Registers = $7 $ Data memory Sign etend 5 $ 3 2 Clock 6 Forwarding nit IF.Flsh sw $25, ($) bbble (nop) bbble bbble or $3,... EX.Flsh ID.Flsh IF/ID 54 Hazard detection nit Control + ID/EX WB Case EX EPC EX/E WB E/WB WB 4 + Shift left 2 Registers = 2 ALU PC Instrction memory Data memory Sign etend 3 Clock 7 Forwarding nit PAT6F43.eps

71 lw $6, 5($7) slt $5, $6, $7 add $, $2, $ or $3,... and $2,... EX.Flsh IF.Flsh ID.Flsh Hazard detection nit ID/EX WB EX/E IF/ID 54 Control + EX 5 Case EPC WB E/WB WB Shift left 2 $6 $ PC 54 Instrction memory 2 Registers = $7 $ Data memory Sign etend 5 $ 3 2 Clock 6 Forwarding nit IF.Flsh sw $25, ($) bbble (nop) bbble bbble or $3,... EX.Flsh ID.Flsh IF/ID 54 Hazard detection nit Control + ID/EX WB Case EX EPC EX/E WB E/WB WB 4 + Shift left 2 Registers = 2 ALU PC Instrction memory Data memory Sign etend 3 Clock 7 Forwarding nit PAT6F43.eps

72 HW/SW interface (/2) HW + OS works in conjnction so eceptions behave as epected. HW contract is normally to stop the offending instrction ti in midstream, let all prior instrctions complete, flsh all following instrctions, set a register to show the case of the eception, save the address of the offending instrction, and jmp to the prearranged address Compter Control emory Datapath Processor Inpt Otpt

73 HW/SW interface (2/2) The OS contract is to look at the case of the eception and act appropriately: Undefined instrction, hw failre, overflow: kills the program and retrns an indicator of the reason I/O reqest or OS service call: Saves state of program, performs desired task, restores the program to contine eec. On of the most important and freqent ses of eceptions is handling page falts and TLB eceptions (chapter 7) Compter Control emory Datapath Processor Inpt Otpt

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Compter Architectre Chapter 3 & Appendi C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Compter Science and Engineering Shanghai Jiao Tong University 1 Otline Introdction