Pipelining. Chapter 4

Size: px

Start display at page:

Download "Pipelining. Chapter 4"

Margaret Lawrence
5 years ago
Views:

1 Pipelining Chapter 4

2 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast

3 Pipelining Single cycle path we need many parallel nits lticycle path Redction of hardware Sharing of elements like e.g. the ALU Comple timing One instrction eected at a time Pipelined path Use of nits at the same time Sharing of elements by different instrctions

4 Eample Landry problem For processing stages Identical time (3 mintes) Fied seqence of sage Total time for n loads: n 2 hors

5 Eample Landry optimization Units operate independently Overlapping se of resorces Total time, loads: 2 hors + (n-) ½ hor Average time for landry: 2.5 h / 4 = 37.5 min

6 Pipeline Lessons Pipelining doesn t help latency of single task, it helps throghpt of entire workload ltiple tasks operating simltaneosly sing different resorces Potential speedp = Nmber pipe stages Pipeline rate limited by slowest pipeline stage Unbalanced lengths of pipe stages redces speedp Time to fill pipeline and time to drain it redces speedp Stall for Dependences

7 Pipeline IPS pipeline steps Fetch instrction from memory registers while decoding Eecte the operation or calclate an address Assess an operand in memory reslts in register Cycle Cycle 2 Cycle 3 Cycle 4 Cycle 5 IF REG/DEC EXEC E WR

8 Single Cycle, ltiple Cycle, vs. Pipeline Cycle Cycle 2 Single Cycle Implementation: LOAD STORE Waste Cycle Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle Cycle 2 Cycle 3 Cycle 4 Cycle ltiple Cycle Implementation: Load IF REG/DEC EXEC E WR Store IF REG/DEC EXEC E R-type IF Pipeline Implementation: Load IF REG/DEC EXEC E WR Store IF REG/DEC EXEC E WR R-type IF REG/DEC EXEC E WR

9 Real pipeline Uneqal time for steps (in ns)

10 Single-cycle, non-pipelined eection in top verss pipelined eection

11 Pipeline Speedp Timebetweeninstrction pipelined Time between instrction = Nmber of pipestages nonpipelined Assme the stages are perfectly balanced Assme ideal condition Only for large nmber of instrctions Pipeline increases throghpt Does not decrease the eection time of individal instrction

12 Preconditions (What makes it easy?) set design s (ideally) of eqal length Few instrction formats emory operands only in load & store Aligned : only one memory access / read operation Sorces of problems s with variable length > mltiple memory accesses Unaligned > mltiple memory access for one item

13 Pipeline can also get s into troble! (Hazards) Strctral hazards: Attempt to se the same resorce two different ways at the same time Accessing the memory at the same time Control hazards: Attempt to make a decision before condition is evalated Branching Data hazards: Attempt to se item before it is ready instrction depends on reslt of prior instrction still in the pipeline Can sally resolve hazards by waiting pipeline control mst detect the hazard and take action to resolve hazards

14 Resolving Strctral Hazard IF Reg/Dec ALU em WB emory Access (R/W) IF Reg/Dec ALU em WB IF Reg/Dec ALU em WB from memory IF Reg/Dec ALU em WB Fi with separate instr and memories (I$ and D$)

15 Strctral Hazard: Register Access IF Reg/Dec ALU em WB to Reg IF Reg/Dec ALU em WB IF Reg/Dec ALU em WB Reg IF Reg/Dec ALU em WB Fi register file access hazard by doing reads in the second half of the cycle and writes in the first half Reg Reg Reg Using rising-edge Pipeline Reg Using falling-edge

16 Resolving control hazard Assmption: in stage 2 all branch comptations are ready Delay by one cycle to wait for reslt

17 Predicting that branches are not taken as a soltion to control hazard Branch not-taken Fetch net instrction after beq Branch taken Fetch net instrction from beq target

18 Delayed decision Perform operations that do no harm first Rearrange instrctions if there are no dependencies Original: - Branch - Load (no branch) Rearranged: Branch - - Load (no branch) Program eection order (in instrctions) beq $, $2, 4 Time fetch Reg ALU Data access Reg add $4, $5, $6 (Delayed branch slot) lw $3, 3($) 2 ns fetch 2 ns Reg fetch ALU Reg Data access ALU Reg Data access Reg 2 ns

19 Data hazards Access to that are not yet compted add $s, $t, $t sb $t2, $s, $t3 does not write ntil 5th stage Sb reads in stage two Three stalls reqired Soltions Compiler optimization rearranging the instrction seqence Forwarding se reslts before they are actally written

20 Forwarding Linear eection Direct se of ALU reslt

21 Forwarding Depending on the instrctions there are still stalls possible

22 Reordering Find Hazard lw $t, ($t) lw $t2, 4($t) sw $t2, ($t) sw $t, 4($t) lw $t, ($t) lw $t2, 4($t) sw $t, 4($t) sw $t2, ($t)

23 Pipelined path Rese of fnctional nits itional hardware Separation of pipeline steps Fnctional nits if sed by different instrctions at the same time Etended control Strict seqentialization of instrction Check for hazards Remove hazards - introdce stalls

24 Single cycle path

25 Problems back to the register file creates a hazard Selection of the net PC creates a control hazard Soltion: Have a separate path for each instrction -> high hardware effort -> not affordable Chop the path into small chnks Keep everything what belongs together in one chnk Introdce registers for separating the stages

26 Problems Three instrctions needs three paths Program eection order (in instrctions) Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 CC7 lw $, ($) I Reg ALU D Reg lw $2, 2($) I Reg ALU D Reg lw $3, 3($) I Reg ALU D Reg

27 Pipelined version of path

28 Eample: lw: fetch PC is saved in IF/ID register for later se

29 decode Decoding the instrction ing the from memory file

30 Eecte or address calclation emory address calclation

31 emory access Access the memory

32 back from memory to the register There is a bg. Can yo spot it?

33 Store word fetch: as before decode and register file read: as before Eecte & address calclation: as before forwarding of register contents to EX/E pipelining register emory access: Send and address to memory Last step: Nothing happens

34 Store word Information in EX/E ress Register to be written

35 Store word emory access phase

36 Store word -back stage

37 Smmary Some instrctions do not reqire the complete path. No information transfer from one pipeline stage to another is possible ecept throgh the pipeline registers. Everything that happened in any previos stage will be overwritten. Correction for store reqired: where is the information on the write register?

38 Corrected path Shifting of the write register address throgh all sbseqent pipeline stages

39 Two instrctions seqence Load / sb: fetch / - lw $, 2($) fetch IF/ID ID/EX EX/E E/WB 4 Shift left 2 reslt PC ress memory register register 2 Registers 2 register Zero ALU ALU reslt ress Data memory 6 Sign etend 32 Clock

40 Two instrctions seqence Load / sb: instrction decode / fetch sb $, $2, $3 lw $, 2($) fetch decode IF/ID ID/EX EX/E E/WB 4 Shift left 2 reslt PC ress memory register register 2 Registers 2 register Zero ALU ALU reslt ress Data memory 6 Sign etend 32 Clock 2

41 Two instrctions seqence Load / sb: eection / instrction decode sb $, $2, $3 lw $, 2($) decode Eection IF/ID ID/EX EX/E E/WB 4 Shift left 2 reslt PC ress memory register register 2 Registers 2 register Zero ALU ALU reslt ress Data memory 6 Sign etend 32 Clock 3

42 Two instrctions seqence Load / sb: memory / eection sb $, $2, $3 Eection lw $, 2($) emory IF/ID ID/EX EX/E E/WB 4 Shift left 2 reslt PC ress memory register register 2 Registers 2 register Zero ALU ALU reslt ress Data memory 6 Sign etend 32 Clock 4

43 Two instrctions seqence Load / sb: write back / memory sb $, $2, $3 emory lw $, 2($) back IF/ID ID/EX EX/E E/WB 4 Shift left 2 reslt PC ress memory register register 2 Registers 2 register Zero ALU ALU reslt ress Data memory 6 Sign etend 32 Clock 5

44 Two instrctions seqence Load / sb: - / write back sb $, $2, $3 back IF/ID ID/EX EX/E E/WB 4 Shift left 2 reslt PC ress memory register register 2 Registers 2 register Zero ALU ALU reslt ress Data memory 6 Sign etend 32 Clock 6

45 Pipelined control Data is travelling along the pipeline stages All belonging to one instrction mst be kept within the stage Information transfer only throgh the pipeline registers Control information mst travel with the instrction

46 Pipelined control fetch Identical for all instrctions decode / register file read Identical for all instrctions Eection / ress calclation RegDest, ALUOp, ALUSrc emory access Branch, em, em back emtoreg, Reg

47 Control signals

48 Review of control Breakdown for each stage

49 Etended path with control For of the nine control lines are sed in the EX phase Three are sed dring the E stage The last two are passed to E/WB for se in the WB stage

50 Etended path with control

51 Eample seqence LW $, 2 ($) SUB $, $2, $3 AND $2, $4, $5 OR $3, $6, $7 ADD $4, $8, $9

52 IF: lw $, 2($) ID: before<> EX: before<2> E: before<3> WB: before<4> IF/ID Control ID/EX WB EX EX/E WB E/WB WB PC 4 ress memory register Reg register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch ress Data memory em emtoreg [5 ] Sign etend ALU control em Clock [2 6] [5 ] RegDst ALUOp

53 IF: sb $, $2, $3 ID: lw $, 2($) EX: before<> E: before<2> WB: before<3> IF/ID lw Control ID/EX WB EX EX/E WB E/WB WB PC 4 ress memory X Reg register $ register 2 Registers $X 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data memory emtoreg 2 [5 ] Sign etend 2 ALU control em Clock 2 X [2 6] [5 ] X RegDst ALUOp

54 IF: and $2, $4, $5 ID: sb $, $2, $3 EX: lw $,... E: before<> WB: before<2> IF/ID sb Control ID/EX WB EX EX/E WB E/WB WB PC 4 ress memory 2 3 register Reg $2 register 2 Registers $3 2 register Shift left 2 $ reslt ALUSrc Zero ALU ALU reslt Branch em ress Data memory emtoreg X [5 ] Sign etend X 2 ALU control em Clock 3 X [2 6] [5 ] X RegDst ALUOp

55 IF: or $3, $6, $7 ID: and $2, $2, $3 EX: sb $,... E: lw $,... WB: before<> IF/ID and Control ID/EX WB EX EX/E WB E/WB WB PC 4 ress memory 4 5 Reg Shift left 2 register $4 $2 register 2 Registers $5 $3 2 register reslt ALUSrc Zero ALU ALU reslt Branch ress em Data memory emtoreg X [5 ] Sign etend X ALU control em Clock 4 X 2 [2 6] [5 ] X 2 RegDst ALUOp

56 IF: add $4, $8, $9 ID: or $3, $6, $7 EX: and $2,... E: sb $,... WB: lw $,... IF/ID or Control ID/EX WB EX EX/E WB E/WB WB PC 4 ress memory 6 7 register Reg $6 register 2 Registers $7 2 register Shift left 2 $4 $5 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data memory emtoreg X [5 ] Sign etend X ALU control em Clock 5 X 3 [2 6] [5 ] X 3 2 RegDst ALUOp

57 IF: after<> ID: add $4, $8, $9 EX: or $3,... E: and $2,... WB: sb $,... IF/ID add Control ID/EX WB EX EX/E WB E/WB WB PC 4 ress memory 8 9 Reg register register 2 Registers $9 2 register $8 Shift left 2 $6 $7 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data memory emtoreg X [5 ] Sign etend X ALU control em Clock 6 X 4 [2 6] [5 ] X 4 3 RegDst ALUOp 2

58 IF: after<2> ID: after<> EX: add $4,... E: or $3,... WB: and $2,... IF/ID Control ID/EX WB EX EX/E WB E/WB WB PC 4 ress memory 2 Reg register register 2 Registers 2 register Shift left 2 $8 $9 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data memory emtoreg [5 ] Sign etend ALU control em Clock 7 [2 6] [5 ] 4 RegDst ALUOp 3 2

59 IF: after<3> ID: after<2> EX: after<> E: add $4,... WB: or $3,... IF/ID Control ID/EX WB EX EX/E WB E/WB WB PC 4 ress memory 3 Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data memory emtoreg [5 ] Sign etend ALU control em Clock 8 [2 6] [5 ] RegDst ALUOp 4 3

60 IF: after<4> ID: after<3> EX: after<2> E: after<> WB: add $4,... IF/ID Control ID/EX WB EX EX/E WB E/WB WB PC 4 ress memory 4 Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data memory emtoreg [5 ] Sign etend ALU control em Clock 9 [2 6] [5 ] RegDst ALUOp 4

What do we have so far? Multi-Cycle Datapath

What do we have so far? Multi-Cycle Datapath What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining