Pipelined Processor Design

Size: px

Start display at page:

Download "Pipelined Processor Design"

Augustus Willis
5 years ago
Views:

1 Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Indian Institute of Science Bangalore Lecture 20 SE-273: Processor Design

2 Courtesy: Prof. Vishwani Agrawal Mar 17,

3 Pipeline Registers PC 4 Add Instr. mem. This requires a CONTROL not too different from single-cycle opcode RegDst Sign ext. CONTROL 1 mux 0 Reg. File RegWrite Shift left 2 Branch Src 1 mux 0 Op Cont. zero 1 mux 0 Data mem. Mar 17, 2008 SE-273@SERC 3 MemWrite MemRead MemtoReg 0 mux 1

4 Pipeline Register Functions Four pipeline registers are added: Register name Data held PC+4, Instruction word (IW) PC+4, R1, R2, IW(0-15) sign ext., IW(11-15) PC+4, zero, Result, R2, IW(11-15) or IW(16-20) M[Result], Result, IW(11-15) or IW(16-20) Mar 17,

5 Pipelined Datapath PC 4 Add Instr mem for R-type for I-type lw opcode Reg. File Sign ext. Shift left 2 1 mux 0 zero 1 mux 0 Data mem. 0 mux Mar 17, 2008 SE-273@SERC 5

6 Five-Cycle Pipeline CC1 CC2 CC3 CC4 CC5 Mar 17,

7 Add Instruction add $t0, $s1, $s2 Machine instruction word opcode $s1 $s2 $t0 function CC1 CC2 CC3 CC4 CC5 IF ID EX MEM WB read $s1 add write $t0 read $s2 $s1+$s2 Mar 17,

8 Pipelined Datapath Executing add PC for R-type for I-type lw t0 Add Instr mem opcode s1 Reg. File s2 $s2 Sign ext. Shift left 2 $s1 1 mux 0 zero 1 mux 0 addr Data mem data 0 mux 1 Mar 17, 2008 SE-273@SERC 8

9 Load Instruction lw $t0, 1200 ($t1) opcode $t1 $t CC1 CC2 CC3 CC4 CC5 IF ID EX MEM WB read $t1 add read write $t0 sign ext $t M[addr] 1200 Mar 17,

10 Pipelined Datapath Executing lw PC 4 Add for R-type for I-type lw t0 Instr mem opcode t1 Reg. File Sign ext. Shift left 2 1 mux 0 zero addr Data mem Mar 17, 2008 SE-273@SERC 10 $t1 data 1 mux 0 0 mux 1

11 Store Instruction sw $t0, 1200 ($t1) opcode $t1 $t CC1 CC2 CC3 CC4 CC5 IF ID EX MEM WB read $t1 add write sign ext $t M[addr] 1200 (addr) $t0 Mar 17,

12 Pipelined Datapath Executing sw PC 4 Add Instr mem for R-type for I-type lw opcode t0 t1 Reg. File Sign ext. $t0 Shift left 2 1 mux 0 zero addr Data mem Mar 17, 2008 SE-273@SERC 12 $t1 data 1 mux 0 0 mux 1

13 Executing a Program Consider a five-instruction segment: lw $10, 20($1) sub $11, $2, $3 add $12, $3, $4 lw $13, 24($1) add $14, $5, $6 Mar 17, 2008 SE-273@SERC 13

14 time Program Execution CC1 CC2 CC3 CC4 CC5 lw $10, 20($1) Program instructions sub $11, $2, $3 add $12, $3, $4 lw $13, 24($1) add $14, $5, $6 Mar 17,

15 CC5 IF: add $14, $5, $6 ID: lw $13, 24($1) EX: add $12, $3, $4 MEM: sub $11, $2, $3 WB: lw $10, 20($1) PC 4 Add Instr mem for R-type for I-type lw opcode Reg. File Sign ext. Shift left 2 1 mux 0 zero 1 mux 0 Data mem. 0 mux Mar 17, 2008 SE-273@SERC 15

16 Advantages of Pipeline After the fifth cycle (CC5), one instruction is completed each cycle; CPI 1, neglecting the initial pipeline latency of 5 cycles. Pipeline latency is defined as the number of stages in the pipeline, or The number of clock cycles after which the first instruction is completed. The clock cycle time is about four times shorter than that of single-cycle datapath and about the same as that of multicycle datapath. For multicycle datapath, CPI = 3.. So, pipelined execution is faster, but... Mar 17, 2008 SE-273@SERC 16

17 Science is always wrong. It never solves a problem without creating ten more. George Bernard Shaw Mar 17, 2008 SE-273@SERC 17

18 Pipeline Hazards Definition: Hazard in a pipeline is a situation in which the next instruction cannot complete execution one clock cycle after completion of the present instruction. Three types of hazards: Structural hazard (resource conflict) Data hazard Control hazard Mar 17, 2008 SE-273@SERC 18

19 Structural Hazard Two instructions cannot execute due to a resource conflict. Example: Consider a computer with a common data and instruction memory. The fourth cycle of a lw instruction requires memory access (memory read) and at the same time the first cycle of the fourth instruction requires instruction fetch (memory read). This will cause a memory resource conflict. Mar 17, 2008 SE-273@SERC 19

20 Example of Structural Hazard CC1 CC2 CC3 CC4 CC5 lw $10, 20($1) / / / / / / / / sub $11, $2, $3 add $12, $3, $4 lw $13, 24($1) Mar 17, 2008 SE-273@SERC 20 time Program instructions Common data and instr. Mem. Nedded by two instructions

21 Possible Remedies for Structural Hazards Provide duplicate hardware resources in datapath. Control unit or compiler can insert delays (no-op cycles) between instructions. This is known as pipeline stall or bubble. Mar 17,

22 time Stall (Bubble) for Structural Hazard CC1 CC2 CC3 CC4 CC5 lw $10, 20($1) / / / / / / / / sub $11, $2, $3 Program instructions add $12, $3, $4 Stall (bubble) lw $13, 24($1) Mar 17, 2008 SE-273@SERC 22

23 Data Hazard Data hazard means that an instruction cannot be completed because the needed data, to be generated by another instruction in the pipeline, is not available. Example: consider two instructions: add $s0, $t0, $t1 sub $t2, $s0, $t3 # needs $s0 Mar 17, 2008 SE-273@SERC 23

24 Example of Data Hazard CC1 CC2 CC3 CC4 CC5 Write s0 in CC5 add $s0, $t0, $t1 time Read s0 and t3 in CC3 We need to read s0 from reg file in cycle 3 But s0 will not be written in reg file until cycle 5 However, s0 will only be used in cycle 4 And it is available at the end of cycle 3 sub $t2, $s0, $t3 Program instructions Mar 17, 2008 SE-273@SERC 24

25 Forwarding or Bypassing Output of a resource used by an instruction is forwarded to the input of some resource being used by another instruction. Forwarding can eliminate some, but not all, data hazards. Mar 17, 2008 SE-273@SERC 25

26 Forwarding for Data Hazard CC1 CC2 CC3 CC4 CC5 add $s0, $t0, $t1 sub $t2, $s0, $t3 Mar 17, time Program instructions Write s0 in CC5 Forwarding Read s0 and t3 in CC3

27 Forwarding Unit Hardware to reg. file FORW. MUX FORW. MUX Data Mem. MUX Control signals Source reg. IDs from opcode Forwarding Unit Mar 17,

28 Forwarding Alone May Not Work CC1 CC2 CC3 CC4 CC5 lw $s0, 20($s1) sub $t2, $s0, $t3 Mar 17, time Program instructions Write s0 in CC5 Read s0 and t3 in CC3 data needed by sub (data hazard) data available from memory only at the end of cycle 4

29 Use Bubble and Forwarding CC1 CC2 CC3 CC4 CC5 lw $s0, 20($s1) sub $t2, $s0, $t3 Mar 17, time Program instructions Write s0 in CC5 Forwarding stall (bubble)

30 Disable write PC Hazard Detection Unit Hardware Instruction Source reg. IDs from opcode Hazard Detection Unit Control 0 to reg. file NOP MUX FORW. MUX FORW. MUX Forwarding Unit Mar 17, 2008 SE-273@SERC 30 Data Mem. Control signals

31 Resolving Hazards Hazards are resolved by Hazard detection and forwarding units. Compiler s understanding of how these units work can improve performance. Mar 17, 2008 SE-273@SERC 31

32 Avoiding Stall by Code Reorder C code: A = B + E; C = B + F; MIPS code: lw $t1, 0($t0). $t1 written lw $t2, 4($t0).. $t2 written add $t3, $t1, $t2... $t1, $t2 needed sw $t3, 12($t0).... lw $t4, 8($t0)..... $t4 written add $t5, $t1, $t $t4 needed sw $t5, 16,($t0) Mar 17, 2008 SE-273@SERC 32

33 Reordered Code C code: A = B + E; C = B + F; MIPS code: lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 no hazard sw $t3, 12($t0) add $t5, $t1, $t4 no hazard sw $t5, 16,($t0) Mar 17, 2008 SE-273@SERC 33

34 Control Hazard Instruction to be fetched is not known! Example: Instruction being executed is branch-type, which will determine the next instruction: add $4, $5, $6 beq $1, $2, 40 next instruction and $7, $8, $9 Mar 17,

35 Stall on Branch time CC1 CC2 CC3 CC4 CC5 add $4, $5, $6 Program instructions beq $1, $2, 40 Stall (bubble) next instruction or and $7, $8, $9 Mar 17,

36 Why Only One Stall? Extra hardware in ID phase: Additional to compute branch address Comparator to generate zero signal Hazard detection unit writes the branch address in PC Mar 17,

37 Ways to Handle Branch Stall or bubble Branch prediction: Heuristics Next instruction Prediction based on statistics (dynamic) Hardware decision (dynamic) Prediction error: pipeline flush Delayed branch Mar 17,

38 Delayed Branch Example Stall on branch add $4, $5, $6 beq $1, $2, skip next instruction... skip or $7, $8, $9 Delayed branch beq $1, $2, skip add $4, $5, $6 next instruction... skip or $7, $8, $9 Instruction executed irrespective of branch decision Mar 17,

39 Delayed Branch CC1 CC2 CC3 CC4 CC5 beq $1, $2, skip add $4, $5, $6 next instruction or skip or $7, $8, $9 Mar 17, time Program instructions

40 Summary: Hazards Structural hazards Cause: resource conflict Remedies: (i) hardware resources, (ii) stall (bubble) Data hazards Cause: data unavailablity Remedies: (i) forwarding, (ii) stall (bubble), (iii) code reordering Control hazards Cause: out-of-sequence execution (branch or jump) Remedies: (i) stall (bubble), (ii) branch prediction/pipeline flush, (iii) delayed branch/pipeline flush Mar 17,

41 Thank You Mar 17,

Pipelined Processor Design

Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science (IISc) Bangalore virendra@computer.org Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/courses/aca/aca.htm