CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia

Size: px

Start display at page:

Download "CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia"

Julie McGee
5 years ago
Views:

1 CS 61C: Geat Ideas in Compute Achitectue Pipelining Hazads Instucto: Senio Lectue SOE Dan Gacia 1

Geat Idea #4: Paallelism So9wae Paallel Requests

g. lookup, ads Paallel InstucNons > 1 instucnon @ one

5 pipelined instucnons Paallel Data > 1 data item @ one

funcnoning in paallel at same Nme Leveage Paallelism &

Coe InstucNon Unit(s) Cache Memoy Coe FuncNonal Unit(s)

2 Geat Idea #4: Paallelism So9wae Paallel Requests Assigned to compute e.g. seach Gacia Paallel Theads Assigned to coe e.g. lookup, ads Paallel InstucNons > 1 one Nme e.g. 5 pipelined instucnons Paallel Data > 1 data one Nme e.g. add of 4 pais of wods Hadwae descipnons All gates funcnoning in paallel at same Nme Leveage Paallelism & Achieve High Pefomance Hadwae Waehouse Scale Compute Coe InstucNon Unit(s) Cache Memoy Coe FuncNonal Unit(s) A 0 +B 0 A 1 +B 1 A 2 +B 2 A 3 +B 3 Compute Memoy Input/Output Coe Smat Phone Logic Gates 2

3 Review of Last Lectue ImplemenNng contolle fo you datapath Take decoded signals fom instucnon and geneate contol signals Use AND and OR Logic scheme Pipelining impoves pefomance by exploinng InstucNon Level Paallelism 5- stage pipeline fo MIPS: IF, ID, EX, MEM, WB Executes mulnple instucnons in paallel What can go wong??? 3

4 Agenda Pipelining Pefomance Stuctual Hazads Administivia Data Hazads Fowading Load Delay Slot Contol Hazads 4

5 Review: Pipelined Datapath 5

6 Pipelined ExecuNon RepesentaNon Time IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB Evey instucnon must take same numbe of steps, so some stages will idle e.g. MEM stage fo any aithmenc instucnon 6

7 Gaphical Pipeline Diagams PC MUX +4 instucnon memoy d s t imm Registe File Data memoy 1. InstucNon Fetch 2. Decode/ Registe Read 3. Execute 4. Memoy 5. Wite Back Use datapath figue below to epesent pipeline: IF ID EX Mem WB 7

8 Gaphical Pipeline RepesentaNon RegFile: leh half is wite, ight half is ead Time (clock cycles) I n I$ Reg D$ Reg s Load t I$ Reg D$ Reg Add O d e Stoe Sub O I$ Reg I$ Reg I$ D$ Reg Reg D$ Reg D$ Reg 8

9 Pipelining Pefomance (1/3) Use T c ( Nme between complenon of instucnons ) to measue speedup Equality only achieved if stages ae balanced (i.e. take the same amount of Nme) If not balanced, speedup is educed Speedup due to inceased thoughput Latency fo each instucnon does not decease 9

10 Pipelining Pefomance (2/3) Assume Nme fo stages is Inst 100ps fo egiste ead o wite 200ps fo othe stages Inst fetch Registe ead op Memoy access Registe wite Total time lw 200ps 100 ps 200ps 200ps 100 ps 800ps sw 200ps 100 ps 200ps 200ps 700ps R-fomat 200ps 100 ps 200ps 100 ps 600ps beq 200ps 100 ps 200ps 500ps What is pipelined clock ate? Compae pipelined datapath with single- cycle datapath 10

11 Pipelining Pefomance (3/3) Single- cycle T c = 800 ps Pipelined T c = 200 ps 11

12 Pipelining Hazads A hazad is a situanon that pevents stanng the next instucnon in the next clock cycle 1) Stuctual hazad A equied esouce is busy (e.g. needed in mulnple stages) 2) Data hazad Data dependency between instucnons Need to wait fo pevious instucnon to complete its data ead/wite 3) Contol hazad Flow of execunon depends on pevious instucnon 12

13 Agenda Pipelining Pefomance Stuctual Hazads Administivia Data Hazads Fowading Load Delay Slot Contol Hazads 13

14 1. Stuctual Hazads Conflict fo use of a esouce MIPS pipeline with a single memoy? Load/Stoe equies memoy access fo data InstucNon fetch would have to stall fo that cycle Causes a pipeline bubble Hence, pipelined datapaths equie sepaate instucnon/data memoies Sepaate L1 I$ and L1 D$ take cae of this 14

15 Stuctual Hazad #1: Single Memoy I n s t O d e Load Inst 1 Inst 2 Inst 3 Inst 4 Time (clock cycles) I$ Reg D$ Reg Tying to ead same memoy twice in same clock cycle 15

16 Stuctual Hazad #2: Registes (1/2) I n s t O d e Load Inst 1 Inst 2 Inst 3 Inst 4 Time (clock cycles) I$ Can we ead and wite to egistes simultaneously? Reg D$ Reg 16

17 Stuctual Hazad #2: Registes (2/2) Two diffeent solunons have been used: 1) Split RegFile access in two: Wite duing 1 st half and Read duing 2 nd half of each clock cycle Possible because RegFile access is VERY fast (takes less than half the Nme of stage) 2) Build RegFile with independent ead and wite pots Conclusion: Read and Wite to egistes duing same clock cycle is okay 17

18 Agenda Pipelining Pefomance Stuctual Hazads Administivia Data Hazads Fowading Load Delay Slot Contol Hazads 18

19 Administivia Check- in with Poject 3 19

20 Agenda Pipelining Pefomance Stuctual Hazads Administivia Data Hazads Fowading Load Delay Slot Contol Hazads 20

21 2. Data Hazads (1/2) Conside the following sequence of instucnons: add $t0, $t1, $t2 sub $t4, $t0, $t3 and $t5, $t0, $t6 o $t7, $t0, $t8 xo $t9, $t0, $t10 21

22 I n s t O d e 2. Data Hazads (2/2) Data- flow backwads in Nme ae hazads add $t0,$t1,$t2 sub $t4,$t0,$t3 and $t5,$t0,$t6 o $t7,$t0,$t8 xo $t9,$t0,$t10 Time (clock cycles) IF ID/RF EX MEM WB I$ Reg D$ Reg 22

23 Data Hazad SoluNon: Fowading Fowad esult as soon as it is available OK that it s not stoed in RegFile yet add $t0,$t1,$t2 sub $t4,$t0,$t3 and $t5,$t0,$t6 IF ID/RF EX MEM WB o $t7,$t0,$t8 xo $t9,$t0,$t10 I$ Reg D$ Reg 23

24 Datapath fo Fowading (1/2) What changes need to be made hee? 24

25 Datapath fo Fowading (2/2) Handled by fowading unit 25

26 Data Hazad: Loads (1/4) Recall: Dataflow backwads in Nme ae hazads lw $t0,0($t1) IF ID/RF EX MEM WB sub $t3,$t0,$t2 Can t solve all cases with fowading Must stall instucnon dependent on load, then fowad (moe hadwae) 26

27 Data Hazad: Loads (2/4) Hadwae stalls pipeline Called hadwae intelock lw $t0, 0($t1) IF ID/RF EX MEM WB SchemaNcally, this is what we want, but in eality stalls done hoizontally sub $t3,$t0,$t2 and $t5,$t0,$t4 bub ble I$ bub Reg D$ Reg ble How to stall just pat of pipeline? bub ble o $t7,$t0,$t6 I$ Reg D$ 27

28 Data Hazad: Loads (3/4) Stall is equivalent to nop lw $t0, 0($t1) nop bub ble bub ble bub ble bub ble bub ble sub $t3,$t0,$t2 and $t5,$t0,$t4 o $t7,$t0,$t6 I$ Reg D$ 28

29 Data Hazad: Loads (4/4) Slot ahe a load is called a load delay slot If that instucnon uses the esult of the load, then the hadwae intelock will stall it fo one cycle Lesng the hadwae stall the instucnon in the delay slot is equivalent to pusng a nop in the slot (except the late uses moe code space) Idea: Let the compile put an unelated instucnon in that slot à no stall! 29

30 Code Scheduling to Avoid Stalls Reode code to avoid use of load esult in the next instucnon! MIPS code fo D=A+B; E=A+C; Stall! Stall! # Method 1: lw $t1, 0($t0) lw $t2, 4($t0) add $t3, $t1, $t2 sw $t3, 12($t0) # Method 2: lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 lw $t4, 8($t0) sw $t3, 12($t0) add $t5, $t1, $t4 add $t5, $t1, $t4 sw $t5, 16($t0) sw $t5, 16($t0) 13 cycles 11 cycles 30

31 Agenda Moe Pipelining Stuctual Hazads Administivia Data Hazads Fowading Load Delay Slot Contol Hazads 31

32 3. Contol Hazads Banch (beq, bne) detemines flow of contol Fetching next instucnon depends on banch outcome Pipeline can t always fetch coect instucnon SNll woking on ID stage of banch Simple SoluIon: Stall on evey banch unnl we have the new PC value How long must we stall? 32

33 Banch Stall When is compaison esult available? I n s t O d e beq Inst 1 Inst 2 Inst 3 Inst 4 Time (clock cycles) I$ Reg D$ Reg TWO bubbles equied pe banch! 33

34 Summay Hazads educe effecnveness of pipelining Cause stalls/bubbles Stuctual Hazads Conflict in use of datapath component Data Hazads Need to wait fo esult of a pevious instucnon Contol Hazads Addess of next instucnon uncetain/unknown Moe to come next lectue! 34

$t4,$t1,5 3: addi $t1,$t0,1 addi $t2,$t0,2 addi $t3,$t0,2 addi $t3,$t0,4

35 QuesIon: Fo each code sequences below, choose one of the statements below: 1: lw $t0,0($t0) add $t1,$t0,$t0 2: add $t1,$t0,$t0 addi $t2,$t0,5 addi $t4,$t1,5 3: addi $t1,$t0,1 addi $t2,$t0,2 addi $t3,$t0,2 addi $t3,$t0,4 addi $t5,$t1,5 A) B) C) No stalls as is No stalls with fowading Must stall 35

36 I n s t O d e lw add inst inst inst Code Sequence 1 Time (clock cycles) I$ Must stall Reg D$ Reg 36

37 I n s t O d e add addi addi inst inst Code Sequence 2 Time (clock cycles) fowading I$ no fowading Reg D$ Reg No stalls with fowading 37

38 I n s t O d e addi addi addi addi addi Code Sequence 3 Time (clock cycles) I$ Reg D$ Reg No stalls as is 38

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.bekeley.edu/~cs61c UCB CS61C : Machine Stuctues Lectue SOE Dan Gacia Lectue 28 CPU Design : Pipelining to Impove Pefomance 2010-04-05 Stanfod Reseaches have invented a monitoing technique called