1048: Computer Organization

Size: px
Start display at page:

Download "1048: Computer Organization"

Transcription

1 8: Compter Organization Lectre 6 Pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-

2 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2

3 Pipelining Is Natral! Landry eample: Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold A B C D Washer takes 3 mintes Dryer takes mintes Folder takes 2 mintes

4 Seqential Landry 6 P idnight Time T a s k O r d e r A B C D Seqential landry takes 6 hors for loads If they learned pipelining, how long wold it take?

5 Pipelined Landry: Start ASAP 6 P idnight Time T a s k O r d e r A B C D 3 2 Pipelined landry takes 3.5 hors for loads

6 Pipelining Lessons T a s k O r d e r 6 P Time 3 2 A B C D Doesn t help latency of single task, bt throghpt of entire Pipeline rate limited by slowest stage ltipletasks working at same time sing different resorces Potential speedp = Nmber pipe stages Unbalanced stage length; time to fill & drain the pipeline redce speedp Stall for dependences

7 Single-, lti-cycle, vs. Pipeline Clk Cycle Cycle 2 Single Cycle Implementation: Load Store Waste Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle Clk ltiple Cycle Implementation: Load Ifetch Reg Eec em Wr Store Ifetch Reg Eec em R-type Ifetch Pipeline Implementation: Load Ifetch Reg Eec em Wr Store Ifetch Reg Eec em Wr R-type Ifetch Reg Eec em Wr

8 Pipelining IPS Eection Program eection order Time (in instrctions) lw $, ($) Instrction fetch Reg ALU Data access Reg lw $2, 2($) 8 ns Instrction fetch Reg ALU Data access Reg lw $3, 3($) Program eection Time order (in instrctions) lw $, ($) Instrction fetch 8 ns Reg ALU Data access Reg Instrction fetch 8 ns... lw $2, 2($) 2 ns Instrction fetch Reg ALU Data access Reg Fig. 6.3 lw $3, 3($) 2 ns Instrction fetch Reg ALU Data access Reg 2 ns 2 ns 2 ns 2 ns 2 ns Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-8

9 Why Pipeline? Time (clock cycles) I n s t r. O r d e r Inst Inst Inst 2 Inst 3 Inst ALU Im Reg Dm Reg ALU Im Reg Dm Reg ALU Im Reg Dm Reg ALU Im Reg Dm Reg ALU Singlecycle Datapath Im Reg Dm Reg Becase the Resorces Are There!

10 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-

11 Designing a Pipelined Processor Eamine the path and control diagram Starting with single- or mlti-cycle path? Single- or mlti-cycle control? Partition path into stages: IF (instrction fetch), ID (instrction decode and register file read), EX (eection or address calclation), E ( access), (write back) Associate resorces with stages Ensre that flows do not conflict, or figre ot how to resolve Assert control in appropriate stage Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-

12 Use lticycle Eection Steps Step name Instrction fetch Instrction decode/register fetch Action for R-type instrctions Action for -reference Action for instrctions branches IR = emory[pc] PC = PC + A = Reg [IR[25-2]] B = Reg [IR[2-6]] ALUOt = PC + (sign-etend (IR[5-]) << 2) Action for jmps Eection, address ALUOt = A op B ALUOt = A + sign-etend if (A ==B) then PC = PC [3-28] II comptation, branch/ (IR[5-]) PC = ALUOt (IR[25-]<<2) jmp completion emory access or R-type Reg [IR[5-]] = Load: DR = emory[aluot] completion ALUOt or Store: emory [ALUOt] = B emory read completion Load: Reg[IR[2-6]] = DR Bt, se single-cycle path...

13 Split Single-Cycle Datapath IF: Instrction fetch ID: Instrction decode/ register file read EX: Eecte/ address calclation E: emory access : back Feedback Path Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Fig. 6.9 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-3

14 Pipeline Registers Pipeline registers (latches) IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data Fig Sign etend 32 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-

15 Consider load Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Load Ifetch Reg/Dec Eec em Wr IF: Instrction Fetch Fetch the instrction from the Instrction emory ID: Instrction Decode Registers fetch and instrction decode EX: Calclate the address E: the from the Data emory : the back to the register file

16 Pipelining load Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Clock st lw Ifetch Reg/Dec Eec em Wr 2nd lw Ifetch Reg/Dec Eec em Wr 3rd lw Ifetch Reg/Dec Eec em Wr 5 fnctional nits in the pipeline path are: Instrction emory for the Ifetch stage Register File s ports (bsa and bsb) for the Reg/Dec stage ALU for the Eec stage Data emory for the E stage Register File s port (bsw) for the stage

17 IF Stage of load IR = mem[pc]; PC = PC + lw Instrction fetch IR, PC+ Fig. 6.2 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend

18 ID Stage of load A = Reg[IR[25-2]]; B = Reg[IR[2-6]]; ALUot = PC + (sign-et(ir[5-]) << 2) (some ops moved to the net stage) lw Instrction decode Fig. 6.2 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend

19 EX Stage of load ALUot = A + sign-et(ir[5-]) lw Eection Fig. 6.3 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend

20 E State of load DR = mem[aluot] Fig. 6. lw emory IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend

21 Stage of load Reg[IR[2-6]] = DR Fig. 6. Who will spply this address? lw back IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2

22 The For Stages of R-type Cycle Cycle 2 Cycle 3 Cycle R-type Ifetch Reg/Dec Eec Wr IF: fetch the instrction from the Instrction emory ID: registers fetch and instrction decode EX: ALU operates on the two register operands : write ALU otpt back to the register file Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-22

23 Pipelining R-type and load Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Clock R-type Ifetch Reg/Dec Eec Wr Ops! We have a problem! R-type Ifetch Reg/Dec Eec Wr Load Ifetch Reg/Dec Eec em Wr R-type Ifetch Reg/Dec Eec Wr R-type Ifetch Reg/Dec Eec Wr We have a strctral hazard: Two instrctions try to write to the register file at the same time! Only one write port Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-23

24 Important Observation Each fnctional nit can only be sed once per instrction Each fnctional nit mst be sed at the same stage for all instrctions: Load ses Register File s write port dring its 5th stage Load Ifetch Reg/Dec Eec em Wr R-type ses Register File s write port dring its th stage 2 3 R-type Ifetch Reg/Dec Eec Wr Several ways to solve: forwarding, adding pipeline bbble, making instrctions same length Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2

25 Soltion: Delay R-type s Delay R-type s register write by one cycle: R-type also se Reg File s write port at Stage 5 E is a NOP stage: nothing is being done R-type Ifetch Reg/Dec Eec em Wr R-type also has 5 stages Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Clock R-type Ifetch Reg/Dec Eec em Wr R-type Ifetch Reg/Dec Eec em Wr Load Ifetch Reg/Dec Eec em Wr R-type Ifetch Reg/Dec Eec em Wr R-type Ifetch Reg/Dec Eec em Wr Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-25

26 The For Stages of store Cycle Cycle 2 Cycle 3 Cycle Store Ifetch Reg/Dec Eec em Wr IF: fetch the instrction from the Instrction emory ID: registers fetch and instrction decode EX: calclate the address E: write the into the Data emory an etra stage: : NOP Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-26

27 The Three Stages of beq Cycle Cycle 2 Cycle 3 Cycle Beq Ifetch Reg/Dec Eec em Wr IF: fetch the instrction from the Instrction emory ID: registers fetch and instrction decode EX: compares the two register operand select correct branch target address latch into PC two etra stages: E: NOP : NOP Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-27

28 Pipelined Datapath Fig. 6.7 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-28

29 Graphically Representing Pipelines Time (in clock cycles) Program eection order (in instrctions) lw $, 2($) CC CC 2 CC 3 CC CC 5 CC 6 I Reg ALU D Reg sb $, $2, $3 I Reg ALU D Reg Can help with answering qestions like: How many cycles to eecte this code? What is the ALU doing dring cycle? Help nderstand paths Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-29

30 Eample : Cycle Fig. 6.8 lw $, 2($) Instrction fetch IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock

31 Eample : Cycle 2 Fig. 6.8 sb $, $2, $3 Instrction fetch lw $, 2($) Instrction decode IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock 2

32 Eample : Cycle 3 Fig. 6.8 sb $, $2, $3 Instrction decode lw $, 2($) Eection IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock 3

33 Eample : Cycle Fig. 6.8 sb $, $2, $3 Eection lw $, 2($) emory IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock

34 Eample : Cycle 5 Fig. 6.8 sb $, $2, $3 emory lw $, 2($) back IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock 5

35 Eample : Cycle 6 Fig. 6.8 sb $, $2, $3 back IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock 6 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-35

36 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-36

37 Control Signals Fig PCSrc IF/ID ID/EX EX/E E/ Reg Shift left 2 reslt Branch PC ress Instrction Instrction register register 2 Registers 2 register Instrction [5 ] 6 Sign 32 etend ALUSrc 6 ALU control Zero ALU ALU reslt ress em Data em emtoreg Instrction [2 6] Instrction [5 ] RegDst ALUOp 6-37

38 Grop Signals According to Stages Can se control signals of single-cycle CPU (Fig. 6.23, 6.2 <==> 5.2, 5.6) Eection/ress Calclation stage control lines emory access stage control lines -back stage control lines Reg Dst ALU Op ALU Op ALU Src Branch em em Reg write em to Reg X X X X Fig Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-38

39 Data Stationary Control Pass control signals along jst like the ain control generates control signals dring ID Fig Instrction Control EX IF/ID ID/EX EX/E E/ Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-39

40 Data Stationary Control Signals for EX (EtOp, ALUSrc,...) are sed cycle later Signals for E (emwr, Branch) are sed 2 cycles later Signals for (emtoreg, emwr) are sed 3 cycles later ID EX E EtOp EtOp ALUSrc ALUSrc IF/ID Register ain Control ALUOp RegDst emwr Branch emtoreg ID/E Register ALUOp RegDst emwr Branch emtoreg E/E Register emw Branch emtoreg E/ Register emtoreg RegWr RegWr RegWr RegWr Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-

41 Datapath with Control Fig PCSrc Control ID/EX EX/E E/ IF/ID EX PC ress Instrction Instrction Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg Remember that? Who will spply this address? Instrction [5 ] Instrction [2 6] Instrction [5 ] 6 Sign 32 etend 6 ALU control RegDst ALUOp em

42 Let s Try it Ot lw $, 2($) sb $, $2, $3 and $2, $, $5 or $3, $6, $7 add $, $8, $9 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2

43 Eample 2: Cycle IF: lw $, 2($) ID: before<> EX: before<2> E: before<3> : before<> IF/ID Control ID/EX EX EX/E E/ PC ress Instrction Instrction Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch ress Data em emtoreg Instrction [5 ] Sign etend ALU control em Clock Instrction [2 6] Instrction [5 ] RegDst ALUOp 6-3

44 Eample 2: Cycle 2 IF: sb $, $2, $3 ID: lw $, 2($) EX: before<> E: before<2> : before<3> IF/ID lw Control ID/EX EX EX/E E/ PC ress Instrction Instrction X Reg register register 2 Registers 2 register $ $X Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg 2 Instrction [5 ] Sign etend 2 ALU control em Clock 2 X Instrction [2 6] Instrction [5 ] X RegDst ALUOp 6-

45 Eample 2: Cycle 3 IF: and $2, $, $5 ID: sb $, $2, $3 EX: lw $,... E: before<> : before<2> IF/ID sb Control ID/EX EX EX/E E/ PC ress Instrction Instrction 2 3 Reg register register 2 Registers 2 register $2 $3 Shift left 2 $ reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg X Instrction [5 ] Sign etend X 2 ALU control em Clock 3 X Instrction [2 6] Instrction [5 ] X RegDst ALUOp 6-5

46 Eample 2: Cycle IF: or $3, $6, $7 ID: and $2, $2, $3 EX: sb $,... E: lw $,... : before<> IF/ID and Control ID/EX EX EX/E E/ PC ress Instrction Instrction 5 Reg Shift left 2 register $ $2 register 2 Registers $5 $3 2 register reslt ALUSrc Zero ALU ALU reslt Branch ress em Data emtoreg X Instrction [5 ] Sign etend X ALU control em Clock X 2 Instrction [2 6] Instrction [5 ] X 2 RegDst ALUOp 6-6

47 Eample 2: Cycle 5 IF: add $, $8, $9 ID: or $3, $6, $7 EX: and $2,... E: sb $,... : lw $,... IF/ID or Control ID/EX EX EX/E E/ PC ress Instrction Instrction 6 7 Reg register register 2 Registers 2 register $6 $7 Shift left 2 $ $5 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg X Instrction [5 ] Sign etend X ALU control em Clock 5 X 3 Instrction [2 6] Instrction [5 ] X 3 2 RegDst ALUOp 6-7

48 Eample 2: Cycle 6 IF: after<> ID: add $, $8, $9 EX: or $3,... E: and $2,... : sb $,... IF/ID add Control ID/EX EX EX/E E/ PC ress Instrction Instrction 8 9 Reg register register 2 Registers 2 register $8 $9 Shift left 2 $6 $7 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg X Instrction [5 ] Sign etend X ALU control em Clock 6 X Instrction [2 6] Instrction [5 ] X 3 RegDst ALUOp 2 6-8

49 Eample 2: Cycle 7 IF: after<2> ID: after<> EX: add $,... E: or $3,... : and $2,... IF/ID Control ID/EX EX EX/E E/ PC ress Instrction Instrction 2 Reg register register 2 Registers 2 register Shift left 2 $8 $9 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg Instrction [5 ] Sign etend ALU control em Clock 7 Instrction [2 6] Instrction [5 ] RegDst ALUOp

50 Eample 2: Cycle 8 IF: after<3> ID: after<2> EX: after<> E: add $,... : or $3,... IF/ID Control ID/EX EX EX/E E/ PC ress Instrction Instrction 3 Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg Instrction [5 ] Sign etend ALU control em Clock 8 Instrction [2 6] Instrction [5 ] RegDst ALUOp 3 6-5

51 Eample 2: Cycle 9 IF: after<> ID: after<3> EX: after<2> E: after<> : add $,... IF/ID Control ID/EX EX EX/E E/ PC ress Instrction Instrction Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg Instrction [5 ] Sign etend ALU control em Clock 9 Instrction [2 6] Instrction [5 ] RegDst ALUOp 6-5

52 Smmary of Pipeline Basics Pipelining is a fndamental concept ltiple steps sing distinct resorces Utilize capabilities of path by pipelined instrction processing Start net instrnction while working on the crrent one Limited by length of longest stage (pls fill/flsh) Need to detect and resolve hazards What makes it easy in IPS? All instrctions are of the same length Jst a few instrction formats emory operands only in loads and stores What makes pipelining hard? Pipeline hazards Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-52

53 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding (R-Type and R-Type) Data hazards and stalls (Load and R-type) Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-53

54 Pipeline Hazards Pipeline Hazards: Strctral hazards: attempt to se the same resorce in two different ways at the same time Data hazards: attempt to se item before ready Instrction depends on reslt of prior instrction still in the pipeline Control hazards: attempt to make decision before condition is evalated Branch instrctions Can always resolve hazards by waiting? pipeline control mst detect the hazard take action (or delay action) to resolve hazards Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-5

55 Strctral Hazard: Single emory Time I n s t r. O r d e r Load Instr Instr 2 Instr 3 Instr ALU em Reg em Reg ALU em Reg em Reg em ALU em Reg em Reg ALU Reg em Reg em Reg em Reg ALU Use 2 : and instrction Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-55

56 Data Hazards Fig Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 CC CC 2 CC 3 CC CC 5 CC 6 I Reg CC 7 CC 8 CC 9 / D Reg and $2, $2, $5 I Reg D Reg or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) I Reg D Reg

57 Types of Data Hazards Three types: (inst. i followed by inst. i2) RAW (read after write): i2 tries to read operand before i writes it WAR (write after read): i2 tries to write operand before i reads it Gets wrong operand, e.g., atoincrement addr. Can t happen in IPS 5-stage pipeline becase: All instrctions take 5 stages, and reads are always in stage 2, and writes are always in stage 5 WAW (write after write): i2 tries to write operand before i writes it Leaves wrong reslt ( i s not i2 s); occr only in pipelines that write in more than one stage Can t happen in IPS 5-stage pipeline becase: RAR? All instrctions take 5 stages, and writes are always in stage 5 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-57

58 Pipeline Hazards Illstrated IF ID EX E Strctral Hazard IF ID. IF ID EX E RAW (read after write) Data Hazard IF ID EX E IF ID EX E WAW Data Hazard (write after write) IF ID EX em IF ID EX E WAR Data Hazard (write after read) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-58

59 Handling Data Hazards Use simple, fied designs Eliminate WAR by always fetching operands early (ID) in pipeline Eliminate WAW by doing all write backs in order (last stage, static) These featres have a lot to do with ISA design Internal forwarding in register file: in first half of clock and read in second half delivers what is written, resolve hazard between sb and add Detect and resolve remaining ones Compiler inserts NOP Forward Stall Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-59

60 Software Soltion Have compiler garantee no hazards Where do we insert the NOPs? sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $, $2, $2 sw $5, ($2) Problem: this really slows s down! Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-6

61 Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 I Insert two nops Reg Data Hazards CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 / D Reg Fig and $2, $2, $5 I Reg D Reg or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) I Reg D Reg Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-6

62 Data Hazards : Forwarding Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 / sb $2, $, $3 I Reg D Reg and $2, $2, $5 I Reg D Reg Fig or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) I Reg D Reg Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-62

63 6-63 Pipeline with Forwarding PC Instrction Registers Control ALU EX ID/EX EX/E E/ Data Forwarding nit IF/ID Instrction Rd EX/E.RegisterRd E/.RegisterRd Rt Rt Rs IF/ID.RegisterRd IF/ID.RegisterRt IF/ID.RegisterRt IF/ID.RegisterRs Fig ForwardA ForwardB

64 Detecting Data Hazards Hazard conditions: a. EX/E.RegisterRd = ID/EX.RegisterRs b. EX/E.RegisterRd = ID/EX.RegisterRt 2a. E/.RegisterRd = ID/EX.RegisterRs 2b. E/.RegisterRd = ID/EX.RegisterRt Two optimizations: Don t forward if instrction does not write register => check if Reg is asserted Don t forward if destination register is $ => check if RegisterRd = Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-6

65 Detecting Data Hazards (cont.) Hazard conditions sing control signals: At EX stage: EX/E.Reg and (EX/E.RegRd ) and (EX/E.RegRd=ID/EX.RegRs) At E stage: E/.Reg and (E/.RegRd ) and (E/.RegRd=ID/EX.RegRs) (replace ID/EX.RegRt for ID/EX.RegRs for the other two conditions) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-65

66 Resolving Hazards: Forwarding Use temporary reslts, e.g., those in pipeline registers, don t wait for them to be written Fig Time (in clock cycles) CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 Vale of register $2 : / Vale of EX/E : X X X 2 X X X X X Vale of E/ : X X X X 2 X X X X Program eection order (in instrctions) sb $2, $, $3 I Reg D Reg and $2, $2, $5 I Reg D Reg or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) 6-66 I Reg D Reg

67 Pipeline with Forwarding ID/EX EX/E Fig Control E/ IF/ID EX PC Instrction Instrction Registers IF/ID.RegisterRs Rs ALU ForwardA Data IF/ID.RegisterRt IF/ID.RegisterRt IF/ID.RegisterRd Rt Rt Rd ForwardB EX/E.RegisterRd Forwarding nit E/.RegisterRd Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-67

68 Forwarding Logic Forwarding: inpt to ALU from any pipe reg. mltipleors to ALU inpt Control forwarding in EX => carry Rs in ID/EX Control signals for forwarding: If both and E forward, e.g., add $,$,$2; add $,$,$3; add $,$,$; => let E forward EX hazard: if (EX/E.Reg and (EX/E.RegRd ) and (EX/E.RegRd=ID/EX.RegRs)) ForwardA= E hazard: if (E/.Reg and (E/.RegRd ) and (EX/E.RegRd ID/EX.Reg.Rs) and (E/.RegRd=ID/EX.RegRs)) ForwardA= (ID/EX.RegRt<->ID/EX.RegRs, ForwardB<-> ForwardA) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-68

69 Eample 3: Cycle 3 or $, $, $2 and $, $2, $5 sb $2, $, $3 before<> before<2> ID/EX EX/E Control E/ IF/ID EX 2 $2 $ PC Instrction Instrction 5 Registers $5 $3 ALU Data Forwarding nit Clock

70 Eample 3: Cycle add $9, $, $2 or $, $, $2 and $, $2, $5 sb $2,... before<> ID/EX EX/E Control E/ IF/ID EX $ $2 PC Instrction Instrction 6 Registers $2 $5 ALU Data Fig. 6. Forwarding nit 2 Clock 6-7

71 Eample 3: Cycle 5 after<> add $9, $, $2 or $, $, $2 and $,... sb $2,... ID/EX EX/E Control E/ IF/ID EX $ $ PC Instrction Instrction 2 2 Registers $2 $2 ALU Data Fig Forwarding nit 2 Clock 5 6-7

72 Eample 3: Cycle 6 after<2> after<> add $9, $, $2 or $,... and $,... ID/EX EX/E Control E/ IF/ID EX $ PC Instrction Instrction Registers $2 ALU Data 2 Fig Forwarding nit Clock

73 lw can still case a hazard: (in instrctions) Can't Always Forward if followed by an instrction to read the loaded reg. Fig. 6.3 lw $2, 2($) I Reg? D Reg Use stalling or compiler to resolve and $, $2, $5 I Reg D Reg or $8, $2, $6 I Reg D Reg add $9, $, $2 I Reg D Reg slt $, $6, $7 I Reg D Reg 6-73

74 Stalling Stall pipeline by keeping instrctions in same stage and inserting an NOP instead (in instrctions) lw $2, 2($) I Reg D Reg Fig and $, $2, $5 I Reg Reg D Reg or $8, $2, $6 add $9, $, $2 I I Reg D Reg bbble I Reg D Reg slt $, $6, $7 I Reg D Reg Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-7

75 Pipeline with Stalling Unit Forwarding controls ALU inpts, hazard detection controls PC, IF/ID, control signals IF/ID Hazard detection nit Control ID/EX.em ID/EX EX/E Fig E/ IF/ID EX PC PC Instrction Instrction Registers ALU Data IF/ID.RegisterRs IF/ID.RegisterRt IF/ID.RegisterRt IF/ID.RegisterRd Rt Rd EX/E.RegisterRd ID/EX.RegisterRt Rs Rt Forwarding nit E/.RegisterRd 6-75

76 Handling Stalls Hazard detection nit in ID to insert stall between a load instrction and its se: if (ID/EX.em and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.registerRt)) stall the pipeline for one cycle (ID/EX.em= indicates a load instrction) How to stall? Stall instrction in IF and ID: not change PC and IF/ID => the stages re-eecte the instrctions What to move into EX: insert an NOP by changing EX, E, control fields of ID/EX pipeline register to as control signals propagate, all control signals to EX, E, are deasserted and no registers or memories are written Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-76

77 Eample : Cycle 2 and $, $2, $5 lw $2, 2($) before<> before<2> IF/ID IF/ID X Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ before<3> PC PC Instrction Instrction X Registers $ $X ALU Data ID/EX.RegisterRt X 2 Forwarding nit Clock

78 Eample : Cycle 3 or $, $, $2 and $, $2, $5 lw $2, 2($) before<> before<2> IF/ID IF/ID 2 5 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC Instrction Instrction 2 5 Registers $2 $5 $ $X ALU Data 2 5 X 2 ID/EX.RegisterRt Forwarding nit Clock

79 Eample : Cycle or $, $, $2 and $, $2, $5 bbble lw $2,... before<> IF/ID IF/ID 2 5 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC Instrction Instrction 2 5 Registers $2 $5 $2 $5 ALU Data ID/EX.RegisterRt Forwarding nit 2 Clock 6-79

80 Eample : Cycle 5 add $9, $, $2 or $, $, $2 and $, $2, $5 bbble lw $2,... IF/ID IF/ID 2 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC Instrction Instrction 2 2 Registers $ $2 $2 $5 ALU Data ID/EX.RegisterRt Forwarding nit Clock 5 6-8

81 Eample : Cycle 6 after<> add $9, $, $2 or $, $, $2 and $,... bbble IF/ID IF/ID 2 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC Instrction Instrction 2 Registers $ $2 $ $2 ALU Data Fig. 6.9 ID/EX.RegisterRt Forwarding nit Clock 6 6-8

82 Eample : Cycle 7 after<2> after<> add $9, $, $2 or $,... and $,... Hazard detection nit ID/EX.em ID/EX IF/ID IF/ID Control EX EX/E E/ PC PC Instrction Instrction Registers $ $2 ALU Data 2 ID/EX.RegisterRt 9 Forwarding nit Clock

83 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-83

84 Pipeline Datapath with Control Signals PCSrc IF/ID ID/EX EX/E E/ Reg Shift left 2 reslt Branch PC ress Instrction Instrction register register 2 Registers 2 register Instrction [5 ] 6 Sign 32 etend ALUSrc 6 ALU control Zero ALU ALU reslt ress em Data em emtoreg Fig Instrction [2 6] Instrction [5 ] ALUOp RegDst Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-8

85 Branch Hazards When decide to branch, other inst. are in pipeline! (in instrctions) beq $, $3, 7 I Reg D Reg Fig and $2, $2, $5 I Reg D Reg 8 or $3, $6, $2 I Reg D Reg 52 add $, $2, $2 I Reg D Reg 72 lw $, 5($7) I Reg D Reg Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-85

86 P i p e l i n e H a z a r d s I l l s t r a t e d IF ID EX E Strctral Hazard IF ID. IF ID EX E RAW (read after write) Data Hazard IF ID EX E IF ID EX E WAW Data Hazard (write after write) IF ID EX em IF ID EX E WAR Data Hazard (write after read) IF ID EX E Control Hazard IF ID. 6-86

87 Handling Branch Hazard Predict branch always not taken Need to add hardware for flshing inst. if wrong Branch decision made at E => need to flsh instrction in IF/ID, ID/EX by changing control vales to Redce delay of taken branch by moving branch eection earlier in the pipeline ove p branch address calclation to ID Check branch eqality at ID (sing XOR) by comparing the two registers read dring ID Branch decision made at ID => one instrction to flsh a control signal, IF.Flsh, to zero instrction field of IF/ID => making the instrction an NOP Dynamic branch prediction Compiler reschedling, delay branch Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-87

88 6-88 Pipeline with Flshing PC Instrction Registers ALU EX ID/EX EX/E E/ Data Hazard detection nit Forwarding nit IF.Flsh IF/ID Sign etend Control = Shift left 2 Fig. 6.

89 Eample 5: Cycle 3 Fig and $2, $2, $5 beq $, $3, 7 sb $, $, $8 before<> before<2> IF.Flsh 72 8 Hazard detection nit ID /EX EX/E IF/ID 8 Control EX E/ PC 72 Instrction Shift left 2 7 Registers = $ $3 $ $8 ALU Data Sign etend Forwarding nit Clock

90 Eample 5: Cycle lw $, 5($7) bbble (nop) beq $, $3, 7 sb $,... before<> IF.Flsh 76 Hazard detection nit ID/EX EX/E IF/ID Control EX E/ PC Instrction Shift left 2 Registers = $ $3 ALU Data Sign etend Forwarding nit Clock 6-9

91 Delayed Branch Predict-not-taken + branch decision at ID => the following instrction is always eected => branches take effect cycle later I n s t r. O r d e r add beq misc lw Time (clock cycles) ALU em Reg em Reg clock cycle penalty per branch instrction if can find instrction to pt in slot ( 5% of time) ALU em Reg em Reg em ALU Reg em Reg em Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-9 ALU Reg em Reg

92 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-92

93 Handling Eceptions How to stop the pipeline? restart? Sppose overflow occr at add $,$2,$ Disable writes of instrctions till trap hits, e.g., flsh following instrctions sing IF.Flsh, ID.Flsh, EX.Flsh to case mltipleers to zero control signals (overflow eception detected at EX => flsh offending instrction) Force trap instrction into IF, e.g., fetch from he by adding he to PC inpt UX Save address of offending instrction in EPC Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-93

94 6-9 Pipeline with Eception PC Instrction Registers Sign etend Control ALU EX ID/EX EX/E E/ Data Hazard detection nit Forwarding nit IF.Flsh IF/ID = Ecept PC ID.Flsh EX.Flsh Case Shift left 2 Fig. 6.55

95 Handling Eceptions 5 instrctions eecting in 5 stage pipeline Who cased the eception? Need to know in which stage an eception can occr => help determine case Stage IF ID EX E Problem interrpts occrring Page falt; misaligned access; -protection violation Undefined or illegal opcode Arithmetic eception Page falt; misaligned access; error; mem-protection violation; Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-95

96 Handling Eceptions Who to serve first, if mltiple interrpts at the same time? ltiple interrpts: se priority hardware to choose the earliest instrction to interrpt Eternal interrpts: fleible in when to interrpt Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-96

97 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-97

98 Instrction Level Parallelism, ILP How to increase the potential amont of ILP: Increase the depth of the pipeline to overlap more instrctions sper-pipeline Lanch mltiple instrctions Static mltiple isse (decision made by compiler before eection) Dynamic mltiple isse (decision made dring eection by the processor) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-98

99 Different Pipelined Designs Pipelining Sper-pipeline - Isse one instrction per (fast) cycle - ALU takes mltiple cycles IF D E W IF D E W IF D E W IF D E IF D E W IF D E W IF D E W IF D E W W Limitation Isse rate, FU stalls, FU depth Clock skew, FU stalls, FU depth Sper-scalar - Isse mltiple scalar instrctions per cycle IF D E W IF D E W IF D E W IF D E W Hazard resoltion VLIW (EPIC) - Each instrction specifies mltiple scalar operations - Compiler determines parallelism IF D E W E W E W E W Packing Vector operations - Each instrction specifies series of identical operations IF D E W E W E W E W Applicability 6-99

100 Static ltiple Isse Use compiler to assist with packing instrctions and handling hazard Very Long Instrction Word (VLIW) Eplicitly Parallel Instrction Compter (EPIC) (Intel IA-6) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-

101 A Static Two-isse Datapath Fig. 6.5 ALU PC Instrction Registers Data Sign etend Sign etend ALU ress

102 Dynamic ltiple Isse The hardware performs the schedling? hardware tries to find instrctions to eecte ot of order eection is possible speclative eection and dynamic branch prediction Sperscalar Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2

103 Sperscalar:Three Primary Units Fig Instrction fetch and decode nit In-order isse Reservation station Reservation station Reservation station Reservation station Fnctional nits Integer Integer Floating point Load/ Store Ot-of-order eecte In-order commit Commit nit Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-3

104 Simple Sperscalar Independent INT and FP isse to separate pipelines I-Cache INT Reg Inst Isse and Bypass FP Reg Operand / Reslt Bsses INT Unit Load / Store Unit FP FP l D-Cache Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-

105 Dynamic Schedling All modern processors are very complicated DEC Alpha 226: 9 stage pipeline, 6 instrction isse PowerPC and Pentim: branch history table Compiler technology important Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-5

106 Smmary Pipelines pass control information down the pipe jst as moves down pipe Forwarding/stalls handled by local control Eceptions stop the pipeline IPS instrction set architectre made pipeline visible (delayed branch, delayed load) ore performance from deeper pipelines, parallelism Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-6

Chapter 6: Pipelining

Chapter 6: Pipelining Chapter 6: Pipelining Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining

More information

Review: Computer Organization

Review: Computer Organization Review: Compter Organization Pipelining Chans Y Landry Eample Landry Eample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 3 mintes A B C D Dryer takes 3 mintes

More information

PS Midterm 2. Pipelining

PS Midterm 2. Pipelining PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry

More information

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction Review Friday the 2st of October Real world eamples of pipelining? How does pipelining pp inflence instrction latency? How does pipelining inflence instrction throghpt? What are the three types of hazard

More information

Enhanced Performance with Pipelining

Enhanced Performance with Pipelining Chapter 6 Enhanced Performance with Pipelining Note: The slides being presented represent a mi. Some are created by ark Franklin, Washington University in St. Lois, Dept. of CSE. any are taken from the

More information

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13 Comp 33 Compter Architectre A Pipelined path Lectre 3 Pipelined path with Signals PCSrc IF/ ID ID/ EX EX / E E / Add PC 4 Address Instrction emory RegWr ra rb rw Registers bsw [5-] [2-6] [5-] bsa bsb Sign

More information

What do we have so far? Multi-Cycle Datapath

What do we have so far? Multi-Cycle Datapath What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining

More information

Pipelining. Chapter 4

Pipelining. Chapter 4 Pipelining Chapter 4 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast Pipelining Single cycle path we

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Compter Architectre Chapter 3 & Appendi C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Compter Science and Engineering Shanghai Jiao Tong University 1 Otline Introdction

More information

Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind

Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind Pipelining hink of sing machines in landry services Chapter 6 nhancing Performance with Pipelining 6 P 7 8 9 A ime ask A B C ot pipelined Assme 3 min. each task wash, dry, fold, store and that separate

More information

Overview of Pipelining

Overview of Pipelining EEC 58 Compter Architectre Pipelining Department of Electrical Engineering and Compter Science Cleveland State University Fndamental Principles Overview of Pipelining Pipelined Design otivation: Increase

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 83 Compter Organization Chapter.6 A Pipelined path Chans Y Pipelined Approach 2 - Cycle time, No. stages - Resorce conflict E E A B C D 3 E E 5 E 2 3 5 2 6 7 8 9 c.y9@csohio.ed Resorces sed in 5 Stages

More information

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University Compter Architectre Chapter 5 Fall 25 Department of Compter Science Kent State University The Processor: Datapath & Control Or implementation of the MIPS is simplified memory-reference instrctions: lw,

More information

Chapter 6: Pipelining

Chapter 6: Pipelining CSE 322 COPUTER ARCHITECTURE II Chapter 6: Pipelining Chapter 6: Pipelining Febrary 10, 2000 1 Clothes Washing CSE 322 COPUTER ARCHITECTURE II The Assembly Line Accmlate dirty clothes in hamper Place in

More information

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2019, Assignment % of course mark CS 25, Winter 29, Assignment.. 3% of corse mark De Wednesday, arch 3th, 5:3P Lates accepted ntil Thrsday arch th, pm with a 5% penalty. (7 points) In the diagram below, the mlticycle compter from the corse

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 4.. 3% of corse mark De Wednesday, arch 7th, 4:3P Lates accepted ntil Thrsday arch 8th, am with a 5% penalty. (6 points) In the diagram below, the mlticycle compter from the

More information

Exceptions and interrupts

Exceptions and interrupts Eceptions and interrpts An eception or interrpt is an nepected event that reqires the CPU to pase or stop the crrent program. Eception handling is the hardware analog of error handling in software. Classes

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5A - simple implementation (cwli@twins.ee.nct.ed.tw) 5A- Introdction In this lectre, we will try to implement simplified IPS which contain emory

More information

The single-cycle design from last time

The single-cycle design from last time lticycle path Last time we saw a single-cycle path and control nit for or simple IPS-based instrction set. A mlticycle processor fies some shortcomings in the single-cycle CPU. Faster instrctions are not

More information

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read.

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read. The final path PC 4 Add Reg Shift left 2 Add PCSrc Instrction [3-] Instrction I [25-2] I [2-6] I [5 - ] register register 2 register 2 Registers ALU Zero Reslt ALUOp em Data emtor RegDst ALUSrc em I [5

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S. Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write

More information

The extra single-cycle adders

The extra single-cycle adders lticycle Datapath As an added bons, we can eliminate some of the etra hardware from the single-cycle path. We will restrict orselves to sing each fnctional nit once per cycle, jst like before. Bt since

More information

Solutions for Chapter 6 Exercises

Solutions for Chapter 6 Exercises Soltions for Chapter 6 Eercises Soltions for Chapter 6 Eercises 6. 6.2 a. Shortening the ALU operation will not affect the speedp obtained from pipelining. It wold not affect the clock cycle. b. If the

More information

Design a MIPS Processor (2/2)

Design a MIPS Processor (2/2) 93-2Digital System Design Design a MIPS Processor (2/2) Lecturer: Chihhao Chao Advisor: Prof. An-Yeu Wu 2005/5/13 Friday ACCESS IC LABORTORY Outline v 6.1 An Overview of Pipelining v 6.2 A Pipelined Datapath

More information

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons Pipelining: Natral Phenomenon Landry Eample: nn, rian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 mintes C D Dryer takes 0 mintes PIPELINING Folder takes 20 mintes

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 483 Compter Organization Chapter 4.4 A Simple Implementation Scheme Chans Y The Big Pictre The Five Classic Components of a Compter Processor Control emory Inpt path Otpt path & Control 2 path and

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online

More information

EEC 483 Computer Organization. Branch (Control) Hazards

EEC 483 Computer Organization. Branch (Control) Hazards EEC 483 Compter Organization Section 4.8 Branch Hazards Section 4.9 Exceptions Chans Y Branch (Control) Hazards While execting a previos branch, next instrction address might not yet be known. s n i o

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

EXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION

EXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION EXAINATIONS 2010 END OF YEAR COPUTER ORGANIZATION Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. ake sre yor answers are clear and to the point. Calclators and paper foreign langage

More information

Quiz #1 EEC 483, Spring 2019

Quiz #1 EEC 483, Spring 2019 Qiz # EEC 483, Spring 29 Date: Jan 22 Name: Eercise #: Translate the following instrction in C into IPS code. Eercise #2: Translate the following instrction in C into IPS code. Hint: operand C is stored

More information

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM Lectre (Wed /5/28) Lab # Hardware De Fri Oct 7 HW #2 IPS programming, de Wed Oct 22 idterm Fri Oct 2 IorD The mlticycle path SrcA Today s objectives: icroprogramming Etending the mlti-cycle path lti-cycle

More information

Computer Architecture. Lecture 6: Pipelining

Computer Architecture. Lecture 6: Pipelining Compter Architectre Lectre 6: Pipelining Dr. Ahmed Sallam Based on original slides by Prof. Onr tl Agenda for Today & Net Few Lectres Single-cycle icroarchitectres lti-cycle and icroprogrammed icroarchitectres

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5B - mlticycle implementation (cwli@twins.ee.nct.ed.tw) 5B- Recap: A Single-Cycle Processor PCSrc 4 Add Shift left 2 Add ALU reslt PC address

More information

Review. A single-cycle MIPS processor

Review. A single-cycle MIPS processor Review If three instrctions have opcodes, 7 and 5 are they all of the same type? If we were to add an instrction to IPS of the form OD $t, $t2, $t3, which performs $t = $t2 OD $t3, what wold be its opcode?

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike

More information

Review Multicycle: What is Happening. Controlling The Multicycle Design

Review Multicycle: What is Happening. Controlling The Multicycle Design Review lticycle: What is Happening Reslt Zero Op SrcA SrcB Registers Reg Address emory em Data Sign etend Shift left Sorce A B Ot [-6] [5-] [-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em

More information

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation EXAINATIONS 2003 COP203 END-YEAR Compter Organisation Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. There are 180 possible marks on the eam. Calclators and foreign langage dictionaries

More information

Lecture 7. Building A Simple Processor

Lecture 7. Building A Simple Processor Lectre 7 Bilding A Simple Processor Christos Kozyrakis Stanford University http://eeclass.stanford.ed/ee8b C. Kozyrakis EE8b Lectre 7 Annoncements Upcoming deadlines Lab is de today Demo by 5pm, report

More information

Improve performance by increasing instruction throughput

Improve performance by increasing instruction throughput Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access

More information

PART I: Adding Instructions to the Datapath. (2 nd Edition):

PART I: Adding Instructions to the Datapath. (2 nd Edition): EE57 Instrctor: G. Pvvada ===================================================================== Homework #5b De: check on the blackboard =====================================================================

More information

Chapter 4 (Part II) Sequential Laundry

Chapter 4 (Part II) Sequential Laundry Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30

More information

Instruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time.

Instruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time. Pipelining Pipelining is the se of pipelining to allow more than one instrction to be in some stage of eection at the same time. Ferranti ATLAS (963): Pipelining redced the average time per instrction

More information

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion . (Chapter 5) Fill in the vales for SrcA, SrcB, IorD, Dst and emto to complete the Finite State achine for the mlti-cycle datapath shown below. emory address comptation 2 SrcA = SrcB = Op = fetch em SrcA

More information

Lecture 10: Pipelined Implementations

Lecture 10: Pipelined Implementations U 8-7 S 9 L- 8-7 Lectre : Pipelined Implementations James. Hoe ept of EE, U Febrary 23, 29 nnoncements: Project is de this week idterm graded, d reslts posted Handots: H9 Homework 3 (on lackboard) Graded

More information

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content 3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Spring 25 Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction

More information

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control CSE-45432 Introdction to Compter Architectre Chapter 5 The Processor: Datapath & Control Dr. Izadi Data Processor Register # PC Address Registers ALU memory Register # Register # Address Data memory Data

More information

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics (Lectres

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites,

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:

Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code: EE8 Winter 25 Homework #2 Soltions De Thrsday, Feb 2, 5 P. ( points) Consider the following fragment of Java code: for (i=; i

More information

Lecture 6: Pipelining

Lecture 6: Pipelining Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

CPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner

CPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner CPS104 Computer Organization and Programming Lecture 19: Pipelining Robert Wagner cps 104 Pipelining..1 RW Fall 2000 Lecture Overview A Pipelined Processor : Introduction to the concept of pipelined processor.

More information

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked

More information

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University 8 447 Lectre 6: icroprogrammed lti Cycle Implementation James C. Hoe Department of ECE Carnegie ellon University 8 447 S8 L06 S, James C. Hoe, CU/ECE/CALC, 208 Yor goal today Hosekeeping nderstand why

More information

Pipelining. Maurizio Palesi

Pipelining. Maurizio Palesi * Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer

More information

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade CSE 141 Compter Architectre Smmer Session I, 2004 Lectres 10 Advanced Topics, emory Hierarchy and Cache Pramod V. Argade CSE141: Introdction to Compter Architectre Instrctor: TA: Pramod V. Argade (p2argade@cs.csd.ed)

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4 The Processor 1. Chapter 4B. The Processor Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Recall. ISA? Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Instruction Format or Encoding how is it decoded? Location of operands and

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

What do we have so far? Multi-Cycle Datapath (Textbook Version)

What do we have so far? Multi-Cycle Datapath (Textbook Version) What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

CENG 3420 Lecture 06: Pipeline

CENG 3420 Lecture 06: Pipeline CENG 3420 Lecture 06: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L06.1 Spring 2019 Outline q Pipeline Motivations q Pipeline Hazards q Exceptions q Background: Flip-Flop Control Signals CENG3420 L06.2

More information

Processor (II) - pipelining. Hwansoo Han

Processor (II) - pipelining. Hwansoo Han Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor? Chapter 4 The Processor 2 Introduction We will learn How the ISA determines many aspects

More information

CSEE 3827: Fundamentals of Computer Systems

CSEE 3827: Fundamentals of Computer Systems CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected

More information

CS 251, Spring 2018, Assignment 3.0 3% of course mark

CS 251, Spring 2018, Assignment 3.0 3% of course mark CS 25, Spring 28, Assignment 3. 3% of corse mark De onday, Jne 25th, 5:3 P. (5 points) Consider the single-cycle compter shown on page 6 of this assignment. Sppose the circit elements take the following

More information

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University Parallel

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

14:332:331 Pipelined Datapath

14:332:331 Pipelined Datapath 14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each

More information

Page 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight

Page 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight Pipelining: Its Natural! Chapter 3 Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder

More information

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ... CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =

More information

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27) Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version MIPS Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 3.. 3% of corse mark De onday, Febrary 26th, 4:3 P Lates accepted ntil : A, Febrary 27th with a 5% penalty. IEEE 754 Floating Point ( points): (a) (4 points) Complete the following

More information

The Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

The Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University The Processor (3) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

Appendix A. Overview

Appendix A. Overview Appendix A Pipelining: Basic and Intermediate Concepts 1 Overview Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 2 1 Unpipelined

More information

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP Overview Appendix A Pipelining: Basic and Intermediate Concepts Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 1 2 Unpipelined

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information