1048: Computer Organization
|
|
- Warren Reeves
- 5 years ago
- Views:
Transcription
1 8: Compter Organization Lectre 6 Pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-
2 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2
3 Pipelining Is Natral! Landry eample: Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold A B C D Washer takes 3 mintes Dryer takes mintes Folder takes 2 mintes
4 Seqential Landry 6 P idnight Time T a s k O r d e r A B C D Seqential landry takes 6 hors for loads If they learned pipelining, how long wold it take?
5 Pipelined Landry: Start ASAP 6 P idnight Time T a s k O r d e r A B C D 3 2 Pipelined landry takes 3.5 hors for loads
6 Pipelining Lessons T a s k O r d e r 6 P Time 3 2 A B C D Doesn t help latency of single task, bt throghpt of entire Pipeline rate limited by slowest stage ltipletasks working at same time sing different resorces Potential speedp = Nmber pipe stages Unbalanced stage length; time to fill & drain the pipeline redce speedp Stall for dependences
7 Single-, lti-cycle, vs. Pipeline Clk Cycle Cycle 2 Single Cycle Implementation: Load Store Waste Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle Clk ltiple Cycle Implementation: Load Ifetch Reg Eec em Wr Store Ifetch Reg Eec em R-type Ifetch Pipeline Implementation: Load Ifetch Reg Eec em Wr Store Ifetch Reg Eec em Wr R-type Ifetch Reg Eec em Wr
8 Pipelining IPS Eection Program eection order Time (in instrctions) lw $, ($) Instrction fetch Reg ALU Data access Reg lw $2, 2($) 8 ns Instrction fetch Reg ALU Data access Reg lw $3, 3($) Program eection Time order (in instrctions) lw $, ($) Instrction fetch 8 ns Reg ALU Data access Reg Instrction fetch 8 ns... lw $2, 2($) 2 ns Instrction fetch Reg ALU Data access Reg Fig. 6.3 lw $3, 3($) 2 ns Instrction fetch Reg ALU Data access Reg 2 ns 2 ns 2 ns 2 ns 2 ns Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-8
9 Why Pipeline? Time (clock cycles) I n s t r. O r d e r Inst Inst Inst 2 Inst 3 Inst ALU Im Reg Dm Reg ALU Im Reg Dm Reg ALU Im Reg Dm Reg ALU Im Reg Dm Reg ALU Singlecycle Datapath Im Reg Dm Reg Becase the Resorces Are There!
10 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-
11 Designing a Pipelined Processor Eamine the path and control diagram Starting with single- or mlti-cycle path? Single- or mlti-cycle control? Partition path into stages: IF (instrction fetch), ID (instrction decode and register file read), EX (eection or address calclation), E ( access), (write back) Associate resorces with stages Ensre that flows do not conflict, or figre ot how to resolve Assert control in appropriate stage Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-
12 Use lticycle Eection Steps Step name Instrction fetch Instrction decode/register fetch Action for R-type instrctions Action for -reference Action for instrctions branches IR = emory[pc] PC = PC + A = Reg [IR[25-2]] B = Reg [IR[2-6]] ALUOt = PC + (sign-etend (IR[5-]) << 2) Action for jmps Eection, address ALUOt = A op B ALUOt = A + sign-etend if (A ==B) then PC = PC [3-28] II comptation, branch/ (IR[5-]) PC = ALUOt (IR[25-]<<2) jmp completion emory access or R-type Reg [IR[5-]] = Load: DR = emory[aluot] completion ALUOt or Store: emory [ALUOt] = B emory read completion Load: Reg[IR[2-6]] = DR Bt, se single-cycle path...
13 Split Single-Cycle Datapath IF: Instrction fetch ID: Instrction decode/ register file read EX: Eecte/ address calclation E: emory access : back Feedback Path Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Fig. 6.9 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-3
14 Pipeline Registers Pipeline registers (latches) IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data Fig Sign etend 32 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-
15 Consider load Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Load Ifetch Reg/Dec Eec em Wr IF: Instrction Fetch Fetch the instrction from the Instrction emory ID: Instrction Decode Registers fetch and instrction decode EX: Calclate the address E: the from the Data emory : the back to the register file
16 Pipelining load Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Clock st lw Ifetch Reg/Dec Eec em Wr 2nd lw Ifetch Reg/Dec Eec em Wr 3rd lw Ifetch Reg/Dec Eec em Wr 5 fnctional nits in the pipeline path are: Instrction emory for the Ifetch stage Register File s ports (bsa and bsb) for the Reg/Dec stage ALU for the Eec stage Data emory for the E stage Register File s port (bsw) for the stage
17 IF Stage of load IR = mem[pc]; PC = PC + lw Instrction fetch IR, PC+ Fig. 6.2 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend
18 ID Stage of load A = Reg[IR[25-2]]; B = Reg[IR[2-6]]; ALUot = PC + (sign-et(ir[5-]) << 2) (some ops moved to the net stage) lw Instrction decode Fig. 6.2 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend
19 EX Stage of load ALUot = A + sign-et(ir[5-]) lw Eection Fig. 6.3 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend
20 E State of load DR = mem[aluot] Fig. 6. lw emory IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend
21 Stage of load Reg[IR[2-6]] = DR Fig. 6. Who will spply this address? lw back IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2
22 The For Stages of R-type Cycle Cycle 2 Cycle 3 Cycle R-type Ifetch Reg/Dec Eec Wr IF: fetch the instrction from the Instrction emory ID: registers fetch and instrction decode EX: ALU operates on the two register operands : write ALU otpt back to the register file Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-22
23 Pipelining R-type and load Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Clock R-type Ifetch Reg/Dec Eec Wr Ops! We have a problem! R-type Ifetch Reg/Dec Eec Wr Load Ifetch Reg/Dec Eec em Wr R-type Ifetch Reg/Dec Eec Wr R-type Ifetch Reg/Dec Eec Wr We have a strctral hazard: Two instrctions try to write to the register file at the same time! Only one write port Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-23
24 Important Observation Each fnctional nit can only be sed once per instrction Each fnctional nit mst be sed at the same stage for all instrctions: Load ses Register File s write port dring its 5th stage Load Ifetch Reg/Dec Eec em Wr R-type ses Register File s write port dring its th stage 2 3 R-type Ifetch Reg/Dec Eec Wr Several ways to solve: forwarding, adding pipeline bbble, making instrctions same length Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2
25 Soltion: Delay R-type s Delay R-type s register write by one cycle: R-type also se Reg File s write port at Stage 5 E is a NOP stage: nothing is being done R-type Ifetch Reg/Dec Eec em Wr R-type also has 5 stages Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Clock R-type Ifetch Reg/Dec Eec em Wr R-type Ifetch Reg/Dec Eec em Wr Load Ifetch Reg/Dec Eec em Wr R-type Ifetch Reg/Dec Eec em Wr R-type Ifetch Reg/Dec Eec em Wr Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-25
26 The For Stages of store Cycle Cycle 2 Cycle 3 Cycle Store Ifetch Reg/Dec Eec em Wr IF: fetch the instrction from the Instrction emory ID: registers fetch and instrction decode EX: calclate the address E: write the into the Data emory an etra stage: : NOP Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-26
27 The Three Stages of beq Cycle Cycle 2 Cycle 3 Cycle Beq Ifetch Reg/Dec Eec em Wr IF: fetch the instrction from the Instrction emory ID: registers fetch and instrction decode EX: compares the two register operand select correct branch target address latch into PC two etra stages: E: NOP : NOP Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-27
28 Pipelined Datapath Fig. 6.7 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-28
29 Graphically Representing Pipelines Time (in clock cycles) Program eection order (in instrctions) lw $, 2($) CC CC 2 CC 3 CC CC 5 CC 6 I Reg ALU D Reg sb $, $2, $3 I Reg ALU D Reg Can help with answering qestions like: How many cycles to eecte this code? What is the ALU doing dring cycle? Help nderstand paths Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-29
30 Eample : Cycle Fig. 6.8 lw $, 2($) Instrction fetch IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock
31 Eample : Cycle 2 Fig. 6.8 sb $, $2, $3 Instrction fetch lw $, 2($) Instrction decode IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock 2
32 Eample : Cycle 3 Fig. 6.8 sb $, $2, $3 Instrction decode lw $, 2($) Eection IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock 3
33 Eample : Cycle Fig. 6.8 sb $, $2, $3 Eection lw $, 2($) emory IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock
34 Eample : Cycle 5 Fig. 6.8 sb $, $2, $3 emory lw $, 2($) back IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock 5
35 Eample : Cycle 6 Fig. 6.8 sb $, $2, $3 back IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Instrction Instrction register register 2 Registers 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Clock 6 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-35
36 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-36
37 Control Signals Fig PCSrc IF/ID ID/EX EX/E E/ Reg Shift left 2 reslt Branch PC ress Instrction Instrction register register 2 Registers 2 register Instrction [5 ] 6 Sign 32 etend ALUSrc 6 ALU control Zero ALU ALU reslt ress em Data em emtoreg Instrction [2 6] Instrction [5 ] RegDst ALUOp 6-37
38 Grop Signals According to Stages Can se control signals of single-cycle CPU (Fig. 6.23, 6.2 <==> 5.2, 5.6) Eection/ress Calclation stage control lines emory access stage control lines -back stage control lines Reg Dst ALU Op ALU Op ALU Src Branch em em Reg write em to Reg X X X X Fig Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-38
39 Data Stationary Control Pass control signals along jst like the ain control generates control signals dring ID Fig Instrction Control EX IF/ID ID/EX EX/E E/ Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-39
40 Data Stationary Control Signals for EX (EtOp, ALUSrc,...) are sed cycle later Signals for E (emwr, Branch) are sed 2 cycles later Signals for (emtoreg, emwr) are sed 3 cycles later ID EX E EtOp EtOp ALUSrc ALUSrc IF/ID Register ain Control ALUOp RegDst emwr Branch emtoreg ID/E Register ALUOp RegDst emwr Branch emtoreg E/E Register emw Branch emtoreg E/ Register emtoreg RegWr RegWr RegWr RegWr Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-
41 Datapath with Control Fig PCSrc Control ID/EX EX/E E/ IF/ID EX PC ress Instrction Instrction Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg Remember that? Who will spply this address? Instrction [5 ] Instrction [2 6] Instrction [5 ] 6 Sign 32 etend 6 ALU control RegDst ALUOp em
42 Let s Try it Ot lw $, 2($) sb $, $2, $3 and $2, $, $5 or $3, $6, $7 add $, $8, $9 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2
43 Eample 2: Cycle IF: lw $, 2($) ID: before<> EX: before<2> E: before<3> : before<> IF/ID Control ID/EX EX EX/E E/ PC ress Instrction Instrction Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch ress Data em emtoreg Instrction [5 ] Sign etend ALU control em Clock Instrction [2 6] Instrction [5 ] RegDst ALUOp 6-3
44 Eample 2: Cycle 2 IF: sb $, $2, $3 ID: lw $, 2($) EX: before<> E: before<2> : before<3> IF/ID lw Control ID/EX EX EX/E E/ PC ress Instrction Instrction X Reg register register 2 Registers 2 register $ $X Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg 2 Instrction [5 ] Sign etend 2 ALU control em Clock 2 X Instrction [2 6] Instrction [5 ] X RegDst ALUOp 6-
45 Eample 2: Cycle 3 IF: and $2, $, $5 ID: sb $, $2, $3 EX: lw $,... E: before<> : before<2> IF/ID sb Control ID/EX EX EX/E E/ PC ress Instrction Instrction 2 3 Reg register register 2 Registers 2 register $2 $3 Shift left 2 $ reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg X Instrction [5 ] Sign etend X 2 ALU control em Clock 3 X Instrction [2 6] Instrction [5 ] X RegDst ALUOp 6-5
46 Eample 2: Cycle IF: or $3, $6, $7 ID: and $2, $2, $3 EX: sb $,... E: lw $,... : before<> IF/ID and Control ID/EX EX EX/E E/ PC ress Instrction Instrction 5 Reg Shift left 2 register $ $2 register 2 Registers $5 $3 2 register reslt ALUSrc Zero ALU ALU reslt Branch ress em Data emtoreg X Instrction [5 ] Sign etend X ALU control em Clock X 2 Instrction [2 6] Instrction [5 ] X 2 RegDst ALUOp 6-6
47 Eample 2: Cycle 5 IF: add $, $8, $9 ID: or $3, $6, $7 EX: and $2,... E: sb $,... : lw $,... IF/ID or Control ID/EX EX EX/E E/ PC ress Instrction Instrction 6 7 Reg register register 2 Registers 2 register $6 $7 Shift left 2 $ $5 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg X Instrction [5 ] Sign etend X ALU control em Clock 5 X 3 Instrction [2 6] Instrction [5 ] X 3 2 RegDst ALUOp 6-7
48 Eample 2: Cycle 6 IF: after<> ID: add $, $8, $9 EX: or $3,... E: and $2,... : sb $,... IF/ID add Control ID/EX EX EX/E E/ PC ress Instrction Instrction 8 9 Reg register register 2 Registers 2 register $8 $9 Shift left 2 $6 $7 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg X Instrction [5 ] Sign etend X ALU control em Clock 6 X Instrction [2 6] Instrction [5 ] X 3 RegDst ALUOp 2 6-8
49 Eample 2: Cycle 7 IF: after<2> ID: after<> EX: add $,... E: or $3,... : and $2,... IF/ID Control ID/EX EX EX/E E/ PC ress Instrction Instrction 2 Reg register register 2 Registers 2 register Shift left 2 $8 $9 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg Instrction [5 ] Sign etend ALU control em Clock 7 Instrction [2 6] Instrction [5 ] RegDst ALUOp
50 Eample 2: Cycle 8 IF: after<3> ID: after<2> EX: after<> E: add $,... : or $3,... IF/ID Control ID/EX EX EX/E E/ PC ress Instrction Instrction 3 Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg Instrction [5 ] Sign etend ALU control em Clock 8 Instrction [2 6] Instrction [5 ] RegDst ALUOp 3 6-5
51 Eample 2: Cycle 9 IF: after<> ID: after<3> EX: after<2> E: after<> : add $,... IF/ID Control ID/EX EX EX/E E/ PC ress Instrction Instrction Reg register register 2 Registers 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emtoreg Instrction [5 ] Sign etend ALU control em Clock 9 Instrction [2 6] Instrction [5 ] RegDst ALUOp 6-5
52 Smmary of Pipeline Basics Pipelining is a fndamental concept ltiple steps sing distinct resorces Utilize capabilities of path by pipelined instrction processing Start net instrnction while working on the crrent one Limited by length of longest stage (pls fill/flsh) Need to detect and resolve hazards What makes it easy in IPS? All instrctions are of the same length Jst a few instrction formats emory operands only in loads and stores What makes pipelining hard? Pipeline hazards Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-52
53 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding (R-Type and R-Type) Data hazards and stalls (Load and R-type) Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-53
54 Pipeline Hazards Pipeline Hazards: Strctral hazards: attempt to se the same resorce in two different ways at the same time Data hazards: attempt to se item before ready Instrction depends on reslt of prior instrction still in the pipeline Control hazards: attempt to make decision before condition is evalated Branch instrctions Can always resolve hazards by waiting? pipeline control mst detect the hazard take action (or delay action) to resolve hazards Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-5
55 Strctral Hazard: Single emory Time I n s t r. O r d e r Load Instr Instr 2 Instr 3 Instr ALU em Reg em Reg ALU em Reg em Reg em ALU em Reg em Reg ALU Reg em Reg em Reg em Reg ALU Use 2 : and instrction Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-55
56 Data Hazards Fig Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 CC CC 2 CC 3 CC CC 5 CC 6 I Reg CC 7 CC 8 CC 9 / D Reg and $2, $2, $5 I Reg D Reg or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) I Reg D Reg
57 Types of Data Hazards Three types: (inst. i followed by inst. i2) RAW (read after write): i2 tries to read operand before i writes it WAR (write after read): i2 tries to write operand before i reads it Gets wrong operand, e.g., atoincrement addr. Can t happen in IPS 5-stage pipeline becase: All instrctions take 5 stages, and reads are always in stage 2, and writes are always in stage 5 WAW (write after write): i2 tries to write operand before i writes it Leaves wrong reslt ( i s not i2 s); occr only in pipelines that write in more than one stage Can t happen in IPS 5-stage pipeline becase: RAR? All instrctions take 5 stages, and writes are always in stage 5 Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-57
58 Pipeline Hazards Illstrated IF ID EX E Strctral Hazard IF ID. IF ID EX E RAW (read after write) Data Hazard IF ID EX E IF ID EX E WAW Data Hazard (write after write) IF ID EX em IF ID EX E WAR Data Hazard (write after read) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-58
59 Handling Data Hazards Use simple, fied designs Eliminate WAR by always fetching operands early (ID) in pipeline Eliminate WAW by doing all write backs in order (last stage, static) These featres have a lot to do with ISA design Internal forwarding in register file: in first half of clock and read in second half delivers what is written, resolve hazard between sb and add Detect and resolve remaining ones Compiler inserts NOP Forward Stall Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-59
60 Software Soltion Have compiler garantee no hazards Where do we insert the NOPs? sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $, $2, $2 sw $5, ($2) Problem: this really slows s down! Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-6
61 Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 I Insert two nops Reg Data Hazards CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 / D Reg Fig and $2, $2, $5 I Reg D Reg or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) I Reg D Reg Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-6
62 Data Hazards : Forwarding Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 / sb $2, $, $3 I Reg D Reg and $2, $2, $5 I Reg D Reg Fig or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) I Reg D Reg Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-62
63 6-63 Pipeline with Forwarding PC Instrction Registers Control ALU EX ID/EX EX/E E/ Data Forwarding nit IF/ID Instrction Rd EX/E.RegisterRd E/.RegisterRd Rt Rt Rs IF/ID.RegisterRd IF/ID.RegisterRt IF/ID.RegisterRt IF/ID.RegisterRs Fig ForwardA ForwardB
64 Detecting Data Hazards Hazard conditions: a. EX/E.RegisterRd = ID/EX.RegisterRs b. EX/E.RegisterRd = ID/EX.RegisterRt 2a. E/.RegisterRd = ID/EX.RegisterRs 2b. E/.RegisterRd = ID/EX.RegisterRt Two optimizations: Don t forward if instrction does not write register => check if Reg is asserted Don t forward if destination register is $ => check if RegisterRd = Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-6
65 Detecting Data Hazards (cont.) Hazard conditions sing control signals: At EX stage: EX/E.Reg and (EX/E.RegRd ) and (EX/E.RegRd=ID/EX.RegRs) At E stage: E/.Reg and (E/.RegRd ) and (E/.RegRd=ID/EX.RegRs) (replace ID/EX.RegRt for ID/EX.RegRs for the other two conditions) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-65
66 Resolving Hazards: Forwarding Use temporary reslts, e.g., those in pipeline registers, don t wait for them to be written Fig Time (in clock cycles) CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 Vale of register $2 : / Vale of EX/E : X X X 2 X X X X X Vale of E/ : X X X X 2 X X X X Program eection order (in instrctions) sb $2, $, $3 I Reg D Reg and $2, $2, $5 I Reg D Reg or $3, $6, $2 I Reg D Reg add $, $2, $2 I Reg D Reg sw $5, ($2) 6-66 I Reg D Reg
67 Pipeline with Forwarding ID/EX EX/E Fig Control E/ IF/ID EX PC Instrction Instrction Registers IF/ID.RegisterRs Rs ALU ForwardA Data IF/ID.RegisterRt IF/ID.RegisterRt IF/ID.RegisterRd Rt Rt Rd ForwardB EX/E.RegisterRd Forwarding nit E/.RegisterRd Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-67
68 Forwarding Logic Forwarding: inpt to ALU from any pipe reg. mltipleors to ALU inpt Control forwarding in EX => carry Rs in ID/EX Control signals for forwarding: If both and E forward, e.g., add $,$,$2; add $,$,$3; add $,$,$; => let E forward EX hazard: if (EX/E.Reg and (EX/E.RegRd ) and (EX/E.RegRd=ID/EX.RegRs)) ForwardA= E hazard: if (E/.Reg and (E/.RegRd ) and (EX/E.RegRd ID/EX.Reg.Rs) and (E/.RegRd=ID/EX.RegRs)) ForwardA= (ID/EX.RegRt<->ID/EX.RegRs, ForwardB<-> ForwardA) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-68
69 Eample 3: Cycle 3 or $, $, $2 and $, $2, $5 sb $2, $, $3 before<> before<2> ID/EX EX/E Control E/ IF/ID EX 2 $2 $ PC Instrction Instrction 5 Registers $5 $3 ALU Data Forwarding nit Clock
70 Eample 3: Cycle add $9, $, $2 or $, $, $2 and $, $2, $5 sb $2,... before<> ID/EX EX/E Control E/ IF/ID EX $ $2 PC Instrction Instrction 6 Registers $2 $5 ALU Data Fig. 6. Forwarding nit 2 Clock 6-7
71 Eample 3: Cycle 5 after<> add $9, $, $2 or $, $, $2 and $,... sb $2,... ID/EX EX/E Control E/ IF/ID EX $ $ PC Instrction Instrction 2 2 Registers $2 $2 ALU Data Fig Forwarding nit 2 Clock 5 6-7
72 Eample 3: Cycle 6 after<2> after<> add $9, $, $2 or $,... and $,... ID/EX EX/E Control E/ IF/ID EX $ PC Instrction Instrction Registers $2 ALU Data 2 Fig Forwarding nit Clock
73 lw can still case a hazard: (in instrctions) Can't Always Forward if followed by an instrction to read the loaded reg. Fig. 6.3 lw $2, 2($) I Reg? D Reg Use stalling or compiler to resolve and $, $2, $5 I Reg D Reg or $8, $2, $6 I Reg D Reg add $9, $, $2 I Reg D Reg slt $, $6, $7 I Reg D Reg 6-73
74 Stalling Stall pipeline by keeping instrctions in same stage and inserting an NOP instead (in instrctions) lw $2, 2($) I Reg D Reg Fig and $, $2, $5 I Reg Reg D Reg or $8, $2, $6 add $9, $, $2 I I Reg D Reg bbble I Reg D Reg slt $, $6, $7 I Reg D Reg Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-7
75 Pipeline with Stalling Unit Forwarding controls ALU inpts, hazard detection controls PC, IF/ID, control signals IF/ID Hazard detection nit Control ID/EX.em ID/EX EX/E Fig E/ IF/ID EX PC PC Instrction Instrction Registers ALU Data IF/ID.RegisterRs IF/ID.RegisterRt IF/ID.RegisterRt IF/ID.RegisterRd Rt Rd EX/E.RegisterRd ID/EX.RegisterRt Rs Rt Forwarding nit E/.RegisterRd 6-75
76 Handling Stalls Hazard detection nit in ID to insert stall between a load instrction and its se: if (ID/EX.em and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.registerRt)) stall the pipeline for one cycle (ID/EX.em= indicates a load instrction) How to stall? Stall instrction in IF and ID: not change PC and IF/ID => the stages re-eecte the instrctions What to move into EX: insert an NOP by changing EX, E, control fields of ID/EX pipeline register to as control signals propagate, all control signals to EX, E, are deasserted and no registers or memories are written Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-76
77 Eample : Cycle 2 and $, $2, $5 lw $2, 2($) before<> before<2> IF/ID IF/ID X Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ before<3> PC PC Instrction Instrction X Registers $ $X ALU Data ID/EX.RegisterRt X 2 Forwarding nit Clock
78 Eample : Cycle 3 or $, $, $2 and $, $2, $5 lw $2, 2($) before<> before<2> IF/ID IF/ID 2 5 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC Instrction Instrction 2 5 Registers $2 $5 $ $X ALU Data 2 5 X 2 ID/EX.RegisterRt Forwarding nit Clock
79 Eample : Cycle or $, $, $2 and $, $2, $5 bbble lw $2,... before<> IF/ID IF/ID 2 5 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC Instrction Instrction 2 5 Registers $2 $5 $2 $5 ALU Data ID/EX.RegisterRt Forwarding nit 2 Clock 6-79
80 Eample : Cycle 5 add $9, $, $2 or $, $, $2 and $, $2, $5 bbble lw $2,... IF/ID IF/ID 2 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC Instrction Instrction 2 2 Registers $ $2 $2 $5 ALU Data ID/EX.RegisterRt Forwarding nit Clock 5 6-8
81 Eample : Cycle 6 after<> add $9, $, $2 or $, $, $2 and $,... bbble IF/ID IF/ID 2 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC Instrction Instrction 2 Registers $ $2 $ $2 ALU Data Fig. 6.9 ID/EX.RegisterRt Forwarding nit Clock 6 6-8
82 Eample : Cycle 7 after<2> after<> add $9, $, $2 or $,... and $,... Hazard detection nit ID/EX.em ID/EX IF/ID IF/ID Control EX EX/E E/ PC PC Instrction Instrction Registers $ $2 ALU Data 2 ID/EX.RegisterRt 9 Forwarding nit Clock
83 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-83
84 Pipeline Datapath with Control Signals PCSrc IF/ID ID/EX EX/E E/ Reg Shift left 2 reslt Branch PC ress Instrction Instrction register register 2 Registers 2 register Instrction [5 ] 6 Sign 32 etend ALUSrc 6 ALU control Zero ALU ALU reslt ress em Data em emtoreg Fig Instrction [2 6] Instrction [5 ] ALUOp RegDst Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-8
85 Branch Hazards When decide to branch, other inst. are in pipeline! (in instrctions) beq $, $3, 7 I Reg D Reg Fig and $2, $2, $5 I Reg D Reg 8 or $3, $6, $2 I Reg D Reg 52 add $, $2, $2 I Reg D Reg 72 lw $, 5($7) I Reg D Reg Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-85
86 P i p e l i n e H a z a r d s I l l s t r a t e d IF ID EX E Strctral Hazard IF ID. IF ID EX E RAW (read after write) Data Hazard IF ID EX E IF ID EX E WAW Data Hazard (write after write) IF ID EX em IF ID EX E WAR Data Hazard (write after read) IF ID EX E Control Hazard IF ID. 6-86
87 Handling Branch Hazard Predict branch always not taken Need to add hardware for flshing inst. if wrong Branch decision made at E => need to flsh instrction in IF/ID, ID/EX by changing control vales to Redce delay of taken branch by moving branch eection earlier in the pipeline ove p branch address calclation to ID Check branch eqality at ID (sing XOR) by comparing the two registers read dring ID Branch decision made at ID => one instrction to flsh a control signal, IF.Flsh, to zero instrction field of IF/ID => making the instrction an NOP Dynamic branch prediction Compiler reschedling, delay branch Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-87
88 6-88 Pipeline with Flshing PC Instrction Registers ALU EX ID/EX EX/E E/ Data Hazard detection nit Forwarding nit IF.Flsh IF/ID Sign etend Control = Shift left 2 Fig. 6.
89 Eample 5: Cycle 3 Fig and $2, $2, $5 beq $, $3, 7 sb $, $, $8 before<> before<2> IF.Flsh 72 8 Hazard detection nit ID /EX EX/E IF/ID 8 Control EX E/ PC 72 Instrction Shift left 2 7 Registers = $ $3 $ $8 ALU Data Sign etend Forwarding nit Clock
90 Eample 5: Cycle lw $, 5($7) bbble (nop) beq $, $3, 7 sb $,... before<> IF.Flsh 76 Hazard detection nit ID/EX EX/E IF/ID Control EX E/ PC Instrction Shift left 2 Registers = $ $3 ALU Data Sign etend Forwarding nit Clock 6-9
91 Delayed Branch Predict-not-taken + branch decision at ID => the following instrction is always eected => branches take effect cycle later I n s t r. O r d e r add beq misc lw Time (clock cycles) ALU em Reg em Reg clock cycle penalty per branch instrction if can find instrction to pt in slot ( 5% of time) ALU em Reg em Reg em ALU Reg em Reg em Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-9 ALU Reg em Reg
92 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-92
93 Handling Eceptions How to stop the pipeline? restart? Sppose overflow occr at add $,$2,$ Disable writes of instrctions till trap hits, e.g., flsh following instrctions sing IF.Flsh, ID.Flsh, EX.Flsh to case mltipleers to zero control signals (overflow eception detected at EX => flsh offending instrction) Force trap instrction into IF, e.g., fetch from he by adding he to PC inpt UX Save address of offending instrction in EPC Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-93
94 6-9 Pipeline with Eception PC Instrction Registers Sign etend Control ALU EX ID/EX EX/E E/ Data Hazard detection nit Forwarding nit IF.Flsh IF/ID = Ecept PC ID.Flsh EX.Flsh Case Shift left 2 Fig. 6.55
95 Handling Eceptions 5 instrctions eecting in 5 stage pipeline Who cased the eception? Need to know in which stage an eception can occr => help determine case Stage IF ID EX E Problem interrpts occrring Page falt; misaligned access; -protection violation Undefined or illegal opcode Arithmetic eception Page falt; misaligned access; error; mem-protection violation; Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-95
96 Handling Eceptions Who to serve first, if mltiple interrpts at the same time? ltiple interrpts: se priority hardware to choose the earliest instrction to interrpt Eternal interrpts: fleible in when to interrpt Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-96
97 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-97
98 Instrction Level Parallelism, ILP How to increase the potential amont of ILP: Increase the depth of the pipeline to overlap more instrctions sper-pipeline Lanch mltiple instrctions Static mltiple isse (decision made by compiler before eection) Dynamic mltiple isse (decision made dring eection by the processor) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-98
99 Different Pipelined Designs Pipelining Sper-pipeline - Isse one instrction per (fast) cycle - ALU takes mltiple cycles IF D E W IF D E W IF D E W IF D E IF D E W IF D E W IF D E W IF D E W W Limitation Isse rate, FU stalls, FU depth Clock skew, FU stalls, FU depth Sper-scalar - Isse mltiple scalar instrctions per cycle IF D E W IF D E W IF D E W IF D E W Hazard resoltion VLIW (EPIC) - Each instrction specifies mltiple scalar operations - Compiler determines parallelism IF D E W E W E W E W Packing Vector operations - Each instrction specifies series of identical operations IF D E W E W E W E W Applicability 6-99
100 Static ltiple Isse Use compiler to assist with packing instrctions and handling hazard Very Long Instrction Word (VLIW) Eplicitly Parallel Instrction Compter (EPIC) (Intel IA-6) Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-
101 A Static Two-isse Datapath Fig. 6.5 ALU PC Instrction Registers Data Sign etend Sign etend ALU ress
102 Dynamic ltiple Isse The hardware performs the schedling? hardware tries to find instrctions to eecte ot of order eection is possible speclative eection and dynamic branch prediction Sperscalar Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-2
103 Sperscalar:Three Primary Units Fig Instrction fetch and decode nit In-order isse Reservation station Reservation station Reservation station Reservation station Fnctional nits Integer Integer Floating point Load/ Store Ot-of-order eecte In-order commit Commit nit Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-3
104 Simple Sperscalar Independent INT and FP isse to separate pipelines I-Cache INT Reg Inst Isse and Bypass FP Reg Operand / Reslt Bsses INT Unit Load / Store Unit FP FP l D-Cache Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-
105 Dynamic Schedling All modern processors are very complicated DEC Alpha 226: 9 stage pipeline, 6 instrction isse PowerPC and Pentim: branch history table Compiler technology important Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-5
106 Smmary Pipelines pass control information down the pipe jst as moves down pipe Forwarding/stalls handled by local control Eceptions stop the pipeline IPS instrction set architectre made pipeline visible (delayed branch, delayed load) ore performance from deeper pipelines, parallelism Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6-6
Chapter 6: Pipelining
Chapter 6: Pipelining Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining
More informationReview: Computer Organization
Review: Compter Organization Pipelining Chans Y Landry Eample Landry Eample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 3 mintes A B C D Dryer takes 3 mintes
More informationPS Midterm 2. Pipelining
PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry
More informationTDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction
Review Friday the 2st of October Real world eamples of pipelining? How does pipelining pp inflence instrction latency? How does pipelining inflence instrction throghpt? What are the three types of hazard
More informationEnhanced Performance with Pipelining
Chapter 6 Enhanced Performance with Pipelining Note: The slides being presented represent a mi. Some are created by ark Franklin, Washington University in St. Lois, Dept. of CSE. any are taken from the
More informationComp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13
Comp 33 Compter Architectre A Pipelined path Lectre 3 Pipelined path with Signals PCSrc IF/ ID ID/ EX EX / E E / Add PC 4 Address Instrction emory RegWr ra rb rw Registers bsw [5-] [2-6] [5-] bsa bsb Sign
More informationWhat do we have so far? Multi-Cycle Datapath
What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining
More informationPipelining. Chapter 4
Pipelining Chapter 4 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast Pipelining Single cycle path we
More informationChapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts
CS359: Compter Architectre Chapter 3 & Appendi C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Compter Science and Engineering Shanghai Jiao Tong University 1 Otline Introdction
More informationChapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind
Pipelining hink of sing machines in landry services Chapter 6 nhancing Performance with Pipelining 6 P 7 8 9 A ime ask A B C ot pipelined Assme 3 min. each task wash, dry, fold, store and that separate
More informationOverview of Pipelining
EEC 58 Compter Architectre Pipelining Department of Electrical Engineering and Compter Science Cleveland State University Fndamental Principles Overview of Pipelining Pipelined Design otivation: Increase
More informationEEC 483 Computer Organization
EEC 83 Compter Organization Chapter.6 A Pipelined path Chans Y Pipelined Approach 2 - Cycle time, No. stages - Resorce conflict E E A B C D 3 E E 5 E 2 3 5 2 6 7 8 9 c.y9@csohio.ed Resorces sed in 5 Stages
More informationComputer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University
Compter Architectre Chapter 5 Fall 25 Department of Compter Science Kent State University The Processor: Datapath & Control Or implementation of the MIPS is simplified memory-reference instrctions: lw,
More informationChapter 6: Pipelining
CSE 322 COPUTER ARCHITECTURE II Chapter 6: Pipelining Chapter 6: Pipelining Febrary 10, 2000 1 Clothes Washing CSE 322 COPUTER ARCHITECTURE II The Assembly Line Accmlate dirty clothes in hamper Place in
More informationCS 251, Winter 2019, Assignment % of course mark
CS 25, Winter 29, Assignment.. 3% of corse mark De Wednesday, arch 3th, 5:3P Lates accepted ntil Thrsday arch th, pm with a 5% penalty. (7 points) In the diagram below, the mlticycle compter from the corse
More informationCS 251, Winter 2018, Assignment % of course mark
CS 25, Winter 28, Assignment 4.. 3% of corse mark De Wednesday, arch 7th, 4:3P Lates accepted ntil Thrsday arch 8th, am with a 5% penalty. (6 points) In the diagram below, the mlticycle compter from the
More informationExceptions and interrupts
Eceptions and interrpts An eception or interrpt is an nepected event that reqires the CPU to pase or stop the crrent program. Eception handling is the hardware analog of error handling in software. Classes
More information1048: Computer Organization
48: Compter Organization Lectre 5 Datapath and Control Lectre5A - simple implementation (cwli@twins.ee.nct.ed.tw) 5A- Introdction In this lectre, we will try to implement simplified IPS which contain emory
More informationThe single-cycle design from last time
lticycle path Last time we saw a single-cycle path and control nit for or simple IPS-based instrction set. A mlticycle processor fies some shortcomings in the single-cycle CPU. Faster instrctions are not
More informationThe final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read.
The final path PC 4 Add Reg Shift left 2 Add PCSrc Instrction [3-] Instrction I [25-2] I [2-6] I [5 - ] register register 2 register 2 Registers ALU Zero Reslt ALUOp em Data emtor RegDst ALUSrc em I [5
More informationPipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.
Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write
More informationThe extra single-cycle adders
lticycle Datapath As an added bons, we can eliminate some of the etra hardware from the single-cycle path. We will restrict orselves to sing each fnctional nit once per cycle, jst like before. Bt since
More informationSolutions for Chapter 6 Exercises
Soltions for Chapter 6 Eercises Soltions for Chapter 6 Eercises 6. 6.2 a. Shortening the ALU operation will not affect the speedp obtained from pipelining. It wold not affect the clock cycle. b. If the
More informationDesign a MIPS Processor (2/2)
93-2Digital System Design Design a MIPS Processor (2/2) Lecturer: Chihhao Chao Advisor: Prof. An-Yeu Wu 2005/5/13 Friday ACCESS IC LABORTORY Outline v 6.1 An Overview of Pipelining v 6.2 A Pipelined Datapath
More informationPIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons
Pipelining: Natral Phenomenon Landry Eample: nn, rian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 mintes C D Dryer takes 0 mintes PIPELINING Folder takes 20 mintes
More informationEEC 483 Computer Organization
EEC 483 Compter Organization Chapter 4.4 A Simple Implementation Scheme Chans Y The Big Pictre The Five Classic Components of a Compter Processor Control emory Inpt path Otpt path & Control 2 path and
More informationPipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12
Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online
More informationEEC 483 Computer Organization. Branch (Control) Hazards
EEC 483 Compter Organization Section 4.8 Branch Hazards Section 4.9 Exceptions Chans Y Branch (Control) Hazards While execting a previos branch, next instrction address might not yet be known. s n i o
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationEXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION
EXAINATIONS 2010 END OF YEAR COPUTER ORGANIZATION Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. ake sre yor answers are clear and to the point. Calclators and paper foreign langage
More informationQuiz #1 EEC 483, Spring 2019
Qiz # EEC 483, Spring 29 Date: Jan 22 Name: Eercise #: Translate the following instrction in C into IPS code. Eercise #2: Translate the following instrction in C into IPS code. Hint: operand C is stored
More informationThe multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM
Lectre (Wed /5/28) Lab # Hardware De Fri Oct 7 HW #2 IPS programming, de Wed Oct 22 idterm Fri Oct 2 IorD The mlticycle path SrcA Today s objectives: icroprogramming Etending the mlti-cycle path lti-cycle
More informationComputer Architecture. Lecture 6: Pipelining
Compter Architectre Lectre 6: Pipelining Dr. Ahmed Sallam Based on original slides by Prof. Onr tl Agenda for Today & Net Few Lectres Single-cycle icroarchitectres lti-cycle and icroprogrammed icroarchitectres
More information1048: Computer Organization
48: Compter Organization Lectre 5 Datapath and Control Lectre5B - mlticycle implementation (cwli@twins.ee.nct.ed.tw) 5B- Recap: A Single-Cycle Processor PCSrc 4 Add Shift left 2 Add ALU reslt PC address
More informationReview. A single-cycle MIPS processor
Review If three instrctions have opcodes, 7 and 5 are they all of the same type? If we were to add an instrction to IPS of the form OD $t, $t2, $t3, which performs $t = $t2 OD $t3, what wold be its opcode?
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike
More informationReview Multicycle: What is Happening. Controlling The Multicycle Design
Review lticycle: What is Happening Reslt Zero Op SrcA SrcB Registers Reg Address emory em Data Sign etend Shift left Sorce A B Ot [-6] [5-] [-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em
More informationEXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation
EXAINATIONS 2003 COP203 END-YEAR Compter Organisation Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. There are 180 possible marks on the eam. Calclators and foreign langage dictionaries
More informationLecture 7. Building A Simple Processor
Lectre 7 Bilding A Simple Processor Christos Kozyrakis Stanford University http://eeclass.stanford.ed/ee8b C. Kozyrakis EE8b Lectre 7 Annoncements Upcoming deadlines Lab is de today Demo by 5pm, report
More informationImprove performance by increasing instruction throughput
Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access
More informationPART I: Adding Instructions to the Datapath. (2 nd Edition):
EE57 Instrctor: G. Pvvada ===================================================================== Homework #5b De: check on the blackboard =====================================================================
More informationChapter 4 (Part II) Sequential Laundry
Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30
More informationInstruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time.
Pipelining Pipelining is the se of pipelining to allow more than one instrction to be in some stage of eection at the same time. Ferranti ATLAS (963): Pipelining redced the average time per instrction
More informationInstruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion
. (Chapter 5) Fill in the vales for SrcA, SrcB, IorD, Dst and emto to complete the Finite State achine for the mlti-cycle datapath shown below. emory address comptation 2 SrcA = SrcB = Op = fetch em SrcA
More informationLecture 10: Pipelined Implementations
U 8-7 S 9 L- 8-7 Lectre : Pipelined Implementations James. Hoe ept of EE, U Febrary 23, 29 nnoncements: Project is de this week idterm graded, d reslts posted Handots: H9 Homework 3 (on lackboard) Graded
More informationCSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content
3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design
More informationComputer Architecture
Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Spring 25 Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics
More informationPipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.
Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview
More informationPipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction
More informationCSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control
CSE-45432 Introdction to Compter Architectre Chapter 5 The Processor: Datapath & Control Dr. Izadi Data Processor Register # PC Address Registers ALU memory Register # Register # Address Data memory Data
More informationLecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1
Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationComputer Architecture
Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics (Lectres
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationProcessor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed
Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites,
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationProf. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:
EE8 Winter 25 Homework #2 Soltions De Thrsday, Feb 2, 5 P. ( points) Consider the following fragment of Java code: for (i=; i
More informationLecture 6: Pipelining
Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationCPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner
CPS104 Computer Organization and Programming Lecture 19: Pipelining Robert Wagner cps 104 Pipelining..1 RW Fall 2000 Lecture Overview A Pipelined Processor : Introduction to the concept of pipelined processor.
More informationFull Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI
CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked
More informationLecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University
8 447 Lectre 6: icroprogrammed lti Cycle Implementation James C. Hoe Department of ECE Carnegie ellon University 8 447 S8 L06 S, James C. Hoe, CU/ECE/CALC, 208 Yor goal today Hosekeeping nderstand why
More informationPipelining. Maurizio Palesi
* Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer
More informationCSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade
CSE 141 Compter Architectre Smmer Session I, 2004 Lectres 10 Advanced Topics, emory Hierarchy and Cache Pramod V. Argade CSE141: Introdction to Compter Architectre Instrctor: TA: Pramod V. Argade (p2argade@cs.csd.ed)
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationChapter 4 The Processor 1. Chapter 4B. The Processor
Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always
More informationChapter 4. The Processor
Chapter 4 The Processor Recall. ISA? Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Instruction Format or Encoding how is it decoded? Location of operands and
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationWhat do we have so far? Multi-Cycle Datapath (Textbook Version)
What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationCENG 3420 Lecture 06: Pipeline
CENG 3420 Lecture 06: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L06.1 Spring 2019 Outline q Pipeline Motivations q Pipeline Hazards q Exceptions q Background: Flip-Flop Control Signals CENG3420 L06.2
More informationProcessor (II) - pipelining. Hwansoo Han
Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor? Chapter 4 The Processor 2 Introduction We will learn How the ISA determines many aspects
More informationCSEE 3827: Fundamentals of Computer Systems
CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected
More informationCS 251, Spring 2018, Assignment 3.0 3% of course mark
CS 25, Spring 28, Assignment 3. 3% of corse mark De onday, Jne 25th, 5:3 P. (5 points) Consider the single-cycle compter shown on page 6 of this assignment. Sppose the circit elements take the following
More informationInstruction Level Parallelism. Appendix C and Chapter 3, HP5e
Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationT = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good
CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined
More informationChapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts
CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University Parallel
More informationLecture 7 Pipelining. Peng Liu.
Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt
More information14:332:331 Pipelined Datapath
14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate
More informationComputer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining
Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one
More informationModern Computer Architecture
Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each
More informationPage 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight
Pipelining: Its Natural! Chapter 3 Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder
More informationPipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...
CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100
More informationThomas Polzer Institut für Technische Informatik
Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =
More informationLecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)
Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationComputer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationDetermined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version
MIPS Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationCS 251, Winter 2018, Assignment % of course mark
CS 25, Winter 28, Assignment 3.. 3% of corse mark De onday, Febrary 26th, 4:3 P Lates accepted ntil : A, Febrary 27th with a 5% penalty. IEEE 754 Floating Point ( points): (a) (4 points) Complete the following
More informationThe Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
The Processor (3) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationAppendix A. Overview
Appendix A Pipelining: Basic and Intermediate Concepts 1 Overview Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 2 1 Unpipelined
More informationOverview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP
Overview Appendix A Pipelining: Basic and Intermediate Concepts Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 1 2 Unpipelined
More informationLecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University
Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will
More information