Chapter 6: Pipelining
|
|
- Laurel Palmer
- 6 years ago
- Views:
Transcription
1 Chapter 6: Pipelining
2 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining 2
3 Landry eample: Pipelining Is Natral! Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold A B C D Washer takes 3 mintes Dryer takes mintes Folder takes 2 mintes 3
4 Seqential Landry 6 P idnight Time T a s k O r d e r A B C D Seqential landry takes 6 hors for loads If they learned pipelining, how long wold it take?
5 Pipelined Landry: Start ASAP 6 P idnight Time T a s k O r d e r A B C D 3 2 Pipelined landry takes 3.5 hors for loads 5
6 Pipelining Lessons T a s k O r d e r 6 P Time 3 2 A B C D Doesn t help latency of single task, bt throghpt of entire Pipeline rate limited by slowest stage ltiple tasks working at same time sing different resorces Potential speedp = Nmber of pipe stages Unbalanced stage length; time to fill & drain the pipeline redce speedp Stall for dependences 6
7 Clk Single-, lti-cycle, vs. Pipeline Cycle Cycle 2 Single Cycle Implementation: Load Store Waste Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle Clk ltiple Cycle Implementation: Load Ifetch Eec em Wr Store Ifetch Eec em R-type Ifetch Pipeline Implementation: Load Ifetch Eec em Wr Store Ifetch Eec em Wr R-type Ifetch Eec em Wr 7
8 Pipelining IPS Eection Program eection order Time (in instrctions) lw $, ($) fetch Data access lw $2, 2($) 8 ns fetch Data access lw $3, 3($) Program eection Time order (in instrctions) lw $, ($) lw $2, 2($) fetch 2 ns 8 ns fetch Data access Data access fetch 8 ns Fig lw $3, 3($) 2 ns fetch Data access 2 ns 2 ns 2 ns 2 ns 2 ns 8
9 Why Pipeline? Becase the Resorces Are There! Time (clock cycles) I n s t r. O r d e r Inst Inst Inst 2 Inst 3 Inst Im Dm Im Dm Im Dm Im Dm Singlecycle Datapath Im Dm 9
10 Hazard Limits to pipelining: Hazards prevent net instrction from eecting dring its designated clock cycle Strctral hazards: Hardware cannot spport this combination of instrctions - two instrctions need the same resorce. Data hazards: depends on reslt of prior instrction still in the pipeline Control hazards: Pipelining of branches & other instrctions that change the PC Common soltion is to stall the pipeline ntil the hazard is resolved, inserting one or more bbbles in the pipeline To do this, hardware or software mst detect that a hazard has occrred.
11 Strctral Hazards Strctral hazards occr when two or more instrctions need the same resorce. Common methods for eliminating strctral hazards are: Dplicate resorces Pipeline the resorce Reorder the instrctions It may be too epensive too eliminate a strctral hazard, in which case the pipeline shold stall. When the pipeline stalls, no instrctions are issed ntil the hazard has been resolved. What are some eamples of strctral hazards?
12 One emory Port Strctral Hazards Figre 3.6, Page 2 Time (clock cycles) Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 I n s t r. O r d e r Load Ifetch Instr Instr 2 Instr 3 Instr Ifetch Ifetch Dem Ifetch Dem Ifetch Dem Dem Dem 2
13 One emory Port Strctral Hazards Figre 3.7, Page 3 Time (clock cycles) Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 I n s t r. O r d e r Load Ifetch Instr Instr 2 Stall Instr 3 Ifetch Ifetch Dem Dem Dem Bbble Bbble Bbble Bbble Bbble Ifetch Dem 3
14 Otline An overview of pipelining A pipelined path (6.2) Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining
15 Designing a Pipelined Processor Eamine the path and control diagram Starting with single- or mlti-cycle path? Single- or mlti-cycle control? Partition path into stages: IF (instrction fetch) ID (instrction decode and register file read) EX (eection or address calclation) E ( access) (write back) Associate resorces with stages Ensre that flows do not conflict, or figre ot how to resolve Assert control in appropriate stage 5
16 Use lticycle Eection Steps Step name fetch decode/register fetch Action for R-type instrctions Action for -reference Action for instrctions branches IR = emory[pc] PC = PC + A = [IR[25-2]] B = [IR[2-6]] Ot = PC + (sign-etend (IR[5-]) << 2) Action for jmps Eection, address Ot = A op B Ot = A + sign-etend if (A ==B) then PC = PC [3-28] II comptation, branch/ (IR[5-]) PC = Ot (IR[25-]<<2) jmp completion emory access or R-type [IR[5-]] = Load: DR = emory[ot] completion Ot or Store: emory [Ot] = B emory read completion Load: [IR[2-6]] = DR Bt, se single-cycle path... 6
17 Split Single-cycle Datapath IF: fetch ID: decode/ register file read EX: Eecte/ address calclation E: emory access : back Feedback Path Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data 6 Sign etend 32 Fig
18 Pipeline isters Pipeline registers (latches) IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data Fig Sign etend 32 8
19 Considerload Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Load Ifetch /Dec Eec em Wr IF: Fetch Fetch the instrction from the emory ID: Decode isters fetch and instrction decode EX: Calclate the address E: the from the Data emory : the back to the register file 9
20 Pipeliningload Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Clock st lw Ifetch /Dec Eec em Wr 2nd lw Ifetch /Dec Eec em Wr 3rd lw Ifetch /Dec Eec em Wr 5 fnctional nits in the pipeline path are: emory for the Ifetch stage ister File s ports (bsa and bsb) for the /Dec stage for the Eec stage Data emory for the E stage ister File s port (bsw) for the stage 2
21 IF Stage ofload IR = mem[pc]; PC = PC + lw fetch IR, PC+ Fig. 6.2 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data 6 Sign etend 32 2
22 ID Stage ofload A = [IR[25-2]]; B = [IR[2-6]]; ot = PC + (sign-et(ir[5-]) << 2) (some ops moved to the net stage) lw decode Fig. 6.2 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data 6 Sign etend 32 22
23 EX Stage ofload ot = A + sign-et(ir[5-]) lw Eection Fig. 6.3 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data 6 Sign etend 32 23
24 E State ofload DR = mem[ot] lw emory IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data Fig Sign etend 32 2
25 Stage ofload [IR[2-6]] = DR IF/ID Who will ID/EX spply this address? reslt EX/E E/ lw back Shift left 2 PC ress register register 2 isters 2 register Zero reslt ress Data 6 Sign etend 32 Fig
26 The For Stages of R-type Cycle Cycle 2 Cycle 3 Cycle R-type Ifetch /Dec Eec Wr IF: fetch the instrction from the emory ID: registers fetch and instrction decode EX: operates on the two register operands : write otpt back to the register file 26
27 Pipelining R-type andload Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Clock R-type Ifetch /Dec Eec Wr Ops! We have a problem! R-type Ifetch /Dec Eec Wr Load Ifetch /Dec Eec em Wr R-type Ifetch /Dec Eec Wr R-type Ifetch /Dec Eec Wr We have a strctral hazard: Two instrctions try to write to the register file at the same time! Only one write port 27
28 Important Observation Each fnctional nit can only be sed once per instrction Each fnctional nit mst be sed at the same stage for all instrctions: Load ses ister File s write port dring its 5th stage Load R-type Ifetch /Dec Eec em Wr R-type ses ister File s write port dring its th stage 2 3 Ifetch /Dec Eec Wr Several ways to solve: forwarding, adding pipeline bbble, making instrctions same length 28
29 Soltion: Delay R-type s Delay R-type s register write by one cycle: R-type also se File s write port at Stage 5 E is a NOP stage: nothing is being done R-type Ifetch /Dec Eec em Wr Cycle Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Clock R-type Ifetch /Dec Eec em Wr R-type Ifetch /Dec Eec em Wr Load R-type also has 5 stages Ifetch /Dec Eec em Wr R-type Ifetch /Dec Eec em Wr R-type Ifetch /Dec Eec em Wr 29
30 The For Stages ofstore Cycle Cycle 2 Cycle 3 Cycle Store Ifetch /Dec Eec em Wr IF: fetch the instrction from the emory ID: registers fetch and instrction decode EX: calclate the address E: write the into the Data emory an etra stage: : NOP 3
31 The Three Stages of beq Cycle Cycle 2 Cycle 3 Cycle Beq Ifetch /Dec Eec em Wr IF: fetch the instrction from the emory ID: registers fetch and instrction decode EX: compares the two register operand select correct branch target address latch into PC two etra stages: E: NOP : NOP 3
32 Pipelined Datapath Fig. 6.7 IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data 6 Sign etend 32 32
33 Graphically Representing Program eection order (in instrctions) lw $, 2($) Time (in clock cycles) Pipelines CC CC 2 CC 3 CC CC 5 CC 6 I D sb $, $2, $3 I D Can help with answering qestions like: How many cycles to eecte this code? What is the doing dring cycle? Help nderstand paths 33
34 Eample : Cycle lw $, 2($) fetch IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Fig. 6.8 register register 2 isters 2 register 6 Sign etend 32 Zero reslt ress Data Clock 3
35 Eample : Cycle 2 sb $, $2, $3 fetch lw $, 2($) decode IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data Fig Sign etend 32 Clock 2 35
36 Eample : Cycle 3 sb $, $2, $3 decode lw $, 2($) Eection IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Fig. 6.8 register register 2 isters 2 register 6 Sign etend 32 Zero reslt ress Data Clock 3 36
37 Eample : Cycle sb $, $2, $3 Eection lw $, 2($) emory IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data Fig Sign etend 32 Clock 37
38 Eample : Cycle 5 sb $, $2, $3 emory lw $, 2($) back IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data Fig Sign etend 32 Clock 5 38
39 Eample : Cycle 6 sb $, $2, $3 back IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress Fig. 6.8 register register 2 isters 2 register 6 Sign etend 32 Zero reslt ress Data Clock 6 39
40 Otline An overview of pipelining A pipelined path Pipelined control (6.3) Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining
41 Pipeline Control: Control Signals PCSrc IF/ID ID/EX EX/E E/ Shift left 2 reslt Branch PC ress Fig register register 2 isters 2 register [5 ] [2 6] [5 ] 6 Sign 32 etend Src 6 control Op Zero reslt ress em Data em emto Dst
42 Grop Signals According to Stages Can se control signals of single-cycle CPU (Fig. 6.23, 6.2 <==> 5.2, 5.6) R-type lw sw beq Eection/ress Calclation stage control lines emory access stage control lines -back stage control lines Dst Op Op Src Branch em em write em to X X X X Fig
43 Data Stationary Control Pass control signals along jst like the ain control generates control signals dring ID Control EX IF/ID ID/EX EX/E E/ Fig
44 Data Stationary Control (cont.) Signals for EX (EtOp, Src,...) are sed cycle later Signals for E (emwr, Branch) are sed 2 cycles later Signals for (emto, emwr) are sed 3 cycles later ID EX E IF/ID ister ain Control EtOp Src Op Dst emw Branch r emto Wr ID/E ister EtOp Src Op Dst emw Branch r emto Wr E/E ister emw Branch emto Wr E/ ister emto Wr
45 Stage ofload [IR[2-6]] = DR IF/ID Who will ID/EX spply this address? reslt EX/E E/ lw back Shift left 2 PC ress register register 2 isters 2 register Zero reslt ress Data 6 Sign etend 32 Fig. 6. 5
46 Datapath with Control PCSrc Control ID/EX EX/E E/ IF/ID EX PC ress register register 2 isters W rite 2 register W rite Shift left 2 reslt Src Zero reslt Branch em ress Data emto [5 ] 6 Sign 32 etend 6 control em Fig [2 6] [5 ] Dst Op 6
47 Let s Try it Ot lw $, 2($) sb$, $2, $3 and$2, $, $5 or $3, $6, $7 add$, $8, $9 7
48 Eample 2: Cycle IF: lw $, 2($) ID: before<> EX: before<2> E: before<3> : before<> IF/ID Control ID/EX EX EX/E E/ PC ress register register 2 isters 2 register Shift left 2 reslt Src Zero reslt Branch ress Data em emto [5 ] Sign etend control em Clock [2 6] [5 ] Dst Op 8
49 Eample 2: Cycle 2 IF: sb $, $2, $3 ID: lw $, 2($) EX: before<> E: before<2> : before<3> IF/ID ID/EX EX/E E/ lw Control EX PC ress X register register 2 isters 2 register $ $X Shift left 2 reslt Src Zero reslt Branch em ress Data emto 2 [5 ] Sign etend 2 control em Clock 2 X [2 6] [5 ] X Dst Op 9
50 Eample 2: Cycle 3 IF: and $2, $, $5 ID: sb $, $2, $3 EX: lw $,... E: before<> : before<2> IF/ID sb Control ID/EX EX EX/E E/ PC ress 2 3 register register 2 isters 2 register $2 $3 Shift left 2 $ reslt Src Zero reslt Branch em ress Data emto X [5 ] Sign etend X 2 control em Clock 3 X [2 6] [5 ] X Dst Op 5
51 Eample 2: Cycle IF: or $3, $6, $7 ID: and $2, $2, $3 EX: sb $,... E: lw $,... : before<> IF/ID and Control ID/EX EX EX/E E/ PC ress 5 Shift left 2 register $ $2 register 2 isters $5 $3 2 register reslt Src Zero reslt Branch ress em Data emto X [5 ] Sign etend X control em Clock X 2 [2 6] [5 ] X 2 Op Dst 5
52 Eample 2: Cycle 5 IF: add $, $8, $9 ID: or $3, $6, $7 EX: and $2,... E: sb $,... : lw $,... IF/ID ID/EX EX/E E/ or Control EX PC ress 6 7 register register 2 isters 2 register $6 $7 Shift left 2 $ $5 reslt Src Zero reslt Branch em ress Data emto X [5 ] Sign etend X control em Clock 5 X 3 [2 6] [5 ] X 3 2 Dst Op 52
53 Eample 2: Cycle 6 IF: after<> ID: add $, $8, $9 EX: or $3,... E: and $2,... : sb $,... IF/ID add Control ID/EX EX EX/E E/ PC ress 8 9 register register 2 isters 2 register $8 $9 Shift left 2 $6 $7 reslt Src Zero reslt Branch em ress Data emto X [5 ] Sign etend X control em Clock 6 X [2 6] [5 ] X 3 Op Dst 2 53
54 Eample 2: Cycle 7 IF: after<2> ID: after<> EX: add $,... E: or $3,... : and $2,... IF/ID ID/EX EX/E E/ Control EX PC ress 2 register register 2 isters 2 register Shift left 2 $8 $9 reslt Src Zero reslt Branch em ress Data emto Fig. 6.3 Clock 7 [5 ] [2 6] [5 ] Sign etend control Dst Op em 3 2 5
55 Eample 2: Cycle 8 IF: after<3> ID: after<2> EX: after<> E: add $,... : or $3,... IF/ID ID/EX EX/E E/ Control EX PC ress 3 register register 2 isters 2 register Shift left 2 reslt Src Zero reslt Branch em ress Data emto Fig. 6.3 Clock 8 [5 ] Sign etend [2 6] [5 ] control Dst Op em 3 55
56 Eample 2: Cycle 9 IF: after<> ID: after<3> EX: after<2> E: after<> : add $,... IF/ID ID/EX EX/E E/ Control EX PC ress register register 2 isters 2 register Shift left 2 reslt Src Zero reslt Branch em ress Data emto [5 ] Sign etend control em Clock 9 [2 6] [5 ] Op Dst 56
57 Smmary of Pipeline Basics Pipelining is a fndamental concept ltiple steps sing distinct resorces Utilize capabilities of path by pipelined instrction processing Start net instrnction while working on the crrent one Limited by length of longest stage (pls fill/flsh) Need to detect and resolve hazards What makes it easy in IPS? All instrctions are of the same length Jst a few instrction formats emory operands only in loads and stores What makes pipelining hard? hazards 57
58 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding (6.) Data hazards and stalls (6.5) Branch hazards Eceptions Sperscalar and dynamic pipelining 58
59 Pipeline Hazards Pipeline Hazards: Strctral hazards: attempt to se the same resorce in two different ways at the same time E.: combined washer/dryer or folder bsy doing something else (watching TV) Data hazards: attempt to se item before ready depends on reslt of prior instrction still in the pipeline Control hazards: attempt to make decision before condition is evalated E.: wash football niforms and need to see reslt of previos load to get proper detergent level Branch instrctions Can always resolve hazards by waiting pipeline control mst detect the hazard take action (or delay action) to resolve hazards 59
60 Strctral Hazard: Single emory Time I n s t r. O r d e r Load Instr Instr 2 Instr 3 Instr em em em em em em em em em em Use 2 : and instrction 6
61 Pipeline Hazards Illstrated IF ID EX E Strctral Hazard IF ID. 6
62 Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 I Data Hazards CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 / D Fig and $2, $2, $5 I D or $3, $6, $2 I D add $, $2, $2 I D sw $5, ($2) I D 62
63 Types of Data Hazards Three types: (inst. i followed by inst. i2) RAW (read after write): i2 tries to read operand before i writes it WAR (write after read): i2 tries to write operand before i reads it Gets wrong operand, e.g., atoincrement addr. Can t happen in IPS 5-stage pipeline becase: All instrctions take 5 stages, and reads are always in stage 2, and writes are always in stage 5 WAW (write after write): i2 tries to write operand before i writes it Leaves wrong reslt ( i s not i2 s); occr only in pipelines that write in more than one stage Can t happen in IPS 5-stage pipeline becase: All instrctions take 5 stages, and writes are always in stage 5 63
64 Pipeline Hazards Illstrated IF ID EX E RAW (read after write) Data Hazard IF ID EX E WAW Data Hazard IF ID EX E (write after write) IF ID EX em IF ID EX E WAR Data Hazard (write after read) 6
65 Handling Data Hazards Use simple, fied designs Eliminate WAR by always fetching operands early (ID) in pipeline Eliminate WAW by doing all write backs in order (last stage, static) These featres have a lot to do with ISA design Internal forwarding in register file: in first half of clock and read in second half delivers what is written, resolve hazard between sb and add Detect and resolve remaining ones Compiler inserts NOP Forward Stall 65
66 Software Soltion Have compiler garantee no hazards Where do we insert the NOPs? sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $, $2, $2 sw $5, ($2) Problem: this really slows s down! 66
67 Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 I Insert two nops Data Hazards CC CC 2 CC 3 CC CC 5 CC 6 CC 7 CC 8 CC 9 / D Fig and $2, $2, $5 I D or $3, $6, $2 I D add $, $2, $2 I D sw $5, ($2) I D 67
68 Data Hazards : Forwarding Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 CC CC 2 CC 3 CC CC 5 CC 6 I CC 7 CC 8 CC 9 / D and $2, $2, $5 I D Fig or $3, $6, $2 I D add $, $2, $2 I D sw $5, ($2) I D 68
69 69 Pipeline with Forwarding PC isters Control EX ID/EX EX/E E/ Data Forwarding nit IF/ID Rd EX/E.isterRd E/.isterRd Rt Rt Rs IF/ID.isterRd IF/ID.isterRt IF/ID.isterRt IF/ID.isterRs Fig ForwardA ForwardB
70 Hazard conditions: Detecting Data Hazards a. EX/E.isterRd = ID/EX.isterRs b. EX/E.isterRd = ID/EX.isterRt 2a. E/.isterRd = ID/EX.isterRs 2b. E/.isterRd = ID/EX.isterRt Two optimizations: Don t forward if instrction does not write register => check if is asserted Don t forward if destination register is $ => check if isterrd = 7
71 Detecting Data Hazards (cont.) Hazard conditions sing control signals: At EX stage: EX/E. and (EX/E.Rd ) and (EX/E.Rd=ID/EX.Rs) At E stage: E/. and (E/.Rd ) and (E/.Rd=ID/EX.Rs) (replace ID/EX.Rt for ID/EX.Rs for the other two conditions) 7
72 Resolving Hazards: Forwarding Use temporary reslts, e.g., those in pipeline registers, don t wait for them to be written Time (in clock cycles) CC C C 2 C C 3 C C C C 5 C C 6 C C 7 C C 8 C C 9 Vale of register $2 : / Vale of EX/ E : X X X 2 X X X X X Vale of E /W B : X X X X 2 X X X X Program eection orde r (in instrctions) sb $2, $, $3 I R eg D R eg Fig and $2, $2, $5 I D or $3, $6, $2 I R eg D R eg add $, $2, $2 I R eg D R eg sw $5, ($2) I D 72
73 73 Pipeline with Forwarding PC isters Control EX ID/EX EX/E E/ Data Forwarding nit IF/ID Rd EX/E.isterRd E/.isterRd Rt Rt Rs IF/ID.isterRd IF/ID.isterRt IF/ID.isterRt IF/ID.isterRs Fig ForwardA ForwardB
74 Forwarding Logic Forwarding: inpt to from any pipe reg. mltipleors to inpt Control forwarding in EX => carry Rs in ID/EX Control signals for forwarding: If both and E forward, e.g.,add $,$,$2; add $,$,$3; add $,$,$; => let E forward EX hazard: if (EX/E. and (EX/E.Rd ) and (EX/E.Rd=ID/EX.Rs)) ForwardA= E hazard: if (E/. and (E/.Rd ) and (EX/E.Rd ID/EX..Rs) and (E/.Rd=ID/EX.Rs)) ForwardA= 7
75 Eample 3: Cycle 3 or $, $, $2 and $, $2, $5 sb $2, $, $3 before<> before<2> ID/EX EX/E Control E/ IF/ID EX 2 $2 $ PC 5 isters $5 $3 Data Forwarding nit Clock 3 75
76 Eample 3: Cycle add $9, $, $2 or $, $, $2 and $, $2, $5 sb $2,... before<> ID/EX EX/E Control E/ IF/ID EX $ $2 PC 6 isters $2 $5 Data Fig. 6. Forwarding nit 2 Clock 76
77 Eample 3: Cycle 5 after<> add $9, $, $2 or $, $, $2 and $,... sb $2,... ID/EX EX/E Control E/ IF/ID EX $ $ PC 2 2 isters $2 $2 Data Fig Forwarding nit 2 Clock 5 77
78 Eample 3: Cycle 6 after<2> after<> add $9, $, $2 or $,... and $,... ID/EX EX/E Control E/ IF/ID EX $ PC isters $2 Data 2 Fig Forwarding nit Clock 6 78
79 lw can still case a hazard: (in instrctions) Can't Always Forward if is followed by an instrction to read the loaded reg. lw $2, 2($) I D and $, $2, $5 I D Fig. 6.3 or $8, $2, $6 I D add $9, $, $2 slt $, $6, $7 Use stalling or compiler to resolve I D I D 79
80 Stalling Stall pipeline by keeping instrctions in same stage and inserting an NOP instead order (in instrctions) lw $2, 2($) I D and $, $2, $5 I D Fig or $8, $2, $6 add $9, $, $2 I I D bbble I D slt $, $6, $7 I D 8
81 Pipeline with Stalling Unit Forwarding controls inpts, hazard detection controls PC, IF/ID, control signals IF/ID IF/ID Hazard detection nit Control ID/EX.em ID/EX EX EX/E Fig E/ PC PC isters Data IF/ID.isterRs IF/ID.isterRt IF/ID.isterRt IF/ID.isterRd ID/EX.isterRt Rt Rd Rs Rt Forwarding nit EX/E.isterRd E/.isterRd 8
82 Handling Stalls Hazard detection nit in ID to insert stall between a load instrction and its se: if (ID/EX.em and ((ID/EX.isterRt = IF/ID.isterRs) or (ID/EX.isterRt = IF/ID.registerRt)) stall the pipeline for one cycle (ID/EX.em= indicates a load instrction) How to stall? Stall instrction in IF and ID: not change PC and IF/ID => the stages re-eecte the instrctions What to move into EX: insert an NOP by changing EX, E, control fields of ID/EX pipeline register to as control signals propagate, all control signals to EX, E, are deasserted and no registers or memories are written 82
83 Eample : Cycle 2 and $, $2, $5 lw $2, 2($) before<> before<2> IF/ID IF/ID X Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ before<3> PC PC X isters $ $X Data ID/EX.isterRt X 2 Forwarding nit Clock 2 83
84 Eample : Cycle 3 or $, $, $2 and $, $2, $5 lw $2, 2($) before<> before<2> IF/ID IF/ID 2 5 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC 2 5 isters $2 $5 $ $X Data 2 5 X 2 ID/EX.isterRt Forwarding nit Clock 3 8
85 Eample : Cycle or $, $, $2 and $, $2, $5 bbble lw $2,... before<> IF/ID IF/ID 2 5 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC 2 5 isters $2 $5 $2 $5 Data ID/EX.isterRt Forwarding nit Clock 85
86 Eample : Cycle 5 add $9, $, $2 or $, $, $2 and $, $2, $5 bbble lw $2,... IF/ID IF/ID 2 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC 2 2 isters $ $2 $2 $5 Data ID/EX.isterRt Forwarding nit Clock 5 86
87 Eample : Cycle 6 after<> add $9, $, $2 or $, $, $2 and $,... bbble IF/ID IF/ID 2 Hazard detection nit Control ID/EX.em ID/EX EX EX/E E/ PC PC 2 isters $ $2 $ $2 Data 2 2 Fig. 6.9 ID/EX.isterRt 9 Forwarding nit Clock 6 87
88 Eample : Cycle 7 after<2> after<> add $9, $, $2 or $,... and $,... Hazard detection nit ID/EX.em ID/EX IF/ID IF/ID Control EX EX/E E/ PC PC isters $ $2 Data 2 Fig. 6.9 ID/EX.isterRt 9 Forwarding nit Clock 7 88
89 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards (6.6) Eceptions Sperscalar and dynamic pipelining 89
90 Feedback Path IF/ID ID/EX EX/E E/ Shift left 2 reslt PC ress register register 2 isters 2 register Zero reslt ress Data Fig Sign etend 32 9
91 Pipeline Datapath with Control Signals PCSrc IF/ID ID/EX EX/E E/ Shift left 2 reslt Branch PC ress Fig register register 2 isters 2 register [5 ] [2 6] [5 ] 6 Sign 32 etend Src 6 control Op Zero reslt ress em Data em emto Dst 9
92 Pipeline Hazards Illstrated IF ID EX E Strctral Hazard IF ID. IF ID EX E RAW (read after write) Data Hazard IF ID EX E WAW Data Hazard IF ID EX E (write after write) IF ID EX em IF ID EX E IF ID EX E Control Hazard IF ID. WAR Data Hazard (write after read) 92
93 When decide to branch, other inst. are in pipeline! order (in instrctions) Branch Hazards beq $, $3, 7 I D Fig and $2, $2, $5 I D 8 or $3, $6, $2 I D 52 add $, $2, $2 I D 72 lw $, 5($7) I D 93
94 Handling Branch Hazard Predict branch always not taken Need to add hardware for flshing inst. if wrong Branch decision made at E => need to flsh instrction in IF/ID, ID/E by changing control vales to Redce delay of taken branch by moving branch eection earlier in the pipeline ove p branch address calclation to ID Check branch eqality at ID (sing XOR) by comparing the two registers read dring ID Branch decision made at ID => one instrction to flsh a control signal, IF.Flsh, to zero instrction field of IF/ID => making the instrction an NOP Dynamic branch prediction Compiler reschedling, delay branch 9
95 Pipeline with Flshing IF.Flsh Hazard detection nit ID/EX EX/E Fig. 6. Control E/ IF/ID EX Shift left 2 PC isters = Data Sign etend Forwarding nit 95
96 Eample 5: Cycle 3 and $2, $2, $5 beq $, $3, 7 sb $, $, $8 before<> before<2> IF.Flsh 72 8 Hazard detection nit ID/EX EX/E IF/ID 8 Control EX E/ PC 72 Shift left 2 7 isters = $ $3 $ $8 Data Sign etend Fig Clock 3 Forwarding nit 96
97 Eample 5: Cycle lw $, 5($7) bbble (nop) beq $, $3, 7 sb $,... before<> IF.Flsh 76 Hazard detection nit ID/EX EX/E IF/ID Control EX E/ PC Shift left 2 isters = $ $3 Data Sign etend Fig Clock Forwarding nit 97
98 Predict-not-taken + branch decision at ID => the following instrction is always eected => branches take effect cycle later I n s t r. O r d e r add beq misc lw Delayed Branch Time (clock cycles) em em em em em em em em 98
99 Dynamic Branch Prediction Performance = ƒ(accracy, cost of misprediction) Branch History Table: Lower bits of PC address inde table of -bit vales Says whether or not branch taken last time No address check Problem: in a loop, -bit BHT will case two mispredictions (avg is 9 iterations before eit): End of loop case, when it eits instead of looping as before First time throgh loop on net time throgh code, when it predicts eit instead of looping 99
100 -Bit Prediction For each branch, keep track of what happened last time and se that otcome as the prediction What are prediction accracies for branches and 2 below: while () { for (i=;i<;i++) { branch- } for (j=;j<2;j++) { branch-2 } }
101 2-Bit Prediction For each branch, maintain a 2-bit satrating conter: if the branch is taken: conter = min(3,conter+) if the branch is not taken: conter = ma(,conter-) If (conter >= 2), predict taken, else predict not taken Advantage: a few atypical branches will not inflence the prediction (a better measre of the common case ) Especially sefl when mltiple branches share the same conter (some bits of the branch PC are sed to inde into the branch predictor) Can be easily etended to N-bits (in most processors, N=2)
102 N-bit Branch Prediction Bffers When the conter is greater than or eqal to onehalf of its maimm vale (2 n -), the branch is predicted as taken. The conter is increased on a taken branch and decremented on an ntaken branch. A branch bffer can be implemented as a small cache accessed dring the IF stage. 2
103 N-bit Branch Prediction Bffers Use an n-bit satrating conter Only the loop eit cases a misprediction 2-bit predictor almost as good as any general n-bit predictor 3
104 Basic Branch Prediction Bffers a.k.a. Branch History Table (BHT) - Small direct-mapped cache of T/NT bits IR: Branch + Branch Target PC: BHT T (predict taken) NT (predict not- taken) PC +
105 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining 5
106 What abot Eceptions? Another form of branch hazard How to stop the pipeline? restart? Who cased the interrpt? Who to serve first, if mltiple interrpts at the same time? 6
107 Handling Eceptions How to stop the pipeline? restart? Sppose overflow occr atadd $,$2,$ Disable writes of instrctions till trap hits, e.g., flsh following instrctions sing IF.Flsh, ID.Flsh, EX.Flsh to case mltipleers to zero control signals (overflow eception detected at EX => flsh offending instrction) Force trap instrction into IF, e.g., fetch from he by adding he to PC inpt UX Save address of offending instrction in EPC 7
108 Pipeline with Eception IF.Flsh ID.Flsh EX.Flsh Hazard detection nit ID/EX EX/E Fig IF/ID Control EX Case E/ Shift left 2 Ecept PC PC isters = Data Sign etend Forwarding nit 8
109 Who cased the eception? 5 instrctions eecting in 5 stage pipeline Who cased the eception? Need to know in which stage an eception can occr => help determine case Stage IF ID EX Problem interrpts occrring Page falt; misaligned access; -protection violation Undefined or illegal opcode Arithmetic eception E Page falt; misaligned access; error; mem-protection violation; 9
110 When to Serve? Who to serve first, if mltiple interrpts at the same time? ltiple interrpts: se priority hardware to choose the earliest instrction to interrpt Eternal interrpts: fleible in when to interrpt
111 Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining
112 Level Parallelism How to increase the potential amont of ILP: Increase the depth of the pipeline to overlap more instrctions sper-pipeline Lanch mltiple instrctions Static mltiple isse (decision made by compiler before eection) Dynamic mltiple isse (decision made dring eection by the processor) 2
113 Different Pipelined Designs Pipelining IF D E W IF D E W IF D E W IF D E W Limitation Isse rate, FU stalls, FU depth Sper-scalar - Isse mltiple scalar instrctions per cycle IF D E W IF D E W IF D E W IF D E W Hazard resoltion VLIW (EPIC) - Each instrction specifies mltiple scalar operations - Compiler determines parallelism IF D E W E W E W E W Packing 3
114 Static ltiple Isse Use compiler to assist with packing instrctions and handling hazard Very Long Word (VLIW) Eplicitly Parallel Compter (EPIC) (Intel IA-6)
115 A Static Two-isse Datapath Fig. 6.5 PC isters Data Sign etend Sign etend ress 5
116 Dynamic ltiple Isse The hardware performs the schedling? hardware tries to find instrctions to eecte ot of order eection is possible speclative eection and dynamic branch prediction Sperscalar 6
117 Sperscalar: Three Primary Units fetch and decode nit In-order isse Reservation station Reservation station Reservation station Reser vation station Fnctional nits Integer Integer Floating point Load/ Store Ot-of-order eecte In-order commit Commit nit Fig
118 Simple Sperscalar Independent INT and FP isse to separate pipelines I-Cache INT Inst Isse and Bypass FP Operand / Reslt Bsses INT Unit Load / Store Unit FP FP l D-Cache 8
119 Dynamic Schedling All modern processors are very complicated DEC Alpha 226: 9 stage pipeline, 6 instrction isse PowerPC and Pentim: branch history table Compiler technology important 9
120 Smmary Pipelines pass control information down the pipe jst as moves down pipe Forwarding/stalls handled by local control Eceptions stop the pipeline IPS instrction set architectre made pipeline visible (delayed branch, delayed load) ore performance from deeper pipelines, parallelism 2
1048: Computer Organization
8: Compter Organization Lectre 6 Pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6- Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards
More informationEnhanced Performance with Pipelining
Chapter 6 Enhanced Performance with Pipelining Note: The slides being presented represent a mi. Some are created by ark Franklin, Washington University in St. Lois, Dept. of CSE. any are taken from the
More informationWhat do we have so far? Multi-Cycle Datapath
What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining
More informationReview: Computer Organization
Review: Compter Organization Pipelining Chans Y Landry Eample Landry Eample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 3 mintes A B C D Dryer takes 3 mintes
More informationOverview of Pipelining
EEC 58 Compter Architectre Pipelining Department of Electrical Engineering and Compter Science Cleveland State University Fndamental Principles Overview of Pipelining Pipelined Design otivation: Increase
More informationTDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction
Review Friday the 2st of October Real world eamples of pipelining? How does pipelining pp inflence instrction latency? How does pipelining inflence instrction throghpt? What are the three types of hazard
More informationPS Midterm 2. Pipelining
PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry
More informationChapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind
Pipelining hink of sing machines in landry services Chapter 6 nhancing Performance with Pipelining 6 P 7 8 9 A ime ask A B C ot pipelined Assme 3 min. each task wash, dry, fold, store and that separate
More informationChapter 6: Pipelining
CSE 322 COPUTER ARCHITECTURE II Chapter 6: Pipelining Chapter 6: Pipelining Febrary 10, 2000 1 Clothes Washing CSE 322 COPUTER ARCHITECTURE II The Assembly Line Accmlate dirty clothes in hamper Place in
More informationChapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts
CS359: Compter Architectre Chapter 3 & Appendi C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Compter Science and Engineering Shanghai Jiao Tong University 1 Otline Introdction
More informationPipelining. Chapter 4
Pipelining Chapter 4 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast Pipelining Single cycle path we
More informationComp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13
Comp 33 Compter Architectre A Pipelined path Lectre 3 Pipelined path with Signals PCSrc IF/ ID ID/ EX EX / E E / Add PC 4 Address Instrction emory RegWr ra rb rw Registers bsw [5-] [2-6] [5-] bsa bsb Sign
More informationChapter 4 (Part II) Sequential Laundry
Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30
More informationInstruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion
. (Chapter 5) Fill in the vales for SrcA, SrcB, IorD, Dst and emto to complete the Finite State achine for the mlti-cycle datapath shown below. emory address comptation 2 SrcA = SrcB = Op = fetch em SrcA
More informationPIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons
Pipelining: Natral Phenomenon Landry Eample: nn, rian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 mintes C D Dryer takes 0 mintes PIPELINING Folder takes 20 mintes
More informationEEC 483 Computer Organization
EEC 83 Compter Organization Chapter.6 A Pipelined path Chans Y Pipelined Approach 2 - Cycle time, No. stages - Resorce conflict E E A B C D 3 E E 5 E 2 3 5 2 6 7 8 9 c.y9@csohio.ed Resorces sed in 5 Stages
More informationExceptions and interrupts
Eceptions and interrpts An eception or interrpt is an nepected event that reqires the CPU to pase or stop the crrent program. Eception handling is the hardware analog of error handling in software. Classes
More informationThe extra single-cycle adders
lticycle Datapath As an added bons, we can eliminate some of the etra hardware from the single-cycle path. We will restrict orselves to sing each fnctional nit once per cycle, jst like before. Bt since
More informationSolutions for Chapter 6 Exercises
Soltions for Chapter 6 Eercises Soltions for Chapter 6 Eercises 6. 6.2 a. Shortening the ALU operation will not affect the speedp obtained from pipelining. It wold not affect the clock cycle. b. If the
More information1048: Computer Organization
48: Compter Organization Lectre 5 Datapath and Control Lectre5A - simple implementation (cwli@twins.ee.nct.ed.tw) 5A- Introdction In this lectre, we will try to implement simplified IPS which contain emory
More informationChapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns
Chapter Si Pipelining Improve perfomance by increasing instruction throughput eecutionı Time lw $, ($) 2 6 8 2 6 8 access lw $2, 2($) 8 ns access lw $3, 3($) eecutionı Time lw $, ($) lw $2, 2($) 2 ns 8
More informationThe final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read.
The final path PC 4 Add Reg Shift left 2 Add PCSrc Instrction [3-] Instrction I [25-2] I [2-6] I [5 - ] register register 2 register 2 Registers ALU Zero Reslt ALUOp em Data emtor RegDst ALUSrc em I [5
More informationEEC 483 Computer Organization. Branch (Control) Hazards
EEC 483 Compter Organization Section 4.8 Branch Hazards Section 4.9 Exceptions Chans Y Branch (Control) Hazards While execting a previos branch, next instrction address might not yet be known. s n i o
More informationEEC 483 Computer Organization
EEC 483 Compter Organization Chapter 4.4 A Simple Implementation Scheme Chans Y The Big Pictre The Five Classic Components of a Compter Processor Control emory Inpt path Otpt path & Control 2 path and
More informationThe single-cycle design from last time
lticycle path Last time we saw a single-cycle path and control nit for or simple IPS-based instrction set. A mlticycle processor fies some shortcomings in the single-cycle CPU. Faster instrctions are not
More informationImprove performance by increasing instruction throughput
Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access
More informationThe multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM
Lectre (Wed /5/28) Lab # Hardware De Fri Oct 7 HW #2 IPS programming, de Wed Oct 22 idterm Fri Oct 2 IorD The mlticycle path SrcA Today s objectives: icroprogramming Etending the mlti-cycle path lti-cycle
More informationComputer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University
Compter Architectre Chapter 5 Fall 25 Department of Compter Science Kent State University The Processor: Datapath & Control Or implementation of the MIPS is simplified memory-reference instrctions: lw,
More informationInstruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time.
Pipelining Pipelining is the se of pipelining to allow more than one instrction to be in some stage of eection at the same time. Ferranti ATLAS (963): Pipelining redced the average time per instrction
More informationReview Multicycle: What is Happening. Controlling The Multicycle Design
Review lticycle: What is Happening Reslt Zero Op SrcA SrcB Registers Reg Address emory em Data Sign etend Shift left Sorce A B Ot [-6] [5-] [-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em
More informationComputer Architecture. Lecture 6: Pipelining
Compter Architectre Lectre 6: Pipelining Dr. Ahmed Sallam Based on original slides by Prof. Onr tl Agenda for Today & Net Few Lectres Single-cycle icroarchitectres lti-cycle and icroprogrammed icroarchitectres
More informationCS 251, Winter 2019, Assignment % of course mark
CS 25, Winter 29, Assignment.. 3% of corse mark De Wednesday, arch 3th, 5:3P Lates accepted ntil Thrsday arch th, pm with a 5% penalty. (7 points) In the diagram below, the mlticycle compter from the corse
More informationLecture 10: Pipelined Implementations
U 8-7 S 9 L- 8-7 Lectre : Pipelined Implementations James. Hoe ept of EE, U Febrary 23, 29 nnoncements: Project is de this week idterm graded, d reslts posted Handots: H9 Homework 3 (on lackboard) Graded
More informationWhat do we have so far? Multi-Cycle Datapath (Textbook Version)
What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001
More informationReview. A single-cycle MIPS processor
Review If three instrctions have opcodes, 7 and 5 are they all of the same type? If we were to add an instrction to IPS of the form OD $t, $t2, $t3, which performs $t = $t2 OD $t3, what wold be its opcode?
More informationEXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION
EXAINATIONS 2010 END OF YEAR COPUTER ORGANIZATION Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. ake sre yor answers are clear and to the point. Calclators and paper foreign langage
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationEXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation
EXAINATIONS 2003 COP203 END-YEAR Compter Organisation Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. There are 180 possible marks on the eam. Calclators and foreign langage dictionaries
More informationPART I: Adding Instructions to the Datapath. (2 nd Edition):
EE57 Instrctor: G. Pvvada ===================================================================== Homework #5b De: check on the blackboard =====================================================================
More informationQuiz #1 EEC 483, Spring 2019
Qiz # EEC 483, Spring 29 Date: Jan 22 Name: Eercise #: Translate the following instrction in C into IPS code. Eercise #2: Translate the following instrction in C into IPS code. Hint: operand C is stored
More informationCS 251, Winter 2018, Assignment % of course mark
CS 25, Winter 28, Assignment 4.. 3% of corse mark De Wednesday, arch 7th, 4:3P Lates accepted ntil Thrsday arch 8th, am with a 5% penalty. (6 points) In the diagram below, the mlticycle compter from the
More informationPipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...
CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100
More informationT = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good
CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle
More informationChapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts
CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University Parallel
More information1048: Computer Organization
48: Compter Organization Lectre 5 Datapath and Control Lectre5B - mlticycle implementation (cwli@twins.ee.nct.ed.tw) 5B- Recap: A Single-Cycle Processor PCSrc 4 Add Shift left 2 Add ALU reslt PC address
More informationCSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade
CSE 141 Compter Architectre Smmer Session I, 2004 Lectres 10 Advanced Topics, emory Hierarchy and Cache Pramod V. Argade CSE141: Introdction to Compter Architectre Instrctor: TA: Pramod V. Argade (p2argade@cs.csd.ed)
More informationLecture 6: Pipelining
Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other
More informationLecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1
Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)
More informationLecture 7. Building A Simple Processor
Lectre 7 Bilding A Simple Processor Christos Kozyrakis Stanford University http://eeclass.stanford.ed/ee8b C. Kozyrakis EE8b Lectre 7 Annoncements Upcoming deadlines Lab is de today Demo by 5pm, report
More informationProf. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:
EE8 Winter 25 Homework #2 Soltions De Thrsday, Feb 2, 5 P. ( points) Consider the following fragment of Java code: for (i=; i
More informationDesigning a Pipelined CPU
Designing a Pipelined CPU CSE 4, S2'6 Review -- Single Cycle CPU CSE 4, S2'6 Review -- ultiple Cycle CPU CSE 4, S2'6 Review -- Instruction Latencies Single-Cycle CPU Load Ifetch /Dec Exec em Wr ultiple
More informationProcessor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed
Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites,
More information微算機系統第六章. Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學. Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold
微算機系統第六章 Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學 chap6- Pipeline is natural! Laundry Example Ann, Brian, athy, Dave each have one load of clothes to wash, dry, and fold A B D Washer takes
More informationPipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.
Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike
More informationDesign a MIPS Processor (2/2)
93-2Digital System Design Design a MIPS Processor (2/2) Lecturer: Chihhao Chao Advisor: Prof. An-Yeu Wu 2005/5/13 Friday ACCESS IC LABORTORY Outline v 6.1 An Overview of Pipelining v 6.2 A Pipelined Datapath
More informationPipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12
Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online
More informationComputer Architecture
Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Spring 25 Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics
More informationCSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control
CSE-45432 Introdction to Compter Architectre Chapter 5 The Processor: Datapath & Control Dr. Izadi Data Processor Register # PC Address Registers ALU memory Register # Register # Address Data memory Data
More informationMIPS An ISA for Pipelining
Pipelining: Basic and Intermediate Concepts Slides by: Muhamed Mudawar CS 282 KAUST Spring 2010 Outline: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch
More information4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1
.3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage 35.e.3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage to Describe and odel a Pipeline
More informationComputer Architecture
Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics (Lectres
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationModern Computer Architecture
Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each
More informationInstruction Level Parallelism. Appendix C and Chapter 3, HP5e
Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation
More informationPipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction
More informationCSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content
3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationPipelining. Maurizio Palesi
* Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer
More informationLecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation
Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu www.secs.oakland.edu/~yan
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationPipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.
Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview
More informationCENG 3420 Lecture 06: Pipeline
CENG 3420 Lecture 06: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L06.1 Spring 2019 Outline q Pipeline Motivations q Pipeline Hazards q Exceptions q Background: Flip-Flop Control Signals CENG3420 L06.2
More information14:332:331 Pipelined Datapath
14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationECE232: Hardware Organization and Design
ECE232: Harware Organization an Design ectre 11: Introction to IPs path apte from Compter Organization an Design, Patterson & Hennessy, CB IPS-lite processor Compter Want to bil a processor for a sbset
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationPage 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight
Pipelining: Its Natural! Chapter 3 Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationPipeline Data Hazards. Dealing With Data Hazards
Pipeline Data Hazards Warning, warning, warning! Dealing With Data Hazards In Software inserting independent instructions In Hardware inserting bubbles (stalling the pipeline) data forwarding Data Data
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationCS 251, Spring 2018, Assignment 3.0 3% of course mark
CS 25, Spring 28, Assignment 3. 3% of corse mark De onday, Jne 25th, 5:3 P. (5 points) Consider the single-cycle compter shown on page 6 of this assignment. Sppose the circit elements take the following
More informationCPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner
CPS104 Computer Organization and Programming Lecture 19: Pipelining Robert Wagner cps 104 Pipelining..1 RW Fall 2000 Lecture Overview A Pipelined Processor : Introduction to the concept of pipelined processor.
More informationAnimating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle
nimating the atapath PS atapath : Single-Cycle npt is either (-type) or sign-etended lower half of instrction (load/store) op offset/immediate W egister File 6 6 + from instrction path beq,, offset if
More informationSI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,
SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Chapter 6 ADMIN ing for Chapter 6: 6., 6.9-6.2 2 Midnight Laundry Task order A 6 PM 7 8 9 0 2 2 AM B C D 3 Smarty
More informationComputer Architecture
Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationLecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University
8 447 Lectre 6: icroprogrammed lti Cycle Implementation James C. Hoe Department of ECE Carnegie ellon University 8 447 S8 L06 S, James C. Hoe, CU/ECE/CALC, 208 Yor goal today Hosekeeping nderstand why
More informationMulti-cycle Datapath (Our Version)
ulti-cycle Datapath (Our Version) npc_sel Next PC PC Instruction Fetch IR File Operand Fetch A B ExtOp ALUSrc ALUctr Ext ALU R emrd emwr em Access emto Data em Dst Wr. File isters added: IR: Instruction
More informationCMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1
CMCS 611-101 Advanced Computer Architecture Lecture 8 Control Hazards and Exception Handling September 30, 2009 www.csee.umbc.edu/~younis/cmsc611/cmsc611.htm Mohamed Younis CMCS 611, Advanced Computer
More informationThomas Polzer Institut für Technische Informatik
Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =
More informationCS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST
CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C
More informationCS 251, Winter 2018, Assignment % of course mark
CS 25, Winter 28, Assignment 3.. 3% of corse mark De onday, Febrary 26th, 4:3 P Lates accepted ntil : A, Febrary 27th with a 5% penalty. IEEE 754 Floating Point ( points): (a) (4 points) Complete the following
More informationAdvanced Computer Architecture Pipelining
Advanced Computer Architecture Pipelining Dr. Shadrokh Samavi Some slides are from the instructors resources which accompany the 6 th and previous editions of the textbook. Some slides are from David Patterson,
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More information