Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind

Size: px
Start display at page:

Download "Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind"

Transcription

1 Pipelining hink of sing machines in landry services Chapter 6 nhancing Performance with Pipelining 6 P A ime ask A B C ot pipelined Assme 3 min. each task wash, dry, fold, store and that separate tasks se separate hardware and so can be overlapped 6 P A ime ask A B Pipelined C Pipelined vs. Single-Cycle Instrction ection: the Plan eection ime (in instrctions) lw $, ($) lw $, ($) lw $3, 3($) eection ime (in instrctions) lw $, ($) lw $, ($) lw $3, 3($) Instrction Instrction ns ns Instrction 8 ns 6 8 Instrction ns Instrction Single-cycle Instrction Assme ns for, operation; ns for : therefore, single cycle clock 8 ns; pipelined clock cycle ns. 8 ns Pipelined... Pipelining: Keep in ind Pipelining does not redce latency of a single task, it increases throghpt of entire workload Pipeline rate limited by longest stage potential speedp = nmber pipe stages nbalanced lengths of pipe stages redces speedp ime to fill pipeline and time to drain it when there is slack in the pipeline redces speedp ns ns ns ns ns

2 Pipelining IPS What makes it hard? strctral hazards: different instrctions, at different stages, in the pipeline want to se the same hardware resorce hazards: scceeding instrction, to pt into pipeline, depends on the otcome of a previos branch instrction, already in pipeline hazards: an instrction in the pipeline reqires to be compted by a previos instrction still in the pipeline Before actally bilding the pipelined path and we first briefly eamine these potential hazards individally Strctral Hazards Strctral hazard: inadeqate hardware to simltaneosly spport all instrctions in the pipeline in the same clock cycle.g., sppose single not separate instrction and in pipeline below with one read port then a strctral hazard between first and forth lw instrctions eection ime (in instrctions) lw $, ($) lw $, ($) lw $3, 3($) lw $, ($) Instrction ns 6 8 Instrction ns Instrction ns Instrction Pipelined Hazard if single ns ns ns ns ns IPS was designed to be pipelined: strctral hazards are easy to avoid! Hazards hazard: need to make a decision based on the reslt of a previos instrction still eecting in pipeline Soltion Stall the pipeline eection (in instrctions) add $, $, $6 beq $, $, lw $3, 3($) ime Instrction ns Instrction ns bbble Instrction ns Pipeline stall ote that branch otcome is compted in I stage with added hardware (later ) Hazards Soltion Predict branch otcome e.g., predict branch-not-taken : eection (in instrctions) add $, $, $6 beq $, $, lw $3, 3($) eection (in instrctions) add $, $,$6 beq $, $, ime ime Instrction ns Instrction ns 6 8 Instrction ns Instrction Instrction Prediction sccess 6 8 bbble bbble bbble bbble bbble or $7, $8, $9 ns Instrction Prediction failre: ndo (=flsh) lw

3 Hazards Soltion 3 elayed branch: always eecte the seqentially net statement with the branch eecting after one instrction delay compiler s job to find a statement that can be pt in the slot that is independent of branch otcome IPS does this bt it is an option in SPI (Simlator -> Settings) eection (in instrctions) beq $, $, ime Instrction 6 8 Hazards hazard: instrction needs from the reslt of a previos instrction still eecting in pipeline Soltion Forward if possible ime 6 8 add $s, $t, $t IF I Instrction pipeline diagram: shade indicates se left=write, right=read add $, $, $6 (d elayed branch slot) ns lw $3, 3($) Instrction ns Instrction ns elayed branch beq is followed by add that is independent of branch otcome eection ime (in instrctions) add $s, $t, $t sb $t, $s, $t3 6 8 IF I IF I Withot forwarding ble line has to go back in time; with forwarding red line is available in time Hazards may not be enogh e.g., if an R-type instrction following a load ses the reslt of the load called load-se hazard ime eection (in instrctions) lw $s, ($t) sb $t, $s, $t3 ime eection (in instrctions) lw $s, ($t) 6 8 IF I IF I 6 8 IF I bbble bbble bbble bbble bbble Withot a stall it is impossible to provide inpt to the sb instrction in time With a one-stage stall, forwarding can get the to the sb instrction in time Reing Code to Avoid Pipeline Stall (Software Soltion) ample: lw $t, ($t) lw $t, ($t) sw $t, ($t) sw $t, ($t) Reed code: lw $t, ($t) lw $t, ($t) sw $t, ($t) sw $t, ($t) hazard Interchanged sb $t, $s, $t3 IF I

4 Pipelined path Review - Single-Cycle path Steps We now move to actally bilding a pipelined path First recall the steps in instrction eection. Instrction Fetch & Increment (IF). Instrction ecode and ister (I) 3. ection or calclate address (). emory (). reslt into () Review: single-cycle processor all steps done in a single clock cycle dedicated hardware reqired for each step What happens if we break the eection into mltiple cycles, bt keep the etra hardware? R R Instrction emory 3 6 Instrction I 3 R R W R ister File R 6 3 << R emory R IF Instrction Fetch I Instrction ecode ecte/ ress Calc. emory Access Back Pipelined path Key Idea Pipelined path What happens if we break the eection into mltiple cycles, bt keep the etra hardware? Answer: We may be able to start eecting a new instrction at each clock cycle - pipelining bt we shall need etra s to hold between cycles pipeline s R R Instrction emory 3 Pipeline swide enogh to hold coming in 6 bits 6 Instrction I 3 R R W R ister File R bits << 97 bits 6 bits R emory R I/ / /

5 Pipelined path Bg in the path Pipeline s wide enogh to hold coming in I/ / / R R Instrction emory 3 6 bits 6 Instrction I 3 R R W R ister File R bits << 97 bits 6 bits R emory R R R Instrction emory 3 6 Instrction I 3 R R W R ister File R 6 3 << R emory R I/ / / Only flowing right to left may case hazard, why? nmber comes from another later instrction! Corrected path Pipelined ample I/ 6 bits 33 bits << / / bits 69 bits Consider the following instrction seqence: lw $t, ($t) sw $t3, ($t) add $t, $t6, $t7 sb $t8, $t9, $t R R Instrction emory 3 R R W ister File R R 6 3 R emory R estination nmber is also passed throgh I/, / and / s, which are now wider by bits

6 Single-Clock-Cycle iagram: Clock Cycle Single-Clock-Cycle iagram: Clock Cycle LW SW LW I/ / / I/ / / << << R R Instrction emory 3 R R W ister File R R 6 3 R emory R R R Instrction emory 3 R R W ister File R R 6 3 R emory R Single-Clock-Cycle iagram: Clock Cycle 3 Single-Clock-Cycle iagram: Clock Cycle SW LW SB SW LW I/ / / I/ / / << << R R Instrction emory 3 R R W ister File R R 6 3 R emory R R R Instrction emory 3 R R W ister File R R 6 3 R emory R

7 Single-Clock-Cycle iagram: Clock Cycle Single-Clock-Cycle iagram: Clock Cycle 6 SB SW LW SB SW I/ / / I/ / / << << R R Instrction emory 3 R R W ister File R R 6 3 R emory R R R Instrction emory 3 R R W ister File R R 6 3 R emory R Single-Clock-Cycle iagram: Clock Cycle 7 Single-Clock-Cycle iagram: Clock Cycle 8 SB SB I/ / / I/ / / << << R R Instrction emory 3 R R W ister File R R 6 3 R emory R R R Instrction emory 3 R R W ister File R R 6 3 R emory R

8 lw $t, ($t) sw $t3, ($t) add $t, $t6, $t7 Alternative View ltiple-clock-cycle iagram CC CC CC 3 CC CC CC 6 CC 7 I RG RG I RG RG ime ais I RG RG sb $t8, $t9, $t I RG RG CC 8 otes One significant difference in the eection of an R-type instrction between mlticycle and pipelined implementations: write-back for the R-type instrction is the th (the last write-back) pipeline stage vs. the th stage for the mlticycle implementation. Why? think of strctral hazards when writing to the file Worth repeating: the essential difference between the pipeline and mlticycle implementations is the insertion of pipeline s to decople the stages he CPI of an ideal pipeline (no stalls) is. Why? he RaVi Architectre Visalization Project of ortmnd. has pipeline simlations see link in or itional Resorces page As we develop for the pipeline keep in mind that the tet does not consider jmp shold not be too hard to implement! Recall Single-Cycle the path Recall Single-Cycle address Instrction Instrction [3 ] Instrction [3 6] Instrction [ ] Instrction [ 6] Instrction [ ] Instrction [ ] st em emto Op em Src isters Instrction [ ] 6 3 etend left reslt reslt ress Src Instrction AlOp Instrction Fnct Field esired opcode operation action inpt LW load word add SW store word add eq branch eq sbtract R-type add add R-type sbtract sbtract R-type A and R-type OR or R-type set on less set on less Op Fnct field Operation Op Op F F F3 F F F rth table for bits

9 Recall Single-Cycle als Pipeline al ame ffect when deasserted ffect of bits ffect when asserted st he destination nmber for the he destination nmber for the comes from the rt field (bits -6) comes from the rd field (bits -) one he on the inpt is written with the vale on the inpt AlLSrc he second operand comes from the he second operand is the sign-etended, second file otpt ( ) lower 6 bits of the instrction Src he is replaced by the otpt of the adder he is replaced by the otpt of the adder that comptes the vale of + that comptes the branch target em one contents designated by the address inpt are pt on the first otpt em one contents designated by the address inpt are replaced by the vale of the inpt emto he vale fed to the inpt he vale fed to the inpt comes from the comes from the etermining Instrction st Src emto- em em Op p R-format lw bits sw beq Initial design motivated by single-cycle path se the same signals Observe: o separate write signal for the as it is written every cycle o separate write signals for the pipeline s as they are written every cycle o separate read signal for instrction as it is read every clock cycle o separate read signal for file as it is read every clock cycle eed to set signals dring each pipeline stage Since signals are associated with components active dring a single pipeline stage, can grop lines into five grops according to pipeline stage Pipelined path with I Pipeline als I/ / / Src here are five stages in the pipeline instrction / increment instrction decode / eection / address calclation write back othing to as instrction read and write are always enabled left reslt ress Instrction Same signals as the single-cycle path Instrction isters Instrction [ ] Instrction [ 6] Instrction [ ] 6 3 etend Src 6 st Op reslt em ress em emto ection/ress Calclation stage lines emory stage lines -back stage lines Instrction st Op Op Src em em write em to R-format lw sw beq

10 Pipeline Implementation Pipelined path with II Pass signals along jst like the etend each pipeline to hold needed bits for scceeding stages Src I/ / / Instrction I/ / / ote: he 6-bit fnct field of the instrction reqired in the stage to generate can be retrieved as the 6 least significant bits of the immediate field which is sign-etended and passed from the to the I/ ress Instrction signals emanate from the portions of the pipeline s Instrction isters Instrction [ ] Instrction [ 6] Instrction [ ] 6 3 etend left 6 st reslt Src Op reslt ress em em emto Pipelined ection and IF: lw $, ($) ress Instrction I: before<> : before<> : before<3> : before<> Instrction isters Instrction [ ] etend I/ left reslt Src reslt / ress em em / emto Pipelined ection and IF: and $, $, $ ress Instrction I: sb $, $, $3 : lw $,... : before<> : before<> Instrction sb 3 isters Instrction [ ] etend $ $3 I/ $ left reslt Src reslt / em ress em / emto Instrction seqence: Clock cycle Clock IF: sb $, $, $3 Instrction [ 6] Instrction [ ] st Op I: lw $, ($) : before<> : before<> : before<3> Instrction seqence: Clock cycle 3 Clock 3 IF: or $3, $6, $7 Instrction [ 6] Instrction [ ] st Op I: and $, $, $3 : sb $,... : lw $,... : before<> lw $, ($) sb $, $, $3 and $, $, $7 or $3, $6, $7 add $, $8, $9 ress Instrction Label before<i> means i th instrction before lw Clock cycle Clock Instrction lw isters Instrction [ 6] Instrction [ ] etend Instrction [ ] $ $ I/ left st reslt Src reslt Op / em ress em / emto lw $, ($) sb $, $, $3 and $, $, $7 or $3, $6, $7 add $, $8, $9 ress Instrction Clock cycle Clock Instrction and I/ left $ $ isters $ $3 Instrction [ ] Instrction [ 6] Instrction [ ] etend st reslt Src reslt Op / ress em em / emto

11 Instrction Pipelined ection and IF: add $, $8, $9 ress I: or $3, $6, $7 : and $,... : sb $,... : lw $,... Instrction or 6 7 isters $6 $7 I/ $ $ left reslt Src reslt / em ress / emto Pipelined ection and IF: after<> ress Instrction I: after<> : add $,... : or $3,... : and $,... Instrction isters I/ $8 $9 left reslt Src reslt / em ress / emto Instrction [ ] etend em Instrction [ ] etend em Instrction seqence: lw $, ($) sb $, $, $3 and $, $, $7 or $3, $6, $7 add $, $8, $9 Label after<i> means i th instrction after add Clock cycle IF: after<> ress Clock Instrction Clock cycle 6 Clock 6 Instrction 3 Instrction [ 6] Instrction [ ] I/ left I: add $, $8, $9 : or $3,... : and $,... : sb $,... add 8 9 isters Instrction [ ] Instrction [ 6] Instrction [ ] etend 3 $8 $9 $6 $7 3 st st Op reslt Src reslt Op / em ress em / emto Instrction seqence: lw $, ($) sb $, $, $3 and $, $, $7 or $3, $6, $7 add $, $8, $9 Clock cycle 7 IF: after<3> Clock 7 ress Instrction Clock cycle 8 Clock 8 Instrction Instrction [ 6] Instrction [ ] I/ left I: after<> : after<> : add $,... : or $3,... 3 isters Instrction [ 6] Instrction [ ] etend Instrction [ ] st st Op reslt Src reslt Op / 3 em ress em / 3 emto Pipelined ection and Revisiting Hazards Instrction seqence: lw $, ($) sb $, $, $3 and $, $, $7 or $3, $6, $7 add $, $8, $9 IF: after<> ress Instrction I: after<3> : after<> : after<> : add $,... Instrction isters I/ left reslt Src reslt / em ress / emto So far or path and have ignored hazards We shall revisit hazards and hazards and enhance or path and to handle them in hardware Instrction [ ] etend em Clock cycle 9 Clock 9 Instrction [ 6] Instrction [ ] st Op

12 Hazards and Problem with starting an instrction before previos are finished: dependencies that go backward in time called hazards $ = before sb; $ = - after sb sb $, $, $3 and $, $, $ or $3, $6, $ add $, $, $ sw $, ($) eection (in instrctions) ime (in clock cycles) Vale of $: sb $, $, $3 and $, $, $ or $3, $6, $ add $, $, $ sw $, ($) CC CC CC 3 CC CC CC 6 I I I CC 7 CC 8 CC 9 / I I Software Soltion Have compiler garantee never any hazards! by rearranging instrctions to insert independent instrctions between instrctions that wold otherwise have a hazard between them, or, if sch rearrangement is not possible, insert nops sb $, $, $3 lw $, ($3) slt $, $6, $7 and $, $, $ or $3, $6, $ add $, $, $ or sb $, $, $3 nop nop and $, $, $ or $3, $6, $ add $, $, $ sw $, ($) sw $, ($) Sch compiler soltions may not always be possible, and nops slow the machine down IPS: nop = no operation = (3bits) = sll $, $, Hardware Soltion: Idea: se intermediate, do not wait for reslt to be finally written to the destination. wo steps:. etect hazard. Forward intermediate to resolve hazard Pipelined path with II (as before) Src I/ / / ress Instrction Instrction isters left reslt Src reslt ress em emto signals emanate from the portions of the pipeline s Instrction [ ] Instrction [ 6] Instrction [ ] 6 3 etend 6 st Op em

13 Hazard etection Hazard conditions: a. /.isterrd = I/.isterRs b. /.isterrd = I/.isterRt a. /.isterrd = I/.isterRs b. /.isterrd = I/.isterRt g., in the earlier eample, first hazard between sb $, $, $3 and and $, $, $ is detected when the and is in stage and the sb is in stage becase /.isterrd = I/.isterRs = $ (a) Whether to forward also depends on: if the later instrction is going to write a if not, no need to forward, even if there is nmber match as in conditions above if the destination of the later instrction is $ in which case there is no need to forward vale ($ is always and never overwritten) Plan: allow inpts to the not jst from I/, bt also later pipeline s, and se mltipleors and signals to choose appropriate inpts to sb $, $, $3 and $, $, $ or $3, $6, $ add $, $, $ sw $, ($) eection (in instrctions) sb $, $, $3 and $, $, $ ime (in clock cycles) CC CC CC 3 CC CC CC 6 I I I CC 7 CC 8 CC 9 Vale of $ : / Vale of / : Vale of / : or $3, $6, $ add $, $, $ I sw $, ($) I ependencies between pipelines move forward in time Hardware isters I/ / / Hardware with I/ Called forwarding nit, not hazard nit, becase once is forwarded there is no hazard! / / a. o forwarding isters path before adding forwarding hardware I/ ForwardA / / Instrction Instrction isters.isterrs Rs.isterRt.isterRt.isterRd Rt Rt Rd /.isterrd b. With forwarding Rs Rt Rt Rd ForwardB nit /.isterrd /.isterrd path after adding forwarding hardware nit /.isterrd path with forwarding hardware and wires certain details, e.g., branching hardware, are omitted to simplify the drawing ote: so far we have only handled forwarding to R-type instrctions!

14 or $, $, $ and $, $, $ sb $, $, $3 before<> before<> after<> add $9, $, $ or $, $, $ and $,... sb $,... I/ / / I/ / / $ $ $ $ Instrction Instrction isters $ $3 Instrction Instrction isters $ $ 3 ection eample: Clock cycle 3 Clock 3 add $9, $, $ or $, $, $ and $, $, $ nit sb $,... before<> ection eample (cont.): Clock cycle Clock 9 nit after<> after<> add $9, $, $ or $,... and $,... I/ I/ sb $, $, $3 and $, $, $ or $, $, $ add $9, $, $ Instrction Instrction 6 isters $ $ $ $ / / sb $, $, $3 and $, $, $ or $, $, $ add $9, $, $ Instrction Instrction isters $ $ / / 6 Clock cycle nit Clock cycle 6 9 nit Clock Clock 6 Hazards and Stalls Load word can still case a hazard: an instrction tries to read a following a load instrction that writes to the same lw $, ($) and $, $, $ or $8, $, $6 add $9, $, $ Slt $, $6, $7 As even a pipeline dependency goes backward in time forwarding will not solve the hazard eection (in instrctions) lw $, ($) and $, $, $ or $8, $, $6 add $9, $, $ slt $, $6, $7 ime (in clock cycles) CC CC CC 3 CC CC CC 6 I I therefore, we need a hazard nit to stall the pipeline after the load instrction I CC 7 CC 8 CC 9 I I ress Instrction Pipelined path with II (as before) Src signals emanate from the portions of the pipeline s Instrction isters Instrction [ ] Instrction [ 6] Instrction [ ] 6 3 etend I/ left 6 st reslt Src Op reslt / ress em em / emto

15 Hazard etection Logic to Stall echanics of Stalling Hazard nit implements the following check if to stall if ( I/.em // if the instrction in the stage is a load and ( ( I/.isterRt =.isterrs ) // and the destination or ( I/.isterRt =.isterrt ) ) ) // matches either sorce // of the instrction in the I stage, then stall the pipeline If the check to stall verifies, then the pipeline needs to stall only clock cycle after the load as after that the forwarding nit can resolve the dependency What the hardware does to stall the pipeline cycle: does not let the change (disable write!) this will case the instrction in the I stage to repeat, i.e., stall therefore, the instrction, jst behind, in the IF stage mst be stalled as well so hardware does not let the change (disable write!) this will case the instrction in the IF stage to repeat, i.e., stall changes all the, and fields in the I/ pipeline to, so effectively the instrction jst behind the load becomes a nop a bbble is said to have been inserted into the pipeline note that we cannot trn that instrction into an nop by ing all the bits in the instrction itself recall nop = (3 bits) becase it has already been decoded and signals generated Hazard etection nit Hazard I/.em nit I/ Stalling Resolves a Hazard Instrction Instrction isters.isterrs.isterrt.isterrt.isterrd I/.isterRt Rt Rd Rs Rt / nit / /.isterrd /.isterrd Same instrction seqence as before for which forwarding by itself cold not resolve the hazard: lw $, ($) and $, $, $ or $8, $, $6 add $9, $, $ Slt $, $6, $7 ime (in clock cycles) eection (in instrctions) CC CC CC 3 CC CC CC 6 lw $, ($) and $, $, $ or $8, $, $6 add $9, $, $ I I I I bbble CC 7 CC 8 CC 9 CC I path with forwarding hardware, the hazard nit and s wires certain details, e.g., branching hardware are omitted to simplify the drawing slt $, $6, $7 I Hazard nit inserts a -cycle bbble in the pipeline, after which all pipeline dependencies go forward so then the forwarding nit can handle them and there are no more hazards

16 Hazard Stalling and $, $, $ lw $, ($) before<> before<> Hazard nit I/.em I/ / / before<3> Stalling or $, $, $ and $, $, $ Hazard nit I/.em I/ bbble / lw $,... / before<> ection eample: lw $, ($) and $, $, $ or $, $, $ add $9, $, $ or $, $, $ Clock Instrction Clock cycle Instrction Instrction and $, $, $ Instrction nit isters I/.isterRt isters I/.em $ $ I/ $ $ lw $, ($) $ $ nit / before<> / before<> ection eample (cont.): lw $, ($) and $, $, $ or $, $, $ add $9, $, $ Clock Instrction Clock cycle add $9, $, $ Instrction Instrction or $, $, $ Instrction Hazard nit isters I/.isterRt isters $ $ I/.em I/ $ $ $ $ and $, $, $ $ $ nit / bbble lw $,... / Clock cycle 3 Clock 3 I/.isterRt nit Clock cycle Clock I/.isterRt nit after<> add $9, $, $ or $, $, $ and $,... bbble Stalling Hazard nit I/.em I/ / / (or ) Hazards ection eample (cont.): lw $, ($) and $, $, $ or $, $, $ add $9, $, $ Clock 6 Instrction Clock cycle 6 after<> Instrction Clock cycle 7 Clock 7 Instrction Instrction after<> Hazard nit isters I/.isterRt isters $ $ 9 I/.em I/.isterRt I/ $ $ add $9, $, $ or $,... and $,... $ $ 9 nit nit / / Problem with branches in the pipeline we have so far is that the branch decision is not made till the stage so what instrctions, if at all, shold we insert into the pipeline following the branch instrctions? Possible soltion: stall the pipeline till branch decision is known not efficient, slow the pipeline significantly! Another soltion: predict the branch otcome e.g., always predict branch-not-taken contine with net seqential instrctions if the prediction is wrong have to flsh the pipeline behind the branch discard instrctions already ed or decoded and contine eection at the branch target

17 Predicting -not-taken: isprediction delay eection (in instrctions) beq $, $3, 7 and $, $, $ 8 or $3, $6, $ add $, $, $ 7 lw $, ($7) ime (in clock cycles) CC I CC CC 3 CC CC CC 6 CC 7 CC 8 CC 9 I I I he otcome of branch taken (prediction wrong) is decided only when beq is in the stage, so the following three seqential instrctions already in the pipeline have to be flshed and eection resmes at lw I Optimizing the Pipeline to Redce elay ove the branch decision from the stage (as in or crrent pipeline) earlier to the I stage calclating the branch target address involves moving the branch adder from the stage to the I stage inpts to this adder, the vale and the immediate fields are already available in the pipeline calclating the branch decision is efficiently done, e.g., for eqality test, by ORing respective bits and then ORing all the reslts and inverting, rather than sing the to sbtract and then test for zero (when there is a carry delay) with the more efficient eqality test we can pt it in the I stage withot significantly lengthening this stage remember an objective of pipeline design is to keep pipeline stages balanced we mst correspondingly make additions to the forwarding and hazard nits to forward to or stall the branch at the I stage in case the branch decision depends on an earlier reslt Flshing on isprediction Same strategy as for stalling on load-se hazard ot all the vales (or the instrction itself) in pipeline s for the instrctions following the branch that are already in the pipeline effectively trning them into nops so they are flshed in the optimized pipeline, with branch decision made in the I stage, we have to flsh only one instrction in the IF stage the branch delay penalty is then only one clock cycle IF.Flsh Optimized path for Instrction Hazard nit left isters = I/ IF.Flsh zeros ot the instrction in the pipeline (which follows the branch) / / etend nit decision is moved from the stage to the I stage simplified drawing not showing enhancements to the forwarding and hazard nits

18 Pipelined ection eample: 36 sb $, $, $8 beq $, $3, 7 and $ $, $ 8 or $3 $, $6 add $, $, $ 6 slt $, $6, $7 7 lw $, ($7) Optimized pipeline with only one bbble as a reslt of the taken branch and $, $, $ beq $, $3, 7 sb $, $, $8 IF.Flsh Clock 3 7 Clock cycle 3 lw $, ($7) IF.Flsh Clock 76 7 Clock cycle 7 8 Instrction 76 Instrction 8 Hazard nit 8 bbble (nop) 76 7 left 7 Hazard nit etend left etend 7 isters isters = = $ $3 I/ I/ $ $8 nit / before<> / beq $, $3, 7 sb $,... before<> $ $3 nit / / before<> Sperscalar Architectre A sperscalar processor eectes more than one instrction dring a clock cycle by simltaneosly dispatching mltiple instrctions to redndant fnctional nits on the processor. ach fnctional nit is not a separate CP core bt an eection resorce within a single CP ypical -stage pipeline Sperscalar Pipeline Pentim Pipeline -stage pipeline

19

Review: Computer Organization

Review: Computer Organization Review: Compter Organization Pipelining Chans Y Landry Eample Landry Eample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 3 mintes A B C D Dryer takes 3 mintes

More information

Enhanced Performance with Pipelining

Enhanced Performance with Pipelining Chapter 6 Enhanced Performance with Pipelining Note: The slides being presented represent a mi. Some are created by ark Franklin, Washington University in St. Lois, Dept. of CSE. any are taken from the

More information

Pipelining. Chapter 4

Pipelining. Chapter 4 Pipelining Chapter 4 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast Pipelining Single cycle path we

More information

PS Midterm 2. Pipelining

PS Midterm 2. Pipelining PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry

More information

What do we have so far? Multi-Cycle Datapath

What do we have so far? Multi-Cycle Datapath What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining

More information

Chapter 6: Pipelining

Chapter 6: Pipelining CSE 322 COPUTER ARCHITECTURE II Chapter 6: Pipelining Chapter 6: Pipelining Febrary 10, 2000 1 Clothes Washing CSE 322 COPUTER ARCHITECTURE II The Assembly Line Accmlate dirty clothes in hamper Place in

More information

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction Review Friday the 2st of October Real world eamples of pipelining? How does pipelining pp inflence instrction latency? How does pipelining inflence instrction throghpt? What are the three types of hazard

More information

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13 Comp 33 Compter Architectre A Pipelined path Lectre 3 Pipelined path with Signals PCSrc IF/ ID ID/ EX EX / E E / Add PC 4 Address Instrction emory RegWr ra rb rw Registers bsw [5-] [2-6] [5-] bsa bsb Sign

More information

The extra single-cycle adders

The extra single-cycle adders lticycle Datapath As an added bons, we can eliminate some of the etra hardware from the single-cycle path. We will restrict orselves to sing each fnctional nit once per cycle, jst like before. Bt since

More information

Review. A single-cycle MIPS processor

Review. A single-cycle MIPS processor Review If three instrctions have opcodes, 7 and 5 are they all of the same type? If we were to add an instrction to IPS of the form OD $t, $t2, $t3, which performs $t = $t2 OD $t3, what wold be its opcode?

More information

Overview of Pipelining

Overview of Pipelining EEC 58 Compter Architectre Pipelining Department of Electrical Engineering and Compter Science Cleveland State University Fndamental Principles Overview of Pipelining Pipelined Design otivation: Increase

More information

1048: Computer Organization

1048: Computer Organization 8: Compter Organization Lectre 6 Pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6- Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards

More information

Chapter 6: Pipelining

Chapter 6: Pipelining Chapter 6: Pipelining Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 483 Compter Organization Chapter 4.4 A Simple Implementation Scheme Chans Y The Big Pictre The Five Classic Components of a Compter Processor Control emory Inpt path Otpt path & Control 2 path and

More information

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons Pipelining: Natral Phenomenon Landry Eample: nn, rian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 mintes C D Dryer takes 0 mintes PIPELINING Folder takes 20 mintes

More information

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2019, Assignment % of course mark CS 25, Winter 29, Assignment.. 3% of corse mark De Wednesday, arch 3th, 5:3P Lates accepted ntil Thrsday arch th, pm with a 5% penalty. (7 points) In the diagram below, the mlticycle compter from the corse

More information

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read.

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read. The final path PC 4 Add Reg Shift left 2 Add PCSrc Instrction [3-] Instrction I [25-2] I [2-6] I [5 - ] register register 2 register 2 Registers ALU Zero Reslt ALUOp em Data emtor RegDst ALUSrc em I [5

More information

Solutions for Chapter 6 Exercises

Solutions for Chapter 6 Exercises Soltions for Chapter 6 Eercises Soltions for Chapter 6 Eercises 6. 6.2 a. Shortening the ALU operation will not affect the speedp obtained from pipelining. It wold not affect the clock cycle. b. If the

More information

EEC 483 Computer Organization. Branch (Control) Hazards

EEC 483 Computer Organization. Branch (Control) Hazards EEC 483 Compter Organization Section 4.8 Branch Hazards Section 4.9 Exceptions Chans Y Branch (Control) Hazards While execting a previos branch, next instrction address might not yet be known. s n i o

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Compter Architectre Chapter 3 & Appendi C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Compter Science and Engineering Shanghai Jiao Tong University 1 Otline Introdction

More information

The single-cycle design from last time

The single-cycle design from last time lticycle path Last time we saw a single-cycle path and control nit for or simple IPS-based instrction set. A mlticycle processor fies some shortcomings in the single-cycle CPU. Faster instrctions are not

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 4.. 3% of corse mark De Wednesday, arch 7th, 4:3P Lates accepted ntil Thrsday arch 8th, am with a 5% penalty. (6 points) In the diagram below, the mlticycle compter from the

More information

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM Lectre (Wed /5/28) Lab # Hardware De Fri Oct 7 HW #2 IPS programming, de Wed Oct 22 idterm Fri Oct 2 IorD The mlticycle path SrcA Today s objectives: icroprogramming Etending the mlti-cycle path lti-cycle

More information

Review Multicycle: What is Happening. Controlling The Multicycle Design

Review Multicycle: What is Happening. Controlling The Multicycle Design Review lticycle: What is Happening Reslt Zero Op SrcA SrcB Registers Reg Address emory em Data Sign etend Shift left Sorce A B Ot [-6] [5-] [-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em

More information

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University Compter Architectre Chapter 5 Fall 25 Department of Compter Science Kent State University The Processor: Datapath & Control Or implementation of the MIPS is simplified memory-reference instrctions: lw,

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 83 Compter Organization Chapter.6 A Pipelined path Chans Y Pipelined Approach 2 - Cycle time, No. stages - Resorce conflict E E A B C D 3 E E 5 E 2 3 5 2 6 7 8 9 c.y9@csohio.ed Resorces sed in 5 Stages

More information

Animating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle

Animating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle nimating the atapath PS atapath : Single-Cycle npt is either (-type) or sign-etended lower half of instrction (load/store) op offset/immediate W egister File 6 6 + from instrction path beq,, offset if

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5A - simple implementation (cwli@twins.ee.nct.ed.tw) 5A- Introdction In this lectre, we will try to implement simplified IPS which contain emory

More information

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion . (Chapter 5) Fill in the vales for SrcA, SrcB, IorD, Dst and emto to complete the Finite State achine for the mlti-cycle datapath shown below. emory address comptation 2 SrcA = SrcB = Op = fetch em SrcA

More information

PART I: Adding Instructions to the Datapath. (2 nd Edition):

PART I: Adding Instructions to the Datapath. (2 nd Edition): EE57 Instrctor: G. Pvvada ===================================================================== Homework #5b De: check on the blackboard =====================================================================

More information

Exceptions and interrupts

Exceptions and interrupts Eceptions and interrpts An eception or interrpt is an nepected event that reqires the CPU to pase or stop the crrent program. Eception handling is the hardware analog of error handling in software. Classes

More information

4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1

4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1 .3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage 35.e.3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage to Describe and odel a Pipeline

More information

Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:

Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code: EE8 Winter 25 Homework #2 Soltions De Thrsday, Feb 2, 5 P. ( points) Consider the following fragment of Java code: for (i=; i

More information

Lecture 7. Building A Simple Processor

Lecture 7. Building A Simple Processor Lectre 7 Bilding A Simple Processor Christos Kozyrakis Stanford University http://eeclass.stanford.ed/ee8b C. Kozyrakis EE8b Lectre 7 Annoncements Upcoming deadlines Lab is de today Demo by 5pm, report

More information

CS 251, Spring 2018, Assignment 3.0 3% of course mark

CS 251, Spring 2018, Assignment 3.0 3% of course mark CS 25, Spring 28, Assignment 3. 3% of corse mark De onday, Jne 25th, 5:3 P. (5 points) Consider the single-cycle compter shown on page 6 of this assignment. Sppose the circit elements take the following

More information

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation EXAINATIONS 2003 COP203 END-YEAR Compter Organisation Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. There are 180 possible marks on the eam. Calclators and foreign langage dictionaries

More information

Quiz #1 EEC 483, Spring 2019

Quiz #1 EEC 483, Spring 2019 Qiz # EEC 483, Spring 29 Date: Jan 22 Name: Eercise #: Translate the following instrction in C into IPS code. Eercise #2: Translate the following instrction in C into IPS code. Hint: operand C is stored

More information

Lecture 10: Pipelined Implementations

Lecture 10: Pipelined Implementations U 8-7 S 9 L- 8-7 Lectre : Pipelined Implementations James. Hoe ept of EE, U Febrary 23, 29 nnoncements: Project is de this week idterm graded, d reslts posted Handots: H9 Homework 3 (on lackboard) Graded

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Harware Organization an Design ectre 11: Introction to IPs path apte from Compter Organization an Design, Patterson & Hennessy, CB IPS-lite processor Compter Want to bil a processor for a sbset

More information

EXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION

EXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION EXAINATIONS 2010 END OF YEAR COPUTER ORGANIZATION Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. ake sre yor answers are clear and to the point. Calclators and paper foreign langage

More information

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control CSE-45432 Introdction to Compter Architectre Chapter 5 The Processor: Datapath & Control Dr. Izadi Data Processor Register # PC Address Registers ALU memory Register # Register # Address Data memory Data

More information

Hardware Design Tips. Outline

Hardware Design Tips. Outline Hardware Design Tips EE 36 University of Hawaii EE 36 Fall 23 University of Hawaii Otline Verilog: some sbleties Simlators Test Benching Implementing the IPS Actally a simplified 6 bit version EE 36 Fall

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Spring 25 Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5B - mlticycle implementation (cwli@twins.ee.nct.ed.tw) 5B- Recap: A Single-Cycle Processor PCSrc 4 Add Shift left 2 Add ALU reslt PC address

More information

Computer Architecture. Lecture 6: Pipelining

Computer Architecture. Lecture 6: Pipelining Compter Architectre Lectre 6: Pipelining Dr. Ahmed Sallam Based on original slides by Prof. Onr tl Agenda for Today & Net Few Lectres Single-cycle icroarchitectres lti-cycle and icroprogrammed icroarchitectres

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 3.. 3% of corse mark De onday, Febrary 26th, 4:3 P Lates accepted ntil : A, Febrary 27th with a 5% penalty. IEEE 754 Floating Point ( points): (a) (4 points) Complete the following

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics (Lectres

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University Parallel

More information

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade CSE 141 Compter Architectre Smmer Session I, 2004 Lectres 10 Advanced Topics, emory Hierarchy and Cache Pramod V. Argade CSE141: Introdction to Compter Architectre Instrctor: TA: Pramod V. Argade (p2argade@cs.csd.ed)

More information

4.13. An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations

4.13. An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations .3 An Introdction to Digital Design Using a Hardware Design Langage to Describe and odel a Pipeline and ore Pipelining Illstrations This online section covers hardware description langages and then gives

More information

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns Chapter Si Pipelining Improve perfomance by increasing instruction throughput eecutionı Time lw $, ($) 2 6 8 2 6 8 access lw $2, 2($) 8 ns access lw $3, 3($) eecutionı Time lw $, ($) lw $2, 2($) 2 ns 8

More information

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites,

More information

Instruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time.

Instruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time. Pipelining Pipelining is the se of pipelining to allow more than one instrction to be in some stage of eection at the same time. Ferranti ATLAS (963): Pipelining redced the average time per instrction

More information

Chapter 4 (Part II) Sequential Laundry

Chapter 4 (Part II) Sequential Laundry Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S. Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

Lab 8 (All Sections) Prelab: ALU and ALU Control

Lab 8 (All Sections) Prelab: ALU and ALU Control Lab 8 (All Sections) Prelab: and Control Name: Sign the following statement: On my honor, as an Aggie, I have neither given nor received nathorized aid on this academic work Objective In this lab yo will

More information

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1, SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Chapter 6 ADMIN ing for Chapter 6: 6., 6.9-6.2 2 Midnight Laundry Task order A 6 PM 7 8 9 0 2 2 AM B C D 3 Smarty

More information

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University 8 447 Lectre 6: icroprogrammed lti Cycle Implementation James C. Hoe Department of ECE Carnegie ellon University 8 447 S8 L06 S, James C. Hoe, CU/ECE/CALC, 208 Yor goal today Hosekeeping nderstand why

More information

Winter 2013 MIDTERM TEST #2 Wednesday, March 20 7:00pm to 8:15pm. Please do not write your U of C ID number on this cover page.

Winter 2013 MIDTERM TEST #2 Wednesday, March 20 7:00pm to 8:15pm. Please do not write your U of C ID number on this cover page. page of 7 University of Calgary Departent of Electrical and Copter Engineering ENCM 369: Copter Organization Lectre Instrctors: Steve Noran and Nor Bartley Winter 23 MIDTERM TEST #2 Wednesday, March 2

More information

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle

More information

The Disciplined Flood Protocol in Sensor Networks

The Disciplined Flood Protocol in Sensor Networks The Disciplined Flood Protocol in Sensor Networks Yong-ri Choi and Mohamed G. Goda Department of Compter Sciences The University of Texas at Astin, U.S.A. fyrchoi, godag@cs.texas.ed Hssein M. Abdel-Wahab

More information

comp 180 Lecture 25 Outline of Lecture The ALU Control Operation & Design The Datapath Control Operation & Design HKUST 1 Computer Science

comp 180 Lecture 25 Outline of Lecture The ALU Control Operation & Design The Datapath Control Operation & Design HKUST 1 Computer Science Outline of Lecture The Control Operation & Design The Datapath Control Operation & Design HKST 1 Computer Science Control After the design of partial single IPS datapath, we need to add the control unit

More information

CSSE232 Computer Architecture I. Mul5cycle Datapath

CSSE232 Computer Architecture I. Mul5cycle Datapath CSSE232 Compter Architectre I Ml5cycle Datapath Class Stats Next 3 days : Ml5cycle datapath ing Ml5cycle datapath is not in the book! How long do instrc5ons take? ALU 2ns Mem 2ns Reg File 1ns Everything

More information

EE 457 Unit 6a. Basic Pipelining Techniques

EE 457 Unit 6a. Basic Pipelining Techniques EE 47 Unit 6a Basic Pipelining Techniques 2 Pipelining Introduction Consider a drink bottling plant Filling the bottle = 3 sec. Placing the cap = 3 sec. Labeling = 3 sec. Would you want Machine = Does

More information

Multiple-Choice Test Chapter Golden Section Search Method Optimization COMPLETE SOLUTION SET

Multiple-Choice Test Chapter Golden Section Search Method Optimization COMPLETE SOLUTION SET Mltiple-Choice Test Chapter 09.0 Golden Section Search Method Optimization COMPLETE SOLUTION SET. Which o the ollowing statements is incorrect regarding the Eqal Interval Search and Golden Section Search

More information

Lecture 9: Microcontrolled Multi-Cycle Implementations

Lecture 9: Microcontrolled Multi-Cycle Implementations 8-447 Lectre 9: icroled lti-cycle Implementations James C. Hoe Dept of ECE, CU Febrary 8, 29 S 9 L9- Annoncements: P&H Appendi D Get started t on Lab Handots: Handot #8: Project (on Blackboard) Single-Cycle

More information

1 Hazards COMP2611 Fall 2015 Pipelined Processor

1 Hazards COMP2611 Fall 2015 Pipelined Processor 1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add

More information

POWER-OF-2 BOUNDARIES

POWER-OF-2 BOUNDARIES Warren.3.fm Page 5 Monday, Jne 17, 5:6 PM CHAPTER 3 POWER-OF- BOUNDARIES 3 1 Ronding Up/Down to a Mltiple of a Known Power of Ronding an nsigned integer down to, for eample, the net smaller mltiple of

More information

MIPS Architecture. Fibonacci (C) Fibonacci (Assembly) Another Example: MIPS. Example: subset of MIPS processor architecture

MIPS Architecture. Fibonacci (C) Fibonacci (Assembly) Another Example: MIPS. Example: subset of MIPS processor architecture Another Eample: IPS From the Harris/Weste book Based on the IPS-like processor from the Hennessy/Patterson book IPS Architectre Eample: sbset of IPS processor architectre Drawn from Patterson & Hennessy

More information

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Pipeline Thoai Nam Outline Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

Lecture 13: Exceptions and Interrupts

Lecture 13: Exceptions and Interrupts 18 447 Lectre 13: Eceptions and Interrpts S 10 L13 1 James C. Hoe Dept of ECE, CU arch 1, 2010 Annoncements: Handots: Spring break is almost here Check grades on Blackboard idterm 1 graded Handot #9: Lab

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Basics of Digital Logic Design

Basics of Digital Logic Design ignals, Logic Operations and Gates E 675.2: Introdction to ompter rchitectre asics of igital Logic esign Rather than referring to voltage levels of signals, we shall consider signals that are logically

More information

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Thoai Nam Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy & David a Patterson,

More information

Improve performance by increasing instruction throughput

Improve performance by increasing instruction throughput Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access

More information

What do we have so far? Multi-Cycle Datapath (Textbook Version)

What do we have so far? Multi-Cycle Datapath (Textbook Version) What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

14:332:331 Pipelined Datapath

14:332:331 Pipelined Datapath 14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate

More information

Computer Architectures. DLX ISA: Pipelined Implementation

Computer Architectures. DLX ISA: Pipelined Implementation Computer Architectures L ISA: Pipelined Implementation 1 The Pipelining Principle Pipelining is nowadays the main basic technique deployed to speed-up a CP. The key idea for pipelining is general, and

More information

Designing a Pipelined CPU

Designing a Pipelined CPU Designing a Pipelined CPU CSE 4, S2'6 Review -- Single Cycle CPU CSE 4, S2'6 Review -- ultiple Cycle CPU CSE 4, S2'6 Review -- Instruction Latencies Single-Cycle CPU Load Ifetch /Dec Exec em Wr ultiple

More information

Review. How to represent real numbers

Review. How to represent real numbers PCWrite PC IorD Review ALUSrcA emread Address Write data emory emwrite em Data IRWrite [3-26] [25-2] [2-6] [5-] [5-] RegDst Read register Read register 2 Write register Write data RegWrite Read data Read

More information

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin.   School of Information Science and Technology SIST CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C

More information

ECEC 355: Pipelining

ECEC 355: Pipelining ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Computer Architecture Lecture 6: Multi-cycle Microarchitectures. Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 2/6/2012

Computer Architecture Lecture 6: Multi-cycle Microarchitectures. Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 2/6/2012 8-447 Compter Architectre Lectre 6: lti-cycle icroarchitectres Prof. Onr tl Carnegie ellon University Spring 22, 2/6/22 Reminder: Homeworks Homework soltions Check and stdy the soltions! Learning now is

More information

Pipelining. Maurizio Palesi

Pipelining. Maurizio Palesi * Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer

More information

Single-Cycle Examples, Multi-Cycle Introduction

Single-Cycle Examples, Multi-Cycle Introduction Single-Cycle Examples, ulti-cycle Introduction 1 Today s enu Single cycle examples Single cycle machines vs. multi-cycle machines Why multi-cycle? Comparative performance Physical and Logical Design of

More information

Pipelined Processor Design

Pipelined Processor Design Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science (IISc) Bangalore virendra@computer.org Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/courses/aca/aca.htm

More information

Lecture 19 Introduction to Pipelining

Lecture 19 Introduction to Pipelining CSE 30321 Lecture 19 Pipelining (Part 1) 1 Lecture 19 Introduction to Pipelining CSE 30321 Lecture 19 Pipelining (Part 1) Basic pipelining basic := single, in-order issue single issue one instruction at

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ... CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100

More information

Pipelined Datapath. One register file is enough

Pipelined Datapath. One register file is enough ipelined path The goal of pipelining is to allow multiple instructions execute at the same time We may need to perform several operations in a cycle Increment the and add s at the same time. Fetch one

More information

Functions of Combinational Logic

Functions of Combinational Logic CHPTER 6 Fnctions of Combinational Logic CHPTER OUTLINE 6 6 6 6 6 5 6 6 6 7 6 8 6 9 6 6 Half and Fll dders Parallel inary dders Ripple Carry and Look-head Carry dders Comparators Decoders Encoders Code

More information