This Unit: (Scalar In-Order) Pipelining. CIS 501 Computer Architecture. Readings. Pre-Class Exercises
|
|
- Edward Fields
- 6 years ago
- Views:
Transcription
1 This Unit: (clr In-rer) Pipelining CI 501 Computer rchitecture Unit : Pipelining pp pp pp ystem softwre CPU I/ Principles of pipelining Effects of overhe n hzrs Pipeline igrms hzrs tlling n bypssing Control hzrs rnch preiction Preiction lies evelope by Milo Mrtin & mir Roth t the University of Pennsylvni with sources tht inclue University of Wisconsin slies by Mrk Hill, Guri ohi, Jim mith, n Dvi Woo. CI 501 (Mrtin): Pipelining 1 CI 501 (Mrtin): Pipelining Reings Chpter.1 of M:FPTCM Pre-Clss Exercises Question#1: you hve wsher, ryer, n foler Ech tkes 30 minutes per lo How long for one lo in totl? How long for two los of lunry? How long for 100 los of lunry? Question #: now ssume: Wshing tkes 30 minutes, rying 60 minutes, n foling 15 min How long for one lo in totl? How long for two los of lunry? How long for 100 los of lunry? CI 501 (Mrtin): Pipelining 3 CI 371 (Mrtin): Pipelining
2 Pre-Clss Exercises nswers Question#1: you hve wsher, ryer, n foler Ech tkes 30 minutes per lo How long for one lo in totl? 90 minutes How long for two los of lunry? = 10 minutes How long for 100 los of lunry? 90 30*99 = 3060 min Question #: now ssume: Wshing tkes 30 minutes, rying 60 minutes, n foling 15 min How long for one lo in totl? 105 minutes How long for two los of lunry? = 165 minutes How long for 100 los of lunry? *99 = 605 min pth ckgroun CI 371 (Mrtin): Pipelining 5 CI 501 (Mrtin): Pipelining 6 Recll: The equentil Moel pth n Control sic structure of ll moern Is Fetch Decoe Re Inputs Execute Write utput Next Processor logiclly executes loop t left Progrm orer: totl orer on ynmic insns rer n nme storge efine computtion Convenient feture: progrm counter () itself t memory[] Next is unless insn sys otherwise tomic: insn X finishes before insn X1 strts Cn brek this constrint physiclly (pipelining) ut must mintin illusion to preserve progrmmer snity CI 501 (Mrtin): Pipelining 7 I$ control pth: implements execute portion of fetch/exec. loop Functionl units (LUs), registers, memory interfce Control: implements ecoe portion of fetch/execute loop Mux selectors, write enble signls regulte flow of t in tpth Prt of ecoe involves trnslting insn opcoe into control signls CI 501 (Mrtin): Pipelining 8 D$
3 ingle-cycle pth Multi-Cycle pth I$ D$ I$ D$ D ingle-cycle tpth: true tomic fetch/execute loop Fetch, ecoe, execute one complete instruction every cycle Hrwire control : opcoe ecoe to control signls irectly Low CPI: 1 by efinition Long clock perio: to ccommote slowest instruction CI 501 (Mrtin): Pipelining 9 Multi-cycle tpth: ttcks slow clock Fetch, ecoe, execute one complete insn over multiple cycles Micro-coe control: stges control signls llows insns to tke ifferent number of cycles (min point) ± pposite of single-cycle: short clock perio, high CPI (think: CIC) CI 501 (Mrtin): Pipelining 10 ingle-cycle vs. Multi-cycle Performnce ingle-cycle Clock perio = 50ns, CPI = 1 Performnce = 50ns/insn Multi-cycle hs opposite performnce split of single-cycle horter clock perio Higher CPI Multi-cycle rnch: 0% (3 cycles), lo: 0% (5 cycles), LU: 60% ( cycles) Clock perio = 11ns, CPI = (0%*3)(0%*5)(60%*) = Why is clock perio 11ns n not 10ns? Performnce = ns/insn Pipelining sics sie: CIC mkes perfect sense in multi-cycle tpth CI 501 (Mrtin): Pipelining 11 CI 501 (Mrtin): Pipelining 1
4 Ltency vs. Throughput Revisite insn0.fetch, ec, exec ingle-cycle insn0.fetch insn0.ec Multi-cycle insn1.fetch, ec, exec insn0.exec insn1.fetch insn1.ec insn1.exec Cn we hve both low CPI n short clock perio? Not if tpth executes only one insn t time Ltency n throughput: two views of performnce (1) t the progrm level n () t the instructions level ingle instruction ltency Doesn t mtter: progrms comprise of billions of instructions Difficult to reuce nywy Gol is to mke progrms, not iniviul insns, go fster Instruction throughput! progrm ltency Key: exploit inter-insn prllelism CI 501 (Mrtin): Pipelining 13 Pipelining insn0.fetch Multi-cycle Importnt performnce technique Improves instruction throughput rther instruction ltency egin with multi-cycle esign When insn vnces from stge 1 to, next insn enters t stge 1 Form of prllelism: insn-stge prllelism Mintins illusion of sequentil fetch/execute loop Iniviul instruction tkes the sme number of stges ut instructions enter n leve t much fster rte Lunry nlogy insn0.ec insn0.exec insn1.fetch insn0.fetch insn0.ec insn0.exec Pipeline insn1.fetch insn1.ec insn1.exec insn1.ec insn1.exec CI 501 (Mrtin): Pipelining 1 Five tge Pipeline pth Five tge Pipeline Performnce Temporry vlues (,,,,,D) re-ltche every stge Why? 5 insns my be in pipeline t once with ifferent s Notice, not ltche fter LU stge (not neee lter) Pipeline control: one single-cycle controller Control signls themselves pipeline CI 501 (Mrtin): Pipelining 15 D T insn-mem T regfile T LU T t-mem T regfile T singlecycle Pipelining: cut tpth into N stges (here five) ne insn in ech stge in ech cycle Clock perio = MX(T insn-mem, T regfile, T LU, T t-mem ) se CPI = 1: insn enters n leves every cycle ctul CPI > 1: pipeline must often stll Iniviul insn ltency increses (pipeline overhe), not the point CI 501 (Mrtin): Pipelining 16
5 Pipeline Terminology More Terminology & Foreshowing Five stge: Fetch, Decoe, execute, ory, Writebck Nothing mgicl bout 5 stges (Pentium h stges!) Ltches (pipeline registers) nme by stges they seprte, F/D, D/X, X/M, M/W CI 501 (Mrtin): Pipelining 17 F/D D/X X/M M/W D clr pipeline: one insn per stge per cycle lterntive: supersclr (lter) In-orer pipeline: insns enter execute stge in orer lterntive: out-of-orer (lter) Pipeline epth: number of pipeline stges Nothing mgicl bout five Contemporry high-performnce cores hve ~15 stge pipelines CI 501 (Mrtin): Pipelining 18 Pipeline Exmple: Cycle 1 Pipeline Exmple: Cycle D D $3,$,$1 lw $,0($5) $3,$,$1 3 instructions CI 501 (Mrtin): Pipelining 19 CI 501 (Mrtin): Pipelining 0
6 Pipeline Exmple: Cycle 3 Pipeline Exmple: Cycle D D sw $6,($7) lw $,0($5) $3,$,$1 sw $6,($7) lw $,0($5) $3,$,$1 3 instructions CI 501 (Mrtin): Pipelining 1 CI 501 (Mrtin): Pipelining Pipeline Exmple: Cycle 5 Pipeline Exmple: Cycle 6 D D sw $6,($7) lw $,0($5) sw $6,(7) lw CI 501 (Mrtin): Pipelining 3 CI 501 (Mrtin): Pipelining
7 Pipeline Exmple: Cycle 7 Pipeline Digrm D sw Pipeline igrm: shorthn for wht we just sw cross: cycles Down: insns Convention: X mens lw $,0($5) finishes execute stge n writes into X/M ltch t en of cycle $3,$,$1 F D X M W lw $,0($5) F D X M W sw $6,($7) F D X M W CI 501 (Mrtin): Pipelining 5 CI 501 (Mrtin): Pipelining 6 Exmple Pipeline Perf. Clcultion ingle-cycle Clock perio = 50ns, CPI = 1 Performnce = 50ns/insn Multi-cycle rnch: 0% (3 cycles), lo: 0% (5 cycles), LU: 60% ( cycles) Clock perio = 11ns, CPI = (0%*3)(0%*5)(60%*) = Performnce = ns/insn 5-stge pipeline Clock perio = 1ns pprox. (50ns / 5 stges) overhes CPI = 1 (ech insn tkes 5 cycles, but 1 completes ech cycle) Performnce = 1ns/insn Well ctully CPI = 1 some penlty for pipelining (next) CPI = 1.5 (on verge insn completes every 1.5 cycles) Performnce = 18ns/insn Much higher performnce thn single-cycle or multi-cycle CI 501 (Mrtin): Pipelining 7 Q1: Why Is Pipeline Clock Perio > (ely thru tpth) / (number of pipeline stges)? Three resons: Ltches ely Pipeline stges hve ifferent elys, clock perio is mx ely [Lter:] Extr tpths for pipelining (bypssing pths) These fctors hve implictions for iel number pipeline stges Diminishing clock frequency gins for longer (eeper) pipelines CI 371 (Mrtin): Pipelining 8
8 Q: Why Is Pipeline CPI > 1? CPI for sclr in-orer pipeline is 1 stll penlties tlls use to resolve hzrs Hzr: conition tht jeoprizes sequentil illusion tll: pipeline ely introuce to restore sequentil illusion Clculting pipeline CPI Frequency of stll * stll cycles Penlties (stlls generlly on t overlp in in-orer pipelines) 1 stll-freq 1 *stll-cyc 1 stll-freq *stll-cyc Correctness/performnce/mke common cse fst (MCCF) Long penlties K if they hppen rrely, e.g., * 10 = 1.1 tlls lso hve implictions for iel number of pipeline stges Depenences, Pipeline Hzrs, n ypssing CI 501 (Mrtin): Pipelining 9 CI 501 (Mrtin): Pipelining 30 Depenences n Hzrs Depenence: reltionship between two insns : two insns use sme storge loction Control: one insn ffects whether nother executes t ll Not b thing, progrms woul be boring without them Enforce by mking oler insn go before younger one Hppens nturlly in single-/multi-cycle esigns ut not in pipeline Hzr: epenence & possibility of wrong insn orer Effects of wrong insn orer cnnot be externlly visible tll: for orer by keeping younger insn in sme stge Hzrs re b thing: stlls reuce performnce CI 501 (Mrtin): Pipelining 31 Why Does Every Tke 5 Cycles? D $3,$,$1 lw $,0($5) Coul/shoul we llow to skip M n go to W? No It wouln t help: pek fetch still only 1 insn per cycle tructurl hzrs: imgine follows lw CI 501 (Mrtin): Pipelining 3
9 tructurl Hzrs tructurl hzrs Two insns trying to use sme circuit t sme time E.g., structurl hzr on register file write port To fix structurl hzrs: proper I/pipeline esign Ech insn uses every structure exctly once For t most one cycle lwys t sme stge reltive to F (fetch) Tolerte structure hzrs stll logic to stll pipeline when hzrs occur Exmple tructurl Hzr l r,0(r1) F D X M W r1,r3,r F D X M W sub r1,r3,r5 F D X M W st r6,0(r1) F D X M W tructurl hzr: resource neee twice in one cycle Exmple: unifie instruction & t memories (cches) olutions: eprte instruction/t memories (cches) Reesign cche to llow ccesses per cycle (slow, expensive) tll pipeline CI 501 (Mrtin): Pipelining 33 CI 501 (Mrtin): Pipelining 3 Hzrs F/D D/X X X/M sw $6,0($7) lw $,0($5) Let s forget bout brnches n the control for while The three insn sequence we sw erlier execute fine ut it wsn t rel progrm Rel progrms hve t epenences They pss vlues vi registers n memory M/W $3,$,$1 D Depenent pertions Inepenent opertions $3,$,$1 $6,$5,$ Woul this progrm execute correctly on pipeline? $3,$,$1 $6,$5,$3 Wht bout this progrm? $3,$,$1 lw $,0($3) i $6,1,$3 sw $3,0($7) CI 501 (Mrtin): Pipelining 35 CI 501 (Mrtin): Pipelining 36
10 Hzrs ory Hzrs F/D D/X X X/M D M/W F/D D/X X X/M D M/W sw $3,0($7) i $6,1,$3 lw $,0($3) $3,$,$1 Woul this progrm execute correctly on this pipeline? Which insns woul execute with correct inputs? is writing its result into $3 in current cycle lw re $3 two cycles go! got wrong vlue i re $3 one cycle go! got wrong vlue sw is reing $3 this cycle! mybe (epening on regfile esign) CI 501 (Mrtin): Pipelining 37 lw $,0($1) sw $5,0($1) re memory t hzrs problem for this pipeline? No lw following sw to sme ress in next cycle, gets right vlue Why? mem re/write lwys tke plce in sme stge hzrs through registers? Yes (previous slie) ccur becuse register write is three stges fter register re Cn only re register vlue three cycles fter writing it CI 501 (Mrtin): Pipelining 38 bservtion! F/D D/X X X/M lw $,0($3) Techniclly, this sitution is broken lw $,0($3) hs lrey re $3 from regfile $3,$,$1 hsn t yet written $3 to regfile ut funmentlly, everything is K lw $,0($3) hsn t ctully use $3 yet $3,$,$1 hs lrey compute $3 CI 501 (Mrtin): Pipelining 39 M/W $3,$,$1 D Reucing Hzrs: ypssing F/D D/X X X/M lw $,0($3) ypssing Reing vlue from n intermeite (µrchitecturl) source Not witing until it is vilble from primry source Here, we re bypssing the register file lso clle forwring CI 501 (Mrtin): Pipelining 0 M/W $3,$,$1 D
11 WX ypssing LUin ypssing F/D D/X X X/M D M/W F/D D/X X X/M D M/W lw $,0($3) $3,$,$1 $,$,$3 $3,$,$1 Wht bout this combintion? nother bypss pth n MUX (multiplexor) input First one ws n MX bypss This one is WX bypss Cn lso bypss to LU input CI 501 (Mrtin): Pipelining 1 CI 501 (Mrtin): Pipelining WM ypssing? ypss Logic D F/D D/X X X/M D M/W sw $3,0($) lw $3,0($) Does WM bypssing mke sense? Not to the ress input (why not?) ut to the store t input, yes CI 501 (Mrtin): Pipelining 3 bypss Ech MUX hs its own, here it is for MUX LUin (D/X..Regource1 == X/M..RegDest) => 0 (D/X..Regource1 == M/W..RegDest) => 1 Else => CI 501 (Mrtin): Pipelining
12 Pipeline Digrms with ypssing If bypss exists, from / to stges execute in sme cycle Exmple: full bypssing, use MX bypss r,r3!r1 F D X M W sub r1,r!r F D X M W Exmple: full bypssing, use WX bypss r,r3!r1 F D X M W l [r7]!r5 F D X M W sub r1,r!r F D X M W Exmple: WM bypss r,r3!r1 F D X M W? F D X M W Cn you think of coe exmple tht uses the WM bypss? CI 501 (Mrtin): Pipelining 5 Hve We Prevente ll Hzrs? D stll nop $,$,$3 CI 501 (Mrtin): Pipelining 6 lw $3,($) No. Consier lo followe by epenent insn ypssing lone isn t sufficient! Hrwre solution: etect this sitution n inject stll cycle oftwre solution: ensure compiler oesn t generte such coe tlling to voi Hzrs F/D D/X X X/M hzr nop Prevent F/D insn from reing (vncing) this cycle Write nop into D/X. (effectively, insert nop in hrwre) lso reset (cler) the tpth control signls Disble F/D ltch n write enbles (why?) Re-evlute sitution next cycle CI 501 (Mrtin): Pipelining 7 M/W D tlling on Lo-To-Use Depenences D stll nop $,$,$3 lw $3,($) tll = (D/X..pertion == LD) && ((F/D..Regrc1 == D/X..RegDest) ((F/D..Regrc == D/X..RegDest) && (F/D..p!= TRE)) CI 501 (Mrtin): Pipelining 8
13 tlling on Lo-To-Use Depenences tlling on Lo-To-Use Depenences D nop D nop stll $,$,$3 (stll bubble) lw $3,($) tll = (D/X..pertion == LD) && ((F/D..Regrc1 == D/X..RegDest) ((F/D..Regrc == D/X..RegDest) && (F/D..p!= TRE)) CI 501 (Mrtin): Pipelining 9 stll $,$,$3 (stll bubble) lw $3, tll = (D/X..pertion == LD) && ((F/D..Regrc1 == D/X..RegDest) ((F/D..Regrc == D/X..RegDest) && (F/D..P!= TRE)) CI 501 (Mrtin): Pipelining 50 Performnce Impct of Lo/Use Penlty ssume rnch: 0%, lo: 0%, store: 10%, other: 50% 50% of los re followe by epenent instruction require 1 cycle stll (I.e., insertion of 1 nop) Clculte CPI CPI = 1 (1 * 0% * 50%) = 1.1 Reucing Lo-Use tll Frequency $3,$,$1 F D X M W lw $,($3) F D X M W i $6,$,1 F * D X M W sub $8,$3,$1 F D X M W Use compiler scheuling to reuce lo-use stll frequency More on compiler scheuling lter $3,$,$1 F D X M W lw $,($3) F D X M W sub $8,$3,$1 F D X M W i $6,$,1 F D X M W CI 501 (Mrtin): Pipelining 51 CI 501 (Mrtin): Pipelining 5
14 Pipelining n Multi-Cycle pertions Pipeline Multiplier F/D D/X X/M D F/D D/X X/M D Wht if you wnte to multi-cycle opertion? E.g., -cycle multiply P/W: seprte output ltch connects to W stge Controlle by pipeline control finite stte mchine (FM) CI 501 (Mrtin): Pipelining 53 X Xctrl P P/W Multiplier itself is often pipeline, wht oes this men? Prouct/multiplicn register/lus/ltches replicte Cn strt ifferent multiply opertions in consecutive cycles CI 501 (Mrtin): Pipelining 5 P M P0/P1 P M P1/P P M P/P3 P M P3/W Pipeline Digrm with Multiplier mul $,$3,$5 F D P0 P1 P P3 W i $6,$,1 F D * * * X M W Wht bout Two instructions trying to write register file in sme cycle? tructurl hzr! Must prevent: mul $,$3,$5 F D P0 P1 P P3 W i $6,$1,1 F D X M W $5,$6,$10 F D X M W More Multiplier Nsties Wht bout Mis-orere writes to the sme register oftwre thinks gets $ from i, ctully gets it from mul mul $,$3,$5 F D P0 P1 P P3 W i $,$1,1 F D X M W $10,$,$6 F D X M W Common? Not for -cycle multiply with 5-stge pipeline More common with eeper pipelines In ny cse, must be correct CI 501 (Mrtin): Pipelining 55 CI 501 (Mrtin): Pipelining 56
15 Correcte Pipeline Digrm With the correct stll logic Prevent mis-orere writes to the sme register Why two cycles of ely? mul $,$3,$5 F D P0 P1 P P3 W i $,$1,1 F * * D X M W $10,$,$6 F D X M W Multi-cycle opertions complicte pipeline logic CI 501 (Mrtin): Pipelining 57 Pipeline Functionl Units lmost ll multi-cycle functionl units re pipeline Ech opertion tkes N cycles ut cn strt initite new (inepenent) opertion every cycle Requires internl ltching n some hrwre repliction cheper wy to bnwith thn multiple non-pipeline units mulf f0,f1,f F D E* E* E* E* W mulf f3,f,f5 F D E* E* E* E* W ne exception: int/fp ivie: ifficult to pipeline n not worth it ivf f0,f1,f F D E/ E/ E/ E/ W ivf f3,f,f5 F D s* s* s* E/ E/ E/ E/ W s* = structurl hzr, two insns nee sme structure Is n pipelines esigne to hve few of these Cnonicl exmple: ll insns force to go through M stge CI 501 (Mrtin): Pipelining 58 Wht bout rnches? F/D D/X X X/M Control Depenences n rnch Preiction Control hzrs options Coul just stll to wit for brnch outcome (two-cycle penlty) Fetch pst brnch insns before brnch outcome is known Defult: ssume not-tken (t fetch, cn t tell it s brnch) CI 501 (Mrtin): Pipelining 59 CI 501 (Mrtin): Pipelining 60
16 ig Ie: pecultive Execution pecultion: risky trnsctions on chnce of profit pecultive execution Execute before ll prmeters known with certinty Correct specultion voi stll, improve performnce Incorrect specultion (mis-specultion) Must bort/flush/sqush incorrect insns Must uno incorrect chnges (recover pre-specultion stte) The gme : [% correct * gin] [(1 % correct ) * penlty] Control specultion: specultion ime t control hzrs Unknown prmeter: re these the correct insns to execute next? CI 501 (Mrtin): Pipelining 61 rnch Recovery nop F/D nop D/X rnch recovery: wht to o when brnch is ctully tken s tht will be written into F/D n D/X re wrong Flush them, i.e., replce them with nops They hven t h written permnent stte yet (regfile, D) Two cycle penlty for tken brnches CI 501 (Mrtin): Pipelining 6 X X/M Control pecultion n Recovery Correct: i r1,1!r3 F D X M W bnez r3,trg F D X M W st r6![r7] F D X M W mul r8,r9!r10 F D X M W specultive Mis-specultion recovery: wht to o on wrong guess Not too pinful in n short, in-orer pipeline rnch resolves in X Younger insns (in F, D) hven t chnge permnent stte Flush insns currently in F/D n D/X (i.e., replce with nops) Recovery: i r1,1!r3 F D X M W bnez r3,trg F D X M W st r6![r7] F D mul r8,r9!r10 F trg: r,r5!r F D X M W CI 501 (Mrtin): Pipelining 63 rnch Performnce ck of the envelope clcultion rnch: 0%, lo: 0%, store: 10%, other: 50% y, 75% of brnches re tken CPI = 1 0% * 75% * = * 0.75 * = 1.3 rnches cuse 30% slowown Even worse with eeper pipelines How o we reuce this penlty? CI 501 (Mrtin): Pipelining 6
17 Reucing Penlty: Fst rnches F/D X Fst brnch: cn ecie t D, not X Test must be comprison to zero or equlity, no time for LU New tken brnch penlty is 1 itionl insns (slt) for more complex tests, must bypss to D too <> 0 D/X X X/M Reucing Penlty: Fst rnches Fst brnch: trgets control-hzr penlty siclly, brnch insns tht cn resolve t D, not X Test must be comprison to zero or equlity, no time for LU New tken brnch penlty is 1 itionl comprison insns (e.g., cmplt, slt) for complex tests Must bypss into ecoe stge now, too bnez r3,trg F D X M W st r6![r7] F D trg: r,r5,r F D X M W CI 371 (Mrtin): Pipelining 65 CI 501 (Mrtin): Pipelining 66 Fst rnch Performnce ssume: rnch: 0%, 75% of brnches re tken CPI = 1 0% * 75% * 1 = 1 0.0*0.75*1 = % slowown (better thn the 30% from before) ut wit, fst brnches ssume only simple comprisons Fine for MIP ut not fine for Is with brnch if $1 > $ opertions In such cses, sy 5% of brnches require n extr insn CPI = 1 (0% * 75% * 1) 0%*5%*1(extr insn) = 1. Exmple of I n micro-rchitecture interction Type of brnch instructions Wht bout conition coes? CI 501 (Mrtin): Pipelining 67 Fewer Mispreictions: rnch Preiction P nop TG F/D D/X X/M X Dynmic brnch preiction: hrwre guesses outcome trt fetching from guesse ress Flush on mis-preiction nop CI 501 (Mrtin): Pipelining 68 TG <>
18 rnch Preiction Performnce Prmeters rnch: 0%, lo: 0%, store: 10%, other: 50% 75% of brnches re tken Dynmic brnch preiction rnches preicte with 95% ccurcy CPI = 1 0% * 5% * = 1.0 Dynmic rnch Preiction Components I$ P regfile tep #1: is it brnch? Esy fter ecoe... tep #: is the brnch tken or not tken? Direction preictor (pplies to conitionl brnches only) Preicts tken/not-tken tep #3: if the brnch is tken, where oes it go? Esy fter ecoe D$ CI 501 (Mrtin): Pipelining 69 CI 501 (Mrtin): Pipelining 70 rnch Direction Preiction rnch History Tble (HT) Lern from pst, preict the future Recor the pst in hrwre structure Direction preictor (DP) Mp conitionl-brnch to tken/not-tken (T/N) ecision Iniviul conitionl brnches often bise or wekly bise 90% one wy or the other consiere bise Why? Loop bck eges, checking for uncommon conitions rnch history tble (HT): simplest preictor inexes tble of bits (0 = N, 1 = T), no tgs Essentilly: brnch will go sme wy it went lst time [31:10] [9:] 1:0 Wht bout lising? Two with the sme lower bits? No problem, just preiction! HT T or NT T or NT rnch history tble (HT): simplest irection preictor inexes tble of bits (0 = N, 1 = T), no tgs Essentilly: brnch will go sme wy it went lst time Problem: inner loop brnch below for (i=0;i<100;i) for (j=0;j<3;j) // whtever Two built-in mis-preictions per inner loop itertion rnch preictor chnges its min too quickly 1 T T N Wrong Preiction (tken or CI 501 (Mrtin): Pipelining not tken) 71 CI 501 (Mrtin): Pipelining 7 Time tte Preiction utcome Result? 1 N N T Wrong T T T Correct 3 T T T Correct T T N Wrong 5 N N T Wrong 6 T T T Correct 7 T T T Correct 8 T T N Wrong 9 N N T Wrong 10 T T T Correct 11 T T T Correct
19 Two-it turting Counters (bc) Two-bit sturting counters (bc) [mith 1981] Replce ech single-bit preiction (0,1,,3) = (N,n,t,T) s hysteresis Force preictor to mis-preict twice before chnging its min ne mispreict ech loop execution (rther thn two) Fixes this pthology (which is not contrive, by the wy) Cn we o even better? CI 501 (Mrtin): Pipelining 73 Time tte Preiction utcome Result? 1 N N T Wrong n N T Wrong 3 t T T Correct T T N Wrong 5 t T T Correct 6 T T T Correct 7 T T T Correct 8 T T N Wrong 9 t T T Correct 10 T T T Correct 11 T T T Correct 1 T T N Wrong Correlte Preictor Correlte (two-level) preictor [Ptt 1991] Exploits observtion tht brnch outcomes re correlte Mintins seprte preiction per (, HR) pirs rnch history register (HR): recent brnch outcomes imple working exmple: ssume progrm hs one brnch HT: one 1-bit DP entry HTHR: = 1-bit DP entries Why in t we o better? HT not long enough to cpture pttern CI 501 (Mrtin): Pipelining 7 Time Pttern tte NN NT TN TT Preiction utcome Result? 1 NN N N N N N T Wrong NT T N N N N T Wrong 3 TT T T N N N T Wrong TT T T N T T N Wrong 5 TN T T N N N T Wrong 6 NT T T T N T T Correct 7 TT T T T N N T Wrong 8 TT T T T T T N Wrong 9 TN T T T N T T Correct 10 NT T T T N T T Correct 11 TT T T T N N T Wrong 1 TT T T T T T N Wrong Correlte Preictor 3 it Pttern Try 3 bits of history 3 DP entries per pttern Time Pttern tte NNN NNT NTN NTT TNN TNT TTN TTT CI 501 (Mrtin): Pipelining 75 Preiction utcome Result? 1 NNN N N N N N N N N N T Wrong NNT T N N N N N N N N T Wrong 3 NTT T T N N N N N N N T Wrong TTT T T N T N N N N N N Correct 5 TTN T T N T N N N N N T Wrong 6 TNT T T N T N N T N N T Wrong 7 NTT T T N T N T T N T T Correct 8 TTT T T N T N T T N N N Correct 9 TTN T T N T N T T N T T Correct 10 TNT T T N T N T T N T T Correct 11 NTT T T N T N T T N T T Correct 1 TTT T T N T N T T N N N Correct No mis-preictions fter preictor lerns ll the relevnt ptterns! Correlte Preictor Design Design choice I: one globl HR or one per (locl)? Ech one cptures ifferent kins of ptterns Globl is better, cptures locl ptterns for tight loop brnches Design choice II: how mny history bits (HR size)? Tricky one Given unlimite resources, longer HRs re better, but HT utiliztion ecreses Mny history ptterns re never seen Mny brnches re history inepenent (on t cre) xor HR llows multiple s to ynmiclly shre HT HR length < log (HT size) Preictor tkes longer to trin Typicl length: 8 1 CI 501 (Mrtin): Pipelining 76
20 Hybri Preictor Hybri (tournment) preictor [McFrling 1993] ttcks correlte preictor HT cpcity problem Ie: combine two preictors imple HT preicts history inepenent brnches Correlte preictor preicts only brnches tht nee history Chooser ssigns brnches to one preictor or the other rnches strt in simple HT, move mis-preiction threshol Correlte preictor cn be me smller, hnles fewer brnches 90 95% ccurcy When to Perform rnch Preiction? ption #1: During Decoe Look t instruction opcoe to etermine brnch instructions Cn clculte next from instruction (for -reltive brnches) ne cycle mis-fetch penlty even if brnch preictor is correct bnez r3,trg F D X M W trg: r,r5,r F D X M W ption #: During Fetch? How o we o tht? HR HT HT chooser CI 501 (Mrtin): Pipelining 77 CI 501 (Mrtin): Pipelining 78 Revisiting rnch Preiction Components I$ P regfile tep #1: is it brnch? Esy fter ecoe... uring fetch: preictor tep #: is the brnch tken or not tken? Direction preictor (s before) tep #3: if the brnch is tken, where oes it go? rnch trget preictor (T) upplies trget if brnch is tken CI 501 (Mrtin): Pipelining 79 D$ rnch Trget uffer (T) s before: lern from pst, preict the future Recor the pst brnch trgets in hrwre structure rnch trget buffer (T): guess the future bse on pst behvior Lst time the brnch X ws tken, it went to ress Y o, in the future, if ress X is fetche, fetch ress Y next pertion smll RM: ress =, t = trget- ccess t Fetch in prllel with instruction memory preicte-trget = T[hsh()] Upte t X whenever trget!= preicte-trget T[hsh()] = trget Hsh function is just typiclly just extrcting lower bits (s before) lising? No problem, this is only preiction CI 501 (Mrtin): Pipelining 80
21 rnch Trget uffer (continue) t Fetch, how oes insn know it s brnch & shoul re T? It oesn t hve to ll insns ccess T in prllel with Imem Fetch Key ie: use T to preict which insn re brnches Implement by tgging ech entry with its corresponing Upte T on every tken brnch insn, recor trget : T[].tg =, T[].trget = trget of brnch ll insns ccess t Fetch in prllel with Imem Check for tg mtch, signifies insn t tht is brnch Preicte = (T[].tg == )? T[].trget : tg T trget preicte trget CI 501 (Mrtin): Pipelining 81 == Why Does T Work? ecuse most control insns use irect trgets Trget encoe in insn itself! sme tken trget every time Wht bout inirect trgets? Trget hel in register! cn be ifferent ech time Two inirect cll iioms Dynmiclly linke functions (DLLs): trget lwys the sme Dynmiclly isptche (virtul) functions: hr but uncommon lso two inirect unconitionl jump iioms witches: hr but uncommon Function returns: hr n common but CI 501 (Mrtin): Pipelining 8 Return ress tck (R) Putting It ll Together T R tg trget == preicte trget T & brnch irection preictor uring fetch T tg trget == preicte trget R Return ress stck (R) Cll instruction? R[Topftck] = Return instruction? Preicte-trget = R[--Topftck] Q: how cn you tell if n insn is cll/return before ecoing it? ccessing R on every insn T-style oesn t work nswer: nother preictor (or put them in T mrke s return ) r, pre-ecoe bits in insn mem, written when first execute CI 501 (Mrtin): Pipelining 83 HT tken/not-tken If brnch preiction correct, no tken brnch penlty CI 501 (Mrtin): Pipelining 8
22 rnch Preiction Performnce Dynmic brnch preiction 0% of instruction brnches imple preictor: brnches preicte with 75% ccurcy CPI = 1 (0% * 5% * ) = 1.1 More vnce preictor: 95% ccurcy CPI = 1 (0% * 5% * ) = 1.0 rnch mis-preictions still big problem though Pipelines re long: typicl mis-preiction penlty is 10 cycles For cores tht o more per cycle, preictions most costly (lter) Reserch: Perceptron Preictor Perceptron preictor [Jimenez] ttcks HR size problem using mchine lerning pproch HT replce by tble of function coefficients F i (signe) Preict tken if!(hr i *F i )> threshol Tble size #* HR * F (cn use long HR: ~60 bits) Equivlent correlte preictor woul be #* HR How oes it lern? Upte F i when brnch is tken HR i == 1? F i : F i ; on t cre F i bits sty ner 0, importnt F i bits sturte Hybri HT/perceptron ccurcy: 95 98% F! F i *HR i > thresh CI 501 (Mrtin): Pipelining 85 HR CI 501 (Mrtin): Pipelining 86 More Reserch: GEHL Preictor Problem with both correlte preictor n perceptron me HT rel-estte eicte to 1st history bit (1 column) s to n, 3r, 10th, 60th Not goo use of spce: 1st bit much more importnt thn 60th Chmpionship rnch Preiction CP Workshop hel in conjunction with MICR ubmitte coe is teste on stnr brnch trces Highest preiction ccurcy wins GEometric History-Length preictor [eznec, IC 05] Multiple HTs, inexe by geometriclly longer HRs (0,, 16, 3) HTs re (prtilly) tgge, not seprte chooser Preict: use mtching entry from HT with longest HR Mis-preict: crete entry in HT with longer HR nly 5% of HT use for bits 16-3 (not 50%) Helps mortize cost of tgging Trins quickly 95-97% ccurte CI 501 (Mrtin): Pipelining 87 Two trcks Ielistic: preictor simultor must run in uner hours Relistic: preictor must synthesize into 3K 56 bits or less 006 winners Relistic: L-TGE (GEHL follow-on) Ielistic: GTL (nother GEHL follow-on) CI 501 (Mrtin): Pipelining 88
23 Pipeline Depth Tren h been to eeper pipelines 86: 5 stges (50 gte elys / clock) Pentium: 7 stges Pentium II/III: 1 stges Pentium : stges (~10 gte elys / clock) super-pipelining Core1/: 1 stges Incresing pipeline epth Increses clock frequency (reuces perio) ut ouble the stges reuce the clock perio by less thn x Decreses I (increses CPI) rnch mis-preiction penlty becomes longer Non-bypsse t hzr stlls become longer t some point, ctully cuses performnce to ecrese, but when? 1GHz Pentium ws slower thn 800 MHz PentiumIII ptiml pipeline epth is progrm n technology specific ummry pp pp pp ystem softwre CPU I/ Principles of pipelining Effects of overhe n hzrs Pipeline igrms hzrs tlling n bypssing Control hzrs rnch preiction CI 371 (Mrtin): Pipelining 89 CI 501 (Mrtin): Pipelining 90
Pipeline Example: Cycle 1. Pipeline Example: Cycle 2. Pipeline Example: Cycle 4. Pipeline Example: Cycle 3. 3 instructions. 3 instructions.
ipeline Exmple: Cycle 1 ipeline Exmple: Cycle X X/ /W X X/ /W $3,$,$1 lw $,0($5) $3,$,$1 3 instructions 8 9 ipeline Exmple: Cycle 3 ipeline Exmple: Cycle X X/ /W X X/ /W sw $6,($7) lw $,0($5) $3,$,$1 sw
More informationDatapath Background. This Unit: (Scalar In-Order) Pipelining. CIS 501 Computer Architecture. Readings
This Unit: (clr In-rer) Pipelining CI 501 Computer rchitecture Unit 6: Pipelining pp pp pp ystem softwre CPU I/ Principles of pipelining Effects of overhe n hzrs Pipeline igrms hzrs tlling n bypssing Control
More informations1 s2 d B (F/D.IR.RS1 == D/X.IR.RD) (F/D.IR.RS2 == D/X.IR.RD) (F/D.IR.RS1 == X/M.IR.RD) (F/D.IR.RS2 == X/M.IR.RD) = 1 = 1
Hrwre Interlock Exmple: cycle Hrwre Interlock Exmple: cycle ile s s / / / t em / ile s s / / / t em / nop nop hzr hzr $,$,$ $,$,$ (/..R == /..R) (/..R == /..R) (/..R == /..R) (/..R == /..R) = (/..R ==
More informationThis Unit: Processor Design. What Is Control? Example: Control for sw. Example: Control for add
This Unit: rocessor Design Appliction O ompiler U ory Firmwre I/O Digitl ircuits Gtes & Trnsistors pth components n timing s n register files ories (RAMs) locking strtegies Mpping n IA to tpth ontrol Exceptions
More informationEECS150 - Digital Design Lecture 23 - High-level Design and Optimization 3, Parallelism and Pipelining
EECS150 - Digitl Design Lecture 23 - High-level Design nd Optimiztion 3, Prllelism nd Pipelining Nov 12, 2002 John Wwrzynek Fll 2002 EECS150 - Lec23-HL3 Pge 1 Prllelism Prllelism is the ct of doing more
More informationOverview. Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory. Running Example. Background
Overview king the Fst Cse Common n the Uncommon Cse imple in Unoune Trnsctionl Colin Blunell (University of Pennsylvni) Joe Devietti (University of Pennsylvni) E Christopher Lewis (Vwre, Inc.) ilo. K.
More informationECE 550D Fundamentals of Computer Systems and Engineering. Fall 2016
ECE 550D Fundamentals of Computer ystems and Engineering Fall 2016 Pipelines Tyler letsch Duke University lides are derived from work by Andrew Hilton (Duke) and Amir Roth (Penn) Clock Period and CPI ingle-cycle
More informationMIPS I/O and Interrupt
MIPS I/O nd Interrupt Review Floting point instructions re crried out on seprte chip clled coprocessor 1 You hve to move dt to/from coprocessor 1 to do most common opertions such s printing, clling functions,
More informationECE / CS 250 Introduction to Computer Architecture
ECE / CS 250 Introduction to Computer rchitecture Pipelining enjamin C. Lee Duke University Slides from Daniel Sorin (Duke) and are derived from work by mir Roth (Penn) and lvy Lebeck (Duke) 1 This Unit:
More informationECE/CS 250 Computer Architecture. Fall 2017
ECE/CS 250 Computer rchitecture Fall 2017 Pipelining Tyler letsch Duke University Includes material adapted from Dan Sorin (Duke) and mir Roth (Penn). This Unit: Pipelining pplication S Compiler Firmware
More informationIn the last lecture, we discussed how valid tokens may be specified by regular expressions.
LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.
More informationDistributed Systems Principles and Paradigms
Distriuted Systems Principles nd Prdigms Chpter 11 (version April 7, 2008) Mrten vn Steen Vrije Universiteit Amsterdm, Fculty of Science Dept. Mthemtics nd Computer Science Room R4.20. Tel: (020) 598 7784
More informationECE 468/573 Midterm 1 September 28, 2012
ECE 468/573 Midterm 1 September 28, 2012 Nme:! Purdue emil:! Plese sign the following: I ffirm tht the nswers given on this test re mine nd mine lone. I did not receive help from ny person or mteril (other
More informationECEN 468 Advanced Logic Design Lecture 36: RTL Optimization
ECEN 468 Advnced Logic Design Lecture 36: RTL Optimiztion ECEN 468 Lecture 36 RTL Design Optimiztions nd Trdeoffs 6.5 While creting dtpth during RTL design, there re severl optimiztions nd trdeoffs, involving
More informationSystems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits
Systems I Logic Design I Topics Digitl logic Logic gtes Simple comintionl logic circuits Simple C sttement.. C = + ; Wht pieces of hrdwre do you think you might need? Storge - for vlues,, C Computtion
More informationChapter 2. 3/28/2004 H133 Spring
Chpter 2 Newton believe tht light ws me up of smll prticles. This point ws ebte by scientists for mny yers n it ws not until the 1800 s when series of experiments emonstrte wve nture of light. (But be
More informationEngineer To Engineer Note
Engineer To Engineer Note EE-169 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit
More informationUnit #9 : Definite Integral Properties, Fundamental Theorem of Calculus
Unit #9 : Definite Integrl Properties, Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl
More informationData Flow on a Queue Machine. Bruno R. Preiss. Copyright (c) 1987 by Bruno R. Preiss, P.Eng. All rights reserved.
Dt Flow on Queue Mchine Bruno R. Preiss 2 Outline Genesis of dt-flow rchitectures Sttic vs. dynmic dt-flow rchitectures Pseudo-sttic dt-flow execution model Some dt-flow mchines Simple queue mchine Prioritized
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationMidterm 2 Sample solution
Nme: Instructions Midterm 2 Smple solution CMSC 430 Introduction to Compilers Fll 2012 November 28, 2012 This exm contins 9 pges, including this one. Mke sure you hve ll the pges. Write your nme on the
More informationUnit 5 Vocabulary. A function is a special relationship where each input has a single output.
MODULE 3 Terms Definition Picture/Exmple/Nottion 1 Function Nottion Function nottion is n efficient nd effective wy to write functions of ll types. This nottion llows you to identify the input vlue with
More informationComplete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li
2nd Interntionl Conference on Electronic & Mechnicl Engineering nd Informtion Technology (EMEIT-212) Complete Coverge Pth Plnning of Mobile Robot Bsed on Dynmic Progrmming Algorithm Peng Zhou, Zhong-min
More informationFunctor (1A) Young Won Lim 8/2/17
Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationBruce McCarl's GAMS Newsletter Number 37
Bruce McCrl's GAMS Newsletter Number 37 This newsletter covers 1 Uptes to Expne GAMS User Guie by McCrl et l.... 1 2 YouTube vieos... 1 3 Explntory text for tuple set elements... 1 4 Reing sets using GDXXRW...
More informationCOMP 423 lecture 11 Jan. 28, 2008
COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring
More informationFunctor (1A) Young Won Lim 10/5/17
Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published
More informationCaches I. CSE 351 Spring Instructor: Ruth Anderson
L16: Cches I Cches I CSE 351 Spring 2017 Instructor: Ruth Anderson Teching Assistnts: Dyln Johnson Kevin Bi Linxing Preston Jing Cody Ohlsen Yufng Sun Joshu Curtis L16: Cches I Administrivi Homework 3,
More informationExtending Finite Automata to Efficiently Match Perl-Compatible Regular Expressions
Extening Finite Automt to Efficiently Mtch Perl-Comptible Regulr Expressions Michel Becchi Wshington University Computer Science n Engineering St. Louis, MO 63130-4899 mbecchi@cse.wustl.eu ABSTRACT Regulr
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationCSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe
CSCI 0 fel Ferreir d Silv rfsilv@isi.edu Slides dpted from: Mrk edekopp nd Dvid Kempe LOG STUCTUED MEGE TEES Series Summtion eview Let n = + + + + k $ = #%& #. Wht is n? n = k+ - Wht is log () + log ()
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationCaches I. CSE 351 Autumn Instructor: Justin Hsia
L01: Intro, L01: L16: Combintionl Introduction Cches I Logic CSE369, CSE351, Autumn 2016 Cches I CSE 351 Autumn 2016 Instructor: Justin Hsi Teching Assistnts: Chris M Hunter Zhn John Kltenbch Kevin Bi
More informationUT1553B BCRT True Dual-port Memory Interface
UTMC APPICATION NOTE UT553B BCRT True Dul-port Memory Interfce INTRODUCTION The UTMC UT553B BCRT is monolithic CMOS integrted circuit tht provides comprehensive MI-STD- 553B Bus Controller nd Remote Terminl
More informationDynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012
Dynmic Progrmming Andres Klppenecker [prtilly bsed on slides by Prof. Welch] 1 Dynmic Progrmming Optiml substructure An optiml solution to the problem contins within it optiml solutions to subproblems.
More information2 Computing all Intersections of a Set of Segments Line Segment Intersection
15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design
More informationSection 10.4 Hyperbolas
66 Section 10.4 Hyperbols Objective : Definition of hyperbol & hyperbols centered t (0, 0). The third type of conic we will study is the hyperbol. It is defined in the sme mnner tht we defined the prbol
More informationFig.25: the Role of LEX
The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing
More informationECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017
ECE 550D Funamentals of Computer Systems an Engineering Fall 017 Datapaths Prof. John Boar Duke University Slies are erive from work by Profs. Tyler Bletch an Anrew Hilton (Duke) an Amir Roth (Penn) What
More informationGeometric transformations
Geometric trnsformtions Computer Grphics Some slides re bsed on Shy Shlom slides from TAU mn n n m m T A,,,,,, 2 1 2 22 12 1 21 11 Rows become columns nd columns become rows nm n n m m A,,,,,, 1 1 2 22
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationP(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have
Rndom Numers nd Monte Crlo Methods Rndom Numer Methods The integrtion methods discussed so fr ll re sed upon mking polynomil pproximtions to the integrnd. Another clss of numericl methods relies upon using
More informationMany analog implementations of CPG exist, typically using operational amplifier or
FPGA Implementtion of Centrl Pttern Genertor By Jmes J Lin Introuction: Mny nlog implementtions of CPG exist, typiclly using opertionl mplifier or trnsistor level circuits. These types of circuits hve
More informationControl Hazards. Branch Recovery. Control Hazard Pipeline Diagram. Branch Performance
Control Hazards ranch Recovery D/
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationGeorge Boole. IT 3123 Hardware and Software Concepts. Switching Algebra. Boolean Functions. Boolean Functions. Truth Tables
George Boole IT 3123 Hrdwre nd Softwre Concepts My 28 Digitl Logic The Little Mn Computer 1815 1864 British mthemticin nd philosopher Mny contriutions to mthemtics. Boolen lger: n lger over finite sets
More information6.2 Volumes of Revolution: The Disk Method
mth ppliction: volumes by disks: volume prt ii 6 6 Volumes of Revolution: The Disk Method One of the simplest pplictions of integrtion (Theorem 6) nd the ccumultion process is to determine so-clled volumes
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationAlgorithm Design (5) Text Search
Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:
More informationCS321 Languages and Compiler Design I. Winter 2012 Lecture 5
CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,
More informationIntroduction to hardware design using VHDL
Introuction to hrwre esign using VHDL Tim Güneysu n Nele Mentens ECC school Novemer 11, 2017, Nijmegen Outline Implementtion pltforms Introuction to VHDL Hrwre tutoril 1 Implementtion pltforms Microprocessor
More information12-B FRACTIONS AND DECIMALS
-B Frctions nd Decimls. () If ll four integers were negtive, their product would be positive, nd so could not equl one of them. If ll four integers were positive, their product would be much greter thn
More informationCaches I. CSE 351 Autumn 2018
Cches I CSE 351 Autumn 2018 Instructors: Mx Willsey Luis Ceze Teching Assistnts: Britt Henderson Luks Joswik Josie Lee Wei Lin Dniel Snitkovsky Luis Veg Kory Wtson Ivy Yu Alt text: I looked t some of the
More informationSIMPLIFYING ALGEBRA PASSPORT.
SIMPLIFYING ALGEBRA PASSPORT www.mthletics.com.u This booklet is ll bout turning complex problems into something simple. You will be ble to do something like this! ( 9- # + 4 ' ) ' ( 9- + 7-) ' ' Give
More informationFault injection attacks on cryptographic devices and countermeasures Part 2
Fult injection ttcks on cryptogrphic devices nd countermesures Prt Isrel Koren Deprtment of Electricl nd Computer Engineering University of Msschusetts Amherst, MA Countermesures - Exmples Must first detect
More informationSlides for Data Mining by I. H. Witten and E. Frank
Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully
More informationStack Manipulation. Other Issues. How about larger constants? Frame Pointer. PowerPC. Alternative Architectures
Other Issues Stck Mnipultion support for procedures (Refer to section 3.6), stcks, frmes, recursion mnipulting strings nd pointers linkers, loders, memory lyout Interrupts, exceptions, system clls nd conventions
More informationMA1008. Calculus and Linear Algebra for Engineers. Course Notes for Section B. Stephen Wills. Department of Mathematics. University College Cork
MA1008 Clculus nd Liner Algebr for Engineers Course Notes for Section B Stephen Wills Deprtment of Mthemtics University College Cork s.wills@ucc.ie http://euclid.ucc.ie/pges/stff/wills/teching/m1008/ma1008.html
More informationAlignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey
Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2012 Colin Dewey cdewey@biostt.wisc.edu Gols for Lecture the key concepts to understnd re the following how lrge-scle lignment
More informationEECS 281: Homework #4 Due: Thursday, October 7, 2004
EECS 28: Homework #4 Due: Thursdy, October 7, 24 Nme: Emil:. Convert the 24-bit number x44243 to mime bse64: QUJD First, set is to brek 8-bit blocks into 6-bit blocks, nd then convert: x44243 b b 6 2 9
More informationWhat do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers
Wht do ll those bits men now? bits (...) Number Systems nd Arithmetic or Computers go to elementry school instruction R-formt I-formt... integer dt number text chrs... floting point signed unsigned single
More informationScanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an
Scnner Termintion A scnner reds input chrcters nd prtitions them into tokens. Wht hppens when the end of the input file is reched? It my be useful to crete n Eof pseudo-chrcter when this occurs. In Jv,
More informationEssential Question What are some of the characteristics of the graph of a rational function?
8. TEXAS ESSENTIAL KNOWLEDGE AND SKILLS A..A A..G A..H A..K Grphing Rtionl Functions Essentil Question Wht re some of the chrcteristics of the grph of rtionl function? The prent function for rtionl functions
More informationMATH 25 CLASS 5 NOTES, SEP
MATH 25 CLASS 5 NOTES, SEP 30 2011 Contents 1. A brief diversion: reltively prime numbers 1 2. Lest common multiples 3 3. Finding ll solutions to x + by = c 4 Quick links to definitions/theorems Euclid
More informationStack. A list whose end points are pointed by top and bottom
4. Stck Stck A list whose end points re pointed by top nd bottom Insertion nd deletion tke plce t the top (cf: Wht is the difference between Stck nd Arry?) Bottom is constnt, but top grows nd shrinks!
More informationQuestions About Numbers. Number Systems and Arithmetic. Introduction to Binary Numbers. Negative Numbers?
Questions About Numbers Number Systems nd Arithmetic or Computers go to elementry school How do you represent negtive numbers? frctions? relly lrge numbers? relly smll numbers? How do you do rithmetic?
More informationCPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls
Redings for Next Two Lectures Text CPSC 213 Switch Sttements, Understnding Pointers - 2nd ed: 3.6.7, 3.10-1st ed: 3.6.6, 3.11 Introduction to Computer Systems Unit 1f Dynmic Control Flow Polymorphism nd
More informationRay surface intersections
Ry surfce intersections Some primitives Finite primitives: polygons spheres, cylinders, cones prts of generl qudrics Infinite primitives: plnes infinite cylinders nd cones generl qudrics A finite primitive
More informationUNIT 11. Query Optimization
UNIT Query Optimiztion Contents Introduction to Query Optimiztion 2 The Optimiztion Process: An Overview 3 Optimiztion in System R 4 Optimiztion in INGRES 5 Implementing the Join Opertors Wei-Png Yng,
More informationTransparent neutral-element elimination in MPI reduction operations
Trnsprent neutrl-element elimintion in MPI reduction opertions Jesper Lrsson Träff Deprtment of Scientific Computing University of Vienn Disclimer Exploiting repetition nd sprsity in input for reducing
More informationSection 3.1: Sequences and Series
Section.: Sequences d Series Sequences Let s strt out with the definition of sequence: sequence: ordered list of numbers, often with definite pttern Recll tht in set, order doesn t mtter so this is one
More informationWhat do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers
Wht do ll those bits men now? bits (...) Number Systems nd Arithmetic or Computers go to elementry school instruction R-formt I-formt... integer dt number text chrs... floting point signed unsigned single
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationReducing Costs with Duck Typing. Structural
Reducing Costs with Duck Typing Structurl 1 Duck Typing In computer progrmming with object-oriented progrmming lnguges, duck typing is lyer of progrmming lnguge nd design rules on top of typing. Typing
More informationExample: 2:1 Multiplexer
Exmple: 2:1 Multiplexer Exmple #1 reg ; lwys @( or or s) egin if (s == 1') egin = ; else egin = ; 1 s B. Bs 114 Exmple: 2:1 Multiplexer Exmple #2 Normlly lwys include egin nd sttements even though they
More informationData-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors
Dt-Flow Prescheduling for Lrge Instruction Windows in Out-of-Order Processors Pierre Michud, André Seznec IRISA/INRIA Cmpus de Beulieu, 35 Rennes Cedex, Frnce {pmichud, seznec}@iris.fr Abstrct The performnce
More informationLooking up objects in Pastry
Review: Pstry routing tbles 0 1 2 3 4 7 8 9 b c d e f 0 1 2 3 4 7 8 9 b c d e f 0 1 2 3 4 7 8 9 b c d e f 0 2 3 4 7 8 9 b c d e f Row0 Row 1 Row 2 Row 3 Routing tble of node with ID i =1fc s - For ech
More informationINTRODUCTION TO SIMPLICIAL COMPLEXES
INTRODUCTION TO SIMPLICIAL COMPLEXES CASEY KELLEHER AND ALESSANDRA PANTANO 0.1. Introduction. In this ctivity set we re going to introduce notion from Algebric Topology clled simplicil homology. The min
More informationIf you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.
Lecture 5 Wlks, Trils, Pths nd Connectedness Reding: Some of the mteril in this lecture comes from Section 1.2 of Dieter Jungnickel (2008), Grphs, Networks nd Algorithms, 3rd edition, which is ville online
More informationDr. D.M. Akbar Hussain
Dr. D.M. Akr Hussin Lexicl Anlysis. Bsic Ide: Red the source code nd generte tokens, it is similr wht humns will do to red in; just tking on the input nd reking it down in pieces. Ech token is sequence
More informationCS 241. Fall 2017 Midterm Review Solutions. October 24, Bits and Bytes 1. 3 MIPS Assembler 6. 4 Regular Languages 7.
CS 241 Fll 2017 Midterm Review Solutions Octoer 24, 2017 Contents 1 Bits nd Bytes 1 2 MIPS Assemly Lnguge Progrmming 2 3 MIPS Assemler 6 4 Regulr Lnguges 7 5 Scnning 9 1 Bits nd Bytes 1. Give two s complement
More informationSimplifying Algebra. Simplifying Algebra. Curriculum Ready.
Simplifying Alger Curriculum Redy www.mthletics.com This ooklet is ll out turning complex prolems into something simple. You will e le to do something like this! ( 9- # + 4 ' ) ' ( 9- + 7-) ' ' Give this
More informationCS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis
CS143 Hndout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexicl Anlysis In this first written ssignment, you'll get the chnce to ply round with the vrious constructions tht come up when doing lexicl
More informationΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών
ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy Recognition of Tokens if expressions nd reltionl opertors if è if then è then else è else relop
More informationToday. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search
Uninformed Serch [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI t UC Berkeley. All CS188 mterils re vilble t http://i.berkeley.edu.] Tody Serch Problems Uninformed Serch Methods
More informationAddress/Data Control. Port latch. Multiplexer
4.1 I/O PORT OPERATION As discussed in chpter 1, ll four ports of the 8051 re bi-directionl. Ech port consists of ltch (Specil Function Registers P0, P1, P2, nd P3), n output driver, nd n input buffer.
More informationpdfapilot Server 2 Manual
pdfpilot Server 2 Mnul 2011 by clls softwre gmbh Schönhuser Allee 6/7 D 10119 Berlin Germny info@cllssoftwre.com www.cllssoftwre.com Mnul clls pdfpilot Server 2 Pge 2 clls pdfpilot Server 2 Mnul Lst modified:
More informationFall 2018 Midterm 1 October 11, ˆ You may not ask questions about the exam except for language clarifications.
15-112 Fll 2018 Midterm 1 October 11, 2018 Nme: Andrew ID: Recittion Section: ˆ You my not use ny books, notes, extr pper, or electronic devices during this exm. There should be nothing on your desk or
More informationData sharing in OpenMP
Dt shring in OpenMP Polo Burgio polo.burgio@unimore.it Outline Expressing prllelism Understnding prllel threds Memory Dt mngement Dt cluses Synchroniztion Brriers, locks, criticl sections Work prtitioning
More informationAgilent Mass Hunter Software
Agilent Mss Hunter Softwre Quick Strt Guide Use this guide to get strted with the Mss Hunter softwre. Wht is Mss Hunter Softwre? Mss Hunter is n integrl prt of Agilent TOF softwre (version A.02.00). Mss
More informationEngineer-to-Engineer Note
Engineer-to-Engineer Note EE-232 Technicl notes on using Anlog Devices DSPs, processors nd development tools Contct our technicl support t dsp.support@nlog.com nd t dsptools.support@nlog.com Or visit our
More informationComputer Arithmetic Logical, Integer Addition & Subtraction Chapter
Computer Arithmetic Logicl, Integer Addition & Sutrction Chpter 3.-3.3 3.3 EEC7 FQ 25 MIPS Integer Representtion -it signed integers,, e.g., for numeric opertions 2 s s complement: one representtion for
More informationLecture 10 Evolutionary Computation: Evolution strategies and genetic programming
Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting
More informationReadings : Computer Networking. Outline. The Next Internet: More of the Same? Required: Relevant earlier meeting:
Redings 15-744: Computer Networking L-14 Future Internet Architecture Required: Servl pper Extr reding on Mobility First Relevnt erlier meeting: CCN -> Nmed Dt Network 2 Outline The Next Internet: More
More informationEngineer To Engineer Note
Engineer To Engineer Note EE-186 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit
More information10.5 Graphing Quadratic Functions
0.5 Grphing Qudrtic Functions Now tht we cn solve qudrtic equtions, we wnt to lern how to grph the function ssocited with the qudrtic eqution. We cll this the qudrtic function. Grphs of Qudrtic Functions
More informationa(e, x) = x. Diagrammatically, this is encoded as the following commutative diagrams / X
4. Mon, Sept. 30 Lst time, we defined the quotient topology coming from continuous surjection q : X! Y. Recll tht q is quotient mp (nd Y hs the quotient topology) if V Y is open precisely when q (V ) X
More informationCS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig
CS311H: Discrete Mthemtics Grph Theory IV Instructor: Işıl Dillig Instructor: Işıl Dillig, CS311H: Discrete Mthemtics Grph Theory IV 1/25 A Non-plnr Grph Regions of Plnr Grph The plnr representtion of
More information