Datapath Background. This Unit: (Scalar In-Order) Pipelining. CIS 501 Computer Architecture. Readings

Size: px
Start display at page:

Download "Datapath Background. This Unit: (Scalar In-Order) Pipelining. CIS 501 Computer Architecture. Readings"

Transcription

1 This Unit: (clr In-rer) Pipelining CI 501 Computer rchitecture Unit 6: Pipelining pp pp pp ystem softwre CPU I/ Principles of pipelining Effects of overhe n hzrs Pipeline igrms hzrs tlling n bypssing Control hzrs rnch preiction Preiction lies originlly evelope by mir Roth with contributions by Milo Mrtin t University of Pennsylvni with sources tht inclue University of Wisconsin slies by Mrk Hill, Guri ohi, Jim mith, n Dvi Woo. CI 501 (Mrtin/Roth): Pipelining 1 CI 501 (Mrtin/Roth): Pipelining 2 Reings H&P ppenix pth ckgroun CI 501 (Mrtin/Roth): Pipelining 3 CI 501 (Mrtin/Roth): Technology

2 pth n Control ingle-cycle pth I$ D$ I$ D$ control pth: implements execute portion of fetch/exec. loop Functionl units (LUs), registers, memory interfce Control: implements ecoe portion of fetch/execute loop Mux selectors, write enble signls regulte flow of t in tpth Prt of ecoe involves trnslting insn opcoe into control signls CI 501 (Mrtin/Roth): Pipelining 5 ingle-cycle tpth: true tomic fetch/execute loop Fetch, ecoe, execute one complete instruction every cycle Hrwire control : opcoe to control signls RM Low CPI: 1 by efinition Long clock perio: to ccommote slowest instruction CI 501 (Mrtin/Roth): Pipelining 6 Multi-Cycle pth I$ Multi-cycle tpth: ttcks slow clock Fetch, ecoe, execute one complete insn over multiple cycles Micro-coe control: stges control signls llows insns to tke ifferent number of cycles (min point) ± pposite of single-cycle: short clock perio, high CPI (think: CIC) CI 501 (Mrtin/Roth): Pipelining 7 D$ D ingle-cycle vs. Multi-cycle Performnce ingle-cycle Clock perio = 50ns, CPI = 1 Performnce = 50ns/insn Multi-cycle hs opposite performnce split of single-cycle horter clock perio Higher CPI Multi-cycle rnch: 20% (3 cycles), lo: 20% (5 cycles), LU: 60% ( cycles) Clock perio = 11ns, CPI = (20%*3)(20%*5)(60%*) = Why is clock perio 11ns n not 10ns? Performnce = ns/insn sie: CIC mkes perfect sense in multi-cycle tpth CI 501 (Mrtin/Roth): Pipelining 8

3 Ltency versus Throughput insn0.fetch, ec, exec ingle-cycle insn0.fetch insn0.ec Multi-cycle insn1.fetch, ec, exec insn0.exec insn1.fetch insn1.ec Cn we hve both low CPI n short clock perio? Not if tpth executes only one insn t time insn1.exec Pipelining sics Ltency vs. Throughput Ltency: no goo wy to mke single insn go fster Throughput: fortuntely, no one cres bout single insn ltency Gol is to mke progrms, not iniviul insns, go fster Progrms contin billions of insns Key: exploit inter-insn prllelism CI 501 (Mrtin/Roth): Technology 9 CI 501 (Mrtin/Roth): Pipelining 10 Pipelining insn0.fetch Multi-cycle Importnt performnce technique Improves instruction throughput rther instruction ltency egin with multi-cycle esign When insn vnces from stge 1 to 2, next insn enters t stge 1 Form of prllelism: insn-stge prllelism Mintins illusion of sequentil fetch/execute loop Iniviul instruction tkes the sme number of stges ut instructions enter n leve t much fster rte Lunry nlogy insn0.ec insn0.exec insn1.fetch insn0.fetch insn0.ec insn0.exec Pipeline insn1.fetch insn1.ec insn1.exec insn1.ec insn1.exec CI 371 (Roth/Mrtin): Pipelining 11 Five tge Pipeline pth Temporry vlues (,,,,,D) re-ltche every stge Why? 5 insns my be in pipeline t once with ifferent s Notice, not ltche fter LU stge (not neee lter) Pipeline control: one single-cycle controller Control signls themselves pipeline CI 371 (Roth/Mrtin): Pipelining 12 D

4 Five tge Pipeline Performnce Pipeline Terminology T insn-mem T regfile T LU T t-mem T regfile T singlecycle Pipelining: cut tpth into N stges (here five) ne insn in ech stge in ech cycle Clock perio = MX(T insn-mem, T regfile, T LU, T t-mem ) se CPI = 1: insn enters n leves every cycle ctul CPI > 1: pipeline must often stll Iniviul insn ltency increses (pipeline overhe), not the point CI 371 (Roth/Mrtin): Pipelining 13 Five stge: Fetch, Decoe, execute, ory, Writebck Nothing mgicl bout 5 stges (Pentium h 22 stges!) Ltches (pipeline registers) nme by stges they seprte, F/D, D/X, X/M, M/W CI 371 (Roth/Mrtin): Pipelining 1 F/D D/X X/M M/W D More Terminology & Foreshowing clr pipeline: one insn per stge per cycle lterntive: supersclr (lter) In-orer pipeline: insns enter execute stge in orer lterntive: out-of-orer (lter) Pipeline epth: number of pipeline stges Nothing mgicl bout five Tren hs been to eeper pipelines (gin, more lter) Instruction Convention Different Is use inconsistent register orers ome Is (for exmple MIP) Instruction estintion (i.e., output) on the left $1, $2, $3 mens $1!$2$3 ther Is Instruction estintion (i.e., output) on the right r1,r2,r3 mens r1r2!r3 l 0(r5),r mens mem[r58]!r st r,0(r5) mens r!mem[r58] CI 501 (Mrtin/Roth): Pipelining 15 Will try to specify to voi confusion, next slies MIP style CI 501 (Mrtin/Roth): Pipelining 16

5 Pipeline Exmple: Cycle 1 Pipeline Exmple: Cycle 2 << 2 << 2 D D $3,$2,$1 lw $,0($5) $3,$2,$1 3 instructions CI 501 (Mrtin/Roth): Pipelining 17 CI 501 (Mrtin/Roth): Pipelining 18 Pipeline Exmple: Cycle 3 Pipeline Exmple: Cycle << 2 << 2 D D sw $6,($7) lw $,0($5) $3,$2,$1 sw $6,($7) lw $,0($5) $3,$2,$1 3 instructions CI 501 (Mrtin/Roth): Pipelining 19 CI 501 (Mrtin/Roth): Pipelining 20

6 Pipeline Exmple: Cycle 5 Pipeline Exmple: Cycle 6 << 2 << 2 D D sw $6,($7) lw $,0($5) sw $6,(7) lw CI 501 (Mrtin/Roth): Pipelining 21 CI 501 (Mrtin/Roth): Pipelining 22 Pipeline Exmple: Cycle 7 Pipeline Digrm D << 2 Pipeline igrm: shorthn for wht we just sw cross: cycles Down: insns Convention: X mens lw $,0($5) finishes execute stge n writes into X/M ltch t en of cycle $3,$2,$1 F D X M W sw lw $,0($5) F D X M W sw $6,($7) F D X M W CI 501 (Mrtin/Roth): Pipelining 23 CI 501 (Mrtin/Roth): Pipelining 2

7 Exmple Pipeline Perf. Clcultion ingle-cycle Clock perio = 50ns, CPI = 1 Performnce = 50ns/insn Multi-cycle rnch: 20% (3 cycles), lo: 20% (5 cycles), LU: 60% ( cycles) Clock perio = 11ns, CPI = (20%*3)(20%*5)(60%*) = Performnce = ns/insn 5-stge pipeline Clock perio = 12ns pprox. (50ns / 5 stges) overhes CPI = 1 (ech insn tkes 5 cycles, but 1 completes ech cycle) Performnce = 12ns/insn Well ctully CPI = 1 some penlty for pipelining (next) CPI = 1.5 (on verge insn completes every 1.5 cycles) Performnce = 18ns/insn Much higher performnce thn single-cycle or multi-cycle CI 501 (Mrtin/Roth): Pipelining 25 Q1: Why Is Pipeline Clock Perio > (ely thru tpth) / (number of pipeline stges)? few resons: Ltches ely Extr bypssing logic s ely Pipeline stges hve ifferent elys, clock perio is mx ely These fctors hve implictions for iel number pipeline stges Diminishing clock frequency gins for longer (eeper) pipelines CI 501 (Mrtin/Roth): Pipelining 26 Q2: Why Is Pipeline CPI > 1? CPI for sclr in-orer pipeline is 1 stll penlties tlls use to resolve hzrs Hzr: conition tht jeoprizes sequentil illusion tll: pipeline ely introuce to restore sequentil illusion Clculting pipeline CPI Frequency of stll * stll cycles Penlties (stlls generlly on t overlp in in-orer pipelines) 1 stll-freq 1 *stll-cyc 1 stll-freq 2 *stll-cyc 2 Correctness/performnce/mke common cse fst (MCCF) Long penlties K if they hppen rrely, e.g., * 10 = 1.1 tlls lso hve implictions for iel number of pipeline stges Depenences, Pipeline Hzrs, n ypssing CI 501 (Mrtin/Roth): Pipelining 27 CI 501 (Mrtin/Roth): Technology 28

8 Depenences n Hzrs Depenence: reltionship between two insns : two insns use sme storge loction Control: one insn ffects whether nother executes t ll Not b thing, progrms woul be boring without them Enforce by mking oler insn go before younger one Hppens nturlly in single-/multi-cycle esigns ut not in pipeline Hzr: epenence & possibility of wrong insn orer Effects of wrong insn orer cnnot be externlly visible tll: for orer by keeping younger insn in sme stge Hzrs re b thing: stlls reuce performnce CI 501 (Mrtin/Roth): Pipelining 29 Why Does Every Tke 5 Cycles? D $3,$2,$1 lw $,0($5) Coul/shoul we llow to skip M n go to W? No It wouln t help: pek fetch still only 1 insn per cycle tructurl hzrs: imgine follows lw CI 501 (Mrtin/Roth): Pipelining 30 << 2 tructurl Hzrs tructurl hzrs Two insns trying to use sme circuit t sme time E.g., structurl hzr on register file write port To fix structurl hzrs: proper I/pipeline esign Ech insn uses every structure exctly once For t most one cycle lwys t sme stge reltive to F (fetch) Tolerte structure hzrs stll logic to stll pipeline when hzrs occur Exmple tructurl Hzr l r2,0(r1) F D X M W r1,r3,r F D X M W sub r1,r3,r5 F D X M W st r6,0(r1) F D X M W tructurl hzr: resource neee twice in one cycle Exmple: unifie instruction & t memories (cches) olutions: eprte instruction/t memories (cches) Reesign cche to llow 2 ccesses per cycle (slow, expensive) tll pipeline CI 501 (Mrtin/Roth): Pipelining 31 CI 501 (Mrtin/Roth): Pipelining 32

9 Hzrs F/D D/X X X/M sw $6,0($7) lw $,0($5) Let s forget bout brnches n the control for while The three insn sequence we sw erlier execute fine ut it wsn t rel progrm Rel progrms hve t epenences They pss vlues vi registers n memory M/W $3,$2,$1 D Depenent pertions Inepenent opertions $3,$2,$1 $6,$5,$ Woul this progrm execute correctly on pipeline? $3,$2,$1 $6,$5,$3 Wht bout this progrm? $3,$2,$1 lw $,0($3) i $6,1,$3 sw $3,0($7) CI 501 (Mrtin/Roth): Pipelining 33 CI 501 (Mrtin/Roth): Pipelining 3 Hzrs ory Hzrs F/D D/X X X/M D M/W F/D D/X X X/M D M/W sw $3,0($7) i $6,1,$3 lw $,0($3) $3,$2,$1 Woul this progrm execute correctly on this pipeline? Which insns woul execute with correct inputs? is writing its result into $3 in current cycle lw re $3 two cycles go! got wrong vlue i re $3 one cycle go! got wrong vlue sw is reing $3 this cycle! mybe (epening on regfile esign) CI 501 (Mrtin/Roth): Pipelining 35 lw $,0($1) sw $5,0($1) re memory t hzrs problem for this pipeline? No lw following sw to sme ress in next cycle, gets right vlue Why? mem re/write lwys tke plce in sme stge hzrs through registers? Yes (previous slie) ccur becuse register write is three stges fter register re Cn only re register vlue three cycles fter writing it CI 501 (Mrtin/Roth): Pipelining 36

10 bservtion! F/D D/X X X/M lw $,0($3) Techniclly, this sitution is broken lw $,0($3) hs lrey re $3 from regfile $3,$2,$1 hsn t yet written $3 to regfile ut funmentlly, everything is K lw $,0($3) hsn t ctully use $3 yet $3,$2,$1 hs lrey compute $3 CI 501 (Mrtin/Roth): Pipelining 37 M/W $3,$2,$1 D Reucing Hzrs: ypssing F/D D/X X X/M lw $,0($3) ypssing Reing vlue from n intermeite (µrchitecturl) source Not witing until it is vilble from primry source Here, we re bypssing the register file lso clle forwring CI 501 (Mrtin/Roth): Pipelining 38 M/W $3,$2,$1 D WX ypssing LUin ypssing F/D D/X X X/M D M/W F/D D/X X X/M D M/W lw $,0($3) $3,$2,$1 $,$2,$3 $3,$2,$1 Wht bout this combintion? nother bypss pth n MUX (multiplexor) input First one ws n MX bypss This one is WX bypss Cn lso bypss to LU input CI 501 (Mrtin/Roth): Pipelining 39 CI 501 (Mrtin/Roth): Pipelining 0

11 WM ypssing? ypss Logic D F/D D/X X X/M D M/W sw $3,0($) lw $3,0($2) Does WM bypssing mke sense? Not to the ress input (why not?) ut to the store t input, yes CI 501 (Mrtin/Roth): Pipelining 1 bypss Ech MUX hs its own, here it is for MUX LUin (D/X..Regource1 == X/M..RegDest) => 0 (D/X..Regource1 == M/W..RegDest) => 1 Else => 2 CI 501 (Mrtin/Roth): Pipelining 2 Pipeline Digrms with ypssing If bypss exists, from / to stges execute in sme cycle Exmple: full bypssing, use MX bypss r2,r3!r1 F D X M W sub r1,r!r2 F D X M W Exmple: full bypssing, use WX bypss r2,r3!r1 F D X M W l [r7]!r5 F D X M W sub r1,r!r2 F D X M W Exmple: WM bypss r2,r3!r1 F D X M W? F D X M W Cn you think of coe exmple tht uses the WM bypss? CI 501 (Mrtin/Roth): Pipelining 3 Hve We Prevente ll Hzrs? D stll nop $,$2,$3 CI 501 (Mrtin/Roth): Pipelining lw $3,($2) No. Consier lo followe by epenent insn ypssing lone isn t sufficient! Hrwre solution: etect this sitution n inject stll cycle oftwre solution: ensure compiler oesn t generte such coe

12 tlling to voi Hzrs F/D D/X X X/M hzr nop Prevent F/D insn from reing (vncing) this cycle Write nop into D/X. (effectively, insert nop in hrwre) lso reset (cler) the tpth control signls Disble F/D ltch n write enbles (why?) Re-evlute sitution next cycle CI 501 (Mrtin/Roth): Pipelining 5 M/W D tlling on Lo-To-Use Depenences D stll nop $,$2,$3 lw $3,($2) tll = (D/X..pertion == LD) && ((F/D..Regrc1 == D/X..RegDest) ((F/D..Regrc2 == D/X..RegDest) && (F/D..P!= TRE)) CI 501 (Mrtin/Roth): Pipelining 6 tlling on Lo-To-Use Depenences tlling on Lo-To-Use Depenences D nop D nop stll $,$2,$3 (stll bubble) lw $3,($2) tll = (D/X..pertion == LD) && ((F/D..Regrc1 == D/X..RegDest) ((F/D..Regrc2 == D/X..RegDest) && (F/D..P!= TRE)) CI 501 (Mrtin/Roth): Pipelining 7 stll $,$2,$3 (stll bubble) lw $3, tll = (D/X..pertion == LD) && ((F/D..Regrc1 == D/X..RegDest) ((F/D..Regrc2 == D/X..RegDest) && (F/D..P!= TRE)) CI 501 (Mrtin/Roth): Pipelining 8

13 Performnce Impct of Lo/Use Penlty ssume rnch: 20%, lo: 20%, store: 10%, other: 50% 50% of los re followe by epenent instruction require 1 cycle stll (I.e., insertion of 1 nop) Clculte CPI CPI = 1 (1 * 20% * 50%) = 1.1 Reucing Lo-Use tll Frequency $3,$2,$1 F D X M W lw $,($3) F D X M W i $6,$,1 F * D X M W sub $8,$3,$1 F D X M W Use compiler scheuling to reuce lo-use stll frequency More on compiler scheuling lter $3,$2,$1 F D X M W lw $,($3) F D X M W sub $8,$3,$1 F D X M W i $6,$,1 F D X M W CI 501 (Mrtin/Roth): Pipelining 9 CI 371 (Roth/Mrtin): Pipelining 50 Pipelining n Multi-Cycle pertions Pipeline Multiplier F/D D/X X/M D F/D D/X X/M D Wht if you wnte to multi-cycle opertion? E.g., -cycle multiply P/W: seprte output ltch connects to W stge Controlle by pipeline control finite stte mchine (FM) CI 501 (Mrtin/Roth): Pipelining 51 X Xctrl P P/W Multiplier itself is often pipeline, wht oes this men? Prouct/multiplicn register/lus/ltches replicte Cn strt ifferent multiply opertions in consecutive cycles CI 501 (Mrtin/Roth): Pipelining 52 P M P0/P1 P M P1/P2 P M P2/P3 P M P3/W

14 Pipeline Digrm with Multiplier mul $,$3,$5 F D P0 P1 P2 P3 W i $6,$,1 F D * * * X M W Wht bout Two instructions trying to write regfile in sme cycle? tructurl hzr! Must prevent: mul $,$3,$5 F D P0 P1 P2 P3 W i $6,$1,1 F D X M W $5,$6,$10 F D X M W More Multiplier Nsties Wht bout Mis-orere writes to the sme register oftwre thinks gets $ from i, ctully gets it from mul mul $,$3,$5 F D P0 P1 P2 P3 W i $,$1,1 F D X M W $10,$,$6 F D X M W Common? Not for -cycle multiply with 5-stge pipeline More common with eeper pipelines In ny cse, must be correct CI 501 (Mrtin/Roth): Pipelining 53 CI 501 (Mrtin/Roth): Pipelining 5 Correcte Pipeline Digrm With the correct stll logic Prevent mis-orere writes to the sme register Why two cycles of ely? mul $,$3,$5 F D P0 P1 P2 P3 W i $,$1,1 F * * D X M W $10,$,$6 F D X M W Multi-cycle opertions complicte pipeline logic CI 371 (Roth/Mrtin): Pipelining 55 Pipeline Functionl Units lmost ll multi-cycle functionl units re pipeline Ech opertion tkes N cycles ut cn strt initite new (inepenent) opertion every cycle Requires internl ltching n some hrwre repliction cheper wy to bnwith thn multiple non-pipeline units mulf f0,f1,f2 F D E* E* E* E* W mulf f3,f,f5 F D E* E* E* E* W ne exception: int/fp ivie: ifficult to pipeline n not worth it ivf f0,f1,f2 F D E/ E/ E/ E/ W ivf f3,f,f5 F D s* s* s* E/ E/ E/ E/ W s* = structurl hzr, two insns nee sme structure Is n pipelines esigne to hve few of these Cnonicl exmple: ll insns force to go through M stge CI 501 (Mrtin/Roth): Pipelining 56

15 Wht bout rnches? F/D D/X X << 2 X/M Control Depenences n rnch Preiction Control hzrs options Coul just stll to wit for brnch outcome (two-cycle penlty) Fetch pst brnch insns before brnch outcome is known Defult: ssume not-tken (t fetch, cn t tell it s brnch) CI 501 (Mrtin/Roth): Technology 57 CI 501 (Mrtin/Roth): Pipelining 58 rnch Recovery nop F/D nop D/X rnch recovery: wht to o when brnch is ctully tken s tht will be written into F/D n D/X re wrong Flush them, i.e., replce them with nops They hven t h written permnent stte yet (regfile, D) Two cycle penlty for tken brnches CI 501 (Mrtin/Roth): Pipelining 59 X << 2 X/M rnch Performnce ck of the envelope clcultion rnch: 20%, lo: 20%, store: 10%, other: 50% y, 75% of brnches re tken CPI = 1 20% * 75% * 2 = * 0.75 * 2 = 1.3 rnches cuse 30% slowown Even worse with eeper pipelines How o we reuce this penlty? CI 501 (Mrtin/Roth): Pipelining 60

16 ig Ie: pecultive Execution pecultion: risky trnsctions on chnce of profit pecultive execution Execute before ll prmeters known with certinty Correct specultion voi stll, improve performnce Incorrect specultion (mis-specultion) Must bort/flush/sqush incorrect insns Must uno incorrect chnges (recover pre-specultion stte) The gme : [% correct * gin] [(1 % correct ) * penlty] Control specultion: specultion ime t control hzrs Unknown prmeter: re these the correct insns to execute next? CI 501 (Mrtin/Roth): Pipelining 61 Control pecultion n Recovery i r1,1!r3 Correct: F D X M W bnez r3,trg F D X M W st r6![r7] F D X M W trg: r,r5!r F D X M W specultive Mis-specultion recovery: wht to o on wrong guess Not too pinful in n in-orer pipeline rnch resolves in X Younger insns (in F, D) hven t chnge permnent stte Flush insns currently in F/D n D/X (i.e., replce with nops) Recovery: i r1,1!r3 F D X M W bnez r3,trg F D X M W st r6![r7] F D trg: r,r5!r F trg: r,r5!r F D X M W CI 501 (Mrtin/Roth): Pipelining 62 Reucing Penlty: Fst rnches Fst brnch: trgets control-hzr penlty siclly, brnch insns tht cn resolve t D, not X Test must be comprison to zero or equlity, no time for LU New tken brnch penlty is 1 itionl comprison insns (e.g., cmplt, slt) for complex tests Must bypss into ecoe stge now, too bnez r3,trg F D X M W trg: r,r5,r F D X M W CI 501 (Mrtin/Roth): Pipelining 63 Fst rnch Performnce ssume: rnch: 20%, 75% of brnches re tken CPI = 1 20% * 75% * 1 = *0.75*1 = % slowown (better thn the 30% from before) ut wit, fst brnches ssume only simple comprisons Fine for MIP ut not fine for Is with brnch if $1 > $2 opertions In such cses, sy 25% of brnches require n extr insn CPI = 1 (20% * 75% * 1) 20%*25%*1(extr insn) = 1.2 Exmple of I n micro-rchitecture interction Type of brnch instructions nother option: Delye brnch or brnch ely slot Wht bout conition coes? CI 501 (Mrtin/Roth): Pipelining 6

17 Fewer Mispreictions: rnch Preiction P nop TG F/D D/X X/M X Dynmic brnch preiction: hrwre guesses outcome trt fetching from guesse ress Flush on mis-preiction nop CI 501 (Mrtin/Roth): Pipelining 65 TG <> << 2 rnch Preiction Performnce Prmeters rnch: 20%, lo: 20%, store: 10%, other: 50% 75% of brnches re tken Dynmic brnch preiction rnches preicte with 95% ccurcy CPI = 1 20% * 5% * 2 = 1.02 CI 501 (Mrtin/Roth): Pipelining 66 Dynmic rnch Preiction Components I$ P regfile tep #1: is it brnch? Esy fter ecoe... tep #2: is the brnch tken or not tken? Direction preictor (pplies to conitionl brnches only) Preicts tken/not-tken tep #3: if the brnch is tken, where oes it go? Esy fter ecoe CI 501 (Mrtin/Roth): Pipelining 67 D$ rnch Direction Preiction Lern from pst, preict the future Recor the pst in hrwre structure Direction preictor (DP) Mp conitionl-brnch to tken/not-tken (T/N) ecision Iniviul conitionl brnches often bise or wekly bise 90% one wy or the other consiere bise Why? Loop bck eges, checking for uncommon conitions rnch history tble (HT): simplest preictor inexes tble of bits (0 = N, 1 = T), no tgs Essentilly: brnch will go sme wy it went lst time [31:10] [9:2] 1:0 Wht bout lising? Two with the sme lower bits? No problem, just preiction! HT T or NT T or NT Preiction (tken or CI 501 (Mrtin/Roth): Pipelining not tken) 68

18 rnch History Tble (HT) rnch history tble (HT): simplest irection preictor inexes tble of bits (0 = N, 1 = T), no tgs Essentilly: brnch will go sme wy it went lst time Problem: consier inner loop brnch below (* = mis-preiction) for (i=0;i<100;i) for (j=0;j<3;j) // whtever tte/preiction N* T T T* N* T T T* N* T T T* utcome T T T N T T T N T T T N Two built-in mis-preictions per inner loop itertion rnch preictor chnges its min too quickly Two-it turting Counters (2bc) Two-bit sturting counters (2bc) [mith] Replce ech single-bit preiction (0,1,2,3) = (N,n,t,T) s hysteresis Force preictor to mis-preict twice before chnging its min tte/preiction N* n* t T* t T T T* t T T T* utcome T T T N T T T N T T T N ne mispreict ech loop execution (rther thn two) Fixes this pthology (which is not contrive, by the wy) Cn we o even better? CI 501 (Mrtin/Roth): Pipelining 69 CI 501 (Mrtin/Roth): Pipelining 70 Correlte Preictor Correlte (two-level) preictor [Ptt] Exploits observtion tht brnch outcomes re correlte Mintins seprte preiction per (, HR) rnch history register (HR): recent brnch outcomes imple working exmple: ssume progrm hs one brnch HT: one 1-bit DP entry HT2HR: 2 2 = 1-bit DP entries tte/preiction HR=NN N* T T T T T T T T T T T ctive pttern HR=NT N N* T T T T T T T T T T HR=TN N N N N N* T T T T T T T HR=TT N N N* T* N N N* T* N N N* T* utcome N N T T T N T T T N T T T N We in t mke nything better, wht s the problem? CI 501 (Mrtin/Roth): Pipelining 71 Correlte Preictor Wht hppene? HR wsn t long enough to cpture the pttern Try gin: HT3HR: 2 3 = 8 1-bit DP entries tte/preiction HR=NNN N* T T T T T T T T T T T HR=NNT N N* T T T T T T T T T T HR=NTN N N N N N N N N N N N N ctive pttern HR=NTT N N N* T T T T T T T T T HR=TNN N N N N N N N N N N N N HR=TNT N N N N N N* T T T T T T HR=TTN N N N N N* T T T T T T T HR=TTT N N N N N N N N N N N N utcome N N N T T T N T T T N T T T N No mis-preictions fter preictor lerns ll the relevnt ptterns CI 501 (Mrtin/Roth): Pipelining 72

19 Correlte Preictor Design choice I: one globl HR or one per (locl)? Ech one cptures ifferent kins of ptterns Globl is better, cptures locl ptterns for tight loop brnches Design choice II: how mny history bits (HR size)? Tricky one Given unlimite resources, longer HRs re better, but HT utiliztion ecreses Mny history ptterns re never seen Mny brnches re history inepenent (on t cre) xor HR llows multiple s to ynmiclly shre HT HR length < log 2 (HT size) Preictor tkes longer to trin Typicl length: 8 12 Hybri Preictor Hybri (tournment) preictor [McFrling] ttcks correlte preictor HT cpcity problem Ie: combine two preictors imple HT preicts history inepenent brnches Correlte preictor preicts only brnches tht nee history Chooser ssigns brnches to one preictor or the other rnches strt in simple HT, move mis-preiction threshol Correlte preictor cn be me smller, hnles fewer brnches 90 95% ccurcy HR HT HT chooser CI 501 (Mrtin/Roth): Pipelining 73 CI 501 (Mrtin/Roth): Pipelining 7 When to Perform rnch Preiction? During Decoe Look t instruction opcoe to etermine brnch instructions Cn clculte next from instruction (for -reltive brnches) ne cycle mis-fetch penlty even if brnch preictor is correct bnez r3,trg F D X M W trg: r,r5,r F D X M W During Fetch? How o we o tht? Revisiting rnch Preiction Components I$ P regfile tep #1: is it brnch? Esy fter ecoe... uring fetch: preictor tep #2: is the brnch tken or not tken? Direction preictor (s before) tep #3: if the brnch is tken, where oes it go? rnch trget preictor (T) upplies trget if brnch is tken D$ CI 501 (Mrtin/Roth): Pipelining 75 CI 501 (Mrtin/Roth): Pipelining 76

20 rnch Trget uffer (T) s before: lern from pst, preict the future Recor the pst brnch trgets in hrwre structure rnch trget buffer (T): guess the future bse on pst behvior Lst time the brnch X ws tken, it went to ress Y o, in the future, if ress X is fetche, fetch ress Y next pertion Like cche: ress =, t = trget- ccess t Fetch in prllel with instruction memory preicte-trget = T[] Upte t X whenever trget!= preicte-trget T[] = trget lising? No problem. s before, this is only preiction CI 501 (Mrtin/Roth): Pipelining 77 rnch Trget uffer (continue) t Fetch, how oes insn know it s brnch & shoul re T? It oesn t hve to ll insns ccess T in prllel with Imem Fetch Key ie: use T to preict which insn re brnches Implement by tgging ech entry with its corresponing Upte T on every tken brnch insn, recor trget : T[].tg =, T[].trget = trget of brnch ll insns ccess t Fetch in prllel with Imem Check for tg mtch, signifies insn t tht is brnch Preicte = (T[].tg == )? T[].trget : tg T trget preicte trget CI 371 (Roth/Mrtin): Pipelining 78 == Why Does T Work? ecuse most control insns use irect trgets Trget encoe in insn itself! sme tken trget every time Wht bout inirect trgets? Trget hel in register! cn be ifferent ech time Inirect conitionl jumps re not wiely supporte Two inirect cll iioms Dynmiclly linke functions (DLLs): trget lwys the sme Dynmiclly isptche (virtul) functions: hr but uncommon lso two inirect unconitionl jump iioms witches: hr but uncommon Function returns: hr n common but CI 501 (Mrtin/Roth): Pipelining 79 Return ress tck (R) T R PD I tg trget == preicte trget Return ress stck (R) Cll instruction? R[T] = Return instruction? Preicte-trget = R[--T] Q: how cn you tell if n insn is cll/return before ecoing it? ccessing R on every insn T-style oesn t work nswer: pre-ecoe bits in Imem, written when first execute Cn lso be use to signify brnches CI 501 (Mrtin/Roth): Pipelining 80

21 Putting It ll Together T & brnch irection preictor uring fetch T R PD I tg trget == is ret? preicte trget rnch Preiction Performnce Dynmic brnch preiction 20% of instruction brnches imple preictor: brnches preicte with 75% ccurcy CPI = 1 (20% * 25% * 2) = 1.1 More vnce preictor: 95% ccurcy CPI = 1 (20% * 5% * 2) = 1.02 rnch mis-preictions still big problem though Pipelines re long: typicl mis-preiction penlty is 10 cycles Pipelines re supersclr (lter) HT tken/not-tken If brnch preiction correct, no tken brnch penlty CI 501 (Mrtin/Roth): Pipelining 81 CI 501 (Mrtin/Roth): Pipelining 82 voiing rnches vi I: Preiction Conventionl control Conitionlly execute insns lso conitionlly fetche beq r3,trg F D X M W sub r6,1,r5 F D flushe: wrong pth trg: r,r5,r F flushe: why? trg: r,r5,r F D X M W If beq mis-preicts, both sub n must be flushe Wste: is inepenent of mis-preiction Preiction: not preiction, preiction I support for conitionlly-execute unconitionlly-fetche insns If beq mis-preicts, nnul sub in plce, preserve Exmple is if-then, but if-then-else cn be preicte too How is this one? How oes get correct vlue for r5 CI 501 (Mrtin/Roth): Pipelining 83 Full Preiction Full preiction Every insn cn be nnulle, nnulment controlle by Preicte registers: itionl register in ech insn (e.g., I6) setp.eq r3,p3 F D X M W sub.p r6,1,r5,p3 F D X nnulle trg: r,r5,r F D X M W Preicte coes: conition bits in ech insn (e.g., RM) setcc r3 F D X M W sub.nz r6,1,r5 F D X nnulle trg: r,r5,r F D X M W nly LU insn shown (sub), but this pplies to ll insns, even stores rnches replce with set-preicte insns CI 501 (Mrtin/Roth): Pipelining 8

22 Conitionl Moves (CMVs) Conitionl (register) moves Construct ppernce of full preiction from one primitive cmoveq r1,r2,r3 // if (r1==0) r3=r2; My require some coe upliction to chieve esire effect Pinful, potentilly impossible for some insn sequences Requires more registers nly goo wy of retro-fitting preiction onto I (e.g., I32, lph) sub r6,1,r9 D X M W cmovne r3,r9,r5 F D X M W trg: r,r5,r F D X M W Preiction Performnce Cost/benefit nlysis enefit: preiction vois brnches Thus voiing mis-preictions lso reuces pressure on preictor tble (few brnches to trck) Cost: extr (nnulle) instructions s brnch preictors re highly ccurte Might not help: 5-stge pipeline, two instruction on ech pth of if-then-else No performnce gin, likely slower if brnch preictble r even hurt! ut cn help: Deeper pipelines, hr-to-preict brnches, n few e insn Thus, preiction is useful, but not pnce CI 501 (Mrtin/Roth): Pipelining 85 CI 501 (Mrtin/Roth): Pipelining 86 Reserch: Perceptron Preictor Reserch Perceptron preictor [Jimenez] ttcks HR size problem using mchine lerning pproch HT replce by tble of function coefficients F i (signe) Preict tken if!(hr i *F i )> threshol Tble size #* HR * F (cn use long HR: ~60 bits) Equivlent correlte preictor woul be #*2 HR How oes it lern? Upte F i when brnch is tken HR i == 1? F i : F i ; on t cre F i bits sty ner 0, importnt F i bits sturte Hybri HT/perceptron ccurcy: 95 98% F! F i *HR i > thresh HR CI 501 (Mrtin/Roth): Technology 87 CI 501 (Mrtin/Roth): Pipelining 88

23 More Reserch: GEHL Preictor Problem with both correlte preictor n perceptron me HT rel-estte eicte to 1st history bit (1 column) s to 2n, 3r, 10th, 60th Not goo use of spce: 1st bit much more importnt thn 60th Chmpionship rnch Preiction CP Workshop hel in conjunction with MICR ubmitte coe is teste on stnr brnch trces Highest preiction ccurcy wins GEometric History-Length preictor [eznec, IC 05] Multiple HTs, inexe by geometriclly longer HRs (0,, 16, 32) HTs re (prtilly) tgge, not seprte chooser Preict: use mtching entry from HT with longest HR Mis-preict: crete entry in HT with longer HR nly 25% of HT use for bits (not 50%) Helps mortize cost of tgging Trins quickly 95-97% ccurte CI 501 (Mrtin/Roth): Pipelining 89 Two trcks Ielistic: preictor simultor must run in uner 2 hours Relistic: preictor must synthesize into 32K 256 bits or less 2006 winners Relistic: L-TGE (GEHL follow-on) Ielistic: GTL (nother GEHL follow-on) CI 501 (Mrtin/Roth): Pipelining 90 Reserch: Runhe Execution -regfile Reserch: Rzor regfile I$ regfile D$ I$ P == D$ In-orer writebcks essentilly imply stlls on D$ misses Cn sve power or use ile time for performnce Runhe execution [Duns] how regfile kept in sync with min regfile (write to both) D$ miss: continue executing using show regfile (isble stores) D$ miss returns: flush pipe n restrt with stlle cts like smrt prefetch engine Performs better s cche t miss grows (reltive to clock perio) CI 501 (Mrtin/Roth): Pipelining 91 Rzor [Uht, Ernst] Ientify pipeline stges with nrrow signl mrgins (e.g., X) Rzor X/M ltch: reltches X/M input signls fter sfe ely Compre X/M ltch with sfe rzor X/M ltch, ifferent? Flush F,D,X & M Restrt M using X/M rzor ltch, restrt F using D/X ltch Pipeline will not brek! reuce V DD until flush rte too high lterntively: over-clock until flush rte too high CI 501 (Mrtin/Roth): Pipelining 92

24 ummry pp pp pp ystem softwre CPU I/ Principles of pipelining Effects of overhe n hzrs Pipeline igrms hzrs tlling n bypssing Control hzrs rnch preiction Preiction CI 501 (Mrtin/Roth): Pipelining 93

Pipeline Example: Cycle 1. Pipeline Example: Cycle 2. Pipeline Example: Cycle 4. Pipeline Example: Cycle 3. 3 instructions. 3 instructions.

Pipeline Example: Cycle 1. Pipeline Example: Cycle 2. Pipeline Example: Cycle 4. Pipeline Example: Cycle 3. 3 instructions. 3 instructions. ipeline Exmple: Cycle 1 ipeline Exmple: Cycle X X/ /W X X/ /W $3,$,$1 lw $,0($5) $3,$,$1 3 instructions 8 9 ipeline Exmple: Cycle 3 ipeline Exmple: Cycle X X/ /W X X/ /W sw $6,($7) lw $,0($5) $3,$,$1 sw

More information

This Unit: (Scalar In-Order) Pipelining. CIS 501 Computer Architecture. Readings. Pre-Class Exercises

This Unit: (Scalar In-Order) Pipelining. CIS 501 Computer Architecture. Readings. Pre-Class Exercises This Unit: (clr In-rer) Pipelining CI 501 Computer rchitecture Unit : Pipelining pp pp pp ystem softwre CPU I/ Principles of pipelining Effects of overhe n hzrs Pipeline igrms hzrs tlling n bypssing Control

More information

s1 s2 d B (F/D.IR.RS1 == D/X.IR.RD) (F/D.IR.RS2 == D/X.IR.RD) (F/D.IR.RS1 == X/M.IR.RD) (F/D.IR.RS2 == X/M.IR.RD) = 1 = 1

s1 s2 d B (F/D.IR.RS1 == D/X.IR.RD) (F/D.IR.RS2 == D/X.IR.RD) (F/D.IR.RS1 == X/M.IR.RD) (F/D.IR.RS2 == X/M.IR.RD) = 1 = 1 Hrwre Interlock Exmple: cycle Hrwre Interlock Exmple: cycle ile s s / / / t em / ile s s / / / t em / nop nop hzr hzr $,$,$ $,$,$ (/..R == /..R) (/..R == /..R) (/..R == /..R) (/..R == /..R) = (/..R ==

More information

This Unit: Processor Design. What Is Control? Example: Control for sw. Example: Control for add

This Unit: Processor Design. What Is Control? Example: Control for sw. Example: Control for add This Unit: rocessor Design Appliction O ompiler U ory Firmwre I/O Digitl ircuits Gtes & Trnsistors pth components n timing s n register files ories (RAMs) locking strtegies Mpping n IA to tpth ontrol Exceptions

More information

EECS150 - Digital Design Lecture 23 - High-level Design and Optimization 3, Parallelism and Pipelining

EECS150 - Digital Design Lecture 23 - High-level Design and Optimization 3, Parallelism and Pipelining EECS150 - Digitl Design Lecture 23 - High-level Design nd Optimiztion 3, Prllelism nd Pipelining Nov 12, 2002 John Wwrzynek Fll 2002 EECS150 - Lec23-HL3 Pge 1 Prllelism Prllelism is the ct of doing more

More information

MIPS I/O and Interrupt

MIPS I/O and Interrupt MIPS I/O nd Interrupt Review Floting point instructions re crried out on seprte chip clled coprocessor 1 You hve to move dt to/from coprocessor 1 to do most common opertions such s printing, clling functions,

More information

Overview. Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory. Running Example. Background

Overview. Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory. Running Example. Background Overview king the Fst Cse Common n the Uncommon Cse imple in Unoune Trnsctionl Colin Blunell (University of Pennsylvni) Joe Devietti (University of Pennsylvni) E Christopher Lewis (Vwre, Inc.) ilo. K.

More information

ECE / CS 250 Introduction to Computer Architecture

ECE / CS 250 Introduction to Computer Architecture ECE / CS 250 Introduction to Computer rchitecture Pipelining enjamin C. Lee Duke University Slides from Daniel Sorin (Duke) and are derived from work by mir Roth (Penn) and lvy Lebeck (Duke) 1 This Unit:

More information

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2016

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2016 ECE 550D Fundamentals of Computer ystems and Engineering Fall 2016 Pipelines Tyler letsch Duke University lides are derived from work by Andrew Hilton (Duke) and Amir Roth (Penn) Clock Period and CPI ingle-cycle

More information

ECE/CS 250 Computer Architecture. Fall 2017

ECE/CS 250 Computer Architecture. Fall 2017 ECE/CS 250 Computer rchitecture Fall 2017 Pipelining Tyler letsch Duke University Includes material adapted from Dan Sorin (Duke) and mir Roth (Penn). This Unit: Pipelining pplication S Compiler Firmware

More information

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus Unit #9 : Definite Integrl Properties, Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl

More information

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

In the last lecture, we discussed how valid tokens may be specified by regular expressions. LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.

More information

UT1553B BCRT True Dual-port Memory Interface

UT1553B BCRT True Dual-port Memory Interface UTMC APPICATION NOTE UT553B BCRT True Dul-port Memory Interfce INTRODUCTION The UTMC UT553B BCRT is monolithic CMOS integrted circuit tht provides comprehensive MI-STD- 553B Bus Controller nd Remote Terminl

More information

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms Distriuted Systems Principles nd Prdigms Chpter 11 (version April 7, 2008) Mrten vn Steen Vrije Universiteit Amsterdm, Fculty of Science Dept. Mthemtics nd Computer Science Room R4.20. Tel: (020) 598 7784

More information

ECE 468/573 Midterm 1 September 28, 2012

ECE 468/573 Midterm 1 September 28, 2012 ECE 468/573 Midterm 1 September 28, 2012 Nme:! Purdue emil:! Plese sign the following: I ffirm tht the nswers given on this test re mine nd mine lone. I did not receive help from ny person or mteril (other

More information

Caches I. CSE 351 Spring Instructor: Ruth Anderson

Caches I. CSE 351 Spring Instructor: Ruth Anderson L16: Cches I Cches I CSE 351 Spring 2017 Instructor: Ruth Anderson Teching Assistnts: Dyln Johnson Kevin Bi Linxing Preston Jing Cody Ohlsen Yufng Sun Joshu Curtis L16: Cches I Administrivi Homework 3,

More information

Data Flow on a Queue Machine. Bruno R. Preiss. Copyright (c) 1987 by Bruno R. Preiss, P.Eng. All rights reserved.

Data Flow on a Queue Machine. Bruno R. Preiss. Copyright (c) 1987 by Bruno R. Preiss, P.Eng. All rights reserved. Dt Flow on Queue Mchine Bruno R. Preiss 2 Outline Genesis of dt-flow rchitectures Sttic vs. dynmic dt-flow rchitectures Pseudo-sttic dt-flow execution model Some dt-flow mchines Simple queue mchine Prioritized

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-169 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Bruce McCarl's GAMS Newsletter Number 37

Bruce McCarl's GAMS Newsletter Number 37 Bruce McCrl's GAMS Newsletter Number 37 This newsletter covers 1 Uptes to Expne GAMS User Guie by McCrl et l.... 1 2 YouTube vieos... 1 3 Explntory text for tuple set elements... 1 4 Reing sets using GDXXRW...

More information

Caches I. CSE 351 Autumn Instructor: Justin Hsia

Caches I. CSE 351 Autumn Instructor: Justin Hsia L01: Intro, L01: L16: Combintionl Introduction Cches I Logic CSE369, CSE351, Autumn 2016 Cches I CSE 351 Autumn 2016 Instructor: Justin Hsi Teching Assistnts: Chris M Hunter Zhn John Kltenbch Kevin Bi

More information

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

Extending Finite Automata to Efficiently Match Perl-Compatible Regular Expressions

Extending Finite Automata to Efficiently Match Perl-Compatible Regular Expressions Extening Finite Automt to Efficiently Mtch Perl-Comptible Regulr Expressions Michel Becchi Wshington University Computer Science n Engineering St. Louis, MO 63130-4899 mbecchi@cse.wustl.eu ABSTRACT Regulr

More information

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017 ECE 550D Funamentals of Computer Systems an Engineering Fall 017 Datapaths Prof. John Boar Duke University Slies are erive from work by Profs. Tyler Bletch an Anrew Hilton (Duke) an Amir Roth (Penn) What

More information

Midterm 2 Sample solution

Midterm 2 Sample solution Nme: Instructions Midterm 2 Smple solution CMSC 430 Introduction to Compilers Fll 2012 November 28, 2012 This exm contins 9 pges, including this one. Mke sure you hve ll the pges. Write your nme on the

More information

2 Computing all Intersections of a Set of Segments Line Segment Intersection

2 Computing all Intersections of a Set of Segments Line Segment Intersection 15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Section 10.4 Hyperbolas

Section 10.4 Hyperbolas 66 Section 10.4 Hyperbols Objective : Definition of hyperbol & hyperbols centered t (0, 0). The third type of conic we will study is the hyperbol. It is defined in the sme mnner tht we defined the prbol

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Dynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012

Dynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012 Dynmic Progrmming Andres Klppenecker [prtilly bsed on slides by Prof. Welch] 1 Dynmic Progrmming Optiml substructure An optiml solution to the problem contins within it optiml solutions to subproblems.

More information

Chapter 2. 3/28/2004 H133 Spring

Chapter 2. 3/28/2004 H133 Spring Chpter 2 Newton believe tht light ws me up of smll prticles. This point ws ebte by scientists for mny yers n it ws not until the 1800 s when series of experiments emonstrte wve nture of light. (But be

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Systems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits

Systems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits Systems I Logic Design I Topics Digitl logic Logic gtes Simple comintionl logic circuits Simple C sttement.. C = + ; Wht pieces of hrdwre do you think you might need? Storge - for vlues,, C Computtion

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

CSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe

CSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe CSCI 0 fel Ferreir d Silv rfsilv@isi.edu Slides dpted from: Mrk edekopp nd Dvid Kempe LOG STUCTUED MEGE TEES Series Summtion eview Let n = + + + + k $ = #%& #. Wht is n? n = k+ - Wht is log () + log ()

More information

Unit 5 Vocabulary. A function is a special relationship where each input has a single output.

Unit 5 Vocabulary. A function is a special relationship where each input has a single output. MODULE 3 Terms Definition Picture/Exmple/Nottion 1 Function Nottion Function nottion is n efficient nd effective wy to write functions of ll types. This nottion llows you to identify the input vlue with

More information

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li 2nd Interntionl Conference on Electronic & Mechnicl Engineering nd Informtion Technology (EMEIT-212) Complete Coverge Pth Plnning of Mobile Robot Bsed on Dynmic Progrmming Algorithm Peng Zhou, Zhong-min

More information

ECEN 468 Advanced Logic Design Lecture 36: RTL Optimization

ECEN 468 Advanced Logic Design Lecture 36: RTL Optimization ECEN 468 Advnced Logic Design Lecture 36: RTL Optimiztion ECEN 468 Lecture 36 RTL Design Optimiztions nd Trdeoffs 6.5 While creting dtpth during RTL design, there re severl optimiztions nd trdeoffs, involving

More information

Geometric transformations

Geometric transformations Geometric trnsformtions Computer Grphics Some slides re bsed on Shy Shlom slides from TAU mn n n m m T A,,,,,, 2 1 2 22 12 1 21 11 Rows become columns nd columns become rows nm n n m m A,,,,,, 1 1 2 22

More information

Fig.25: the Role of LEX

Fig.25: the Role of LEX The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing

More information

Functor (1A) Young Won Lim 8/2/17

Functor (1A) Young Won Lim 8/2/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

pdfapilot Server 2 Manual

pdfapilot Server 2 Manual pdfpilot Server 2 Mnul 2011 by clls softwre gmbh Schönhuser Allee 6/7 D 10119 Berlin Germny info@cllssoftwre.com www.cllssoftwre.com Mnul clls pdfpilot Server 2 Pge 2 clls pdfpilot Server 2 Mnul Lst modified:

More information

Caches I. CSE 351 Autumn 2018

Caches I. CSE 351 Autumn 2018 Cches I CSE 351 Autumn 2018 Instructors: Mx Willsey Luis Ceze Teching Assistnts: Britt Henderson Luks Joswik Josie Lee Wei Lin Dniel Snitkovsky Luis Veg Kory Wtson Ivy Yu Alt text: I looked t some of the

More information

CPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls

CPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls Redings for Next Two Lectures Text CPSC 213 Switch Sttements, Understnding Pointers - 2nd ed: 3.6.7, 3.10-1st ed: 3.6.6, 3.11 Introduction to Computer Systems Unit 1f Dynmic Control Flow Polymorphism nd

More information

MA1008. Calculus and Linear Algebra for Engineers. Course Notes for Section B. Stephen Wills. Department of Mathematics. University College Cork

MA1008. Calculus and Linear Algebra for Engineers. Course Notes for Section B. Stephen Wills. Department of Mathematics. University College Cork MA1008 Clculus nd Liner Algebr for Engineers Course Notes for Section B Stephen Wills Deprtment of Mthemtics University College Cork s.wills@ucc.ie http://euclid.ucc.ie/pges/stff/wills/teching/m1008/ma1008.html

More information

Functor (1A) Young Won Lim 10/5/17

Functor (1A) Young Won Lim 10/5/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

What do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers

What do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers Wht do ll those bits men now? bits (...) Number Systems nd Arithmetic or Computers go to elementry school instruction R-formt I-formt... integer dt number text chrs... floting point signed unsigned single

More information

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

12-B FRACTIONS AND DECIMALS

12-B FRACTIONS AND DECIMALS -B Frctions nd Decimls. () If ll four integers were negtive, their product would be positive, nd so could not equl one of them. If ll four integers were positive, their product would be much greter thn

More information

Looking up objects in Pastry

Looking up objects in Pastry Review: Pstry routing tbles 0 1 2 3 4 7 8 9 b c d e f 0 1 2 3 4 7 8 9 b c d e f 0 1 2 3 4 7 8 9 b c d e f 0 2 3 4 7 8 9 b c d e f Row0 Row 1 Row 2 Row 3 Routing tble of node with ID i =1fc s - For ech

More information

Questions About Numbers. Number Systems and Arithmetic. Introduction to Binary Numbers. Negative Numbers?

Questions About Numbers. Number Systems and Arithmetic. Introduction to Binary Numbers. Negative Numbers? Questions About Numbers Number Systems nd Arithmetic or Computers go to elementry school How do you represent negtive numbers? frctions? relly lrge numbers? relly smll numbers? How do you do rithmetic?

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Introduction to hardware design using VHDL

Introduction to hardware design using VHDL Introuction to hrwre esign using VHDL Tim Güneysu n Nele Mentens ECC school Novemer 11, 2017, Nijmegen Outline Implementtion pltforms Introuction to VHDL Hrwre tutoril 1 Implementtion pltforms Microprocessor

More information

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an Scnner Termintion A scnner reds input chrcters nd prtitions them into tokens. Wht hppens when the end of the input file is reched? It my be useful to crete n Eof pseudo-chrcter when this occurs. In Jv,

More information

Stack. A list whose end points are pointed by top and bottom

Stack. A list whose end points are pointed by top and bottom 4. Stck Stck A list whose end points re pointed by top nd bottom Insertion nd deletion tke plce t the top (cf: Wht is the difference between Stck nd Arry?) Bottom is constnt, but top grows nd shrinks!

More information

CS 241. Fall 2017 Midterm Review Solutions. October 24, Bits and Bytes 1. 3 MIPS Assembler 6. 4 Regular Languages 7.

CS 241. Fall 2017 Midterm Review Solutions. October 24, Bits and Bytes 1. 3 MIPS Assembler 6. 4 Regular Languages 7. CS 241 Fll 2017 Midterm Review Solutions Octoer 24, 2017 Contents 1 Bits nd Bytes 1 2 MIPS Assemly Lnguge Progrmming 2 3 MIPS Assemler 6 4 Regulr Lnguges 7 5 Scnning 9 1 Bits nd Bytes 1. Give two s complement

More information

Transparent neutral-element elimination in MPI reduction operations

Transparent neutral-element elimination in MPI reduction operations Trnsprent neutrl-element elimintion in MPI reduction opertions Jesper Lrsson Träff Deprtment of Scientific Computing University of Vienn Disclimer Exploiting repetition nd sprsity in input for reducing

More information

Algorithm Design (5) Text Search

Algorithm Design (5) Text Search Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:

More information

What do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers

What do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers Wht do ll those bits men now? bits (...) Number Systems nd Arithmetic or Computers go to elementry school instruction R-formt I-formt... integer dt number text chrs... floting point signed unsigned single

More information

Stack Manipulation. Other Issues. How about larger constants? Frame Pointer. PowerPC. Alternative Architectures

Stack Manipulation. Other Issues. How about larger constants? Frame Pointer. PowerPC. Alternative Architectures Other Issues Stck Mnipultion support for procedures (Refer to section 3.6), stcks, frmes, recursion mnipulting strings nd pointers linkers, loders, memory lyout Interrupts, exceptions, system clls nd conventions

More information

Many analog implementations of CPG exist, typically using operational amplifier or

Many analog implementations of CPG exist, typically using operational amplifier or FPGA Implementtion of Centrl Pttern Genertor By Jmes J Lin Introuction: Mny nlog implementtions of CPG exist, typiclly using opertionl mplifier or trnsistor level circuits. These types of circuits hve

More information

6.2 Volumes of Revolution: The Disk Method

6.2 Volumes of Revolution: The Disk Method mth ppliction: volumes by disks: volume prt ii 6 6 Volumes of Revolution: The Disk Method One of the simplest pplictions of integrtion (Theorem 6) nd the ccumultion process is to determine so-clled volumes

More information

Fault injection attacks on cryptographic devices and countermeasures Part 2

Fault injection attacks on cryptographic devices and countermeasures Part 2 Fult injection ttcks on cryptogrphic devices nd countermesures Prt Isrel Koren Deprtment of Electricl nd Computer Engineering University of Msschusetts Amherst, MA Countermesures - Exmples Must first detect

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain Dr. D.M. Akr Hussin Lexicl Anlysis. Bsic Ide: Red the source code nd generte tokens, it is similr wht humns will do to red in; just tking on the input nd reking it down in pieces. Ech token is sequence

More information

Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors

Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors Dt-Flow Prescheduling for Lrge Instruction Windows in Out-of-Order Processors Pierre Michud, André Seznec IRISA/INRIA Cmpus de Beulieu, 35 Rennes Cedex, Frnce {pmichud, seznec}@iris.fr Abstrct The performnce

More information

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2012 Colin Dewey cdewey@biostt.wisc.edu Gols for Lecture the key concepts to understnd re the following how lrge-scle lignment

More information

Fall 2018 Midterm 1 October 11, ˆ You may not ask questions about the exam except for language clarifications.

Fall 2018 Midterm 1 October 11, ˆ You may not ask questions about the exam except for language clarifications. 15-112 Fll 2018 Midterm 1 October 11, 2018 Nme: Andrew ID: Recittion Section: ˆ You my not use ny books, notes, extr pper, or electronic devices during this exm. There should be nothing on your desk or

More information

Data sharing in OpenMP

Data sharing in OpenMP Dt shring in OpenMP Polo Burgio polo.burgio@unimore.it Outline Expressing prllelism Understnding prllel threds Memory Dt mngement Dt cluses Synchroniztion Brriers, locks, criticl sections Work prtitioning

More information

George Boole. IT 3123 Hardware and Software Concepts. Switching Algebra. Boolean Functions. Boolean Functions. Truth Tables

George Boole. IT 3123 Hardware and Software Concepts. Switching Algebra. Boolean Functions. Boolean Functions. Truth Tables George Boole IT 3123 Hrdwre nd Softwre Concepts My 28 Digitl Logic The Little Mn Computer 1815 1864 British mthemticin nd philosopher Mny contriutions to mthemtics. Boolen lger: n lger over finite sets

More information

Example: 2:1 Multiplexer

Example: 2:1 Multiplexer Exmple: 2:1 Multiplexer Exmple #1 reg ; lwys @( or or s) egin if (s == 1') egin = ; else egin = ; 1 s B. Bs 114 Exmple: 2:1 Multiplexer Exmple #2 Normlly lwys include egin nd sttements even though they

More information

Overview. Network characteristics. Network architecture. Data dissemination. Network characteristics (cont d) Mobile computing and databases

Overview. Network characteristics. Network architecture. Data dissemination. Network characteristics (cont d) Mobile computing and databases Overview Mobile computing nd dtbses Generl issues in mobile dt mngement Dt dissemintion Dt consistency Loction dependent queries Interfces Detils of brodcst disks thlis klfigopoulos Network rchitecture

More information

Enginner To Engineer Note

Enginner To Engineer Note Technicl Notes on using Anlog Devices DSP components nd development tools from the DSP Division Phone: (800) ANALOG-D, FAX: (781) 461-3010, EMAIL: dsp_pplictions@nlog.com, FTP: ftp.nlog.com Using n ADSP-2181

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully

More information

INTRODUCTION TO SIMPLICIAL COMPLEXES

INTRODUCTION TO SIMPLICIAL COMPLEXES INTRODUCTION TO SIMPLICIAL COMPLEXES CASEY KELLEHER AND ALESSANDRA PANTANO 0.1. Introduction. In this ctivity set we re going to introduce notion from Algebric Topology clled simplicil homology. The min

More information

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011 CSCI 3130: Forml Lnguges nd utomt Theory Lecture 12 The Chinese University of Hong Kong, Fll 2011 ndrej Bogdnov In progrmming lnguges, uilding prse trees is significnt tsk ecuse prse trees tell us the

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Engineer-to-Engineer Note

Engineer-to-Engineer Note Engineer-to-Engineer Note EE-232 Technicl notes on using Anlog Devices DSPs, processors nd development tools Contct our technicl support t dsp.support@nlog.com nd t dsptools.support@nlog.com Or visit our

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-186 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Readings : Computer Networking. Outline. The Next Internet: More of the Same? Required: Relevant earlier meeting:

Readings : Computer Networking. Outline. The Next Internet: More of the Same? Required: Relevant earlier meeting: Redings 15-744: Computer Networking L-14 Future Internet Architecture Required: Servl pper Extr reding on Mobility First Relevnt erlier meeting: CCN -> Nmed Dt Network 2 Outline The Next Internet: More

More information

SIMPLIFYING ALGEBRA PASSPORT.

SIMPLIFYING ALGEBRA PASSPORT. SIMPLIFYING ALGEBRA PASSPORT www.mthletics.com.u This booklet is ll bout turning complex problems into something simple. You will be ble to do something like this! ( 9- # + 4 ' ) ' ( 9- + 7-) ' ' Give

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy Recognition of Tokens if expressions nd reltionl opertors if è if then è then else è else relop

More information

An Integrated Simulation System for Human Factors Study

An Integrated Simulation System for Human Factors Study An Integrted Simultion System for Humn Fctors Study Ying Wng, Wei Zhng Deprtment of Industril Engineering, Tsinghu University, Beijing 100084, Chin Foud Bennis, Dmien Chblt IRCCyN, Ecole Centrle de Nntes,

More information

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig CS311H: Discrete Mthemtics Grph Theory IV Instructor: Işıl Dillig Instructor: Işıl Dillig, CS311H: Discrete Mthemtics Grph Theory IV 1/25 A Non-plnr Grph Regions of Plnr Grph The plnr representtion of

More information

Mobility Support for a QoS Aggregation Protocol

Mobility Support for a QoS Aggregation Protocol Mobility Support for QoS Aggregtion Protocol A. Kloxylos^, D. Vli*, S. Psklis+, G. Pngiotou^, I. Goninkis^, E. Zervs # ^ Deprtment of Telecommunictions Science n Technology, University of Peloponnese,

More information

EECS 281: Homework #4 Due: Thursday, October 7, 2004

EECS 281: Homework #4 Due: Thursday, October 7, 2004 EECS 28: Homework #4 Due: Thursdy, October 7, 24 Nme: Emil:. Convert the 24-bit number x44243 to mime bse64: QUJD First, set is to brek 8-bit blocks into 6-bit blocks, nd then convert: x44243 b b 6 2 9

More information

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have Rndom Numers nd Monte Crlo Methods Rndom Numer Methods The integrtion methods discussed so fr ll re sed upon mking polynomil pproximtions to the integrnd. Another clss of numericl methods relies upon using

More information

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment File Mnger Quick Reference Guide June 2018 Prepred for the Myo Clinic Enterprise Khu Deployment NVIGTION IN FILE MNGER To nvigte in File Mnger, users will mke use of the left pne to nvigte nd further pnes

More information

Today. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search

Today. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search Uninformed Serch [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI t UC Berkeley. All CS188 mterils re vilble t http://i.berkeley.edu.] Tody Serch Problems Uninformed Serch Methods

More information

Sample Midterm Solutions COMS W4115 Programming Languages and Translators Monday, October 12, 2009

Sample Midterm Solutions COMS W4115 Programming Languages and Translators Monday, October 12, 2009 Deprtment of Computer cience Columbi University mple Midterm olutions COM W4115 Progrmming Lnguges nd Trnsltors Mondy, October 12, 2009 Closed book, no ids. ch question is worth 20 points. Question 5(c)

More information

Essential Question What are some of the characteristics of the graph of a rational function?

Essential Question What are some of the characteristics of the graph of a rational function? 8. TEXAS ESSENTIAL KNOWLEDGE AND SKILLS A..A A..G A..H A..K Grphing Rtionl Functions Essentil Question Wht re some of the chrcteristics of the grph of rtionl function? The prent function for rtionl functions

More information

Agilent Mass Hunter Software

Agilent Mass Hunter Software Agilent Mss Hunter Softwre Quick Strt Guide Use this guide to get strted with the Mss Hunter softwre. Wht is Mss Hunter Softwre? Mss Hunter is n integrl prt of Agilent TOF softwre (version A.02.00). Mss

More information

Outline. Tiling, formally. Expression tile as rule. Statement tiles as rules. Function calls. CS 412 Introduction to Compilers

Outline. Tiling, formally. Expression tile as rule. Statement tiles as rules. Function calls. CS 412 Introduction to Compilers CS 412 Introduction to Compilers Andrew Myers Cornell University Lectur8 Finishing genertion 9 Mr 01 Outline Tiling s syntx-directed trnsltion Implementing function clls Implementing functions Optimizing

More information

Today. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search.

Today. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search. CS 88: Artificil Intelligence Fll 00 Lecture : A* Serch 9//00 A* Serch rph Serch Tody Heuristic Design Dn Klein UC Berkeley Multiple slides from Sturt Russell or Andrew Moore Recp: Serch Exmple: Pncke

More information

Ray surface intersections

Ray surface intersections Ry surfce intersections Some primitives Finite primitives: polygons spheres, cylinders, cones prts of generl qudrics Infinite primitives: plnes infinite cylinders nd cones generl qudrics A finite primitive

More information

MATH 25 CLASS 5 NOTES, SEP

MATH 25 CLASS 5 NOTES, SEP MATH 25 CLASS 5 NOTES, SEP 30 2011 Contents 1. A brief diversion: reltively prime numbers 1 2. Lest common multiples 3 3. Finding ll solutions to x + by = c 4 Quick links to definitions/theorems Euclid

More information

If f(x, y) is a surface that lies above r(t), we can think about the area between the surface and the curve.

If f(x, y) is a surface that lies above r(t), we can think about the area between the surface and the curve. Line Integrls The ide of line integrl is very similr to tht of single integrls. If the function f(x) is bove the x-xis on the intervl [, b], then the integrl of f(x) over [, b] is the re under f over the

More information

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting

More information