Datapath Background. This Unit: (Scalar In-Order) Pipelining. CIS 501 Computer Architecture. Readings
|
|
- Arleen Jennifer Clark
- 6 years ago
- Views:
Transcription
1 This Unit: (clr In-rer) Pipelining CI 501 Computer rchitecture Unit 6: Pipelining pp pp pp ystem softwre CPU I/ Principles of pipelining Effects of overhe n hzrs Pipeline igrms hzrs tlling n bypssing Control hzrs rnch preiction Preiction lies originlly evelope by mir Roth with contributions by Milo Mrtin t University of Pennsylvni with sources tht inclue University of Wisconsin slies by Mrk Hill, Guri ohi, Jim mith, n Dvi Woo. CI 501 (Mrtin/Roth): Pipelining 1 CI 501 (Mrtin/Roth): Pipelining 2 Reings H&P ppenix pth ckgroun CI 501 (Mrtin/Roth): Pipelining 3 CI 501 (Mrtin/Roth): Technology
2 pth n Control ingle-cycle pth I$ D$ I$ D$ control pth: implements execute portion of fetch/exec. loop Functionl units (LUs), registers, memory interfce Control: implements ecoe portion of fetch/execute loop Mux selectors, write enble signls regulte flow of t in tpth Prt of ecoe involves trnslting insn opcoe into control signls CI 501 (Mrtin/Roth): Pipelining 5 ingle-cycle tpth: true tomic fetch/execute loop Fetch, ecoe, execute one complete instruction every cycle Hrwire control : opcoe to control signls RM Low CPI: 1 by efinition Long clock perio: to ccommote slowest instruction CI 501 (Mrtin/Roth): Pipelining 6 Multi-Cycle pth I$ Multi-cycle tpth: ttcks slow clock Fetch, ecoe, execute one complete insn over multiple cycles Micro-coe control: stges control signls llows insns to tke ifferent number of cycles (min point) ± pposite of single-cycle: short clock perio, high CPI (think: CIC) CI 501 (Mrtin/Roth): Pipelining 7 D$ D ingle-cycle vs. Multi-cycle Performnce ingle-cycle Clock perio = 50ns, CPI = 1 Performnce = 50ns/insn Multi-cycle hs opposite performnce split of single-cycle horter clock perio Higher CPI Multi-cycle rnch: 20% (3 cycles), lo: 20% (5 cycles), LU: 60% ( cycles) Clock perio = 11ns, CPI = (20%*3)(20%*5)(60%*) = Why is clock perio 11ns n not 10ns? Performnce = ns/insn sie: CIC mkes perfect sense in multi-cycle tpth CI 501 (Mrtin/Roth): Pipelining 8
3 Ltency versus Throughput insn0.fetch, ec, exec ingle-cycle insn0.fetch insn0.ec Multi-cycle insn1.fetch, ec, exec insn0.exec insn1.fetch insn1.ec Cn we hve both low CPI n short clock perio? Not if tpth executes only one insn t time insn1.exec Pipelining sics Ltency vs. Throughput Ltency: no goo wy to mke single insn go fster Throughput: fortuntely, no one cres bout single insn ltency Gol is to mke progrms, not iniviul insns, go fster Progrms contin billions of insns Key: exploit inter-insn prllelism CI 501 (Mrtin/Roth): Technology 9 CI 501 (Mrtin/Roth): Pipelining 10 Pipelining insn0.fetch Multi-cycle Importnt performnce technique Improves instruction throughput rther instruction ltency egin with multi-cycle esign When insn vnces from stge 1 to 2, next insn enters t stge 1 Form of prllelism: insn-stge prllelism Mintins illusion of sequentil fetch/execute loop Iniviul instruction tkes the sme number of stges ut instructions enter n leve t much fster rte Lunry nlogy insn0.ec insn0.exec insn1.fetch insn0.fetch insn0.ec insn0.exec Pipeline insn1.fetch insn1.ec insn1.exec insn1.ec insn1.exec CI 371 (Roth/Mrtin): Pipelining 11 Five tge Pipeline pth Temporry vlues (,,,,,D) re-ltche every stge Why? 5 insns my be in pipeline t once with ifferent s Notice, not ltche fter LU stge (not neee lter) Pipeline control: one single-cycle controller Control signls themselves pipeline CI 371 (Roth/Mrtin): Pipelining 12 D
4 Five tge Pipeline Performnce Pipeline Terminology T insn-mem T regfile T LU T t-mem T regfile T singlecycle Pipelining: cut tpth into N stges (here five) ne insn in ech stge in ech cycle Clock perio = MX(T insn-mem, T regfile, T LU, T t-mem ) se CPI = 1: insn enters n leves every cycle ctul CPI > 1: pipeline must often stll Iniviul insn ltency increses (pipeline overhe), not the point CI 371 (Roth/Mrtin): Pipelining 13 Five stge: Fetch, Decoe, execute, ory, Writebck Nothing mgicl bout 5 stges (Pentium h 22 stges!) Ltches (pipeline registers) nme by stges they seprte, F/D, D/X, X/M, M/W CI 371 (Roth/Mrtin): Pipelining 1 F/D D/X X/M M/W D More Terminology & Foreshowing clr pipeline: one insn per stge per cycle lterntive: supersclr (lter) In-orer pipeline: insns enter execute stge in orer lterntive: out-of-orer (lter) Pipeline epth: number of pipeline stges Nothing mgicl bout five Tren hs been to eeper pipelines (gin, more lter) Instruction Convention Different Is use inconsistent register orers ome Is (for exmple MIP) Instruction estintion (i.e., output) on the left $1, $2, $3 mens $1!$2$3 ther Is Instruction estintion (i.e., output) on the right r1,r2,r3 mens r1r2!r3 l 0(r5),r mens mem[r58]!r st r,0(r5) mens r!mem[r58] CI 501 (Mrtin/Roth): Pipelining 15 Will try to specify to voi confusion, next slies MIP style CI 501 (Mrtin/Roth): Pipelining 16
5 Pipeline Exmple: Cycle 1 Pipeline Exmple: Cycle 2 << 2 << 2 D D $3,$2,$1 lw $,0($5) $3,$2,$1 3 instructions CI 501 (Mrtin/Roth): Pipelining 17 CI 501 (Mrtin/Roth): Pipelining 18 Pipeline Exmple: Cycle 3 Pipeline Exmple: Cycle << 2 << 2 D D sw $6,($7) lw $,0($5) $3,$2,$1 sw $6,($7) lw $,0($5) $3,$2,$1 3 instructions CI 501 (Mrtin/Roth): Pipelining 19 CI 501 (Mrtin/Roth): Pipelining 20
6 Pipeline Exmple: Cycle 5 Pipeline Exmple: Cycle 6 << 2 << 2 D D sw $6,($7) lw $,0($5) sw $6,(7) lw CI 501 (Mrtin/Roth): Pipelining 21 CI 501 (Mrtin/Roth): Pipelining 22 Pipeline Exmple: Cycle 7 Pipeline Digrm D << 2 Pipeline igrm: shorthn for wht we just sw cross: cycles Down: insns Convention: X mens lw $,0($5) finishes execute stge n writes into X/M ltch t en of cycle $3,$2,$1 F D X M W sw lw $,0($5) F D X M W sw $6,($7) F D X M W CI 501 (Mrtin/Roth): Pipelining 23 CI 501 (Mrtin/Roth): Pipelining 2
7 Exmple Pipeline Perf. Clcultion ingle-cycle Clock perio = 50ns, CPI = 1 Performnce = 50ns/insn Multi-cycle rnch: 20% (3 cycles), lo: 20% (5 cycles), LU: 60% ( cycles) Clock perio = 11ns, CPI = (20%*3)(20%*5)(60%*) = Performnce = ns/insn 5-stge pipeline Clock perio = 12ns pprox. (50ns / 5 stges) overhes CPI = 1 (ech insn tkes 5 cycles, but 1 completes ech cycle) Performnce = 12ns/insn Well ctully CPI = 1 some penlty for pipelining (next) CPI = 1.5 (on verge insn completes every 1.5 cycles) Performnce = 18ns/insn Much higher performnce thn single-cycle or multi-cycle CI 501 (Mrtin/Roth): Pipelining 25 Q1: Why Is Pipeline Clock Perio > (ely thru tpth) / (number of pipeline stges)? few resons: Ltches ely Extr bypssing logic s ely Pipeline stges hve ifferent elys, clock perio is mx ely These fctors hve implictions for iel number pipeline stges Diminishing clock frequency gins for longer (eeper) pipelines CI 501 (Mrtin/Roth): Pipelining 26 Q2: Why Is Pipeline CPI > 1? CPI for sclr in-orer pipeline is 1 stll penlties tlls use to resolve hzrs Hzr: conition tht jeoprizes sequentil illusion tll: pipeline ely introuce to restore sequentil illusion Clculting pipeline CPI Frequency of stll * stll cycles Penlties (stlls generlly on t overlp in in-orer pipelines) 1 stll-freq 1 *stll-cyc 1 stll-freq 2 *stll-cyc 2 Correctness/performnce/mke common cse fst (MCCF) Long penlties K if they hppen rrely, e.g., * 10 = 1.1 tlls lso hve implictions for iel number of pipeline stges Depenences, Pipeline Hzrs, n ypssing CI 501 (Mrtin/Roth): Pipelining 27 CI 501 (Mrtin/Roth): Technology 28
8 Depenences n Hzrs Depenence: reltionship between two insns : two insns use sme storge loction Control: one insn ffects whether nother executes t ll Not b thing, progrms woul be boring without them Enforce by mking oler insn go before younger one Hppens nturlly in single-/multi-cycle esigns ut not in pipeline Hzr: epenence & possibility of wrong insn orer Effects of wrong insn orer cnnot be externlly visible tll: for orer by keeping younger insn in sme stge Hzrs re b thing: stlls reuce performnce CI 501 (Mrtin/Roth): Pipelining 29 Why Does Every Tke 5 Cycles? D $3,$2,$1 lw $,0($5) Coul/shoul we llow to skip M n go to W? No It wouln t help: pek fetch still only 1 insn per cycle tructurl hzrs: imgine follows lw CI 501 (Mrtin/Roth): Pipelining 30 << 2 tructurl Hzrs tructurl hzrs Two insns trying to use sme circuit t sme time E.g., structurl hzr on register file write port To fix structurl hzrs: proper I/pipeline esign Ech insn uses every structure exctly once For t most one cycle lwys t sme stge reltive to F (fetch) Tolerte structure hzrs stll logic to stll pipeline when hzrs occur Exmple tructurl Hzr l r2,0(r1) F D X M W r1,r3,r F D X M W sub r1,r3,r5 F D X M W st r6,0(r1) F D X M W tructurl hzr: resource neee twice in one cycle Exmple: unifie instruction & t memories (cches) olutions: eprte instruction/t memories (cches) Reesign cche to llow 2 ccesses per cycle (slow, expensive) tll pipeline CI 501 (Mrtin/Roth): Pipelining 31 CI 501 (Mrtin/Roth): Pipelining 32
9 Hzrs F/D D/X X X/M sw $6,0($7) lw $,0($5) Let s forget bout brnches n the control for while The three insn sequence we sw erlier execute fine ut it wsn t rel progrm Rel progrms hve t epenences They pss vlues vi registers n memory M/W $3,$2,$1 D Depenent pertions Inepenent opertions $3,$2,$1 $6,$5,$ Woul this progrm execute correctly on pipeline? $3,$2,$1 $6,$5,$3 Wht bout this progrm? $3,$2,$1 lw $,0($3) i $6,1,$3 sw $3,0($7) CI 501 (Mrtin/Roth): Pipelining 33 CI 501 (Mrtin/Roth): Pipelining 3 Hzrs ory Hzrs F/D D/X X X/M D M/W F/D D/X X X/M D M/W sw $3,0($7) i $6,1,$3 lw $,0($3) $3,$2,$1 Woul this progrm execute correctly on this pipeline? Which insns woul execute with correct inputs? is writing its result into $3 in current cycle lw re $3 two cycles go! got wrong vlue i re $3 one cycle go! got wrong vlue sw is reing $3 this cycle! mybe (epening on regfile esign) CI 501 (Mrtin/Roth): Pipelining 35 lw $,0($1) sw $5,0($1) re memory t hzrs problem for this pipeline? No lw following sw to sme ress in next cycle, gets right vlue Why? mem re/write lwys tke plce in sme stge hzrs through registers? Yes (previous slie) ccur becuse register write is three stges fter register re Cn only re register vlue three cycles fter writing it CI 501 (Mrtin/Roth): Pipelining 36
10 bservtion! F/D D/X X X/M lw $,0($3) Techniclly, this sitution is broken lw $,0($3) hs lrey re $3 from regfile $3,$2,$1 hsn t yet written $3 to regfile ut funmentlly, everything is K lw $,0($3) hsn t ctully use $3 yet $3,$2,$1 hs lrey compute $3 CI 501 (Mrtin/Roth): Pipelining 37 M/W $3,$2,$1 D Reucing Hzrs: ypssing F/D D/X X X/M lw $,0($3) ypssing Reing vlue from n intermeite (µrchitecturl) source Not witing until it is vilble from primry source Here, we re bypssing the register file lso clle forwring CI 501 (Mrtin/Roth): Pipelining 38 M/W $3,$2,$1 D WX ypssing LUin ypssing F/D D/X X X/M D M/W F/D D/X X X/M D M/W lw $,0($3) $3,$2,$1 $,$2,$3 $3,$2,$1 Wht bout this combintion? nother bypss pth n MUX (multiplexor) input First one ws n MX bypss This one is WX bypss Cn lso bypss to LU input CI 501 (Mrtin/Roth): Pipelining 39 CI 501 (Mrtin/Roth): Pipelining 0
11 WM ypssing? ypss Logic D F/D D/X X X/M D M/W sw $3,0($) lw $3,0($2) Does WM bypssing mke sense? Not to the ress input (why not?) ut to the store t input, yes CI 501 (Mrtin/Roth): Pipelining 1 bypss Ech MUX hs its own, here it is for MUX LUin (D/X..Regource1 == X/M..RegDest) => 0 (D/X..Regource1 == M/W..RegDest) => 1 Else => 2 CI 501 (Mrtin/Roth): Pipelining 2 Pipeline Digrms with ypssing If bypss exists, from / to stges execute in sme cycle Exmple: full bypssing, use MX bypss r2,r3!r1 F D X M W sub r1,r!r2 F D X M W Exmple: full bypssing, use WX bypss r2,r3!r1 F D X M W l [r7]!r5 F D X M W sub r1,r!r2 F D X M W Exmple: WM bypss r2,r3!r1 F D X M W? F D X M W Cn you think of coe exmple tht uses the WM bypss? CI 501 (Mrtin/Roth): Pipelining 3 Hve We Prevente ll Hzrs? D stll nop $,$2,$3 CI 501 (Mrtin/Roth): Pipelining lw $3,($2) No. Consier lo followe by epenent insn ypssing lone isn t sufficient! Hrwre solution: etect this sitution n inject stll cycle oftwre solution: ensure compiler oesn t generte such coe
12 tlling to voi Hzrs F/D D/X X X/M hzr nop Prevent F/D insn from reing (vncing) this cycle Write nop into D/X. (effectively, insert nop in hrwre) lso reset (cler) the tpth control signls Disble F/D ltch n write enbles (why?) Re-evlute sitution next cycle CI 501 (Mrtin/Roth): Pipelining 5 M/W D tlling on Lo-To-Use Depenences D stll nop $,$2,$3 lw $3,($2) tll = (D/X..pertion == LD) && ((F/D..Regrc1 == D/X..RegDest) ((F/D..Regrc2 == D/X..RegDest) && (F/D..P!= TRE)) CI 501 (Mrtin/Roth): Pipelining 6 tlling on Lo-To-Use Depenences tlling on Lo-To-Use Depenences D nop D nop stll $,$2,$3 (stll bubble) lw $3,($2) tll = (D/X..pertion == LD) && ((F/D..Regrc1 == D/X..RegDest) ((F/D..Regrc2 == D/X..RegDest) && (F/D..P!= TRE)) CI 501 (Mrtin/Roth): Pipelining 7 stll $,$2,$3 (stll bubble) lw $3, tll = (D/X..pertion == LD) && ((F/D..Regrc1 == D/X..RegDest) ((F/D..Regrc2 == D/X..RegDest) && (F/D..P!= TRE)) CI 501 (Mrtin/Roth): Pipelining 8
13 Performnce Impct of Lo/Use Penlty ssume rnch: 20%, lo: 20%, store: 10%, other: 50% 50% of los re followe by epenent instruction require 1 cycle stll (I.e., insertion of 1 nop) Clculte CPI CPI = 1 (1 * 20% * 50%) = 1.1 Reucing Lo-Use tll Frequency $3,$2,$1 F D X M W lw $,($3) F D X M W i $6,$,1 F * D X M W sub $8,$3,$1 F D X M W Use compiler scheuling to reuce lo-use stll frequency More on compiler scheuling lter $3,$2,$1 F D X M W lw $,($3) F D X M W sub $8,$3,$1 F D X M W i $6,$,1 F D X M W CI 501 (Mrtin/Roth): Pipelining 9 CI 371 (Roth/Mrtin): Pipelining 50 Pipelining n Multi-Cycle pertions Pipeline Multiplier F/D D/X X/M D F/D D/X X/M D Wht if you wnte to multi-cycle opertion? E.g., -cycle multiply P/W: seprte output ltch connects to W stge Controlle by pipeline control finite stte mchine (FM) CI 501 (Mrtin/Roth): Pipelining 51 X Xctrl P P/W Multiplier itself is often pipeline, wht oes this men? Prouct/multiplicn register/lus/ltches replicte Cn strt ifferent multiply opertions in consecutive cycles CI 501 (Mrtin/Roth): Pipelining 52 P M P0/P1 P M P1/P2 P M P2/P3 P M P3/W
14 Pipeline Digrm with Multiplier mul $,$3,$5 F D P0 P1 P2 P3 W i $6,$,1 F D * * * X M W Wht bout Two instructions trying to write regfile in sme cycle? tructurl hzr! Must prevent: mul $,$3,$5 F D P0 P1 P2 P3 W i $6,$1,1 F D X M W $5,$6,$10 F D X M W More Multiplier Nsties Wht bout Mis-orere writes to the sme register oftwre thinks gets $ from i, ctully gets it from mul mul $,$3,$5 F D P0 P1 P2 P3 W i $,$1,1 F D X M W $10,$,$6 F D X M W Common? Not for -cycle multiply with 5-stge pipeline More common with eeper pipelines In ny cse, must be correct CI 501 (Mrtin/Roth): Pipelining 53 CI 501 (Mrtin/Roth): Pipelining 5 Correcte Pipeline Digrm With the correct stll logic Prevent mis-orere writes to the sme register Why two cycles of ely? mul $,$3,$5 F D P0 P1 P2 P3 W i $,$1,1 F * * D X M W $10,$,$6 F D X M W Multi-cycle opertions complicte pipeline logic CI 371 (Roth/Mrtin): Pipelining 55 Pipeline Functionl Units lmost ll multi-cycle functionl units re pipeline Ech opertion tkes N cycles ut cn strt initite new (inepenent) opertion every cycle Requires internl ltching n some hrwre repliction cheper wy to bnwith thn multiple non-pipeline units mulf f0,f1,f2 F D E* E* E* E* W mulf f3,f,f5 F D E* E* E* E* W ne exception: int/fp ivie: ifficult to pipeline n not worth it ivf f0,f1,f2 F D E/ E/ E/ E/ W ivf f3,f,f5 F D s* s* s* E/ E/ E/ E/ W s* = structurl hzr, two insns nee sme structure Is n pipelines esigne to hve few of these Cnonicl exmple: ll insns force to go through M stge CI 501 (Mrtin/Roth): Pipelining 56
15 Wht bout rnches? F/D D/X X << 2 X/M Control Depenences n rnch Preiction Control hzrs options Coul just stll to wit for brnch outcome (two-cycle penlty) Fetch pst brnch insns before brnch outcome is known Defult: ssume not-tken (t fetch, cn t tell it s brnch) CI 501 (Mrtin/Roth): Technology 57 CI 501 (Mrtin/Roth): Pipelining 58 rnch Recovery nop F/D nop D/X rnch recovery: wht to o when brnch is ctully tken s tht will be written into F/D n D/X re wrong Flush them, i.e., replce them with nops They hven t h written permnent stte yet (regfile, D) Two cycle penlty for tken brnches CI 501 (Mrtin/Roth): Pipelining 59 X << 2 X/M rnch Performnce ck of the envelope clcultion rnch: 20%, lo: 20%, store: 10%, other: 50% y, 75% of brnches re tken CPI = 1 20% * 75% * 2 = * 0.75 * 2 = 1.3 rnches cuse 30% slowown Even worse with eeper pipelines How o we reuce this penlty? CI 501 (Mrtin/Roth): Pipelining 60
16 ig Ie: pecultive Execution pecultion: risky trnsctions on chnce of profit pecultive execution Execute before ll prmeters known with certinty Correct specultion voi stll, improve performnce Incorrect specultion (mis-specultion) Must bort/flush/sqush incorrect insns Must uno incorrect chnges (recover pre-specultion stte) The gme : [% correct * gin] [(1 % correct ) * penlty] Control specultion: specultion ime t control hzrs Unknown prmeter: re these the correct insns to execute next? CI 501 (Mrtin/Roth): Pipelining 61 Control pecultion n Recovery i r1,1!r3 Correct: F D X M W bnez r3,trg F D X M W st r6![r7] F D X M W trg: r,r5!r F D X M W specultive Mis-specultion recovery: wht to o on wrong guess Not too pinful in n in-orer pipeline rnch resolves in X Younger insns (in F, D) hven t chnge permnent stte Flush insns currently in F/D n D/X (i.e., replce with nops) Recovery: i r1,1!r3 F D X M W bnez r3,trg F D X M W st r6![r7] F D trg: r,r5!r F trg: r,r5!r F D X M W CI 501 (Mrtin/Roth): Pipelining 62 Reucing Penlty: Fst rnches Fst brnch: trgets control-hzr penlty siclly, brnch insns tht cn resolve t D, not X Test must be comprison to zero or equlity, no time for LU New tken brnch penlty is 1 itionl comprison insns (e.g., cmplt, slt) for complex tests Must bypss into ecoe stge now, too bnez r3,trg F D X M W trg: r,r5,r F D X M W CI 501 (Mrtin/Roth): Pipelining 63 Fst rnch Performnce ssume: rnch: 20%, 75% of brnches re tken CPI = 1 20% * 75% * 1 = *0.75*1 = % slowown (better thn the 30% from before) ut wit, fst brnches ssume only simple comprisons Fine for MIP ut not fine for Is with brnch if $1 > $2 opertions In such cses, sy 25% of brnches require n extr insn CPI = 1 (20% * 75% * 1) 20%*25%*1(extr insn) = 1.2 Exmple of I n micro-rchitecture interction Type of brnch instructions nother option: Delye brnch or brnch ely slot Wht bout conition coes? CI 501 (Mrtin/Roth): Pipelining 6
17 Fewer Mispreictions: rnch Preiction P nop TG F/D D/X X/M X Dynmic brnch preiction: hrwre guesses outcome trt fetching from guesse ress Flush on mis-preiction nop CI 501 (Mrtin/Roth): Pipelining 65 TG <> << 2 rnch Preiction Performnce Prmeters rnch: 20%, lo: 20%, store: 10%, other: 50% 75% of brnches re tken Dynmic brnch preiction rnches preicte with 95% ccurcy CPI = 1 20% * 5% * 2 = 1.02 CI 501 (Mrtin/Roth): Pipelining 66 Dynmic rnch Preiction Components I$ P regfile tep #1: is it brnch? Esy fter ecoe... tep #2: is the brnch tken or not tken? Direction preictor (pplies to conitionl brnches only) Preicts tken/not-tken tep #3: if the brnch is tken, where oes it go? Esy fter ecoe CI 501 (Mrtin/Roth): Pipelining 67 D$ rnch Direction Preiction Lern from pst, preict the future Recor the pst in hrwre structure Direction preictor (DP) Mp conitionl-brnch to tken/not-tken (T/N) ecision Iniviul conitionl brnches often bise or wekly bise 90% one wy or the other consiere bise Why? Loop bck eges, checking for uncommon conitions rnch history tble (HT): simplest preictor inexes tble of bits (0 = N, 1 = T), no tgs Essentilly: brnch will go sme wy it went lst time [31:10] [9:2] 1:0 Wht bout lising? Two with the sme lower bits? No problem, just preiction! HT T or NT T or NT Preiction (tken or CI 501 (Mrtin/Roth): Pipelining not tken) 68
18 rnch History Tble (HT) rnch history tble (HT): simplest irection preictor inexes tble of bits (0 = N, 1 = T), no tgs Essentilly: brnch will go sme wy it went lst time Problem: consier inner loop brnch below (* = mis-preiction) for (i=0;i<100;i) for (j=0;j<3;j) // whtever tte/preiction N* T T T* N* T T T* N* T T T* utcome T T T N T T T N T T T N Two built-in mis-preictions per inner loop itertion rnch preictor chnges its min too quickly Two-it turting Counters (2bc) Two-bit sturting counters (2bc) [mith] Replce ech single-bit preiction (0,1,2,3) = (N,n,t,T) s hysteresis Force preictor to mis-preict twice before chnging its min tte/preiction N* n* t T* t T T T* t T T T* utcome T T T N T T T N T T T N ne mispreict ech loop execution (rther thn two) Fixes this pthology (which is not contrive, by the wy) Cn we o even better? CI 501 (Mrtin/Roth): Pipelining 69 CI 501 (Mrtin/Roth): Pipelining 70 Correlte Preictor Correlte (two-level) preictor [Ptt] Exploits observtion tht brnch outcomes re correlte Mintins seprte preiction per (, HR) rnch history register (HR): recent brnch outcomes imple working exmple: ssume progrm hs one brnch HT: one 1-bit DP entry HT2HR: 2 2 = 1-bit DP entries tte/preiction HR=NN N* T T T T T T T T T T T ctive pttern HR=NT N N* T T T T T T T T T T HR=TN N N N N N* T T T T T T T HR=TT N N N* T* N N N* T* N N N* T* utcome N N T T T N T T T N T T T N We in t mke nything better, wht s the problem? CI 501 (Mrtin/Roth): Pipelining 71 Correlte Preictor Wht hppene? HR wsn t long enough to cpture the pttern Try gin: HT3HR: 2 3 = 8 1-bit DP entries tte/preiction HR=NNN N* T T T T T T T T T T T HR=NNT N N* T T T T T T T T T T HR=NTN N N N N N N N N N N N N ctive pttern HR=NTT N N N* T T T T T T T T T HR=TNN N N N N N N N N N N N N HR=TNT N N N N N N* T T T T T T HR=TTN N N N N N* T T T T T T T HR=TTT N N N N N N N N N N N N utcome N N N T T T N T T T N T T T N No mis-preictions fter preictor lerns ll the relevnt ptterns CI 501 (Mrtin/Roth): Pipelining 72
19 Correlte Preictor Design choice I: one globl HR or one per (locl)? Ech one cptures ifferent kins of ptterns Globl is better, cptures locl ptterns for tight loop brnches Design choice II: how mny history bits (HR size)? Tricky one Given unlimite resources, longer HRs re better, but HT utiliztion ecreses Mny history ptterns re never seen Mny brnches re history inepenent (on t cre) xor HR llows multiple s to ynmiclly shre HT HR length < log 2 (HT size) Preictor tkes longer to trin Typicl length: 8 12 Hybri Preictor Hybri (tournment) preictor [McFrling] ttcks correlte preictor HT cpcity problem Ie: combine two preictors imple HT preicts history inepenent brnches Correlte preictor preicts only brnches tht nee history Chooser ssigns brnches to one preictor or the other rnches strt in simple HT, move mis-preiction threshol Correlte preictor cn be me smller, hnles fewer brnches 90 95% ccurcy HR HT HT chooser CI 501 (Mrtin/Roth): Pipelining 73 CI 501 (Mrtin/Roth): Pipelining 7 When to Perform rnch Preiction? During Decoe Look t instruction opcoe to etermine brnch instructions Cn clculte next from instruction (for -reltive brnches) ne cycle mis-fetch penlty even if brnch preictor is correct bnez r3,trg F D X M W trg: r,r5,r F D X M W During Fetch? How o we o tht? Revisiting rnch Preiction Components I$ P regfile tep #1: is it brnch? Esy fter ecoe... uring fetch: preictor tep #2: is the brnch tken or not tken? Direction preictor (s before) tep #3: if the brnch is tken, where oes it go? rnch trget preictor (T) upplies trget if brnch is tken D$ CI 501 (Mrtin/Roth): Pipelining 75 CI 501 (Mrtin/Roth): Pipelining 76
20 rnch Trget uffer (T) s before: lern from pst, preict the future Recor the pst brnch trgets in hrwre structure rnch trget buffer (T): guess the future bse on pst behvior Lst time the brnch X ws tken, it went to ress Y o, in the future, if ress X is fetche, fetch ress Y next pertion Like cche: ress =, t = trget- ccess t Fetch in prllel with instruction memory preicte-trget = T[] Upte t X whenever trget!= preicte-trget T[] = trget lising? No problem. s before, this is only preiction CI 501 (Mrtin/Roth): Pipelining 77 rnch Trget uffer (continue) t Fetch, how oes insn know it s brnch & shoul re T? It oesn t hve to ll insns ccess T in prllel with Imem Fetch Key ie: use T to preict which insn re brnches Implement by tgging ech entry with its corresponing Upte T on every tken brnch insn, recor trget : T[].tg =, T[].trget = trget of brnch ll insns ccess t Fetch in prllel with Imem Check for tg mtch, signifies insn t tht is brnch Preicte = (T[].tg == )? T[].trget : tg T trget preicte trget CI 371 (Roth/Mrtin): Pipelining 78 == Why Does T Work? ecuse most control insns use irect trgets Trget encoe in insn itself! sme tken trget every time Wht bout inirect trgets? Trget hel in register! cn be ifferent ech time Inirect conitionl jumps re not wiely supporte Two inirect cll iioms Dynmiclly linke functions (DLLs): trget lwys the sme Dynmiclly isptche (virtul) functions: hr but uncommon lso two inirect unconitionl jump iioms witches: hr but uncommon Function returns: hr n common but CI 501 (Mrtin/Roth): Pipelining 79 Return ress tck (R) T R PD I tg trget == preicte trget Return ress stck (R) Cll instruction? R[T] = Return instruction? Preicte-trget = R[--T] Q: how cn you tell if n insn is cll/return before ecoing it? ccessing R on every insn T-style oesn t work nswer: pre-ecoe bits in Imem, written when first execute Cn lso be use to signify brnches CI 501 (Mrtin/Roth): Pipelining 80
21 Putting It ll Together T & brnch irection preictor uring fetch T R PD I tg trget == is ret? preicte trget rnch Preiction Performnce Dynmic brnch preiction 20% of instruction brnches imple preictor: brnches preicte with 75% ccurcy CPI = 1 (20% * 25% * 2) = 1.1 More vnce preictor: 95% ccurcy CPI = 1 (20% * 5% * 2) = 1.02 rnch mis-preictions still big problem though Pipelines re long: typicl mis-preiction penlty is 10 cycles Pipelines re supersclr (lter) HT tken/not-tken If brnch preiction correct, no tken brnch penlty CI 501 (Mrtin/Roth): Pipelining 81 CI 501 (Mrtin/Roth): Pipelining 82 voiing rnches vi I: Preiction Conventionl control Conitionlly execute insns lso conitionlly fetche beq r3,trg F D X M W sub r6,1,r5 F D flushe: wrong pth trg: r,r5,r F flushe: why? trg: r,r5,r F D X M W If beq mis-preicts, both sub n must be flushe Wste: is inepenent of mis-preiction Preiction: not preiction, preiction I support for conitionlly-execute unconitionlly-fetche insns If beq mis-preicts, nnul sub in plce, preserve Exmple is if-then, but if-then-else cn be preicte too How is this one? How oes get correct vlue for r5 CI 501 (Mrtin/Roth): Pipelining 83 Full Preiction Full preiction Every insn cn be nnulle, nnulment controlle by Preicte registers: itionl register in ech insn (e.g., I6) setp.eq r3,p3 F D X M W sub.p r6,1,r5,p3 F D X nnulle trg: r,r5,r F D X M W Preicte coes: conition bits in ech insn (e.g., RM) setcc r3 F D X M W sub.nz r6,1,r5 F D X nnulle trg: r,r5,r F D X M W nly LU insn shown (sub), but this pplies to ll insns, even stores rnches replce with set-preicte insns CI 501 (Mrtin/Roth): Pipelining 8
22 Conitionl Moves (CMVs) Conitionl (register) moves Construct ppernce of full preiction from one primitive cmoveq r1,r2,r3 // if (r1==0) r3=r2; My require some coe upliction to chieve esire effect Pinful, potentilly impossible for some insn sequences Requires more registers nly goo wy of retro-fitting preiction onto I (e.g., I32, lph) sub r6,1,r9 D X M W cmovne r3,r9,r5 F D X M W trg: r,r5,r F D X M W Preiction Performnce Cost/benefit nlysis enefit: preiction vois brnches Thus voiing mis-preictions lso reuces pressure on preictor tble (few brnches to trck) Cost: extr (nnulle) instructions s brnch preictors re highly ccurte Might not help: 5-stge pipeline, two instruction on ech pth of if-then-else No performnce gin, likely slower if brnch preictble r even hurt! ut cn help: Deeper pipelines, hr-to-preict brnches, n few e insn Thus, preiction is useful, but not pnce CI 501 (Mrtin/Roth): Pipelining 85 CI 501 (Mrtin/Roth): Pipelining 86 Reserch: Perceptron Preictor Reserch Perceptron preictor [Jimenez] ttcks HR size problem using mchine lerning pproch HT replce by tble of function coefficients F i (signe) Preict tken if!(hr i *F i )> threshol Tble size #* HR * F (cn use long HR: ~60 bits) Equivlent correlte preictor woul be #*2 HR How oes it lern? Upte F i when brnch is tken HR i == 1? F i : F i ; on t cre F i bits sty ner 0, importnt F i bits sturte Hybri HT/perceptron ccurcy: 95 98% F! F i *HR i > thresh HR CI 501 (Mrtin/Roth): Technology 87 CI 501 (Mrtin/Roth): Pipelining 88
23 More Reserch: GEHL Preictor Problem with both correlte preictor n perceptron me HT rel-estte eicte to 1st history bit (1 column) s to 2n, 3r, 10th, 60th Not goo use of spce: 1st bit much more importnt thn 60th Chmpionship rnch Preiction CP Workshop hel in conjunction with MICR ubmitte coe is teste on stnr brnch trces Highest preiction ccurcy wins GEometric History-Length preictor [eznec, IC 05] Multiple HTs, inexe by geometriclly longer HRs (0,, 16, 32) HTs re (prtilly) tgge, not seprte chooser Preict: use mtching entry from HT with longest HR Mis-preict: crete entry in HT with longer HR nly 25% of HT use for bits (not 50%) Helps mortize cost of tgging Trins quickly 95-97% ccurte CI 501 (Mrtin/Roth): Pipelining 89 Two trcks Ielistic: preictor simultor must run in uner 2 hours Relistic: preictor must synthesize into 32K 256 bits or less 2006 winners Relistic: L-TGE (GEHL follow-on) Ielistic: GTL (nother GEHL follow-on) CI 501 (Mrtin/Roth): Pipelining 90 Reserch: Runhe Execution -regfile Reserch: Rzor regfile I$ regfile D$ I$ P == D$ In-orer writebcks essentilly imply stlls on D$ misses Cn sve power or use ile time for performnce Runhe execution [Duns] how regfile kept in sync with min regfile (write to both) D$ miss: continue executing using show regfile (isble stores) D$ miss returns: flush pipe n restrt with stlle cts like smrt prefetch engine Performs better s cche t miss grows (reltive to clock perio) CI 501 (Mrtin/Roth): Pipelining 91 Rzor [Uht, Ernst] Ientify pipeline stges with nrrow signl mrgins (e.g., X) Rzor X/M ltch: reltches X/M input signls fter sfe ely Compre X/M ltch with sfe rzor X/M ltch, ifferent? Flush F,D,X & M Restrt M using X/M rzor ltch, restrt F using D/X ltch Pipeline will not brek! reuce V DD until flush rte too high lterntively: over-clock until flush rte too high CI 501 (Mrtin/Roth): Pipelining 92
24 ummry pp pp pp ystem softwre CPU I/ Principles of pipelining Effects of overhe n hzrs Pipeline igrms hzrs tlling n bypssing Control hzrs rnch preiction Preiction CI 501 (Mrtin/Roth): Pipelining 93
Pipeline Example: Cycle 1. Pipeline Example: Cycle 2. Pipeline Example: Cycle 4. Pipeline Example: Cycle 3. 3 instructions. 3 instructions.
ipeline Exmple: Cycle 1 ipeline Exmple: Cycle X X/ /W X X/ /W $3,$,$1 lw $,0($5) $3,$,$1 3 instructions 8 9 ipeline Exmple: Cycle 3 ipeline Exmple: Cycle X X/ /W X X/ /W sw $6,($7) lw $,0($5) $3,$,$1 sw
More informationThis Unit: (Scalar In-Order) Pipelining. CIS 501 Computer Architecture. Readings. Pre-Class Exercises
This Unit: (clr In-rer) Pipelining CI 501 Computer rchitecture Unit : Pipelining pp pp pp ystem softwre CPU I/ Principles of pipelining Effects of overhe n hzrs Pipeline igrms hzrs tlling n bypssing Control
More informations1 s2 d B (F/D.IR.RS1 == D/X.IR.RD) (F/D.IR.RS2 == D/X.IR.RD) (F/D.IR.RS1 == X/M.IR.RD) (F/D.IR.RS2 == X/M.IR.RD) = 1 = 1
Hrwre Interlock Exmple: cycle Hrwre Interlock Exmple: cycle ile s s / / / t em / ile s s / / / t em / nop nop hzr hzr $,$,$ $,$,$ (/..R == /..R) (/..R == /..R) (/..R == /..R) (/..R == /..R) = (/..R ==
More informationThis Unit: Processor Design. What Is Control? Example: Control for sw. Example: Control for add
This Unit: rocessor Design Appliction O ompiler U ory Firmwre I/O Digitl ircuits Gtes & Trnsistors pth components n timing s n register files ories (RAMs) locking strtegies Mpping n IA to tpth ontrol Exceptions
More informationEECS150 - Digital Design Lecture 23 - High-level Design and Optimization 3, Parallelism and Pipelining
EECS150 - Digitl Design Lecture 23 - High-level Design nd Optimiztion 3, Prllelism nd Pipelining Nov 12, 2002 John Wwrzynek Fll 2002 EECS150 - Lec23-HL3 Pge 1 Prllelism Prllelism is the ct of doing more
More informationMIPS I/O and Interrupt
MIPS I/O nd Interrupt Review Floting point instructions re crried out on seprte chip clled coprocessor 1 You hve to move dt to/from coprocessor 1 to do most common opertions such s printing, clling functions,
More informationOverview. Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory. Running Example. Background
Overview king the Fst Cse Common n the Uncommon Cse imple in Unoune Trnsctionl Colin Blunell (University of Pennsylvni) Joe Devietti (University of Pennsylvni) E Christopher Lewis (Vwre, Inc.) ilo. K.
More informationECE / CS 250 Introduction to Computer Architecture
ECE / CS 250 Introduction to Computer rchitecture Pipelining enjamin C. Lee Duke University Slides from Daniel Sorin (Duke) and are derived from work by mir Roth (Penn) and lvy Lebeck (Duke) 1 This Unit:
More informationECE 550D Fundamentals of Computer Systems and Engineering. Fall 2016
ECE 550D Fundamentals of Computer ystems and Engineering Fall 2016 Pipelines Tyler letsch Duke University lides are derived from work by Andrew Hilton (Duke) and Amir Roth (Penn) Clock Period and CPI ingle-cycle
More informationECE/CS 250 Computer Architecture. Fall 2017
ECE/CS 250 Computer rchitecture Fall 2017 Pipelining Tyler letsch Duke University Includes material adapted from Dan Sorin (Duke) and mir Roth (Penn). This Unit: Pipelining pplication S Compiler Firmware
More informationUnit #9 : Definite Integral Properties, Fundamental Theorem of Calculus
Unit #9 : Definite Integrl Properties, Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl
More informationIn the last lecture, we discussed how valid tokens may be specified by regular expressions.
LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.
More informationUT1553B BCRT True Dual-port Memory Interface
UTMC APPICATION NOTE UT553B BCRT True Dul-port Memory Interfce INTRODUCTION The UTMC UT553B BCRT is monolithic CMOS integrted circuit tht provides comprehensive MI-STD- 553B Bus Controller nd Remote Terminl
More informationDistributed Systems Principles and Paradigms
Distriuted Systems Principles nd Prdigms Chpter 11 (version April 7, 2008) Mrten vn Steen Vrije Universiteit Amsterdm, Fculty of Science Dept. Mthemtics nd Computer Science Room R4.20. Tel: (020) 598 7784
More informationECE 468/573 Midterm 1 September 28, 2012
ECE 468/573 Midterm 1 September 28, 2012 Nme:! Purdue emil:! Plese sign the following: I ffirm tht the nswers given on this test re mine nd mine lone. I did not receive help from ny person or mteril (other
More informationCaches I. CSE 351 Spring Instructor: Ruth Anderson
L16: Cches I Cches I CSE 351 Spring 2017 Instructor: Ruth Anderson Teching Assistnts: Dyln Johnson Kevin Bi Linxing Preston Jing Cody Ohlsen Yufng Sun Joshu Curtis L16: Cches I Administrivi Homework 3,
More informationData Flow on a Queue Machine. Bruno R. Preiss. Copyright (c) 1987 by Bruno R. Preiss, P.Eng. All rights reserved.
Dt Flow on Queue Mchine Bruno R. Preiss 2 Outline Genesis of dt-flow rchitectures Sttic vs. dynmic dt-flow rchitectures Pseudo-sttic dt-flow execution model Some dt-flow mchines Simple queue mchine Prioritized
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationEngineer To Engineer Note
Engineer To Engineer Note EE-169 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit
More informationBruce McCarl's GAMS Newsletter Number 37
Bruce McCrl's GAMS Newsletter Number 37 This newsletter covers 1 Uptes to Expne GAMS User Guie by McCrl et l.... 1 2 YouTube vieos... 1 3 Explntory text for tuple set elements... 1 4 Reing sets using GDXXRW...
More informationCaches I. CSE 351 Autumn Instructor: Justin Hsia
L01: Intro, L01: L16: Combintionl Introduction Cches I Logic CSE369, CSE351, Autumn 2016 Cches I CSE 351 Autumn 2016 Instructor: Justin Hsi Teching Assistnts: Chris M Hunter Zhn John Kltenbch Kevin Bi
More informationCOMP 423 lecture 11 Jan. 28, 2008
COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring
More informationExtending Finite Automata to Efficiently Match Perl-Compatible Regular Expressions
Extening Finite Automt to Efficiently Mtch Perl-Comptible Regulr Expressions Michel Becchi Wshington University Computer Science n Engineering St. Louis, MO 63130-4899 mbecchi@cse.wustl.eu ABSTRACT Regulr
More informationECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017
ECE 550D Funamentals of Computer Systems an Engineering Fall 017 Datapaths Prof. John Boar Duke University Slies are erive from work by Profs. Tyler Bletch an Anrew Hilton (Duke) an Amir Roth (Penn) What
More informationMidterm 2 Sample solution
Nme: Instructions Midterm 2 Smple solution CMSC 430 Introduction to Compilers Fll 2012 November 28, 2012 This exm contins 9 pges, including this one. Mke sure you hve ll the pges. Write your nme on the
More information2 Computing all Intersections of a Set of Segments Line Segment Intersection
15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationSection 10.4 Hyperbolas
66 Section 10.4 Hyperbols Objective : Definition of hyperbol & hyperbols centered t (0, 0). The third type of conic we will study is the hyperbol. It is defined in the sme mnner tht we defined the prbol
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationDynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012
Dynmic Progrmming Andres Klppenecker [prtilly bsed on slides by Prof. Welch] 1 Dynmic Progrmming Optiml substructure An optiml solution to the problem contins within it optiml solutions to subproblems.
More informationChapter 2. 3/28/2004 H133 Spring
Chpter 2 Newton believe tht light ws me up of smll prticles. This point ws ebte by scientists for mny yers n it ws not until the 1800 s when series of experiments emonstrte wve nture of light. (But be
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationSystems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits
Systems I Logic Design I Topics Digitl logic Logic gtes Simple comintionl logic circuits Simple C sttement.. C = + ; Wht pieces of hrdwre do you think you might need? Storge - for vlues,, C Computtion
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationCSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe
CSCI 0 fel Ferreir d Silv rfsilv@isi.edu Slides dpted from: Mrk edekopp nd Dvid Kempe LOG STUCTUED MEGE TEES Series Summtion eview Let n = + + + + k $ = #%& #. Wht is n? n = k+ - Wht is log () + log ()
More informationUnit 5 Vocabulary. A function is a special relationship where each input has a single output.
MODULE 3 Terms Definition Picture/Exmple/Nottion 1 Function Nottion Function nottion is n efficient nd effective wy to write functions of ll types. This nottion llows you to identify the input vlue with
More informationComplete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li
2nd Interntionl Conference on Electronic & Mechnicl Engineering nd Informtion Technology (EMEIT-212) Complete Coverge Pth Plnning of Mobile Robot Bsed on Dynmic Progrmming Algorithm Peng Zhou, Zhong-min
More informationECEN 468 Advanced Logic Design Lecture 36: RTL Optimization
ECEN 468 Advnced Logic Design Lecture 36: RTL Optimiztion ECEN 468 Lecture 36 RTL Design Optimiztions nd Trdeoffs 6.5 While creting dtpth during RTL design, there re severl optimiztions nd trdeoffs, involving
More informationGeometric transformations
Geometric trnsformtions Computer Grphics Some slides re bsed on Shy Shlom slides from TAU mn n n m m T A,,,,,, 2 1 2 22 12 1 21 11 Rows become columns nd columns become rows nm n n m m A,,,,,, 1 1 2 22
More informationFig.25: the Role of LEX
The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing
More informationFunctor (1A) Young Won Lim 8/2/17
Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationpdfapilot Server 2 Manual
pdfpilot Server 2 Mnul 2011 by clls softwre gmbh Schönhuser Allee 6/7 D 10119 Berlin Germny info@cllssoftwre.com www.cllssoftwre.com Mnul clls pdfpilot Server 2 Pge 2 clls pdfpilot Server 2 Mnul Lst modified:
More informationCaches I. CSE 351 Autumn 2018
Cches I CSE 351 Autumn 2018 Instructors: Mx Willsey Luis Ceze Teching Assistnts: Britt Henderson Luks Joswik Josie Lee Wei Lin Dniel Snitkovsky Luis Veg Kory Wtson Ivy Yu Alt text: I looked t some of the
More informationCPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls
Redings for Next Two Lectures Text CPSC 213 Switch Sttements, Understnding Pointers - 2nd ed: 3.6.7, 3.10-1st ed: 3.6.6, 3.11 Introduction to Computer Systems Unit 1f Dynmic Control Flow Polymorphism nd
More informationMA1008. Calculus and Linear Algebra for Engineers. Course Notes for Section B. Stephen Wills. Department of Mathematics. University College Cork
MA1008 Clculus nd Liner Algebr for Engineers Course Notes for Section B Stephen Wills Deprtment of Mthemtics University College Cork s.wills@ucc.ie http://euclid.ucc.ie/pges/stff/wills/teching/m1008/ma1008.html
More informationFunctor (1A) Young Won Lim 10/5/17
Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationWhat do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers
Wht do ll those bits men now? bits (...) Number Systems nd Arithmetic or Computers go to elementry school instruction R-formt I-formt... integer dt number text chrs... floting point signed unsigned single
More informationCS321 Languages and Compiler Design I. Winter 2012 Lecture 5
CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,
More information12-B FRACTIONS AND DECIMALS
-B Frctions nd Decimls. () If ll four integers were negtive, their product would be positive, nd so could not equl one of them. If ll four integers were positive, their product would be much greter thn
More informationLooking up objects in Pastry
Review: Pstry routing tbles 0 1 2 3 4 7 8 9 b c d e f 0 1 2 3 4 7 8 9 b c d e f 0 1 2 3 4 7 8 9 b c d e f 0 2 3 4 7 8 9 b c d e f Row0 Row 1 Row 2 Row 3 Routing tble of node with ID i =1fc s - For ech
More informationQuestions About Numbers. Number Systems and Arithmetic. Introduction to Binary Numbers. Negative Numbers?
Questions About Numbers Number Systems nd Arithmetic or Computers go to elementry school How do you represent negtive numbers? frctions? relly lrge numbers? relly smll numbers? How do you do rithmetic?
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationControl Hazards. Branch Recovery. Control Hazard Pipeline Diagram. Branch Performance
Control Hazards ranch Recovery D/
More informationIntroduction to hardware design using VHDL
Introuction to hrwre esign using VHDL Tim Güneysu n Nele Mentens ECC school Novemer 11, 2017, Nijmegen Outline Implementtion pltforms Introuction to VHDL Hrwre tutoril 1 Implementtion pltforms Microprocessor
More informationScanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an
Scnner Termintion A scnner reds input chrcters nd prtitions them into tokens. Wht hppens when the end of the input file is reched? It my be useful to crete n Eof pseudo-chrcter when this occurs. In Jv,
More informationStack. A list whose end points are pointed by top and bottom
4. Stck Stck A list whose end points re pointed by top nd bottom Insertion nd deletion tke plce t the top (cf: Wht is the difference between Stck nd Arry?) Bottom is constnt, but top grows nd shrinks!
More informationCS 241. Fall 2017 Midterm Review Solutions. October 24, Bits and Bytes 1. 3 MIPS Assembler 6. 4 Regular Languages 7.
CS 241 Fll 2017 Midterm Review Solutions Octoer 24, 2017 Contents 1 Bits nd Bytes 1 2 MIPS Assemly Lnguge Progrmming 2 3 MIPS Assemler 6 4 Regulr Lnguges 7 5 Scnning 9 1 Bits nd Bytes 1. Give two s complement
More informationTransparent neutral-element elimination in MPI reduction operations
Trnsprent neutrl-element elimintion in MPI reduction opertions Jesper Lrsson Träff Deprtment of Scientific Computing University of Vienn Disclimer Exploiting repetition nd sprsity in input for reducing
More informationAlgorithm Design (5) Text Search
Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:
More informationWhat do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers
Wht do ll those bits men now? bits (...) Number Systems nd Arithmetic or Computers go to elementry school instruction R-formt I-formt... integer dt number text chrs... floting point signed unsigned single
More informationStack Manipulation. Other Issues. How about larger constants? Frame Pointer. PowerPC. Alternative Architectures
Other Issues Stck Mnipultion support for procedures (Refer to section 3.6), stcks, frmes, recursion mnipulting strings nd pointers linkers, loders, memory lyout Interrupts, exceptions, system clls nd conventions
More informationMany analog implementations of CPG exist, typically using operational amplifier or
FPGA Implementtion of Centrl Pttern Genertor By Jmes J Lin Introuction: Mny nlog implementtions of CPG exist, typiclly using opertionl mplifier or trnsistor level circuits. These types of circuits hve
More information6.2 Volumes of Revolution: The Disk Method
mth ppliction: volumes by disks: volume prt ii 6 6 Volumes of Revolution: The Disk Method One of the simplest pplictions of integrtion (Theorem 6) nd the ccumultion process is to determine so-clled volumes
More informationFault injection attacks on cryptographic devices and countermeasures Part 2
Fult injection ttcks on cryptogrphic devices nd countermesures Prt Isrel Koren Deprtment of Electricl nd Computer Engineering University of Msschusetts Amherst, MA Countermesures - Exmples Must first detect
More informationDr. D.M. Akbar Hussain
Dr. D.M. Akr Hussin Lexicl Anlysis. Bsic Ide: Red the source code nd generte tokens, it is similr wht humns will do to red in; just tking on the input nd reking it down in pieces. Ech token is sequence
More informationData-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors
Dt-Flow Prescheduling for Lrge Instruction Windows in Out-of-Order Processors Pierre Michud, André Seznec IRISA/INRIA Cmpus de Beulieu, 35 Rennes Cedex, Frnce {pmichud, seznec}@iris.fr Abstrct The performnce
More informationAlignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey
Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2012 Colin Dewey cdewey@biostt.wisc.edu Gols for Lecture the key concepts to understnd re the following how lrge-scle lignment
More informationFall 2018 Midterm 1 October 11, ˆ You may not ask questions about the exam except for language clarifications.
15-112 Fll 2018 Midterm 1 October 11, 2018 Nme: Andrew ID: Recittion Section: ˆ You my not use ny books, notes, extr pper, or electronic devices during this exm. There should be nothing on your desk or
More informationData sharing in OpenMP
Dt shring in OpenMP Polo Burgio polo.burgio@unimore.it Outline Expressing prllelism Understnding prllel threds Memory Dt mngement Dt cluses Synchroniztion Brriers, locks, criticl sections Work prtitioning
More informationGeorge Boole. IT 3123 Hardware and Software Concepts. Switching Algebra. Boolean Functions. Boolean Functions. Truth Tables
George Boole IT 3123 Hrdwre nd Softwre Concepts My 28 Digitl Logic The Little Mn Computer 1815 1864 British mthemticin nd philosopher Mny contriutions to mthemtics. Boolen lger: n lger over finite sets
More informationExample: 2:1 Multiplexer
Exmple: 2:1 Multiplexer Exmple #1 reg ; lwys @( or or s) egin if (s == 1') egin = ; else egin = ; 1 s B. Bs 114 Exmple: 2:1 Multiplexer Exmple #2 Normlly lwys include egin nd sttements even though they
More informationOverview. Network characteristics. Network architecture. Data dissemination. Network characteristics (cont d) Mobile computing and databases
Overview Mobile computing nd dtbses Generl issues in mobile dt mngement Dt dissemintion Dt consistency Loction dependent queries Interfces Detils of brodcst disks thlis klfigopoulos Network rchitecture
More informationEnginner To Engineer Note
Technicl Notes on using Anlog Devices DSP components nd development tools from the DSP Division Phone: (800) ANALOG-D, FAX: (781) 461-3010, EMAIL: dsp_pplictions@nlog.com, FTP: ftp.nlog.com Using n ADSP-2181
More informationSlides for Data Mining by I. H. Witten and E. Frank
Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully
More informationINTRODUCTION TO SIMPLICIAL COMPLEXES
INTRODUCTION TO SIMPLICIAL COMPLEXES CASEY KELLEHER AND ALESSANDRA PANTANO 0.1. Introduction. In this ctivity set we re going to introduce notion from Algebric Topology clled simplicil homology. The min
More informationCSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011
CSCI 3130: Forml Lnguges nd utomt Theory Lecture 12 The Chinese University of Hong Kong, Fll 2011 ndrej Bogdnov In progrmming lnguges, uilding prse trees is significnt tsk ecuse prse trees tell us the
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationEngineer-to-Engineer Note
Engineer-to-Engineer Note EE-232 Technicl notes on using Anlog Devices DSPs, processors nd development tools Contct our technicl support t dsp.support@nlog.com nd t dsptools.support@nlog.com Or visit our
More informationEngineer To Engineer Note
Engineer To Engineer Note EE-186 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit
More informationReadings : Computer Networking. Outline. The Next Internet: More of the Same? Required: Relevant earlier meeting:
Redings 15-744: Computer Networking L-14 Future Internet Architecture Required: Servl pper Extr reding on Mobility First Relevnt erlier meeting: CCN -> Nmed Dt Network 2 Outline The Next Internet: More
More informationSIMPLIFYING ALGEBRA PASSPORT.
SIMPLIFYING ALGEBRA PASSPORT www.mthletics.com.u This booklet is ll bout turning complex problems into something simple. You will be ble to do something like this! ( 9- # + 4 ' ) ' ( 9- + 7-) ' ' Give
More informationΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών
ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy Recognition of Tokens if expressions nd reltionl opertors if è if then è then else è else relop
More informationAn Integrated Simulation System for Human Factors Study
An Integrted Simultion System for Humn Fctors Study Ying Wng, Wei Zhng Deprtment of Industril Engineering, Tsinghu University, Beijing 100084, Chin Foud Bennis, Dmien Chblt IRCCyN, Ecole Centrle de Nntes,
More informationCS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig
CS311H: Discrete Mthemtics Grph Theory IV Instructor: Işıl Dillig Instructor: Işıl Dillig, CS311H: Discrete Mthemtics Grph Theory IV 1/25 A Non-plnr Grph Regions of Plnr Grph The plnr representtion of
More informationMobility Support for a QoS Aggregation Protocol
Mobility Support for QoS Aggregtion Protocol A. Kloxylos^, D. Vli*, S. Psklis+, G. Pngiotou^, I. Goninkis^, E. Zervs # ^ Deprtment of Telecommunictions Science n Technology, University of Peloponnese,
More informationEECS 281: Homework #4 Due: Thursday, October 7, 2004
EECS 28: Homework #4 Due: Thursdy, October 7, 24 Nme: Emil:. Convert the 24-bit number x44243 to mime bse64: QUJD First, set is to brek 8-bit blocks into 6-bit blocks, nd then convert: x44243 b b 6 2 9
More informationP(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have
Rndom Numers nd Monte Crlo Methods Rndom Numer Methods The integrtion methods discussed so fr ll re sed upon mking polynomil pproximtions to the integrnd. Another clss of numericl methods relies upon using
More informationFile Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment
File Mnger Quick Reference Guide June 2018 Prepred for the Myo Clinic Enterprise Khu Deployment NVIGTION IN FILE MNGER To nvigte in File Mnger, users will mke use of the left pne to nvigte nd further pnes
More informationToday. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search
Uninformed Serch [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI t UC Berkeley. All CS188 mterils re vilble t http://i.berkeley.edu.] Tody Serch Problems Uninformed Serch Methods
More informationSample Midterm Solutions COMS W4115 Programming Languages and Translators Monday, October 12, 2009
Deprtment of Computer cience Columbi University mple Midterm olutions COM W4115 Progrmming Lnguges nd Trnsltors Mondy, October 12, 2009 Closed book, no ids. ch question is worth 20 points. Question 5(c)
More informationEssential Question What are some of the characteristics of the graph of a rational function?
8. TEXAS ESSENTIAL KNOWLEDGE AND SKILLS A..A A..G A..H A..K Grphing Rtionl Functions Essentil Question Wht re some of the chrcteristics of the grph of rtionl function? The prent function for rtionl functions
More informationAgilent Mass Hunter Software
Agilent Mss Hunter Softwre Quick Strt Guide Use this guide to get strted with the Mss Hunter softwre. Wht is Mss Hunter Softwre? Mss Hunter is n integrl prt of Agilent TOF softwre (version A.02.00). Mss
More informationOutline. Tiling, formally. Expression tile as rule. Statement tiles as rules. Function calls. CS 412 Introduction to Compilers
CS 412 Introduction to Compilers Andrew Myers Cornell University Lectur8 Finishing genertion 9 Mr 01 Outline Tiling s syntx-directed trnsltion Implementing function clls Implementing functions Optimizing
More informationToday. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search.
CS 88: Artificil Intelligence Fll 00 Lecture : A* Serch 9//00 A* Serch rph Serch Tody Heuristic Design Dn Klein UC Berkeley Multiple slides from Sturt Russell or Andrew Moore Recp: Serch Exmple: Pncke
More informationRay surface intersections
Ry surfce intersections Some primitives Finite primitives: polygons spheres, cylinders, cones prts of generl qudrics Infinite primitives: plnes infinite cylinders nd cones generl qudrics A finite primitive
More informationMATH 25 CLASS 5 NOTES, SEP
MATH 25 CLASS 5 NOTES, SEP 30 2011 Contents 1. A brief diversion: reltively prime numbers 1 2. Lest common multiples 3 3. Finding ll solutions to x + by = c 4 Quick links to definitions/theorems Euclid
More informationIf f(x, y) is a surface that lies above r(t), we can think about the area between the surface and the curve.
Line Integrls The ide of line integrl is very similr to tht of single integrls. If the function f(x) is bove the x-xis on the intervl [, b], then the integrl of f(x) over [, b] is the re under f over the
More informationLecture 10 Evolutionary Computation: Evolution strategies and genetic programming
Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting
More information