A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion

Size: px
Start display at page:

Download "A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion"

Transcription

1 27. A Fully olyomial Time Approximatio Scheme for Timig Drive Miimum Cost Buffer Isertio ABSTRACT Shiya Hu Dept. of Electrical ad Computer Egieerig Michiga Techological Uiversity Houghto, Michiga 4993 shiya@mtu.edu As VLSI techology eters the aoscale regime, itercoect delay has become the bottleeck of the circuit timig. As oe of the most powerful techiques for itercoect optimizatio, buffer isertio is idispesable i the physical sythesis flow. Bufferig is kow to be N-complete ad existig works either explore dyamic programmig to compute optimal solutio i the worst-case expoetial time or desig efficiet heuristics without performace guaratee. Eve if buffer isertio is oe of the most studied problems i physical desig, whether there is a efficiet algorithm with provably good performace still remais ukow. This work settles this ope problem. I the paper, the first fully polyomial time approximatio scheme for the timig drive miimum cost buffer isertio problem is desiged. The ew algorithm ca approximate the optimal bufferig solutio withi a factor of + ruig i O(m 2 2 b/ b 2 / time for ay 0 <<, where is the umber of cadidate buffer locatios, m is the umber of siks i the tree, ad b is the umber of buffers i the buffer library. I additio to its theoretical guaratee, our experimets o 000 idustrial ets demostrate that compared to the commoly-used dyamic programmig algorithm, the ew algorithm well approximates the optimal solutio, with oly 0.57% additioal buffers ad 4.6 speedup. This clearly demostrates the practical value of the ew algorithm. Categories ad Subject Descriptors B.7.2 [Itegrated Circuits]: Desig Aids - lacemet ad Routig; J.6 [Computer-aided Egieerig]: Computeraided Desig Geeral Terms Algorithms, erformace, Desig Keywords Buffer Isertio, Fully olyomial Time Approximatio Scheme, N-complete, Cost Miimizatio, Dyamic rogrammig ermissio to make digital or hard copies of part or all of this work for persoal or classroom use is grated without fee provided that copies are ot made or distributed for profit or commercial advatage ad that copies bear this otice ad the full citatio o the first page. To copy otherwise, to republish, to post o servers or to redistribute to lists, requires prior specific permissio ad/or a fee. DAC 09, July 26-3, 2009, Sa Fracisco, Califoria, USA Copyright 2009 ACM /09/ Zhuo Li ad Charles J. Alpert IBM Austi Research Laboratory 50 Buret Road Austi, Texas {lizhuo, alpert}@us.ibm.com. INTRODUCTION As VLSI techology eters the aoscale regime, itercoect delay has become the bottleeck of the circuit timig sice devices scale much faster tha itercoects. As oe of the most effective itercoect timig optimizatio egies, buffer isertio is idispesable i the physical sythesis flow [, 2, 3]. It is demostrated i [4] that i two recet IBM ASIC desigs, over oe-fourth gates are buffers. Buffer isertio is oe of the most studied problems i physical desig. Existig works iclude [5, 6, 7, 8] o explorig dyamic programmig techiques with advaced data structures to compute optimal timig drive bufferig solutios. Sice buffers themselves are a drai o power, it is highly desirable to use as little bufferig resources as possible i buffer isertio. Bufferig with cost (power or area miimizatio has bee cosidered i [6]. It proposes a dyamic programmig algorithm which rus i pseudo-polyomial time. This surprises o oe as the miimum cost timig drive bufferig problem is N-complete [9]. Derived from this classic problem, bufferig techiques have bee developed for various scearios. For example, there are works [6, 0,, 2] hadlig slew, oise ad/or variatios, ad works explorig the iteractio of bufferig with routig [3], placemet [4], ad floorplaig [5, 6]. Despite the fact that may bufferig techiques have bee developed, the uderlyig bufferig problem is still less studied especially i theory. The dyamic programmig ca compute the optimal solutio but ot rus i polyomial time, while heuristic algorithms ca ru fast but without ay performace guaratee. hether there is ay efficiet algorithm with provably good performace still remais ukow. This work aims to settle this ope problem ad advace the uderstadig of miimum cost timig drive bufferig problem from a theoretical poit of view. Yet, our ew algorithm is highly practical. I this paper, we propose a fully polyomial time approximatio scheme (FTAS for the N-complete timig drive miimum cost bufferig problem. I our cotext, a fully polyomial time approximatio scheme refers to a bufferig algorithm which is able to compute a solutio with the cost at most + times worse tha the cost of the optimal bufferig solutio for ay >0. It rus i time polyomial i the iput size of the problem istace ad /. Give a N-complete problem, a FTAS is geerally regarded as a ultimate solutio i theory. The mai cotributio of this paper is summarized as follows. A FTAS algorithm is proposed to approximate the optimal bufferig solutio withi a factor of + i 424

2 O(m 2 2 b/ b 2 / time for ay 0 <<adi O(m 2 2 b/ + m 2 b + 3 b 2 time for ay, where is the umber of cadidate buffer locatios i the tree, m is the umber of siks i the tree, ad b is the umber of buffers i the buffer library. This work presets the first provably good approximatio algorithm o the timig-drive miimum cost bufferig problem. The proposed FTAS is motivated from the algorithms i [7, 8] for the layer assigmet problem. Nevertheless, our FTAS features ovel techiques such as double- oracle based solutio search ad timig-cost approximate dyamic programmig algorithm. latter rus i O( m2 The time to compute a bufferig solutio with the cost at most ( + ad the timig at most ( + T where refers to the optimal cost ad T refers to the timig costrait. The ew FTAS algorithm is highly practical. Experimetal results o 000 idustrial ets usig a buffer library cosistig of 48 buffer types demostrate that the FTAS approximates the optimal solutio by oly 0.57% additioal buffers with 4.6 speedup compared to the dyamic programmig algorithm. 2. RELIMINARIES A routig tree T =(V,E is give as a iput to the miimum cost timig bufferig problem, where V = {s 0} V s V,adE V V. As i may previous works [9, ], the routig tree is assumed to be biary i this paper. Trees i other topologies ca be easily coverted to a biary tree [7]. Vertex s 0 is the root/driver of T, V s is the set of sik vertices, ad V is the set of cadidate buffer locatios. For each sik s V s, there is a sik capacitace C(s ad a required arrival time (RAT. The driver s 0 has a arrival time, deoted by AT (s 0. A et satisfies the timig costrait if the arrival time is o greater tha the required arrival time at driver. By subtractig AT (s 0 ad each sik RAT by AT (s 0, oe ca modify the et such that AT (s 0 = 0 ad all RAT are positive. Note that this will ot impact bufferig solutios. Defie the timig costrait T to be the maximum RAT after the above modificatio. A buffer library B cotaiig all buffer types which ca be assiged to cadidate buffer locatiosisalsogive. NotethatB icludes both o-ivertig buffers ad ivertig buffers. As is commoly used i physical sythesis, the Elmore delay model is adopted. The Elmore delay o a edge e = (v i,v j is computed by D(e =R(e C(e + C(v 2 j,where C(e,R(e,C(v j refer to the edge capacitace, the edge resistace ad the dowstream capacitace viewig at v j,respectively. For a buffer b placed at vertex v j, its buffer delay is computed by D(b =R(b C(v j+k(b, where R(b ad K(b refer to the drivig resistace ad the itrisic delay of buffer b, respectively. Each buffer b has also a iput capacitace C(b adacostw(b. I this paper, the buffer area is used as buffer cost to illustrate our ew algorithm. A buffer assigmet γ is a mappig γ : V B {b} where b deotes the case with o buffer iserted. The total cost of a bufferig solutio γ for the tree T is defied as the sum of the costs over all iserted buffers. The timig drive miimum cost bufferig problem, kow as N-complete [9], ca be formulated as follows. Timig Costraied Miimum Cost Bufferig: Give a biary routig tree with cadidate buffer locatios ad a buffer library with b buffer types, to compute a buffer assigmet solutio such that the timig costrait is satisfied ad the total buffer cost is miimized. 3. ALGORITHMIC FLO Our FTAS algorithm for timig costraied miimum cost bufferig problem is motivated from [7, 8] for a layer assigmet problem. Let deote the cost of the optimal bufferig solutio. At a high level, the FTAS works i the followig ituitive framework. It first makes a guess x o ad uses a procedure called oracle to check whether such a guess is good, i.e., sufficietly close to or ot. If this is the case, retur x. Otherwise, make a ew guess o. This procedure is iterated util a good guess is made. Certaily, there are two algorithmic desig challeges i the above framework. First, oe does ot kow the optimal cost, which is our target, the how to decide whether x for ay x? Moreover, oe eeds to perform this check efficietly. Secod, if the curret guess is ot good, how to fid a possibly better guess? The first difficulty eeds to a saliet desig of the oracle ad the secod difficulty eeds the usage of a efficiet oracle based solutio search. These two key compoets will be described i this paper. 4. DOUBLE- ORACLE SEARCH 4. The Oracle Give ay positive umber x, the oracle ca efficietly decide whether x, where is the total buffer cost of the optimal bufferig solutio. I fact, the decisio is aswered approximately depedig o, meaig that the aswer is either x or < ( + x. A key compoet i the oracle is a polyomial time timigcost approximate dyamic programmig algorithm which will be described i Sectio 5. This algorithm has three critical properties. First, the algorithm ca be performed efficietly, i time polyomial i = /. Secod, it will either retur a bufferig solutio with the cost o greater tha a cost budget value, or coclude that this cost budget is too low to fid ay bufferig solutio (approximately satisfyig the timig costrait. Third, the dyamic programmig algorithm will retur a solutio with timig slightly larger tha the timig costrait T with cotrolled error. recisely, the timig of the solutio is bouded by ( + T where is the target approximatio ratio. This is acceptable especially cosiderig that i early stage of physical sythesis flow (or whe chip is i the prototype stage, the timig costrait is ofte set accordig to the desiger experiece ad thus i geeral the timig costrait is ot striget. I additio, varyig, our algorithm ca provide more desig flexility (i terms of timig ad cost tradeoff to the circuit desigers. Eve i the late physical desig stage where timig costrait is striget, i practice, oe ca still rip-up ad rebuffer the ets with timig violatios usig [6] to compute the optimal bufferig solutios. e call this procedure timig recovery. Eve if FTAS with timig recovery is ot guarateed to ru i polyomial time, i practice it would still ru much faster tha usig [6] aloe. This is the case sice there are very few ets which violate the timig costraits by our FTAS as idicated i the experimets. Give a timig-cost approximate dyamic programmig algorithm, we are ready to preset the oracle. First, for ay 425

3 positive umber, each buffer cost w is scaled by the factor of x w followed by dow-roudig, i.e., w becomes. x The dyamic programmig is performed to the scaled ad rouded bufferig problem with the cost budget set to /. There are two possible decisio results i a oracle query.. By dyamic programmig, a bufferig solutio with timig o greater tha ( + T is foud for the total buffer cost /. This meas that usig the uscaled ad urouded costs, the cost of the obtaied bufferig solutio will be smaller tha x + x =(+x. This is the case sice the roudig error i cost is at most x by otig that the roudig error is at most x at each buffer ad there are oly cadidate buffer locatios. Therefore, there is a bufferig solutio with cost smaller tha ( + x ad the timig at most ( + T i the origial bufferig problem. e coclude that < ( + x. 2. By dyamic programmig, there is o bufferig solutio with cost w = / which ca satisfy eve the relaxed timig costrait ( + T. This meas that the origial uscaled bufferig problem does ot have a bufferig solutio withi the cost x = x satisfyig ( + T ad thus T. e coclude that x i the origial bufferig problem. Sice both timig ad cost are rouded, our FTAS actually computes the ( + approximatio to the bufferig problem with the logest path delay bouded by ( + T. Accordig to Lemma 2 i Sectio 5, the proposed dyamic programmig algorithm rus i O( m2 + m 2 b time which gives the time for a oracle query. 4.2 The Double- OracleBasedSolutioSearch After obtaiig the oracle, a ( + approximatio could be computed as follows. Start with a iitial lower boud l ad a iitial upper boud u o the optimal bufferig cost. For example, u could correspod to always usig largest buffer at every cadidate buffer locatio ad l could be set to the cost of the sigle buffer with smallest cost i the buffer library. Oe just eeds to compare the timig of the give ubuffered et to the timig costrait to accout for the case where o buffer eeds to be iserted. Subsequetly, a biary search is performed withi these bouds. That is, each time x = l + u is used to query the oracle 2 ad depedig o the decisio results, l ad u will be updated accordigly, i.e., update u =(+xi Case ad update l = x i Case 2. The complexity of this algorithm certaily depeds o the gap betwee the iitial upper ad lower bouds. Oe ca make the time complexity idepedet of the iitial boud values as i [8]. This is accomplished by utilizig the fact that a oracle query takes the time iversely proportioal to. recisely, a larger leads to a coarser approximatio but rus faster while a smaller leads to fier approximatio but rus slower. This provides the opportuity i varyig approximatio ratio adaptively to accelerate the whole procedure of oracle based solutio search. Istead of stickig to the target approximatio ratio, oe ca use larger iitially ad gradually reduce it to the target. he these form a decreasig geometric sequece (e.g.,...,27, 9, 3,, /3,...,, the total asymptotic rutime will be bouded by the last query sice the a oracle query takes the time proportioal to /. Our oracle based solutio search is similar to the oe i [8] with a importat differece as follows. Sice their algorithm oly rouds oe parameter while our approach rouds two parameters (cost ad timig Q, applyig the techique i [8] leads to adaptively settig both roudig parameters. This is ot desired for roudig Q sice the timig error will ot be cotrolled by. Thatis, i first few iteratios durig oracle based search, the timig of the solutios may be sigificatly larger tha ( + T sice approximatio ratios there would be big (e.g., the first few approximatio ratios i..., 27, 9, 3,, /3,..., are much bigger tha the target. This meas that by dyamic programmig, oe oly kows whether the cost budget / is sufficiet for computig a solutio with much larger timig, which may lead to the wrog decisio i arrowig dow the gap betwee upper ad lower bouds. It motivates us to propose to oly chage the correspodig to cost but ot the correspodig to timig. e call it double- oracle based solutio search. Thisiswhytherearetwo ad i the dyamic programmig algorithm i Sectio 5. Namely, correspods to the approximatio ratio o cost ad correspods to the approximatio ratio o timig Q. If we fix both of them to, oe caot reduce the total rutime to be idepedet of the iitial bouds. Our idea is to fix at while adaptively chagig. I details, let u,i deote the upper boud ad deote the lower boud after i-th iteratio i the oracle based solutio search. Iitially, u,0 = u ad l,0 = l. A geometric sequece of approximatio ratios i covergig to will be used i oracle based solutio search for (but ot sice it is fixed at. Followig [8], set i = Ê u,i. ( I each iteratio, r u,i x = (2 + i is used to query the oracle. It ca be proved that after i-th oracle query, u,i+ + =( u,i 3/4, (3 sice for Case u,i+ =(+ ix = u,i 3/4 /4, + =, adforcase2u,i+ = u,i, + = x = u,i /4 This process is iterated util the ratio betwee upper ad lower boud is o greater tha 2. Let i be the first i such that u,i 2. Accordig to Lemma 2 i Sectio 5, a oracle query takes O( m2 time. e first boud the time util i,whichiso( m2 + m 2 b + i i m2 2 i 2 i i 2 i 3 b 2. Sice i i i i ad u,i > 2, = i i meas i 2 ( Thus, O( m2 2 u,i =O( m2 2 2 i i i 2 i i i i i + i i i É u,i. (4 u,i 3/4. 426

4 It is show i [8] that i i = u,i i i ( u,i ( 4 3 i i = 0 j<i ( ( 4 3 j, (5 u,i (ote that j starts from 0, ad l,t u,t < by otig that 2 3/4 i is the first i such that u,i 2. Subsequetly, i i = u,i 0 j<i ( u,i (4/3 j < (4/3 j. (6 2 3/4 0 j<i The last term is the sum of a mootoically decreasig geometric sequece which is certaily bouded by O(. Thus, O( m2 2 Similarly, sice i i É u,i O( [8], we have O( m2 b =O( m2 2. (7 2 i i i 2 < 0 j<i 0.59 /2 (4/3j = i i i = O( m2 b ad O( 3 b 2 i i i =O( 3 b 2. The total rutime for oracle based solutio search is bouded by O( m2 2 b + m 2 b + 3 b 2, (8 sice is fixed at. After i oracle queries, the ratio betwee upper ad lower bouds is at most 2. ith this better startig poit, the timig-cost approximate dyamic programmig ca be efficietly performed with both ad set to. First set x to the curret lower boud. Subsequetly, scale ad roud the cost w of each buffer to w where is the target x. Note that is o greater tha u,i 2,which meas that there is at least oe solutio with scaled cost o greater tha 2/. Otherwise, similar to Case 2, there is o bufferig solutio withi cost 2 x =2x u,i which is a cotradictio. This solutio is the ( + approximatio which will be retured by our FTAS. Similar to the argumet i Case (, scalig the result back by the factor of x forms a lower boud o ad the maximum roudig error is x = x = l,t. Thus, the cost of the obtaied bufferig solutio is at most ( +. This last step uses the dyamic programmig which takes O( m2 time by otig that 3 both ad are set to. The efficiet computatio is due to the fact that the ratio betwee the curret upper ad lower bouds has bee reduced to 2. Together with the time for oracle based solutio search i Eq. (8, the total time is O( m2 3 b +m 2 b+ 3 b 2 which ca be simplified to O(m 2 2 b/ b 2 / for 0 <<adtoo(m 2 2 b/ + m 2 b + 3 b 2 for. e reach the followig theorem. Theorem : A(+ approximatio to the timig costraied miimum cost bufferig problem ca be computed i O(m 2 2 b/ b 2 / time for ay 0 <<adi O(m 2 2 b/ + m 2 b + 3 b 2 timefor, where is the umber of odes i the tree, m is the umber of siks i the tree, ad b is the umber of buffers i the buffer library. 5. OLYNOMIAL TIME TIMING-COST A- ROIMATE DYNAMIC ROGRAMMING 5. Boud Distict Cost ad RAT Q A careful ivestigatio i Lillis algorithm [6] would reveal that the umber of solutios is ot polyomially bouded which is why Lillis algorithm is a pseudo-polyomial algorithm. To desig a efficiet algorithm, we certaily eed to boud the umber of solutios durig solutio propagatio. A major iovatio i the proposed FTAS algorithm is a dyamic programmig algorithm with polyomially bouded ad Q. As a result, the umber of solutios will also be polyomially bouded (sice there is oly oe possible o-domiated solutio with each pair of ad Q, amely, the oe with the smallest C. First ote that we have two i the algorithm, oe beig which refers to the approximatio i cost ad the other beig which refers to the approximatio i timig. is varyig while is fixed to i the fast double- oracle based solutio search. To boud, recall that i Sectio 4, oe first scales ad rouds each buffer cost w to a iteger as w = w x.after that, the oracle oly wats to kow whether there is a solutio (approximately satisfyig the timig costrait with cost up to /. Let = /. The oracle uses this cost boud to perform the dyamic programmig. Thus, wheever there is a solutio with cost greater tha, it will be elimiated from the solutio set. Cosequetly, there are at most + distict (0,,..., ataylocatiodurig solutio propagatio. Our ew dyamic programmig algorithm is as follows. First, it always works with cost bis ( -bi sice the oracle scales ad rouds each buffer cost before performig the dyamic programmig algorithm. I dyamic programmig, right before a brach merge, all the solutios are also discretized ito timig bis (Q-bi ad the the solutio pruig is performed. This allows us to boud Q as well. Cosequetly, the umber of o-domiated solutios durig solutio propagatio is bouded. To boud the umber of distict Q, right before each brach merge, for all Q 0, roud up Q of each brach to the earest value i {0,T/m,2T /m,..., T }, wherem is the umber of the siks. e uderestimate the delay by roudig. For example, whe = 0.5 ad m = 2, Q =0.7T ad will be up-rouded to 3T/m.Thesolutios with Q<0 will be prued sice the arrival time at driver is T T/m 0. Thus, there are at most +=m/2 + distict Q after roudig. Together with the fact that there are at most O( distict at ay poit, at most O(m/= O( m o-domiated solutios ca be obtaied after ay brach merge (the umber of o-domiated solutios before brach merge will be discussed soo. This is due to the fact that there is oly oe solutio for each pair of Q,, amely, the oe with miimum C. Note that the roudig error i timig at each brachig poit is at most T/mad thus at most T forthewholetreewithm brachig poits. Note that the timig of the obtaied solutio (i the scaled problem is at most T but it is with the rouded Q. This meas that after roudig Q back, we obtai a bufferig solutio with timig at most ( + T for the origial bufferig problem. The time complexity aalysis is as follows. After performig a brach merge, there are at most O( m odomiated solutios. After that, solutios are propagated 427

5 i its upstream brach. A add wire operatio does ot itroduce ew solutio. After performig buffer isertio operatio at a ode v, for solutios with ew buffers iserted at v, there are oly b distict C (where b is the umber of buffer types. Sice the umber of is always bouded by O(, there are at most O(b=O(b/ o-domiated buffered solutios. This is due to the fact that there is oly oe solutio for each pair of C,, i.e., the oe with maximum Q. e are to boud the time for computig these O(b= O(b/ o-domiated buffered solutios. Give a solutio, buffer isertio at v leads to b possible ew solutios. Sice there are at most O(m/( + 2 b/ odomiated solutios aywhere (see below, computig odomiated buffered solutios at ode v takes O(mb/( + 2 b 2 / time. This is due to that withi a sigle bi, (Q, C based pruig takes liear time i the umber of geerated solutios which is the same as the pruig without cosiderig i [5]. Note that cross -bi pruig is ot performed sice this will ot improve the asymptotic complexity. Together with those O(m/ ubuffered solutios (which are propagated by add wire from the last brach merge, there are at most O(m/ + bodomiated solutios. he these solutios are propagated alog this brach, there are at most O(m/ + b= O(m/( + 2 b/ o-domiated solutios at ay poit before ext brach merge sice there are at most cadidate buffer locatios ad Q is ot rouded util the brach merge. Before brach merge, all solutios are first put ito cost bis ( -bi ad timig bis (Q-bi. That is, they are placed ito the bis with iteger costs from 0 to /. I each cost bi, there are m/ + timig bis with the timig as 0,T/m,2T /m,..., T by up-roudig each Q. By a liear traversal of all bis, domiated solutios will be prued. The whole process certaily takes time liear i the umber of solutios, i.e., O(m/( + 2 b/ time. I solutio pruig, sice is always a iteger, there are + possible mergig results of left ad right brach solutios for each. For example, to obtai merged cost =3,the four possibly combiatios of left-brach solutio γ ad right-brach solutio γ 2 are ( (γ =0,(γ 2=3,(2 (γ =,(γ 2=2,(3 (γ =2,(γ 2 =, ad (4 (γ =3,(γ 2 = 0. For a fixed combiatio o,there are a list of solutios with distict Q alog each brach. For each Q-bi, the time complexity ca be easily bouded sice all the solutios i each brach are o-domiated ad thus for the same, C are icreasigly sorted. Oe just eeds to correspodigly merge two solutios with the same Q. Table : A example for brach merge. Left brach (Q,, C C =0 = =2 =3 Q = Q = T/m Q =2T/m Right brach (Q,, C C =0 = =2 =3 Q = Q = T/m Q =2T/m It is helpful to look at a example to illustrate the above aalysis. Refer to Table. To obtai the merged cost of 3, suppose that we are mergig solutios with =2ithe left brach ad solutios with = i the right brach. The correspodig are show with arrows. For Q =0 after brach merge, the miimum C is C = = 7. For Q = T/m after brach merge, the miimum C is C = = 37. For Q =2T/m after brach merge, the miimum C is C = = 90. Oe the also eeds to cosider the other three mergig possibilities for the merged cost to be 3, i.e., ( (γ =0,(γ 2=3, (2 (γ =,(γ 2 = 2, ad (4 (γ =3,(γ 2=0. Subsequetly, the solutio with the miimum C for each pair of, Q is picked. This process takes O(m/ time for each merged sice there are oly O(m/distict Q. Summig over all till, the time complexity for brach merge ad solutio pruig i brach merge is O(m/ ( 2 =O(m 2 /(. Together with the time for puttig the solutios ito bis before brach merge, the total rutime is O(m/( + 2 b/ + m 2 /( for a sigle brach merge. 5.2 Time Complexity As metioed above, a sigle buffer isertio together with pruig takes O(mb/( + 2 b 2 / time. Summig over cadidate buffer locatios, total buffer isertio takes O(m 2 b/( + 3 b 2 / time. This certaily upper bouds the time for add wire. A sigle brach merge takes O(m/( + 2 b/ + m 2 /( time. Summig over all m brachmerges,o(m 2 /( +m 2 b/ + m 2 2 /( time is eeded. Thus, the dyamic programmig will ru i O( m2 (9 2 time to compute a solutio with the cost at most ( + optimal cost ad with timig at most ( + T. e reach the followig lemma. Lemma 2: The timig-cost approximate dyamic programmig algorithm ca compute a timig drive bufferig solutio with the cost at most (+ optimal cost ad with timigatmost(+ 2T i O( m2 time for ay, > 0, where is the umber of cadidate buffer locatios, m is the umber of siks, ad b is the umber of buffers i the buffer library. 6. EERIMENTAL RESULTS e compare the proposed FTAS for the timig drive miimum cost bufferig problem to the dyamic programmig algorithm [6] which computes optimal bufferig solutio. The experimets are performed o a set of 000 ets at various scales extracted from a idustrial ASIC chip. The buffer library cosists of 48 buffer types icludig buffers ad iverters. The buffer cost is measured by buffer area i this paper. However, other metric ca be easily hadled i FTAS. Refer to Table 2 for the compariso. Cost Ratio ad Speedup are computed by comparig to the total buffer cost ad the rutime of dyamic programmig algorithm. # Vio. specifies the umber of ets with timig violatios. The results with small <are show. This rage of is desired i practice sice oe always wishes to compute solutios close to the optima. e make the followig observatios. The dyamic programmig i [6] computes the optimal solutio. Total buffer cost is , o et has timig violatio, ad CU time is secods. 428

6 Table 2: Compariso of the dyamic programmig [6] ad the FTAS algorithm o 000 idustrial ets. I dyamic programmig solutio, total buffer cost is , ad CU is secods. # Vio. specifies the umber of ets with timig violatios. Cost Ratio ad Speedup are computed by comparig to dyamic programmig. FTAS #Vio. Total Cost CU(s Cost Ratio Speedup % % % % % % % 5.7 Average 5.2 Table 3: The obtaied timig o the ets violatig the timig costraits for FTAS with =0.0. FTAS with =0.0 Net Timig Costrait Actual Delay Timig Violatio % % % Our FTAS works very well i practice. Compared to the dyamic programmig solutios, there are oly slight solutio degradatios i total buffer costs while o average over 5 speedup is obtaied. For example, whe the target approximatio ratio is set to = 0.0, the actual approximatio ratio (cost ratio is oly 0.57% while 4.6 speedup is achieved. The speedup is so sigificat sice the total umber of solutios at driver over 000 ets is 66,676 for FTAS with = 0.0 (for all iteratios i performig double- oracle based solutio search while it is,22,604 i the dyamic programmig algorithm [6]. It is importat to ote that the cost ratio is theoretically guarateed to be o greater tha. I practice, it is much smaller as is show i Table 2. This clearly demostrates the effectiveess of our FTAS algorithm. Larger leads to more speedup while smaller leads to less solutio quality degradatio. This is agai as guarateed theoretically. Sice timig is rouded i our timig-cost approximate dyamic programmig, there are timig violatios i the obtaied bufferig solutios. However, it is clear that this happes with very small probability i practice as idicated by our experimetal results. he =0.0, oly 3 out of 000 ets have timig violatios. I additio, the obtaied timig is theoretically guarateed to be withi ( + T where T is the timig costrait. For example, the ets with timig violatios for FTAS with =0.0 are show i Table 3. Their actual delays are clearly bouded by the ( + T =.0T. he the timig costraits are striget, oe may eed every et to satisfy the timig costrait. For this, the followig timig recovery procedure could be performed. Those ets with timig violatios ca be ripped up ad rebuffered usig optimal dyamic programmig [6]. The overall rutime would still be much better tha [6] sice few ets eed to be rebuffered. For the ets without timig violatios, the approximatio ratio is bouded by +. For the ets with timig violatios, the optimal solutios are computed i rebufferig. Thus, the approximatio ratio is still bouded by + after timig recovery. Refer to Table 4 for the results. The cost is icreased compared to FTAS without timig recovery sice roudig o timig i FTAS uderestimates delay ad may make the cost of the obtaied bufferig solutio smaller tha the optimal cost (esp. for may of the ets with timig violatios eve if roudig o cost icreases it. Empirically, FTAS with = 0.0 gives the best performace sice it has fewest ets which eed rebufferig. Table 4: The results of FTAS with timig recovery. FTAS with Timig Recovery #Vio. Total Cost CU(s Cost Ratio Speedup % % % % % % % 2.4 Average REFERENCES []. Saxea ad N. Meezes ad. Cocchii ad D.A. Kirkpatrick, Repeater scalig ad its impact o CAD, TCAD, vol. 23, o. 4, pp , [2] J. Cog, A itercoect-cetric desig flow for aometer techologies, roceedigs of the IEEE, vol. 89, o. 4, pp , 200. [3] Z. Li, C. Alpert, S. Hu, T. Muhmud, S. Quay, ad. Villarrubia, Fast itercoect sythesis with layer assigmet, ISD, [4].J. Osler, lacemet drive sythesis case studies o two sets of two chips: hierarchical ad flat, ISD, pp , [5] L... va Gieke, Buffer placemet i distributed RC-tree etworks for miimal Elmore delay, i roceedigs of the IEEE Iteratioal Symposium o Circuits ad Systems, pp , 990. [6] J. Lillis ad C.-K. Cheg ad T.-T.Y. Li, Optimal wire sizig ad buffer isertio for low power ad a geeralized delay model, IEEE Joural of Solid State Circuits, vol.3, o. 3, pp , 996. [7]. Shi ad Z. Li, A fast algorithm for optimal buffer isertio, TCAD, vol. 24, o. 6, pp , [8] R. Che ad H. Zhou, A flexible data structure for efficiet buffer isertio, ICCD, [9]. Shi ad Z. Li ad C. Alpert, Complexity aalysis ad speedup techiques for optimal buffer isertio with miimum cost, ASDAC, pp , [0] C.J. Alpert ad A. Devga ad S.T. Quay, Buffer isertio for oise ad delay optimizatio, DAC, pp , 998. [] S.Hu,C.J.Alpert,J.Hu,S.Karadikar,Z.Li,.Shi,ad C. N. Sze, Fast algorithms for slew costraied miimum cost bufferig, DAC, [2] R. Che ad H. Zhou, Fast mi-cost buffer isertio uder process variatios, DAC, [3] H. Zhou, D.F. og, I.-M. Liu, ad A. Aziz, Simultaeous routig ad buffer isertio with restrictios o buffer locatios, DAC, 999. [4] T.-C. Che, A. Chakraborty, ad D. Z. a, A itegrated oliear placemet framework with cogestio ad porosity aware buffer plaig, DAC, [5] J. Cog, T. Kog, ad D. Z. a, Buffer block plaig for itercoect-drive floorplaig, ICCAD, 999. [6] C.J. Alpert, J. Hu, S.S. Sapatekar ad. Villarrubia, A practical methodology for early buffer ad wire resource allocatio, DAC, 200. [7] S. Hu, Z. Li, ad C.J. Alpert, A polyomial time approximatio scheme for timig costraied miimum cost layer assigmet, ICCAD, [8] S. Hu, Z. Li, ad C.J. Alpert, A faster approximatio scheme for timig costraied miimum cost layer assigmet, ISD,

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets

More information

Counting the Number of Minimum Roman Dominating Functions of a Graph

Counting the Number of Minimum Roman Dominating Functions of a Graph Coutig the Number of Miimum Roma Domiatig Fuctios of a Graph SHI ZHENG ad KOH KHEE MENG, Natioal Uiversity of Sigapore We provide two algorithms coutig the umber of miimum Roma domiatig fuctios of a graph

More information

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015. Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

Analysis of Algorithms

Analysis of Algorithms Presetatio for use with the textbook, Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Aalysis of Algorithms Iput 2015 Goodrich ad Tamassia Algorithm Aalysis of Algorithms

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Solving Fuzzy Assignment Problem Using Fourier Elimination Method

Solving Fuzzy Assignment Problem Using Fourier Elimination Method Global Joural of Pure ad Applied Mathematics. ISSN 0973-768 Volume 3, Number 2 (207), pp. 453-462 Research Idia Publicatios http://www.ripublicatio.com Solvig Fuzzy Assigmet Problem Usig Fourier Elimiatio

More information

CS 683: Advanced Design and Analysis of Algorithms

CS 683: Advanced Design and Analysis of Algorithms CS 683: Advaced Desig ad Aalysis of Algorithms Lecture 6, February 1, 2008 Lecturer: Joh Hopcroft Scribes: Shaomei Wu, Etha Feldma February 7, 2008 1 Threshold for k CNF Satisfiability I the previous lecture,

More information

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis Outlie ad Readig Aalysis of Algorithms Iput Algorithm Output Ruig time ( 3.) Pseudo-code ( 3.2) Coutig primitive operatios ( 3.3-3.) Asymptotic otatio ( 3.6) Asymptotic aalysis ( 3.7) Case study Aalysis

More information

Homework 1 Solutions MA 522 Fall 2017

Homework 1 Solutions MA 522 Fall 2017 Homework 1 Solutios MA 5 Fall 017 1. Cosider the searchig problem: Iput A sequece of umbers A = [a 1,..., a ] ad a value v. Output A idex i such that v = A[i] or the special value NIL if v does ot appear

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

Lecture 18. Optimization in n dimensions

Lecture 18. Optimization in n dimensions Lecture 8 Optimizatio i dimesios Itroductio We ow cosider the problem of miimizig a sigle scalar fuctio of variables, f x, where x=[ x, x,, x ]T. The D case ca be visualized as fidig the lowest poit of

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The

More information

Reversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits

Reversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits Egieerig Letters, :, EL Reversible Realizatio of Quaterary Decoder, Multiplexer, ad Demultiplexer Circuits Mozammel H.. Kha, Member, ENG bstract quaterary reversible circuit is more compact tha the correspodig

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis IOSR Joural of Egieerig Redudacy Allocatio for Series Parallel Systems with Multiple Costraits ad Sesitivity Aalysis S. V. Suresh Babu, D.Maheswar 2, G. Ragaath 3 Y.Viaya Kumar d G.Sakaraiah e (Mechaical

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Ruig Time of a algorithm Ruig Time Upper Bouds Lower Bouds Examples Mathematical facts Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

A General Framework for Accurate Statistical Timing Analysis Considering Correlations

A General Framework for Accurate Statistical Timing Analysis Considering Correlations A Geeral Framework for Accurate Statistical Timig Aalysis Cosiderig Correlatios 7.4 Vishal Khadelwal Departmet of ECE Uiversity of Marylad-College Park vishalk@glue.umd.edu Akur Srivastava Departmet of

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS) CSC165H1, Witer 018 Learig Objectives By the ed of this worksheet, you will: Aalyse the ruig time of fuctios cotaiig ested loops. 1. Nested loop variatios. Each of the followig fuctios takes as iput a

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

1 Graph Sparsfication

1 Graph Sparsfication CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider

More information

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8) CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig

More information

arxiv: v2 [cs.ds] 24 Mar 2018

arxiv: v2 [cs.ds] 24 Mar 2018 Similar Elemets ad Metric Labelig o Complete Graphs arxiv:1803.08037v [cs.ds] 4 Mar 018 Pedro F. Felzeszwalb Brow Uiversity Providece, RI, USA pff@brow.edu March 8, 018 We cosider a problem that ivolves

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

Random Graphs and Complex Networks T

Random Graphs and Complex Networks T Radom Graphs ad Complex Networks T-79.7003 Charalampos E. Tsourakakis Aalto Uiversity Lecture 3 7 September 013 Aoucemet Homework 1 is out, due i two weeks from ow. Exercises: Probabilistic iequalities

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

Computer Science Foundation Exam. August 12, Computer Science. Section 1A. No Calculators! KEY. Solutions and Grading Criteria.

Computer Science Foundation Exam. August 12, Computer Science. Section 1A. No Calculators! KEY. Solutions and Grading Criteria. Computer Sciece Foudatio Exam August, 005 Computer Sciece Sectio A No Calculators! Name: SSN: KEY Solutios ad Gradig Criteria Score: 50 I this sectio of the exam, there are four (4) problems. You must

More information

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 201 Heaps 201 Goodrich ad Tamassia xkcd. http://xkcd.com/83/. Tree. Used with permissio uder

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Examples and Applications of Binary Search

Examples and Applications of Binary Search Toy Gog ITEE Uiersity of Queeslad I the secod lecture last week we studied the biary search algorithm that soles the problem of determiig if a particular alue appears i a sorted list of iteger or ot. We

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Combination Labelings Of Graphs

Combination Labelings Of Graphs Applied Mathematics E-Notes, (0), - c ISSN 0-0 Available free at mirror sites of http://wwwmaththuedutw/ame/ Combiatio Labeligs Of Graphs Pak Chig Li y Received February 0 Abstract Suppose G = (V; E) is

More information

CSE 417: Algorithms and Computational Complexity

CSE 417: Algorithms and Computational Complexity Time CSE 47: Algorithms ad Computatioal Readig assigmet Read Chapter of The ALGORITHM Desig Maual Aalysis & Sortig Autum 00 Paul Beame aalysis Problem size Worst-case complexity: max # steps algorithm

More information

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions: CS 604 Data Structures Midterm Sprig, 00 VIRG INIA POLYTECHNIC INSTITUTE AND STATE U T PROSI M UNI VERSI TY Istructios: Prit your ame i the space provided below. This examiatio is closed book ad closed

More information

Lecture Notes on Integer Linear Programming

Lecture Notes on Integer Linear Programming Lecture Notes o Iteger Liear Programmig Roel va de Broek October 15, 2018 These otes supplemet the material o (iteger) liear programmig covered by the lectures i the course Algorithms for Decisio Support.

More information

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

2. ALGORITHM ANALYSIS

2. ALGORITHM ANALYSIS 2. ALGORITHM ANALYSIS computatioal tractability survey of commo ruig times 2. ALGORITHM ANALYSIS computatioal tractability survey of commo ruig times Lecture slides by Kevi Waye Copyright 2005 Pearso-Addiso

More information

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals UNIT 4 Sectio 8 Estimatig Populatio Parameters usig Cofidece Itervals To make ifereces about a populatio that caot be surveyed etirely, sample statistics ca be take from a SRS of the populatio ad used

More information

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP Nature-Ispired Computig Hadlig Costraits Dr. Şima Uyar September 2006 Itroductio may practical problems are costraied ot all combiatios of variable values represet valid solutios feasible solutios ifeasible

More information

BASED ON ITERATIVE ERROR-CORRECTION

BASED ON ITERATIVE ERROR-CORRECTION A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity

More information

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015 15-859E: Advaced Algorithms CMU, Sprig 2015 Lecture #2: Radomized MST ad MST Verificatio Jauary 14, 2015 Lecturer: Aupam Gupta Scribe: Yu Zhao 1 Prelimiaries I this lecture we are talkig about two cotets:

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

Designing a learning system

Designing a learning system CS 75 Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square, x-5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please try

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

Xiaozhou (Steve) Li, Atri Rudra, Ram Swaminathan. HP Laboratories HPL Keyword(s): graph coloring; hardness of approximation

Xiaozhou (Steve) Li, Atri Rudra, Ram Swaminathan. HP Laboratories HPL Keyword(s): graph coloring; hardness of approximation Flexible Colorig Xiaozhou (Steve) Li, Atri Rudra, Ram Swamiatha HP Laboratories HPL-2010-177 Keyword(s): graph colorig; hardess of approximatio Abstract: Motivated b y reliability cosideratios i data deduplicatio

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

Data Structures and Algorithms Part 1.4

Data Structures and Algorithms Part 1.4 1 Data Structures ad Algorithms Part 1.4 Werer Nutt 2 DSA, Part 1: Itroductio, syllabus, orgaisatio Algorithms Recursio (priciple, trace, factorial, Fiboacci) Sortig (bubble, isertio, selectio) 3 Sortig

More information

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network Available olie at www.sciecedirect.com Eergy Procedia 6 (202) 60 64 202 Iteratioal Coferece o Future Eergy, Eviromet, ad Materials Adaptive Resource Allocatio for Electric Evirometal Pollutio through the

More information

Σ P(i) ( depth T (K i ) + 1),

Σ P(i) ( depth T (K i ) + 1), EECS 3101 York Uiversity Istructor: Ady Mirzaia DYNAMIC PROGRAMMING: OPIMAL SAIC BINARY SEARCH REES his lecture ote describes a applicatio of the dyamic programmig paradigm o computig the optimal static

More information

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU)

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU) Graphs Miimum Spaig Trees Slides by Rose Hoberma (CMU) Problem: Layig Telephoe Wire Cetral office 2 Wirig: Naïve Approach Cetral office Expesive! 3 Wirig: Better Approach Cetral office Miimize the total

More information

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of

More information

The Adjacency Matrix and The nth Eigenvalue

The Adjacency Matrix and The nth Eigenvalue Spectral Graph Theory Lecture 3 The Adjacecy Matrix ad The th Eigevalue Daiel A. Spielma September 5, 2012 3.1 About these otes These otes are ot ecessarily a accurate represetatio of what happeed i class.

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

Algorithms Chapter 3 Growth of Functions

Algorithms Chapter 3 Growth of Functions Algorithms Chapter 3 Growth of Fuctios Istructor: Chig Chi Li 林清池助理教授 chigchi.li@gmail.com Departmet of Computer Sciece ad Egieerig Natioal Taiwa Ocea Uiversity Outlie Asymptotic otatio Stadard otatios

More information

Greedy Algorithms. Interval Scheduling. Greedy Algorithms. Interval scheduling. Greedy Algorithms. Interval Scheduling

Greedy Algorithms. Interval Scheduling. Greedy Algorithms. Interval scheduling. Greedy Algorithms. Interval Scheduling Greedy Algorithms Greedy Algorithms Witer Paul Beame Hard to defie exactly but ca give geeral properties Solutio is built i small steps Decisios o how to build the solutio are made to maximize some criterio

More information

Computational Geometry

Computational Geometry Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed

More information

Heuristic Approaches for Solving the Multidimensional Knapsack Problem (MKP)

Heuristic Approaches for Solving the Multidimensional Knapsack Problem (MKP) Heuristic Approaches for Solvig the Multidimesioal Kapsack Problem (MKP) R. PARRA-HERNANDEZ N. DIMOPOULOS Departmet of Electrical ad Computer Eg. Uiversity of Victoria Victoria, B.C. CANADA Abstract: -

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 ) EE26: Digital Desig, Sprig 28 3/6/8 EE 26: Itroductio to Digital Desig Combiatioal Datapath Yao Zheg Departmet of Electrical Egieerig Uiversity of Hawaiʻi at Māoa Combiatioal Logic Blocks Multiplexer Ecoders/Decoders

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

Sorting 9/15/2009. Sorting Problem. Insertion Sort: Soundness. Insertion Sort. Insertion Sort: Running Time. Insertion Sort: Soundness

Sorting 9/15/2009. Sorting Problem. Insertion Sort: Soundness. Insertion Sort. Insertion Sort: Running Time. Insertion Sort: Soundness 9/5/009 Algorithms Sortig 3- Sortig Sortig Problem The Sortig Problem Istace: A sequece of umbers Objective: A permutatio (reorderig) such that a ' K a' a, K,a a ', K, a' of the iput sequece The umbers

More information

Minimum Spanning Trees

Minimum Spanning Trees Presetatio for use with the textbook, lgorithm esig ad pplicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 0 Miimum Spaig Trees 0 Goodrich ad Tamassia Miimum Spaig Trees pplicatio: oectig a Network Suppose

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

ANN WHICH COVERS MLP AND RBF

ANN WHICH COVERS MLP AND RBF ANN WHICH COVERS MLP AND RBF Josef Boští, Jaromír Kual Faculty of Nuclear Scieces ad Physical Egieerig, CTU i Prague Departmet of Software Egieerig Abstract Two basic types of artificial eural etwors Multi

More information

A Note on Least-norm Solution of Global WireWarping

A Note on Least-norm Solution of Global WireWarping A Note o Least-orm Solutio of Global WireWarpig Charlie C. L. Wag Departmet of Mechaical ad Automatio Egieerig The Chiese Uiversity of Hog Kog Shati, N.T., Hog Kog E-mail: cwag@mae.cuhk.edu.hk Abstract

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

Module 8-7: Pascal s Triangle and the Binomial Theorem

Module 8-7: Pascal s Triangle and the Binomial Theorem Module 8-7: Pascal s Triagle ad the Biomial Theorem Gregory V. Bard April 5, 017 A Note about Notatio Just to recall, all of the followig mea the same thig: ( 7 7C 4 C4 7 7C4 5 4 ad they are (all proouced

More information

Minimum Spanning Trees. Application: Connecting a Network

Minimum Spanning Trees. Application: Connecting a Network Miimum Spaig Tree // : Presetatio for use with the textbook, lgorithm esig ad pplicatios, by M. T. oodrich ad R. Tamassia, Wiley, Miimum Spaig Trees oodrich ad Tamassia Miimum Spaig Trees pplicatio: oectig

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

One advantage that SONAR has over any other music-sequencing product I ve worked

One advantage that SONAR has over any other music-sequencing product I ve worked *gajedra* D:/Thomso_Learig_Projects/Garrigus_163132/z_productio/z_3B2_3D_files/Garrigus_163132_ch17.3d, 14/11/08/16:26:39, 16:26, page: 647 17 CAL 101 Oe advatage that SONAR has over ay other music-sequecig

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

A Boolean Query Processing with a Result Cache in Mediator Systems

A Boolean Query Processing with a Result Cache in Mediator Systems A Boolea Query Processig with a Result Cache i Mediator Systems Jae-heo Cheog ad Sag-goo Lee * Departmet of Computer Sciece Seoul Natioal Uiversity Sa 56-1 Shillim-dog Kwaak-gu, Seoul Korea {cjh, sglee}cygus.su.ac.kr

More information

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:

More information

BST Sequence of Operations

BST Sequence of Operations Splay Trees Problems with BSTs Because the shape of a BST is determied by the order that data is iserted, we ru the risk of trees that are essetially lists 12 21 20 32 24 37 15 40 55 56 77 2 BST Sequece

More information

Civil Engineering Computation

Civil Engineering Computation Civil Egieerig Computatio Fidig Roots of No-Liear Equatios March 14, 1945 World War II The R.A.F. first operatioal use of the Grad Slam bomb, Bielefeld, Germay. Cotets 2 Root basics Excel solver Newto-Raphso

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information