Interconnect Optimization for High-Level Synthesis of SSA Form Programs

Size: px
Start display at page:

Download "Interconnect Optimization for High-Level Synthesis of SSA Form Programs"

Transcription

1 Interconnect Optmzaton for Hgh-Level Synthess of SSA Form Programs Phlp Brsk Aay K. Verma Paolo Ienne Processor Archtecture Laboratory Swss Federal Insttute of Technology (EPFL) Lausanne, Swtzerland {phlp.brsk, aaykumar.verma, Mad Sarrafzadeh Department of Computer Scence Unversty of Calforna, Los Angeles (UCLA) Los Angeles, CA 995 ABSTRACT Regster Allocaton for Programs n Statc Sngle Assgnment (SSA) Form has a Polynomal-Tme Soluton because nterference graphs for procedures n ths representaton are chordal graphs. Ths paper explores a complementary problem whch s NP- Complete: the assgnment of regsters to varables n order to mnmze nterconnect costs. In partcular, we attempt to mnmze the sze of the multplexers placed on the nput to each regster. Ths s partcularly mportant for FPGA-based desgn flows where multplexers have a hgh cost n terms of both area and delay. An effcent greedy heurstc for color assgnment s presented and compared aganst smulated annealng. Categores and Subect Descrptors B.5.2 [Hardware]: Desgn Ads automatc synthess, optmzaton. General Terms Algorthms, Performance, Desgn. Keywords Regster Allocaton and Assgnment, Connectvty Bndng, Interconnect Allocaton, Statc Sngle Assgnment (SSA) Form. INTRODUCTION In hgh-level synthess, regster allocaton determnes how many regsters should be allocated to the desgn. If G = (V, E) s the nterference graph of the program, and χ(g) s the chromatc number, then at least χ(g) regsters must be allocated. Determnng χ(g) s NP-Complete for general graphs; however, nterference graphs for Statc Sngle Assgnment (SSA) form programs belong to the class of chordal graphs [2, 4, 9], for whch χ(g) can be computed n O( V + E ) tme [8]. For graph G, there may be many dfferent χ(g)-colorngs. Gven G and χ(g), regster assgnment s the problem of determnng the best χ(g)-colorng of G. Gven a color assgnment, the allocaton of nterconnect resources wres and multplexers s then derved determnstcally. The goal of regster assgnment s to mnmze the overall cost of the allocated nterconnect resources. Ths problem, connectvty bndng, has been proven NP- Complete for applcatons whose nterference graphs are nterval graphs [6]; snce nterval graphs are a subset of the chordal graphs, the problem remans NP-Complete for applcatons that are syntheszed drectly from SSA Form. To the best of our knowledge, ths s the frst paper to study regster assgnment for SSA-form programs n the context of synthess. We have developed two heurstcs for ths problem. The frst s a greedy heurstc that modfes Gavrl s optmal algorthm for chordal colorng [8], whch does not optmze nterconnect resources. The second approach uses smulated annealng to produce locally optmal solutons. The annealng heurstc that we have developed s talored specfcally to chordal graphs. The results of these heurstcs are compared on a set of large chordal nterference graphs taken from real-world benchmarks. The result shows that tradtonal methods for chordal colorng generally perform qute poorly; however, when the chordal colorng algorthm s modfed to account for ϕ-functons, the area of multplexers allocated to the desgn s reduced by 24.8%, on average. Smulated annealng, n contrast, reduces the area by 25.9%, on average, but at a sgnfcant runtme cost. 2. SSA FORM SSA Form [3, 6-7] s a compler ntermedate representaton that has been used for numerous analyses and optmzatons. More recently, several technques for drect synthess of SSA-form programs have been proposed [4, 2-3]. Any operaton x s a defnton of x; any operaton x s a use of x. A procedure P s n SSA Form f: () every varable s defned exactly once; and (2) every use of a varable corresponds to ts defnton. Ensurng unque defntons of each varable s trval. Each defnton of x s replaced wth defntons of x, x 2,, etc. Ensurng that each use corresponds to one defnton, however, s a bt more complcated. In Fg. (a), varable x s defned on both sdes of a condton, and then used at the on pont followng the condton. In Fg. (b), the defntons of x are replaced wth defntons of x and x 2 ; however, the use of x cannot be changed to a use of x or x 2 wthout changng the semantcs of the applcaton. To rectfy ths stuaton, a ϕ-functon: x 3 ϕ(x, x 2 ) s ntroduced at the on pont, and the use of x s replaced wth a use of x 3. The semantcs of the ϕ-functon are as follows: f the path on the left s taken, then x 3 receves ts value from x ; f the path on the rght s taken, then x 3 receves ts value from x 2. ϕ-functons are placed at confluence ponts n the procedure, where multple control flow paths merge; the descrpton above s easly generalzed to an arbtrary number of convergng paths. At a confluence pont, there may be multple ϕ-functons, each correspondng to a dfferent varable n the procedure. Whenever a block contanng the ϕ-functons s executed, t s assumed that all of the ϕ-functons execute concurrently. Ths s not a problem for drect synthess of SSA Form programs, but presents a sgnfcant challenge to complers, that must translate the program out of SSA Form before the program can be executed.

2 x x x x 2 (a) x (b) x? (c) x x 2 x 3 ϕ(x, x 2 ) x 3 Fgure. Illustraton of SSA Form Technques to construct SSA Form have been descrbed by Cytron et al. [6] and Brggs et al. [3]; throughout ths paper, we assume that Pruned SSA Form [7] s always used. One mportant advantage of SSA s that all copy operatons can be elmnated durng ts constructon [3]. 3. RELATED WORK Connectvty bndng subsumes the process of assgnng regsters to the dfferent varables represented n an applcaton to be syntheszed. Typcally, the applcaton has already been scheduled, and the operatons have already been bound to a set of functonal resources that have been allocated. If a value produced by resource A s assgned to regster R, then t s necessary to allocate a wre that connects the output of A to the nput of R. Snce there may be many resources that produce values wrtten to R, a multplexer must be placed on R s nput. The overall goal of connectvty bndng s to mnmze an obectve functon that encompasses both the number of wres and the cost of the multplexers that are allocated durng ths stage. The problem has been proven NP-Complete by Pangrle [6]. Connectvty bndng has been studed n the past, but not for SSA-form programs. Huang et al. [] used a bpartte weghted matchng heurstc to mprove an ntal assgnment of regsters. Rm et al. [7] formulated the problem as an nteger-lnear program and solved t optmally, but n exponental worst-case tme. Km and Lu [4] allocated extra regsters n order to reduce the cost of allocatng multplexers, thereby sacrfcng an ntally optmal allocaton of regsters. Zhu and Jong [9] developed an approach usng network flows, but never compared ther technque to any other work. Most recently, Chen and Cong [5] used a k-cofamly based algorthm, and optmzed port assgnment as a post-processng phase. In many, but not all cases, Chen and Cong s technque outperformed Huang et al. s. The ϕ-functons n SSA Form create a new challenge for connectvty bndng. Consder a ϕ-functon y ϕ(..., x, ). If x and y are bound to regsters r x and r y respectvely, then an wre from the output of r x to the nput of r y s requred to facltate the data transfer. On the other hand, f x and y are assgned to the same regster r, then no data transfer, s requred. Ths ssue does not affect synthess of non-ssa-form applcatons. Avakan and Ouass [] studed regster bndng for FPGAs usng smulated annealng. Unlke our work, they dd not use SSA Form for synthess. Another dfference s that they focus on optmzng multplexers on the nputs to functonal unts, whereas ours focuses on mnmzng the cost of regster-to-regster transfers. Ths approach dd not consder applcatons represented n SSA Form. The obectve functon dd not consder the effects of ϕ- functons on swappng the regsters bound to each varable. Ther mplementaton of smulated annealng s lmted n two respects. Frst, t assumes that the nterference graph s an nterval graph. Interval graphs are the nterference graphs for straght-lne applcatons wth no control flow; nterval graphs also arse for applcatons wth condtonals that have been elmnated va fconverson, but no loops. Snce any procedure can be converted to SSA Form, chordal graphs are much more general. The second lmtaton s that t only swaps the colors of two varables whose lfetmes are dentcal. Our mplementaton of smulated annealng, n contrast, does not mpose any restrcton on the varables whose colors are swapped. 4. PROBLEM STATEMENT 4. Overvew Varable v s lve at pont p n a program f there s a path from the defnton of v to p and a path from p to a use of v. Two varables nterfere wth one another f there s at least one pont n the applcaton where both are lve. Two nterferng varables cannot resde n the same regster. In SSA Form, an nterference graph s a graph G = (V, E, E ϕ ), where V s a set of vertces, each correspondng to an SSA varable; an edge (u, v) E s placed between every par of varables u and v that nterfere; fnally, E ϕ = {(u, v) E u v and there s a ϕ-functon v ϕ(, u, )}. Edges belongng to E ϕ are called ϕ-edges. Edges n E are undrected; ϕ-edges n E ϕ, are drected because the data transfer always orgnates at the parameter of a ϕ-functon. Let f(x) be the color (.e. regster) assgned to varable x. A color assgnment s legal f for each edge (u, v) E, f(u) f(v). Gven a legal color assgnment, nterconnect allocaton s determnstc, as dscussed n the next secton. 4.2 Interconnect Allocaton In SSA Form, the nterference graph G s chordal, so χ(g) can be computed n polynomal tme and χ = χ(g) regsters are allocated. Let the set of regsters allocated to the desgn be R = {R,, R χ }. Each regster R has a multplexer havng on ts nput. Let m be the number of nputs bts to ths multplexer. If m =, then there are no connectons to R ; f m =, then the multplexer has only one nput, and t c an be replaced by a wre; f m >, then a multplexer wth s = log 2 m selecton bts s requred. The result of the color assgnment s the set M = {m,, m χ }. Next, we derve M from a legal color assgnment. Let Φ = {(u, v) E ϕ f(u) =, f(v) = }. Φ s the set of ϕ-edges that necesstate a connecton from regster R to R. A connecton s requred f c = Φ >. Let b = f c = and f c >. Let B = {b } be a bt-matrx representng all of the b values. The th column of B represents all of the connectons from other regsters to regster R. From the th column, the value of m s: χ m = b + b () = = +

3 v v 2 v 4 v 6 v 5 v v 3 Interference Edge v 8 v 7 v 2 v v v 9 (a) f(v ) = f(v ) = f(v 2 ) = 2 f(v 3 ) = f(v 4 ) = f(v 5 ) = 2 f(v 6 ) = f(v 7 ) = f(v 8 ) = 2 f(v 9 ) = f(v ) = 3 f(v ) = f(v 2 ) = 2 (b) (c) C M φ-edge Fgure 2. Example chordal extended nterference graph (a) wth legal 4-color assgnment (b) and c and m values (c) There s no need to count b, the connecton from R to tself. Fg. 2 shows an example chordal nterference graph (a) wth color assgnment (b). Based on the color assgnment the matrx C = {c } and vector M = {m } are shown n Fg. 2 (c). 4.3 Evaluatng the Color Assgnment In ths paper, the goal s to mnmze the aggregate area of multplexers allocated to the desgn. A secondary goal could be to mnmze the delay of the largest multplexer; ths goal s somewhat dubous because there s no way to tell whether or not that multplexer wll actually le on the crtcal path of the fnal crcut. Let A(m ) and D(m ) be the area and delay, respectvely, of a multplexer wth m nputs. Then the overall area and delay through the multplexers are as follows: χ = ( ) Area = A m (2) Delay { D( m )} χ = max = (3) 5. REGISTER ASSIGNMENT HEURISTIC 5. Optmal Chordal Colorng Let G = (V, E) be an undrected graph. An elmnaton order (EO) of G s a one-to-one and onto functon α: V {,, V }. Let v V be the vertex such that α(v ) =, and V = {v V < }. V s the empty set, V V = V, and let G = (V, E ) be the subgraph of G nduced by V. The basc dea of an elmnaton order s that t can be used to ncrementally construct G startng wth an empty graph. Inductvely, f we have computed G, we construct G + by addng vertex v + to V, and addng all edges to E connectng v + to vertces n V. The result s the sets V + and E +. Let N(v) be the set of vertces adacent to v. N (v) = N(v) V s the set of vertces adacent to v wth EO ndces at most. An EO s a perfect elmnaton order (PEO) f N (v ) s a clque for all. G s defned to be a chordal graph f and only f G has a PEO; there are several other provably equvalent defntons of chordal graphs, but they are not needed here. A PEO can be computed n O( V + E ) tme usng an algorthm called Maxmum Cardnalty Search (MCS). Gven a PEO, a mnmum colorng of a chordal graph can be computed optmally n O( V + E ) tme usng a greedy algorthm by Gavrl [8]. Inductvely, assume that an optmal colorng has been computed for G. Now, consder G +, and n partcular, vertex v +. Snce G has a PEO, N + (v + ) s a clque. Therefore, t suffces to assgn the smallest color not assgned to a vertex n N + (v + ) to v +. The maxmal color assgned among all vertces s the chromatc number: χ = χ(g). We should also note that N [v ] = N(v ) {v } s also a clque. The clque N [v ] such that N [v ] > N [v ], < < V s the maxmal clque n the nterference graph. The cardnalty of the maxmal clque s equal to χ. The maxmal ndependent set and the mnmal clque partton of a chordal graph can be computed n O( V + E ) tme [8]. 5.2 Interconnect Optmzaton The chordal colorng algorthm descrbed n Secton 5. computes a mnmal colorng of an nterference graph, but does not try to optmze Eq. (2) or (3). In ths secton, we extend the algorthm to optmze area (Eq. (2)). We begn wth an nterference graph G = (V, E, E ϕ ), as descrbed n Secton 4.. G s chordal snce we are syntheszng an applcaton n SSA Form. Frst, we run Gavrl s algorthm (Secton 5.) to compute χ and we allocate a set R of regsters to the desgn, such that R > χ; there s no absolute requrement a mnmum regster allocaton s necessary, and t s possble that allocatng more regsters can lead to an overall reducton n area once multplexer optmzaton has been accounted for. Second, we ntalze an R R matrx, B, as descrbed n Secton 4.2, whch s ntally empty. Thrd, we process the vertces of G n PEO order. Any color n the range.. R not assgned to a vertex n N(v ) s a potental canddate for v. Let Free(v ) denote ths set of colors; recall that f(v) s the color assgned to vertex v. Free ( v ) = {.. R } U f ( v ) (4) v N ( v ) We use the ϕ-edges ncdent on v to help us decde the best color to assgn to v. ϕ n (v ) and ϕ out (v ) are the set of vertces adacent to v va ϕ-edges that have already been assgned colors. The sets of colors assgned to vertces n ϕ n (v ) and ϕ out (v ), respectvely, that are avalable for v, are denoted f n (v ) and f out (v ). ϕ n (v ) and f n (v ) s shown as follows; ϕ out (v ) and f out (v ) are analogous:

4 ( v ) = {( v v ) Eϕ } ( v ) = U f ( v ) Free ϕ, (5) f n < n ϕ ( v ) v n ( v ) We compute a cost, F(c k ), for each color c k Free(v ), and the color wth lowest cost s assgned to v ; tes are broken arbtrarly. The cost functon that we have selected attempts to mnmze the number of new connectons that are created by assgnng color c k to vertex v. The cost functon that we have selected has two components: the frst s the number of wres that wll be allocated to transfer data from other regsters to r k f c k s assgned to v ; lkewse, the second s the number of wres that wll be allocated to transfer data from regster r k to other regsters. ( c ) = F k b k bk (7) c f ( v ) c f ( v ) n c ck + out c ck If color c k s selected for v, then we must update the matrx B to account for the new wres that have been allocated. Let B - denote the matrx pror to assgnng a color to v, and B be the matrx afterward; b - k and b k represent ndvdual elements. Let row k [B] and col k [B] represent the k th row and column of B respectvely. These are the only elements of B that wll be updated. Let b - k row k [B - ] and b - k col k [B - ]. Then: b b k = k b b k = k c f ( v ) out otherwse c f ( v ) n otherwse (6) k (8) k 5.3 Complexty The tme complexty of the algorthm descrbed n Secton 5.2 s O( R 2 V + E ). The complexty of computng the PEO s O( V + E ). Lke chordal colorng, the complexty of processng vertces n PEO order and elmnatng colors assgned to vertces contaned n N (v ) from consderaton for v s also O( V + E ). In the new algorthm, we consder up to R colors for each vertex. The complexty of evaluatng F(c k ), per color, s O( R ) as well, yeldng a complexty of O( R 2 ), per vertex. Once color c k has been selected for vertex v, the cost of updatng the k th row and column of B - s O( R ) f f n and f out are represented as bt-vectors. The overall tme complexty s therefore O( R 2 V + E ). 6. SIMULATED ANNEALING Smulated annealng s an teratve mprovement heurstc that provdes locally optmal solutons to classcally hard problems. Due to space lmtatons, we assume that the reader s famlar wth smulated annealng; f not, please refer to the paper by Johnson et al. [] for an overvew. 6. Representaton Smulated annealng begns wth a problem nstance and constructs an ntal soluton. An obectve functon evaluates the qualty of the soluton. The obectve functon s nonnegatve, wth smaller values defned as beng superor to larger ones. (9) v Dummy v Vertex C v 4 v 3 v 2 A MOVE operaton s then appled to the ntal soluton. The move s a small perturbaton for example, swappng the colors of two vertces. If the move mproves the current soluton, then the move s accepted wthout queston; otherwse, the move s ether accepted or reected based on a randomzed computaton. A clque s a subset of vertces whose nduced subgraph s complete.e. there s an edge between every par of vertces n the subgraph. A clque partton parttons the nterference graph nto a set of non-overlappng clques. A clque partton of a chordal graph can be computed n O( V + E ) tme [8]. κ(g) s defned to be the clque partton number of G,.e. the smallest number of clques that can partton G; κ wll denote κ(g) n order to smplfy notaton. Let C = {C,, C κ } be the clque partton. Let Cl(v) = f v C. An edge (u, v) E s an ntra-clque edge f Cl(u) = Cl(v) and an nter-clque edge otherwse. The MOVE operaton s restrcted to swappng the colors of two vertces belongng to the same clque. Therefore, the colorng constrant mposed by each ntra-clque edge s satsfed trvally. Usng ths move operaton, ntra-clque edges can be removed from the graph, thus reducng the number of constrants. If C < χ, then t wll not be possble for some vertex v C to receve every possble color n the nterference graph. To rectfy ths, χ - C dummy vertces are added to each clque. A dummy vertex s adacent to each vertex n C (ncludng other dummes) but no other vertces n the nterference graph. A dummy vertex s smply a placeholder for a color not used by a clque. After addng dummes, there wll be exactly κχ vertces n the graph. Fg. 3 shows Fg. 2 after clque parttonng. The ϕ-edges from Fg. 2 are not shown n Fg. 3. The number of edges s reduced from 22 to. 6.2 MOVE Operaton A vertex v s an llegal vertex f there s some edge (u, v) E such that f(u) = f(v).e. the color assgnment s llegal. A vertex v s defned to be a sub-optmal vertex f v s not llegal and there s at least one edge (v, w) E ϕ such that f(w) f(v). The MOVE operaton s defned as follows: () Randomly select an llegal vertex v; randomly select another vertex u from v s clque and swap ther colors. v 6 v 5 C 2 C 3 Fgure 3. v 9 v 2 v v 7 v 8 v C 4 C 5 The graph from Fg. 2(a) wth ntra-clque edges removed and dummy vertces; ϕ-edges are not shown.

5 (2) If there are no llegal vertces, randomly select a sub-optmal vertex v; randomly select another vertex w from v s clque and swap ther colors. (3) If there are no llegal or sub-optmal vertces, then the current color assgnment s optmal; annealng termnates. In Fg. 2, the soluton that would result from swappng the colors of v 8 and v would yeld the matrx C = {c } shown below n Eq. (). The updated vector, M becomes [2,,, 2]. Colors 2 and 3 are swapped, and the correspondng rows and columns of Care modfed. 3 = 2 C () 6.3 Obectve Functon Here, we descrbe the obectve functon used by the smulated annealng heurstc. For legal solutons, our obectve functon s the sum of the areas of the multplexers allocated. Recall m s the number of nputs to regster r that must be multplexed. If a color soluton s legal, then Ob Legal = R = m A ( m ). () Our mplementaton of smulated annealng allows llegal colorng solutons to be accepted. A sequence of llegal solutons could lead to a new area of the search space that was not lkely to be explored otherwse. Snce llegal solutons are of no practcal use, we desre an obectve functon that ensures that all llegal solutons have a hgher value than any legal one. The maxmum value of a legal obectve functon s ( ) MAX Legal = R A R. (2) MAX Legal s the value of Ob Legal that would occur f the largest possble multplexer was placed on the nput to every regster. If the current soluton s llegal, the obectve functon should nfluence the annealng procedure toward legalzaton. An edge (u, v), such that f(u) = f(v) s an llegal edge. Let E be the subset of nterference edges that are llegal based on the current color assgnment. Then the obectve functon when the current soluton s llegal s: Ob + = MAX Legal E (3) If MAX Legal s suffcently large, there s vrtually no chance that a move that causes an llegal soluton to become llegal wll be accepted due to the large dfferental n obectve functon value. To ncrease the lkelhood of acceptng llegal moves, we normalze the obectve functon such that Ob Legal takes values n the range [, ] and Ob takes values n the range (, 2]. The complete obectve functon, Ob*, s therefore: ObLegal MAX Ob* = E + E Legal E E = > (4) 7. EXPERIMENTAL RESULTS The colorng heurstcs descrbed n the precedng sectons were ntegrated nto an expermental SSA-based synthess framework developed by Brsk et al. [4]; ths framework has been bult on top of the Machne SUIF compler [8]. We targeted an Altera ACEXK FPGA. We generated a lbrary of multplexers startng wth the 5 lsted n Table of the paper by Avakan and Ouass []; we then used the algorthm of Mtra and McCluskey [5] to generate a lbrary of multplexers from 2-2 nputs. Due to space lmtatons, the parameters of ths lbrary are not shown. For synthess, we allocated 5 resources: 2 adders, 2 multplers, and one ALU that performs logcal operatons such as AND, OR, etc. We then syntheszed each desgn and studed the effects of nterconnect optmzaton heurstcs descrbed n ths paper. We tested 3 approaches for nterconnect optmzaton: chordal colorng [4, 8] (Secton 5.), optmzed chordal colorng (Secton 5.2), and smulated annealng (Secton 6). Table shows the parameters used for the smulated annealng; the parameters are taken from the paper by Johnson et al. []. Our benchmarks were nterference graphs taken from the paper by Brsk et al. [4]; these graphs were selected because they are large, and present a sgnfcant challenge to the syntheszer. Table 2 shows the results of the experments. The area for each desgn s showed n terms of logc blocks, and the runtme s presented n mllseconds. The chromatc number of each graph s the number of regsters allocated, and each regster (and multplexer) s 32- bts wde. The area results are shown for regsters and multplexers only, because the cost of the other resources s fxed and does not depend on the allocaton. On average, the enhanced color assgnment heurstc of Secton 6.2 reduced the number of logc cells by 7 (24.8%) compared to chordal colorng [8]. On average, smulated annealng yelded an addtonal reducton of 47 cells (25.9%). The average runtmes were 8.87ms for chordal colorng, 39.5ms for optmzed chordal colorng, and 2,662ms for smulated annealng. We beleve that ths reflects favorable on the frst heurstc due to ts relatve compettveness when compared to smulated annealng. N T SIZE_FACTOR CUTOFF Table. Smulated Annealng Parameters Annealng Parameters TEMPFACTOR MINPERCENT FREEZE_LIM.9.5 5

6 Benchmark try_combne smplfy_rtx yyparse fold_rtx expand_expr fold recog_5 ump_optmze Chromatc Number Table 2. Area Results and Runtme of the 3 Heurstcs Area (# Logc Cells) Runtme (mllseconds) Chordal Optmzed Annealng Chordal Optmzed Annealng 5,366 4,32 4, ,83 2,83 2,368 2, ,7 2,887 2,752 2, ,373 5,48 3,776 3, ,964 4,66 3,358 3, ,88 5,76 4,35 4, ,36,48,32, ,467 8,84 4,999 4, ,843 Average - 4,494 3,378 3, ,662 Chordal Optmzed Annealng Fgure 4. Dstrbuton of multplexers allocated by 3 color assgnment heurstcs for the benchmark fold Fg. 4 shows the dstrbuton of multplexers allocated for the benchmark fold by the 3 colorng heurstcs. It s easy to see that chordal colorng performs poorly, allocatng two -nput and one 5-nput multplexer; n contrast, largest multplexer allocated by both optmzed chordal colorng and smulated annealng has 9- nputs. Smulated annealng also allocates the largest number of 5-nput multplexers, the smallest number of nputs among all allocated multplexers. It s mportant to note that the dstrbuton of multplexers, as shown n Fg. 4, does not completely characterze the soluton to the problem. The area metrc reported n Table 2, for example, would be consderably dfferent f a standard cell desgn was consdered nstead of an FPGA. It s well-known that a multplexer s not easly syntheszed from look-up tables, so ts cost, relatve to other crcut elements, s sgnfcantly hgher n an FPGA than n a standard cell desgn. Thus, the mpetus of optmzng multplexers s consderably greater for FPGAs. A secondary ssue s whether complete or ncomplete multplexers are used. A complete multplexer only has nputs that are even powers of 2, so, for example, a 7-nput multplexer would be mplemented va an 8-nput multplexer, wth one nput never used. If the only multplexers avalable are complete, all of the 5-, 6-, and 7-nput multplexers n Fg. 4 would have the same cost as an 8-nput multplexer, and all 9- to 5-nput multplexers would have the same cost as a 6-nput multplexer; and of course, the mpact of ths decson s consderably greater for an FPGA than for a standard cell desgn. In practce, one would want to generate a large lbrary of ncomplete multplexers usng the technque of Mtra et al. [5], or a comparable approach. Ths must be done pror to runnng the smulated annealng heurstc to ensure a correct estmate of the area of each multplexer; t s not necessary to generate the lbrary n advance f the optmzed chordal colorng heurstc s used, because ts obectve functon s not based on the area of a specfc multplexer.

7 For datapath crcuts, t s qute lkely that the sze of the largest multplexer could affect the clock frequency. Ignorng the generaton of control sgnals, the regster-to-regster path wth maxmum combnatonal delay wll constran the clock frequency. It s lkely, although not guaranteed, that the largest multplexer wll le along ths path. Consequently, there could be some benefts ganed from attemptng to constran the largest multplexer allocated to the desgn, rather than focusng solely on area optmzaton. Dong so accurately, however, would requre a detaled model of the layout of the datapath porton of the crcut, whch s not lkely to be avalable durng hgh-level synthess. A reasonable model may be avalable f regster allocaton and nterconnect optmzaton are performed as the fnal steps durng hgh-level synthess. At the very least, nterconnect allocaton must occur before logc synthess can optmze the datapath and certanly before the fnal layout; and for FPGA-based desgns, the delays depend on placement and routng as well. Thus, we have focused on area rather than delay as the metrc to study n ths paper; however, we may attempt to optmze the latter as future work. 8. CONCLUSION AND FUTURE WORK The problem of nterconnect optmzaton n hgh-level synthess of SSA Form applcatons has been ntroduced. SSA Form s an deal representaton for synthess because the nterference graph for each procedure s a chordal graph, whch can be colored optmally n O( V + E ) tme. To solve ths problem, we have ntroduced two heurstcs: the frst, an extenson of Gavrl s optmal algorthm for chordal graph colorng and a second based on smulated annealng. Although smulated annealng performs.% better than the other on average, the mprovement requres an average ncrease n runtme of three orders of magntude. The nterconnect optmzaton problem studed n ths paper specfcally focuses on mnmzng the connectons between regsters that arse from ϕ-functons. In the future, we ntend to extend ths deas presented n ths paper so that operaton bndng and port assgnment s performed concurrently. Ths makes sense because as Brsk et al. [4] showed, t s possble to color an nterference graph for an SSA Form procedure by makng a reverse post-order pass over ts domnator tree, and processng the operatons n each node n forward order. The extenson to chordal graph colorng suggested n Secton 5.2 can be mplemented n ths fashon as well, because t augments tradtonal chordal graph colorng wth an mproved method to select the color to assgn to each vertex. We beleve that the color assgnment heurstc can be modfed to make operaton and port bndng decsons as each operaton s processed durng the traversal of the applcaton. Ths would be smlar n prncple to the approach advocated by Cong and Chen [5], whle accountng for ϕ-functons. REFERENCES [] Avakan, A., and Ouass, I. Optmzng regster bndng n FPGAs usng smulated annealng. In Proc. of the Int. Conf. Reconfgurable Computng and FPGAs (ReConFg 5) (Puebla Cty, Mexco, September 28-3, 25) 6. [2] Bouchez, F., Darte, A., Gullon, C., and Rastello, F. Regster Allocaton and Spll Complexty Under SSA. Techncal Report RR25-33, ENS-Lyon, Lyon, France, 25. [3] Brggs, P., Cooper, K. D., Harvey, T. J., and Smpson, L. T. Practcal mprovements to the constructon and destructon of statc sngle assgnment form. Software Practce and Experence, 28, 8, (July, 998), [4] Brsk, P., Dabr, F., Jafar, R., and Sarrafzadeh, M. Optmal regster sharng for hgh-level synthess of SSA form programs. IEEE Trans. Computer-Aded Desgn., 25, 5 (May. 26), [5] Chen, D., and Cong, J. Regster bndng and port assgnment for multplexer optmzaton. In Proc. of the Asa South Pacfc Desgn Automaton Conf. (ASP-DAC 4) (Yokohama, Japan, 24) [6] Cho, J-D., Cytron, R., and Ferrante, J. Automatc constructon of sparse data flow evaluaton graphs. In Proc. of the ACM/SIGPLAN Conf. Prncples of Progr. Languages (POPL 9) (Orlando, FL, USA, 99) [7] Cytron, R., Ferrante, J., Rosen, B. K., Wegman, M. N., and Zadeck, F. K. Effcently computng statc sngle assgnment form and the control dependence graph. ACM Trans. Prog. Lang. and Systems, 3, 4 (October, 99), [8] Gavrl, F. Algorthms for mnmum colorng, maxmum clque, mnmum coverng by clques, and maxmum ndependent set of a chordal graph. SIAM J. Computng, 2, (June, 972), [9] Hack, S., and Goos, G. Optmal regster allocaton for SSA-form programs n polynomal tme. Informaton Processng Letters, 98, 4 (May, 26), [] Huang, C-Y., Chen, Y-S., Ln, Y-L., and Hsu, Y-C. Data path allocaton based on bpartte weghted matchng. In Proc. of the Desgn Automaton Conf. (DAC 9) (Orlando, FL, USA, 99) [] Johnson, D. S., Aragon, C. R., McGeoch, L. A., and Schevon, C. Optmzaton by smulated annealng, part : graph colorng and number parttonng. Operatons Research, 39, 3, (May-June, 99), [2] Kaplan, A., Brsk, P., and Kastner, R. Data communcaton estmaton and reducton for reconfgurable systems. In Proc. of the Desgn Automaton Conf. (DAC 3) (Anahem, CA, USA, 23) [3] Kastner, R., et al. Layout drven data communcaton optmzaton for hgh level synthess. Proceedngs of the Conference on desgn automaton and test n Europe (DATE 6) (Munch, Germany, 26) [4] Km, T., and Lu, C. L. An ntegrated data path synthess algorthm based on network flow method. In Proc. of the Custom Integrated Crcuts Conf. (CICC 95) (Santa Clara, CA, USA, 995), [5] Mtra, S., Avra, L. J., and McCluskey, E. J. Effcent multplexer synthess technques. IEEE Desgn & Test of Computers, (October- December, 999), 2-9. [6] Pangrle, B. On the complexty of connectvty bndng. IEEE Trans. Computer-Aded Desgn,,, (November, 99), [7] Rm, M., Jan, R., and De Leone, R. Optmal allocaton and bndng n hgh-level synthess. In Proc. of the Desgn Automaton Conf. (DAC 92) (Anahem, CA, USA, 992) [8] Smth, M. D., and Holloway, G. An Introducton to Machne SUIF and ts Portable Lbrares for Analyss and Optmzaton. Techncal Report, Harvard Unversty, Cambrdge, MA, USA, 22. [9] Zhu, H. W., and Jong, C. C. Interconnecton optmzaton n data path allocaton usng mnmal cost maxmal flow algorthm. Mcroelectroncs Journal, 33, 9, (September, 22),

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Report on On-line Graph Coloring

Report on On-line Graph Coloring 2003 Fall Semester Comp 670K Onlne Algorthm Report on LO Yuet Me (00086365) cndylo@ust.hk Abstract Onlne algorthm deals wth data that has no future nformaton. Lots of examples demonstrate that onlne algorthm

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

Constructing Minimum Connected Dominating Set: Algorithmic approach

Constructing Minimum Connected Dominating Set: Algorithmic approach Constructng Mnmum Connected Domnatng Set: Algorthmc approach G.N. Puroht and Usha Sharma Centre for Mathematcal Scences, Banasthal Unversty, Rajasthan 304022 usha.sharma94@yahoo.com Abstract: Connected

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT

DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT Bran J. Wolf, Joseph L. Hammond, and Harlan B. Russell Dept. of Electrcal and Computer Engneerng, Clemson Unversty,

More information

Storage Binding in RTL synthesis

Storage Binding in RTL synthesis Storage Bndng n RTL synthess Pe Zhang Danel D. Gajsk Techncal Report ICS-0-37 August 0th, 200 Center for Embedded Computer Systems Department of Informaton and Computer Scence Unersty of Calforna, Irne

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

5 The Primal-Dual Method

5 The Primal-Dual Method 5 The Prmal-Dual Method Orgnally desgned as a method for solvng lnear programs, where t reduces weghted optmzaton problems to smpler combnatoral ones, the prmal-dual method (PDM) has receved much attenton

More information

the nber of vertces n the graph. spannng tree T beng part of a par of maxmally dstant trees s called extremal. Extremal trees are useful n the mxed an

the nber of vertces n the graph. spannng tree T beng part of a par of maxmally dstant trees s called extremal. Extremal trees are useful n the mxed an On Central Spannng Trees of a Graph S. Bezrukov Unverstat-GH Paderborn FB Mathematk/Informatk Furstenallee 11 D{33102 Paderborn F. Kaderal, W. Poguntke FernUnverstat Hagen LG Kommunkatonssysteme Bergscher

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

NOVEL CONSTRUCTION OF SHORT LENGTH LDPC CODES FOR SIMPLE DECODING

NOVEL CONSTRUCTION OF SHORT LENGTH LDPC CODES FOR SIMPLE DECODING Journal of Theoretcal and Appled Informaton Technology 27 JATIT. All rghts reserved. www.jatt.org NOVEL CONSTRUCTION OF SHORT LENGTH LDPC CODES FOR SIMPLE DECODING Fatma A. Newagy, Yasmne A. Fahmy, and

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

An efficient iterative source routing algorithm

An efficient iterative source routing algorithm An effcent teratve source routng algorthm Gang Cheng Ye Tan Nrwan Ansar Advanced Networng Lab Department of Electrcal Computer Engneerng New Jersey Insttute of Technology Newar NJ 7 {gc yt Ansar}@ntedu

More information

Scheduling with Integer Time Budgeting for Low-Power Optimization

Scheduling with Integer Time Budgeting for Low-Power Optimization Schedlng wth Integer Tme Bdgetng for Low-Power Optmzaton We Jang, Zhr Zhang, Modrag Potkonjak and Jason Cong Compter Scence Department Unversty of Calforna, Los Angeles Spported by NSF, SRC. Otlne Introdcton

More information

The stream cipher MICKEY-128 (version 1) Algorithm specification issue 1.0

The stream cipher MICKEY-128 (version 1) Algorithm specification issue 1.0 The stream cpher MICKEY-128 (verson 1 Algorthm specfcaton ssue 1. Steve Babbage Vodafone Group R&D, Newbury, UK steve.babbage@vodafone.com Matthew Dodd Independent consultant matthew@mdodd.net www.mdodd.net

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

Gradual Relaxation Techniques with Applications to Behavioral Synthesis *

Gradual Relaxation Techniques with Applications to Behavioral Synthesis * Gradual Relaxaton Technques wth Applcatons to Behavoral Synthess * Zhru Zhang, Ypng Fan, Modrag Potkonjak, Jason Cong Computer Scence Department, Unversty of Calforna, Los Angeles Los Angeles, CA 90095,

More information

Needed Information to do Allocation

Needed Information to do Allocation Complexty n the Database Allocaton Desgn Must tae relatonshp between fragments nto account Cost of ntegrty enforcements Constrants on response-tme, storage, and processng capablty Needed Informaton to

More information

ARTICLE IN PRESS. Signal Processing: Image Communication

ARTICLE IN PRESS. Signal Processing: Image Communication Sgnal Processng: Image Communcaton 23 (2008) 754 768 Contents lsts avalable at ScenceDrect Sgnal Processng: Image Communcaton journal homepage: www.elsever.com/locate/mage Dstrbuted meda rate allocaton

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

RADIX-10 PARALLEL DECIMAL MULTIPLIER

RADIX-10 PARALLEL DECIMAL MULTIPLIER RADIX-10 PARALLEL DECIMAL MULTIPLIER 1 MRUNALINI E. INGLE & 2 TEJASWINI PANSE 1&2 Electroncs Engneerng, Yeshwantrao Chavan College of Engneerng, Nagpur, Inda E-mal : mrunalngle@gmal.com, tejaswn.deshmukh@gmal.com

More information

Fitting: Deformable contours April 26 th, 2018

Fitting: Deformable contours April 26 th, 2018 4/6/08 Fttng: Deformable contours Aprl 6 th, 08 Yong Jae Lee UC Davs Recap so far: Groupng and Fttng Goal: move from array of pxel values (or flter outputs) to a collecton of regons, objects, and shapes.

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS CACHE MEMORY DESIGN FOR INTERNET PROCESSORS WE EVALUATE A SERIES OF THREE PROGRESSIVELY MORE AGGRESSIVE ROUTING-TABLE CACHE DESIGNS AND DEMONSTRATE THAT THE INCORPORATION OF HARDWARE CACHES INTO INTERNET

More information

Routability Driven Modification Method of Monotonic Via Assignment for 2-layer Ball Grid Array Packages

Routability Driven Modification Method of Monotonic Via Assignment for 2-layer Ball Grid Array Packages Routablty Drven Modfcaton Method of Monotonc Va Assgnment for 2-layer Ball Grd Array Pacages Yoch Tomoa Atsush Taahash Department of Communcatons and Integrated Systems, Toyo Insttute of Technology 2 12

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm Internatonal Journal of Advancements n Research & Technology, Volume, Issue, July- ISS - on-splt Restraned Domnatng Set of an Interval Graph Usng an Algorthm ABSTRACT Dr.A.Sudhakaraah *, E. Gnana Deepka,

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Bran Curless Sprng 2008 Announcements (5/14/08) Homework due at begnnng of class on Frday. Secton tomorrow: Graded homeworks returned More dscusson

More information

Graph-based Clustering

Graph-based Clustering Graphbased Clusterng Transform the data nto a graph representaton ertces are the data ponts to be clustered Edges are eghted based on smlarty beteen data ponts Graph parttonng Þ Each connected component

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Conditional Speculative Decimal Addition*

Conditional Speculative Decimal Addition* Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

VISUAL SELECTION OF SURFACE FEATURES DURING THEIR GEOMETRIC SIMULATION WITH THE HELP OF COMPUTER TECHNOLOGIES

VISUAL SELECTION OF SURFACE FEATURES DURING THEIR GEOMETRIC SIMULATION WITH THE HELP OF COMPUTER TECHNOLOGIES UbCC 2011, Volume 6, 5002981-x manuscrpts OPEN ACCES UbCC Journal ISSN 1992-8424 www.ubcc.org VISUAL SELECTION OF SURFACE FEATURES DURING THEIR GEOMETRIC SIMULATION WITH THE HELP OF COMPUTER TECHNOLOGIES

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

CHAPTER 4 PARALLEL PREFIX ADDER

CHAPTER 4 PARALLEL PREFIX ADDER 93 CHAPTER 4 PARALLEL PREFIX ADDER 4.1 INTRODUCTION VLSI Integer adders fnd applcatons n Arthmetc and Logc Unts (ALUs), mcroprocessors and memory addressng unts. Speed of the adder often decdes the mnmum

More information

Solving Route Planning Using Euler Path Transform

Solving Route Planning Using Euler Path Transform Solvng Route Plannng Usng Euler Path ransform Y-Chong Zeng Insttute of Informaton Scence Academa Snca awan ychongzeng@s.snca.edu.tw Abstract hs paper presents a method to solve route plannng problem n

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

Flexible ASIC: Shared Masking for Multiple Media Processors

Flexible ASIC: Shared Masking for Multiple Media Processors 54.1 Flexble ASIC: Shared Maskng for Multple Meda Processors Jennfer L. Wong Unv. of Calf., Los Angeles Los Angeles, Calforna jwong@cs.ucla.edu Farnaz Kourshanfar Unv. of Calf., Berkeley Berkeley, Calforna

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Loop Transformations, Dependences, and Parallelization

Loop Transformations, Dependences, and Parallelization Loop Transformatons, Dependences, and Parallelzaton Announcements Mdterm s Frday from 3-4:15 n ths room Today Semester long project Data dependence recap Parallelsm and storage tradeoff Scalar expanson

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introducton 1.1 Parallel Processng There s a contnual demand for greater computatonal speed from a computer system than s currently possble (.e. sequental systems). Areas need great computatonal

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem Ecent Computaton of the Most Probable Moton from Fuzzy Correspondences Moshe Ben-Ezra Shmuel Peleg Mchael Werman Insttute of Computer Scence The Hebrew Unversty of Jerusalem 91904 Jerusalem, Israel Emal:

More information

Power-Aware Mapping for Network-on-Chip Architectures under Bandwidth and Latency Constraints

Power-Aware Mapping for Network-on-Chip Architectures under Bandwidth and Latency Constraints Power-Aware Mappng for Network-on-Chp Archtectures under Bandwdth and Latency Constrants Xaohang Wang 1,2, Me Yang 2, Yngtao Jang 2, and Peng Lu 1 1 Department of Informaton Scence and Electronc Engneerng,

More information