A Power Optimization Toolbox for Logic Synthesis and Mapping

Size: px
Start display at page:

Download "A Power Optimization Toolbox for Logic Synthesis and Mapping"

Transcription

1 A Power Optmzaton Toolbox for Logc Synthess and Mappng Alan Mshchenko Robert Brayton Stephen Jang Kevn Chung Department of EECS, Unversty of Calforna, Berkeley LogcMll Technology Qualcom Innovatons {alanm, Abstract The paper descrbes several complementary algorthms for power-aware logc optmzaton: o SmSwtch s an effcent sequental smulator for estmatng swtchng actvty of sgnals n large sequental desgns. o PowerMap uses swtchng actvty to make better decsons durng power-aware technology mappng. o PowerDC s a resynthess algorthm that elmnates wres wth hgh swtchng actvty. The proposed smulator draws on new deas n logc representaton and s geared for speed, e.g. t can smulate a 1Mnode sequental desgn usng 1000 bt patterns for 100 cycles n about 10 seconds on a typcal one-core CPU. Ths s more than 85X faster than prevously publshed actvty estmators wth smlar accuracy. Experments show that, although each technque contrbutes to the fnal qualty, t s ther combnaton that gves the best results. When appled to large ndustral desgns n a hghly-optmzed ndustral flow, prevous work on sequental synthess and wre-aware technology mappng led to a 27.6% reducton n swtchng actvty, whle the technques of ths paper reduce t addtonally by 19.6% wthout a substantal ncrease n runtme or degradaton of other metrcs. 1 Introducton Wth the ncrease of the desgn sze and the reducton of feature sze, power mnmzaton becomes an mportant ssue. Accordng to current estmates, about 67% of power dsspaton n FPGA devces s due to dynamc power [3]. Ths hgh power consumpton s an ncreasng concern for FPGA vendors and ther customers. Reducng power s key to lowerng packagng and coolng costs, and mprovng devce relablty [13]. Total power n an FPGA (or any semconductor devce) conssts of statc power and dynamc power. Statc power results prmarly from transstor leakage current, whch s the small current that travels ether from source-to-dran or through the gate oxde when the transstor s logcally off [33]. Dynamc power dsspaton s caused by sgnal transtons n a crcut and comes from the clock and logc networks. These transtons can be part of normal operaton or can be due to gltchng. The dynamc power s characterzed by the equaton: 1 2 P = f V C dynamc s 2 sgnals where f s the clock frequency, V the supply voltage, C the capactance swtched by sgnal, and s s the probablty of sgnal makng a transton. An overvew of leakage reducton technques used n ASICs and mcroprocessor s descrbed n [30]. However, the focus of ths paper s on reducng the dynamc power consumpton due to sgnal swtchng n the logc network. Most power-aware technology mappng [3] and power-aware placement and routng [15] use swtchng actvty nformaton of the netlst to estmate dynamc power dsspaton. Technques for estmatng swtchng actvty can be smulatonbased or probablty-based (or vectorless). Ths work uses smulaton, whch s more accurate but slower [16]. Sequental smulaton s appled to a Boolean network consstng of logc gates and flp-flops whle keepng track of the transtons. Gatelevel smulaton s a well-studed problem and much effort has been placed on mprovng ts speed [17][30]. Despte many nnovatons, gate-level smulaton remans slow for large desgns. Logc synthess and technology mappng transform logc mplementaton by choosng among dfferent crcut structures. Gven the goal of reducng dynamc power dsspaton, t s natural to seek logc structures resultng n a reducton of the total swtchng actvty of ther consttuent nodes. Thus, t s mportant to develop a fast and accurate swtchng actvty estmator that can drve power-aware technology mappng and resynthess. The present paper contrbutes to these goals n several ways: It descrbes an mplementaton of an effcent sequental smulator, SmSwtch, for estmatng swtchng actvty of all sgnals n a sequental desgn. Ths estmaton s used to gude power-aware logc synthess by evaluatng dfferent restructurng and mappng choces. It descrbes two synergstc algorthms, PowerMap and PowerDC, performng power-aware technology mappng and restructurng of the crcut whle tryng to mnmze the total swtchng actvty. It offers an expermental evaluaton of these methods and demonstrates that a substantal reducton n the dynamc power s possble wthout ncreasng runtme and compromsng other parameters such as area and delay. The contrbutons of ths paper can be extended to work for standard-cell desgns. In partcular, the same sequental smulator can be used to estmate swtchng actvty of all sgnals. A cutbased standard-cell mapper [5] can be modfed to take swtchng actvty nto account, leadng to mappngs wth reduced power consumpton. Fnally, a don t-care-based optmzaton envronment for standard-cells can be developed based on the same methodology as [27]. Ths optmzaton wll rewre a standard-cell desgn whle replacng nets wth hgh swtchng actvty by nets wth low swtchng actvty. Although development of such an envronment s beyond the scope of ths work, our experence wth other optmzatons applcable to both standard-cell and LUT-based desgns suggests that powerreductons for standard-cells are lkely to be comparable to those for the LUT-based desgns presented n ths paper. The rest of the paper s organzed as follows. Secton 2 provdes necessary background on logc synthess and technology mappng. Secton 3 descrbes the sequental smulator for power estmaton. Secton 4 presents the power-aware logc optmzaton algorthms. Secton 5 shows expermental results. Fnally, Secton 6 concludes the paper and outlnes future work. 2 Background 2.1 Boolean Networks A Boolean network s a drected acyclc graph (DAG) wth nodes correspondng to logc gates and drected edges

2 correspondng to wres connectng the gates. The terms Boolean network and crcut are used nterchangeably n ths paper. If the network s sequental, the memory elements are assumed to be D-flp-flops wth ntal states. Terms memory elements, flopflops, and regsters are used nterchangeably n ths paper. A node n has zero or more fanns,.e. nodes that are drvng n, and zero or more fanouts,.e. nodes drven by n. The prmary nputs (PIs) are nodes wthout fanns n the current network. The prmary outputs (POs) are a subset of nodes of the network. If the network s sequental, t contans regsters whose nputs and output are treated as addtonal PIs/POs n combnatonal optmzaton and mappng. It s assumed that each node has a unque nteger called ts node ID. A cut C of node n, called root, s a set of nodes, called leaves, such that each path from a PI to n passes through at least one leaf. A cut s K-feasble f ts cardnalty does not exceed K. A cut s domnated f there s another cut of the same node, contaned, settheoretcally, n the gven cut. A fann (fanout) cone of node n s the subset of the nodes of the network reachable through the fann (fanout) edges from n. A maxmum fanout free cone (MFFC) of node n s a subset of the fann cone, such that every path from a node n the subset to the POs passes through n. Informally, the MFFC of a node contans all the logc used exclusvely to generate the output functon of the node. When a node s removed due to redundancy, the logc n ts MFFC can also be removed. 2.2 And-Inverter Graphs A combnatonal And-Inverter Graph (AIG) [22] s a drected acyclc graph (DAG), n whch a node has ether 0 or 2 ncomng edges. A node wth no ncomng edges s a prmary nput (PI). A node wth 2 ncomng edges s a two-nput AND gate. An edge s ether complemented or not. A complemented edge ndcates the nverson of the sgnal. Certan nodes are marked as prmary outputs (POs). The combnatonal logc of an arbtrary Boolean network can be factored [4] and transformed nto an AIG usng DeMorgan s rule. 2.3 Technology Mappng wth Structural Choces A typcal procedure for structural technology mappng conssts of the followng step: 1) cut enumeraton, 2) delay-optmal mappng, 3) area recovery usng heurstcs, 4) producng the resultng LUT network. A detaled descrpton of these steps s gven n [7]. The man drawback of the structural approaches to technology mappng s ther dependence on the crcut structure gven to the mapper (also known as the problem of structural bas ). If the structure s bad, nether heurstcs nor teratve recovery procedures wll lead to a good result of mappng. To overcome the bas of a sngle netlst structure, mappng wth structural choces (lossless synthess) s explored n [5] and [24]. The structural choces are constructed by recordng equvalent crcut structures at a subset of nodes (called choce nodes). These alternatve structures come from synthess transformatons, such as, AIG rewrtng/balancng or co-factorng. As the mapper traverses over these choce nodes n the subject graph, t can explore the multple alternatve mplementatons smultaneously. It s found that usng structural choces over multple passes of LUT mappng yelds sgnfcantly better results, compared to mappng wthout choces or runnng only a sngle teraton of mappng wth choces [24]. Mappng wth structural choces s used n the experments results secton. 3 Sequental Smulaton (SmSwtch) A sequental smulator apples sequences of values to the prmary nputs. It s assumed that the ntal state of the desgn s known and used to smulate the frst frame. In subsequent frames, the state derved at the prevous teraton s used. Input nformaton (such as the probablty of a transton occurrng) can be gven by the user or produced by another tool (for example, f an nput trace s known, t may be appled to smulate the desg. In ths work, sequental smulaton s used to estmate swtchng actvty of regsters and nternal nodes. Ths s used later to gude heurstc power-aware optmzaton. The desgn s sequentally smulated for a fxed number of tmeframes. The random nput patterns are generated to have a 0.5 probablty of the probablty of the node changng ts value (toggle rate) s fxed. Smulaton s performed from the ntal state for a fxed number of cycles (typcally, 64), of whch the frst few (typcally, 16) are skpped as not representatve of normal operaton; the remanng ones are used to accumulate the swtchng actvty. Prevous work [6][8][12][18] concluded that the man problem wth sequental smulaton for swtchng-actvty estmaton s the prohbtve runtme of the smulator. In ths work, we address ths problem wth SmSwtch, based on new logc representaton and manpulaton methods. The salent features of the new smulator are descrbed below. In general, the smaller the memory footprnt of the smulator, the faster t runs. Ths s due to a CPU havng typcally a local cache rangng n sze from 2Mb to 16Mb. If an applcaton requres more memory than fts nto the cache, repeated cache msses cause the runtme to degrade. Therefore, the challenge s to desgn the smulator that uses mnmalstc data-structures wthout compromsng the computaton speed. We found three orthogonal ways of reducng the memory requrements of the smulator, wthout mpactng ts performance. 4 Approaches to Power Mnmzaton In ths secton, two orthogonal ways of usng swtchng actvty are descrbed. The frst (PowerMap) uses swtchng actvty to make better decsons durng technology mappng whle the second (PowerDC) does power-aware re-synthess after mappng. 4.1 Technology Mappng (PowerMap) LUT-based mappers [7][19][23] have evolved from the frst technology mappng solutons for FPGAs [11][9], to those that have reduced runtme, mproved qualty of results, and allow users to select cost-functons employed durng decson makng, as n [21]. In partcular, a recent technology mapper [23] was successfully modfed to accommodate heurstcs for reducng wre-length and mprovng desgn routablty (WreMap) [12][14]. We propose a smlar modfcaton of the prorty-cut based mapper [23] geared towards reducng swtchng actvty, resultng n an algorthm called PowerMap. Snce the basc mappng algorthm s descrbed n detal n [23] and [14], here we only descrbe the changes n the new cost functon whch use swtchng actvty to prortze the cuts durng technology mappng. The ntuton behnd power-aware technology mappng s to try to cover the nodes wth hgh swtchng actvty (or hot nodes) so they are hdden nsde a LUT. Snce the capactance nsde a LUT s very small, power consumpton wll be reduced. The followng three metrcs are compared below: area flow [10][19][23], edge flow (WreMap) [14], swtchng flow (PowerMap) (ths paper).

3 Area flow (AF) s defned as: AF( Leaf ( ) AF( = Area(, (1) + NumFanout( Leaf ( ) where Area( s the area of the LUT used to map node n, Leaf ( s the -th leaf of the representatve cut of node n, and NumFanouts(Leaf () s the number of fanouts of node Leaf ( n the currently selected mappng. Edge flow (EF) s defned as: EF( Leaf ( ) EF( = Edge(, (2) + NumFanout( Leaf ( ) where Edge( s the total number of fann edges (or wres) of the LUT used to map the current representatve cut of node n, whle other notatons are the same as n (1). Swtchng flow s defned as: SwtchFlow( = Swtch( + SwtchFlow( Leaf ( ), (3) NumFanout( Leaf ( ) where Swtch( s the total swtchng actvty at the output of node n, computed usng sequental smulaton (from Secton 3). Intutvely, each of these flows s an estmate of a resource (area, wrng, swtchng) assocated wth the logc rooted at node n. The man dea s to compute for each cut three dfferent metrcs (flows) capturng the global vew of the netlst durng mappng: area flow, edge flow, and swtchng flow. When computng these metrcs, each cut s characterzed by 1) ts own resources used and 2) the resources used by ts fanns. The value of a resource of a fann s dvded by the number of the fann s fanouts. Consder node n1 n Fgure 1. The drect resources of n1 are: 1) the node tself; 2) the three nput edges; 3) the total resources for the three edges. For nodes n2 and n4, half of the resources s used by node n1 whle the other half s used by other fanouts. For node n3, all resources are used by node n1. Resources owned by the nodes n1 n5 Resources owned by the nodes Formulas (1)-(3) provde dfferent applcatons of the noton of a flow when appled to technology mappng. The cost functons measure ncreasngly complex netlst metrcs, startng from area (LUT count) to the number of fanns (routng wres) to the total swtchng actvty of fanns (power estmate), whle the noton of a flow captures the fact that each ncomng flow s shared by the fanouts of the fanns, whch gves a global vew of each cost functon durng technology mappng. 4.2 SAT-Based Resynthess (PowerDC) An effcent approach to resyntheszng logc networks after technology mappng s gven n [27]. Its features are lsted below: substantal optmzaton power, due to the use of nternal don t-cares; scalable local computaton, due to the use of wndowng; computaton speed, due to the use of Boolean satsfablty for functonal manpulaton, and ablty to accommodate varous optmzaton objectves. Resubsttuton s a logc restructurng technque that modfes the local space of a mapped node by addng or removng ts fanns. When appled to a Boolean network, resubsttuton can be teratvely appled, amng at mnmzng a gven cost functon, such as area, delay, wre-length, etc. The man dea of the power-aware flavor of the resynthess algorthm, called PowerDC, s summarzed below. For detals on the basc algorthm and ts mplementaton, refer to [20][27]. Gven a mapped network, t s frst transformed nto an AIG by applyng Boolean decomposton of the logcal functons of the LUTs nto two-nput gates. Then sequental smulaton s appled to the AIG, resultng n the transton probabltes for all the AIG nodes. Ths nformaton s back-annotated to the mapped network to provde the swtchng actvty for all nodes n the mappng. Next, a subset of wres, called hot wres, s dentfed. Based on the experments, f the toggle rate for prmary nputs s 0.5, hot wres are those wth swtchng probablty greater than 0.4. Note that swtchng computed as a transton probablty may be hgher than 0.5 (but cannot exceed 1.0). Another way of defnng hot wres mght be a fxed rato (say, 10%) of the wres wth the hghest swtchng actvty. n2 n3 n4 Power dstrbuton Fgure 1: Metrcs used n PowerMap. The use of the cost functon (3) durng power-aware technology mappng s smlar to that of ts use n (2) n wreaware mappng [14]. The man dfferences are as follows. When comparng two cuts, ther area flows are compared frst, whle ther swtchng flows are used as the frst te-breaker,.e. f the area flows of two cuts are equal up to a certan delta (say, 0.001), the decson of what cut to use s made based on the swtchng flow. Smlarly, edge flow s used as the second te-breaker. Smlarly, n the second phase of technology mappng when local areas of cuts are compared [14], local swtchng s used as the frst te-breaker and edge flow s used as the second tebreaker. The local area of a cut s the sum of the areas of the LUTs n the MFFC of the cut. Smlarly, the local swtchng of a cut s the sum of swtchng actvtes of the LUTs n the cut s MFFC. Percentage % 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% T5 T4 T3 T2 T1 Nodes 5.25% 1.09% 0.64% 0.84% 92.16% Wre 13.90% 1.62% 0.91% 1.29% 82.27% Pwr 85.90% 7.36% 2.68% 2.32% 1.74% Swtchng frequency (T5 s hot and T1 s cool) Nodes Wre Pwr Fgure 2: Dstrbuton of wres and nodes by toggle rate. Based on an experment descrbed n Secton 5, about 13.9% of wres are hot and these contrbute about 86% of the total power % of wres are cool but these only contrbute 1.74% of the total power. A detaled dstrbuton of wres (and nodes) by swtchng actvty and ther power contrbuton are shown n

4 Fgure 2. Snce the vast majorty of dynamc power s dsspated n a small percentage of hot wres, power reducton technques targetng hot wres are very effectve. After hot nodes are found, SAT-based resubsttuton [27] s appled targetng hot nodes as fanns of other mapped nodes. The goal s to remove or replace them by cooler fanns. To save runtme, swtchng actvty s not updated after ndvdual steps of resubsttuton. Ths s reasonable because, although Boolean functons of the nodes after resynthess wth observablty don t-cares may be changed, ther swtchng actvty does not change substantally. Even when t changes, the percentage of nodes whose hot/cool status s modfed durng resynthess, s typcally neglgble. 5 Expermental Results The algorthms of ths paper were mplemented n ABC [1]. Expermental evaluaton targetng 6-LUT mplementatons were performed usng a sute of 20 large ndustral desgns rangng n sze from 12K to 165K 6-LUTs. The experments were run on an Intel Xeon 3GHz 2-CPU 4-core computer wth 8GB of RAM. The resultng networks were verfed usng combnatonal equvalence checker n ABC (command cec). The power consumpton of a crcut s the sum of the power consumed by each wre and the power consumed by each wre s the product of the capactance and swtchng actvty for the wre. Because the netlsts were not placed n ths experment, the capactance of each wre was not known. We assumed a untcapactance model for each wre n our dynamc power calculatons. Note that ths power model assumes that each fanout connecton has the same capactance for a multple fanout net. The sequental smulator for estmatng the swtchng actvty was based on a new AIG package, called ga. The followng commands n ABC were extended to be power-aware. In each case, new swtch -p enables swtchng actvty mnmzaton: Technology mappng for FPGAs [23] (command f p). SAT-based resubsttuton [27] (command mfs p). Swtchng actvty report (command ps p). Several other commands, ncludng sequental synthess and mappng wth structural choces, were used n ths experment: Structural sequental cleanup [26] (command scl). Parttoned regster-correspondence computaton usng smple nducton [26] (command lcorr). Parttoned sgnal-correspondence computaton usng k-step nducton [26] (command scorr). Computaton of structural choces [3] (command dch). 5.1 Smulaton Runtme As mentoned, one of the man ssues wth sequental smulaton s potentally long runtme. In ths secton, we performed two experments. In the frst experment we showed that runnng SmSwtch offers affordable runtme for large desgns. In the second experment, we showed that the runtme of SmSwtch s very fast compared wth ACE-2.0 [16], a fast swtchng estmaton tool that s avalable to the publc. In the frst experment, four ndustral desgns rangng from 304K to 1.3M AIG nodes were smulated wth dfferent numbers of smulaton patterns, rangng from 2,560 to 20,480. The nput toggle rate was assumed to be 0.5. The results are shown n Table 1. Columns AIG and FF show the number of AIG nodes and regsters. The runtmes for dfferent szes of nputs patterns are shown n the last columns. Note that the runtmes are qute affordable even for Desgn C4 wth 1.3M AIG nodes (about 160 K 6-LUT after mappng). In most cases, 2,560 patterns were suffcent for node swtchng actvty rates to converge. Table 1: Runtme of SmSwtch. Desgn AIG FF Runtme for nputs patterns (seconds) C1 304K C2 362K C3 842K C4 1306K In the second experment, we compared the runtme of SmSwtch vs. ACE-2.0 on 14 ndustry desgns and 12 large academc benchmarks. The nput toggle rate s assumed to be 0.5 for both tools. The number of nput patterns s assumed to be 5,000 for both runs. All crcuts are decomposed to AIG netlsts before performng swtchng estmaton. The results, not shown n ths paper, lead to the followng observatons. SmSwtch can fnsh all the desgns n very affordable tme but ACE-2.0 gets 4 tmeouts n ndustry desgns. For ndustry desgns, the runtme of SmSwtch s 146+ tmes faster than ACE-2.0. For large academc benchmarks, the runtme of SmSwtch s 85+ tmes faster than ACE Power-Aware Optmzatons To evaluate the contrbuton of power-aware mappng and resynthess, three experments were performed, denoted Baselne, FullOpt, and PowerMap. Baselne corresponds to two runs of (dch; f e). It performs prorty-cut based technology mappng wth structure choces. WreMap [14] s not used for Baselne because WreMap s known to reduce power dsspaton as a by-product of wre count mnmzaton. FullOpt s the complete flow ncludng hgh-effort sequental and combnatonal logc synthess. Sequental synthess (scl; lcorr; scorr) [26] s run frst to remove sequentally equvalent nodes and regsters. Then, two teratons of combnatonal synthess and mappng wth structural choces are performed (dch; f). WreMap s used n ths flow. It reduces 10% of the wres on top of Baselne [14]. In the PowerMap run, FullOpt s used as the nput netlst followed by two runs (dch; f p), whch s the power-aware technology mappng developed n ths paper. In the PowerDC run, the results of PowerMap are used as nput followed by two teratons of power-aware resynthess (mfs p). The sequental smulator SmSwtch s used to drve the poweraware logc synthess. In sequental smulaton, 2,560 random nput patterns wth a toggle rate of 0.5 were used. The results are reported n Table 4. (Detaled results for some desgns n the table are omtted due to page lmtaton, but the averages are computed over all desgns.) Columns PI, PO, FF, and LUT show the number of prmary nputs, prmary outputs, flp-flops, and 6- nput LUTs. Columns Lv and Pwr show the number of logc levels and the total swtchng actvty of the network. The expermental results lead to the followng observatons: FullOpt mproved on Baselne n the number of regsters, LUTs, and logc levels by 14.8%, 12.2%, and 3%, respectvely. Power s reduced by 27%.

5 FullOpt, PowerMap, and PowerDC have the same number of regsters because only combnatonal synthess s used n PowerMap and PowerDC. PowerMap produces an addtonal 10.5% power reducton on top of FullOpt. PowerDC produces an addtonal 10.1% power reducton on top of PowerMap. The overall power reducton for PowerDC vs. FullOpt s 19.6%. The overall power reducton for PowerDC vs. Baselne s 41.9%. Note that when runnng PowerMap and PowerDC, the LUT count and logc delay s reduced slghtly. The addtonal optmzatons reduce power wthout any effect on desgn speed. To nvestgate robustness of the algorthm, we performed the same experments and changed the toggle rate of the nputs from 0.5 to The results are summarzed n Table 2. PowerMap produces an addtonal 9.8% power reducton on top of FullOpt PowerDC produces an addtonal 8.7% power reducton on top of PowerMap. The power reducton for PowerDC vs. FullOpt s 17.7%. The power reducton for PowerDC vs. Baselne s 39.7% Table 2: Comparson of power-optmzaton algorthms (nputs toggle rate s 0.25). Power BaseLne FullOpt PwrMap PwrDC Geomean Rato Rato Rato Wre Dstrbuton Analyss For the same sute of 20 desgns, the changes n the dstrbuton of wres by ther swtchng actvty as a result of resynthess s analyzed. The results are reported Fgure 3. Column T5 s the total number of wres whose swtchng probablty exceeds 0.4. These hot wres are targeted by power reducton. Column T4 s the total number of wres whose swtchng probablty s between 0.3 and 0.4. Columns T3, T2, and T1 are defned smlarly. Thus the T1 wres are the cool wres. In general, they don t contrbute much to the power because ther swtchng probablty s less than 0.1. The results n Fgure 3 lead to the followng observatons: For PowerMap vs. FullOpt, hot (T5) wres are reduced by 13.22%, and T4 are ncreased by 4.96%. Ths ndcates that PowerMap reduces the number of hot wres. For PowerDC vs. PowerMap, total wres are reduced by 2.7% but power s reduced by 10.1%. Thus, the resynthess has removed the rght wres,.e., the hot wres. Hot wres are reduced by 11.04%. The cooler wres are reduced also but the reducton s smaller n ths case. In addton to wre count reducton for hot wres, the wre dstrbuton n terms of toggle rates s also mproved. Fgure 4 shows the power dstrbuton before and after optmzaton. Wre and Wre2 are the percentages of wres, whle Pwr and Pwr2 are the percentages of power contrbutons before and after optmzaton. The chart leads to the followng conclusons: After FullOpt, 13.9% (82.3%) of the wres are hot (cool). After PowerMap, 11.4% (84.6%) of wres are hot (cool). After FullOpt, 85.9% (1.7%) of the power s contrbuted by hot (cool) wres. After PowerMap, 82.6% (2.0%) of power s contrbuted by hot (cool) wres. The above observatons show that the power contrbuted by hot wres s less than Baselne whle the power contrbuted by cool wres s more than Baselne. The concluson s that the poweraware logc synthess was able to shft the peak of power consumpton to lower-swtchng wres. Reducton (negatve s go Percentage 10.00% 5.00% 0.00% -5.00% % % PowrMap vs FullOpt Wre Comparson T5 T4 T3 T2 T1 Total Wrs PowerDc vs. PowrMap Fgure 3: The changes n wre ratos after two power-aware transforms % 80.00% 60.00% 40.00% 20.00% 0.00% Power dstrbuton before/after optmzaton T5 T4 T3 T2 T1 Wre 13.90% 1.62% 0.91% 1.29% 82.27% Wre % 1.79% 0.93% 1.29% 84.55% Pwr 85.90% 7.36% 2.68% 2.32% 1.74% Pwr % 9.48% 3.19% 2.69% 2.02% Swtchng frequency (temperature) Wre Wre2 Pwr Pwr2 Fgure 4: Power dstrbuton before/after optmzaton. 6 Conclusons Ths paper descrbes a toolbox for power-aware logc synthess and mappng. Estmaton of power consumpton at the early stages of the desgn flow s based on calculatng the probabltes of sgnal transton durng sequental smulaton. The toolbox also ncludes two technques for optmzng the desgn durng synthess and mappng to reduce swtchng actvty and thereby mnmze dynamc power consumpton: PowerMap: a power-aware LUT mapper that extends [23] to prortze cuts based on swtchng actvty of the nodes. PowerDC: a SAT-based engne for resubsttuton wth don t-cares that extends [27] to remove sgnals wth hgh swtchng actvty (so called hot sgnals ). Estmaton of power dsspaton s effcently performed by a new sequental smulator, SmSwtch. The estmaton converges for most ndustral desgns after a reasonable number of smulaton cycles. Ths allows for makng targeted heurstc decsons durng logc synthess and technology mappng Future work wll nclude: Speedng up swtchng actvty estmaton (t s estmated that the current mplementaton can be made 50% faster). Extendng swtchng actvty computaton to nclude gltch estmaton.

6 Implementng other power-aware commands, such as computng better choces to reduce power and logc restructurng to reduce power. References [1] Berkeley Logc Synthess and Verfcaton Group, ABC: A system for sequental synthess and verfcaton, Release 904xx. [2] Altera, Stratx II vs. Vrtex-4 power comparson and estmaton accuracy (whte paper). wp_s2v4_pwr_acc.pdf [3] J. Anderson and F. N. Najm, Power-aware Technology Mappng for LUT-Based FPGAs, IEEE Int. Conf. on Feld Programmable Technology, Dec. 2002, pp [4] R. Brayton and C. McMullen, The decomposton of logc functons, Proc. ICCAD 97, pp [5] S. Chatterjee, A. Mshchenko, R. Brayton, X. Wang, and T. Kam, Reducng structural bas n technology mappng, Proc. ICCAD '05, pp [6] D. Chen, J. Cong, and P. Pan, FPGA desgn automaton: A survey, Foundatons and Trends n Electronc Desgn Automaton, Vol. 1(3), November 2006, pp [7] D. Chen and J. Cong. DAOmap: A depth-optmal area optmzaton mappng algorthm for FPGA desgns, Proc. ICCAD 04, pp [8] L. Cheng, D. Chen, and M.D. Wong, GltchMap: FPGA technology mapper for low power consderng gltches. Proc. DAC '07. [9] J. Cong and Y. Dng, FlowMap: An optmal technology mappng algorthm for delay optmzaton n lookup-table based FPGA desgns, IEEE TCAD, Vol. 13(1), 1994, pp [10] J. Cong, C. Wu, and Y. Dng, Cut rankng and prunng: Enablng a general and effcent FPGA mappng soluton, Proc. FPGA 99, pp [11] R. J. Francs, J. Rose, and K. Chung, Chortle: A technology mappng program for lookup table-based feld programmable gate arrays, Proc. DAC 90, pp [12] A. Ghosh, S. Devadas, K. Keutzer, and J. Whte, Estmaton of average swtchng actvty n combnatonal and sequental crcuts Proc. DAC 92. [13] S. Gupta and J. Anderson, Optmzng FPGA power wth ISE desgn tools, Issue 60, 2007, pp , publcatons/xcellonlne/xcell_60/xc_pdf/p16-19_60-gupta.pdf [14] S. Jang, B. Chan, K. Chung, and A. Mshchenko, "WreMap: FGPA technology mappng for mproved routablty". Proc. FPGA '08. [15] J. Lamoureux and S.J.E. Wlton, On the Interacton between Power- Aware FPGA CAD Algorthms,, Proc. ICCAD, Nov [16] J. Lamoureux and S.J.E. Wlton, Actvty estmaton for Feld- Programmable Gate Arrays, Proc. Intl Conf. Feld-Prog. Logc and Applcatons (FPL), 2006, pp [17] J. N. Kozhaya and F. Najm, Accurate power estmaton for large sequental crcuts, Proc. ICCAD 97. [18] J. Lamoureux, Modelng and reducton of dynamc power n Feld- Programmable Gate Arrays, Ph.D. Thess, [19] V. Manohara-rajah, S. D. Brown, and Z. G. Vranesc, Heurstcs for area mnmzaton n LUT-based FPGA technology mappng, Proc. IWLS 04, pp [20] A. Mshchenko and R. Brayton, "SAT-based complete don't-care computaton for network optmzaton", Proc. DATE '05. [21] A. Mshchenko, S. Chatterjee, R. Brayton, and M. Ceselsk, "An ntegrated technology mappng envronment", Proc. IWLS '05, pp [22] A. Mshchenko, S. Chatterjee, and R. Brayton, "DAG-aware AIG rewrtng: A fresh look at combnatonal logc synthess", Proc. DAC '06, pp [23] A. Mshchenko, S. Cho, S. Chatterjee, and R. Brayton, "Combnatonal and sequental mappng wth prorty cuts", Proc. ICCAD '07, pp [24] A. Mshchenko, S. Chatterjee, and R. Brayton, "Improvements to technology mappng for LUT-based FPGAs", Proc. FPGA '06, pp [25] A. Mshchenko, R. K. Brayton, and S. Jang, "Global delay optmzaton usng structural choces", Proc. IWLS'08, pp [26] A. Mshchenko, M. L. Case, R. K. Brayton, and S. Jang, Scalable and scalably-verfable sequental synthess, Proc. ICCAD'08, pp [27] A. Mshchenko, R. Brayton, J.-H. R. Jang, S. Jang, "Scalable don'tcare-based logc optmzaton and resynthess", Proc. FPGA '09. [28] J. Najm, Transton densty: A new measure of actvty n dgtal crcuts, IEEE Trans. CAD, Vol. 12(2), 1993, pp [29] PowerPlay Power Analyss, Quartus II 9.0 Handbook, Vol. 3, [30] K. Roy, S. Mukhopadhyay, and H. Mahmood-Memand, Leakage current mechansms and leakage reducton technques n deepsubmcrometer CMOS crcuts, n Proceedngs of the IEEE, pages , 2002 [31] E. Todorovch, M. Glabert, G. Sutter, S. Lopez-Buedo, and E. Boemo, A tool for actvty estmaton n FPGAs, Proc. Intl. Conf. Feld-Prog. Logc (FPL), 2002, pp [32] Xlnx Power Estmator User Gude, products/desgn_resources/power_central/ug440.pdf [33] Xlnx Whte Paper: Power Consumpton n 65 nm FPGAs pdf Table 4: Comparson of power-optmzaton algorthms (nput toggle rate s 0.5). Desgn Statstcs Base FullOpt PowerMap PowerDC name PI PO FF LUT Lv Pwr FF LUT Lv Pwr FF LUT Lv Pwr FF LUT Lv Pwr D D D D D D Geom Rato Rato Rato

A Power Optimization Toolbox for Logic Synthesis and Mapping

A Power Optimization Toolbox for Logic Synthesis and Mapping A Power Optimization Toolbox for Logic Synthesis and Mapping Stephen Jang Kevin Chung Alan Mishchenko Robert Brayton Xilinx Inc. Department of EECS, University of California, Berkeley {sjang, kevinc}@xilinx.com

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

CPE 628 Chapter 2 Design for Testability. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction

CPE 628 Chapter 2 Design for Testability. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction Chapter 2 Desgn for Testablty Dr Rhonda Kay Gaede UAH 2 Introducton Dffcultes n and the states of sequental crcuts led to provdng drect access for storage elements, whereby selected storage elements are

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

High-Level Power Modeling of CPLDs and FPGAs

High-Level Power Modeling of CPLDs and FPGAs Hgh-Level Power Modelng of CPLs and FPGAs L Shang and Nraj K. Jha epartment of Electrcal Engneerng Prnceton Unversty {lshang, jha}@ee.prnceton.edu Abstract In ths paper, we present a hgh-level power modelng

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) ,

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , VRT012 User s gude V0.1 Thank you for purchasng our product. We hope ths user-frendly devce wll be helpful n realsng your deas and brngng comfort to your lfe. Please take few mnutes to read ths manual

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011 9/8/2 2 Outlne Appendx C: The Bascs of Logc Desgn TDT4255 Computer Desgn Case Study: TDT4255 Communcaton Module Lecture 2 Magnus Jahre 3 4 Dgtal Systems C.2: Gates, Truth Tables and Logc Equatons All sgnals

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

Conditional Speculative Decimal Addition*

Conditional Speculative Decimal Addition* Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations* Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Newton-Raphson division module via truncated multipliers

Newton-Raphson division module via truncated multipliers Newton-Raphson dvson module va truncated multplers Alexandar Tzakov Department of Electrcal and Computer Engneerng Illnos Insttute of Technology Chcago,IL 60616, USA Abstract Reducton n area and power

More information

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits Repeater Inserton for Two-Termnal Nets n Three-Dmensonal Integrated Crcuts Hu Xu, Vasls F. Pavlds, and Govann De Mchel LSI - EPFL, CH-5, Swtzerland, {hu.xu,vasleos.pavlds,govann.demchel}@epfl.ch Abstract.

More information

Improving The Test Quality for Scan-based BIST Using A General Test Application Scheme

Improving The Test Quality for Scan-based BIST Using A General Test Application Scheme _ Improvng The Test Qualty for can-based BIT Usng A General Test Applcaton cheme Huan-Chh Tsa Kwang-Tng Cheng udpta Bhawmk Department of ECE Bell Laboratores Unversty of Calforna Lucent Technologes anta

More information

A fault tree analysis strategy using binary decision diagrams

A fault tree analysis strategy using binary decision diagrams Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Construction of ROBDDs. area. that such graphs, under some conditions, can be easily manipulated.

Construction of ROBDDs. area. that such graphs, under some conditions, can be easily manipulated. A Study of Composton Schemes for Mxed Apply/Compose Based Constructon of s A Narayan 1 S P Khatr 1 J Jan 2 M Fujta 2 R K Brayton 1 A Sangovann-Vncentell 1 Abstract Reduced Ordered Bnary Decson Dagrams

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Generation of Compact Single-Detect Stuck-At Test Sets Targeting Unmodeled Defects

Generation of Compact Single-Detect Stuck-At Test Sets Targeting Unmodeled Defects Generaton of Compact Sngle-Detect Stuck-At Test Sets Targetng Unmodeled Defects Xrysovalants Kavousanos and Krshnendu Chakrabarty * Abstract- We present a new method to generate compact stuckat test sets

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

A Min-Cost Flow Based Detailed Router for FPGAs

A Min-Cost Flow Based Detailed Router for FPGAs A Mn-Cost Flow Based Detaled Router for FPGAs eokn Lee Dept. of ECE The Unversty of Texas at Austn Austn, TX 78712 Yongseok Cheon Dept. of Computer cences The Unversty of Texas at Austn Austn, TX 78712

More information

Improved Symoblic Simulation By Dynamic Funtional Space Partitioning

Improved Symoblic Simulation By Dynamic Funtional Space Partitioning Improved Symoblc Smulaton By Dynamc Funtonal Space Parttonng Tao Feng, L-.Wang, Kwang-Tng heng Department of EE, U-Santa Barbara, U.S.A tfeng,lcwang, tmcheng @ece.ucsb.edu Andy -. Ln adence Desgn Systems,

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

CHAPTER 4 PARALLEL PREFIX ADDER

CHAPTER 4 PARALLEL PREFIX ADDER 93 CHAPTER 4 PARALLEL PREFIX ADDER 4.1 INTRODUCTION VLSI Integer adders fnd applcatons n Arthmetc and Logc Unts (ALUs), mcroprocessors and memory addressng unts. Speed of the adder often decdes the mnmum

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Test-Cost Modeling and Optimal Test-Flow Selection of 3D-Stacked ICs

Test-Cost Modeling and Optimal Test-Flow Selection of 3D-Stacked ICs Test-Cost Modelng and Optmal Test-Flow Selecton of 3D-Stacked ICs Mukesh Agrawal, Student Member, IEEE, and Krshnendu Chakrabarty, Fellow, IEEE Abstract Three-dmensonal (3D) ntegraton s an attractve technology

More information

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution Dynamc Voltage Scalng of Supply and Body Bas Explotng Software Runtme Dstrbuton Sungpack Hong EE Department Stanford Unversty Sungjoo Yoo, Byeong Bn, Kyu-Myung Cho, Soo-Kwan Eo Samsung Electroncs Taehwan

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

How Accurately Can We Model Timing In A Placement Engine?

How Accurately Can We Model Timing In A Placement Engine? How Accurately Can We Model Tmng In A Placement Engne? Amt Chowdhary, Karth Raagopal, Satsh Venatesan, Tung Cao, Vladmr Tourn, Yegna Parasuram, Bll Halpn Intel Corporaton Serra Desgn Automaton Synplcty,

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Advanced Computer Networks

Advanced Computer Networks Char of Network Archtectures and Servces Department of Informatcs Techncal Unversty of Munch Note: Durng the attendance check a stcker contanng a unque QR code wll be put on ths exam. Ths QR code contans

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS CACHE MEMORY DESIGN FOR INTERNET PROCESSORS WE EVALUATE A SERIES OF THREE PROGRESSIVELY MORE AGGRESSIVE ROUTING-TABLE CACHE DESIGNS AND DEMONSTRATE THAT THE INCORPORATION OF HARDWARE CACHES INTO INTERNET

More information

Multi-objective Design Optimization of MCM Placement

Multi-objective Design Optimization of MCM Placement Proceedngs of the 5th WSEAS Int. Conf. on Instrumentaton, Measurement, Crcuts and Systems, Hangzhou, Chna, Aprl 6-8, 26 (pp56-6) Mult-objectve Desgn Optmzaton of MCM Placement Chng-Ma Ko ab, Yu-Jung Huang

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Analysis of Min Sum Iterative Decoder using Buffer Insertion

Analysis of Min Sum Iterative Decoder using Buffer Insertion Analyss of Mn Sum Iteratve ecoder usng Buffer Inserton Saravanan Swapna M.E II year, ept of ECE SSN College of Engneerng M. Anbuselv Assstant Professor, ept of ECE SSN College of Engneerng S.Salvahanan

More information

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

A Saturation Binary Neural Network for Crossbar Switching Problem

A Saturation Binary Neural Network for Crossbar Switching Problem A Saturaton Bnary Neural Network for Crossbar Swtchng Problem Cu Zhang 1, L-Qng Zhao 2, and Rong-Long Wang 2 1 Department of Autocontrol, Laonng Insttute of Scence and Technology, Benx, Chna bxlkyzhangcu@163.com

More information

Design of a Real Time FPGA-based Three Dimensional Positioning Algorithm

Design of a Real Time FPGA-based Three Dimensional Positioning Algorithm Desgn of a Real Tme FPGA-based Three Dmensonal Postonng Algorthm Nathan G. Johnson-Wllams, Student Member IEEE, Robert S. Myaoka, Member IEEE, Xaol L, Student Member IEEE, Tom K. Lewellen, Fellow IEEE,

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

Parallel Inverse Halftoning by Look-Up Table (LUT) Partitioning

Parallel Inverse Halftoning by Look-Up Table (LUT) Partitioning Parallel Inverse Halftonng by Look-Up Table (LUT) Parttonng Umar F. Sddq and Sadq M. Sat umar@ccse.kfupm.edu.sa, sadq@kfupm.edu.sa KFUPM Box: Department of Computer Engneerng, Kng Fahd Unversty of Petroleum

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Obstacle-Aware Routing Problem in. a Rectangular Mesh Network

Obstacle-Aware Routing Problem in. a Rectangular Mesh Network Appled Mathematcal Scences, Vol. 9, 015, no. 14, 653-663 HIKARI Ltd, www.m-hkar.com http://dx.do.org/10.1988/ams.015.411911 Obstacle-Aware Routng Problem n a Rectangular Mesh Network Norazah Adzhar Department

More information

A Multilevel Analytical Placement for 3D ICs

A Multilevel Analytical Placement for 3D ICs A Multlevel Analytcal Placement for 3D ICs Jason Cong, and Guoje Luo Computer Scence Department Unversty of Calforna, Los Angeles Calforna NanoSystems Insttute Los Angeles, CA 90095, USA Tel : (30) 06-775

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

3. CR parameters and Multi-Objective Fitness Function

3. CR parameters and Multi-Objective Fitness Function 3 CR parameters and Mult-objectve Ftness Functon 41 3. CR parameters and Mult-Objectve Ftness Functon 3.1. Introducton Cogntve rados dynamcally confgure the wreless communcaton system, whch takes beneft

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

Storage Binding in RTL synthesis

Storage Binding in RTL synthesis Storage Bndng n RTL synthess Pe Zhang Danel D. Gajsk Techncal Report ICS-0-37 August 0th, 200 Center for Embedded Computer Systems Department of Informaton and Computer Scence Unersty of Calforna, Irne

More information

An Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem

An Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem An Effcent Genetc Algorthm wth Fuzzy c-means Clusterng for Travelng Salesman Problem Jong-Won Yoon and Sung-Bae Cho Dept. of Computer Scence Yonse Unversty Seoul, Korea jwyoon@sclab.yonse.ac.r, sbcho@cs.yonse.ac.r

More information

Memory Modeling in ESL-RTL Equivalence Checking

Memory Modeling in ESL-RTL Equivalence Checking 11.4 Memory Modelng n ESL-RTL Equvalence Checkng Alfred Koelbl 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 koelbl@synopsys.com Jerry R. Burch 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 burch@synopsys.com

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR Judth Aronow Rchard Jarvnen Independent Consultant Dept of Math/Stat 559 Frost Wnona State Unversty Beaumont, TX 7776 Wnona, MN 55987 aronowju@hal.lamar.edu

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

A Frame Packing Mechanism Using PDO Communication Service within CANopen

A Frame Packing Mechanism Using PDO Communication Service within CANopen 28 A Frame Packng Mechansm Usng PDO Communcaton Servce wthn CANopen Mnkoo Kang and Kejn Park Dvson of Industral & Informaton Systems Engneerng, Ajou Unversty, Suwon, Gyeongg-do, South Korea Summary The

More information

Modeling Multiple Input Switching of CMOS Gates in DSM Technology Using HDMR

Modeling Multiple Input Switching of CMOS Gates in DSM Technology Using HDMR 1 Modelng Multple Input Swtchng of CMOS Gates n DSM Technology Usng HDMR Jayashree Srdharan and Tom Chen Dept. of Electrcal and Computer Engneerng Colorado State Unversty, Fort Collns, CO, 8523, USA (jaya@engr.colostate.edu,

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

A RECONFIGURABLE ARCHITECTURE FOR MULTI-GIGABIT SPEED CONTENT-BASED ROUTING. James Moscola, Young H. Cho, John W. Lockwood

A RECONFIGURABLE ARCHITECTURE FOR MULTI-GIGABIT SPEED CONTENT-BASED ROUTING. James Moscola, Young H. Cho, John W. Lockwood A RECONFIGURABLE ARCHITECTURE FOR MULTI-GIGABIT SPEED CONTENT-BASED ROUTING James Moscola, Young H. Cho, John W. Lockwood Dept. of Computer Scence and Engneerng Washngton Unversty, St. Lous, MO {jmm5,

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries Run-Tme Operator State Spllng for Memory Intensve Long-Runnng Queres Bn Lu, Yal Zhu, and lke A. Rundenstener epartment of Computer Scence, Worcester Polytechnc Insttute Worcester, Massachusetts, USA {bnlu,

More information