A Power Optimization Toolbox for Logic Synthesis and Mapping
|
|
- Brandon Waters
- 6 years ago
- Views:
Transcription
1 A Power Optmzaton Toolbox for Logc Synthess and Mappng Alan Mshchenko Robert Brayton Stephen Jang Kevn Chung Department of EECS, Unversty of Calforna, Berkeley LogcMll Technology Qualcom Innovatons {alanm, Abstract The paper descrbes several complementary algorthms for power-aware logc optmzaton: o SmSwtch s an effcent sequental smulator for estmatng swtchng actvty of sgnals n large sequental desgns. o PowerMap uses swtchng actvty to make better decsons durng power-aware technology mappng. o PowerDC s a resynthess algorthm that elmnates wres wth hgh swtchng actvty. The proposed smulator draws on new deas n logc representaton and s geared for speed, e.g. t can smulate a 1Mnode sequental desgn usng 1000 bt patterns for 100 cycles n about 10 seconds on a typcal one-core CPU. Ths s more than 85X faster than prevously publshed actvty estmators wth smlar accuracy. Experments show that, although each technque contrbutes to the fnal qualty, t s ther combnaton that gves the best results. When appled to large ndustral desgns n a hghly-optmzed ndustral flow, prevous work on sequental synthess and wre-aware technology mappng led to a 27.6% reducton n swtchng actvty, whle the technques of ths paper reduce t addtonally by 19.6% wthout a substantal ncrease n runtme or degradaton of other metrcs. 1 Introducton Wth the ncrease of the desgn sze and the reducton of feature sze, power mnmzaton becomes an mportant ssue. Accordng to current estmates, about 67% of power dsspaton n FPGA devces s due to dynamc power [3]. Ths hgh power consumpton s an ncreasng concern for FPGA vendors and ther customers. Reducng power s key to lowerng packagng and coolng costs, and mprovng devce relablty [13]. Total power n an FPGA (or any semconductor devce) conssts of statc power and dynamc power. Statc power results prmarly from transstor leakage current, whch s the small current that travels ether from source-to-dran or through the gate oxde when the transstor s logcally off [33]. Dynamc power dsspaton s caused by sgnal transtons n a crcut and comes from the clock and logc networks. These transtons can be part of normal operaton or can be due to gltchng. The dynamc power s characterzed by the equaton: 1 2 P = f V C dynamc s 2 sgnals where f s the clock frequency, V the supply voltage, C the capactance swtched by sgnal, and s s the probablty of sgnal makng a transton. An overvew of leakage reducton technques used n ASICs and mcroprocessor s descrbed n [30]. However, the focus of ths paper s on reducng the dynamc power consumpton due to sgnal swtchng n the logc network. Most power-aware technology mappng [3] and power-aware placement and routng [15] use swtchng actvty nformaton of the netlst to estmate dynamc power dsspaton. Technques for estmatng swtchng actvty can be smulatonbased or probablty-based (or vectorless). Ths work uses smulaton, whch s more accurate but slower [16]. Sequental smulaton s appled to a Boolean network consstng of logc gates and flp-flops whle keepng track of the transtons. Gatelevel smulaton s a well-studed problem and much effort has been placed on mprovng ts speed [17][30]. Despte many nnovatons, gate-level smulaton remans slow for large desgns. Logc synthess and technology mappng transform logc mplementaton by choosng among dfferent crcut structures. Gven the goal of reducng dynamc power dsspaton, t s natural to seek logc structures resultng n a reducton of the total swtchng actvty of ther consttuent nodes. Thus, t s mportant to develop a fast and accurate swtchng actvty estmator that can drve power-aware technology mappng and resynthess. The present paper contrbutes to these goals n several ways: It descrbes an mplementaton of an effcent sequental smulator, SmSwtch, for estmatng swtchng actvty of all sgnals n a sequental desgn. Ths estmaton s used to gude power-aware logc synthess by evaluatng dfferent restructurng and mappng choces. It descrbes two synergstc algorthms, PowerMap and PowerDC, performng power-aware technology mappng and restructurng of the crcut whle tryng to mnmze the total swtchng actvty. It offers an expermental evaluaton of these methods and demonstrates that a substantal reducton n the dynamc power s possble wthout ncreasng runtme and compromsng other parameters such as area and delay. The contrbutons of ths paper can be extended to work for standard-cell desgns. In partcular, the same sequental smulator can be used to estmate swtchng actvty of all sgnals. A cutbased standard-cell mapper [5] can be modfed to take swtchng actvty nto account, leadng to mappngs wth reduced power consumpton. Fnally, a don t-care-based optmzaton envronment for standard-cells can be developed based on the same methodology as [27]. Ths optmzaton wll rewre a standard-cell desgn whle replacng nets wth hgh swtchng actvty by nets wth low swtchng actvty. Although development of such an envronment s beyond the scope of ths work, our experence wth other optmzatons applcable to both standard-cell and LUT-based desgns suggests that powerreductons for standard-cells are lkely to be comparable to those for the LUT-based desgns presented n ths paper. The rest of the paper s organzed as follows. Secton 2 provdes necessary background on logc synthess and technology mappng. Secton 3 descrbes the sequental smulator for power estmaton. Secton 4 presents the power-aware logc optmzaton algorthms. Secton 5 shows expermental results. Fnally, Secton 6 concludes the paper and outlnes future work. 2 Background 2.1 Boolean Networks A Boolean network s a drected acyclc graph (DAG) wth nodes correspondng to logc gates and drected edges
2 correspondng to wres connectng the gates. The terms Boolean network and crcut are used nterchangeably n ths paper. If the network s sequental, the memory elements are assumed to be D-flp-flops wth ntal states. Terms memory elements, flopflops, and regsters are used nterchangeably n ths paper. A node n has zero or more fanns,.e. nodes that are drvng n, and zero or more fanouts,.e. nodes drven by n. The prmary nputs (PIs) are nodes wthout fanns n the current network. The prmary outputs (POs) are a subset of nodes of the network. If the network s sequental, t contans regsters whose nputs and output are treated as addtonal PIs/POs n combnatonal optmzaton and mappng. It s assumed that each node has a unque nteger called ts node ID. A cut C of node n, called root, s a set of nodes, called leaves, such that each path from a PI to n passes through at least one leaf. A cut s K-feasble f ts cardnalty does not exceed K. A cut s domnated f there s another cut of the same node, contaned, settheoretcally, n the gven cut. A fann (fanout) cone of node n s the subset of the nodes of the network reachable through the fann (fanout) edges from n. A maxmum fanout free cone (MFFC) of node n s a subset of the fann cone, such that every path from a node n the subset to the POs passes through n. Informally, the MFFC of a node contans all the logc used exclusvely to generate the output functon of the node. When a node s removed due to redundancy, the logc n ts MFFC can also be removed. 2.2 And-Inverter Graphs A combnatonal And-Inverter Graph (AIG) [22] s a drected acyclc graph (DAG), n whch a node has ether 0 or 2 ncomng edges. A node wth no ncomng edges s a prmary nput (PI). A node wth 2 ncomng edges s a two-nput AND gate. An edge s ether complemented or not. A complemented edge ndcates the nverson of the sgnal. Certan nodes are marked as prmary outputs (POs). The combnatonal logc of an arbtrary Boolean network can be factored [4] and transformed nto an AIG usng DeMorgan s rule. 2.3 Technology Mappng wth Structural Choces A typcal procedure for structural technology mappng conssts of the followng step: 1) cut enumeraton, 2) delay-optmal mappng, 3) area recovery usng heurstcs, 4) producng the resultng LUT network. A detaled descrpton of these steps s gven n [7]. The man drawback of the structural approaches to technology mappng s ther dependence on the crcut structure gven to the mapper (also known as the problem of structural bas ). If the structure s bad, nether heurstcs nor teratve recovery procedures wll lead to a good result of mappng. To overcome the bas of a sngle netlst structure, mappng wth structural choces (lossless synthess) s explored n [5] and [24]. The structural choces are constructed by recordng equvalent crcut structures at a subset of nodes (called choce nodes). These alternatve structures come from synthess transformatons, such as, AIG rewrtng/balancng or co-factorng. As the mapper traverses over these choce nodes n the subject graph, t can explore the multple alternatve mplementatons smultaneously. It s found that usng structural choces over multple passes of LUT mappng yelds sgnfcantly better results, compared to mappng wthout choces or runnng only a sngle teraton of mappng wth choces [24]. Mappng wth structural choces s used n the experments results secton. 3 Sequental Smulaton (SmSwtch) A sequental smulator apples sequences of values to the prmary nputs. It s assumed that the ntal state of the desgn s known and used to smulate the frst frame. In subsequent frames, the state derved at the prevous teraton s used. Input nformaton (such as the probablty of a transton occurrng) can be gven by the user or produced by another tool (for example, f an nput trace s known, t may be appled to smulate the desg. In ths work, sequental smulaton s used to estmate swtchng actvty of regsters and nternal nodes. Ths s used later to gude heurstc power-aware optmzaton. The desgn s sequentally smulated for a fxed number of tmeframes. The random nput patterns are generated to have a 0.5 probablty of the probablty of the node changng ts value (toggle rate) s fxed. Smulaton s performed from the ntal state for a fxed number of cycles (typcally, 64), of whch the frst few (typcally, 16) are skpped as not representatve of normal operaton; the remanng ones are used to accumulate the swtchng actvty. Prevous work [6][8][12][18] concluded that the man problem wth sequental smulaton for swtchng-actvty estmaton s the prohbtve runtme of the smulator. In ths work, we address ths problem wth SmSwtch, based on new logc representaton and manpulaton methods. The salent features of the new smulator are descrbed below. In general, the smaller the memory footprnt of the smulator, the faster t runs. Ths s due to a CPU havng typcally a local cache rangng n sze from 2Mb to 16Mb. If an applcaton requres more memory than fts nto the cache, repeated cache msses cause the runtme to degrade. Therefore, the challenge s to desgn the smulator that uses mnmalstc data-structures wthout compromsng the computaton speed. We found three orthogonal ways of reducng the memory requrements of the smulator, wthout mpactng ts performance. 4 Approaches to Power Mnmzaton In ths secton, two orthogonal ways of usng swtchng actvty are descrbed. The frst (PowerMap) uses swtchng actvty to make better decsons durng technology mappng whle the second (PowerDC) does power-aware re-synthess after mappng. 4.1 Technology Mappng (PowerMap) LUT-based mappers [7][19][23] have evolved from the frst technology mappng solutons for FPGAs [11][9], to those that have reduced runtme, mproved qualty of results, and allow users to select cost-functons employed durng decson makng, as n [21]. In partcular, a recent technology mapper [23] was successfully modfed to accommodate heurstcs for reducng wre-length and mprovng desgn routablty (WreMap) [12][14]. We propose a smlar modfcaton of the prorty-cut based mapper [23] geared towards reducng swtchng actvty, resultng n an algorthm called PowerMap. Snce the basc mappng algorthm s descrbed n detal n [23] and [14], here we only descrbe the changes n the new cost functon whch use swtchng actvty to prortze the cuts durng technology mappng. The ntuton behnd power-aware technology mappng s to try to cover the nodes wth hgh swtchng actvty (or hot nodes) so they are hdden nsde a LUT. Snce the capactance nsde a LUT s very small, power consumpton wll be reduced. The followng three metrcs are compared below: area flow [10][19][23], edge flow (WreMap) [14], swtchng flow (PowerMap) (ths paper).
3 Area flow (AF) s defned as: AF( Leaf ( ) AF( = Area(, (1) + NumFanout( Leaf ( ) where Area( s the area of the LUT used to map node n, Leaf ( s the -th leaf of the representatve cut of node n, and NumFanouts(Leaf () s the number of fanouts of node Leaf ( n the currently selected mappng. Edge flow (EF) s defned as: EF( Leaf ( ) EF( = Edge(, (2) + NumFanout( Leaf ( ) where Edge( s the total number of fann edges (or wres) of the LUT used to map the current representatve cut of node n, whle other notatons are the same as n (1). Swtchng flow s defned as: SwtchFlow( = Swtch( + SwtchFlow( Leaf ( ), (3) NumFanout( Leaf ( ) where Swtch( s the total swtchng actvty at the output of node n, computed usng sequental smulaton (from Secton 3). Intutvely, each of these flows s an estmate of a resource (area, wrng, swtchng) assocated wth the logc rooted at node n. The man dea s to compute for each cut three dfferent metrcs (flows) capturng the global vew of the netlst durng mappng: area flow, edge flow, and swtchng flow. When computng these metrcs, each cut s characterzed by 1) ts own resources used and 2) the resources used by ts fanns. The value of a resource of a fann s dvded by the number of the fann s fanouts. Consder node n1 n Fgure 1. The drect resources of n1 are: 1) the node tself; 2) the three nput edges; 3) the total resources for the three edges. For nodes n2 and n4, half of the resources s used by node n1 whle the other half s used by other fanouts. For node n3, all resources are used by node n1. Resources owned by the nodes n1 n5 Resources owned by the nodes Formulas (1)-(3) provde dfferent applcatons of the noton of a flow when appled to technology mappng. The cost functons measure ncreasngly complex netlst metrcs, startng from area (LUT count) to the number of fanns (routng wres) to the total swtchng actvty of fanns (power estmate), whle the noton of a flow captures the fact that each ncomng flow s shared by the fanouts of the fanns, whch gves a global vew of each cost functon durng technology mappng. 4.2 SAT-Based Resynthess (PowerDC) An effcent approach to resyntheszng logc networks after technology mappng s gven n [27]. Its features are lsted below: substantal optmzaton power, due to the use of nternal don t-cares; scalable local computaton, due to the use of wndowng; computaton speed, due to the use of Boolean satsfablty for functonal manpulaton, and ablty to accommodate varous optmzaton objectves. Resubsttuton s a logc restructurng technque that modfes the local space of a mapped node by addng or removng ts fanns. When appled to a Boolean network, resubsttuton can be teratvely appled, amng at mnmzng a gven cost functon, such as area, delay, wre-length, etc. The man dea of the power-aware flavor of the resynthess algorthm, called PowerDC, s summarzed below. For detals on the basc algorthm and ts mplementaton, refer to [20][27]. Gven a mapped network, t s frst transformed nto an AIG by applyng Boolean decomposton of the logcal functons of the LUTs nto two-nput gates. Then sequental smulaton s appled to the AIG, resultng n the transton probabltes for all the AIG nodes. Ths nformaton s back-annotated to the mapped network to provde the swtchng actvty for all nodes n the mappng. Next, a subset of wres, called hot wres, s dentfed. Based on the experments, f the toggle rate for prmary nputs s 0.5, hot wres are those wth swtchng probablty greater than 0.4. Note that swtchng computed as a transton probablty may be hgher than 0.5 (but cannot exceed 1.0). Another way of defnng hot wres mght be a fxed rato (say, 10%) of the wres wth the hghest swtchng actvty. n2 n3 n4 Power dstrbuton Fgure 1: Metrcs used n PowerMap. The use of the cost functon (3) durng power-aware technology mappng s smlar to that of ts use n (2) n wreaware mappng [14]. The man dfferences are as follows. When comparng two cuts, ther area flows are compared frst, whle ther swtchng flows are used as the frst te-breaker,.e. f the area flows of two cuts are equal up to a certan delta (say, 0.001), the decson of what cut to use s made based on the swtchng flow. Smlarly, edge flow s used as the second te-breaker. Smlarly, n the second phase of technology mappng when local areas of cuts are compared [14], local swtchng s used as the frst te-breaker and edge flow s used as the second tebreaker. The local area of a cut s the sum of the areas of the LUTs n the MFFC of the cut. Smlarly, the local swtchng of a cut s the sum of swtchng actvtes of the LUTs n the cut s MFFC. Percentage % 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% T5 T4 T3 T2 T1 Nodes 5.25% 1.09% 0.64% 0.84% 92.16% Wre 13.90% 1.62% 0.91% 1.29% 82.27% Pwr 85.90% 7.36% 2.68% 2.32% 1.74% Swtchng frequency (T5 s hot and T1 s cool) Nodes Wre Pwr Fgure 2: Dstrbuton of wres and nodes by toggle rate. Based on an experment descrbed n Secton 5, about 13.9% of wres are hot and these contrbute about 86% of the total power % of wres are cool but these only contrbute 1.74% of the total power. A detaled dstrbuton of wres (and nodes) by swtchng actvty and ther power contrbuton are shown n
4 Fgure 2. Snce the vast majorty of dynamc power s dsspated n a small percentage of hot wres, power reducton technques targetng hot wres are very effectve. After hot nodes are found, SAT-based resubsttuton [27] s appled targetng hot nodes as fanns of other mapped nodes. The goal s to remove or replace them by cooler fanns. To save runtme, swtchng actvty s not updated after ndvdual steps of resubsttuton. Ths s reasonable because, although Boolean functons of the nodes after resynthess wth observablty don t-cares may be changed, ther swtchng actvty does not change substantally. Even when t changes, the percentage of nodes whose hot/cool status s modfed durng resynthess, s typcally neglgble. 5 Expermental Results The algorthms of ths paper were mplemented n ABC [1]. Expermental evaluaton targetng 6-LUT mplementatons were performed usng a sute of 20 large ndustral desgns rangng n sze from 12K to 165K 6-LUTs. The experments were run on an Intel Xeon 3GHz 2-CPU 4-core computer wth 8GB of RAM. The resultng networks were verfed usng combnatonal equvalence checker n ABC (command cec). The power consumpton of a crcut s the sum of the power consumed by each wre and the power consumed by each wre s the product of the capactance and swtchng actvty for the wre. Because the netlsts were not placed n ths experment, the capactance of each wre was not known. We assumed a untcapactance model for each wre n our dynamc power calculatons. Note that ths power model assumes that each fanout connecton has the same capactance for a multple fanout net. The sequental smulator for estmatng the swtchng actvty was based on a new AIG package, called ga. The followng commands n ABC were extended to be power-aware. In each case, new swtch -p enables swtchng actvty mnmzaton: Technology mappng for FPGAs [23] (command f p). SAT-based resubsttuton [27] (command mfs p). Swtchng actvty report (command ps p). Several other commands, ncludng sequental synthess and mappng wth structural choces, were used n ths experment: Structural sequental cleanup [26] (command scl). Parttoned regster-correspondence computaton usng smple nducton [26] (command lcorr). Parttoned sgnal-correspondence computaton usng k-step nducton [26] (command scorr). Computaton of structural choces [3] (command dch). 5.1 Smulaton Runtme As mentoned, one of the man ssues wth sequental smulaton s potentally long runtme. In ths secton, we performed two experments. In the frst experment we showed that runnng SmSwtch offers affordable runtme for large desgns. In the second experment, we showed that the runtme of SmSwtch s very fast compared wth ACE-2.0 [16], a fast swtchng estmaton tool that s avalable to the publc. In the frst experment, four ndustral desgns rangng from 304K to 1.3M AIG nodes were smulated wth dfferent numbers of smulaton patterns, rangng from 2,560 to 20,480. The nput toggle rate was assumed to be 0.5. The results are shown n Table 1. Columns AIG and FF show the number of AIG nodes and regsters. The runtmes for dfferent szes of nputs patterns are shown n the last columns. Note that the runtmes are qute affordable even for Desgn C4 wth 1.3M AIG nodes (about 160 K 6-LUT after mappng). In most cases, 2,560 patterns were suffcent for node swtchng actvty rates to converge. Table 1: Runtme of SmSwtch. Desgn AIG FF Runtme for nputs patterns (seconds) C1 304K C2 362K C3 842K C4 1306K In the second experment, we compared the runtme of SmSwtch vs. ACE-2.0 on 14 ndustry desgns and 12 large academc benchmarks. The nput toggle rate s assumed to be 0.5 for both tools. The number of nput patterns s assumed to be 5,000 for both runs. All crcuts are decomposed to AIG netlsts before performng swtchng estmaton. The results, not shown n ths paper, lead to the followng observatons. SmSwtch can fnsh all the desgns n very affordable tme but ACE-2.0 gets 4 tmeouts n ndustry desgns. For ndustry desgns, the runtme of SmSwtch s 146+ tmes faster than ACE-2.0. For large academc benchmarks, the runtme of SmSwtch s 85+ tmes faster than ACE Power-Aware Optmzatons To evaluate the contrbuton of power-aware mappng and resynthess, three experments were performed, denoted Baselne, FullOpt, and PowerMap. Baselne corresponds to two runs of (dch; f e). It performs prorty-cut based technology mappng wth structure choces. WreMap [14] s not used for Baselne because WreMap s known to reduce power dsspaton as a by-product of wre count mnmzaton. FullOpt s the complete flow ncludng hgh-effort sequental and combnatonal logc synthess. Sequental synthess (scl; lcorr; scorr) [26] s run frst to remove sequentally equvalent nodes and regsters. Then, two teratons of combnatonal synthess and mappng wth structural choces are performed (dch; f). WreMap s used n ths flow. It reduces 10% of the wres on top of Baselne [14]. In the PowerMap run, FullOpt s used as the nput netlst followed by two runs (dch; f p), whch s the power-aware technology mappng developed n ths paper. In the PowerDC run, the results of PowerMap are used as nput followed by two teratons of power-aware resynthess (mfs p). The sequental smulator SmSwtch s used to drve the poweraware logc synthess. In sequental smulaton, 2,560 random nput patterns wth a toggle rate of 0.5 were used. The results are reported n Table 4. (Detaled results for some desgns n the table are omtted due to page lmtaton, but the averages are computed over all desgns.) Columns PI, PO, FF, and LUT show the number of prmary nputs, prmary outputs, flp-flops, and 6- nput LUTs. Columns Lv and Pwr show the number of logc levels and the total swtchng actvty of the network. The expermental results lead to the followng observatons: FullOpt mproved on Baselne n the number of regsters, LUTs, and logc levels by 14.8%, 12.2%, and 3%, respectvely. Power s reduced by 27%.
5 FullOpt, PowerMap, and PowerDC have the same number of regsters because only combnatonal synthess s used n PowerMap and PowerDC. PowerMap produces an addtonal 10.5% power reducton on top of FullOpt. PowerDC produces an addtonal 10.1% power reducton on top of PowerMap. The overall power reducton for PowerDC vs. FullOpt s 19.6%. The overall power reducton for PowerDC vs. Baselne s 41.9%. Note that when runnng PowerMap and PowerDC, the LUT count and logc delay s reduced slghtly. The addtonal optmzatons reduce power wthout any effect on desgn speed. To nvestgate robustness of the algorthm, we performed the same experments and changed the toggle rate of the nputs from 0.5 to The results are summarzed n Table 2. PowerMap produces an addtonal 9.8% power reducton on top of FullOpt PowerDC produces an addtonal 8.7% power reducton on top of PowerMap. The power reducton for PowerDC vs. FullOpt s 17.7%. The power reducton for PowerDC vs. Baselne s 39.7% Table 2: Comparson of power-optmzaton algorthms (nputs toggle rate s 0.25). Power BaseLne FullOpt PwrMap PwrDC Geomean Rato Rato Rato Wre Dstrbuton Analyss For the same sute of 20 desgns, the changes n the dstrbuton of wres by ther swtchng actvty as a result of resynthess s analyzed. The results are reported Fgure 3. Column T5 s the total number of wres whose swtchng probablty exceeds 0.4. These hot wres are targeted by power reducton. Column T4 s the total number of wres whose swtchng probablty s between 0.3 and 0.4. Columns T3, T2, and T1 are defned smlarly. Thus the T1 wres are the cool wres. In general, they don t contrbute much to the power because ther swtchng probablty s less than 0.1. The results n Fgure 3 lead to the followng observatons: For PowerMap vs. FullOpt, hot (T5) wres are reduced by 13.22%, and T4 are ncreased by 4.96%. Ths ndcates that PowerMap reduces the number of hot wres. For PowerDC vs. PowerMap, total wres are reduced by 2.7% but power s reduced by 10.1%. Thus, the resynthess has removed the rght wres,.e., the hot wres. Hot wres are reduced by 11.04%. The cooler wres are reduced also but the reducton s smaller n ths case. In addton to wre count reducton for hot wres, the wre dstrbuton n terms of toggle rates s also mproved. Fgure 4 shows the power dstrbuton before and after optmzaton. Wre and Wre2 are the percentages of wres, whle Pwr and Pwr2 are the percentages of power contrbutons before and after optmzaton. The chart leads to the followng conclusons: After FullOpt, 13.9% (82.3%) of the wres are hot (cool). After PowerMap, 11.4% (84.6%) of wres are hot (cool). After FullOpt, 85.9% (1.7%) of the power s contrbuted by hot (cool) wres. After PowerMap, 82.6% (2.0%) of power s contrbuted by hot (cool) wres. The above observatons show that the power contrbuted by hot wres s less than Baselne whle the power contrbuted by cool wres s more than Baselne. The concluson s that the poweraware logc synthess was able to shft the peak of power consumpton to lower-swtchng wres. Reducton (negatve s go Percentage 10.00% 5.00% 0.00% -5.00% % % PowrMap vs FullOpt Wre Comparson T5 T4 T3 T2 T1 Total Wrs PowerDc vs. PowrMap Fgure 3: The changes n wre ratos after two power-aware transforms % 80.00% 60.00% 40.00% 20.00% 0.00% Power dstrbuton before/after optmzaton T5 T4 T3 T2 T1 Wre 13.90% 1.62% 0.91% 1.29% 82.27% Wre % 1.79% 0.93% 1.29% 84.55% Pwr 85.90% 7.36% 2.68% 2.32% 1.74% Pwr % 9.48% 3.19% 2.69% 2.02% Swtchng frequency (temperature) Wre Wre2 Pwr Pwr2 Fgure 4: Power dstrbuton before/after optmzaton. 6 Conclusons Ths paper descrbes a toolbox for power-aware logc synthess and mappng. Estmaton of power consumpton at the early stages of the desgn flow s based on calculatng the probabltes of sgnal transton durng sequental smulaton. The toolbox also ncludes two technques for optmzng the desgn durng synthess and mappng to reduce swtchng actvty and thereby mnmze dynamc power consumpton: PowerMap: a power-aware LUT mapper that extends [23] to prortze cuts based on swtchng actvty of the nodes. PowerDC: a SAT-based engne for resubsttuton wth don t-cares that extends [27] to remove sgnals wth hgh swtchng actvty (so called hot sgnals ). Estmaton of power dsspaton s effcently performed by a new sequental smulator, SmSwtch. The estmaton converges for most ndustral desgns after a reasonable number of smulaton cycles. Ths allows for makng targeted heurstc decsons durng logc synthess and technology mappng Future work wll nclude: Speedng up swtchng actvty estmaton (t s estmated that the current mplementaton can be made 50% faster). Extendng swtchng actvty computaton to nclude gltch estmaton.
6 Implementng other power-aware commands, such as computng better choces to reduce power and logc restructurng to reduce power. References [1] Berkeley Logc Synthess and Verfcaton Group, ABC: A system for sequental synthess and verfcaton, Release 904xx. [2] Altera, Stratx II vs. Vrtex-4 power comparson and estmaton accuracy (whte paper). wp_s2v4_pwr_acc.pdf [3] J. Anderson and F. N. Najm, Power-aware Technology Mappng for LUT-Based FPGAs, IEEE Int. Conf. on Feld Programmable Technology, Dec. 2002, pp [4] R. Brayton and C. McMullen, The decomposton of logc functons, Proc. ICCAD 97, pp [5] S. Chatterjee, A. Mshchenko, R. Brayton, X. Wang, and T. Kam, Reducng structural bas n technology mappng, Proc. ICCAD '05, pp [6] D. Chen, J. Cong, and P. Pan, FPGA desgn automaton: A survey, Foundatons and Trends n Electronc Desgn Automaton, Vol. 1(3), November 2006, pp [7] D. Chen and J. Cong. DAOmap: A depth-optmal area optmzaton mappng algorthm for FPGA desgns, Proc. ICCAD 04, pp [8] L. Cheng, D. Chen, and M.D. Wong, GltchMap: FPGA technology mapper for low power consderng gltches. Proc. DAC '07. [9] J. Cong and Y. Dng, FlowMap: An optmal technology mappng algorthm for delay optmzaton n lookup-table based FPGA desgns, IEEE TCAD, Vol. 13(1), 1994, pp [10] J. Cong, C. Wu, and Y. Dng, Cut rankng and prunng: Enablng a general and effcent FPGA mappng soluton, Proc. FPGA 99, pp [11] R. J. Francs, J. Rose, and K. Chung, Chortle: A technology mappng program for lookup table-based feld programmable gate arrays, Proc. DAC 90, pp [12] A. Ghosh, S. Devadas, K. Keutzer, and J. Whte, Estmaton of average swtchng actvty n combnatonal and sequental crcuts Proc. DAC 92. [13] S. Gupta and J. Anderson, Optmzng FPGA power wth ISE desgn tools, Issue 60, 2007, pp , publcatons/xcellonlne/xcell_60/xc_pdf/p16-19_60-gupta.pdf [14] S. Jang, B. Chan, K. Chung, and A. Mshchenko, "WreMap: FGPA technology mappng for mproved routablty". Proc. FPGA '08. [15] J. Lamoureux and S.J.E. Wlton, On the Interacton between Power- Aware FPGA CAD Algorthms,, Proc. ICCAD, Nov [16] J. Lamoureux and S.J.E. Wlton, Actvty estmaton for Feld- Programmable Gate Arrays, Proc. Intl Conf. Feld-Prog. Logc and Applcatons (FPL), 2006, pp [17] J. N. Kozhaya and F. Najm, Accurate power estmaton for large sequental crcuts, Proc. ICCAD 97. [18] J. Lamoureux, Modelng and reducton of dynamc power n Feld- Programmable Gate Arrays, Ph.D. Thess, [19] V. Manohara-rajah, S. D. Brown, and Z. G. Vranesc, Heurstcs for area mnmzaton n LUT-based FPGA technology mappng, Proc. IWLS 04, pp [20] A. Mshchenko and R. Brayton, "SAT-based complete don't-care computaton for network optmzaton", Proc. DATE '05. [21] A. Mshchenko, S. Chatterjee, R. Brayton, and M. Ceselsk, "An ntegrated technology mappng envronment", Proc. IWLS '05, pp [22] A. Mshchenko, S. Chatterjee, and R. Brayton, "DAG-aware AIG rewrtng: A fresh look at combnatonal logc synthess", Proc. DAC '06, pp [23] A. Mshchenko, S. Cho, S. Chatterjee, and R. Brayton, "Combnatonal and sequental mappng wth prorty cuts", Proc. ICCAD '07, pp [24] A. Mshchenko, S. Chatterjee, and R. Brayton, "Improvements to technology mappng for LUT-based FPGAs", Proc. FPGA '06, pp [25] A. Mshchenko, R. K. Brayton, and S. Jang, "Global delay optmzaton usng structural choces", Proc. IWLS'08, pp [26] A. Mshchenko, M. L. Case, R. K. Brayton, and S. Jang, Scalable and scalably-verfable sequental synthess, Proc. ICCAD'08, pp [27] A. Mshchenko, R. Brayton, J.-H. R. Jang, S. Jang, "Scalable don'tcare-based logc optmzaton and resynthess", Proc. FPGA '09. [28] J. Najm, Transton densty: A new measure of actvty n dgtal crcuts, IEEE Trans. CAD, Vol. 12(2), 1993, pp [29] PowerPlay Power Analyss, Quartus II 9.0 Handbook, Vol. 3, [30] K. Roy, S. Mukhopadhyay, and H. Mahmood-Memand, Leakage current mechansms and leakage reducton technques n deepsubmcrometer CMOS crcuts, n Proceedngs of the IEEE, pages , 2002 [31] E. Todorovch, M. Glabert, G. Sutter, S. Lopez-Buedo, and E. Boemo, A tool for actvty estmaton n FPGAs, Proc. Intl. Conf. Feld-Prog. Logc (FPL), 2002, pp [32] Xlnx Power Estmator User Gude, products/desgn_resources/power_central/ug440.pdf [33] Xlnx Whte Paper: Power Consumpton n 65 nm FPGAs pdf Table 4: Comparson of power-optmzaton algorthms (nput toggle rate s 0.5). Desgn Statstcs Base FullOpt PowerMap PowerDC name PI PO FF LUT Lv Pwr FF LUT Lv Pwr FF LUT Lv Pwr FF LUT Lv Pwr D D D D D D Geom Rato Rato Rato
A Power Optimization Toolbox for Logic Synthesis and Mapping
A Power Optimization Toolbox for Logic Synthesis and Mapping Stephen Jang Kevin Chung Alan Mishchenko Robert Brayton Xilinx Inc. Department of EECS, University of California, Berkeley {sjang, kevinc}@xilinx.com
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationDetermining the Optimal Bandwidth Based on Multi-criterion Fusion
Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn
More informationCPE 628 Chapter 2 Design for Testability. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction
Chapter 2 Desgn for Testablty Dr Rhonda Kay Gaede UAH 2 Introducton Dffcultes n and the states of sequental crcuts led to provdng drect access for storage elements, whereby selected storage elements are
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationThe Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique
//00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy
More informationHigh-Level Power Modeling of CPLDs and FPGAs
Hgh-Level Power Modelng of CPLs and FPGAs L Shang and Nraj K. Jha epartment of Electrcal Engneerng Prnceton Unversty {lshang, jha}@ee.prnceton.edu Abstract In ths paper, we present a hgh-level power modelng
More informationParallel matrix-vector multiplication
Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more
More informationExplicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements
Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationThe Codesign Challenge
ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.
More informationVRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) ,
VRT012 User s gude V0.1 Thank you for purchasng our product. We hope ths user-frendly devce wll be helpful n realsng your deas and brngng comfort to your lfe. Please take few mnutes to read ths manual
More informationSimulation Based Analysis of FAST TCP using OMNET++
Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationOutline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011
9/8/2 2 Outlne Appendx C: The Bascs of Logc Desgn TDT4255 Computer Desgn Case Study: TDT4255 Communcaton Module Lecture 2 Magnus Jahre 3 4 Dgtal Systems C.2: Gates, Truth Tables and Logc Equatons All sgnals
More informationA Fast Visual Tracking Algorithm Based on Circle Pixels Matching
A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationLoad Balancing for Hex-Cell Interconnection Network
Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationEECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science
EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty
More informationConditional Speculative Decimal Addition*
Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant
More informationAn Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationELEC 377 Operating Systems. Week 6 Class 3
ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems
More informationConfiguration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*
Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad
More informationTPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints
TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process
More informationNAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics
Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson
More informationSteps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices
Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationSimulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010
Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement
More informationProblem Definitions and Evaluation Criteria for Computational Expensive Optimization
Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty
More informationHelsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)
Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute
More informationA mathematical programming approach to the analysis, design and scheduling of offshore oilfields
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationNewton-Raphson division module via truncated multipliers
Newton-Raphson dvson module va truncated multplers Alexandar Tzakov Department of Electrcal and Computer Engneerng Illnos Insttute of Technology Chcago,IL 60616, USA Abstract Reducton n area and power
More informationRepeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits
Repeater Inserton for Two-Termnal Nets n Three-Dmensonal Integrated Crcuts Hu Xu, Vasls F. Pavlds, and Govann De Mchel LSI - EPFL, CH-5, Swtzerland, {hu.xu,vasleos.pavlds,govann.demchel}@epfl.ch Abstract.
More informationImproving The Test Quality for Scan-based BIST Using A General Test Application Scheme
_ Improvng The Test Qualty for can-based BIT Usng A General Test Applcaton cheme Huan-Chh Tsa Kwang-Tng Cheng udpta Bhawmk Department of ECE Bell Laboratores Unversty of Calforna Lucent Technologes anta
More informationA fault tree analysis strategy using binary decision diagrams
Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationMeta-heuristics for Multidimensional Knapsack Problems
2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,
More informationConstruction of ROBDDs. area. that such graphs, under some conditions, can be easily manipulated.
A Study of Composton Schemes for Mxed Apply/Compose Based Constructon of s A Narayan 1 S P Khatr 1 J Jan 2 M Fujta 2 R K Brayton 1 A Sangovann-Vncentell 1 Abstract Reduced Ordered Bnary Decson Dagrams
More informationHermite Splines in Lie Groups as Products of Geodesics
Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the
More informationImprovement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration
Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,
More informationGeneration of Compact Single-Detect Stuck-At Test Sets Targeting Unmodeled Defects
Generaton of Compact Sngle-Detect Stuck-At Test Sets Targetng Unmodeled Defects Xrysovalants Kavousanos and Krshnendu Chakrabarty * Abstract- We present a new method to generate compact stuckat test sets
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationA Min-Cost Flow Based Detailed Router for FPGAs
A Mn-Cost Flow Based Detaled Router for FPGAs eokn Lee Dept. of ECE The Unversty of Texas at Austn Austn, TX 78712 Yongseok Cheon Dept. of Computer cences The Unversty of Texas at Austn Austn, TX 78712
More informationImproved Symoblic Simulation By Dynamic Funtional Space Partitioning
Improved Symoblc Smulaton By Dynamc Funtonal Space Parttonng Tao Feng, L-.Wang, Kwang-Tng heng Department of EE, U-Santa Barbara, U.S.A tfeng,lcwang, tmcheng @ece.ucsb.edu Andy -. Ln adence Desgn Systems,
More informationA MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS
Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung
More informationEfficient Distributed File System (EDFS)
Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate
More informationLoad-Balanced Anycast Routing
Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance
More informationFor instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)
Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A
More informationCHAPTER 4 PARALLEL PREFIX ADDER
93 CHAPTER 4 PARALLEL PREFIX ADDER 4.1 INTRODUCTION VLSI Integer adders fnd applcatons n Arthmetc and Logc Unts (ALUs), mcroprocessors and memory addressng unts. Speed of the adder often decdes the mnmum
More informationCourse Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms
Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques
More informationQuality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation
Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on
More informationTest-Cost Modeling and Optimal Test-Flow Selection of 3D-Stacked ICs
Test-Cost Modelng and Optmal Test-Flow Selecton of 3D-Stacked ICs Mukesh Agrawal, Student Member, IEEE, and Krshnendu Chakrabarty, Fellow, IEEE Abstract Three-dmensonal (3D) ntegraton s an attractve technology
More informationDynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution
Dynamc Voltage Scalng of Supply and Body Bas Explotng Software Runtme Dstrbuton Sungpack Hong EE Department Stanford Unversty Sungjoo Yoo, Byeong Bn, Kyu-Myung Cho, Soo-Kwan Eo Samsung Electroncs Taehwan
More informationHigh-Boost Mesh Filtering for 3-D Shape Enhancement
Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,
More informationTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationReducing Frame Rate for Object Tracking
Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg
More informationOptimizing Document Scoring for Query Retrieval
Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng
More informationComparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments
Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,
More informationHow Accurately Can We Model Timing In A Placement Engine?
How Accurately Can We Model Tmng In A Placement Engne? Amt Chowdhary, Karth Raagopal, Satsh Venatesan, Tung Cao, Vladmr Tourn, Yegna Parasuram, Bll Halpn Intel Corporaton Serra Desgn Automaton Synplcty,
More informationTECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.
TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of
More informationProper Choice of Data Used for the Estimation of Datum Transformation Parameters
Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and
More informationAdvanced Computer Networks
Char of Network Archtectures and Servces Department of Informatcs Techncal Unversty of Munch Note: Durng the attendance check a stcker contanng a unque QR code wll be put on ths exam. Ths QR code contans
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationCACHE MEMORY DESIGN FOR INTERNET PROCESSORS
CACHE MEMORY DESIGN FOR INTERNET PROCESSORS WE EVALUATE A SERIES OF THREE PROGRESSIVELY MORE AGGRESSIVE ROUTING-TABLE CACHE DESIGNS AND DEMONSTRATE THAT THE INCORPORATION OF HARDWARE CACHES INTO INTERNET
More informationMulti-objective Design Optimization of MCM Placement
Proceedngs of the 5th WSEAS Int. Conf. on Instrumentaton, Measurement, Crcuts and Systems, Hangzhou, Chna, Aprl 6-8, 26 (pp56-6) Mult-objectve Desgn Optmzaton of MCM Placement Chng-Ma Ko ab, Yu-Jung Huang
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationAnalysis of Min Sum Iterative Decoder using Buffer Insertion
Analyss of Mn Sum Iteratve ecoder usng Buffer Inserton Saravanan Swapna M.E II year, ept of ECE SSN College of Engneerng M. Anbuselv Assstant Professor, ept of ECE SSN College of Engneerng S.Salvahanan
More informationAn Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems
S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,
More informationData Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach
Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer
More informationA Saturation Binary Neural Network for Crossbar Switching Problem
A Saturaton Bnary Neural Network for Crossbar Swtchng Problem Cu Zhang 1, L-Qng Zhao 2, and Rong-Long Wang 2 1 Department of Autocontrol, Laonng Insttute of Scence and Technology, Benx, Chna bxlkyzhangcu@163.com
More informationDesign of a Real Time FPGA-based Three Dimensional Positioning Algorithm
Desgn of a Real Tme FPGA-based Three Dmensonal Postonng Algorthm Nathan G. Johnson-Wllams, Student Member IEEE, Robert S. Myaoka, Member IEEE, Xaol L, Student Member IEEE, Tom K. Lewellen, Fellow IEEE,
More informationSum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints
Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan
More informationParallel Inverse Halftoning by Look-Up Table (LUT) Partitioning
Parallel Inverse Halftonng by Look-Up Table (LUT) Parttonng Umar F. Sddq and Sadq M. Sat umar@ccse.kfupm.edu.sa, sadq@kfupm.edu.sa KFUPM Box: Department of Computer Engneerng, Kng Fahd Unversty of Petroleum
More informationFINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK
FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,
More informationObstacle-Aware Routing Problem in. a Rectangular Mesh Network
Appled Mathematcal Scences, Vol. 9, 015, no. 14, 653-663 HIKARI Ltd, www.m-hkar.com http://dx.do.org/10.1988/ams.015.411911 Obstacle-Aware Routng Problem n a Rectangular Mesh Network Norazah Adzhar Department
More informationA Multilevel Analytical Placement for 3D ICs
A Multlevel Analytcal Placement for 3D ICs Jason Cong, and Guoje Luo Computer Scence Department Unversty of Calforna, Los Angeles Calforna NanoSystems Insttute Los Angeles, CA 90095, USA Tel : (30) 06-775
More informationProgramming in Fortran 90 : 2017/2018
Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values
More information3. CR parameters and Multi-Objective Fitness Function
3 CR parameters and Mult-objectve Ftness Functon 41 3. CR parameters and Mult-Objectve Ftness Functon 3.1. Introducton Cogntve rados dynamcally confgure the wreless communcaton system, whch takes beneft
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationOptimal Workload-based Weighted Wavelet Synopses
Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,
More informationStorage Binding in RTL synthesis
Storage Bndng n RTL synthess Pe Zhang Danel D. Gajsk Techncal Report ICS-0-37 August 0th, 200 Center for Embedded Computer Systems Department of Informaton and Computer Scence Unersty of Calforna, Irne
More informationAn Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem
An Effcent Genetc Algorthm wth Fuzzy c-means Clusterng for Travelng Salesman Problem Jong-Won Yoon and Sung-Bae Cho Dept. of Computer Scence Yonse Unversty Seoul, Korea jwyoon@sclab.yonse.ac.r, sbcho@cs.yonse.ac.r
More informationMemory Modeling in ESL-RTL Equivalence Checking
11.4 Memory Modelng n ESL-RTL Equvalence Checkng Alfred Koelbl 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 koelbl@synopsys.com Jerry R. Burch 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 burch@synopsys.com
More informationSLAM Summer School 2006 Practical 2: SLAM using Monocular Vision
SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,
More informationVirtual Machine Migration based on Trust Measurement of Computer Node
Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on
More informationSENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR
SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR Judth Aronow Rchard Jarvnen Independent Consultant Dept of Math/Stat 559 Frost Wnona State Unversty Beaumont, TX 7776 Wnona, MN 55987 aronowju@hal.lamar.edu
More informationMathematics 256 a course in differential equations for engineering students
Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the
More informationA Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems
A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty
More informationAnalysis of Continuous Beams in General
Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,
More informationA Frame Packing Mechanism Using PDO Communication Service within CANopen
28 A Frame Packng Mechansm Usng PDO Communcaton Servce wthn CANopen Mnkoo Kang and Kejn Park Dvson of Industral & Informaton Systems Engneerng, Ajou Unversty, Suwon, Gyeongg-do, South Korea Summary The
More informationModeling Multiple Input Switching of CMOS Gates in DSM Technology Using HDMR
1 Modelng Multple Input Swtchng of CMOS Gates n DSM Technology Usng HDMR Jayashree Srdharan and Tom Chen Dept. of Electrcal and Computer Engneerng Colorado State Unversty, Fort Collns, CO, 8523, USA (jaya@engr.colostate.edu,
More informationComplex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.
Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal
More informationA RECONFIGURABLE ARCHITECTURE FOR MULTI-GIGABIT SPEED CONTENT-BASED ROUTING. James Moscola, Young H. Cho, John W. Lockwood
A RECONFIGURABLE ARCHITECTURE FOR MULTI-GIGABIT SPEED CONTENT-BASED ROUTING James Moscola, Young H. Cho, John W. Lockwood Dept. of Computer Scence and Engneerng Washngton Unversty, St. Lous, MO {jmm5,
More informationModule Management Tool in Software Development Organizations
Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,
More informationRun-Time Operator State Spilling for Memory Intensive Long-Running Queries
Run-Tme Operator State Spllng for Memory Intensve Long-Runnng Queres Bn Lu, Yal Zhu, and lke A. Rundenstener epartment of Computer Scence, Worcester Polytechnc Insttute Worcester, Massachusetts, USA {bnlu,
More information