Behavior-Level Observability Analysis for Operation Gating in Low-Power Behavioral Synthesis

Size: px
Start display at page:

Download "Behavior-Level Observability Analysis for Operation Gating in Low-Power Behavioral Synthesis"

Transcription

1 Behavor-Level Observablty Analyss for Operaton Gatng n Low-Power Behavoral Synthess JASON CONG, BIN LIU, RUPAK MAJUMDAR Unversty of Calforna, Los Angeles and ZHIRU ZHANG AutoESL Desgn Technologes, Inc. Many technques for power reducton n advanced RTL synthess tools rely explctly or mplctly on observablty don t-care condtons. In ths artcle we propose a systematc approach to maxmze the effectveness of these technques by generatng power-frendly RTL descrptons n behavoral synthess. Ths s done usng operaton gatng, that s, explctly addng a predcate to an operaton based on ts observablty condton, so that the operaton, once dentfed as unobservable at runtme, can be avoded usng RTL power optmzaton technques such as clock gatng. We frst ntroduce the concept of behavor-level observablty and ts approxmatons n the context of behavoral synthess. We then propose an effcent procedure to compute an approxmated behavor-level observablty of every operaton n a dataflow graph. Unlke prevous technques whch work at the bt level n Boolean networks, our method s able to perform analyss at the word level, and thus avods most computaton effort wth a reasonable approxmaton. Our algorthm explots the observablty-maskng nature of some Boolean operatons, as well as the select operaton, and allows certan forms of other knowledge to be consdered for stronger observablty condtons. The approxmaton s proved exact for (acyclc) dataflow graphs when non-boolean operatons other than select are treated as black boxes. The behavor-level observablty condton obtaned by our analyss can be used to gude the operaton scheduler to optmze the effcency of operaton gatng. In a set of experments on real-world desgns, our method acheves an average of 33.9% reducton n total power; t outperforms a prevous method by 17.1% on average and gves close-to-optmal solutons on several desgns. To the best of our knowledge, ths s the frst tme behavor-level observablty analyss and optmzaton are performed durng behavoral synthess 4 J. Cong s Chef Technology Advsor of AutoESL Desgn Technologes, Inc. B. Lu s also wth AutoESL Desgn Technologes, Inc. Authors addresses: J. Cong, B. Lu, R. Majumdar, Computer Scence Department, Unversty of Calforna, Los Angeles, CA 90095; emal: {cong, blu, rupak}@cs.ucla.edu; Z. Zhang, AutoESL Desgn Technologes, Inc., Stevens Creek Blvd., Sute 150, Cupertno, CA 95014; emal: zhruz@autoesl.com. Permsson to make dgtal or hard copes part or all of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes show ths notce on the frst page or ntal screen of a dsplay along wth the full ctaton. Copyrghts for components of ths work owned by others than ACM must be honored. Abstractng wth credt s permtted. To copy otherwse, to republsh, to post on servers, to redstrbute to lsts, or to use any component of ths work n other works requres pror specfc permsson and/or a fee. Permssons may be requested from the Publcatons Dept., ACM, Inc., 2 Penn Plaza, Sute 701, New York, NY USA, fax +1 (212) , or permssons@acm.org. c 2010 ACM /2010/11-ART4 $10.00 DOI: /

2 4: 2 J. Cong et al. n a systematc manner. We beleve that our dea can be appled to compler transformatons n general. Categores and Subject Descrptors: B.7.2 [Integrated Crcuts]: Desgn Ads General Terms: Algorthms, Desgn Addtonal Key Words and Phrases: Observablty, low power, schedulng, operaton gatng, behavoral synthess ACM Reference Format: Cong, J., Lu, B., Majumdar, R., and Zhang, Z Behavor-level observablty analyss for operaton gatng n low-power behavoral synthess. ACM Trans. Des. Autom. Electron. Syst. 16, 1, Artcle 4 (November 2010), 29 pages. DOI = / INTRODUCTION In recent years, power dsspaton has become an ncreasngly crtcal ssue n VLSI desgn. A number of technques for power reducton have been developed n advanced RTL synthess tools. Whle some technques try to replace powerhungry devces wth ther power-effcent counterparts at possble costs n performance and/or area, other orthogonal approaches reduce power by avodng unnecessary operatons, usng technques such as operand solaton, clock gatng, and power gatng. Observablty Don t-care (ODC) condtons, ntroduced by the logc synthess communty (for example, De Mchel [1994], Devadas et al. [1994], Gajsk et al. [2002]), play an mportant role n the dentfcaton of unnecessary operatons n a Boolean network. Isolaton cells can be nserted at nputs of a functonal unt when ts result s not observable [Münch et al. 2000]. For clock gatng, the smplest approach s based on stablty condtons [Fraer et al. 2008]: when the value stored n a regster does not change, ts clock can be gated. It s recognzed that explotng ODC condtons n clock gatng can lead to sgnfcantly more power reducton by avodng unobservable value changes n regsters [Babghan et al. 2005; Benn and De Mchel 1996; Benn et al. 1999; Fraer et al. 2008; Ohnsh et al. 1997]. Fgure 1 shows a smple case where ODC condtons n an RTL desgn are used for clock gatng: when A s greater than 5, the output of the multplexer s equal to A, and thus B 2 computed by the multpler s unobservable; then we can get the mplementaton shown on the rght where clock s gated for some regsters and actvty of the multpler can be avoded. The problem of computng ODC condtons n a sequental RTL model has been approached n a number of ways. Some pror work vews the crcut as a Fnte-State Machne (FSM) and calculates the exact ODC condton for every bt usng formal methods [Benn and De Mchel 1996; Benn et al. 1999]. However, the number of states n an FSM can be exponental n terms of the number of regsters. Thus, the exact approach can be prohbtvely expensve for moderately large desgns. Methods developed n practcal systems are often conservatve but more scalable, wthout a thorough analyss of the FSM. The work n Münch et al. [2000] assumes that every value stored n a regster s observable and only performs analyss on combnatonal parts of the crcut. Smlar assumptons are used n many commercal products. The

3 Behavor-Level Observablty Analyss for Operaton Gatng 4: 3 Fg. 1. An example of clock gatng based on ODC condtons n RTL synthess. Fg. 2. An example of the lmtaton of ODC-based clock gatng n RTL synthess. approach n Ohnsh et al. [1997] reles on specal patterns n the Hardware Descrpton Language (HDL) code to compute ODC condtons, and thus the qualty of result depends on the codng style. The algorthm n Babghan et al. [2005] detects ODC condtons based on datapath topology, usng backward traversal and levelzaton technques. A more recent work [Fraer et al. 2008] shows that more ODC condtons can be uncovered n the results of Babghan et al. [2005] by propagatng ODC condtons that are already utlzed n other parts of the desgn (possbly dscovered manually by the desgner). Wang and Roy [2003] propose to perform observablty analyss at the behavor level, n order to dscover opportuntes for power optmzaton n an exstng RTL desgn. The approach reles on the branchng structure of the program, but gnores correlaton between Boolean values. All these methods are shown very effectve n practce. However, t s not clear how much opportunty for power optmzaton stll exsts due to obvous pessmsm when computng ODC condtons. Even wth a powerful tool that could calculate the exact ODC condton for every sgnal n an RTL model effcently, the opportunty for power savng s stll lmted by the exstng RTL desgn. In the example shown n Fgure 2, the comparson s performed later than the multplcaton, and the clock gatng technque n Fgure 1 cannot be appled. Ths suggests that huge opportuntes reman unexploted at a hgher level where there s freedom n choosng a good RTL structure among many alternatves of the same functonalty. A more sophstcated example s shown n Fgure 3, where dfferent schedules wth the same latency mply dfferent opportuntes for avodng operatons. Note that there s a select nstructon n the behavor code (correspondng to a multplexer n the crcut), and thus

4 4: 4 J. Cong et al. Fg. 3. Two schedules of a dataflow graph and the mpled control flows when ODC s exploted. the evaluaton of some values are not necessary dependng on whch value s selected as the output. In the frst schedule, two multplcatons (v1 and v2) are always executed. In the second one, when v7 s evaluated as false n the frst step, v9 wll be equal to v3, and values ncludng v1, v2, v5, and v6 are not needed because they wll not nfluence the output. By schedulng nstructons ntellgently and mposng guardng condtons (predcates) to operatons based on ODC condtons n the resultng RTL desgn (ths s referred to as operaton gatng n ths artcle), we can effectvely restructure the control flow and get dfferent equvalent C codes as shown on the rght sde; the resultng

5 Behavor-Level Observablty Analyss for Operaton Gatng 4: 5 mplementaton can have dfferent power under the same performance constrant. A powerful behavoral synthess tool could explore such hgher-level opportuntes to generate and redstrbute ODC condtons, whereas an RTL synthess tool s unable to explore ths desgn space; t can at most take advantage of avalable ODC condtons for a fxed schedule. It then becomes an nterestng problem how a power-frendly RTL model can be generated n order to maxmze the effectveness of ODC-based power management technques. In ths work, we study ths problem systematcally. Our contrbutons nclude the followng. We present a formal framework for observablty analyss at the behavor level. We ntroduce several observablty measures and explan ther relatons. We descrbe an effcent method to compute observablty at the behavor level. We frst present an abstracton of a dataflow graph usng only black boxes and generalzed select operatons. Then, a method s developed to compute the smoothed behavor-level observablty based on several theorems. The method s exact for dataflow graphs wth black box abstracton. We also allow certan forms of knowledge about nputs and other nstructons to be consdered. We present a behavoral synthess flow for power optmzaton usng operaton gatng, guded by behavor-level observablty, and demonstrate the effectveness of our approach n real-word desgns. To the best of our knowledge, ths s the frst tme that behavoral synthess s guded by a comprehensve observablty analyss for power optmzaton. The rest of ths artcle s organzed as follows. In Secton 2 we descrbe our assumptons on the behavoral synthess system and the target hardware archtecture. Behavor-level observablty, as well as ts approxmaton under the proposed black-box abstracton, s ntroduced n Secton 3. An effcent algorthm for observablty analyss at the word level s proposed n Secton 4, based on several theorems. We brefly descrbe our approach to observabltyguded power optmzaton n schedulng n Secton 5 and report expermental results n Secton 6. Secton 7 dscusses related work, followed by concludng remarks n Secton BACKGROUND In ths secton we descrbe a behavoral synthess system whch serves as the context for the dscussons n the followng parts of ths artcle. In our behavoral synthess system, a compler front-end parses and optmzes behavoral descrptons n hgh-level languages (lke C/C++) and generates descrptons n an ntermedate representaton, whch can be represented as a Control/Data Flow Graph (CDFG). A CDFG s a graph G =(V, E), where each node v V represents an operaton (also called an nstructon), and each drected edge e E represents a data dependency or a control dependency. Operatons are scheduled statcally by the synthess tool. For each operaton v, an

6 4: 6 J. Cong et al. Fg. 4. Overvew of our behavoral synthess system. nteger-valued schedulng varable s v can be ntroduced to represent the tme slot n whch operaton v s performed. A Fnte State Machne wth Datapath (FSMD) model [Gajsk et al. 1992] can be constructed once the schedulng varable for every operaton s decded [Cong and Zhang 2006]. The FSMD model can be subsequently translated nto an RTL model through a bndng process whch maps operatons to functonal unts, varables to storage unts, and data transfers to nterconnects. The overall flow s llustrated n Fgure 4. When we perform operaton gatng, a predcate s added to an operaton based on ts observablty. Unlke general-purpose processors where the syntactc form of a predcate s usually very lmted (e.g., ncludng only one or two dedcated predcate regsters), the target archtecture of a behavoral synthess system can be customzed, thus allowng much more flexblty n the Instructon-Set Archtecture (ISA), the form of a predcate n partcular. Here we consder a target archtecture where the predcate can be any expresson of an arbtrary number of Boolean values (lterals). Ths s based on the fact that evaluaton of Boolean expressons on applcaton-specfc hardware can often be done n a very short tme (typcally much shorter than a clock cycle) wth a relatvely small overhead (a few logc gates and wres). However, we lmt the form of a lteral to a Boolean value because expressons nvolvng non-boolean values can be much more expensve to evaluate; f the logc of the predcate nvolves non-boolean values, addtonal operatons (such as truncaton, comparson) are needed to obtan Boolean values from non-boolean values. Observablty condtons assocated wth dfferent levels of abstractons can be dfferent. In the schedulng process of transformng a CDFG nto an FSMD, behavor-level observablty condtons are translated nto FSMD-observablty condtons (observablty under a gven schedule, more precsely defned n

7 Behavor-Level Observablty Analyss for Operaton Gatng 4: 7 Secton 5). In the example n Fgure 3, a behavor-level observablty condton for v1 sv6v7v8, that s, v1 s observable only when v6, v7, and v8 areall true. However, the evaluaton of v1 can never be avoded n the frst schedule, because the observablty of v1 s always unknown when t s evaluated and conservatve decsons have to be made to guarantee correctness. The second schedule s better n the sense that we could avod evaluatng v1 whenv7 s known to be false, because v6v7v8 wll then be false even when v6 andv8 are not yet evaluated. Here we say the FSMD-observablty condton of v1 s v7. Clearly, dfferent schedules mply dfferent ODC condtons on ther assocated FSMDs, and t s not always possble to postpone every nstructon untl ts behavor-level observablty condton s known, due partly to performance constrants. The problem consdered n ths artcle can be descrbed as follows: Gven a CDFG and proflng nformaton, as well as the cost (average power) for executng each nstructon, fnd a schedule that leads to the smallest expected total cost after operaton gatng, subject to data-dependency constrants and a latency constrant. 3. BEHAVIOR-LEVEL OBSERVABILITY 3.1 Observablty for a General Functon The observablty of a functon f (x, y) wth respect to part of ts varables x s a Boolean-valued functon of the rest of the varables y; the observablty s true for values of y whch makes t possble that changes n x are observable at the output. Defnton 3.1 (Observablty). For a functon z = f (x, y) :X Y Z, where x X, y Y, anobservablty functon of f wth respect to x s a functon O x f : Y {0, 1}, sothat ( x 1, x 2 X, f (x 1, y) f (x 2, y) ) O x f (y). Informally, O x f s a necessary condton about y for x to be observable. It s clear that Defnton 3.1 s compatble wth the defnton of the observablty condton for Boolean functons. When only part of the varables can be used for observablty computaton, we usually need to make conservatve decsons about other unknown varables by usng a necessary condton of the exact observablty. Ths can be done usng projecton. Defnton 3.2 (Projecton). For a Boolean-valued functon h(y 1, y 2 ) : Y 1 Y 2 {0, 1}, the projecton of h onto y 1 s a functon P y1 h : Y 1 {0, 1}, so that { 1 f y 2 Y 2, h(y 1, y 2 )=1 P y1 h(y 1 )= 0 otherwse.

8 4: 8 J. Cong et al. Informally, P y1 h(y 1 )sweakerthanh(y 1, y 2 ), but t s the strongest necessary condton for h(y 1, y 2 )wthrespecttoy 1. 1 LEMMA 3.1. For g : Y 1 Y 2 Y 3 {0, 1}, P y1 (P {y1,y 2 }g) P y1 g. For a dataflow graph g, letx X be the value whose observablty s beng consdered. We cut the edges from the operaton that computes x, andtreatx as a prmary nput. Let V V be the set of all the other prmary nputs n g, among whch B B s the set of Boolean values, and C C s the set of other values. Let OUT OUT be the output. We vew the program as a functon g : X B C OUT. Defnton 3.3 (BL-Observablty). For the program g as descrbed before, the behavor-level observablty condton of x s BLO(x) O x g. Accordng to ts defnton, BLO(x) s a functon: B C {0, 1}. Snce only Boolean values are allowed n a predcate expresson n our target archtecture, we defne the Boolean behavor-level observablty condton of x as the projecton BLO(x) ontob, so that we get an observablty condton that uses only Boolean values. Defnton 3.4 (B-BL-Observablty). For the program g as descrbed before, the Boolean behavor-level observablty condton of x s BBLO(x) P B BLO(x). 3.2 Dataflow Graph Abstracton Dfferent from a Boolean network where each vertex s a smple Boolean operator and each edge s a sngle-bt sgnal, a dataflow graph represents program behavor at the word level. Operatons n a dataflow graph can be ether Boolean operators lke those n a Boolean network (such as and, or, not), or more complex ones at the word level (such as add, mul, dv). A specal operaton s select, whose output s equal to one of ts data nputs, dependng on the control nput. Whle t s theoretcally possble to decompose all complex wordlevel operatons nto Boolean networks and compute BLO(x) usng technques for observablty computaton n Boolean networks, the approach s often computatonally ntractable due to the large sze of the network. Moreover, even f we have an effcent way to compute observablty n the large Boolean network, the observablty condton s lkely to be very complex, nvolvng bts from word-level sgnals. Ths s not qute useful for operaton gatng, because only Boolean values can be used n predcates, accordng to our assumpton n Secton 2. To get reasonable observablty condtons effcently at the word level wthout elaboratng all detals, we propose to model complex operatons (.e., all 1 For two logc condtons A and B, fa B, wesaythatb s weaker than A, andthata s stronger than B.

9 Behavor-Level Observablty Analyss for Operaton Gatng 4: 9 operatons that take a non-boolean value as an nput, excludng select)nthe dataflow graph as black boxes. A black box has fxed nput/output btwdths, and t mplements some non-boolean functon whose semantcs s beng gnored n our analyss. In other words, a black box can be nstantated as any functon wth the specfed nput/output btwdths. Under ths abstracton, no knowledge about the complex operatons can be used, and the goal s to obtan observablty condtons that hold regardless of the nstantatons of black boxes. Consder the general case where a dataflow graph has m black boxes B 1, B 2,...,B m. The observablty condton of a value x depends on the nstantaton of each black box. Let BLO f1,..., f m x be the observablty condton of x when B s nstantated as a functon f. We are nterested n the strongest necessary condton for BLO(x), wthout knowng anythng about f, =1,...,m. The result s defned as the smoothng of BLO(x) wthrespect to all the black boxes. Denotng the result of smoothng as S {f1,..., f m } BLO(x), or smply S BLO(x), we have S BLO(x) = f 1,..., f m BLO f1,..., f m (x). (1) Please recall that the smoothng of a Boolean functon g(x 1,...,x,...,x n )over x s defned as S x g g x =1 +g x =0 n De Mchel [1994]. Note that our defnton of smoothng over black-box functons n Eq. (1) s compatble wth the orgnal defnton n De Mchel [1994]. In fact, they are the same f we vew varable x as the output of a 0-nput-1-output black-box functon. Clearly, S BLO(x) s weaker than the exact BLO(x) due to the absence of knowledge about operatons that are modeled as black boxes. However, the result of such an operaton typcally depends on all of ts operands (wth rare exceptons, such as the case when one of the nputs of a mul operaton s 0), and correlatons between dfferent non-boolean values are dffcult to analyze and represent n a thorough way at the word level. Thus, we consder the black-box modelng a reasonable abstracton for behavor-level observablty analyss. We also generalze the select operaton to facltate observablty analyss. A (k, l)-select operaton takes k control nputs (b 1,...,b k ), l data nputs (d 1,...,d l ) and generates one output z. All control nputs are Boolean varables, and all data nputs are of the same btwdth. Or, d 1 when s 1 (b 1,...,b k )=1 d 2 when s 2 (b 1,...,b k )=1 z =. d l when s l (b 1,...,b k )=1 (2) z = l s (b 1,...,b k )d (3)

10 4: 10 J. Cong et al. Here s 1,...,s l s a set of orthonormal Boolean bases; that s, s s j =0, j {1,...,l} (4) l s =1. (5) It s easy to note that the tradtonal select operaton s a (1, 2)-select wth s 1 (b 1 ) = b 1 and s 2 (b 1 ) = b 1. Ths generalzed select allows selectng from multple data nputs, and absorbs Boolean functons. Wth the approach of black-box modelng and the precedng generalzaton of select, there are only two types of operatons remanng n the dataflow graph: black box and generalzed select. We make the followng restrctons to facltate dscusson n the followng parts of the artcle. Note that a vald dataflow graph can always be normalzed to a form that conforms to these restrctons. (1) If a Boolean varable s used by a select operaton as a control nput, t cannot be used by black boxes or by any select operaton as a data nput. In other words, values are dvded nto two categores: ether used purely as control sgnals (by select operatons), or purely as data nputs (by black boxes and select operatons). If a Boolean varable b s a prmary output, ntroduce a (1, 2)-select that selects constant 1 f b s true and constant 0 f b s false. If a select operaton takes Boolean varables as data nputs, we can always replace the select operaton wth a Boolean expresson (whch s eventually absorbed n another select). (2) For each select operaton, all of ts nputs are dstnct. If a Boolean varable appears more than once n the lst of control nputs of a select operaton, the number of control nputs can be reduced, and the selectng logc can be smplfed. If the same data value s selected n more than one case, the number of data nputs can be reduced, and the cases where the same value s selected can be merged. (3) Each data nput of a select comes from a black box. Prmary nputs and constants are regarded as outputs of black boxes. When the result of a select n 1 s used by another select n 2,wecan smply replace n 2 wth a combnaton of n 1 and n 2, so that the result of n 1 s not used by another select. Notethatn 1 may stll be present after the transformaton f t has other uses. For the example dataflow graph shown n Fgure 3, operatons that evaluate v1, v2, v3, v4, v5, v6, and v7 are all modeled as black boxes; the operaton that evaluates v8 s absorbed n the (2, 2)-select operaton v9; the selecton functon for the (2, 2)-select s s v3 (v 6,v 7 )=v 6 v 7, s v4 (v 6,v 7 )=v 6 v 7. The dataflow graph s transformed as shown n Fgure 5.

11 Behavor-Level Observablty Analyss for Operaton Gatng 4: 11 Fg. 5. The dataflow graph abstracton wth black boxes and the generalzed select operaton. Black boxes are flled wth shade. 4. BEHAVIOR-LEVEL OBSERVABILITY ANALYSIS As mentoned prevously, computng the exact observablty condton requres nontrval effort, essentally breakng all values nto ndvdual bts and applyng technques for Boolean networks. In ths secton, we descrbe an algorthm to compute S BLO, that s, observablty under the abstracton usng black boxes descrbed n Secton 3. The algorthm propagates and manpulates S BLO drectly, and thus t avods the trouble of consderng the nstantatons of black boxes. 4.1 Revew of Observablty Computaton n a Boolean Network Our method for computng S BLO s based on a technque for observablty analyss n Boolean networks [De Mchel 1994]. Here we gve a bref revew of the algorthm. For smplcty, we consder the case when the Boolean network has only one prmary output. The observablty of a node n the Boolean network wth regard to multple prmary outputs can be computed by summng up ts observablty condtons wth regard to each ndvdual prmary outputs. The algorthm labels the observablty condtons on nodes and edges n reverse topologcal order. It proceeds wth three knds of actons. Intalze. For the prmary output z under consderaton, set BLO(z) =1. For all other prmary output w, setblo(w) =0. Propagate node observablty to ts nput edges. For each node z = f (x 1,...,x n ), compute BLO(x )=BLO(z) f x, =1,...,n. Merge edge observablty to get node observablty. If a value y s used m tmes as y 1,...,y m, when y s vsted, we already have the edge observablty condtons BLO(y ), =1,...,m. These edge observablty condtons are computed ndependently from downstream operatons; thus the correlaton that y 1 = y 2 = = y m needs to be consdered when

12 4: 12 J. Cong et al. mergng edge observablty condtons. Eq. (6) s used to derve node observablty. BLO(y) = m BLO(y ) y+1 = =y m =y (6) 4.2 Observablty Analyss wth Black Box In ths subsecton, we show that for a Boolean network composed of black boxes and generalzed select operatons, f the requrements n Secton 3.2 are satsfed, we can compute and propagate the smoothed observablty S BLO drectly, usng an approach smlar to that for Boolean networks. For smplcty, we frst consder the case wth only one output (an output s an operaton whose result may be used outsde the current dataflow graph, or by the next loop teraton); the observablty wth regard to multple outputs can be obtaned by consderng these outputs one by one and summng up the smoothed observablty condtons wth regard to dfferent outputs for the same value. We stll have the three types of actons, namely ntalzaton, propagaton and mergng, among whch the ntalzaton s trval. In the followng, we develop theorems that gve rules for propagaton through black boxes and generalzed select operatons, as well as rules for mergng of control sgnals and data sgnals. Proof to these theorems can be found n the Appendx. THEOREM 4.1 (PROPAGATE THROUGH A BLACK BOX). For a black box z = f (x 1,...,x m ), the edge observablty satsfes S BLO(x ) = S BLO(z), foreach {1,...,m}. THEOREM 4.2 (PROPAGATE THROUGH SELECT). For a (k, l)-select nstructon z = g(b 1,...,b k, d 1,...,d l ),wths (b 1,...,b k ) beng the condton under whch d s selected. We have S BLO(d )=SBLO(z)s (b 1,...,b k ), S BLO(b )= S BLO(z) lj=1 s j b. THEOREM 4.3 (MERGE EDGE OBSERVABILITY FOR DATA). For a value x that s used by black boxes or by select nstructons as data nputs, suppose x 1,...,x p are the edges begnnng from x. We have S BLO(x) = p S BLO(x ). THEOREM 4.4 (MERGE EDGE OBSERVABILITY FOR CONTROL). For a node x that s used as control nputs n m select nstructons, f 1,..., f m, each havng l data nputs, suppose x 1,...,x p are the edges begnnng from x. We have S BLO(x) = m S BLO( f ) x+1 =...=x m =x l j=1 s j, x where s j s the selecton functon for data nput j n the th select operaton.

13 Behavor-Level Observablty Analyss for Operaton Gatng 4: 13 Based on the precedng theorems, we develop a process called observablty analyss to compute the smoothed observablty condtons for all operatons. The nput program s thoroughly optmzed usng classc compler optmzatons ncludng control-flow optmzaton and f-converson (usng the select nstructons); a dataflow graph s formed for each acyclc regon, and preprocessed nto a graph wth black boxes and generalzed select operatons. The algorthm for observablty analyss s shown n Algorthm 1. Algorthm 1. Observablty Analyss for a Dataflow Graph wth a Sngle Output for all operaton I n reverse topologcal order do f I drectly generates the prmary output then Intalze: S BLO(I) = predcate(i), predcate(i) standng for the executng condton of I. else Merge: calculates BLO(I) from the smoothed observablty condtons of edges startng from I, whch have been calculated when vstng downstream operatons. The equaton n Theorem 4.3 s used f the value s used as data; the equaton n Theorem 4.4 s used otherwse. end f Propogate: calculate the smoothed observablty for every ncomng edge S BLO(src (I)) based on the type of I. IfI s a black box, use the equaton n Theorem 4.1; otherwse (I s a generalzed select) use the equaton n Theorem 4.2. end for Based on Theorems 4.1, 4.2, 4.3, 4.4, we have the followng. THEOREM 4.5. Algorthm 1 computes the correct smoothed observablty condton for each value n the dataflow graph. For the example dataflow graph wth black boxes and the generalzed select operatons shown n Fgure 5, after applyng Algorthm 1, we get the smoothed behavor-level observablty condton shown n Table I. Snce the expresson of S BLO contans only Boolean control sgnals, S BLO s also a BBLO (and s also a BLO), accordng to defnton. However, t s not necessarly the strongest BBLO possble, due to the ntroducton of black boxes. The output of a black box s completely unknown, and correlatons between values are completely lost after a black box. Whle we consder ths black-box model very useful to enable analyss at a hgher level, certan knowledge about values n the dataflow graph, once uncovered, can be employed to strengthen the condton. To do that, we are mostly nterested n correlatons between Boolean values, for example, (x == 3) mples (x < 10). Although capturng exact relatons between Boolean values s nontrval, at least some knowledge can be dscovered and exploted. Such technques have been developed n complers [August et al. 1999], and can be drectly appled n our algorthm. For the example n Fgure 3, let us assume that we know nput c s always an odd number. By analyzng the observablty-propagatng nstructons, t can be asserted that the two Boolean values, v6 =(a a + b b == 100) and

14 4: 14 J. Cong et al. Table I. Smoothed Behavor-Level Observablty Computed by Algorthm 1 value S BLO descrpton v9 true ntalzaton for prmary output (v9) v3 S BLO(v9)v6v7 = v6v7 propagaton from select (v9) to ts data nput v4 S BLO(v9)v6v7 ( = v6v7 ) propagaton from select (v9) to ts data nput v6 S BLO(v9) (v6v7) v6 + v6v7 v6 = v7 propagaton from select (v9) to ts condton ( ) v7 S BLO(v9) (v6v7) v7 + v6v7 v7 = v6 propagaton from select (v9) to ts condton c v3 S BLO(v3) = v6v7 propagaton from black box (v3) to ts nput d S BLO(v3) = v6v7 propagaton from black box (v3) to ts nput a v4 S BLO(v4) = v6v7 propagaton from black box (v4) to ts nput b v4 S BLO(v4) = v6v7 propagaton from black box (v4) to ts nput v5 S BLO(v6) = v7 propagaton from black box (v6) to ts nput a v7 S BLO(v7) = v6 propagaton from black box (v7) to ts nput c v7 S BLO(v7) = v6 propagaton from black box (v7) to ts nput v1 S BLO(v5) = v7 propagaton from black box (v5) to ts nput v2 S BLO(v5) = v7 propagaton from black box (v5) to ts nput a v1 S BLO(v1) = v7 propagaton from black box (v1) to ts nput b v2 S BLO(v2) = v7 propagaton from black box (v2) to ts nput a S BLO(a v1 )+SBLO(a v4 ) +S BLO(a v7 )=true merge edge observablty for a b S BLO(b v2 )+SBLO(b v4 )=v6+v7 merge edge observablty for b c S BLO(c v3 )+SBLO(c v7 )=v6 merge edge observablty for c Here c v3 denotes the edge observablty for the edge from v3 toc. If a node has only one outgong edge, the edge observablty and node observablty are the same. v7 =(a == c) cannot be true smultaneously, because the set of nteger values of a that satsfes a a + b b == 100 s {0, 6, 8, 10}, all elements of whch are even, so v7 =(a == c) wll be false f v6 s true. Thus we have the knowledge v6v7 =true, whch can be used to smplfy the condtons. For example, we have BLO(v3) = v6v7; wth that knowledge, we have BLO(v3) = v6v7 v6v7 = false, that s, we fnd that v3 s always unobservable when c s odd. 5. SCHEDULING FOR OPERATION GATING OPTIMIZATION In ths secton we dscuss how the behavor-level observablty obtaned by Algorthm 1 can be used to mprove the effectveness of operaton gatng n schedulng. 5.1 Observablty Under a Gven Schedule For a gven schedule s of program g,leta s (x) be the set of values avalable when value x s evaluated (.e., the set of values generated by operatons scheduled to fnsh before the evaluaton of x starts), we defne FSMD-observablty as follows. Defnton 5.1 (FSMD-Observablty). An FSMD-observablty condton of g wth respect to x under a gven schedule s, FSMDO s (x) P As (x)blo(x).

15 Behavor-Level Observablty Analyss for Operaton Gatng 4: 15 Fg. 6. Relatons between observabltes. The bold arc shows the way we obtan BFSMDO, whch s used as the predcate. Defnton 5.2 (B-FSMD-Observablty). A Boolean FSMD-observablty condton (B-FSMD-observablty) ofg wth respect to x under a gven schedule s, BFSMDO s (x) P B FSMDO s (x). BFSMDO s (x) s the condton we can use as the predcate of the operaton that computes x. The conceptual dfference between BLO, BBLO, FSMDO, and BFSMDO les n the set of values that are used to evaluate observablty. All values n the program can be used for behavor-level analyss, whle only avalable values are meanngful when the schedule s fxed. Theoretcally, both Boolean and non-boolean values can be used, whle n practce most archtectures support only a Boolean expresson as the predcate of an nstructon. Usng Lemma 3.1, we have the next theorem. THEOREM 5.1. BFSMDO s (x) P As (x)bblo(x). Theorem 5.1 uncovers relatons between BLO, BBLO, FSMDO, and BFSMDO; tgvesawaytocomputebfsmdo under a gven schedule by projectng a BBLO condton onto avalable values. In Fgure 6, the arcs llustrate the defnton of observabltes by projecton, and the bold arc llustrates the way to compute BFSMDO from BBLO by projecton, as stated n Theorem 5.1. Usng Algorthm 1, we obtan S BLO as a BBLO, whch s subsequently projected as a FSMDO for operaton gatng. 5.2 Prevous Work on Operaton Gatng Consderng the fact that only Boolean values already evaluated can be used n predcates for operaton gatng, the mpact of schedulng on ODC-based power management s obvous. To our knowledge, the work n Montero et al. [1996] presents the frst algorthm desgned to create more opportuntes for ODCbased power management. Ths method works as a postprocessng step on an exstng schedule: t examnes multplexers (select nstructons) one by one and tres to move the nstructon by computng the Boolean operand earler f

16 4: 16 J. Cong et al. possble. Authors of Montero et al. [1996] notced that ther results depended on the order n whch multplexers were examned, and used reverse topologcal order n ther mplementaton. Chen and Sarrafzadeh [2002] propose an mproved optmzaton technque usng prorty and soft dependency edges. Both Montero et al. [1996] and Chen and Sarrafzadeh [2002] use a very smple method for observablty analyss. They do not generalze the select operaton; thus the dataflow graph contans Boolean operatons such as and/or/not, whch generate the control sgnals for select. However, those Boolean operatons are essentally modeled as black boxes just lke add/sub. Hence, knowledge about Boolean operatons s unexploted, resultng n weaker observablty condtons compared to the proposed method n Secton 4. For example, n the schedule llustrated n Fgure 3, the evaluaton of v3 can be avoded when v7 s false. Ths may look straghtforward n the orgnal code, but t s nontrval n the f-converted form, where the frst operand of the select nstructon, v8, s computed later than v3. The method n Montero et al. [1996] and Chen and Sarrafzadeh [2002] wll not fnd such an opportunty. Although the method can possbly be extended by vewng and/or as degenerated select, t stll cannot capture the nformaton that ether operand can mask the observablty of the other. Such nformaton s essental for control-flow restructurng when the scheduler explots dfferent possble speculaton/predcaton schemes under a latency constrant. 5.3 Schedulng Optmzaton for Operaton Gatng When the schedule s optmzed for operaton gatng, along wth other objectves such as latency, dfferent algorthm frameworks can be used. As the problem s ntrnscally dffcult even wthout the consderaton of operaton gatng, t s often solved usng heurstcs lke lst schedulng [Landskov et al. 1980] or force-drected schedulng [Pauln and Knght 1989]. The postprocessng technque n Montero et al. [1996] and the approach of Chen and Sarrafzadeh [2002] can be vewed as natural adaptatons of prevous heurstcs to the problem wth consderaton of operaton gatng. In our mplementaton, we extend a prevous formulaton of schedulng based on mathematcal programmng [Cong and Zhang 2006]. The formulaton s stll a heurstc wth approxmatons nstead of an exact method; yet t s able to optmze the schedule of all operatons globally. For each operaton v, an nteger-valued schedulng varable s v s ntroduced to represent the tme slot n whch operaton v s performed. Once the schedulng varable for every operaton s decded, a FSMD model can be constructed [Cong and Zhang 2006]. The task of schedulng s thus to decde s v for every operaton v. The formulaton uses two types of constrants, namely nteger-dfference hard constrants and nteger-dfference soft constrants. Both constrants have the same form, and they dffer n the sense that nteger-dfference soft constrants are not necessarly satsfed. Defnton 5.3 (Integer-Dfference Constrant). An nteger-dfference constrant s a constrant of the form s u s v d, whered s a constant nteger.

17 Behavor-Level Observablty Analyss for Operaton Gatng 4: 17 Integer-dfference hard constrants can be used to model a wde range of tradtonal schedulng constrants, ncludng dependency, latency, frequency, resource, etc., as shown n Cong and Zhang [2006]. Integer-dfference soft constrants, on the other hand, can be used to express the ntenton of operaton gatng. When t s preferred that a Boolean value c s scheduled before another value v so that v can be avoded when c takes a certan value, an nteger-dfference soft constrant can be added as s c s v b d c +1, (7) where d c s the number of clock cycles operaton c spans, and b s the number of clock cycles needed to separate operatons c and v. The value of b depends on the power management technque and the target platform: a typcal value of b s 1 f clock gatng or operand solaton s used; t means that the condton should be avalable at least 1 cycle before t can be used as a predcate for clock gatng. For power gatng, the number b s probably larger than 1. Then the problem of power optmzaton usng operaton gatng can be descrbed n a mathematcal-programmng form as follows. mn k c ks k s.t. s u s v p, =1,...,m (hard constrants) s c j s v j q j, j =1,...,n (soft constrants) (8) Here p, q j are constants n varous constrants for dependency, frequency, latency, resource, etc. Ther values are determned by varous constrant generators as descrbed n Cong and Zhang [2006]. c k are constants that serve as weghts for dfferent objectves. To handle soft constrants n the formulaton descrbed n Eq. (8), we ntroduce a volaton varable w j for each soft constrant j to represent the amount of volaton. s c j s v j w j q j, (9) w j 0. A penalty term φ j (w j ) s added to the objectve functon and the formulaton can be wrtten n matrx form as mn c T s + n j=1 φ j(w j ) (10) s.t. Gs p Hs w q w 0. It s shown n Cong et al. [2009b] that the precedng formulaton can be solved optmally n polynomal tme wth nteger solutons, f each penalty functon φ j (w j ) s convex. In such a case, the problem s reduced to a lnear program wth a totally unmodular constrant matrx [Cong et al. 2009b]; thus t s guaranteed to have ntegral solutons. For nonconvex penalty functons (n ths case the bnary penalty, where volatng a soft constrant leads to a constant cost), teratve approxmaton technques are also developed that approxmate a bnary penalty wth a sequence of lnear penalty functons. More detals of the solver are omtted here as they are not the focus of ths artcle.

18 4: 18 J. Cong et al. 6. EXPERIMENTAL RESULTS 6.1 Experment Setup Technques proposed n ths work have been mplemented n the scheduler of AutoPlot TM, a commercal behavoral synthess tool from AutoESL Desgn Technologes, Inc. [Zhang et al. 2008]. The tool accepts C/C++/SystemC as the nput language and generates RTL specfcatons n VHDL or Verlog. Our scheduler ntroduces soft constrants and formulates the problem usng technques descrbed n Secton 5. We make comparsons to three other approaches: (1) a baselne scheduler usng the SDC formulaton wthout operaton gatng, (2) the teratve algorthm descrbed n Chen and Sarrafzadeh [2002] for operaton gatng, (3) an Integer-Lnear Programmng (ILP) formulaton to handle bnary penalty exactly for optmal operaton gatng. We wll not compare our approach wth the orgnal work on operaton gatng n Montero et al. [1996], because Chen and Sarrafzadeh [2002] s algorthmcally smlar to Montero et al. [1996], but wth an mproved strategy. All these approaches are mplemented n C++, and the programs run on a workstaton wth four 2.4GHz 64-bt CPU and 8G prmary memory. The ILP formulaton (forthe purposeofoptmalty study) s brefly descrbed as follows. In addton to all varables and constrants n Eq. (10), a varable m j s ntroduced for each volaton varable w j. We add constrants w j N m j 0, (11) m j {0, 1}, (12) where N s a large constant number so that the constrant n Eq. (11) can always be satsfed when m j =1. Thenm j s ntroduced to the objectve wth a coeffcent reflectng the cost of volatng soft constrant j. We also explctly enforce the constrant that every varable s an nteger. It s easy to verfy that n the soluton of the ILP formulaton, we have { 1 when w j > 0, m j = 0 otherwse. After schedulng, a bndng algorthm descrbed n Cong et al. [2006] s performed. The RTL code generated by the behavoral synthess tool s fed to the Magma Talus RTL-to-GDSII toolset. Gate-level smulaton under typcal nput vectors s performed usng the Aldec Rvera smulator to obtan power dsspaton. All desgns are mplemented usng a TSMC 90nm standard cell lbrary. In ths experment the actual operaton gatng s carred out by the clock gatng on the output regsters of the gated operatons. Further power savngs can potentally be acheved f we apply addtonal low-power technques (e.g., operand solaton, feedng sleep vectors for leakage reducton). Several desgns n arthmetc and multmeda applcatons are used n our experments. Characterstcs of these desgns are gven n Table II.

19 Behavor-Level Observablty Analyss for Operaton Gatng 4: 19 Table II. Benchmark Characterstcs Name #node Descrpton addr 88 address space translaton unt BoxMuller 333 Gaussan nose generator dfmul 351 floatng-pont multpler MotonComp 1306 moton compensaton (MPEG4 decoder) MotonEst 621 moton estmaton (MPEG4 encoder) #node means the number of nodes n the CDFG; t s roughly equal to the number of operatons n the program. 6.2 Results and Analyss Results of the four approaches are reported n Table III. Here, area and power after gate-level mplementaton are reported for each approach. Snce the Magma Talus synthess tool meets the clock cycle tme constrant for all cases, we do not report the frequency for each ndvdual approach. We also normalze the power values to those generated by the approach wth soft constrants. For some larger desgns, the exact ILP formulaton (solved by Clp [Forrest et al. 2004], a state-of-the-art open-source ILP solver) fals to fnd a soluton wthn 7200 seconds. All of the three other methods fnsh wthn 60 seconds for all cases. From the results, t s clear that operaton gatng s a useful technque to create opportuntes for power management at the RT level wthout sgnfcant overhead n area. Compared to the SDC schedulng algorthm wthout consderng operaton gatng, all of the three other methods that optmze for operaton gatng mprove the power dsspaton: on average, the method n Chen and Sarrafzadeh [2002] reduces power by 20.1%, the exact method gven by ILP reduces power by 34.6%, and our proposed method by 33.9%. The reducton tends to be partcularly sgnfcant when the desgn has a complex control structure, lke addr. Relatvely large memory blocks are present n some of the desgns, ncludng BoxMuller, MontonComp, and MotonEst. For such desgns, when the access pattern to the memory s fxed, operaton gatng tends to be less effectve, because the memory power s roughly a constant. But f we exclude the power consumed by memory (approxmately 1mW n Box- Muller, more than 2mW for MotonComp, and 3mW for MotonEst) and only look at the power values consumed purely by logc (functonal unts, regsters, nterconnects), the power savng s stll very sgnfcant. Whle power consumed n memory blocks can be a very mportant part of total power, t s usually not controlled by the operaton scheduler when memory operatons are unavodable and the access pattern s fxed. Possble technques that help to reduce memory power nclude behavoral transformatons (loop transformaton to enhance memory localty, to leverage burst-mode memory access, etc.), memory archtecture selecton, etc., but those are beyond the scope of ths study. For a far comparson, we nclude the memory power for every desgn n Table III. Compared to Chen and Sarrafzadeh [2002], the proposed approach further reduces total power dsspaton by an average of 17.1%. Ths savng s because we are able to consder all opportuntes for operaton gatng smultaneously,

20 4: 20 J. Cong et al. Table III. Expermental Results desgn cycle SDC teratve ILP soft constrants area power Np area power Np area power Np area power Np addr BoxMuller dfmul MotonComp MotonEst geomean Cycle s n ns, area s n μm 2,andpowersnmW.TheNp column s the normalzed power.

21 Behavor-Level Observablty Analyss for Operaton Gatng 4: 21 and optmze globally n our approach. The approxmaton of bnary penalty functon turns out to work very well; the results generated usng our approach are very close to those of the exact formulaton, and the observed optmalty gap n terms of power s about 1%. At the same tme, our method s much more scalable than the exact formulaton. 7. RELATED WORK ON OBSERVABILITY ANALYSIS One mght thnk that after partal dead code elmnaton n the predcated form [Bodík and Gupta 1997; Ryoo et al. 2006], the predcate of every nstructon s equal to ts behavor-level observablty condton because no redundant nstructon wll be executed along every control path. However, ths s not true. Consder two Boolean values used n a Boolean and nstructon, the behavorlevel observablty condton for ether nstructon could contan a term about the other. If behavor-level observablty condtons are appled as predcates, there wll be a cyclc dependency between the two nstructons, and the code becomes llegal. Thus, from the perspectve of behavor-level observablty, one can always fnd nstructons that are unnecessarly executed (unobservable), even after thorough compler optmzaton. Snce the executon of unnecessary nstructons cannot be avoded completely, profle-guded optmzaton s needed to mnmze the cost of unnecessary executon, as shown n ths work. Notably, usng behavor-level observablty to gude schedulng gves us opportuntes to unfy both speculatve schedulng and control-flow restructurng, as shown n Fgure 3. Prevous efforts usng predcates n hyperblock schedulng also allow speculatve schedulng through predcate promoton [Mahlke et al. 1992], and have the ablty to smplfy program decson logc usng knowledge about relatons between predcates [August et al. 1999], but a postprocessng pass lke predcated partal dead code elmnaton s needed to strengthen some predcates after schedulng to fully realze the equvalent transformaton. Behavor-level observablty could provde more nformaton to the scheduler than predcate, and can be helpful when varous trade-offs are performed by the scheduler under tght constrants lke latency/throughput. The work n Wang and Roy [2003] ntroduced the concept of behavorlevel observablty and used behavor-level ODC to strengthen condtons for operand solaton and clock gatng. However, the algorthm dd not consder correlatons (between Boolean values, and between data values among a network of select operaton), and thus dd not capture opportuntes for control-flow restructurng. In addton, the work was not ntended to gude archtectural exploraton n behavoral synthess. A prelmnary verson of ths work was presented n Cong et al. [2009a]. However, the work presented there dd not nclude a rgorous theoretcal justfcaton. The method used a smlar black-box model for complex operatons compared to the one n ths artcle, but dd not consder the generalzed select operaton. Thus, when reconvergent paths nvolvng only select operatons occur, some data correlatons may be lost, leadng to weaker observablty don tcare condtons.

22 4: 22 J. Cong et al. Fg. 7. Transformaton to get an even better mplementaton. Let us, agan consder the example code n Fgure 3. In practce, an experenced desgner may optmze the desgn to one n Fgure 7, where a redundant Boolean value s ntroduced n the hope that t can be used n observablty computaton for further avodng multplcatons. The Boolean value ((a b) & 0xFFFFFFFB) == 0xA s a necessary condton for a * a + b * b == 100. A technque to add such Boolean guards has been developed n Ghodrat et al. [2007] for embedded complers. We beleve that when ths technque s appled, our proposed approach on operaton gatng can be more effectve n power reducton. 8. CONCLUSION We have developed the frst systematc way to analyze observablty at the behavor level, and we show how behavor-level observablty can gude the scheduler n a behavoral synthess tool to maxmze the chance of operaton gatng, enablng more RTL power management technques. Our approach ntroduces the generalzed select operaton to capture the observablty maskng effect and the correlaton between condtons, whle modelng arthmetc operatons as black boxes to enable effcent word-level analyss. Expermental results show that our approach s very effectve for power reducton. The analyss usng our theory also reveals possble opportuntes n compler optmzaton usng observablty don t-cares. We leave t for future work. APPENDIX Here we provde proofs to the theorems n Secton 4. THEOREM 4.1 (PROPAGATE THROUGH A BLACK BOX). For a black box z = f (x 1,...,x m ), the edge observablty satsfes S BLO(x ) = S BLO(z), foreach {1,...,m}. PROOF. Wthout loss of generalty, we only consder x 1. propagaton rule n a Boolean network, we have Accordng to the BLO(x 1 )=BLO(z) z x 1 (13) = BLO(z)( f (0, x 2,...,x m ) f (1, x 2,...,x m )). (14)

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Scheduling with Integer Time Budgeting for Low-Power Optimization

Scheduling with Integer Time Budgeting for Low-Power Optimization Schedlng wth Integer Tme Bdgetng for Low-Power Optmzaton We Jang, Zhr Zhang, Modrag Potkonjak and Jason Cong Compter Scence Department Unversty of Calforna, Los Angeles Spported by NSF, SRC. Otlne Introdcton

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Conditional Speculative Decimal Addition*

Conditional Speculative Decimal Addition* Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

5 The Primal-Dual Method

5 The Primal-Dual Method 5 The Prmal-Dual Method Orgnally desgned as a method for solvng lnear programs, where t reduces weghted optmzaton problems to smpler combnatoral ones, the prmal-dual method (PDM) has receved much attenton

More information

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Storage Binding in RTL synthesis

Storage Binding in RTL synthesis Storage Bndng n RTL synthess Pe Zhang Danel D. Gajsk Techncal Report ICS-0-37 August 0th, 200 Center for Embedded Computer Systems Department of Informaton and Computer Scence Unersty of Calforna, Irne

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Brave New World Pseudocode Reference

Brave New World Pseudocode Reference Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be

More information

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6) Harvard Unversty CS 101 Fall 2005, Shmon Schocken Assembler Elements of Computng Systems 1 Assembler (Ch. 6) Why care about assemblers? Because Assemblers employ some nfty trcks Assemblers are the frst

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

LECTURE NOTES Duality Theory, Sensitivity Analysis, and Parametric Programming

LECTURE NOTES Duality Theory, Sensitivity Analysis, and Parametric Programming CEE 60 Davd Rosenberg p. LECTURE NOTES Dualty Theory, Senstvty Analyss, and Parametrc Programmng Learnng Objectves. Revew the prmal LP model formulaton 2. Formulate the Dual Problem of an LP problem (TUES)

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Concurrent models of computation for embedded software

Concurrent models of computation for embedded software Concurrent models of computaton for embedded software and hardware! Researcher overvew what t looks lke semantcs what t means and how t relates desgnng an actor language actor propertes and how to represent

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface. IDC Herzlya Shmon Schocken Assembler Shmon Schocken Sprng 2005 Elements of Computng Systems 1 Assembler (Ch. 6) Where we are at: Human Thought Abstract desgn Chapters 9, 12 abstract nterface H.L. Language

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations* Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad

More information

Lecture 3: Computer Arithmetic: Multiplication and Division

Lecture 3: Computer Arithmetic: Multiplication and Division 8-447 Lecture 3: Computer Arthmetc: Multplcaton and Dvson James C. Hoe Dept of ECE, CMU January 26, 29 S 9 L3- Announcements: Handout survey due Lab partner?? Read P&H Ch 3 Read IEEE 754-985 Handouts:

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Memory Modeling in ESL-RTL Equivalence Checking

Memory Modeling in ESL-RTL Equivalence Checking 11.4 Memory Modelng n ESL-RTL Equvalence Checkng Alfred Koelbl 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 koelbl@synopsys.com Jerry R. Burch 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 burch@synopsys.com

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to: 4.1 4.2 Motvaton EE 457 Unt 4 Computer System Performance An ndvdual user wants to: Mnmze sngle program executon tme A datacenter owner wants to: Maxmze number of Mnmze ( ) http://e-tellgentnternetmarketng.com/webste/frustrated-computer-user-2/

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Polyhedral Compilation Foundations

Polyhedral Compilation Foundations Polyhedral Complaton Foundatons Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty Feb 8, 200 888., Class # Introducton: Polyhedral Complaton Foundatons

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

RADIX-10 PARALLEL DECIMAL MULTIPLIER

RADIX-10 PARALLEL DECIMAL MULTIPLIER RADIX-10 PARALLEL DECIMAL MULTIPLIER 1 MRUNALINI E. INGLE & 2 TEJASWINI PANSE 1&2 Electroncs Engneerng, Yeshwantrao Chavan College of Engneerng, Nagpur, Inda E-mal : mrunalngle@gmal.com, tejaswn.deshmukh@gmal.com

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits Repeater Inserton for Two-Termnal Nets n Three-Dmensonal Integrated Crcuts Hu Xu, Vasls F. Pavlds, and Govann De Mchel LSI - EPFL, CH-5, Swtzerland, {hu.xu,vasleos.pavlds,govann.demchel}@epfl.ch Abstract.

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

Optimal Scheduling of Capture Times in a Multiple Capture Imaging System

Optimal Scheduling of Capture Times in a Multiple Capture Imaging System Optmal Schedulng of Capture Tmes n a Multple Capture Imagng System Tng Chen and Abbas El Gamal Informaton Systems Laboratory Department of Electrcal Engneerng Stanford Unversty Stanford, Calforna 9435,

More information

Mallathahally, Bangalore, India 1 2

Mallathahally, Bangalore, India 1 2 7 IMPLEMENTATION OF HIGH PERFORMANCE BINARY SQUARER PRADEEP M C, RAMESH S, Department of Electroncs and Communcaton Engneerng, Dr. Ambedkar Insttute of Technology, Mallathahally, Bangalore, Inda pradeepmc@gmal.com,

More information

CHAPTER 4 PARALLEL PREFIX ADDER

CHAPTER 4 PARALLEL PREFIX ADDER 93 CHAPTER 4 PARALLEL PREFIX ADDER 4.1 INTRODUCTION VLSI Integer adders fnd applcatons n Arthmetc and Logc Unts (ALUs), mcroprocessors and memory addressng unts. Speed of the adder often decdes the mnmum

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

CE 221 Data Structures and Algorithms

CE 221 Data Structures and Algorithms CE 1 ata Structures and Algorthms Chapter 4: Trees BST Text: Read Wess, 4.3 Izmr Unversty of Economcs 1 The Search Tree AT Bnary Search Trees An mportant applcaton of bnary trees s n searchng. Let us assume

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution Dynamc Voltage Scalng of Supply and Body Bas Explotng Software Runtme Dstrbuton Sungpack Hong EE Department Stanford Unversty Sungjoo Yoo, Byeong Bn, Kyu-Myung Cho, Soo-Kwan Eo Samsung Electroncs Taehwan

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

ETAtouch RESTful Webservices

ETAtouch RESTful Webservices ETAtouch RESTful Webservces Verson 1.1 November 8, 2012 Contents 1 Introducton 3 2 The resource /user/ap 6 2.1 HTTP GET................................... 6 2.2 HTTP POST..................................

More information

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011 9/8/2 2 Outlne Appendx C: The Bascs of Logc Desgn TDT4255 Computer Desgn Case Study: TDT4255 Communcaton Module Lecture 2 Magnus Jahre 3 4 Dgtal Systems C.2: Gates, Truth Tables and Logc Equatons All sgnals

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

CPE 628 Chapter 2 Design for Testability. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction

CPE 628 Chapter 2 Design for Testability. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction Chapter 2 Desgn for Testablty Dr Rhonda Kay Gaede UAH 2 Introducton Dffcultes n and the states of sequental crcuts led to provdng drect access for storage elements, whereby selected storage elements are

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Agenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals

Agenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals Agenda & Readng COMPSCI 8 SC Applcatons Programmng Programmng Fundamentals Control Flow Agenda: Decsonmakng statements: Smple If, Ifelse, nested felse, Select Case s Whle, DoWhle/Untl, For, For Each, Nested

More information

Video Proxy System for a Large-scale VOD System (DINA)

Video Proxy System for a Large-scale VOD System (DINA) Vdeo Proxy System for a Large-scale VOD System (DINA) KWUN-CHUNG CHAN #, KWOK-WAI CHEUNG *# #Department of Informaton Engneerng *Centre of Innovaton and Technology The Chnese Unversty of Hong Kong SHATIN,

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Petri Net Based Software Dependability Engineering

Petri Net Based Software Dependability Engineering Proc. RELECTRONIC 95, Budapest, pp. 181-186; October 1995 Petr Net Based Software Dependablty Engneerng Monka Hener Brandenburg Unversty of Technology Cottbus Computer Scence Insttute Postbox 101344 D-03013

More information